gdb: recognize 64 bits Windows executables as Cygwin osabi

Message ID 20200307041742.31158-1-simon.marchi@efficios.com
State New
Headers show
Series
  • gdb: recognize 64 bits Windows executables as Cygwin osabi
Related show

Commit Message

Simon Marchi March 7, 2020, 4:17 a.m.
If I generate two Windows PE executables, one 32 bits and one 64 bits:

    $ x86_64-w64-mingw32-gcc test.c -g3 -O0 -o test_64
    $ i686-w64-mingw32-gcc test.c -g3 -O0 -o test_32
    $ file test_64
    test_64: PE32+ executable (console) x86-64, for MS Windows
    $ file test_32
    test_32: PE32 executable (console) Intel 80386, for MS Windows

When I load the 32 bits binary in my GNU/Linux-hosted GDB, the osabi is
correctly recognized as "Cygwin":

    $ ./gdb --data-directory=data-directory -nx test_32
    (gdb) show osabi
    The current OS ABI is "auto" (currently "Cygwin").

When I load the 64 bits binary in GDB, the osabi is incorrectly
recognized as "GNU/Linux":

    $ ./gdb --data-directory=data-directory -nx test_64
    (gdb) show osabi
    The current OS ABI is "auto" (currently "GNU/Linux").

The 32 bits one gets recognized by the i386_cygwin_osabi_sniffer
function, by its target name:

    if (strcmp (target_name, "pei-i386") == 0)
      return GDB_OSABI_CYGWIN;

The target name for the 64 bits binaries is "pei-x86-64".  It doesn't
get recognized by any osabi sniffer, so GDB falls back on its default
osabi, "GNU/Linux".

This patch adds an osabi sniffer function for the Windows 64 bits
executables in amd64-windows-tdep.c.  With it, the osabi is recognized
as "Cygwin", just like with the 32 bits binary.

I'm not very familiar with the Windows platform, and from what I know
Cygwin and mingw are different things.  A binary compiled with mingw
does not mean it runs on Cygwin.  But from what I understand, the osabi
"Cygwin" in GDB is used for everything Windows.  Not sure that's the
ideal, but that's the way it seems to be right now.

I think it would be good to have a test for this, testing that both 32
bits and 64 bits Windows binaries get correctly recognized, regardless
of the host GDB runs on.  However, I'm not sure how to proceed.

The easiest way would be to generate two small executables and check
them in the repo.  That would allow the test to run on pretty much any
platform GDB is built on.  However, that requires checking in binaries
in the repo, something we don't really do at the moment.

The alternative would be to try to call a mingw/mingw64 compiler
(x86_64-w64-mingw32-gcc in my case), and hope it is installed on the
host.  I find that this greatly reduces the effectiveness of the test,
as it will not be ran by in as many environments.

If someone has an opinion on this or a better idea, I'd like to hear it.

gdb/ChangeLog:

	* amd64-windows-tdep.c (amd64_windows_osabi_sniffer): New
	function.
	(_initialize_amd64_windows_tdep): Register osabi sniffer.
---
 gdb/amd64-windows-tdep.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

-- 
2.25.1

Comments

Eli Zaretskii March 7, 2020, 8:09 a.m. | #1
> From: Simon Marchi <simon.marchi@efficios.com>

> Cc: Simon Marchi <simon.marchi@efficios.com>

> Date: Fri,  6 Mar 2020 23:17:42 -0500

> 

> When I load the 32 bits binary in my GNU/Linux-hosted GDB, the osabi is

> correctly recognized as "Cygwin":

> 

>     $ ./gdb --data-directory=data-directory -nx test_32

>     (gdb) show osabi

>     The current OS ABI is "auto" (currently "Cygwin").


Why is this correct?  Your debuggee is a MinGW program, not a Cygwin
program.  The OS ABI should say "MinGW" or perhaps even "MS-Windows"
(since MinGW programs are just native Windows executables).

I'm guessing that this is some historical left-over: only Cygwin took
care of returning an OS ABI name at some point in the past, and we
never bothered to augment that for MinGW programs.  I suggest that we
do TRT now, as long as we are on this subject.

The difference between a Cygwin program and a native Windows program
is that the former has a dependency on the Cygwin DLL (or MSYS DLL, if
we want to support MSYS/MSYS2 executables).  Is it possible to make
this distinction where we decide on the OS ABI?

In any case, I see no reason to say that "pei-i386" executables are
necessarily Cygwin programs, the default should be "MS-Windows".

Thanks.
Simon Marchi March 7, 2020, 4:51 p.m. | #2
On 2020-03-07 3:09 a.m., Eli Zaretskii wrote:
>> From: Simon Marchi <simon.marchi@efficios.com>

>> Cc: Simon Marchi <simon.marchi@efficios.com>

>> Date: Fri,  6 Mar 2020 23:17:42 -0500

>>

>> When I load the 32 bits binary in my GNU/Linux-hosted GDB, the osabi is

>> correctly recognized as "Cygwin":

>>

>>     $ ./gdb --data-directory=data-directory -nx test_32

>>     (gdb) show osabi

>>     The current OS ABI is "auto" (currently "Cygwin").

> 

> Why is this correct?  Your debuggee is a MinGW program, not a Cygwin

> program.  The OS ABI should say "MinGW" or perhaps even "MS-Windows"

> (since MinGW programs are just native Windows executables).


As I said, I know that it's not absolutely correct.  But having the native
executable be detected as "Cygwin" seems relatively more correct compared
to the current situation.  The context is the following:

  https://sourceware.org/ml/gdb-patches/2020-03/msg00151.html

Currently, loading the 64-bits .exe in a GNU/Linux-hosted GDB ends up calling
the svr4 libraries code, which is plain wrong.  By using the Cygwin osabi,
at least the right shared libraries functions are used.

I agree with what you suggest below, but I think that the current patch is
still a step forward and improves things.

> I'm guessing that this is some historical left-over: only Cygwin took

> care of returning an OS ABI name at some point in the past, and we

> never bothered to augment that for MinGW programs.  I suggest that we

> do TRT now, as long as we are on this subject.


I think it makes sense and would avoid some confusion.  I was surprised
to see the Cygwin osabi used when debugging a native Windows program.

> The difference between a Cygwin program and a native Windows program

> is that the former has a dependency on the Cygwin DLL (or MSYS DLL, if

> we want to support MSYS/MSYS2 executables).  Is it possible to make

> this distinction where we decide on the OS ABI?


One question is, do we need this distinction at all?  Let's look at
where the Cygwin osabi comes into play.  When the GDB_OSABI_CYGWIN
osabi is detected, the "i386_cygwin_init_abi" function from
i386-cygwin-tdep.c is called.

In this function, I see nothing Cygwin-specific, all that is in there
seems to apply equally to native Windows executables and Cygwin
executables.  So it seems like we could just rename the "Cygwin" osabi
to "MS-Windows".  Except that would be a breaking change, as the
command "set osabi Cygwin" wouldn't work anymore.

So what we can do is add an "MS-Windows" osabi and make "Cygwin" and
"MS-Windows" functionally equivalent.  Any "pei-i386" or "pei-x86-64"
executable would be detected as "MS-Windows".

We could try to detect whether the binary is using the cygwin or msys
dll and if so apply the Cygwin osabi, but that would be just for
aesthetic purposes, as the two osabis would be functionally equivalent
(at least for now).  It can still be useful to avoid confusion: if we
have a Cygwin osabi, but Cygwin binaries are not recognized to use the
Cygwin osabi, it just looks like a bug even if it isn't.  I would need
to dig into BFD and the PE/coff format to see how we could find this
information.

If we do such a change, I would like it to be done on top of the current
patch, as to not mix concerns.

> In any case, I see no reason to say that "pei-i386" executables are

> necessarily Cygwin programs, the default should be "MS-Windows".


Agreed.

Simon
Eli Zaretskii March 7, 2020, 5:45 p.m. | #3
> Cc: gdb-patches@sourceware.org

> From: Simon Marchi <simon.marchi@efficios.com>

> Date: Sat, 7 Mar 2020 11:51:08 -0500

> 

>   https://sourceware.org/ml/gdb-patches/2020-03/msg00151.html

> 

> Currently, loading the 64-bits .exe in a GNU/Linux-hosted GDB ends up calling

> the svr4 libraries code, which is plain wrong.  By using the Cygwin osabi,

> at least the right shared libraries functions are used.

> 

> I agree with what you suggest below, but I think that the current patch is

> still a step forward and improves things.


I agree.  I just think we can do better.

> So what we can do is add an "MS-Windows" osabi and make "Cygwin" and

> "MS-Windows" functionally equivalent.  Any "pei-i386" or "pei-x86-64"

> executable would be detected as "MS-Windows".


That's fine with me, and IMO will be more accurate than calling them
all "Cygwin", since Cygwin programs are just a peculiar kind of
Windows executables.

> If we do such a change, I would like it to be done on top of the current

> patch, as to not mix concerns.


I'm okay with that, thanks.
Simon Marchi March 7, 2020, 6:16 p.m. | #4
On 2020-03-07 12:45 p.m., Eli Zaretskii wrote:
>> Cc: gdb-patches@sourceware.org

>> From: Simon Marchi <simon.marchi@efficios.com>

>> Date: Sat, 7 Mar 2020 11:51:08 -0500

>>

>>   https://sourceware.org/ml/gdb-patches/2020-03/msg00151.html

>>

>> Currently, loading the 64-bits .exe in a GNU/Linux-hosted GDB ends up calling

>> the svr4 libraries code, which is plain wrong.  By using the Cygwin osabi,

>> at least the right shared libraries functions are used.

>>

>> I agree with what you suggest below, but I think that the current patch is

>> still a step forward and improves things.

> 

> I agree.  I just think we can do better.

> 

>> So what we can do is add an "MS-Windows" osabi and make "Cygwin" and

>> "MS-Windows" functionally equivalent.  Any "pei-i386" or "pei-x86-64"

>> executable would be detected as "MS-Windows".

> 

> That's fine with me, and IMO will be more accurate than calling them

> all "Cygwin", since Cygwin programs are just a peculiar kind of

> Windows executables.


Ok, I can make a second patch that introduces this new "MS-Windows" osabi.

> 

>> If we do such a change, I would like it to be done on top of the current

>> patch, as to not mix concerns.

> 

> I'm okay with that, thanks.

> 


I looked up some information about how PE executables list their DLL dependencies,
apparently it's in the ".idata" section.  There is some doc here:

  https://docs.microsoft.com/en-us/windows/win32/debug/pe-format#the-idata-section

and here:

  https://blog.kowalczyk.info/articles/pefileformat.html#9ccef823-67e7-4372-9172-045d7b1fb006

With that, I was able to parse enough of .idata in a very crude way (consider it just a prototype)
and get the DLL names.

I noticed that there is some BFD code that also parses it:

  https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=bfd/peXXigen.c;h=e42d646552a0ca1e856e082256cd3d943b54ddf0;hb=HEAD#l1261

however, it's coupled with the printing code, so not very easy to re-use.  Ideally, one would
refactor this code to provide a nice BFD interface to look up this information, but I don't
really have time for that.  I wrote to the binutils mailing list (sourceware is in maintenance
right now so I can't provide the archive link) to ask what we can do about that.

Simon
Jon Turney March 8, 2020, 2:05 p.m. | #5
On 07/03/2020 17:45, Eli Zaretskii wrote:
>> Cc: gdb-patches@sourceware.org

>> From: Simon Marchi <simon.marchi@efficios.com>

>> Date: Sat, 7 Mar 2020 11:51:08 -0500

>>

>>    https://sourceware.org/ml/gdb-patches/2020-03/msg00151.html

>>

>> Currently, loading the 64-bits .exe in a GNU/Linux-hosted GDB ends up calling

>> the svr4 libraries code, which is plain wrong.  By using the Cygwin osabi,

>> at least the right shared libraries functions are used.

>>

>> I agree with what you suggest below, but I think that the current patch is

>> still a step forward and improves things.

> 

> I agree.  I just think we can do better.

> 

>> So what we can do is add an "MS-Windows" osabi and make "Cygwin" and

>> "MS-Windows" functionally equivalent.  Any "pei-i386" or "pei-x86-64"

>> executable would be detected as "MS-Windows".


I believe this suggestion for x86_64 is wrong, in the other direction:
x86_64 Cygwin is LP64, but Windows is LLP64 (Se also table in [1])

(currently 'print sizeof(long)' incorrectly returns 4 on a gdb built for 
Cygwin)

There was some discussion that these need to be separate osabis 
previously, I think.

[1] https://cygwin.com/faq.html#faq.programming.64bitporting

> That's fine with me, and IMO will be more accurate than calling them

> all "Cygwin", since Cygwin programs are just a peculiar kind of

> Windows executables.

> 

>> If we do such a change, I would like it to be done on top of the current

>> patch, as to not mix concerns.

> 

> I'm okay with that, thanks.
Jon Turney March 9, 2020, 3:39 p.m. | #6
On 07/03/2020 17:45, Eli Zaretskii wrote:
>> Cc: gdb-patches@sourceware.org

>> From: Simon Marchi <simon.marchi@efficios.com>

>> Date: Sat, 7 Mar 2020 11:51:08 -0500

>>

>>    https://sourceware.org/ml/gdb-patches/2020-03/msg00151.html

>>

>> Currently, loading the 64-bits .exe in a GNU/Linux-hosted GDB ends up calling

>> the svr4 libraries code, which is plain wrong.  By using the Cygwin osabi,

>> at least the right shared libraries functions are used.

>>

>> I agree with what you suggest below, but I think that the current patch is

>> still a step forward and improves things.

> 

> I agree.  I just think we can do better.

> 

>> So what we can do is add an "MS-Windows" osabi and make "Cygwin" and

>> "MS-Windows" functionally equivalent.  Any "pei-i386" or "pei-x86-64"

>> executable would be detected as "MS-Windows".


I believe this suggestion for x86_64 is wrong, in the other direction:
x86_64 Cygwin is LP64, but Windows is LLP64 (See also table in [1])

(This is handled incorrectly in Cygwin at the moment, e.g. in our build 
of gdb 'print sizeof(long)' returns 4)

There was some discussion that these need to be separate osabis 
previously, I think.

[1] https://cygwin.com/faq.html#faq.programming.64bitporting


> That's fine with me, and IMO will be more accurate than calling them

> all "Cygwin", since Cygwin programs are just a peculiar kind of

> Windows executables.

> 

>> If we do such a change, I would like it to be done on top of the current

>> patch, as to not mix concerns.

> 

> I'm okay with that, thanks.
Eli Zaretskii March 10, 2020, 3:16 p.m. | #7
> From: Jon Turney <jon.turney@dronecode.org.uk>

> Date: Sun, 8 Mar 2020 14:05:10 +0000

> 

> >> So what we can do is add an "MS-Windows" osabi and make "Cygwin" and

> >> "MS-Windows" functionally equivalent.  Any "pei-i386" or "pei-x86-64"

> >> executable would be detected as "MS-Windows".

> 

> I believe this suggestion for x86_64 is wrong, in the other direction:

> x86_64 Cygwin is LP64, but Windows is LLP64 (Se also table in [1])


If the LP64 thing is part of what defines the OS ABI, then yes, Cygwin
should have a separate value.

> (currently 'print sizeof(long)' incorrectly returns 4 on a gdb built for 

> Cygwin)


So I guess calling it "Cygwin" doesn't help? ;-)
Simon Marchi March 10, 2020, 3:45 p.m. | #8
On 2020-03-10 11:16 a.m., Eli Zaretskii wrote:
>> From: Jon Turney <jon.turney@dronecode.org.uk>

>> Date: Sun, 8 Mar 2020 14:05:10 +0000

>>

>>>> So what we can do is add an "MS-Windows" osabi and make "Cygwin" and

>>>> "MS-Windows" functionally equivalent.  Any "pei-i386" or "pei-x86-64"

>>>> executable would be detected as "MS-Windows".

>>

>> I believe this suggestion for x86_64 is wrong, in the other direction:

>> x86_64 Cygwin is LP64, but Windows is LLP64 (Se also table in [1])

> 

> If the LP64 thing is part of what defines the OS ABI, then yes, Cygwin

> should have a separate value.


If I understand correctly, that's one practical reason for introducing the separate
"Windows" OS ABI?

Simon
Jon Turney March 14, 2020, 3:35 p.m. | #9
On 10/03/2020 15:45, Simon Marchi wrote:
> On 2020-03-10 11:16 a.m., Eli Zaretskii wrote:

>>> From: Jon Turney <jon.turney@dronecode.org.uk>

>>> Date: Sun, 8 Mar 2020 14:05:10 +0000

>>>

>>>>> So what we can do is add an "MS-Windows" osabi and make "Cygwin" and

>>>>> "MS-Windows" functionally equivalent.  Any "pei-i386" or "pei-x86-64"

>>>>> executable would be detected as "MS-Windows".

>>>

>>> I believe this suggestion for x86_64 is wrong, in the other direction:

>>> x86_64 Cygwin is LP64, but Windows is LLP64 (Se also table in [1])

>>

>> If the LP64 thing is part of what defines the OS ABI, then yes, Cygwin

>> should have a separate value.

> 

> If I understand correctly, that's one practical reason for introducing the separate

> "Windows" OS ABI?


That's the suggestion in:

https://sourceware.org/bugzilla/show_bug.cgi?id=21500

Patch

diff --git a/gdb/amd64-windows-tdep.c b/gdb/amd64-windows-tdep.c
index d4d79682dd..2ca979513c 100644
--- a/gdb/amd64-windows-tdep.c
+++ b/gdb/amd64-windows-tdep.c
@@ -1244,10 +1244,24 @@  amd64_windows_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   set_gdbarch_auto_wide_charset (gdbarch, amd64_windows_auto_wide_charset);
 }
 
+static gdb_osabi
+amd64_windows_osabi_sniffer (bfd *abfd)
+{
+  const char *target_name = bfd_get_target (abfd);
+
+  if (strcmp (target_name, "pei-x86-64") == 0)
+    return GDB_OSABI_CYGWIN;
+
+  return GDB_OSABI_UNKNOWN;
+}
+
 void _initialize_amd64_windows_tdep ();
 void
 _initialize_amd64_windows_tdep ()
 {
   gdbarch_register_osabi (bfd_arch_i386, bfd_mach_x86_64, GDB_OSABI_CYGWIN,
                           amd64_windows_init_abi);
+
+  gdbarch_register_osabi_sniffer (bfd_arch_i386, bfd_target_coff_flavour,
+				  amd64_windows_osabi_sniffer);
 }