Fix lookup of separate debug file on MS-Windows

Message ID 83ef5ze84y.fsf@gnu.org
State New
Headers show
Series
  • Fix lookup of separate debug file on MS-Windows
Related show

Commit Message

Eli Zaretskii April 18, 2019, 1:49 p.m.
If you put the separate debug file in a global debug directory, GDB on
MS-Windows will currently fail to find it.  This happens because we
obtain the directory to look up the debug file by concatenating the
debug directory name with the leading directories of the executable,
and the latter includes the drive letter on MS-Windows.  So we get an
invalid file name like

   d:/usr/lib/debug/d:/usr/bin/foo.debug

The patch below fixes that:



OK to commit to both branches (with the necessary ChangeLog entries)?

Btw, the removal of the leading slash of dir_notarget could
potentially benefit Posix systems as well.  Or maybe we should not
append the literal slash in this snippet from
find_separate_debug_file:

      debugfile = target_prefix ? "target:" : "";
      debugfile += debugdir.get ();
      debugfile += "/";  <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
      debugfile += dir_notarget;
      debugfile += debuglink;

since AFAICT dir_notarget will always begin with a slash (I added a
test for that because I wasn't sure that is indeed so).

Comments

LRN April 18, 2019, 4:19 p.m. | #1
On 18.04.2019 16:49, Eli Zaretskii wrote:
> If you put the separate debug file in a global debug directory, GDB on

> MS-Windows will currently fail to find it.  This happens because we

> obtain the directory to look up the debug file by concatenating the

> debug directory name with the leading directories of the executable,

> and the latter includes the drive letter on MS-Windows.  So we get an

> invalid file name like

> 

>    d:/usr/lib/debug/d:/usr/bin/foo.debug

> 

> +  /* MS-Windows/MS-DOS don't allow colons in file names; we must strip

> +     the drive letter, so that the file name resulting from splicing

> +     below will be valid.  */

> +  if (HAS_DRIVE_SPEC (dir_notarget))

> +    {

> +      dir_notarget = STRIP_DRIVE_SPEC (dir_notarget);

> +      /* We will append a slash to debugdir, so remove the leading

> +	 slash as well.  */

> +      if (IS_DIR_SEPARATOR (dir_notarget[0]))

> +	dir_notarget++;

> +    }

> +


While this is *a* fix, the result, AFAIU is:

d:/usr/lib/debug/usr/bin/foo.debug

This is a functional filename, but is it correct? What if the path to "foo" is:

c:/some/directory/bin/foo

? In that case your patch will produce:

d:/usr/lib/debug/some/directory/bin/foo

Actually, no, that's not right. If gdb and 'foo' are in the same tree (this is
a valid case, for example, if you're dealing with a mingw installation), then
gdb will autodetect debug directory as 'c:/some/directory/lib/debug', and the
debug path will be:

c:/some/directory/lib/debug/some/directory/bin/foo

To make it more obvious:

c:/some/directory/mingw/bin/foo + c:/some/directory/mingw/bin/gdb (meaning
c:/some/directory/mingw/lib/debug debug directory)

will produce

c:/some/directory/mingw/lib/debug/some/directory/mingw/bin/foo

, if i'm not mistaken. So the problems i see with this approach are:

1) Drive letters are lost (meaning that you can't have two paths with equal
names on two different drives be mapped to different debug tree paths - they
only differ in the drive letter, and you make that letter disappear)

A more correct way to fix this is to replace 'X:/' with 'X/'.

2) This doesn't take runtime tree location into account. Your debug tree has to
reflect the absolute, Windows path to the binary (since you're not doing any
other adjustments, and since gdb gives you an asbolute, Windows path to begin
with), otherwise gdb won't find the debug files.
I.e. in the 'c:/some/directory/mingw/lib/debug/some/directory/mingw/bin/foo'
example above the second '/some/directory' is clearly incidental, as the user
just used that as a root of an installation. The package maintainer can't know
where the binaries will end up, and can only predict the '/mingw/bin/foo' part.
I.e. the path that gdb should look for should be:

c:/some/directory/mingw/lib/debug/mingw/bin/foo

or

c:/some/directory/mingw/lib/debug/bin/foo

(depending on whether we consider '/mingw' or '/' as the root directory;
probably the former, although some people might make a case for the latter),
and this is where a package manager can put the files (the package will contain
'/mingw/lib/debug/bin/foo' debug file, and the package manager will prepend the
install root to that path when installing).

Note that (1) and (2) somewhat contradict one another.

One way to satisfy everyone is to implement all lookups (with the prefix first,
then with the absolute path with a drive letter; after that it might even try
the without-drive-letter case, or something else, as long as the debug info
file isn't found). This is not unheard of - for example, gcc can look for
include files in multiple non-existing directories, to support different
installation schemes.

(the reason why i have such a detailed opinion on the matter is that i've
debugged this very code for my latest gdb patch[0], which i submitted last
month, and back then i noticed the same behaviour you did).

[0]: https://www.sourceware.org/ml/gdb-patches/2019-03/msg00082.html
Eli Zaretskii April 18, 2019, 6:40 p.m. | #2
> From: LRN <lrn1986@gmail.com>

> Date: Thu, 18 Apr 2019 19:19:39 +0300

> 

> While this is *a* fix, the result, AFAIU is:

> 

> d:/usr/lib/debug/usr/bin/foo.debug

> 

> This is a functional filename, but is it correct?


It fits the documentation, so yes.

> 1) Drive letters are lost (meaning that you can't have two paths with equal

> names on two different drives be mapped to different debug tree paths - they

> only differ in the drive letter, and you make that letter disappear)

> 

> A more correct way to fix this is to replace 'X:/' with 'X/'.


We could do that, but it will need a documentation change.  I think
it's overkill, as the chances of the binaries and the debug directory
to be on different drives are nil, but if others prefer to convert X:/
to X/, I'm fine with that.

> 2) This doesn't take runtime tree location into account. Your debug tree has to

> reflect the absolute, Windows path to the binary (since you're not doing any

> other adjustments, and since gdb gives you an asbolute, Windows path to begin

> with), otherwise gdb won't find the debug files.

> I.e. in the 'c:/some/directory/mingw/lib/debug/some/directory/mingw/bin/foo'

> example above the second '/some/directory' is clearly incidental, as the user

> just used that as a root of an installation.


It is as "incidental" as the same situation of Posix platforms, where
we simply splice the two directories.

> The package maintainer can't know where the binaries will end up,

> and can only predict the '/mingw/bin/foo' part.


Package maintainers shouldn't use global debug directories, because
for starters such directories might not exist at all, and even if they
do, their location cannot be predicted in advance.  Package
maintainers should instead use the .debug subdirectory of where they
put the binaries, that arrangement already works, and is predictable.

> I.e. the path that gdb should look for should be:

> 

> c:/some/directory/mingw/lib/debug/mingw/bin/foo

> 

> or

> 

> c:/some/directory/mingw/lib/debug/bin/foo


This would deviate from what we do on Unix, so I'm opposed to it.
Unless we also change the Unix behavior, of course.
LRN April 18, 2019, 9:41 p.m. | #3
On 18.04.2019 21:40, Eli Zaretskii wrote:
>> From: LRN 

>> Date: Thu, 18 Apr 2019 19:19:39 +0300

>>

>> 2) This doesn't take runtime tree location into account. Your debug tree has to

>> reflect the absolute, Windows path to the binary (since you're not doing any

>> other adjustments, and since gdb gives you an asbolute, Windows path to begin

>> with), otherwise gdb won't find the debug files.

>> I.e. in the 'c:/some/directory/mingw/lib/debug/some/directory/mingw/bin/foo'

>> example above the second '/some/directory' is clearly incidental, as the user

>> just used that as a root of an installation.

> 

> It is as "incidental" as the same situation of Posix platforms, where

> we simply splice the two directories.

> 

>> The package maintainer can't know where the binaries will end up,

>> and can only predict the '/mingw/bin/foo' part.

> 

> Package maintainers shouldn't use global debug directories, because

> for starters such directories might not exist at all, and even if they

> do, their location cannot be predicted in advance.  Package

> maintainers should instead use the .debug subdirectory of where they

> put the binaries, that arrangement already works, and is predictable.


Wait, what?
I've just built a "foo" program, it installs into `/mingw/bin/foo`, and i know
that `/mingw/bin/gdb` will look for debug files in `/mingw/lib/debug` (note
that gdb already does this, AFAIR). So it makes sense for me to install debug
file for 'foo' as `/mingw/lib/debug/mingw/bin/foo`. I don't see why debug files
can't be packaged like that.

I was also going to point out that distros do this, but Debian doesn't at the
moment - it uses build-ids instead. But --build-id is relatively new, appeared
in 2011. I distinctly remember that distros used debug directories before then
(unless my memory is starting to give, that is)

> 

>> I.e. the path that gdb should look for should be:

>>

>> c:/some/directory/mingw/lib/debug/mingw/bin/foo

>>

>> or

>>

>> c:/some/directory/mingw/lib/debug/bin/foo

> 

> This would deviate from what we do on Unix, so I'm opposed to it.

> Unless we also change the Unix behavior, of course.

> 


Looking for debug info for `/usr/bin/foo` in `/usr/lib/debug/usr/bin/foo` is
OK, and the paths i described above are exactly the same, not deviating - just
accounting for Windows specifics. The specifics in this case being the fact
that on Windows there's no pre-defined root directory, so packages that want to
have a notion of a root directory have to pick *some* directory and consider it
"root". Usually it's autodetected at runtime as the parent of the bin
subdirectory where the binary is (conveniently, all binaries go into bin
subdir; and yes, this is problematic for plugins - but they usually don't need
to know where the root is). Once you consider the "c:/some/directory" part of
the paths above as "some path to root directory that we don't take into
account", everything becomes clear. (i'm explaining this just for the sake of
the argument, since you already know all this).

Unless you meant strictly the second string (the one with a stripped '/mingw'
prefix), in which case i would agree that this is more-or-less unortodox, even
if somewhat plausible.

As i've said, all these approaches are not mutually-exclusive, gdb can be made
to look in *all* of these places until it finds a debug file.
Eli Zaretskii April 19, 2019, 6:48 a.m. | #4
> From: LRN <lrn1986@gmail.com>

> Date: Fri, 19 Apr 2019 00:41:53 +0300

> 

> >> The package maintainer can't know where the binaries will end up,

> >> and can only predict the '/mingw/bin/foo' part.

> > 

> > Package maintainers shouldn't use global debug directories, because

> > for starters such directories might not exist at all, and even if they

> > do, their location cannot be predicted in advance.  Package

> > maintainers should instead use the .debug subdirectory of where they

> > put the binaries, that arrangement already works, and is predictable.

> 

> Wait, what?

> I've just built a "foo" program, it installs into `/mingw/bin/foo`, and i know

> that `/mingw/bin/gdb` will look for debug files in `/mingw/lib/debug` (note

> that gdb already does this, AFAIR). So it makes sense for me to install debug

> file for 'foo' as `/mingw/lib/debug/mingw/bin/foo`. I don't see why debug files

> can't be packaged like that.


You assume that you know how the end-user's GDB was configured.  But
that assumption is false in general.  E.g., I build my own GDB, and I
configure it with a global debug directory that you cannot possibly
know about in advance.

Moreover, the global debug directory is "relocatable", i.e. it is
computed based on the directory where gdb.exe lives, which makes its
exact place even less predictable in advance when an unrelated package
is being prepared.

By contrast, the .debug subdirectory of the binary's installation
directory is searched by default by every GDB build, so it is safer.
(You could also put the debug info file in the same directory as the
binary, but that is less clean, IMO.)

> I was also going to point out that distros do this, but Debian doesn't at the

> moment - it uses build-ids instead. But --build-id is relatively new, appeared

> in 2011. I distinctly remember that distros used debug directories before then

> (unless my memory is starting to give, that is)


The --build-id way is less convenient because AFAIK it requires to
tweak the link command.  It also requires creation of directories with
ugly long human-unreadable names.

> >> I.e. the path that gdb should look for should be:

> >>

> >> c:/some/directory/mingw/lib/debug/mingw/bin/foo

> >>

> >> or

> >>

> >> c:/some/directory/mingw/lib/debug/bin/foo

> > 

> > This would deviate from what we do on Unix, so I'm opposed to it.

> > Unless we also change the Unix behavior, of course.

> > 

> 

> Looking for debug info for `/usr/bin/foo` in `/usr/lib/debug/usr/bin/foo` is

> OK, and the paths i described above are exactly the same, not deviating - just

> accounting for Windows specifics.


They omit the common prefix /some/directory/, unlike on Unix, where
/usr of /usr/bin is not omitted.  That's the deviation I alluded to.
And it runs afoul of the use case that you raised as an objection to
dropping the drive letter: programs by the same name installed in
different places.  If having such programs on different drives is not
rare enough, having them in different directories on the same drive is
even less rare.

> The specifics in this case being the fact

> that on Windows there's no pre-defined root directory, so packages that want to

> have a notion of a root directory have to pick *some* directory and consider it

> "root". Usually it's autodetected at runtime as the parent of the bin

> subdirectory where the binary is (conveniently, all binaries go into bin

> subdir; and yes, this is problematic for plugins - but they usually don't need

> to know where the root is). Once you consider the "c:/some/directory" part of

> the paths above as "some path to root directory that we don't take into

> account", everything becomes clear. (i'm explaining this just for the sake of

> the argument, since you already know all this).


What you say will make sense only if people install software from a
single source, like the MSYS2 project.  But that is a dangerous
assumption, IMO.  It is better to have a system that doesn't make such
an assumption, especially since what I propose is more closely
following what happens on Unix.

> As i've said, all these approaches are not mutually-exclusive, gdb can be made

> to look in *all* of these places until it finds a debug file.


That would need to be a general feature, not limited to Windows.
LRN April 19, 2019, 10:06 a.m. | #5
On 19.04.2019 9:48, Eli Zaretskii wrote:
>> From: LRN

>> Date: Fri, 19 Apr 2019 00:41:53 +0300

>>

>> I've just built a "foo" program, it installs into `/mingw/bin/foo`, and i know

>> that `/mingw/bin/gdb` will look for debug files in `/mingw/lib/debug` (note

>> that gdb already does this, AFAIR). So it makes sense for me to install debug

>> file for 'foo' as `/mingw/lib/debug/mingw/bin/foo`. I don't see why debug files

>> can't be packaged like that.

> 

> You assume that you know how the end-user's GDB was configured.  But

> that assumption is false in general.  E.g., I build my own GDB, and I

> configure it with a global debug directory that you cannot possibly

> know about in advance.


If you built your gdb to look for debug files in a completely unpredictable
global directory (i.e. hardcoded 'f:/debugfiles' instead of letting gdb
autodetect the debug directory relative to its own runtime prefix), then your
only choice is to tell gdb at runtime that it should look for debug files in
'f:/debugfiles' (there's probably a command or an envvar that achieves this).
And you will have to place debug files in there by hand (or using some kind of
custom tool), because no one else knows in advance that debug files should go
into 'f:/debugfiles'.
This covers there "where the debug directory root is" part of the lookup. The
second part (path to the binary -> path to the debug file mapping) doesn't
depend on how you configure gdb.

> 

> Moreover, the global debug directory is "relocatable", i.e. it is

> computed based on the directory where gdb.exe lives, which makes its

> exact place even less predictable in advance when an unrelated package

> is being prepared.


If you didn't hardcode "f:/debugfiles", but hardcoded something along the lines
of "/debug/files" (instead of the default "/lib/debug"), then that is indeed
the case. Either way, the conventional "/lib/debug" path is no longer correct,
and the assumption made when debug files were built (the assumption that debug
files should be installed into "/lib/debug") and installed is false. So you are
right - in this case no one but you can know where the debug files should go. I
don't see how your desire to use unconventional debug directories is a valid
justification for gdb to not look in conventional debug directories.

If gdb and debugee are installed completely separately (in different root
directories), then you're right - prefix-aware lookup might not find the debug
files. Well, it *might* find them, if both packages follow FHS (i.e.
"c:/foo/bin/gdb" will look in "c:/foo/debugfiles/bin/bar" for a debug file for
"d:/zool/bin/bar", having concluded (how? i'm starting to doubt that this
scenario is going to work after all) that the adjusted path to 'bar' is
'/bin/bar'), but you certainly won't be able to have different debug files for
two different installs of a debugee in two different root directories.

Though i wonder who would want to do that - install gdb in a:/foo, install
debugee_v1 in b:/bar and install debugee_v2 in c:/zool...i mean, Windows
doesn't make it easy to install multiple things into multiple places, as you
have to add each 'bin' subdir to PATH, otherwise DLLs won't be found. This is
not scalable, and i wouldn't want to encourage anyone to do this.

> 

> By contrast, the .debug subdirectory of the binary's installation

> directory is searched by default by every GDB build, so it is safer.

> (You could also put the debug info file in the same directory as the

> binary, but that is less clean, IMO.)


This is why i didn't raise the issue originally - i actually do place dbg files
in the same directory as their binaries. It's not clean, indeed, but it worked,
works, and will likely always work. So gdb's inability to correctly find debug
files in a separate debug directory is not critical to me.

>>>> I.e. the path that gdb should look for should be:

>>>>

>>>> c:/some/directory/mingw/lib/debug/mingw/bin/foo

>>>>

>>>> or

>>>>

>>>> c:/some/directory/mingw/lib/debug/bin/foo

>>>

>>> This would deviate from what we do on Unix, so I'm opposed to it.

>>> Unless we also change the Unix behavior, of course.

>>>

>>

>> Looking for debug info for `/usr/bin/foo` in `/usr/lib/debug/usr/bin/foo` is

>> OK, and the paths i described above are exactly the same, not deviating - just

>> accounting for Windows specifics.

>> The specifics in this case being the fact

>> that on Windows there's no pre-defined root directory, so packages that want to

>> have a notion of a root directory have to pick *some* directory and consider it

>> "root". Usually it's autodetected at runtime as the parent of the bin

>> subdirectory where the binary is (conveniently, all binaries go into bin

>> subdir; and yes, this is problematic for plugins - but they usually don't need

>> to know where the root is). Once you consider the "c:/some/directory" part of

>> the paths above as "some path to root directory that we don't take into

>> account", everything becomes clear. (i'm explaining this just for the sake of

>> the argument, since you already know all this).

> 

> What you say will make sense only if people install software from a

> single source, like the MSYS2 project.


It also makes sense if everyone follows the convention. It's like FHS - no one
forces you to follow it, but everyone benefits if you do, because everything is
where you'd expect it to be.

>  But that is a dangerous

> assumption, IMO.


Dangerous how?

>  It is better to have a system that doesn't make such

> an assumption, especially since what I propose is more closely

> following what happens on Unix.


See below.

> 

>> As i've said, all these approaches are not mutually-exclusive, gdb can be made

>> to look in *all* of these places until it finds a debug file.

> 

> That would need to be a general feature, not limited to Windows.

> 


And yet that feature is a way to *both* follow an established convention for
Windows-installed Unix-originated programs *and* be able to install debugees in
multiple separate places with separate debug files that are correctly found by
a single instance gdb (which is also installed separately).

If your debugdir is 'c:/path/to/debugdir' and you're debugging
'f:/some/dir/usr/bin/foo' with gdb located in 'd:/programs/mingw/bin/gdb',
start by looking for

c:/path/to/debugdir/f/some/dir/usr/bin/foo

(no ambiguity), then try

c:/path/to/debugdir/some/dir/usr/bin/foo

(ambiguous drive), and then...that's it. If gdb and debugee are in different
root directories, prefix-aware lookup won't work (gdb doesn't know which part
of 'f:/some/dir/usr/bin/' to strip away). If you had gdb in
'x:/zool/usr/bin/gdb' and debugee in 'x:/zool/mingw/bin/foo', you'd also be
able to look for

c:/path/to/debugdir/mingw/bin/foo

This is both a feature and a bug.
A bug - because programs installed in different root directories have no way to
locate each other (as long as they don't all get put into PATH, in which case
you'll spend hours chasing after some delicious heisenbugs).
A feature - because you can have multiple root directories with multiple
independent installs, and they won't interfere with each other in any way (i
have, like, 4 of them at the moment, and the number will increase to 5 soon).

You can't just blindly do what Unix does and expect a good result - this isn't
Unix. There's an extra layer of complexity - arbitrary root directories.
Eli Zaretskii April 21, 2019, 12:05 p.m. | #6
Ping!

Would people please voice their opinions regarding preservation of the
drive letter (and removing the colon) vs just dropping the drive
letter altogether?  I'd like to fix this issue for the upcoming
release of GDB 8.3.

> Date: Thu, 18 Apr 2019 21:40:56 +0300

> From: Eli Zaretskii <eliz@gnu.org>

> CC: gdb-patches@sourceware.org

> 

> > From: LRN <lrn1986@gmail.com>

> > Date: Thu, 18 Apr 2019 19:19:39 +0300

> > 

> > While this is *a* fix, the result, AFAIU is:

> > 

> > d:/usr/lib/debug/usr/bin/foo.debug

> > 

> > This is a functional filename, but is it correct?

> 

> It fits the documentation, so yes.

> 

> > 1) Drive letters are lost (meaning that you can't have two paths with equal

> > names on two different drives be mapped to different debug tree paths - they

> > only differ in the drive letter, and you make that letter disappear)

> > 

> > A more correct way to fix this is to replace 'X:/' with 'X/'.

> 

> We could do that, but it will need a documentation change.  I think

> it's overkill, as the chances of the binaries and the debug directory

> to be on different drives are nil, but if others prefer to convert X:/

> to X/, I'm fine with that.

> 

> > 2) This doesn't take runtime tree location into account. Your debug tree has to

> > reflect the absolute, Windows path to the binary (since you're not doing any

> > other adjustments, and since gdb gives you an asbolute, Windows path to begin

> > with), otherwise gdb won't find the debug files.

> > I.e. in the 'c:/some/directory/mingw/lib/debug/some/directory/mingw/bin/foo'

> > example above the second '/some/directory' is clearly incidental, as the user

> > just used that as a root of an installation.

> 

> It is as "incidental" as the same situation of Posix platforms, where

> we simply splice the two directories.

> 

> > The package maintainer can't know where the binaries will end up,

> > and can only predict the '/mingw/bin/foo' part.

> 

> Package maintainers shouldn't use global debug directories, because

> for starters such directories might not exist at all, and even if they

> do, their location cannot be predicted in advance.  Package

> maintainers should instead use the .debug subdirectory of where they

> put the binaries, that arrangement already works, and is predictable.

> 

> > I.e. the path that gdb should look for should be:

> > 

> > c:/some/directory/mingw/lib/debug/mingw/bin/foo

> > 

> > or

> > 

> > c:/some/directory/mingw/lib/debug/bin/foo

> 

> This would deviate from what we do on Unix, so I'm opposed to it.

> Unless we also change the Unix behavior, of course.

>
Simon Marchi April 21, 2019, 12:55 p.m. | #7
On 2019-04-21 8:05 a.m., Eli Zaretskii wrote:
> Ping!

> 

> Would people please voice their opinions regarding preservation of the

> drive letter (and removing the colon) vs just dropping the drive

> letter altogether?  I'd like to fix this issue for the upcoming

> release of GDB 8.3.


I first read your patch (which initiated this thread), and my first reaction was
also to be surprised that we would lose the drive letter.  The risk of clash
is low, but why take that risk at all when it's easy enough to keep the drive letter
and just strip the colon?

I don't know the complete history of this, so if you know about distributions that
place debug files in a path without the drive letter (what your patch implements),
then I would be fine if GDB looked in both places, starting with the path including
the drive letter.

Simon
Eli Zaretskii April 22, 2019, 9:19 a.m. | #8
> From: Simon Marchi <simark@simark.ca>

> Date: Sun, 21 Apr 2019 08:55:07 -0400

> 

> On 2019-04-21 8:05 a.m., Eli Zaretskii wrote:

> > Ping!

> > 

> > Would people please voice their opinions regarding preservation of the

> > drive letter (and removing the colon) vs just dropping the drive

> > letter altogether?  I'd like to fix this issue for the upcoming

> > release of GDB 8.3.

> 

> I first read your patch (which initiated this thread), and my first reaction was

> also to be surprised that we would lose the drive letter.  The risk of clash

> is low, but why take that risk at all when it's easy enough to keep the drive letter

> and just strip the colon?


OK, so how about the patch below?

> I don't know the complete history of this, so if you know about distributions that

> place debug files in a path without the drive letter (what your patch implements),

> then I would be fine if GDB looked in both places, starting with the path including

> the drive letter.


I'm not aware of any Windows distributions that put debug info in the
directory suggested by my previous patch.  So I think we can safely
use just the /X/ subdirectory method.

diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index a3a5f3e..533a52c 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -19917,7 +19917,11 @@
 the directory of the executable file, then in a subdirectory of that
 directory named @file{.debug}, and finally under each one of the global debug
 directories, in a subdirectory whose name is identical to the leading
-directories of the executable's absolute file name.
+directories of the executable's absolute file name.  (On MS-Windows,
+the drive letter of the executable's leading directories is converted
+to a one-letter subdirectory, i.e.@: @file{d:/usr/bin/} is converted
+to @file{/d/usr/bin/}, because Windows filesystems disallow colons in
+file names.)
 
 @item
 For the ``build ID'' method, @value{GDBN} looks in the
diff --git a/gdb/symfile.c b/gdb/symfile.c
index 5736666..5749c61 100644
--- a/gdb/symfile.c
+++ b/gdb/symfile.c
@@ -1443,11 +1443,24 @@ find_separate_debug_file (const char *dir,
     = dirnames_to_char_ptr_vec (debug_file_directory);
   gdb::unique_xmalloc_ptr<char> canon_sysroot = gdb_realpath (gdb_sysroot);
 
+ /* MS-Windows/MS-DOS don't allow colons in file names; we must
+    convert the drive letter into a one-letter directory, so that the
+    file name resulting from splicing below will be valid.  */
+  std::string drive;
+  if (HAS_DRIVE_SPEC (dir_notarget))
+    {
+      drive = std::string (1, dir_notarget[0]);
+      dir_notarget = STRIP_DRIVE_SPEC (dir_notarget);
+    }
+  else
+    drive = "";
+
   for (const gdb::unique_xmalloc_ptr<char> &debugdir : debugdir_vec)
     {
       debugfile = target_prefix ? "target:" : "";
       debugfile += debugdir.get ();
       debugfile += "/";
+      debugfile += drive;
       debugfile += dir_notarget;
       debugfile += debuglink;
Eli Zaretskii April 27, 2019, 10:56 a.m. | #9
Ping!  OK to commit the below to both branches?

> Date: Mon, 22 Apr 2019 12:19:19 +0300

> From: Eli Zaretskii <eliz@gnu.org>

> CC: gdb-patches@sourceware.org

> 

> > From: Simon Marchi <simark@simark.ca>

> > Date: Sun, 21 Apr 2019 08:55:07 -0400

> > 

> > On 2019-04-21 8:05 a.m., Eli Zaretskii wrote:

> > > Ping!

> > > 

> > > Would people please voice their opinions regarding preservation of the

> > > drive letter (and removing the colon) vs just dropping the drive

> > > letter altogether?  I'd like to fix this issue for the upcoming

> > > release of GDB 8.3.

> > 

> > I first read your patch (which initiated this thread), and my first reaction was

> > also to be surprised that we would lose the drive letter.  The risk of clash

> > is low, but why take that risk at all when it's easy enough to keep the drive letter

> > and just strip the colon?

> 

> OK, so how about the patch below?

> 

> > I don't know the complete history of this, so if you know about distributions that

> > place debug files in a path without the drive letter (what your patch implements),

> > then I would be fine if GDB looked in both places, starting with the path including

> > the drive letter.

> 

> I'm not aware of any Windows distributions that put debug info in the

> directory suggested by my previous patch.  So I think we can safely

> use just the /X/ subdirectory method.

> 

> diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo

> index a3a5f3e..533a52c 100644

> --- a/gdb/doc/gdb.texinfo

> +++ b/gdb/doc/gdb.texinfo

> @@ -19917,7 +19917,11 @@

>  the directory of the executable file, then in a subdirectory of that

>  directory named @file{.debug}, and finally under each one of the global debug

>  directories, in a subdirectory whose name is identical to the leading

> -directories of the executable's absolute file name.

> +directories of the executable's absolute file name.  (On MS-Windows,

> +the drive letter of the executable's leading directories is converted

> +to a one-letter subdirectory, i.e.@: @file{d:/usr/bin/} is converted

> +to @file{/d/usr/bin/}, because Windows filesystems disallow colons in

> +file names.)

>  

>  @item

>  For the ``build ID'' method, @value{GDBN} looks in the

> diff --git a/gdb/symfile.c b/gdb/symfile.c

> index 5736666..5749c61 100644

> --- a/gdb/symfile.c

> +++ b/gdb/symfile.c

> @@ -1443,11 +1443,24 @@ find_separate_debug_file (const char *dir,

>      = dirnames_to_char_ptr_vec (debug_file_directory);

>    gdb::unique_xmalloc_ptr<char> canon_sysroot = gdb_realpath (gdb_sysroot);

>  

> + /* MS-Windows/MS-DOS don't allow colons in file names; we must

> +    convert the drive letter into a one-letter directory, so that the

> +    file name resulting from splicing below will be valid.  */

> +  std::string drive;

> +  if (HAS_DRIVE_SPEC (dir_notarget))

> +    {

> +      drive = std::string (1, dir_notarget[0]);

> +      dir_notarget = STRIP_DRIVE_SPEC (dir_notarget);

> +    }

> +  else

> +    drive = "";

> +

>    for (const gdb::unique_xmalloc_ptr<char> &debugdir : debugdir_vec)

>      {

>        debugfile = target_prefix ? "target:" : "";

>        debugfile += debugdir.get ();

>        debugfile += "/";

> +      debugfile += drive;

>        debugfile += dir_notarget;

>        debugfile += debuglink;

>  

>
Simon Marchi April 27, 2019, 4:16 p.m. | #10
On 2019-04-22 5:19 a.m., Eli Zaretskii wrote:
>> From: Simon Marchi <simark@simark.ca>

>> Date: Sun, 21 Apr 2019 08:55:07 -0400

>>

>> On 2019-04-21 8:05 a.m., Eli Zaretskii wrote:

>>> Ping!

>>>

>>> Would people please voice their opinions regarding preservation of the

>>> drive letter (and removing the colon) vs just dropping the drive

>>> letter altogether?  I'd like to fix this issue for the upcoming

>>> release of GDB 8.3.

>>

>> I first read your patch (which initiated this thread), and my first reaction was

>> also to be surprised that we would lose the drive letter.  The risk of clash

>> is low, but why take that risk at all when it's easy enough to keep the drive letter

>> and just strip the colon?

> 

> OK, so how about the patch below?

> 

>> I don't know the complete history of this, so if you know about distributions that

>> place debug files in a path without the drive letter (what your patch implements),

>> then I would be fine if GDB looked in both places, starting with the path including

>> the drive letter.

> 

> I'm not aware of any Windows distributions that put debug info in the

> directory suggested by my previous patch.  So I think we can safely

> use just the /X/ subdirectory method.

> 

> diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo

> index a3a5f3e..533a52c 100644

> --- a/gdb/doc/gdb.texinfo

> +++ b/gdb/doc/gdb.texinfo

> @@ -19917,7 +19917,11 @@

>  the directory of the executable file, then in a subdirectory of that

>  directory named @file{.debug}, and finally under each one of the global debug

>  directories, in a subdirectory whose name is identical to the leading

> -directories of the executable's absolute file name.

> +directories of the executable's absolute file name.  (On MS-Windows,

> +the drive letter of the executable's leading directories is converted

> +to a one-letter subdirectory, i.e.@: @file{d:/usr/bin/} is converted

> +to @file{/d/usr/bin/}, because Windows filesystems disallow colons in

> +file names.)

>  

>  @item

>  For the ``build ID'' method, @value{GDBN} looks in the

> diff --git a/gdb/symfile.c b/gdb/symfile.c

> index 5736666..5749c61 100644

> --- a/gdb/symfile.c

> +++ b/gdb/symfile.c

> @@ -1443,11 +1443,24 @@ find_separate_debug_file (const char *dir,

>      = dirnames_to_char_ptr_vec (debug_file_directory);

>    gdb::unique_xmalloc_ptr<char> canon_sysroot = gdb_realpath (gdb_sysroot);

>  

> + /* MS-Windows/MS-DOS don't allow colons in file names; we must

> +    convert the drive letter into a one-letter directory, so that the

> +    file name resulting from splicing below will be valid.  */

> +  std::string drive;

> +  if (HAS_DRIVE_SPEC (dir_notarget))


I think we should consider the case where we build GDB on GNU/Linux, to remotely
debug a Windows program.  When building on GNU/Linux, HAS_DRIVE_SPEC always return false,
since it's defined as (see include/filenames.h):

  #define HAS_DRIVE_SPEC(f) (0)

Let's suppose DEBUGDIR is "D:/my/debug", DIR is "target:E:/the/directory/" and DEBUGLINK
is "program.debug".  On GNU/Linux, we would build the path

  target:D:/my/debug/E:/the/directory/program.debug

And I suppose that the "E:" would result in the debug file not being found.

So I think we should be using HAS_DRIVE_SPEC_1, which allows us to do the same check.  We
just need to pass to the macro whether the target filesystem id DOS-based.  The only problem
is, how do we know whether the target filesystem is DOS-based?  We wouldn't want
HAS_DRIVE_SPEC_1 to do this stripping erroneously when debugging natively on GNU/Linux...

If we can't find a practical solution to this, then I would say it's better to keep using
HAS_DRIVE_SPEC, and it will be a known deficiency for the moment that it is wrong when debugging
remotely from GNU/Linux to Windows.  It wouldn't be a regression, as it is already wrong today).
But you will have fixed it for the native Windows case, which is a net gain.

> +    {

> +      drive = std::string (1, dir_notarget[0]);


This should be equivalent to:

    drive = dir_notarget[0];

> +      dir_notarget = STRIP_DRIVE_SPEC (dir_notarget);

> +    }

> +  else

> +    drive = "";


The else clause is unnecessary, as drive is already the empty string;

So you could reduce it to:

  std::string drive;
  if (HAS_DRIVE_SPEC (dir_notarget))
    {
      drive = dir_notarget[0];
      dir_notarget = STRIP_DRIVE_SPEC(dir_notarget);
    }

Simon
Eli Zaretskii April 27, 2019, 4:39 p.m. | #11
> Cc: gdb-patches@sourceware.org

> From: Simon Marchi <simark@simark.ca>

> Date: Sat, 27 Apr 2019 12:16:23 -0400

> 

> I think we should consider the case where we build GDB on GNU/Linux, to remotely

> debug a Windows program.  When building on GNU/Linux, HAS_DRIVE_SPEC always return false,

> since it's defined as (see include/filenames.h):

> 

>   #define HAS_DRIVE_SPEC(f) (0)

> 

> Let's suppose DEBUGDIR is "D:/my/debug", DIR is "target:E:/the/directory/" and DEBUGLINK

> is "program.debug".  On GNU/Linux, we would build the path

> 

>   target:D:/my/debug/E:/the/directory/program.debug

> 

> And I suppose that the "E:" would result in the debug file not being found.


When debugging remotely, is the debug info on a Windows or on a
GNU/Linux filesystem?  If the latter, the above will work.  I always
thought that in remote debugging, GDB itself runs on the local host,
i.e. on GNU/Linux in this case, and the part that runs on the remote
is gdbserver.  Isn't that correct?

> So I think we should be using HAS_DRIVE_SPEC_1, which allows us to do the same check.  We

> just need to pass to the macro whether the target filesystem id DOS-based.  The only problem

> is, how do we know whether the target filesystem is DOS-based?  We wouldn't want

> HAS_DRIVE_SPEC_1 to do this stripping erroneously when debugging natively on GNU/Linux...


But if we do that, how do we distinguish between the use case you
describe above and a use case where the we debug locally and the file
names just happen to include colons?  Are we willing to restrict file
names on GNU/Linux to support the remote debugging on Windows?

Thanks for the other comments, I will implement them when we agree
about the above issue.
Simon Marchi April 27, 2019, 6:56 p.m. | #12
On 2019-04-27 12:39, Eli Zaretskii wrote:
>> Cc: gdb-patches@sourceware.org

>> From: Simon Marchi <simark@simark.ca>

>> Date: Sat, 27 Apr 2019 12:16:23 -0400

>> 

>> I think we should consider the case where we build GDB on GNU/Linux, 

>> to remotely

>> debug a Windows program.  When building on GNU/Linux, HAS_DRIVE_SPEC 

>> always return false,

>> since it's defined as (see include/filenames.h):

>> 

>>   #define HAS_DRIVE_SPEC(f) (0)

>> 

>> Let's suppose DEBUGDIR is "D:/my/debug", DIR is 

>> "target:E:/the/directory/" and DEBUGLINK

>> is "program.debug".  On GNU/Linux, we would build the path

>> 

>>   target:D:/my/debug/E:/the/directory/program.debug

>> 

>> And I suppose that the "E:" would result in the debug file not being 

>> found.

> 

> When debugging remotely, is the debug info on a Windows or on a

> GNU/Linux filesystem?  If the latter, the above will work.  I always

> thought that in remote debugging, GDB itself runs on the local host,

> i.e. on GNU/Linux in this case, and the part that runs on the remote

> is gdbserver.  Isn't that correct?


The latter part is correct, gdbserver would run on the Windows machine, 
while GDB would run on the local GNU/Linux machine.

But the debug info can come from either places, depending on the setup.  
If the executable read by GDB comes from the "target:" filesystem, we 
will search for the separate debug files on the target: filesystem 
(hence all this complexity with target_prefix in this function).  If you 
are debugging remotely, the target: filesystem represents the remote 
filesystem, which GDB can read through requests serviced by gdbserver.  
If you have a local copy of the remote filesystem (a sysroot) and made 
"set sysroot" point to it, then GDB will read from there.

If you are debugging locally on Windows, you want the path to end up 
like:

   target:D:/my/debug/E/the/directory/program.debug

If you are debugging remotely your Windows from another Windows, using 
"set sysroot target:" (the default), you also want the path to end up 
like

   target:D:/my/debug/E/the/directory/program.debug

If you are debugging remotely from a Windows machine to another Windows 
machine, but installed a sysroot of the debugged machine in F:/sysroot, 
then I guess you'll want the path to end up like

   F:/sysroot/D/my/debug/E/the/directory/program.debug

So maybe we would want to apply the same treatment to the sysroot drive 
letter?

If you are debugging remotely your Windows from GNU/Linux, using "set 
sysroot target:", we would want the path to end up like

   target:D:/my/debug/E/the/directory/program.debug

Since the target filesystem wouldn't support a "E:" directory.

If you are debugging remotely your Windows from GNU/Linux, but installed 
a sysroot of the debugged machine in /sysroot, then either of these 
could work:

   /sysroot/D:/my/debug/E:/the/directory/program.debug
   /sysroot/D:/my/debug/E/the/directory/program.debug
   /sysroot/D/my/debug/E:/the/directory/program.debug
   /sysroot/D/my/debug/E/the/directory/program.debug

This leads me to say that for simplicity, when we convert a Windows 
drive letter into a directory component, we should always convert it to 
the letter without the colon.  This way, whatever platform GDB itself 
runs on, the searched path will be the same and predictable.  Also, if 
you were to copy D:/my/debug over to your GNU/Linux machine, you 
wouldn't have to rename the "E" directory to "E:".


? I would suggest the first one,

If you are debugging remotely from GNU/Linux and installed a sysroot of 
your Windows machine in /sysroot, then I suppose you would want the path 
to end up like:

   /sysroot/D:/my/debug/E/the/directory/program.debug

?  Or

   /sysroot/D/my/debug/E/the/directory/program.debug

?

In this last example, the file is lookup up on the GNU/Linux filesystem, 
so the path technically could include the E:.  However, since the

So if you are using a local sysroot of your
In the first case since you are reading from the GNU/Linux filesystem, 
so colons would be supported.  But in the latter case, they wouldn't.

One use case to consider is: you start debugging your Windows machine 
from your GNU/Linux machine, we do strip the colon, such that the 
separate debug info is searched at:

   target:D:/my/debug/E/the/directory/program.debug

and it is found.  However, it is quite slow, because everytime you start 
debugging, GDB needs to download this information.  So you make a 
sysroot of your Windows machine on your GNU/Linux machine, at /sysroot, 
such that you have locally:

   /sysroot/E:/the/directory/program.exe
   /sysroot/D:/my/debug/E/the/directory/program.debug

Then, you point GDB to the executable in your local sysroot, and it 
should At this point, the local filesystem

> 

>> So I think we should be using HAS_DRIVE_SPEC_1, which allows us to do 

>> the same check.  We

>> just need to pass to the macro whether the target filesystem id 

>> DOS-based.  The only problem

>> is, how do we know whether the target filesystem is DOS-based?  We 

>> wouldn't want

>> HAS_DRIVE_SPEC_1 to do this stripping erroneously when debugging 

>> natively on GNU/Linux...

> 

> But if we do that, how do we distinguish between the use case you

> describe above and a use case where the we debug locally and the file

> names just happen to include colons?  Are we willing to restrict file

> names on GNU/Linux to support the remote debugging on Windows?

> 

> Thanks for the other comments, I will implement them when we agree

> about the above issue.

On 2019-04-27 12:39, Eli Zaretskii wrote:
>> Cc: gdb-patches@sourceware.org

>> From: Simon Marchi <simark@simark.ca>

>> Date: Sat, 27 Apr 2019 12:16:23 -0400

>> 

>> I think we should consider the case where we build GDB on GNU/Linux, 

>> to remotely

>> debug a Windows program.  When building on GNU/Linux, HAS_DRIVE_SPEC 

>> always return false,

>> since it's defined as (see include/filenames.h):

>> 

>>   #define HAS_DRIVE_SPEC(f) (0)

>> 

>> Let's suppose DEBUGDIR is "D:/my/debug", DIR is 

>> "target:E:/the/directory/" and DEBUGLINK

>> is "program.debug".  On GNU/Linux, we would build the path

>> 

>>   target:D:/my/debug/E:/the/directory/program.debug

>> 

>> And I suppose that the "E:" would result in the debug file not being 

>> found.

> 

> When debugging remotely, is the debug info on a Windows or on a

> GNU/Linux filesystem?  If the latter, the above will work.  I always

> thought that in remote debugging, GDB itself runs on the local host,

> i.e. on GNU/Linux in this case, and the part that runs on the remote

> is gdbserver.  Isn't that correct?


The latter part is correct, gdbserver would run on the Windows machine, 
while GDB would run on the local GNU/Linux machine.

But the debug info can come from either places, depending on the setup.  
If the executable read by GDB comes from the "target:" filesystem, we 
will search for the separate debug files on the target: filesystem 
(hence all this complexity with target_prefix in this function).  If you 
are debugging remotely, the target: filesystem represents the remote 
filesystem, which GDB can read through requests serviced by gdbserver.  
If you have a local copy of the remote filesystem (a sysroot) and made 
"set sysroot" point to it, then GDB will read from there.

I tried to think about different scenarios, it brought more questions 
than answers, but here it is.

If you are debugging locally on Windows, you want the path to end up 
like:

   target:D:/my/debug/E/the/directory/program.debug

If you are debugging remotely your Windows from another Windows, using 
"set sysroot target:" (the default), you also want the path to end up 
like

   target:D:/my/debug/E/the/directory/program.debug

If you are debugging remotely from a Windows machine to another Windows 
machine, but installed a sysroot of the debugged machine in F:/sysroot, 
then I guess you'll want the path to end up like

   F:/sysroot/D/my/debug/E/the/directory/program.debug

We couldn't have a "D:" directory, so maybe we would want to apply the 
same treatment to the sysroot drive letter?

If you are debugging remotely your Windows from GNU/Linux, using "set 
sysroot target:", we would want the path to end up like

   target:D:/my/debug/E/the/directory/program.debug

... since the target filesystem wouldn't support a "E:" directory.

If you are debugging remotely your Windows from GNU/Linux, but installed 
a sysroot of the debugged machine in /sysroot, then either of these 
could work:

   /sysroot/D:/my/debug/E:/the/directory/program.debug
   /sysroot/D:/my/debug/E/the/directory/program.debug
   /sysroot/D/my/debug/E:/the/directory/program.debug
   /sysroot/D/my/debug/E/the/directory/program.debug

This leads me to say that for simplicity, when we convert a Windows 
drive letter into a directory component, we should always convert it to 
the letter without the colon.  This way, whatever platform GDB itself 
runs on, the searched path will be the same and predictable.  Also, if 
you were to copy D:/my/debug/E/the/directory/program.debug over to your 
sysroot on the GNU/Linux machine, you wouldn't have to rename the "E" 
directory to "E:" for it to be found.

Now, the question remains: how do we know for sure a path is a Windows 
path, in which we need to strip the colon?  I don't know.

Given the complexity of supporting all these scenarios (if we want to 
support them, we at least want to test them, which is a lot of work) and 
the fact that they are all theoretical at this point, I would be fine to 
go with what you had in your patch (using HAS_DRIVE_SPEC).  If somebody 
ever needs to remotely debug a Windows machine from a Linux machine and 
have GDB find separate debug file in the remote debug file directory... 
then they can add the missing pieces :).

Simon
Eli Zaretskii April 27, 2019, 7:04 p.m. | #13
> Date: Sat, 27 Apr 2019 14:56:11 -0400

> From: Simon Marchi <simark@simark.ca>

> Cc: gdb-patches@sourceware.org

> 

> Given the complexity of supporting all these scenarios (if we want to 

> support them, we at least want to test them, which is a lot of work) and 

> the fact that they are all theoretical at this point, I would be fine to 

> go with what you had in your patch (using HAS_DRIVE_SPEC).  If somebody 

> ever needs to remotely debug a Windows machine from a Linux machine and 

> have GDB find separate debug file in the remote debug file directory... 

> then they can add the missing pieces :).


OK, thanks.  I will wait for a day or two for any other comments, and
if nothing comes up, will push with a FIXME comment regarding the
remote debugging use cases.
Eli Zaretskii May 3, 2019, 7:35 a.m. | #14
> Date: Sat, 27 Apr 2019 22:04:37 +0300

> From: Eli Zaretskii <eliz@gnu.org>

> CC: gdb-patches@sourceware.org

> 

> OK, thanks.  I will wait for a day or two for any other comments, and

> if nothing comes up, will push with a FIXME comment regarding the

> remote debugging use cases.


Done.

Patch

--- gdb/symfile.c~0	2019-03-27 00:52:05.000000000 +0200
+++ gdb/symfile.c	2019-04-18 13:19:05.231697600 +0300
@@ -1443,6 +1443,18 @@  find_separate_debug_file (const char *di
     = dirnames_to_char_ptr_vec (debug_file_directory);
   gdb::unique_xmalloc_ptr<char> canon_sysroot = gdb_realpath (gdb_sysroot);
 
+  /* MS-Windows/MS-DOS don't allow colons in file names; we must strip
+     the drive letter, so that the file name resulting from splicing
+     below will be valid.  */
+  if (HAS_DRIVE_SPEC (dir_notarget))
+    {
+      dir_notarget = STRIP_DRIVE_SPEC (dir_notarget);
+      /* We will append a slash to debugdir, so remove the leading
+	 slash as well.  */
+      if (IS_DIR_SEPARATOR (dir_notarget[0]))
+	dir_notarget++;
+    }
+
   for (const gdb::unique_xmalloc_ptr<char> &debugdir : debugdir_vec)
     {
       debugfile = target_prefix ? "target:" : "";