Fix missing symtab includes

Message ID 20200327204948.GA23365@delia
State New
Headers show
Series
  • Fix missing symtab includes
Related show

Commit Message

Tom de Vries March 27, 2020, 8:49 p.m.
Hi,

Consider hello.h:
...
 inline static const char*
 foo (void)
 {
   return "foo";
 }
...
and hello.c:
...
 #include <stdio.h>
 #include "hello.h"

 int
 main (void)
 {
   printf ("hello: %s\n", foo ());
   return 0;
 }
...
compiled with -g, and dwz-compressed:
...
$ gcc -g hello.c
$ dwz a.out
...

When breaking on foo and printing the symbol tables, we have two includes for
the hello.c compunit_symtab, representing two imported partial units:
...
$ gdb -iex "set language c" -batch a.out -ex "b foo" -ex "maint info symtabs"
Breakpoint 1 at 0x40050b: file hello.h, line 4.
  ...
  { ((struct compunit_symtab *) 0x38fa890)
    debugformat DWARF 2
    producer GNU C11 7.5.0 -mtune=generic -march=x86-64 -g
    dirname /data/gdb_versions/devel
    blockvector ((struct blockvector *) 0x39af9f0)
    user ((struct compunit_symtab *) (null))
    ( includes
      ((struct compunit_symtab *) 0x39afb10)
      ((struct compunit_symtab *) 0x39b00c0)
    )
        { symtab hello.c ((struct symtab *) 0x38fa940)
          fullname (null)
          linetable ((struct linetable *) 0x39afa80)
        }
...

But if we instead break on the same location using a linespec, the includes
are gone:
...
$ gdb -iex "set language c" -batch a.out -ex "b hello.h:4" -ex "maint info symtabs"
Breakpoint 1 at 0x40050b: file hello.h, line 4.
  ...
  { ((struct compunit_symtab *) 0x2283500)
    debugformat DWARF 2
    producer GNU C11 7.5.0 -mtune=generic -march=x86-64 -g
    dirname /data/gdb_versions/devel
    blockvector ((struct blockvector *) 0x23086c0)
    user ((struct compunit_symtab *) (null))
        { symtab hello.c ((struct symtab *) 0x22835b0)
          fullname (null)
          linetable ((struct linetable *) 0x2308750)
        }
...

The includes are calculated by process_cu_includes in gdb/dwarf2/read.c.

In the case of "b foo":
- the hello.c partial symtab is found to contain foo
- the hello.c partial symtab is expanded using psymtab_to_symtab
- psymtab_to_symtab calls dwarf2_psymtab::read_symtab
- dwarf2_psymtab::read_symtab calls dwarf2_psymtab::expand_psymtab
- dwarf2_psymtab::read_symtab calls process_cu_includes, and we have the
  includes

In the case of "b hello.h:4":
- the hello.h partial symtab is found to represent hello.h
- the hello.h partial symtab is expanded using psymtab_to_symtab
- psymtab_to_symtab calls dwarf2_include_psymtab::read_symtab
- dwarf2_include_psymtab::read_symtab calls
  dwarf2_include_psymtab::expand_psymtab
- dwarf2_include_psymtab::expand_psymtab calls
  partial_symtab::read_dependencies
- partial_symtab::read_dependencies calls dwarf2_psymtab::expand_psymtab
  for partial symtab hello.c
- the hello.c partial symtab is expanded using dwarf2_psymtab::expand_psymtab
- process_cu_includes is never called

Fix this by making sure in dwarf2_include_psymtab::expand_psymtab that
read_symtab rather than expand_symtab is called for the hello.c partial
symtab.

Tested on x86_64-linux, with native, and target board cc-with-dwz and
cc-with-dwz-m.

OK for trunk?

Thanks,
- Tom

[gdb] Fix missing symtab includes

gdb/ChangeLog:

2020-03-27  Tom de Vries  <tdevries@suse.de>

	PR symtab/25718
	* dwarf2/read.c (dwarf2_include_psymtab::expand_psymtab): Call
	read_symtab instead of expand_symtab for dependency.

---
 gdb/dwarf2/read.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Comments

Simon Marchi March 28, 2020, 4:32 p.m. | #1
On 2020-03-27 4:49 p.m., Tom de Vries wrote:
> Hi,

> 

> Consider hello.h:

> ...

>  inline static const char*

>  foo (void)

>  {

>    return "foo";

>  }

> ...

> and hello.c:

> ...

>  #include <stdio.h>

>  #include "hello.h"

> 

>  int

>  main (void)

>  {

>    printf ("hello: %s\n", foo ());

>    return 0;

>  }

> ...

> compiled with -g, and dwz-compressed:

> ...

> $ gcc -g hello.c

> $ dwz a.out

> ...

> 

> When breaking on foo and printing the symbol tables, we have two includes for

> the hello.c compunit_symtab, representing two imported partial units:

> ...

> $ gdb -iex "set language c" -batch a.out -ex "b foo" -ex "maint info symtabs"

> Breakpoint 1 at 0x40050b: file hello.h, line 4.

>   ...

>   { ((struct compunit_symtab *) 0x38fa890)

>     debugformat DWARF 2

>     producer GNU C11 7.5.0 -mtune=generic -march=x86-64 -g

>     dirname /data/gdb_versions/devel

>     blockvector ((struct blockvector *) 0x39af9f0)

>     user ((struct compunit_symtab *) (null))

>     ( includes

>       ((struct compunit_symtab *) 0x39afb10)

>       ((struct compunit_symtab *) 0x39b00c0)

>     )

>         { symtab hello.c ((struct symtab *) 0x38fa940)

>           fullname (null)

>           linetable ((struct linetable *) 0x39afa80)

>         }

> ...


I am not able to reproduce this.  In both cases, I don't get the `includes`.

What transformation is dwz expected to do on the binary?  Here, it looks like
it just compressed the debug info a little bit, changing the addresses, but the
general structure was untouched.

I'm using the latest git version of dwz (commit b7111689a2ccec2f57343f1051ec8f1df5e89e5c).

Simon
Tom de Vries March 28, 2020, 5:24 p.m. | #2
On 28-03-2020 17:32, Simon Marchi wrote:
> On 2020-03-27 4:49 p.m., Tom de Vries wrote:

>> Hi,

>>

>> Consider hello.h:

>> ...

>>  inline static const char*

>>  foo (void)

>>  {

>>    return "foo";

>>  }

>> ...

>> and hello.c:

>> ...

>>  #include <stdio.h>

>>  #include "hello.h"

>>

>>  int

>>  main (void)

>>  {

>>    printf ("hello: %s\n", foo ());

>>    return 0;

>>  }

>> ...

>> compiled with -g, and dwz-compressed:

>> ...

>> $ gcc -g hello.c

>> $ dwz a.out

>> ...

>>

>> When breaking on foo and printing the symbol tables, we have two includes for

>> the hello.c compunit_symtab, representing two imported partial units:

>> ...

>> $ gdb -iex "set language c" -batch a.out -ex "b foo" -ex "maint info symtabs"

>> Breakpoint 1 at 0x40050b: file hello.h, line 4.

>>   ...

>>   { ((struct compunit_symtab *) 0x38fa890)

>>     debugformat DWARF 2

>>     producer GNU C11 7.5.0 -mtune=generic -march=x86-64 -g

>>     dirname /data/gdb_versions/devel

>>     blockvector ((struct blockvector *) 0x39af9f0)

>>     user ((struct compunit_symtab *) (null))

>>     ( includes

>>       ((struct compunit_symtab *) 0x39afb10)

>>       ((struct compunit_symtab *) 0x39b00c0)

>>     )

>>         { symtab hello.c ((struct symtab *) 0x38fa940)

>>           fullname (null)

>>           linetable ((struct linetable *) 0x39afa80)

>>         }

>> ...

> 

> I am not able to reproduce this.  In both cases, I don't get the `includes`.

> 

> What transformation is dwz expected to do on the binary?  Here, it looks like

> it just compressed the debug info a little bit, changing the addresses, but the

> general structure was untouched.

> 

> I'm using the latest git version of dwz (commit b7111689a2ccec2f57343f1051ec8f1df5e89e5c).


Hi Simon,

thanks for trying this out.

I've attached the original a.out here (
https://sourceware.org/bugzilla/show_bug.cgi?id=25718#c3 ) and the
dwz-ed a.out here (
https://sourceware.org/bugzilla/show_bug.cgi?id=25718#c4  ).

I'm hoping you might be able to reproduce using the latter file (and
FWIW, I'm using the same dwz version).

I think the reason for the difference in what we are seeing is due to me
using openSUSE, which has debug info on various linked in objects like
glibc's init.c and elf-init.c. Looking at the readelf -wi output of the
dwz-ed executable, dwz exploits commonality between those objects and
hello.c, so it's not surprising dwz does not create partial units for
platforms that do not have debug info for those objects.

Anyway, I'll need to construct a better test-case that reproduces the
problem on other platforms.

Thanks,
- Tom
Tom Tromey March 28, 2020, 7:08 p.m. | #3
>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:


Thanks for the analysis.  This is very helpful.

Tom> - the hello.c partial symtab is expanded using dwarf2_psymtab::expand_psymtab
Tom> - process_cu_includes is never called

It seems to me that perhaps the call to process_cu_includes is
misplaced.  Why is it where it is and not elsewhere?

I guess the idea is that it should be called after a group of CUs have
been processed.  So maybe your patch is the correct way.  But it seems
like it would be worth understanding this a bit better and perhaps
sticking a comment somewhere explaining the approach.

Tom
Simon Marchi March 29, 2020, 3:39 p.m. | #4
On 2020-03-28 1:24 p.m., Tom de Vries wrote:
>> I am not able to reproduce this.  In both cases, I don't get the `includes`.

>>

>> What transformation is dwz expected to do on the binary?  Here, it looks like

>> it just compressed the debug info a little bit, changing the addresses, but the

>> general structure was untouched.

>>

>> I'm using the latest git version of dwz (commit b7111689a2ccec2f57343f1051ec8f1df5e89e5c).

> 

> Hi Simon,

> 

> thanks for trying this out.

> 

> I've attached the original a.out here (

> https://sourceware.org/bugzilla/show_bug.cgi?id=25718#c3 ) and the

> dwz-ed a.out here (

> https://sourceware.org/bugzilla/show_bug.cgi?id=25718#c4  ).

> 

> I'm hoping you might be able to reproduce using the latter file (and

> FWIW, I'm using the same dwz version).

> 

> I think the reason for the difference in what we are seeing is due to me

> using openSUSE, which has debug info on various linked in objects like

> glibc's init.c and elf-init.c. Looking at the readelf -wi output of the

> dwz-ed executable, dwz exploits commonality between those objects and

> hello.c, so it's not surprising dwz does not create partial units for

> platforms that do not have debug info for those objects.

> 

> Anyway, I'll need to construct a better test-case that reproduces the

> problem on other platforms.

> 

> Thanks,

> - Tom

> 


Thanks, your explanations make more sense with the the DWARF in those files,
I can reproduce the problem now.

I have a bit of trouble understanding the current code.  In particular, I am
trying to put in words what's the difference between the `read_symtab` and
`expand_symtab` methods of `partial_symtab`, but I can't quite do it.  I think
this would help to determine what's the right solution.

Simon
Tom de Vries March 29, 2020, 3:56 p.m. | #5
On 29-03-2020 17:39, Simon Marchi wrote:
> On 2020-03-28 1:24 p.m., Tom de Vries wrote:

>>> I am not able to reproduce this.  In both cases, I don't get the `includes`.

>>>

>>> What transformation is dwz expected to do on the binary?  Here, it looks like

>>> it just compressed the debug info a little bit, changing the addresses, but the

>>> general structure was untouched.

>>>

>>> I'm using the latest git version of dwz (commit b7111689a2ccec2f57343f1051ec8f1df5e89e5c).

>>

>> Hi Simon,

>>

>> thanks for trying this out.

>>

>> I've attached the original a.out here (

>> https://sourceware.org/bugzilla/show_bug.cgi?id=25718#c3 ) and the

>> dwz-ed a.out here (

>> https://sourceware.org/bugzilla/show_bug.cgi?id=25718#c4  ).

>>

>> I'm hoping you might be able to reproduce using the latter file (and

>> FWIW, I'm using the same dwz version).

>>

>> I think the reason for the difference in what we are seeing is due to me

>> using openSUSE, which has debug info on various linked in objects like

>> glibc's init.c and elf-init.c. Looking at the readelf -wi output of the

>> dwz-ed executable, dwz exploits commonality between those objects and

>> hello.c, so it's not surprising dwz does not create partial units for

>> platforms that do not have debug info for those objects.

>>

>> Anyway, I'll need to construct a better test-case that reproduces the

>> problem on other platforms.

>>

>> Thanks,

>> - Tom

>>

> 

> Thanks, your explanations make more sense with the the DWARF in those files,

> I can reproduce the problem now.

> 

> I have a bit of trouble understanding the current code.  In particular, I am

> trying to put in words what's the difference between the `read_symtab` and

> `expand_symtab` methods of `partial_symtab`, but I can't quite do it.  I think

> this would help to determine what's the right solution.


Hi,

I gave that a try, but got stalled working on the test-case.

Anyway here's the current state.

Thanks,
- Tom
[gdb] Fix missing symtab includes

Consider hello.h:
...
 inline static const char*
 foo (void)
 {
   return "foo";
 }
...
and hello.c:
...
 #include <stdio.h>
 #include "hello.h"

 int
 main (void)
 {
   printf ("hello: %s\n", foo ());
   return 0;
 }
...
compiled with -g, and dwz-compressed:
...
$ gcc -g hello.c
$ dwz a.out
...

When breaking on foo and printing the symbol tables, we have two includes for
the hello.c compunit_symtab, representing two imported partial units:
...
$ gdb -iex "set language c" -batch a.out -ex "b foo" -ex "maint info symtabs"
Breakpoint 1 at 0x40050b: file hello.h, line 4.
  ...
  { ((struct compunit_symtab *) 0x38fa890)
    debugformat DWARF 2
    producer GNU C11 7.5.0 -mtune=generic -march=x86-64 -g
    dirname /data/gdb_versions/devel
    blockvector ((struct blockvector *) 0x39af9f0)
    user ((struct compunit_symtab *) (null))
    ( includes
      ((struct compunit_symtab *) 0x39afb10)
      ((struct compunit_symtab *) 0x39b00c0)
    )
        { symtab hello.c ((struct symtab *) 0x38fa940)
          fullname (null)
          linetable ((struct linetable *) 0x39afa80)
        }
...

But if we instead break on the same location using a linespec, the includes
are gone:
...
$ gdb -iex "set language c" -batch a.out -ex "b hello.h:4" -ex "maint info symtabs"
Breakpoint 1 at 0x40050b: file hello.h, line 4.
  ...
  { ((struct compunit_symtab *) 0x2283500)
    debugformat DWARF 2
    producer GNU C11 7.5.0 -mtune=generic -march=x86-64 -g
    dirname /data/gdb_versions/devel
    blockvector ((struct blockvector *) 0x23086c0)
    user ((struct compunit_symtab *) (null))
        { symtab hello.c ((struct symtab *) 0x22835b0)
          fullname (null)
          linetable ((struct linetable *) 0x2308750)
        }
...

The includes are calculated by process_cu_includes in gdb/dwarf2/read.c.

In the case of "b foo":
- the hello.c partial symtab is found to contain foo
- the hello.c partial symtab is expanded using psymtab_to_symtab
- psymtab_to_symtab calls dwarf2_psymtab::read_symtab
- dwarf2_psymtab::read_symtab calls dwarf2_psymtab::expand_psymtab
- dwarf2_psymtab::read_symtab calls process_cu_includes, and we have the
  includes

In the case of "b hello.h:4":
- the hello.h partial symtab is found to represent hello.h
- the hello.h partial symtab is expanded using psymtab_to_symtab
- psymtab_to_symtab calls dwarf2_include_psymtab::read_symtab
- dwarf2_include_psymtab::read_symtab calls
  dwarf2_include_psymtab::expand_psymtab
- dwarf2_include_psymtab::expand_psymtab calls
  partial_symtab::read_dependencies
- partial_symtab::read_dependencies calls dwarf2_psymtab::expand_psymtab
  for partial symtab hello.c
- the hello.c partial symtab is expanded using dwarf2_psymtab::expand_psymtab
- process_cu_includes is never called

Fix this by making sure in dwarf2_include_psymtab::read_symtab that
read_symtab is called for the hello.c partial symtab.

Tested on x86_64-linux, with native, and target board cc-with-dwz and
cc-with-dwz-m.

gdb/ChangeLog:

2020-03-27  Tom de Vries  <tdevries@suse.de>

	PR symtab/25718
	* psympriv.h (struct partial_symtab::read_symtab)
	(struct partial_symtab::expand_psymtab)
	(struct partial_symtab::read_dependencies): Update comments.
	* dwarf2/read.c (dwarf2_include_psymtab::read_symtab): Directly call
	read_symtab for includer.
	(dwarf2_include_psymtab::expand_psymtab): Assert false.

---
 gdb/dwarf2/read.c | 28 +++++++++++++++++++++-------
 gdb/psympriv.h    | 22 ++++++++++++++++------
 2 files changed, 37 insertions(+), 13 deletions(-)

diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 8c5046ef41..9d366449f7 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -5830,20 +5830,34 @@ struct dwarf2_include_psymtab : public partial_symtab
   }
 
   void read_symtab (struct objfile *objfile) override
-  {
-    expand_psymtab (objfile);
-  }
-
-  void expand_psymtab (struct objfile *objfile) override
   {
     if (m_readin)
       return;
+
     /* It's an include file, no symbols to read for it.
-       Everything is in the parent symtab.  */
-    read_dependencies (objfile);
+       Everything is in the includer symtab.  */
+
+    /* The expansion of a dwarf2_include_psymtab is just a trigger for
+       expansion of the includer psymtab.  We use the dependencies[0] field to
+       model the includer.  But if we go the regular route of calling
+       expand_psymtab here, and having expand_psymtab call read_dependencies to
+       expand the includer, we'll only use expand_psymtab on the includer
+       (making it a non-toplevel psymtab), while if we expand the includer via
+       another path, we'll use read_symtab (making it a toplevel psymtab).
+       So, don't pretend a dwarf2_include_psymtab is an actual toplevel
+       psymtab, and trigger read_symtab on the includer here directly.  */
+    if (!dependencies[0]->readin_p ())
+      dependencies[0]->read_symtab (objfile);
     m_readin = true;
   }
 
+  void expand_psymtab (struct objfile *objfile) override
+  {
+    /* This is not called by read_symtab, and should not be called by any
+       read_dependencies.  */
+    gdb_assert (false);
+  }
+
   bool readin_p () const override
   {
     return m_readin;
diff --git a/gdb/psympriv.h b/gdb/psympriv.h
index 0effedc4ec..4317392149 100644
--- a/gdb/psympriv.h
+++ b/gdb/psympriv.h
@@ -124,16 +124,26 @@ struct partial_symtab
   {
   }
 
-  /* Read the full symbol table corresponding to this partial symbol
-     table.  */
+  /* Psymtab expansion is done in two steps:
+     - a call to read_symtab
+     - while that call is in progress, calls to expand_psymtab can be made,
+       both for this psymtab, and its dependencies.
+     This makes a distinction between a toplevel psymtab (for which both
+     read_symtab and expand_psymtab will be called) and a non-toplevel
+     psymtab (for which only expand_psymtab will be called). The
+     distinction can be used f.i. to do things before and after all
+     dependencies of a top-level psymtab have been expanded.
+
+     Read the full symbol table corresponding to this partial symbol
+     table.  Typically calls expand_psymtab.  */
   virtual void read_symtab (struct objfile *) = 0;
 
-  /* Psymtab expansion is done in two steps.  The first step is a call
-     to read_symtab; but while that is in progress, calls to
-     expand_psymtab can be made.  */
+  /* Expand the full symbol table for this partial symbol table.  Typically
+     calls read_dependencies.  */
   virtual void expand_psymtab (struct objfile *) = 0;
 
-  /* Ensure that all the dependencies are read in.  */
+  /* Ensure that all the dependencies are read in.  Typically calls
+     expand_psymtab for each dependency.  */
   void read_dependencies (struct objfile *);
 
   /* Return true if the symtab corresponding to this psymtab has been
Simon Marchi March 29, 2020, 9:44 p.m. | #6
On 2020-03-29 11:56 a.m., Tom de Vries wrote:
> diff --git a/gdb/psympriv.h b/gdb/psympriv.h

> index 0effedc4ec..4317392149 100644

> --- a/gdb/psympriv.h

> +++ b/gdb/psympriv.h

> @@ -124,16 +124,26 @@ struct partial_symtab

>    {

>    }

>  

> -  /* Read the full symbol table corresponding to this partial symbol

> -     table.  */

> +  /* Psymtab expansion is done in two steps:

> +     - a call to read_symtab

> +     - while that call is in progress, calls to expand_psymtab can be made,

> +       both for this psymtab, and its dependencies.

> +     This makes a distinction between a toplevel psymtab (for which both

> +     read_symtab and expand_psymtab will be called) and a non-toplevel

> +     psymtab (for which only expand_psymtab will be called). The

> +     distinction can be used f.i. to do things before and after all

> +     dependencies of a top-level psymtab have been expanded.

> +

> +     Read the full symbol table corresponding to this partial symbol

> +     table.  Typically calls expand_psymtab.  */

>    virtual void read_symtab (struct objfile *) = 0;

>  

> -  /* Psymtab expansion is done in two steps.  The first step is a call

> -     to read_symtab; but while that is in progress, calls to

> -     expand_psymtab can be made.  */

> +  /* Expand the full symbol table for this partial symbol table.  Typically

> +     calls read_dependencies.  */

>    virtual void expand_psymtab (struct objfile *) = 0;

>  

> -  /* Ensure that all the dependencies are read in.  */

> +  /* Ensure that all the dependencies are read in.  Typically calls

> +     expand_psymtab for each dependency.  */

>    void read_dependencies (struct objfile *);


I don't think the "Typically" here is right.  This is not a virtual function that can
be implemented/overriden by sub-classes, it's a helper that sub-classes call, that calls
expand_psymtab on all dependencies.  As you probably saw, I renamed it to
expand_dependencies to be more accurate (since it calls expand, not read).

But otherwise, I am starting to think that it's the right thing to to, to call read_symtab
on the includer (top-level psymtab) in dwarf2_include_psymtab::read_symtab.  To read in
an include psymtab, we read in the includer.

Following this, I also came to the conclusion that dwarf2_include_psymtab::m_readin should
probably be removed.  An include psymtab can't be read in on its own.  It can be considered
read in when its includer is read in.  So it doesn't make sense to maintain a "read in" state
separate from its includer.

Here's the version I came up with, if you want to get inspiration.

From ec02d85b88d6a9bb842c69c0630e3e944357e4d3 Mon Sep 17 00:00:00 2001
From: Simon Marchi <simon.marchi@polymtl.ca>

Date: Sun, 29 Mar 2020 14:21:16 -0400
Subject: [PATCH] gdb: make dwarf2_include_psymtab defer to its includer

---
 gdb/dwarf2/read.c | 23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 1ec5c1e582b7..feabaae324d9 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -5831,22 +5831,17 @@ struct dwarf2_include_psymtab : public partial_symtab

   void read_symtab (struct objfile *objfile) override
   {
-    expand_psymtab (objfile);
+    includer ()->read_symtab (objfile);
   }

   void expand_psymtab (struct objfile *objfile) override
   {
-    if (m_readin)
-      return;
-    /* It's an include file, no symbols to read for it.
-       Everything is in the parent symtab.  */
-    expand_dependencies (objfile);
-    m_readin = true;
+    gdb_assert (false);
   }

   bool readin_p () const override
   {
-    return m_readin;
+    return includer ()->readin_p ();
   }

   struct compunit_symtab *get_compunit_symtab () const override
@@ -5855,8 +5850,13 @@ struct dwarf2_include_psymtab : public partial_symtab
   }

 private:
-
-  bool m_readin = false;
+  partial_symtab *includer () const
+  {
+    /* An include psymtab has exactly one dependency: the psymtab that
+       includes it.  */
+    gdb_assert (this->number_of_dependencies == 1);
+    return this->dependencies[0];
+  }
 };

 /* Allocate a new partial symtab for file named NAME and mark this new
@@ -8772,8 +8772,7 @@ process_queue (struct dwarf2_per_objfile *dwarf2_per_objfile)
 void
 dwarf2_psymtab::expand_psymtab (struct objfile *objfile)
 {
-  if (readin)
-    return;
+  gdb_assert (!readin);

   expand_dependencies (objfile);

-- 
2.25.2
Tom de Vries March 30, 2020, 5:42 p.m. | #7
On 29-03-2020 23:44, Simon Marchi wrote:
> On 2020-03-29 11:56 a.m., Tom de Vries wrote:

>> diff --git a/gdb/psympriv.h b/gdb/psympriv.h

>> index 0effedc4ec..4317392149 100644

>> --- a/gdb/psympriv.h

>> +++ b/gdb/psympriv.h

>> @@ -124,16 +124,26 @@ struct partial_symtab

>>    {

>>    }

>>  

>> -  /* Read the full symbol table corresponding to this partial symbol

>> -     table.  */

>> +  /* Psymtab expansion is done in two steps:

>> +     - a call to read_symtab

>> +     - while that call is in progress, calls to expand_psymtab can be made,

>> +       both for this psymtab, and its dependencies.

>> +     This makes a distinction between a toplevel psymtab (for which both

>> +     read_symtab and expand_psymtab will be called) and a non-toplevel

>> +     psymtab (for which only expand_psymtab will be called). The

>> +     distinction can be used f.i. to do things before and after all

>> +     dependencies of a top-level psymtab have been expanded.

>> +

>> +     Read the full symbol table corresponding to this partial symbol

>> +     table.  Typically calls expand_psymtab.  */

>>    virtual void read_symtab (struct objfile *) = 0;

>>  

>> -  /* Psymtab expansion is done in two steps.  The first step is a call

>> -     to read_symtab; but while that is in progress, calls to

>> -     expand_psymtab can be made.  */

>> +  /* Expand the full symbol table for this partial symbol table.  Typically

>> +     calls read_dependencies.  */

>>    virtual void expand_psymtab (struct objfile *) = 0;

>>  

>> -  /* Ensure that all the dependencies are read in.  */

>> +  /* Ensure that all the dependencies are read in.  Typically calls

>> +     expand_psymtab for each dependency.  */

>>    void read_dependencies (struct objfile *);

> 

> I don't think the "Typically" here is right.  This is not a virtual function that can

> be implemented/overriden by sub-classes, it's a helper that sub-classes call, that calls

> expand_psymtab on all dependencies.


Ack, comment updated accordingly.

>  As you probably saw, I renamed it to

> expand_dependencies to be more accurate (since it calls expand, not read).

> 


Ack, patch rebased.

> But otherwise, I am starting to think that it's the right thing to to, to call read_symtab

> on the includer (top-level psymtab) in dwarf2_include_psymtab::read_symtab.  To read in

> an include psymtab, we read in the includer.

> 

> Following this, I also came to the conclusion that dwarf2_include_psymtab::m_readin should

> probably be removed.  An include psymtab can't be read in on its own.  It can be considered

> read in when its includer is read in.  So it doesn't make sense to maintain a "read in" state

> separate from its includer.

> 


That makes sense to me.

> Here's the version I came up with, if you want to get inspiration.

> 


I've integrated it into my patch and re-tested.

Also, added test-case.

Thanks,
- Tom
[gdb] Fix missing symtab includes

[ The test-case requires patch "[gdb] Expand symbolless symtabs using maint
expand-symtabs" submitted at
https://sourceware.org/pipermail/gdb-patches/2020-March/167131.html . ]

Consider the debug info for the test-case included in this patch.  It consists
of a PU:
...
 <0><d2>: Abbrev Number: 2 (DW_TAG_partial_unit)
 <1><d3>: Abbrev Number: 0
...
imported by a CU:
...
 <0><df>: Abbrev Number: 2 (DW_TAG_compile_unit)
    <e0>   DW_AT_language    : 2        (non-ANSI C)
    <e1>   DW_AT_stmt_list   : 0xe9
 <1><e5>: Abbrev Number: 3 (DW_TAG_imported_unit)
    <e6>   DW_AT_import      : <0xd2>   [Abbrev Number: 2]
 <1><ea>: Abbrev Number: 0
...
and the CU has a dw2-symtab-includes.h file in the .debug_line file name
table:
...
 The Directory Table (offset 0x101):
  1     /data/gdb_versions/devel/src/gdb/testsuite/gdb.dwarf2

 The File Name Table (offset 0x138):
  Entry Dir     Time    Size    Name
  1     1       0       0       dw2-symtab-includes.h
...

After expanding all symtabs, we can see the CU listed in the user field of the
PU, and vice-versa the PU listed in the includes of the CU:
...
$ gdb.sh -batch \
  -iex "set language c" \
  outputs/gdb.dwarf2/dw2-symtab-includes/dw2-symtab-includes \
  -ex "maint expand-symtabs" \
  -ex "maint info symtabs"
  ...
  { ((struct compunit_symtab *) 0x394dd60)
    debugformat DWARF 2
    producer (null)
    dirname (null)
    blockvector ((struct blockvector *) 0x394dea0)
    user ((struct compunit_symtab *) 0x394dba0)
  }
  { ((struct compunit_symtab *) 0x394dba0)
    debugformat DWARF 2
    producer (null)
    dirname (null)
    blockvector ((struct blockvector *) 0x394dd10)
    user ((struct compunit_symtab *) (null))
    ( includes
      ((struct compunit_symtab *) 0x394dd60)
    )
  }
...

But if we instead only expand the symtab for the dw2-symtab-includes.h file,
the includes and user links are gone:
...
$ gdb -batch \
  -iex "set language c" \
  outputs/gdb.dwarf2/dw2-symtab-includes/dw2-symtab-includes \
  -ex "maint expand-symtabs dw2-symtab-includes.h" \
  -ex "maint info symtabs"
  ...
  { ((struct compunit_symtab *) 0x2728210)
    debugformat DWARF 2
    producer (null)
    dirname (null)
    blockvector ((struct blockvector *) 0x2728350)
    user ((struct compunit_symtab *) (null))
  }
  { ((struct compunit_symtab *) 0x2728050)
    debugformat DWARF 2
    producer (null)
    dirname (null)
    blockvector ((struct blockvector *) 0x27281c0)
    user ((struct compunit_symtab *) (null))
  }
...

The includes are calculated by process_cu_includes in gdb/dwarf2/read.c.

In the case of expanding all symtabs:
- the CU partial symtab is expanded using psymtab_to_symtab
- psymtab_to_symtab calls dwarf2_psymtab::read_symtab
- dwarf2_psymtab::read_symtab calls dwarf2_psymtab::expand_psymtab
- dwarf2_psymtab::read_symtab calls process_cu_includes, and we have the
  includes

In the case of expanding the symtab for dw2-symtab-includes.h:
- the dw2-symtab-includes.h partial symtab is expanded using psymtab_to_symtab
- psymtab_to_symtab calls dwarf2_include_psymtab::read_symtab
- dwarf2_include_psymtab::read_symtab calls
  dwarf2_include_psymtab::expand_psymtab
- dwarf2_include_psymtab::expand_psymtab calls
  partial_symtab::expand_dependencies
- partial_symtab::expand_dependencies calls dwarf2_psymtab::expand_psymtab
  for the CU partial symtab
- the CU partial symtab is expanded using dwarf2_psymtab::expand_psymtab
- process_cu_includes is never called

Fix this by making sure in dwarf2_include_psymtab::read_symtab that
read_symtab is called for the CU partial symtab.

Tested on x86_64-linux, with native, and target board cc-with-dwz and
cc-with-dwz-m.

In addition, tested test-case with target boards cc-with-gdb-index.exp,
cc-with-debug-names.exp and readnow.exp.

gdb/ChangeLog:

2020-03-30  Simon Marchi  <simon.marchi@polymtl.ca>
	    Tom de Vries  <tdevries@suse.de>

	PR symtab/25718
	* psympriv.h (struct partial_symtab::read_symtab)
	(struct partial_symtab::expand_psymtab)
	(struct partial_symtab::read_dependencies): Update comments.
	* dwarf2/read.c (struct dwarf2_include_psymtab::read_symtab): Call
	read_symtab for includer.
	(struct dwarf2_include_psymtab::expand_psymtab): Assert false.
	(struct dwarf2_include_psymtab::readin_p): Call readin_p () for includer.
	(struct dwarf2_include_psymtab::m_readin): Remove.
	(struct dwarf2_include_psymtab::includer): New member function.
	(dwarf2_psymtab::expand_psymtab): Assert !readin.

gdb/testsuite/ChangeLog:

2020-03-30  Tom de Vries  <tdevries@suse.de>

	PR symtab/25718
	* gdb.dwarf2/dw2-symtab-includes.exp: New file.

---
 gdb/dwarf2/read.c                                | 37 +++++++----
 gdb/psympriv.h                                   | 22 +++++--
 gdb/testsuite/gdb.dwarf2/dw2-symtab-includes.exp | 80 ++++++++++++++++++++++++
 3 files changed, 121 insertions(+), 18 deletions(-)

diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 673f2c2904..80a6b7f2bf 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -5855,22 +5855,31 @@ struct dwarf2_include_psymtab : public partial_symtab
 
   void read_symtab (struct objfile *objfile) override
   {
-    expand_psymtab (objfile);
+    /* It's an include file, no symbols to read for it.
+       Everything is in the includer symtab.  */
+
+    /* The expansion of a dwarf2_include_psymtab is just a trigger for
+       expansion of the includer psymtab.  We use the dependencies[0] field to
+       model the includer.  But if we go the regular route of calling
+       expand_psymtab here, and having expand_psymtab call expand_dependencies
+       to expand the includer, we'll only use expand_psymtab on the includer
+       (making it a non-toplevel psymtab), while if we expand the includer via
+       another path, we'll use read_symtab (making it a toplevel psymtab).
+       So, don't pretend a dwarf2_include_psymtab is an actual toplevel
+       psymtab, and trigger read_symtab on the includer here directly.  */
+    includer ()->read_symtab (objfile);
   }
 
   void expand_psymtab (struct objfile *objfile) override
   {
-    if (m_readin)
-      return;
-    /* It's an include file, no symbols to read for it.
-       Everything is in the parent symtab.  */
-    expand_dependencies (objfile);
-    m_readin = true;
+    /* This is not called by read_symtab, and should not be called by any
+       expand_dependencies.  */
+    gdb_assert (false);
   }
 
   bool readin_p () const override
   {
-    return m_readin;
+    return includer ()->readin_p ();
   }
 
   struct compunit_symtab *get_compunit_symtab () const override
@@ -5879,8 +5888,13 @@ struct dwarf2_include_psymtab : public partial_symtab
   }
 
 private:
-
-  bool m_readin = false;
+  partial_symtab *includer () const
+  {
+    /* An include psymtab has exactly one dependency: the psymtab that
+       includes it.  */
+    gdb_assert (this->number_of_dependencies == 1);
+    return this->dependencies[0];
+  }
 };
 
 /* Allocate a new partial symtab for file named NAME and mark this new
@@ -8796,8 +8810,7 @@ process_queue (struct dwarf2_per_objfile *dwarf2_per_objfile)
 void
 dwarf2_psymtab::expand_psymtab (struct objfile *objfile)
 {
-  if (readin)
-    return;
+  gdb_assert (!readin);
 
   expand_dependencies (objfile);
 
diff --git a/gdb/psympriv.h b/gdb/psympriv.h
index 9bc960a77d..fdcee99e33 100644
--- a/gdb/psympriv.h
+++ b/gdb/psympriv.h
@@ -124,16 +124,26 @@ struct partial_symtab
   {
   }
 
-  /* Read the full symbol table corresponding to this partial symbol
-     table.  */
+  /* Psymtab expansion is done in two steps:
+     - a call to read_symtab
+     - while that call is in progress, calls to expand_psymtab can be made,
+       both for this psymtab, and its dependencies.
+     This makes a distinction between a toplevel psymtab (for which both
+     read_symtab and expand_psymtab will be called) and a non-toplevel
+     psymtab (for which only expand_psymtab will be called). The
+     distinction can be used f.i. to do things before and after all
+     dependencies of a top-level psymtab have been expanded.
+
+     Read the full symbol table corresponding to this partial symbol
+     table.  Typically calls expand_psymtab.  */
   virtual void read_symtab (struct objfile *) = 0;
 
-  /* Psymtab expansion is done in two steps.  The first step is a call
-     to read_symtab; but while that is in progress, calls to
-     expand_psymtab can be made.  */
+  /* Expand the full symbol table for this partial symbol table.  Typically
+     calls expand_dependencies.  */
   virtual void expand_psymtab (struct objfile *) = 0;
 
-  /* Ensure that all the dependencies are read in.  */
+  /* Ensure that all the dependencies are read in.  Calls
+     expand_psymtab for each non-shared dependency.  */
   void expand_dependencies (struct objfile *);
 
   /* Return true if the symtab corresponding to this psymtab has been
diff --git a/gdb/testsuite/gdb.dwarf2/dw2-symtab-includes.exp b/gdb/testsuite/gdb.dwarf2/dw2-symtab-includes.exp
new file mode 100644
index 0000000000..1eaaf4af4f
--- /dev/null
+++ b/gdb/testsuite/gdb.dwarf2/dw2-symtab-includes.exp
@@ -0,0 +1,80 @@
+# Copyright 2020 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Check that symtab user and includes are present after symtab expansion
+# triggered by an include file.
+
+load_lib dwarf.exp
+
+# This test can only be run on targets which support DWARF-2 and use gas.
+if {![dwarf2_support]} {
+    return 0
+}
+
+standard_testfile main.c .S
+
+# Create the DWARF.
+set asm_file [standard_output_file $srcfile2]
+Dwarf::assemble $asm_file {
+    declare_labels partial_label lines_label
+    global srcdir subdir srcfile
+
+    extern main
+
+    cu {} {
+	partial_label: partial_unit {} {
+	}
+    }
+
+    cu {} {
+	compile_unit {
+	    {language @DW_LANG_C}
+	    {stmt_list ${lines_label} DW_FORM_sec_offset}
+	} {
+	    imported_unit {
+		{import $partial_label ref_addr}
+	    }
+	}
+    }
+
+    lines {version 2} lines_label {
+	include_dir "${srcdir}/${subdir}"
+	file_name "dw2-symtab-includes.h" 1
+	program {
+	    {DW_LNS_advance_line 1}
+	}
+    }
+}
+
+if { [prepare_for_testing "failed to prepare" $testfile \
+	  "${asm_file} ${srcfile}" {}] } {
+    return -1
+}
+
+# Check that no symtabs are expanded.
+set test "no symtabs expanded"
+if { [readnow] } {
+    unsupported $test
+    return -1
+}
+gdb_test_no_output "maint info symtabs" $test
+
+# Expand dw2-symtab-includes.h symtab
+gdb_test "maint expand-symtab dw2-symtab-includes.h"
+
+# Check that there are includes.
+gdb_test "maint info symtabs" \
+    "\r\n    \\( includes\r\n.*" \
+    "check symtab includes"
Tom de Vries April 14, 2020, 1:31 p.m. | #8
On 30-03-2020 19:42, Tom de Vries wrote:
> On 29-03-2020 23:44, Simon Marchi wrote:

>> On 2020-03-29 11:56 a.m., Tom de Vries wrote:

>>> diff --git a/gdb/psympriv.h b/gdb/psympriv.h

>>> index 0effedc4ec..4317392149 100644

>>> --- a/gdb/psympriv.h

>>> +++ b/gdb/psympriv.h

>>> @@ -124,16 +124,26 @@ struct partial_symtab

>>>    {

>>>    }

>>>  

>>> -  /* Read the full symbol table corresponding to this partial symbol

>>> -     table.  */

>>> +  /* Psymtab expansion is done in two steps:

>>> +     - a call to read_symtab

>>> +     - while that call is in progress, calls to expand_psymtab can be made,

>>> +       both for this psymtab, and its dependencies.

>>> +     This makes a distinction between a toplevel psymtab (for which both

>>> +     read_symtab and expand_psymtab will be called) and a non-toplevel

>>> +     psymtab (for which only expand_psymtab will be called). The

>>> +     distinction can be used f.i. to do things before and after all

>>> +     dependencies of a top-level psymtab have been expanded.

>>> +

>>> +     Read the full symbol table corresponding to this partial symbol

>>> +     table.  Typically calls expand_psymtab.  */

>>>    virtual void read_symtab (struct objfile *) = 0;

>>>  

>>> -  /* Psymtab expansion is done in two steps.  The first step is a call

>>> -     to read_symtab; but while that is in progress, calls to

>>> -     expand_psymtab can be made.  */

>>> +  /* Expand the full symbol table for this partial symbol table.  Typically

>>> +     calls read_dependencies.  */

>>>    virtual void expand_psymtab (struct objfile *) = 0;

>>>  

>>> -  /* Ensure that all the dependencies are read in.  */

>>> +  /* Ensure that all the dependencies are read in.  Typically calls

>>> +     expand_psymtab for each dependency.  */

>>>    void read_dependencies (struct objfile *);

>>

>> I don't think the "Typically" here is right.  This is not a virtual function that can

>> be implemented/overriden by sub-classes, it's a helper that sub-classes call, that calls

>> expand_psymtab on all dependencies.

> 

> Ack, comment updated accordingly.

> 

>>  As you probably saw, I renamed it to

>> expand_dependencies to be more accurate (since it calls expand, not read).

>>

> 

> Ack, patch rebased.

> 

>> But otherwise, I am starting to think that it's the right thing to to, to call read_symtab

>> on the includer (top-level psymtab) in dwarf2_include_psymtab::read_symtab.  To read in

>> an include psymtab, we read in the includer.

>>

>> Following this, I also came to the conclusion that dwarf2_include_psymtab::m_readin should

>> probably be removed.  An include psymtab can't be read in on its own.  It can be considered

>> read in when its includer is read in.  So it doesn't make sense to maintain a "read in" state

>> separate from its includer.

>>

> 

> That makes sense to me.

> 

>> Here's the version I came up with, if you want to get inspiration.

>>

> 

> I've integrated it into my patch and re-tested.

> 

> Also, added test-case.


And committed.

Thanks,
- Tom

Patch

diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 2465ecebe3..fb90a04061 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -5840,7 +5840,14 @@  struct dwarf2_include_psymtab : public partial_symtab
       return;
     /* It's an include file, no symbols to read for it.
        Everything is in the parent symtab.  */
-    read_dependencies (objfile);
+
+    /* Make sure that we call read_symtab rather than just expand_symtab,
+       otherwise process_cu_includes is never called, and the expanded symtab
+       will not have the correct value for the includes field (note:
+       confusingly, the mentioned includes field is related to imported units,
+       and has no relation to include as used in 'dwarf2_include_psymtab').  */
+    if (!dependencies[0]->readin_p ())
+      dependencies[0]->read_symtab (objfile);
     m_readin = true;
   }