[1/2] Fix BZ 25065 - Ensure that physnames are computed for inherited DIEs

Message ID 20191014001842.27413-2-kevinb@redhat.com
State New
Headers show
Series
  • Fix BZ 25065 (LTO related GDB segfault)
Related show

Commit Message

Kevin Buettner Oct. 14, 2019, 12:18 a.m.
This is a fix for BZ 25065.

GDB segfaults when running either gdb.cp/subtypes.exp or
gdb.cp/local.exp in conjunction with using the -flto compiler/linker
flag.

A much simpler program, which was used to help create the test for
this fix, is:

-- doit.cc --
int main()
{
  class Foo {
  public:
    int doit ()
    {
      return 0;
    }
  };

  Foo foo;

  return foo.doit ();
}
-- end doit.cc --

gcc -o doit -flto -g doit.cc
gdb -q doit
Reading symbols from doit...
(gdb) ptype main::Foo
type = class Foo {
Segmentation fault (core dumped)

The segfault occurs due to a NULL physname in
c_type_print_base_struct_union in c-typeprint.c.  Specifically,
calling is_constructor_name() eventually causes the SIGSEGV is this
code in c-typeprint.c:

	      const char *physname = TYPE_FN_FIELD_PHYSNAME (f, j);
	      int is_full_physname_constructor =
		TYPE_FN_FIELD_CONSTRUCTOR (f, j)
		|| is_constructor_name (physname)
		|| is_destructor_name (physname)
		|| method_name[0] == '~';

However, looking at compute_delayed_physnames(), we see that
the TYPE_FN_FIELD_PHYSNAME field should never be NULL.  This
field will be set to "" for NULL physnames:

      physname = dwarf2_physname (mi.name, mi.die, cu);
      TYPE_FN_FIELD_PHYSNAME (fn_flp->fn_fields, mi.index)
	= physname ? physname : "";

For this particular case, it turns out that compute_delayed_physnames
wasn't being called, which left TYPE_FN_FIELD_PHYSNAME set to the NULL
value that it started with when that data structure was allocated.

The place to fix it, I think, is towards the end of
inherit_abstract_dies().  My fix causes the origin CU's method_list
(which is simply the list of methods whose physnames still
need to be computed) to be added to the CU which is doing the
inheriting.

Another way to fix it might be call compute_delayed_physnames() from
inherit_abstract_dies(), but I wasn't confident that all needed types
would be known at that point.  It seemed safer to defer the physname
computation until the inheriting CU is completely processed.

gdb/ChangeLog:

	* dwarf2read.c (inherit_abstract_dies): Ensure that delayed
	physnames are computed for inherited DIEs.
---
 gdb/dwarf2read.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

-- 
2.21.0

Comments

Simon Marchi Oct. 14, 2019, 3:02 a.m. | #1
Hi Kevin,

I don't really have the big picture of these advanced DWARF use cases, 
so I can't really weigh on this, but I have a question:

> diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c

> index ee9df34ed2..f7ef122933 100644

> --- a/gdb/dwarf2read.c

> +++ b/gdb/dwarf2read.c

> @@ -13666,6 +13666,17 @@ inherit_abstract_dies (struct die_info *die,

> struct dwarf2_cu *cu)

>        origin_child_die = sibling_die (origin_child_die);

>      }

>    origin_cu->list_in_scope = origin_previous_list_in_scope;

> +

> +  if (cu != origin_cu && !origin_cu->method_list.empty ())

> +    {

> +      /* Add list of methods found in origin_cu to the list in cu.  

> These

> +         methods still need to have their physnames computed in

> +	 compute_delayed_physnames().  */

> +      cu->method_list.insert (cu->method_list.end (),

> +                              origin_cu->method_list.begin (),

> +			      origin_cu->method_list.end ());

> +      origin_cu->method_list.clear ();

> +    }


So, this effectively makes the inheriting CU steal the method_list 
content from the inherited from CU?  Is it possible for a CU to be 
inherited from multiple times?  If so, what happens the second time we 
process something that inherits from this CU, the method_list vector 
will then be empty?  Is it that we want?

Simon
Kevin Buettner Oct. 15, 2019, 4:27 p.m. | #2
On Sun, 13 Oct 2019 23:02:01 -0400
Simon Marchi <simon.marchi@polymtl.ca> wrote:

> Hi Kevin,

> 

> I don't really have the big picture of these advanced DWARF use cases, 

> so I can't really weigh on this, but I have a question:

> 

> > diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c

> > index ee9df34ed2..f7ef122933 100644

> > --- a/gdb/dwarf2read.c

> > +++ b/gdb/dwarf2read.c

> > @@ -13666,6 +13666,17 @@ inherit_abstract_dies (struct die_info *die,

> > struct dwarf2_cu *cu)

> >        origin_child_die = sibling_die (origin_child_die);

> >      }

> >    origin_cu->list_in_scope = origin_previous_list_in_scope;

> > +

> > +  if (cu != origin_cu && !origin_cu->method_list.empty ())

> > +    {

> > +      /* Add list of methods found in origin_cu to the list in cu.  

> > These

> > +         methods still need to have their physnames computed in

> > +	 compute_delayed_physnames().  */

> > +      cu->method_list.insert (cu->method_list.end (),

> > +                              origin_cu->method_list.begin (),

> > +			      origin_cu->method_list.end ());

> > +      origin_cu->method_list.clear ();

> > +    }  

> 

> So, this effectively makes the inheriting CU steal the method_list 

> content from the inherited from CU?  Is it possible for a CU to be 

> inherited from multiple times?  If so, what happens the second time we 

> process something that inherits from this CU, the method_list vector 

> will then be empty?  Is it that we want?


Hi Simon,

You raise some good questions.  I modified the test associated with
this bug so that two different CUs attempt to inherit a third CU. 
This has turned up another bug in GDB which I spent the rest of the
day looking at.  (GDB goes into an infinite loop!)

I'll make the following observations, however...

- method_list is used solely for keeping track of C++ methods for
delayed physname computations.

- method_list is cleared in process_full_comp_unit(),
process_full_type_unit(), and compute_delayed_physnames().
That latter function(), compute_delayed_physnames(), is called
after DIE processing in the first two functions.  So method_list
is made to be empty prior to DIE processing and then made empty
(as a result of delayed physname computation) again after DIE
processing (in the above mentioned functions).

So, what is the right thing to do with regard to method_list for
inherit_abstract_dies()?

Yesterday, as part of my investigations, I made inherit_abstract_dies()
call compute_delayed_physnames in place of the code in my posted
patch.  That also works, at least for my test case.  I'll note that
for my original (posted) patch, compute_delayed_physnames will be
called with the inheriting CU.  In the alternate approach (in which
compute_delayed_physnames is called from inherit_abstract_dies),
compute_delayed_physnames is called with the origin CU.  I don't
yet know which is more correct or if there are cases where it'll
make a difference.

So... I'm continuing my investigations and will let you know when
I have some answers.  In the interim, if anyone has some insights
about these matters, I'd very much like to hear them.

Kevin
Simon Marchi Oct. 17, 2019, 3:54 a.m. | #3
On 2019-10-15 12:27 p.m., Kevin Buettner wrote:
> Hi Simon,

> 

> You raise some good questions.  I modified the test associated with

> this bug so that two different CUs attempt to inherit a third CU. 

> This has turned up another bug in GDB which I spent the rest of the

> day looking at.  (GDB goes into an infinite loop!)

> 

> I'll make the following observations, however...

> 

> - method_list is used solely for keeping track of C++ methods for

> delayed physname computations.

> 

> - method_list is cleared in process_full_comp_unit(),

> process_full_type_unit(), and compute_delayed_physnames().

> That latter function(), compute_delayed_physnames(), is called

> after DIE processing in the first two functions.  So method_list

> is made to be empty prior to DIE processing and then made empty

> (as a result of delayed physname computation) again after DIE

> processing (in the above mentioned functions).

> 

> So, what is the right thing to do with regard to method_list for

> inherit_abstract_dies()?

> 

> Yesterday, as part of my investigations, I made inherit_abstract_dies()

> call compute_delayed_physnames in place of the code in my posted

> patch.  That also works, at least for my test case.  I'll note that

> for my original (posted) patch, compute_delayed_physnames will be

> called with the inheriting CU.  In the alternate approach (in which

> compute_delayed_physnames is called from inherit_abstract_dies),

> compute_delayed_physnames is called with the origin CU.  I don't

> yet know which is more correct or if there are cases where it'll

> make a difference.

> 

> So... I'm continuing my investigations and will let you know when

> I have some answers.  In the interim, if anyone has some insights

> about these matters, I'd very much like to hear them.

> 

> Kevin

> 

Hi Kevin,

I've spent a bit of time to better understand what inherited abstract DIEs
are and to step in that code.

I observed that when a DIE A inherits from another DIE X, we process DIE X completely
from scratch, going through process_die, creating the struct type instances, adding
items to method_list for delayed name computation.  If another DIE B inherits from
DIE X, then we'll go through process_die again, creating another struct type, adding
items to method_list for delayed name computation again.  There is no (and there
shouldn't be) any state shared between the processing of X while inherited by A and
the processing of X while inherited by B.  In theory, the compiler could have decided
to have two completely separate DIE trees under A and under B.  But to reduce
duplication, it decided to use that abstract origin thing, so A and B could share some
children.  However, we should still conceptually treat them as separate trees.  Btw,
I'm writing all this so I assimilate it better, but also so you can tell me if you
think I'm wrong.

I am confident that what you do in this patch is right.  Let's say A, B and X, the same
DIEs as above, are all in different CUs.  Before your patch, while processing A, the
delayed method info for things described in X would be put in X's CU's method_info list
and stay there, and it would never get used.  When processing B, the delayed method info
for things described in X would also be put in X's CU's method_info list and never get
used.  If, we then happen to process the CU that contains X, we'll start by clearing that
CU's method_info list in process_full_comp_unit and start from scratch.

With your patch, the entries generated while parsing X as inherited by A get moved to A's
CU, and the entries generated while parsing X as inherited by B get transferred to B's CU.
That's as if A and B had their own independent subtree, which is what we want.  If we then
happen to process X's CU, that CU's method_list will be empty, as expected.

I think that what's confusing in all this is the fact that the method_info list is
currently attached to a particular CU.  Instead, I think it should be attached to the
operation of processing of a CU, and used to collect all delayed method infos while
processing that CU, even if some of these infos come from inherited DIEs from another
CU.  Concretely, it would mean to have a local instance of
std::vector<delayed_method_info> in process_full_comp_unit/process_full_type_unit and to
pass it by pointer/reference through the call stack to any code who might want to append
to it.  We wouldn't have to do anything special in inherit_abstract_dies, just pass this
reference to the list down the stack.  I don't know how feasible it would be in practice
to do that change, maybe it's too much work or would end up ugly.  I'll give it a try.
But your patch gives essentially the same result, and works with what we have today.

Also, note this in inherit_abstract_dies:

  /* We're inheriting ORIGIN's children into the scope we'd put DIE's
     symbols in.  */
  origin_previous_list_in_scope = origin_cu->list_in_scope;
  origin_cu->list_in_scope = cu->list_in_scope;

  ...

  origin_cu->list_in_scope = origin_previous_list_in_scope;

I think this code is solving the same kind of problem for `list_in_scope`, but differently:
it temporarily moves A's CU's list in X's CU.  Here too, I think it would be clearer if
`list_in_scope` wasn't attached to a CU, but something passed down the stack.  Again, we
wouldn't have to do anything special in inherit_abstract_dies, just pass that list down.

One last thing, I think that those method_list.clear() calls in process_full_comp_unit and
process_full_type_unit are unnecessary, and are confusing because they suggest that there
might be something in there.  When we start processing a CU with either of these functions,
I don't think there should ever be some content.  I'd try to change them to

  gdb_assert (cu->method_list.empty ());

Simon
Simon Marchi Oct. 17, 2019, 5:30 a.m. | #4
On 2019-10-16 11:54 p.m., Simon Marchi wrote:
> I think that what's confusing in all this is the fact that the method_info list is

> currently attached to a particular CU.  Instead, I think it should be attached to the

> operation of processing of a CU, and used to collect all delayed method infos while

> processing that CU, even if some of these infos come from inherited DIEs from another

> CU.  Concretely, it would mean to have a local instance of

> std::vector<delayed_method_info> in process_full_comp_unit/process_full_type_unit and to

> pass it by pointer/reference through the call stack to any code who might want to append

> to it.  We wouldn't have to do anything special in inherit_abstract_dies, just pass this

> reference to the list down the stack.  I don't know how feasible it would be in practice

> to do that change, maybe it's too much work or would end up ugly.  I'll give it a try.

> But your patch gives essentially the same result, and works with what we have today.


A little follow up to the above.

I prototyped that change here in a WIP patch (so, not intended to be reviewed):

https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/128

I got no regression in gdb.dwarf2.  However, it's a bit invasive.  If we want to pass other
objects in the same fashion, it will quickly become very heavy.

What we could do though, is to introduce a new type (e.g. struct dwarf2_cu_processing_context)
and pass that around instead.  Its lifetime would be the duration of process_full_comp_unit /
process_full_type_unit, just like std::vector in the patch above, but could contain many fields.

I found something potentially problematic though (applies to both your and my patch).  When we process
the delayed_method_info objects in compute_delayed_physnames, we call:

  dwarf2_physname (mi.name, mi.die, cu);

mi.die could be a die coming from X's CU (to keep the example from my previous message), but the
cu in the call above is A's CU (the CU we are processing).  I am pretty sure that this function
(and what it calls) expect the passed DIE to come from the passed CU.  If they don't match, I guess
we could have some bad surprises.

Simon
Kevin Buettner Oct. 18, 2019, 1:08 a.m. | #5
On Thu, 17 Oct 2019 01:30:12 -0400
Simon Marchi <simon.marchi@polymtl.ca> wrote:

> On 2019-10-16 11:54 p.m., Simon Marchi wrote:

> > I think that what's confusing in all this is the fact that the

> > method_info list is currently attached to a particular CU. 

> > Instead, I think it should be attached to the operation of

> > processing of a CU, and used to collect all delayed method infos

> > while processing that CU, even if some of these infos come from

> > inherited DIEs from another CU.  Concretely, it would mean to have

> > a local instance of std::vector<delayed_method_info> in

> > process_full_comp_unit/process_full_type_unit and to pass it by

> > pointer/reference through the call stack to any code who might

> > want to append to it.  We wouldn't have to do anything special in

> > inherit_abstract_dies, just pass this reference to the list down

> > the stack.  I don't know how feasible it would be in practice to

> > do that change, maybe it's too much work or would end up ugly. 

> > I'll give it a try.  But your patch gives essentially the same

> > result, and works with what we have today. 

> 

> A little follow up to the above.

> 

> I prototyped that change here in a WIP patch (so, not intended to be

> reviewed):

> 

> https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/128

> 

> I got no regression in gdb.dwarf2.  However, it's a bit invasive. 

> If we want to pass other objects in the same fashion, it will

> quickly become very heavy.

> 

> What we could do though, is to introduce a new type (e.g.  struct

> dwarf2_cu_processing_context) and pass that around instead.  Its

> lifetime would be the duration of process_full_comp_unit /

> process_full_type_unit, just like std::vector in the patch above,

> but could contain many fields.


I looked over your patch; it looks reasonable to me.  If we go that
route, I like the idea of introducing a dwarf2_cu_processing_context
struct.  (But see below for some later misgivings that I have/had.)

> I found something potentially problematic though (applies to both

> your and my patch).  When we process the delayed_method_info objects

> in compute_delayed_physnames, we call:

> 

> dwarf2_physname (mi.name, mi.die, cu);

> 

> mi.die could be a die coming from X's CU (to keep the example from

> my previous message), but the cu in the call above is A's CU (the CU

> we are processing).  I am pretty sure that this function (and what

> it calls) expect the passed DIE to come from the passed CU.  If they

> don't match, I guess we could have some bad surprises.


I spent a little time trying to figure out what CU is used for in
dwarf2_physname() and its callees.  What I noticed most is that
cu->language is used to figure out some appropriate thing to do.
I seem to remember Keith running into problems with mismatched
languages in different CUs.  So there might be problems if the
languages associated with the CUs are different.

There are also obstacks associated with each CU.  I noticed this
call in dwarf2_compute_name():

		  dwarf2_const_value_attr (attr, type, name,
					   &cu->comp_unit_obstack, cu,
					   &value, &bytes, &baton);

For this matter, I think that we want the inheriting CU's obstack to
be used, so this might be okay.  However, due to potentially differing
lifetimes, there could be problems if data structures allocated in one
obstack point to data structures in another; I don't know if that could
happen or not.

There are also errors which fetch the objfile name via...

  objfile_name (cu->per_cu->dwarf2_per_objfile->objfile)

I don't think this will cause a crash or anything, but it has
the potential to misidentify the problematic objfile.

There may be other concerns too; I'm certain that I didn't look at all
of the ways that CU is used in dwarf2_physname and its callees.

Also, while looking at all of that, it occurred to me that struct
dwarf2_cu already contains elements of a CU processing context.  E.g. 
the rust_unions member looks very similar to method_list.  Plus the
comment for dwarf2_cu says "Internal state when decoding a particular
compilation unit."  That led me to wonder if it truly makes sense to
pass around both a struct dwarf2_cu and a struct
dwarf2_cu_processing_context.  (I don't know the answer, but it's
something to mull over...)

Kevin
Simon Marchi Oct. 18, 2019, 3:07 p.m. | #6
On 2019-10-17 9:08 p.m., Kevin Buettner wrote:
> On Thu, 17 Oct 2019 01:30:12 -0400

> Simon Marchi <simon.marchi@polymtl.ca> wrote:

> 

>> On 2019-10-16 11:54 p.m., Simon Marchi wrote:

>>> I think that what's confusing in all this is the fact that the

>>> method_info list is currently attached to a particular CU. 

>>> Instead, I think it should be attached to the operation of

>>> processing of a CU, and used to collect all delayed method infos

>>> while processing that CU, even if some of these infos come from

>>> inherited DIEs from another CU.  Concretely, it would mean to have

>>> a local instance of std::vector<delayed_method_info> in

>>> process_full_comp_unit/process_full_type_unit and to pass it by

>>> pointer/reference through the call stack to any code who might

>>> want to append to it.  We wouldn't have to do anything special in

>>> inherit_abstract_dies, just pass this reference to the list down

>>> the stack.  I don't know how feasible it would be in practice to

>>> do that change, maybe it's too much work or would end up ugly. 

>>> I'll give it a try.  But your patch gives essentially the same

>>> result, and works with what we have today. 

>>

>> A little follow up to the above.

>>

>> I prototyped that change here in a WIP patch (so, not intended to be

>> reviewed):

>>

>> https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/128

>>

>> I got no regression in gdb.dwarf2.  However, it's a bit invasive. 

>> If we want to pass other objects in the same fashion, it will

>> quickly become very heavy.

>>

>> What we could do though, is to introduce a new type (e.g.  struct

>> dwarf2_cu_processing_context) and pass that around instead.  Its

>> lifetime would be the duration of process_full_comp_unit /

>> process_full_type_unit, just like std::vector in the patch above,

>> but could contain many fields.

> 

> I looked over your patch; it looks reasonable to me.  If we go that

> route, I like the idea of introducing a dwarf2_cu_processing_context

> struct.  (But see below for some later misgivings that I have/had.)


Ok, I can try to make something cleaner, but I don't know when it would be ready,
and I wouldn't want that to block the GDB 9.1 branch creation (or release) for
that.  Would you like to still push your patch (or a perhaps updated version of it)
so that we have the fix in GDB 9.1?

>> I found something potentially problematic though (applies to both

>> your and my patch).  When we process the delayed_method_info objects

>> in compute_delayed_physnames, we call:

>>

>> dwarf2_physname (mi.name, mi.die, cu);

>>

>> mi.die could be a die coming from X's CU (to keep the example from

>> my previous message), but the cu in the call above is A's CU (the CU

>> we are processing).  I am pretty sure that this function (and what

>> it calls) expect the passed DIE to come from the passed CU.  If they

>> don't match, I guess we could have some bad surprises.

> 

> I spent a little time trying to figure out what CU is used for in

> dwarf2_physname() and its callees.  What I noticed most is that

> cu->language is used to figure out some appropriate thing to do.

> I seem to remember Keith running into problems with mismatched

> languages in different CUs.  So there might be problems if the

> languages associated with the CUs are different.


Yeah, I remember this saga, but I did not really follow it.  Keith, would
you have an idea of what would be the right CU to pass here?

> There are also obstacks associated with each CU.  I noticed this

> call in dwarf2_compute_name():

> 

> 		  dwarf2_const_value_attr (attr, type, name,

> 					   &cu->comp_unit_obstack, cu,

> 					   &value, &bytes, &baton);

> 

> For this matter, I think that we want the inheriting CU's obstack to

> be used, so this might be okay.  However, due to potentially differing

> lifetimes, there could be problems if data structures allocated in one

> obstack point to data structures in another; I don't know if that could

> happen or not.


Hmm right.  Maybe it's fine in practice since the lifetimes of the two dwarf_cu
objects are probably similar.  But I agree it's not ideal.

> 

> There are also errors which fetch the objfile name via...

> 

>   objfile_name (cu->per_cu->dwarf2_per_objfile->objfile)

> 

> I don't think this will cause a crash or anything, but it has

> the potential to misidentify the problematic objfile.


Aren't the two CUs necessarily in the same objfile?

> There may be other concerns too; I'm certain that I didn't look at all

> of the ways that CU is used in dwarf2_physname and its callees.


I don't think it's humanly possible to manually check all the possible branches
this code can take.  I say, let's do a quick pass to check for the obvious (like
what you found above), but otherwise I'm fine with this patch, it already makes
things better than they are now.

> Also, while looking at all of that, it occurred to me that struct

> dwarf2_cu already contains elements of a CU processing context.  E.g. 

> the rust_unions member looks very similar to method_list.  Plus the

> comment for dwarf2_cu says "Internal state when decoding a particular

> compilation unit."  That led me to wonder if it truly makes sense to

> pass around both a struct dwarf2_cu and a struct

> dwarf2_cu_processing_context.  (I don't know the answer, but it's

> something to mull over...)


I'll check closer when actually trying to write that patch, but for now
I believe that it's fine to pass both objects, as they have different
meanings.

The dwarf2_cu_processing_ctx would contain things related to the root CU
we are parsing.  The dwarf2_cu would represent things related to the CU
we are in right now.  So when crossing a CU boundary (like in the
abstract origin case that started this thread), the passed
dwarf2_cu_processing_ctx wouldn't change, but the passed dwarf2_cu would
change.

Simon
Keith Seitz Oct. 21, 2019, 8:05 p.m. | #7
On 10/18/19 8:07 AM, Simon Marchi wrote:
>> I spent a little time trying to figure out what CU is used for in

>> dwarf2_physname() and its callees.  What I noticed most is that

>> cu->language is used to figure out some appropriate thing to do.

>> I seem to remember Keith running into problems with mismatched

>> languages in different CUs.  So there might be problems if the

>> languages associated with the CUs are different.

> 

> Yeah, I remember this saga, but I did not really follow it.  Keith, would

> you have an idea of what would be the right CU to pass here?


The language mismatch that Kevin refers to was a problem that appeared
with LTO. The primary DIE would be marked artificial and having language C99.
That DIE would then import other DIEs of different languages (mostly C++),
and that would cause problems because all these symbols are supposed to
live in the same dictionary. That was resolved by moving to multidictionary
a while back. [Reminder, a symbol's language is used for searching. Different
languages now have different searching algorithms. This was introduced to
help with Pedro's multi-breakpoint/linespec work to deal with namespaces,
ABI tags, etc.]

As far as the correct DIEs to use when passing to dwarf2_physname, I don't
think/remember it being anything particularly unexpected. If the physname
for a DIE is computed, the DIE/CU passed must be able to give us this
information.

Since dwarf2_physname largely uses linkage names nowadays (there was only
one or two instances where we actually needed to compute the physname),
the correct CU/DIE would be the pair containing DW_AT_linkage_name for
the symbol.

It is debatable whether we even still need delayed physnames. I don't
think we actually need to compute physnames much anymore. [As I recall,
one place we've had to do this with was dtors. GCC did not emit linkage
names for them.]

Keith
Kevin Buettner Oct. 22, 2019, 10:23 p.m. | #8
On Fri, 18 Oct 2019 11:07:31 -0400
Simon Marchi <simon.marchi@polymtl.ca> wrote:

> > I looked over your patch; it looks reasonable to me.  If we go that

> > route, I like the idea of introducing a dwarf2_cu_processing_context

> > struct.  (But see below for some later misgivings that I have/had.)  

> 

> Ok, I can try to make something cleaner, but I don't know when it

> would be ready, and I wouldn't want that to block the GDB 9.1 branch

> creation (or release) for that.  Would you like to still push your

> patch (or a perhaps updated version of it) so that we have the fix

> in GDB 9.1?


[...]

> > There may be other concerns too; I'm certain that I didn't look at all

> > of the ways that CU is used in dwarf2_physname and its callees.  

> 

> I don't think it's humanly possible to manually check all the

> possible branches this code can take.  I say, let's do a quick pass

> to check for the obvious (like what you found above), but otherwise

> I'm fine with this patch, it already makes things better than they

> are now.


My testing shows that the patch below still fixes the problem while
also avoiding the poential problems of passing a CU to
compute_delayed_physnames() which is different from the CU of the
methods for which want to compute physnames.

I think that this patch is safer than the one I originally proposed
and is, therefore, a better short term solution.

What do you think?

- - -

Fix BZ 25065 - Ensure that physnames are computed for inherited DIEs

This is a fix for BZ 25065.

GDB segfaults when running either gdb.cp/subtypes.exp or
gdb.cp/local.exp in conjunction with using the -flto compiler/linker
flag.

A much simpler program, which was used to help create the test for
this fix, is:

-- doit.cc --
int main()
{
  class Foo {
  public:
    int doit ()
    {
      return 0;
    }
  };

  Foo foo;

  return foo.doit ();
}
-- end doit.cc --

gcc -o doit -flto -g doit.cc
gdb -q doit
Reading symbols from doit...
(gdb) ptype main::Foo
type = class Foo {
Segmentation fault (core dumped)

The segfault occurs due to a NULL physname in
c_type_print_base_struct_union in c-typeprint.c.  Specifically,
calling is_constructor_name() eventually causes the SIGSEGV is this
code in c-typeprint.c:

	      const char *physname = TYPE_FN_FIELD_PHYSNAME (f, j);
	      int is_full_physname_constructor =
		TYPE_FN_FIELD_CONSTRUCTOR (f, j)
		|| is_constructor_name (physname)
		|| is_destructor_name (physname)
		|| method_name[0] == '~';

However, looking at compute_delayed_physnames(), we see that
the TYPE_FN_FIELD_PHYSNAME field should never be NULL.  This
field will be set to "" for NULL physnames:

      physname = dwarf2_physname (mi.name, mi.die, cu);
      TYPE_FN_FIELD_PHYSNAME (fn_flp->fn_fields, mi.index)
	= physname ? physname : "";

For this particular case, it turns out that compute_delayed_physnames
wasn't being called, which left TYPE_FN_FIELD_PHYSNAME set to the NULL
value that it started with when that data structure was allocated.

The place to fix it, I think, is towards the end of
inherit_abstract_dies().

My first attempt at fix caused the origin CU's method_list (which is
simply the list of methods whose physnames still need to be computed)
to be added to the CU which is doing the inheriting.  One drawback
with this approach is that compute_delayed_physnames is (eventually)
called with a CU that's different than the CU in which the methods
were found.  It's not clear whether this will cause problems or not.

A safer approach, which is what I ultimately settled on, is to call
compute_delayed_physnames() from inherit_abstract_dies().  One
potential drawback is that all needed types might not be known at that
point.  However, in my testing, I haven't seen a problem along these
lines.

gdb/ChangeLog:

	* dwarf2read.c (inherit_abstract_dies): Ensure that delayed
	physnames are computed for inherited DIEs.
    
    Change-Id: I6c6ffe96b301a9daab9f653956b89e3a33fa9445
---
 gdb/dwarf2read.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c
index ee9df34ed2..976153640a 100644
--- a/gdb/dwarf2read.c
+++ b/gdb/dwarf2read.c
@@ -13666,6 +13666,9 @@ inherit_abstract_dies (struct die_info *die, struct dwarf2_cu *cu)
       origin_child_die = sibling_die (origin_child_die);
     }
   origin_cu->list_in_scope = origin_previous_list_in_scope;
+
+  if (cu != origin_cu)
+    compute_delayed_physnames (origin_cu);
 }
 
 static void
Kevin Buettner Nov. 4, 2019, 8:41 p.m. | #9
Ping.

On Tue, 22 Oct 2019 15:23:07 -0700
Kevin Buettner <kevinb@redhat.com> wrote:

> On Fri, 18 Oct 2019 11:07:31 -0400

> Simon Marchi <simon.marchi@polymtl.ca> wrote:

> 

> > > I looked over your patch; it looks reasonable to me.  If we go that

> > > route, I like the idea of introducing a dwarf2_cu_processing_context

> > > struct.  (But see below for some later misgivings that I have/had.)    

> > 

> > Ok, I can try to make something cleaner, but I don't know when it

> > would be ready, and I wouldn't want that to block the GDB 9.1 branch

> > creation (or release) for that.  Would you like to still push your

> > patch (or a perhaps updated version of it) so that we have the fix

> > in GDB 9.1?  

> 

> [...]

> 

> > > There may be other concerns too; I'm certain that I didn't look at all

> > > of the ways that CU is used in dwarf2_physname and its callees.    

> > 

> > I don't think it's humanly possible to manually check all the

> > possible branches this code can take.  I say, let's do a quick pass

> > to check for the obvious (like what you found above), but otherwise

> > I'm fine with this patch, it already makes things better than they

> > are now.  

> 

> My testing shows that the patch below still fixes the problem while

> also avoiding the poential problems of passing a CU to

> compute_delayed_physnames() which is different from the CU of the

> methods for which want to compute physnames.

> 

> I think that this patch is safer than the one I originally proposed

> and is, therefore, a better short term solution.

> 

> What do you think?

> 

> - - -

> 

> Fix BZ 25065 - Ensure that physnames are computed for inherited DIEs

> 

> This is a fix for BZ 25065.

> 

> GDB segfaults when running either gdb.cp/subtypes.exp or

> gdb.cp/local.exp in conjunction with using the -flto compiler/linker

> flag.

> 

> A much simpler program, which was used to help create the test for

> this fix, is:

> 

> -- doit.cc --

> int main()

> {

>   class Foo {

>   public:

>     int doit ()

>     {

>       return 0;

>     }

>   };

> 

>   Foo foo;

> 

>   return foo.doit ();

> }

> -- end doit.cc --

> 

> gcc -o doit -flto -g doit.cc

> gdb -q doit

> Reading symbols from doit...

> (gdb) ptype main::Foo

> type = class Foo {

> Segmentation fault (core dumped)

> 

> The segfault occurs due to a NULL physname in

> c_type_print_base_struct_union in c-typeprint.c.  Specifically,

> calling is_constructor_name() eventually causes the SIGSEGV is this

> code in c-typeprint.c:

> 

> 	      const char *physname = TYPE_FN_FIELD_PHYSNAME (f, j);

> 	      int is_full_physname_constructor =

> 		TYPE_FN_FIELD_CONSTRUCTOR (f, j)

> 		|| is_constructor_name (physname)

> 		|| is_destructor_name (physname)

> 		|| method_name[0] == '~';

> 

> However, looking at compute_delayed_physnames(), we see that

> the TYPE_FN_FIELD_PHYSNAME field should never be NULL.  This

> field will be set to "" for NULL physnames:

> 

>       physname = dwarf2_physname (mi.name, mi.die, cu);

>       TYPE_FN_FIELD_PHYSNAME (fn_flp->fn_fields, mi.index)

> 	= physname ? physname : "";

> 

> For this particular case, it turns out that compute_delayed_physnames

> wasn't being called, which left TYPE_FN_FIELD_PHYSNAME set to the NULL

> value that it started with when that data structure was allocated.

> 

> The place to fix it, I think, is towards the end of

> inherit_abstract_dies().

> 

> My first attempt at fix caused the origin CU's method_list (which is

> simply the list of methods whose physnames still need to be computed)

> to be added to the CU which is doing the inheriting.  One drawback

> with this approach is that compute_delayed_physnames is (eventually)

> called with a CU that's different than the CU in which the methods

> were found.  It's not clear whether this will cause problems or not.

> 

> A safer approach, which is what I ultimately settled on, is to call

> compute_delayed_physnames() from inherit_abstract_dies().  One

> potential drawback is that all needed types might not be known at that

> point.  However, in my testing, I haven't seen a problem along these

> lines.

> 

> gdb/ChangeLog:

> 

> 	* dwarf2read.c (inherit_abstract_dies): Ensure that delayed

> 	physnames are computed for inherited DIEs.

>     

>     Change-Id: I6c6ffe96b301a9daab9f653956b89e3a33fa9445

> ---

>  gdb/dwarf2read.c | 3 +++

>  1 file changed, 3 insertions(+)

> 

> diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c

> index ee9df34ed2..976153640a 100644

> --- a/gdb/dwarf2read.c

> +++ b/gdb/dwarf2read.c

> @@ -13666,6 +13666,9 @@ inherit_abstract_dies (struct die_info *die, struct dwarf2_cu *cu)

>        origin_child_die = sibling_die (origin_child_die);

>      }

>    origin_cu->list_in_scope = origin_previous_list_in_scope;

> +

> +  if (cu != origin_cu)

> +    compute_delayed_physnames (origin_cu);

>  }

>  

>  static void

>
Simon Marchi Nov. 4, 2019, 8:49 p.m. | #10
On 2019-10-22 6:23 p.m., Kevin Buettner wrote:
> On Fri, 18 Oct 2019 11:07:31 -0400

> Simon Marchi <simon.marchi@polymtl.ca> wrote:

> 

>>> I looked over your patch; it looks reasonable to me.  If we go that

>>> route, I like the idea of introducing a dwarf2_cu_processing_context

>>> struct.  (But see below for some later misgivings that I have/had.)  

>>

>> Ok, I can try to make something cleaner, but I don't know when it

>> would be ready, and I wouldn't want that to block the GDB 9.1 branch

>> creation (or release) for that.  Would you like to still push your

>> patch (or a perhaps updated version of it) so that we have the fix

>> in GDB 9.1?

> 

> [...]

> 

>>> There may be other concerns too; I'm certain that I didn't look at all

>>> of the ways that CU is used in dwarf2_physname and its callees.  

>>

>> I don't think it's humanly possible to manually check all the

>> possible branches this code can take.  I say, let's do a quick pass

>> to check for the obvious (like what you found above), but otherwise

>> I'm fine with this patch, it already makes things better than they

>> are now.

> 

> My testing shows that the patch below still fixes the problem while

> also avoiding the poential problems of passing a CU to

> compute_delayed_physnames() which is different from the CU of the

> methods for which want to compute physnames.

> 

> I think that this patch is safer than the one I originally proposed

> and is, therefore, a better short term solution.

> 

> What do you think?


Yep, this looks good and relatively safe to me.

Thanks!

Simon
Kevin Buettner Nov. 27, 2019, 8:17 p.m. | #11
On Mon, 4 Nov 2019 15:49:21 -0500
Simon Marchi <simon.marchi@polymtl.ca> wrote:

> > I think that this patch is safer than the one I originally proposed

> > and is, therefore, a better short term solution.

> > 

> > What do you think?  

> 

> Yep, this looks good and relatively safe to me.

> 

> Thanks!


I've (finally) pushed this patch along with the test case.

Thanks again for the review.

Kevin

Patch

diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c
index ee9df34ed2..f7ef122933 100644
--- a/gdb/dwarf2read.c
+++ b/gdb/dwarf2read.c
@@ -13666,6 +13666,17 @@  inherit_abstract_dies (struct die_info *die, struct dwarf2_cu *cu)
       origin_child_die = sibling_die (origin_child_die);
     }
   origin_cu->list_in_scope = origin_previous_list_in_scope;
+
+  if (cu != origin_cu && !origin_cu->method_list.empty ())
+    {
+      /* Add list of methods found in origin_cu to the list in cu.  These
+         methods still need to have their physnames computed in
+	 compute_delayed_physnames().  */
+      cu->method_list.insert (cu->method_list.end (),
+                              origin_cu->method_list.begin (),
+			      origin_cu->method_list.end ());
+      origin_cu->method_list.clear ();
+    }
 }
 
 static void