[gdb/symtab] Fix infinite recursion in dwarf2_cu::get_builder(), again

Message ID 20210609162305.GA7082@delia.home
State New
Headers show
Series
  • [gdb/symtab] Fix infinite recursion in dwarf2_cu::get_builder(), again
Related show

Commit Message

Tom de Vries June 9, 2021, 4:23 p.m.
Hi,

This is another attempt at fixing the problem described in commit 4cf88725da1
"[gdb/symtab] Fix infinite recursion in dwarf2_cu::get_builder()", which was
reverted in commit 3db19b2d724.

First off, some context.

A DWARF CU can be viewed as a symbol table: toplevel children of a CU DIE
represent symbol table entries for that CU.  Furthermore, there is a
hierarchy: a symbol table entry such as a function itself has a symbol table
containing parameters and local variables.

The dwarf reader maintains a notion of current symbol table (that is: the
symbol table a new symbol needs to be entered into) in dwarf2_cu member
list_in_scope.

A problem then presents itself when reading inter-CU references:
- a new symbol read from a CU B needs to be entered into the symbol table of
  another CU A.
- the notion of current symbol table is tracked on a per-CU basis.
This is addressed in inherit_abstract_dies by temporarily overwriting the
list_in_scope for CU B with the one for CU A.

The current symbol table is one aspect of the current dwarf reader context
that is tracked, but there are more, f.i. ones that are tracked via the
dwarf2_cu member m_builder, f.i. m_builder->m_local_using_directives.

A similar problem exists in relation to inter-CU references, but a different
solution was chosen:
- to keep track of an ancestor field in dwarf2_cu, which is updated
  when traversing inter-CU references, and
- to use the ancestor field in dwarf2_cu::get_builder to return the m_builder
  in scope.

There is no actual concept of a CU having an ancestor, it just marks the most
recent CU from which a CU was inter-CU-referenced.  Consequently, when
following inter-CU references from a CU A to another CU B and back to CU A,
the ancestors form a cycle, which causes dwarf2_cu::get_builder to hang or
segfault, as reported in PR26327.

ISTM that the ancestor implementation is confusing and fragile, and should
go.  Furthermore, it seems that keeping track of the m_builder in scope can be
handled simply with a global variable.

Fix the hang / segfault by:
- keeping track of the m_builder in scope using a new variable
  dwarf2_cu::sym_cu, and
- using it in dwarf2_cu::get_builder.

Tested on x86_64-linux (openSUSE Leap 15.2), no regressions for config:
- using default gcc version 7.5.0
  (with 5 unexpected FAILs)
- gcc 10.3.0 and target board
  unix/-flto/-O0/-flto-partition=none/-ffat-lto-objects
  (with 1000 unexpected FAILs)

Any comments?

Thanks,
- Tom

[gdb/symtab] Fix infinite recursion in dwarf2_cu::get_builder(), again

gdb/ChangeLog:

2021-06-09  Tom de Vries  <tdevries@suse.de>

	PR symtab/26327
	* dwarf2/cu.c (dwarf2_cu::sym_cu): New var.
	* dwarf2/cu.h (dwarf2_cu::ancestor): Remove.
	(dwarf2_cu::sym_cu): Declare.
	(dwarf2_cu::get_builder): Use sym_cu instead of ancestor.  Assert
	return value is non-null.
	* dwarf2/read.c (read_file_scope): Set dwarf2_cu::sym_cu.
	(follow_die_offset, follow_die_sig_1): Remove setting of ancestor.

---
 gdb/dwarf2/cu.c   |  3 +++
 gdb/dwarf2/cu.h   | 11 +++++------
 gdb/dwarf2/read.c | 13 ++++++-------
 3 files changed, 14 insertions(+), 13 deletions(-)

Comments

Tom Tromey June 14, 2021, 8:22 p.m. | #1
>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:


Tom> ISTM that the ancestor implementation is confusing and fragile, and should
Tom> go.  Furthermore, it seems that keeping track of the m_builder in scope can be
Tom> handled simply with a global variable.

I'd rather not have any global variables.

Maybe a new member of dwarf2_per_objfile would be equivalent.

Tom> +  gdb_assert (dwarf2_cu::sym_cu == nullptr);
Tom> +  scoped_restore restore_sym_cu = make_scoped_restore (&dwarf2_cu::sym_cu);
Tom> +  dwarf2_cu::sym_cu = cu;

You can pass 'cu' as a second argument to make_scoped_restore.

thanks,
Tom
Tom de Vries June 15, 2021, 3:51 p.m. | #2
On 6/14/21 10:22 PM, Tom Tromey wrote:
>>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:

> 

> Tom> ISTM that the ancestor implementation is confusing and fragile, and should

> Tom> go.  Furthermore, it seems that keeping track of the m_builder in scope can be

> Tom> handled simply with a global variable.

> 

> I'd rather not have any global variables.

> 

> Maybe a new member of dwarf2_per_objfile would be equivalent.

> 


Done.

[ My concern here was problems with inter-obj references, but I've
tested with target board cc-with-dwz-m and didn't run into any problems. ]

> Tom> +  gdb_assert (dwarf2_cu::sym_cu == nullptr);

> Tom> +  scoped_restore restore_sym_cu = make_scoped_restore (&dwarf2_cu::sym_cu);

> Tom> +  dwarf2_cu::sym_cu = cu;

> 

> You can pass 'cu' as a second argument to make_scoped_restore.


Also done, and retested in both test configurations.

Any further comments?

Thanks,
- Tom
[gdb/symtab] Fix infinite recursion in dwarf2_cu::get_builder(), again

This is another attempt at fixing the problem described in commit 4cf88725da1
"[gdb/symtab] Fix infinite recursion in dwarf2_cu::get_builder()", which was
reverted in commit 3db19b2d724.

First off, some context.

A DWARF CU can be viewed as a symbol table: toplevel children of a CU DIE
represent symbol table entries for that CU.  Furthermore, there is a
hierarchy: a symbol table entry such as a function itself has a symbol table
containing parameters and local variables.

The dwarf reader maintains a notion of current symbol table (that is: the
symbol table a new symbol needs to be entered into) in dwarf2_cu member
list_in_scope.

A problem then presents itself when reading inter-CU references:
- a new symbol read from a CU B needs to be entered into the symbol table of
  another CU A.
- the notion of current symbol table is tracked on a per-CU basis.
This is addressed in inherit_abstract_dies by temporarily overwriting the
list_in_scope for CU B with the one for CU A.

The current symbol table is one aspect of the current dwarf reader context
that is tracked, but there are more, f.i. ones that are tracked via the
dwarf2_cu member m_builder, f.i. m_builder->m_local_using_directives.

A similar problem exists in relation to inter-CU references, but a different
solution was chosen:
- to keep track of an ancestor field in dwarf2_cu, which is updated
  when traversing inter-CU references, and
- to use the ancestor field in dwarf2_cu::get_builder to return the m_builder
  in scope.

There is no actual concept of a CU having an ancestor, it just marks the most
recent CU from which a CU was inter-CU-referenced.  Consequently, when
following inter-CU references from a CU A to another CU B and back to CU A,
the ancestors form a cycle, which causes dwarf2_cu::get_builder to hang or
segfault, as reported in PR26327.

ISTM that the ancestor implementation is confusing and fragile, and should
go.  Furthermore, it seems that keeping track of the m_builder in scope can be
handled simply with a per-objfile variable.

Fix the hang / segfault by:
- keeping track of the m_builder in scope using a new variable
  per_obj->sym_cu, and
- using it in dwarf2_cu::get_builder.

Tested on x86_64-linux (openSUSE Leap 15.2), no regressions for config:
- using default gcc version 7.5.0
  (with 5 unexpected FAILs)
- gcc 10.3.0 and target board
  unix/-flto/-O0/-flto-partition=none/-ffat-lto-objects
  (with 1000 unexpected FAILs)

gdb/ChangeLog:

2021-06-09  Tom de Vries  <tdevries@suse.de>

	PR symtab/26327
	* dwarf2/cu.h (dwarf2_cu::ancestor): Remove.
	(dwarf2_cu::get_builder): Declare and move ...
	* dwarf2/cu.c (dwarf2_cu::get_builder): ... here.  Use sym_cu instead
	of ancestor.  Assert return value is non-null.
	* dwarf2/read.c (read_file_scope): Set per_objfile->sym_cu.
	(follow_die_offset, follow_die_sig_1): Remove setting of ancestor.
	(dwarf2_per_objfile): Add sym_cu field.

---
 gdb/dwarf2/cu.c   | 15 +++++++++++++++
 gdb/dwarf2/cu.h   | 17 +----------------
 gdb/dwarf2/read.c | 13 ++++++-------
 gdb/dwarf2/read.h |  3 +++
 4 files changed, 25 insertions(+), 23 deletions(-)

diff --git a/gdb/dwarf2/cu.c b/gdb/dwarf2/cu.c
index 1031ed3aa00..bc3f26dc28f 100644
--- a/gdb/dwarf2/cu.c
+++ b/gdb/dwarf2/cu.c
@@ -138,3 +138,18 @@ dwarf2_cu::add_dependence (struct dwarf2_per_cu_data *ref_per_cu)
   if (*slot == nullptr)
     *slot = ref_per_cu;
 }
+
+/* See dwarf2/cu.h.  */
+
+buildsym_compunit *
+dwarf2_cu::get_builder ()
+{
+  /* If this CU has a builder associated with it, use that.  */
+  if (m_builder != nullptr)
+    return m_builder.get ();
+
+  if (per_objfile->sym_cu != nullptr)
+    return per_objfile->sym_cu->m_builder.get ();
+
+  gdb_assert_not_reached ("");
+}
diff --git a/gdb/dwarf2/cu.h b/gdb/dwarf2/cu.h
index b4a5b08d5a6..fe606f3eab7 100644
--- a/gdb/dwarf2/cu.h
+++ b/gdb/dwarf2/cu.h
@@ -272,23 +272,8 @@ struct dwarf2_cu
 
   struct partial_die_info *find_partial_die (sect_offset sect_off);
 
-  /* If this CU was inherited by another CU (via specification,
-     abstract_origin, etc), this is the ancestor CU.  */
-  dwarf2_cu *ancestor;
-
   /* Get the buildsym_compunit for this CU.  */
-  buildsym_compunit *get_builder ()
-  {
-    /* If this CU has a builder associated with it, use that.  */
-    if (m_builder != nullptr)
-      return m_builder.get ();
-
-    /* Otherwise, search ancestors for a valid builder.  */
-    if (ancestor != nullptr)
-      return ancestor->get_builder ();
-
-    return nullptr;
-  }
+  buildsym_compunit *get_builder ();
 };
 
 #endif /* GDB_DWARF2_CU_H */
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 96009f1418f..07bc08fba14 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -10521,6 +10521,10 @@ read_file_scope (struct die_info *die, struct dwarf2_cu *cu)
 
   cu->start_symtab (fnd.name, fnd.comp_dir, lowpc);
 
+  gdb_assert (per_objfile->sym_cu == nullptr);
+  scoped_restore restore_sym_cu
+    = make_scoped_restore (&per_objfile->sym_cu, cu);
+
   /* Decode line number information if present.  We do this before
      processing child DIEs, so that the line header table is available
      for DW_AT_decl_file.  */
@@ -10536,6 +10540,7 @@ read_file_scope (struct die_info *die, struct dwarf2_cu *cu)
 	  child_die = child_die->sibling;
 	}
     }
+  per_objfile->sym_cu = nullptr;
 
   /* Decode macro information, if present.  Dwarf 2 macro information
      refers to information in the line number info statement program
@@ -23114,9 +23119,6 @@ follow_die_offset (sect_offset sect_off, int offset_in_dwz,
   *ref_cu = target_cu;
   temp_die.sect_off = sect_off;
 
-  if (target_cu != cu)
-    target_cu->ancestor = cu;
-
   return (struct die_info *) htab_find_with_hash (target_cu->die_hash,
 						  &temp_die,
 						  to_underlying (sect_off));
@@ -23469,7 +23471,7 @@ follow_die_sig_1 (struct die_info *src_die, struct signatured_type *sig_type,
 		  struct dwarf2_cu **ref_cu)
 {
   struct die_info temp_die;
-  struct dwarf2_cu *sig_cu, *cu = *ref_cu;
+  struct dwarf2_cu *sig_cu;
   struct die_info *die;
   dwarf2_per_objfile *per_objfile = (*ref_cu)->per_objfile;
 
@@ -23505,9 +23507,6 @@ follow_die_sig_1 (struct die_info *src_die, struct signatured_type *sig_type,
 	}
 
       *ref_cu = sig_cu;
-      if (sig_cu != cu)
-	sig_cu->ancestor = cu;
-
       return die;
     }
 
diff --git a/gdb/dwarf2/read.h b/gdb/dwarf2/read.h
index ae1608fa822..ee454ad300a 100644
--- a/gdb/dwarf2/read.h
+++ b/gdb/dwarf2/read.h
@@ -579,6 +579,9 @@ struct dwarf2_per_objfile
   /* Table containing line_header indexed by offset and offset_in_dwz.  */
   htab_up line_header_hash;
 
+  /* The CU containing the m_builder in scope.  */
+  dwarf2_cu *sym_cu = nullptr;
+
 private:
   /* Hold the corresponding compunit_symtab for each CU or TU.  This
      is indexed by dwarf2_per_cu_data::index.  A NULL value means

Patch

diff --git a/gdb/dwarf2/cu.c b/gdb/dwarf2/cu.c
index 1031ed3aa00..b5b1f4551c6 100644
--- a/gdb/dwarf2/cu.c
+++ b/gdb/dwarf2/cu.c
@@ -21,6 +21,9 @@ 
 #include "dwarf2/cu.h"
 #include "dwarf2/read.h"
 
+/* See cu.h.  */
+dwarf2_cu *dwarf2_cu::sym_cu;
+
 /* Initialize dwarf2_cu to read PER_CU, in the context of PER_OBJFILE.  */
 
 dwarf2_cu::dwarf2_cu (dwarf2_per_cu_data *per_cu,
diff --git a/gdb/dwarf2/cu.h b/gdb/dwarf2/cu.h
index b4a5b08d5a6..68f213d9d48 100644
--- a/gdb/dwarf2/cu.h
+++ b/gdb/dwarf2/cu.h
@@ -272,9 +272,8 @@  struct dwarf2_cu
 
   struct partial_die_info *find_partial_die (sect_offset sect_off);
 
-  /* If this CU was inherited by another CU (via specification,
-     abstract_origin, etc), this is the ancestor CU.  */
-  dwarf2_cu *ancestor;
+  /* The CU containing the m_builder in scope.  */
+  static dwarf2_cu *sym_cu;
 
   /* Get the buildsym_compunit for this CU.  */
   buildsym_compunit *get_builder ()
@@ -283,10 +282,10 @@  struct dwarf2_cu
     if (m_builder != nullptr)
       return m_builder.get ();
 
-    /* Otherwise, search ancestors for a valid builder.  */
-    if (ancestor != nullptr)
-      return ancestor->get_builder ();
+    if (sym_cu != nullptr)
+      return sym_cu->m_builder.get ();
 
+    gdb_assert_not_reached ("");
     return nullptr;
   }
 };
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 96009f1418f..f904073da21 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -10521,6 +10521,10 @@  read_file_scope (struct die_info *die, struct dwarf2_cu *cu)
 
   cu->start_symtab (fnd.name, fnd.comp_dir, lowpc);
 
+  gdb_assert (dwarf2_cu::sym_cu == nullptr);
+  scoped_restore restore_sym_cu = make_scoped_restore (&dwarf2_cu::sym_cu);
+  dwarf2_cu::sym_cu = cu;
+
   /* Decode line number information if present.  We do this before
      processing child DIEs, so that the line header table is available
      for DW_AT_decl_file.  */
@@ -10536,6 +10540,7 @@  read_file_scope (struct die_info *die, struct dwarf2_cu *cu)
 	  child_die = child_die->sibling;
 	}
     }
+  dwarf2_cu::sym_cu = nullptr;
 
   /* Decode macro information, if present.  Dwarf 2 macro information
      refers to information in the line number info statement program
@@ -23114,9 +23119,6 @@  follow_die_offset (sect_offset sect_off, int offset_in_dwz,
   *ref_cu = target_cu;
   temp_die.sect_off = sect_off;
 
-  if (target_cu != cu)
-    target_cu->ancestor = cu;
-
   return (struct die_info *) htab_find_with_hash (target_cu->die_hash,
 						  &temp_die,
 						  to_underlying (sect_off));
@@ -23469,7 +23471,7 @@  follow_die_sig_1 (struct die_info *src_die, struct signatured_type *sig_type,
 		  struct dwarf2_cu **ref_cu)
 {
   struct die_info temp_die;
-  struct dwarf2_cu *sig_cu, *cu = *ref_cu;
+  struct dwarf2_cu *sig_cu;
   struct die_info *die;
   dwarf2_per_objfile *per_objfile = (*ref_cu)->per_objfile;
 
@@ -23505,9 +23507,6 @@  follow_die_sig_1 (struct die_info *src_die, struct signatured_type *sig_type,
 	}
 
       *ref_cu = sig_cu;
-      if (sig_cu != cu)
-	sig_cu->ancestor = cu;
-
       return die;
     }