Simple speedup for DWARF CU expansion

Message ID 20200523212154.15024-1-tom@tromey.com
State New
Headers show
Series
  • Simple speedup for DWARF CU expansion
Related show

Commit Message

Tom Tromey May 23, 2020, 9:21 p.m.
I noticed that DWARF CU expansion had an easy-to-fix hot spot:
following DIE references in dwarf2_attr.

This patch fixes the problem by caching the referent in the die_info,
if the two DIEs are in the same CU.  (This restriction avoids any
issues with CU invalidation.)

I ran "gdb -readnow" on itself 10 times.  The mean before on this
machine was 14.522 seconds; with the patch it was 13.685 seconds.

Historically I figured that CU expansion was not very important; but I
think occasionally it results in unexpected delays for users.  So,
it's probably worth fixing.  I'm not sure the micro-optimization
approach is really all that good in the long run.  Better, I think,
would be to read types and function bodies lazily -- this would vastly
improve performance.  (I attempted the latter once and saw a 40%
speedup, IIRC.)

The use of dwarf2_attr could also be improved.  Rather than looking up
attributes individually, it would probably be better for (most)
functions to loop once over all attributes, collecting useful
information.

Meanwhile, this patch seems reasonably worthwhile.

gdb/ChangeLog
2020-05-23  Tom Tromey  <tom@tromey.com>

	* dwarf2/read.c (dwarf2_attr): Use die_info::ref.
	* dwarf2/die.h (struct die_info) <ref>: New member.
---
 gdb/ChangeLog     |  5 +++++
 gdb/dwarf2/die.h  |  5 +++++
 gdb/dwarf2/read.c | 11 ++++++++++-
 3 files changed, 20 insertions(+), 1 deletion(-)

-- 
2.17.2

Comments

Simon Marchi May 23, 2020, 10:02 p.m. | #1
On 2020-05-23 5:21 p.m., Tom Tromey wrote:
> I noticed that DWARF CU expansion had an easy-to-fix hot spot:

> following DIE references in dwarf2_attr.

> 

> This patch fixes the problem by caching the referent in the die_info,

> if the two DIEs are in the same CU.  (This restriction avoids any

> issues with CU invalidation.)

> 

> I ran "gdb -readnow" on itself 10 times.  The mean before on this

> machine was 14.522 seconds; with the patch it was 13.685 seconds.

> 

> Historically I figured that CU expansion was not very important; but I

> think occasionally it results in unexpected delays for users.  So,

> it's probably worth fixing.  I'm not sure the micro-optimization

> approach is really all that good in the long run.  Better, I think,

> would be to read types and function bodies lazily -- this would vastly

> improve performance.  (I attempted the latter once and saw a 40%

> speedup, IIRC.)

> 

> The use of dwarf2_attr could also be improved.  Rather than looking up

> attributes individually, it would probably be better for (most)

> functions to loop once over all attributes, collecting useful

> information.

> 

> Meanwhile, this patch seems reasonably worthwhile.

> 

> gdb/ChangeLog

> 2020-05-23  Tom Tromey  <tom@tromey.com>

> 

> 	* dwarf2/read.c (dwarf2_attr): Use die_info::ref.

> 	* dwarf2/die.h (struct die_info) <ref>: New member.

> ---

>  gdb/ChangeLog     |  5 +++++

>  gdb/dwarf2/die.h  |  5 +++++

>  gdb/dwarf2/read.c | 11 ++++++++++-

>  3 files changed, 20 insertions(+), 1 deletion(-)

> 

> diff --git a/gdb/dwarf2/die.h b/gdb/dwarf2/die.h

> index 5522ebdf311..83bcdf69a17 100644

> --- a/gdb/dwarf2/die.h

> +++ b/gdb/dwarf2/die.h

> @@ -94,6 +94,11 @@ struct die_info

>    struct die_info *sibling;	/* Its next sibling, if any.  */

>    struct die_info *parent;	/* Its parent, if any.  */

>  

> +  /* If the DIE has a DW_AT_specification or DW_AT_abstract_origin,

> +     and the referenced DIE appears in the same CU as this DIE, then

> +     this caches the referenced DIE.  */

> +  struct die_info *ref;

> +

>    /* An array of attributes, with NUM_ATTRS elements.  There may be

>       zero, but it's not common and zero-sized arrays are not

>       sufficiently portable C.  */

> diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c

> index e6d08110b2a..973c6c95ded 100644

> --- a/gdb/dwarf2/read.c

> +++ b/gdb/dwarf2/read.c

> @@ -19556,7 +19556,16 @@ dwarf2_attr (struct die_info *die, unsigned int name, struct dwarf2_cu *cu)

>        if (!spec)

>  	break;

>  

> -      die = follow_die_ref (die, spec, &cu);

> +      if (die->ref != nullptr)

> +	die = die->ref;

> +      else

> +	{

> +	  struct dwarf2_cu *save_cu = cu;

> +	  struct die_info *ref = follow_die_ref (die, spec, &cu);

> +	  if (cu == save_cu)

> +	    die->ref = ref;

> +	  die = ref;

> +	}


It wouldn't hurt to add some comment here, something like:

/* Cache referred* DIE to speed up subsequent accesses.  Don't cache it
   if it's from another CU, as the other CU could be freed/invalidated
   and this pointer would become invalid.  */

* Or "referent", as you used in the commit message?  I didn't know about
  this meaning of the referent word.

If I just saw the code without the explanation, I probably wouldn't
understand the rationale.

Otherwise, LGTM.

Simon

Patch

diff --git a/gdb/dwarf2/die.h b/gdb/dwarf2/die.h
index 5522ebdf311..83bcdf69a17 100644
--- a/gdb/dwarf2/die.h
+++ b/gdb/dwarf2/die.h
@@ -94,6 +94,11 @@  struct die_info
   struct die_info *sibling;	/* Its next sibling, if any.  */
   struct die_info *parent;	/* Its parent, if any.  */
 
+  /* If the DIE has a DW_AT_specification or DW_AT_abstract_origin,
+     and the referenced DIE appears in the same CU as this DIE, then
+     this caches the referenced DIE.  */
+  struct die_info *ref;
+
   /* An array of attributes, with NUM_ATTRS elements.  There may be
      zero, but it's not common and zero-sized arrays are not
      sufficiently portable C.  */
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index e6d08110b2a..973c6c95ded 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -19556,7 +19556,16 @@  dwarf2_attr (struct die_info *die, unsigned int name, struct dwarf2_cu *cu)
       if (!spec)
 	break;
 
-      die = follow_die_ref (die, spec, &cu);
+      if (die->ref != nullptr)
+	die = die->ref;
+      else
+	{
+	  struct dwarf2_cu *save_cu = cu;
+	  struct die_info *ref = follow_die_ref (die, spec, &cu);
+	  if (cu == save_cu)
+	    die->ref = ref;
+	  die = ref;
+	}
     }
 
   return NULL;