gdb: improve error reporting from the disassembler

Message ID 20211011131001.1693647-1-andrew.burgess@embecosm.com
State New
Headers show
Series
  • gdb: improve error reporting from the disassembler
Related show

Commit Message

Andrew Burgess Oct. 11, 2021, 1:10 p.m.
If the libopcodes disassembler returns a negative value then this
indicates that the disassembly failed for some reason.  In disas.c, in
the function gdb_disassembler::print_insn we can see how this is
handled; when we get a negative value back, we call the memory_error
function, which throws an exception.

The problem here is that the address used in the memory_error call is
gdb_disassembler::m_err_memaddr, which is set in
gdb_disassembler::dis_asm_memory_error, which is called from within
the libopcodes disassembler through the
disassembler_info::memory_error_func callback.

However, for this to work correctly, every time the libopcodes
disassembler returns a negative value, the libopcodes disassembler
must have first called the memory_error_func callback.

My first plan was to make m_err_memaddr a gdb::optional, and assert
that it always had a value prior to calling memory_error, however, a
quick look in opcodes/*-dis.c shows that there _are_ cases where a
negative value is returned without first calling the memory_error_func
callback, for example in arc-dis.c and cris-dis.c.

Now, I think that a good argument can be made that these disassemblers
must therefore be broken, except for the case where we can't read
memory, we should always be able to disassemble the memory contents to
_something_, even if it's just '.word 0x....'.  However, I certainly
don't plan to go and fix all of the disassemblers.

What I do propose to do then, is make m_err_memaddr a gdb::optional,
but now, instead of always calling memory_error, I add a new path
which just calls error complaining about an unknown error.  This new
path is only used if m_err_memaddr doesn't have a value (indicating
that the memory_error_func callback was not called).

To test this I just augmented one of the disassemblers to always
return -1, before this patch I see this:

  Dump of assembler code for function main:
     0x000101aa <+0>:	Cannot access memory at address 0x0

And after this commit I now see:

  Dump of assembler code for function main:
     0x000101aa <+0>:	unknown disassembler error (error = -1)

This doesn't really help much, but that's because there's no way to
report non memory errors out of the disasembler, because, it was not
expected that the disassembler would ever report non memory errors.
---
 gdb/disasm.c | 14 +++++++++-----
 gdb/disasm.h |  7 ++++++-
 2 files changed, 15 insertions(+), 6 deletions(-)

-- 
2.25.4

Comments

Simon Marchi Oct. 12, 2021, 3:23 p.m. | #1
On 2021-10-11 9:10 a.m., Andrew Burgess wrote:
> If the libopcodes disassembler returns a negative value then this

> indicates that the disassembly failed for some reason.  In disas.c, in

> the function gdb_disassembler::print_insn we can see how this is

> handled; when we get a negative value back, we call the memory_error

> function, which throws an exception.

>

> The problem here is that the address used in the memory_error call is

> gdb_disassembler::m_err_memaddr, which is set in

> gdb_disassembler::dis_asm_memory_error, which is called from within

> the libopcodes disassembler through the

> disassembler_info::memory_error_func callback.

>

> However, for this to work correctly, every time the libopcodes

> disassembler returns a negative value, the libopcodes disassembler

> must have first called the memory_error_func callback.

>

> My first plan was to make m_err_memaddr a gdb::optional, and assert

> that it always had a value prior to calling memory_error, however, a

> quick look in opcodes/*-dis.c shows that there _are_ cases where a

> negative value is returned without first calling the memory_error_func

> callback, for example in arc-dis.c and cris-dis.c.

>

> Now, I think that a good argument can be made that these disassemblers

> must therefore be broken, except for the case where we can't read

> memory, we should always be able to disassemble the memory contents to

> _something_, even if it's just '.word 0x....'.  However, I certainly

> don't plan to go and fix all of the disassemblers.

>

> What I do propose to do then, is make m_err_memaddr a gdb::optional,

> but now, instead of always calling memory_error, I add a new path

> which just calls error complaining about an unknown error.  This new

> path is only used if m_err_memaddr doesn't have a value (indicating

> that the memory_error_func callback was not called).

>

> To test this I just augmented one of the disassemblers to always

> return -1, before this patch I see this:

>

>   Dump of assembler code for function main:

>      0x000101aa <+0>:	Cannot access memory at address 0x0

>

> And after this commit I now see:

>

>   Dump of assembler code for function main:

>      0x000101aa <+0>:	unknown disassembler error (error = -1)

>

> This doesn't really help much, but that's because there's no way to

> report non memory errors out of the disasembler, because, it was not

> expected that the disassembler would ever report non memory errors.


The patch LGTM since it makes GDB more resilient against the bugs in
opcodes.  Given infinite time, we would fix opcodes, but this looks like
a good compromise.

I'm wondering how you stumbled on that, did you actually get bitten by
opcodes returning -1 but not calling memory_error_func?

Simon
Andrew Burgess Oct. 13, 2021, 9:40 a.m. | #2
* Simon Marchi <simark@simark.ca> [2021-10-12 11:23:51 -0400]:

> On 2021-10-11 9:10 a.m., Andrew Burgess wrote:

> > If the libopcodes disassembler returns a negative value then this

> > indicates that the disassembly failed for some reason.  In disas.c, in

> > the function gdb_disassembler::print_insn we can see how this is

> > handled; when we get a negative value back, we call the memory_error

> > function, which throws an exception.

> >

> > The problem here is that the address used in the memory_error call is

> > gdb_disassembler::m_err_memaddr, which is set in

> > gdb_disassembler::dis_asm_memory_error, which is called from within

> > the libopcodes disassembler through the

> > disassembler_info::memory_error_func callback.

> >

> > However, for this to work correctly, every time the libopcodes

> > disassembler returns a negative value, the libopcodes disassembler

> > must have first called the memory_error_func callback.

> >

> > My first plan was to make m_err_memaddr a gdb::optional, and assert

> > that it always had a value prior to calling memory_error, however, a

> > quick look in opcodes/*-dis.c shows that there _are_ cases where a

> > negative value is returned without first calling the memory_error_func

> > callback, for example in arc-dis.c and cris-dis.c.

> >

> > Now, I think that a good argument can be made that these disassemblers

> > must therefore be broken, except for the case where we can't read

> > memory, we should always be able to disassemble the memory contents to

> > _something_, even if it's just '.word 0x....'.  However, I certainly

> > don't plan to go and fix all of the disassemblers.

> >

> > What I do propose to do then, is make m_err_memaddr a gdb::optional,

> > but now, instead of always calling memory_error, I add a new path

> > which just calls error complaining about an unknown error.  This new

> > path is only used if m_err_memaddr doesn't have a value (indicating

> > that the memory_error_func callback was not called).

> >

> > To test this I just augmented one of the disassemblers to always

> > return -1, before this patch I see this:

> >

> >   Dump of assembler code for function main:

> >      0x000101aa <+0>:	Cannot access memory at address 0x0

> >

> > And after this commit I now see:

> >

> >   Dump of assembler code for function main:

> >      0x000101aa <+0>:	unknown disassembler error (error = -1)

> >

> > This doesn't really help much, but that's because there's no way to

> > report non memory errors out of the disasembler, because, it was not

> > expected that the disassembler would ever report non memory errors.

> 

> The patch LGTM since it makes GDB more resilient against the bugs in

> opcodes.  Given infinite time, we would fix opcodes, but this looks like

> a good compromise.


Thanks.  I agree fixing libopcodes would be best in an ideal world,
but it's nice to harden GDB against these things.

> I'm wondering how you stumbled on that, did you actually get bitten by

> opcodes returning -1 but not calling memory_error_func?


No.  I have an incoming patch to add a Python API for the
disassembler.  As part of this work I was reviewing when the
disassembler can return -1.  I initially reasoned that it should only
return -1 following a memory_error, but a quick grep disproved that
idea.

Thanks,
Andrew

Patch

diff --git a/gdb/disasm.c b/gdb/disasm.c
index c788f5b618c..dc6426718bb 100644
--- a/gdb/disasm.c
+++ b/gdb/disasm.c
@@ -148,7 +148,7 @@  gdb_disassembler::dis_asm_memory_error (int err, bfd_vma memaddr,
   gdb_disassembler *self
     = static_cast<gdb_disassembler *>(info->application_data);
 
-  self->m_err_memaddr = memaddr;
+  self->m_err_memaddr.emplace (memaddr);
 }
 
 /* Wrapper of print_address.  */
@@ -754,8 +754,7 @@  get_all_disassembler_options (struct gdbarch *gdbarch)
 gdb_disassembler::gdb_disassembler (struct gdbarch *gdbarch,
 				    struct ui_file *file,
 				    di_read_memory_ftype read_memory_func)
-  : m_gdbarch (gdbarch),
-    m_err_memaddr (0)
+  : m_gdbarch (gdbarch)
 {
   init_disassemble_info (&m_di, file, fprintf_disasm);
   m_di.flavour = bfd_target_unknown_flavour;
@@ -790,12 +789,17 @@  int
 gdb_disassembler::print_insn (CORE_ADDR memaddr,
 			      int *branch_delay_insns)
 {
-  m_err_memaddr = 0;
+  m_err_memaddr.reset ();
 
   int length = gdbarch_print_insn (arch (), memaddr, &m_di);
 
   if (length < 0)
-    memory_error (TARGET_XFER_E_IO, m_err_memaddr);
+    {
+      if (m_err_memaddr.has_value ())
+	memory_error (TARGET_XFER_E_IO, *m_err_memaddr);
+      else
+	error (_("unknown disassembler error (error = %d)"), length);
+    }
 
   if (branch_delay_insns != NULL)
     {
diff --git a/gdb/disasm.h b/gdb/disasm.h
index 6cbfdcd4f6b..d3642d8ca01 100644
--- a/gdb/disasm.h
+++ b/gdb/disasm.h
@@ -75,7 +75,12 @@  class gdb_disassembler
      using this field.  */
   std::string m_disassembler_options_holder;
 
-  CORE_ADDR m_err_memaddr;
+  /* This member variable is given a value by calling dis_asm_memory_error.
+     If after calling into the libopcodes disassembler we get back a
+     negative value (which indicates an error), then, if this variable has
+     a value, we report a memory error to the user, otherwise, we report a
+     non-memory error.  */
+  gdb::optional<CORE_ADDR> m_err_memaddr;
 
   static int dis_asm_read_memory (bfd_vma memaddr, gdb_byte *myaddr,
 				  unsigned int len,