[2/3] gdb: make thread_suspend_state::stop_pc optional

Message ID d2f8d7496d6bfcff1ff61f0debb9bcefdccf7ca9.1630353626.git.andrew.burgess@embecosm.com
State New
Headers show
Series
  • Changes to thread state tracking
Related show

Commit Message

Andrew Burgess Aug. 30, 2021, 8:03 p.m.
Currently the stop_pc field of thread_suspect_state is a CORE_ADDR and
when we want to indicate that there is no stop_pc available we set
this field back to a special value.

There are actually two special values used, in post_create_inferior
the stop_pc is set to 0.  This is a little unfortunate, there are
plenty of embedded targets where 0 is a valid pc address.  The more
common special value for stop_pc was set in
thread_info::set_executing, where the value (~(CORE_ADDR) 0) was used.

This commit changes things so that the stop_pc is instead a
gdb::optional.  We can now explicitly reset the field to an
uninitialised state, we also have (when compiling with _GLIBCXX_DEBUG
defined) asserts that we don't read the stop_pc when its in an
uninitialised state (see gdbsupport/gdb_optional.h).

One situation where a thread will not have a stop_pc value is when the
thread is stopped as a consequence of GDB being in all stop mode, and
some other thread stopped at an interesting event.  When GDB brings
all the other threads to a stop those other threads will not have a
stop_pc set (thus avoiding an unnecessary read of $pc).

Previously, when GDB passed through handle_one (in infrun.c) the
threads executing flag was set to false and the stop_pc field was left
unchanged, i.e. it would (previous) have been left as ~0.

Now, handle_one leaves the stop_pc with no value.

This caused a problem when we later try to set these threads running
again, in proceed() we compare the current pc with the cached
stop_pc.  If the thread was stopped in via handle_one then the stop_pc
would have been left as ~0, and the compare (in proceed)
would (likely) fail.  Now however, this compare tries to read the
stop_pc when it has no value, this would trigger an assert.

To resolve this I've added thread_info::stop_pc_p() which returns true
if the thread has a cached stop_pc.  We should only ever call
thread_info::stop_pc() if we know that there is a cached stop_pc.

After running the testsuite I've seen no other situations where
stop_pc is read uninitialised.
---
 gdb/gdbthread.h | 26 ++++++++++++++++++++++----
 gdb/infcmd.c    |  2 +-
 gdb/infrun.c    |  3 ++-
 gdb/thread.c    |  2 +-
 4 files changed, 26 insertions(+), 7 deletions(-)

-- 
2.25.4

Comments

Lancelot SIX via Gdb-patches Sept. 1, 2021, 2:23 p.m. | #1
On 2021-08-30 4:03 p.m., Andrew Burgess wrote:
> Currently the stop_pc field of thread_suspect_state is a CORE_ADDR and

> when we want to indicate that there is no stop_pc available we set

> this field back to a special value.

> 

> There are actually two special values used, in post_create_inferior

> the stop_pc is set to 0.  This is a little unfortunate, there are

> plenty of embedded targets where 0 is a valid pc address.  The more

> common special value for stop_pc was set in

> thread_info::set_executing, where the value (~(CORE_ADDR) 0) was used.

> 

> This commit changes things so that the stop_pc is instead a

> gdb::optional.  We can now explicitly reset the field to an

> uninitialised state, we also have (when compiling with _GLIBCXX_DEBUG

> defined) asserts that we don't read the stop_pc when its in an

> uninitialised state (see gdbsupport/gdb_optional.h).


Thanks, I think it's a good idea.

> One situation where a thread will not have a stop_pc value is when the

> thread is stopped as a consequence of GDB being in all stop mode, and

> some other thread stopped at an interesting event.  When GDB brings

> all the other threads to a stop those other threads will not have a

> stop_pc set (thus avoiding an unnecessary read of $pc).

> 

> Previously, when GDB passed through handle_one (in infrun.c) the

> threads executing flag was set to false and the stop_pc field was left

> unchanged, i.e. it would (previous) have been left as ~0.

> 

> Now, handle_one leaves the stop_pc with no value.

> 

> This caused a problem when we later try to set these threads running

> again, in proceed() we compare the current pc with the cached

> stop_pc.  If the thread was stopped in via handle_one then the stop_pc

> would have been left as ~0, and the compare (in proceed)

> would (likely) fail.  Now however, this compare tries to read the

> stop_pc when it has no value, this would trigger an assert.

> 

> To resolve this I've added thread_info::stop_pc_p() which returns true

> if the thread has a cached stop_pc.  We should only ever call

> thread_info::stop_pc() if we know that there is a cached stop_pc.


We could also make stop_pc return gdb::optional<CORE_ADDR>.  I think it
would be slightly better, since anybody calling stop_pc would see that
it returns an optional and be forced to consider that.  Otherwise, one
could call stop_pc and not know that stop_pc_p exists.  But otherwise
it's the same.

LGTM in any case.

Simon
Andrew Burgess Sept. 7, 2021, 1:21 p.m. | #2
* Simon Marchi <simon.marchi@polymtl.ca> [2021-09-01 10:23:32 -0400]:

> 

> 

> On 2021-08-30 4:03 p.m., Andrew Burgess wrote:

> > Currently the stop_pc field of thread_suspect_state is a CORE_ADDR and

> > when we want to indicate that there is no stop_pc available we set

> > this field back to a special value.

> > 

> > There are actually two special values used, in post_create_inferior

> > the stop_pc is set to 0.  This is a little unfortunate, there are

> > plenty of embedded targets where 0 is a valid pc address.  The more

> > common special value for stop_pc was set in

> > thread_info::set_executing, where the value (~(CORE_ADDR) 0) was used.

> > 

> > This commit changes things so that the stop_pc is instead a

> > gdb::optional.  We can now explicitly reset the field to an

> > uninitialised state, we also have (when compiling with _GLIBCXX_DEBUG

> > defined) asserts that we don't read the stop_pc when its in an

> > uninitialised state (see gdbsupport/gdb_optional.h).

> 

> Thanks, I think it's a good idea.

> 

> > One situation where a thread will not have a stop_pc value is when the

> > thread is stopped as a consequence of GDB being in all stop mode, and

> > some other thread stopped at an interesting event.  When GDB brings

> > all the other threads to a stop those other threads will not have a

> > stop_pc set (thus avoiding an unnecessary read of $pc).

> > 

> > Previously, when GDB passed through handle_one (in infrun.c) the

> > threads executing flag was set to false and the stop_pc field was left

> > unchanged, i.e. it would (previous) have been left as ~0.

> > 

> > Now, handle_one leaves the stop_pc with no value.

> > 

> > This caused a problem when we later try to set these threads running

> > again, in proceed() we compare the current pc with the cached

> > stop_pc.  If the thread was stopped in via handle_one then the stop_pc

> > would have been left as ~0, and the compare (in proceed)

> > would (likely) fail.  Now however, this compare tries to read the

> > stop_pc when it has no value, this would trigger an assert.

> > 

> > To resolve this I've added thread_info::stop_pc_p() which returns true

> > if the thread has a cached stop_pc.  We should only ever call

> > thread_info::stop_pc() if we know that there is a cached stop_pc.

> 

> We could also make stop_pc return gdb::optional<CORE_ADDR>.  I think it

> would be slightly better, since anybody calling stop_pc would see that

> it returns an optional and be forced to consider that.  Otherwise, one

> could call stop_pc and not know that stop_pc_p exists.  But otherwise

> it's the same.


I did consider that initially, but most of the places where
thread_info::stop_pc is called the value is being immediately passed
through to some other function, here's an example pulled randomly from
infrun.c:

      ecs->event_thread->control.stop_bpstat
	= bpstat_stop_status (get_current_regcache ()->aspace (),
			      ecs->event_thread->stop_pc (),
			      ecs->event_thread, &ecs->ws);

if we are returned a gdb::optional<> then we might change the code to
do this:

      ecs->event_thread->control.stop_bpstat
	= bpstat_stop_status (get_current_regcache ()->aspace (),
			      *ecs->event_thread->stop_pc (),
			      ecs->event_thread, &ecs->ws);

Or maybe, like this:

      auto stop_pc = ecs->event_thread->stop_pc ();
      gdb_assert (stop_pc.has_value ());
      ecs->event_thread->control.stop_bpstat
	= bpstat_stop_status (get_current_regcache ()->aspace (),
			      *stop_pc,
			      ecs->event_thread, &ecs->ws);

In the first case, it doesn't feel like we've gained much over my
patch, where thead_info::stop_pc() accesses the value for us.
Further, once we've normalised the pattern of accessing the stop_pc as
`*ecs->event_thread->stop_pc ()`, I worry people still wouldn't
actually consider whether the stop_pc value was valid or not, they'd
just duplicate the existing code.

The second case seems excessively verbose, so much so, that you might
even be tempted to write a wrapper, say thread_info::stop_pc_value(),
which kind lands us back on my original patch...

Initially, I'd relied on the asserts within gdb::optional to ensure
that we didn't access the stop_pc when it had no value, but these
asserts are only present when compiling with _GLIBCXX_DEBUG defined -
I do this, but it's certainly not going to be standard in a release
build of GDB.  So, I wonder if this would be a good change:

  /* Return this thread's stop PC.  This should only be called when it is
     known that stop_pc has a value.  If this function is being used in a
     situation where a thread may not have had a stop_pc assigned, then
     stop_pc_p() can be used to check if the stop_pc is defined.  */

  CORE_ADDR stop_pc () const
  {
    gdb_assert (m_suspend.stop_pc.has_value ());
    return *m_suspend.stop_pc;
  }

I've (a) extended the comment to mention stop_pc_p(), and added an
assert that the stop_pc has a value.  Given that (currently) all
builds of GDB do check assertions, this should hopefully make it much
more likely that if someone does access stop_pc when they shouldn't
then GDB will rapidly point this out to them.

What are your thoughts?

Thanks,
Andrew
Lancelot SIX via Gdb-patches Sept. 7, 2021, 2:10 p.m. | #3
On 2021-09-07 9:21 a.m., Andrew Burgess wrote:
> * Simon Marchi <simon.marchi@polymtl.ca> [2021-09-01 10:23:32 -0400]:

> 

>>

>>

>> On 2021-08-30 4:03 p.m., Andrew Burgess wrote:

>>> Currently the stop_pc field of thread_suspect_state is a CORE_ADDR and

>>> when we want to indicate that there is no stop_pc available we set

>>> this field back to a special value.

>>>

>>> There are actually two special values used, in post_create_inferior

>>> the stop_pc is set to 0.  This is a little unfortunate, there are

>>> plenty of embedded targets where 0 is a valid pc address.  The more

>>> common special value for stop_pc was set in

>>> thread_info::set_executing, where the value (~(CORE_ADDR) 0) was used.

>>>

>>> This commit changes things so that the stop_pc is instead a

>>> gdb::optional.  We can now explicitly reset the field to an

>>> uninitialised state, we also have (when compiling with _GLIBCXX_DEBUG

>>> defined) asserts that we don't read the stop_pc when its in an

>>> uninitialised state (see gdbsupport/gdb_optional.h).

>>

>> Thanks, I think it's a good idea.

>>

>>> One situation where a thread will not have a stop_pc value is when the

>>> thread is stopped as a consequence of GDB being in all stop mode, and

>>> some other thread stopped at an interesting event.  When GDB brings

>>> all the other threads to a stop those other threads will not have a

>>> stop_pc set (thus avoiding an unnecessary read of $pc).

>>>

>>> Previously, when GDB passed through handle_one (in infrun.c) the

>>> threads executing flag was set to false and the stop_pc field was left

>>> unchanged, i.e. it would (previous) have been left as ~0.

>>>

>>> Now, handle_one leaves the stop_pc with no value.

>>>

>>> This caused a problem when we later try to set these threads running

>>> again, in proceed() we compare the current pc with the cached

>>> stop_pc.  If the thread was stopped in via handle_one then the stop_pc

>>> would have been left as ~0, and the compare (in proceed)

>>> would (likely) fail.  Now however, this compare tries to read the

>>> stop_pc when it has no value, this would trigger an assert.

>>>

>>> To resolve this I've added thread_info::stop_pc_p() which returns true

>>> if the thread has a cached stop_pc.  We should only ever call

>>> thread_info::stop_pc() if we know that there is a cached stop_pc.

>>

>> We could also make stop_pc return gdb::optional<CORE_ADDR>.  I think it

>> would be slightly better, since anybody calling stop_pc would see that

>> it returns an optional and be forced to consider that.  Otherwise, one

>> could call stop_pc and not know that stop_pc_p exists.  But otherwise

>> it's the same.

> 

> I did consider that initially, but most of the places where

> thread_info::stop_pc is called the value is being immediately passed

> through to some other function, here's an example pulled randomly from

> infrun.c:

> 

>       ecs->event_thread->control.stop_bpstat

> 	= bpstat_stop_status (get_current_regcache ()->aspace (),

> 			      ecs->event_thread->stop_pc (),

> 			      ecs->event_thread, &ecs->ws);

> 

> if we are returned a gdb::optional<> then we might change the code to

> do this:

> 

>       ecs->event_thread->control.stop_bpstat

> 	= bpstat_stop_status (get_current_regcache ()->aspace (),

> 			      *ecs->event_thread->stop_pc (),

> 			      ecs->event_thread, &ecs->ws);

> 

> Or maybe, like this:

> 

>       auto stop_pc = ecs->event_thread->stop_pc ();

>       gdb_assert (stop_pc.has_value ());

>       ecs->event_thread->control.stop_bpstat

> 	= bpstat_stop_status (get_current_regcache ()->aspace (),

> 			      *stop_pc,

> 			      ecs->event_thread, &ecs->ws);

> 

> In the first case, it doesn't feel like we've gained much over my

> patch, where thead_info::stop_pc() accesses the value for us.

> Further, once we've normalised the pattern of accessing the stop_pc as

> `*ecs->event_thread->stop_pc ()`, I worry people still wouldn't

> actually consider whether the stop_pc value was valid or not, they'd

> just duplicate the existing code.

> 

> The second case seems excessively verbose, so much so, that you might

> even be tempted to write a wrapper, say thread_info::stop_pc_value(),

> which kind lands us back on my original patch...

> 

> Initially, I'd relied on the asserts within gdb::optional to ensure

> that we didn't access the stop_pc when it had no value, but these

> asserts are only present when compiling with _GLIBCXX_DEBUG defined -

> I do this, but it's certainly not going to be standard in a release

> build of GDB.  So, I wonder if this would be a good change:

> 

>   /* Return this thread's stop PC.  This should only be called when it is

>      known that stop_pc has a value.  If this function is being used in a

>      situation where a thread may not have had a stop_pc assigned, then

>      stop_pc_p() can be used to check if the stop_pc is defined.  */

> 

>   CORE_ADDR stop_pc () const

>   {

>     gdb_assert (m_suspend.stop_pc.has_value ());

>     return *m_suspend.stop_pc;

>   }


That one looks good to me, stating preconditions in the documentation
and enforcing them using assertions.  That's similar to how I designed
other things, like dynamic_prop.

Simon
Andrew Burgess Sept. 8, 2021, 9:50 a.m. | #4
* Simon Marchi <simon.marchi@polymtl.ca> [2021-09-07 10:10:07 -0400]:

> On 2021-09-07 9:21 a.m., Andrew Burgess wrote:

> > * Simon Marchi <simon.marchi@polymtl.ca> [2021-09-01 10:23:32 -0400]:

> > 

> >>

> >>

> >> On 2021-08-30 4:03 p.m., Andrew Burgess wrote:

> >>> Currently the stop_pc field of thread_suspect_state is a CORE_ADDR and

> >>> when we want to indicate that there is no stop_pc available we set

> >>> this field back to a special value.

> >>>

> >>> There are actually two special values used, in post_create_inferior

> >>> the stop_pc is set to 0.  This is a little unfortunate, there are

> >>> plenty of embedded targets where 0 is a valid pc address.  The more

> >>> common special value for stop_pc was set in

> >>> thread_info::set_executing, where the value (~(CORE_ADDR) 0) was used.

> >>>

> >>> This commit changes things so that the stop_pc is instead a

> >>> gdb::optional.  We can now explicitly reset the field to an

> >>> uninitialised state, we also have (when compiling with _GLIBCXX_DEBUG

> >>> defined) asserts that we don't read the stop_pc when its in an

> >>> uninitialised state (see gdbsupport/gdb_optional.h).

> >>

> >> Thanks, I think it's a good idea.

> >>

> >>> One situation where a thread will not have a stop_pc value is when the

> >>> thread is stopped as a consequence of GDB being in all stop mode, and

> >>> some other thread stopped at an interesting event.  When GDB brings

> >>> all the other threads to a stop those other threads will not have a

> >>> stop_pc set (thus avoiding an unnecessary read of $pc).

> >>>

> >>> Previously, when GDB passed through handle_one (in infrun.c) the

> >>> threads executing flag was set to false and the stop_pc field was left

> >>> unchanged, i.e. it would (previous) have been left as ~0.

> >>>

> >>> Now, handle_one leaves the stop_pc with no value.

> >>>

> >>> This caused a problem when we later try to set these threads running

> >>> again, in proceed() we compare the current pc with the cached

> >>> stop_pc.  If the thread was stopped in via handle_one then the stop_pc

> >>> would have been left as ~0, and the compare (in proceed)

> >>> would (likely) fail.  Now however, this compare tries to read the

> >>> stop_pc when it has no value, this would trigger an assert.

> >>>

> >>> To resolve this I've added thread_info::stop_pc_p() which returns true

> >>> if the thread has a cached stop_pc.  We should only ever call

> >>> thread_info::stop_pc() if we know that there is a cached stop_pc.

> >>

> >> We could also make stop_pc return gdb::optional<CORE_ADDR>.  I think it

> >> would be slightly better, since anybody calling stop_pc would see that

> >> it returns an optional and be forced to consider that.  Otherwise, one

> >> could call stop_pc and not know that stop_pc_p exists.  But otherwise

> >> it's the same.

> > 

> > I did consider that initially, but most of the places where

> > thread_info::stop_pc is called the value is being immediately passed

> > through to some other function, here's an example pulled randomly from

> > infrun.c:

> > 

> >       ecs->event_thread->control.stop_bpstat

> > 	= bpstat_stop_status (get_current_regcache ()->aspace (),

> > 			      ecs->event_thread->stop_pc (),

> > 			      ecs->event_thread, &ecs->ws);

> > 

> > if we are returned a gdb::optional<> then we might change the code to

> > do this:

> > 

> >       ecs->event_thread->control.stop_bpstat

> > 	= bpstat_stop_status (get_current_regcache ()->aspace (),

> > 			      *ecs->event_thread->stop_pc (),

> > 			      ecs->event_thread, &ecs->ws);

> > 

> > Or maybe, like this:

> > 

> >       auto stop_pc = ecs->event_thread->stop_pc ();

> >       gdb_assert (stop_pc.has_value ());

> >       ecs->event_thread->control.stop_bpstat

> > 	= bpstat_stop_status (get_current_regcache ()->aspace (),

> > 			      *stop_pc,

> > 			      ecs->event_thread, &ecs->ws);

> > 

> > In the first case, it doesn't feel like we've gained much over my

> > patch, where thead_info::stop_pc() accesses the value for us.

> > Further, once we've normalised the pattern of accessing the stop_pc as

> > `*ecs->event_thread->stop_pc ()`, I worry people still wouldn't

> > actually consider whether the stop_pc value was valid or not, they'd

> > just duplicate the existing code.

> > 

> > The second case seems excessively verbose, so much so, that you might

> > even be tempted to write a wrapper, say thread_info::stop_pc_value(),

> > which kind lands us back on my original patch...

> > 

> > Initially, I'd relied on the asserts within gdb::optional to ensure

> > that we didn't access the stop_pc when it had no value, but these

> > asserts are only present when compiling with _GLIBCXX_DEBUG defined -

> > I do this, but it's certainly not going to be standard in a release

> > build of GDB.  So, I wonder if this would be a good change:

> > 

> >   /* Return this thread's stop PC.  This should only be called when it is

> >      known that stop_pc has a value.  If this function is being used in a

> >      situation where a thread may not have had a stop_pc assigned, then

> >      stop_pc_p() can be used to check if the stop_pc is defined.  */

> > 

> >   CORE_ADDR stop_pc () const

> >   {

> >     gdb_assert (m_suspend.stop_pc.has_value ());

> >     return *m_suspend.stop_pc;

> >   }

> 

> That one looks good to me, stating preconditions in the documentation

> and enforcing them using assertions.  That's similar to how I designed

> other things, like dynamic_prop.

> 

> Simon


Thanks, I pushed this patch with that change in place.

Andrew

Patch

diff --git a/gdb/gdbthread.h b/gdb/gdbthread.h
index 734fe3bce64..245445a859b 100644
--- a/gdb/gdbthread.h
+++ b/gdb/gdbthread.h
@@ -197,9 +197,12 @@  struct thread_suspend_state
        stop_reason: if the thread's PC has changed since the thread
        last stopped, a pending breakpoint waitstatus is discarded.
 
-     - If the thread is running, this is set to -1, to avoid leaving
-       it with a stale value, to make it easier to catch bugs.  */
-  CORE_ADDR stop_pc = 0;
+     - If the thread is running, then this field has its value removed by
+       calling stop_pc.reset() (see thread_info::set_executing()).
+       Attempting to read a gdb::optional with no value is undefined
+       behaviour and will trigger an assertion error when _GLIBCXX_DEBUG is
+       defined, which should make error easier to track down.  */
+  gdb::optional<CORE_ADDR> stop_pc;
 };
 
 /* Base class for target-specific thread data.  */
@@ -326,7 +329,7 @@  class thread_info : public refcounted_object,
 
   CORE_ADDR stop_pc () const
   {
-    return m_suspend.stop_pc;
+    return *m_suspend.stop_pc;
   }
 
   /* Set this thread's stop PC.  */
@@ -336,6 +339,21 @@  class thread_info : public refcounted_object,
     m_suspend.stop_pc = stop_pc;
   }
 
+  /* Remove the stop_pc stored on this thread.  */
+
+  void clear_stop_pc ()
+  {
+    m_suspend.stop_pc.reset ();
+  }
+
+  /* Return true if this thread has a cached stop pc value, otherwise
+     return false.  */
+
+  bool stop_pc_p () const
+  {
+    return m_suspend.stop_pc.has_value ();
+  }
+
   /* Return true if this thread has a pending wait status.  */
 
   bool has_pending_waitstatus () const
diff --git a/gdb/infcmd.c b/gdb/infcmd.c
index e6ee49bc43d..d948f4bafc5 100644
--- a/gdb/infcmd.c
+++ b/gdb/infcmd.c
@@ -249,7 +249,7 @@  post_create_inferior (int from_tty)
      missing registers info), ignore it.  */
   thread_info *thr = inferior_thread ();
 
-  thr->set_stop_pc (0);
+  thr->clear_stop_pc ();
   try
     {
       regcache *rc = get_thread_regcache (thr);
diff --git a/gdb/infrun.c b/gdb/infrun.c
index 11440894b15..5554523a049 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -3051,7 +3051,8 @@  proceed (CORE_ADDR addr, enum gdb_signal siggnal)
 
   if (addr == (CORE_ADDR) -1)
     {
-      if (pc == cur_thr->stop_pc ()
+      if (cur_thr->stop_pc_p ()
+	  && pc == cur_thr->stop_pc ()
 	  && breakpoint_here_p (aspace, pc) == ordinary_breakpoint_here
 	  && execution_direction != EXEC_REVERSE)
 	/* There is a breakpoint at the address we will resume at,
diff --git a/gdb/thread.c b/gdb/thread.c
index c95a9186681..10c3dcd6991 100644
--- a/gdb/thread.c
+++ b/gdb/thread.c
@@ -324,7 +324,7 @@  thread_info::set_executing (bool executing)
 {
   m_executing = executing;
   if (executing)
-    this->set_stop_pc (~(CORE_ADDR) 0);
+    this->clear_stop_pc ();
 }
 
 /* See gdbthread.h.  */