[pushed] Don't touch frame_info objects if frame cache was reinitialized (was: Re: [PATCH 0/3] Fix crash if connection drops in scoped_restore_current_thread's ctor)

Message ID 8eb81bc8-4b7d-74d2-0658-36f637d1f74e@palves.net
State New
Headers show
Series
  • [pushed] Don't touch frame_info objects if frame cache was reinitialized (was: Re: [PATCH 0/3] Fix crash if connection drops in scoped_restore_current_thread's ctor)
Related show

Commit Message

Pedro Alves July 23, 2020, 3:28 p.m.
On 7/22/20 9:47 PM, Simon Marchi wrote:
> On 2020-07-22 4:37 p.m., Pedro Alves wrote:

>> I'm thinking that to fix this we will need a generation counter in

>> reinit_frame_cache.  Then in frame_unwind_try_unwinder, don't call

>> frame_cleanup_after_sniffer if the generation is not the same as it was

>> on entry.

>>

>> Something like this.  Does it fix it for you?  I can't seem to reproduce

>> the crash here.

> 

> I don't have time to try, but the approach makes sense.  It also crossed my mind,

> but I thought it would be more complicated to implement than that.

> 


Alright, I managed to reproduce it reliably with just:

 - not having debug info loaded to avoid the dwarf unwinder
 - killing gdbserver
 - backtrace

The tentative fix was incomplete, because the exception is rethrown, and
then caught by get_prev_frame_if_no_cycle which also touches frame_info
objects.  We just need to apply the same fix there.

Simon is away, so I'm going ahead assuming that this also fixes it for him.

Here's what I'm merging.

From 9552defd7ee0424165599b578dcd8ed71f0aef49 Mon Sep 17 00:00:00 2001
From: Pedro Alves <pedro@palves.net>

Date: Thu, 23 Jul 2020 16:19:38 +0100
Subject: [PATCH] Don't touch frame_info objects if frame cache was
 reinitialized

This fixes yet another bug exposed by ASAN + multi-target.exp

Running an Asan-enabled GDB against gdb.multi/multi-target.exp exposed
yet another latent GDB bug.  See here for the full log:

  https://sourceware.org/pipermail/gdb-patches/2020-July/170761.html

As Simon described, the problem is:

 - We create a new frame_info object in restore_selected_frame (by
   calling find_relative_frame)

 - The frame is allocated on the frame_cache_obstack

 - In frame_unwind_try_unwinder, we try to find an unwinder for that
   frame

 - While trying unwinders, memory read fails because the remote target
   closes, because of "monitor exit"

 - That calls reinit_frame_cache (as shown above), which resets
   frame_cache_obstack

 - When handling the exception in frame_unwind_try_unwinder, we try to
   set some things on the frame_info object (like *this_cache, which
   in fact tries to write into frame_info::prologue_cache), but the
   frame_info object is no more, it went away with the obstack.

Fix this by maintaining a frame cache generation counter.  Then in
exception handling code paths, don't touch frame objects if the
generation is not the same as it was on entry.

This commit generalizes the gdb.server/server-kill.exp testcase and
reuses it to test the scenario in question.  The new tests fail
without the GDB fix.

gdb/ChangeLog:

	* frame-unwind.c (frame_unwind_try_unwinder): On exception, don't
	touch THIS_CACHE/THIS_FRAME if the frame cache was cleared
	meanwhile.
	* frame.c (frame_cache_generation, get_frame_cache_generation):
	New.
	(reinit_frame_cache): Increment FRAME_CACHE_GENERATION.
	(get_prev_frame_if_no_cycle): On exception, don't touch
	PREV_FRAME/THIS_FRAME if the frame cache was cleared meanwhile.
	* frame.h (get_frame_cache_generation): Declare.

gdb/testsuite/ChangeLog:

	* gdb.server/server-kill.exp (prepare): New, factored out from the
	top level.
	(kill_server): New.
	(test_tstatus, test_unwind_nosyms, test_unwind_syms): New.
	(top level) : Call test_tstatus, test_unwind_nosyms, test_unwind_syms.
---
 gdb/frame-unwind.c                       |  13 +++-
 gdb/frame.c                              |  22 +++++-
 gdb/frame.h                              |   4 ++
 gdb/testsuite/gdb.server/server-kill.exp | 114 ++++++++++++++++++++++++-------
 4 files changed, 124 insertions(+), 29 deletions(-)


base-commit: 32fa152e3bfcf021ce49767be547fae5129d922b
-- 
2.14.5

Patch

diff --git a/gdb/frame-unwind.c b/gdb/frame-unwind.c
index 3334c472d02..064f6ebebda 100644
--- a/gdb/frame-unwind.c
+++ b/gdb/frame-unwind.c
@@ -121,6 +121,8 @@  frame_unwind_try_unwinder (struct frame_info *this_frame, void **this_cache,
 {
   int res = 0;
 
+  unsigned int entry_generation = get_frame_cache_generation ();
+
   frame_prepare_for_sniffer (this_frame, unwinder);
 
   try
@@ -130,9 +132,14 @@  frame_unwind_try_unwinder (struct frame_info *this_frame, void **this_cache,
   catch (const gdb_exception &ex)
     {
       /* Catch all exceptions, caused by either interrupt or error.
-	 Reset *THIS_CACHE.  */
-      *this_cache = NULL;
-      frame_cleanup_after_sniffer (this_frame);
+	 Reset *THIS_CACHE, unless something reinitialized the frame
+	 cache meanwhile, in which case THIS_FRAME/THIS_CACHE are now
+	 dangling.  */
+      if (get_frame_cache_generation () == entry_generation)
+	{
+	  *this_cache = NULL;
+	  frame_cleanup_after_sniffer (this_frame);
+	}
 
       if (ex.error == NOT_AVAILABLE_ERROR)
 	{
diff --git a/gdb/frame.c b/gdb/frame.c
index ac1016b083f..a3599e8af69 100644
--- a/gdb/frame.c
+++ b/gdb/frame.c
@@ -53,6 +53,17 @@ 
 
 static struct frame_info *sentinel_frame;
 
+/* Number of calls to reinit_frame_cache.  */
+static unsigned int frame_cache_generation = 0;
+
+/* See frame.h.  */
+
+unsigned int
+get_frame_cache_generation ()
+{
+  return frame_cache_generation;
+}
+
 /* The values behind the global "set backtrace ..." settings.  */
 set_backtrace_options user_set_backtrace_options;
 
@@ -1843,6 +1854,8 @@  reinit_frame_cache (void)
 {
   struct frame_info *fi;
 
+  ++frame_cache_generation;
+
   /* Tear down all frame caches.  */
   for (fi = sentinel_frame; fi != NULL; fi = fi->prev)
     {
@@ -1922,6 +1935,8 @@  get_prev_frame_if_no_cycle (struct frame_info *this_frame)
   if (prev_frame->level == 0)
     return prev_frame;
 
+  unsigned int entry_generation = get_frame_cache_generation ();
+
   try
     {
       compute_frame_id (prev_frame);
@@ -1944,8 +1959,11 @@  get_prev_frame_if_no_cycle (struct frame_info *this_frame)
     }
   catch (const gdb_exception &ex)
     {
-      prev_frame->next = NULL;
-      this_frame->prev = NULL;
+      if (get_frame_cache_generation () == entry_generation)
+	{
+	  prev_frame->next = NULL;
+	  this_frame->prev = NULL;
+	}
 
       throw;
     }
diff --git a/gdb/frame.h b/gdb/frame.h
index cfc15022ed5..8d029cc065d 100644
--- a/gdb/frame.h
+++ b/gdb/frame.h
@@ -949,6 +949,10 @@  extern const gdb::option::option_def set_backtrace_option_defs[2];
 /* The values behind the global "set backtrace ..." settings.  */
 extern set_backtrace_options user_set_backtrace_options;
 
+/* Get the number of calls to reinit_frame_cache.  */
+
+unsigned int get_frame_cache_generation ();
+
 /* Mark that the PC value is masked for the previous frame.  */
 
 extern void set_frame_previous_pc_masked (struct frame_info *frame);
diff --git a/gdb/testsuite/gdb.server/server-kill.exp b/gdb/testsuite/gdb.server/server-kill.exp
index 0072b28eb97..37b42460730 100644
--- a/gdb/testsuite/gdb.server/server-kill.exp
+++ b/gdb/testsuite/gdb.server/server-kill.exp
@@ -15,6 +15,9 @@ 
 # You should have received a copy of the GNU General Public License
 # along with this program.  If not, see <http://www.gnu.org/licenses/>.
 
+# Check that GDB handles GDBserver disconnecting abruptly, in several
+# scenarios.
+
 load_lib gdbserver-support.exp
 
 standard_testfile
@@ -23,40 +26,103 @@  if {[skip_gdbserver_tests]} {
     return 0
 }
 
-if { [prepare_for_testing "failed to prepare" ${testfile}] } {
+if { [build_executable "failed to prepare" ${testfile}] } {
     return -1
 }
 
-# Make sure we're disconnected, in case we're testing with an
-# extended-remote board, therefore already connected.
-gdb_test "disconnect" ".*"
+# Spawn GDBserver, run to main, extract GDBserver's PID and save it in
+# the SERVER_PID global.
+
+proc prepare {} {
+    global binfile gdb_prompt srcfile decimal
+    global server_pid
+
+    clean_restart $binfile
+
+    # Make sure we're disconnected, in case we're testing with an
+    # extended-remote board, therefore already connected.
+    gdb_test "disconnect" ".*"
 
-gdbserver_run ""
+    gdbserver_run ""
 
-# Continue past server_pid assignment.
-gdb_breakpoint ${srcfile}:[gdb_get_line_number "i = 0;"]
-gdb_continue_to_breakpoint "after server_pid assignment"
+    # Continue past server_pid assignment.
+    gdb_breakpoint ${srcfile}:[gdb_get_line_number "i = 0;"]
+    gdb_continue_to_breakpoint "after server_pid assignment"
 
-# Get the pid of GDBServer.
-set test "p server_pid"
-gdb_test_multiple $test $test {
-    -re " = ($decimal)\r\n$gdb_prompt $" {
-	set server_pid $expect_out(1,string)
-	pass $test
+    # Get the pid of GDBServer.
+    set test "p server_pid"
+    set server_pid 0
+    gdb_test_multiple $test $test {
+	-re " = ($decimal)\r\n$gdb_prompt $" {
+	    set server_pid $expect_out(1,string)
+	    pass $test
+	}
     }
+
+    if {$server_pid == 0} {
+	return 0
+    }
+
+    return 1
 }
 
-if ![info exists server_pid] {
-    return -1
+# Kill GDBserver using the PID saved by prepare.
+
+proc kill_server {} {
+    global server_pid
+
+    remote_exec target "kill -9 $server_pid"
+}
+
+# Test issuing "tstatus" right after the connection is dropped.
+
+proc_with_prefix test_tstatus {} {
+    if ![prepare] {
+	return
+    }
+
+    kill_server
+
+    # Enable trace status packet which is disabled after the
+    # connection if the remote target doesn't support tracepoint at
+    # all.  Otherwise, no RSP packet is sent out.
+    gdb_test_no_output "set remote trace-status-packet on"
+
+    # Force GDB to talk with GDBserver, so that we can get the
+    # "connection closed" error.
+    gdb_test "tstatus" {Remote connection closed|Remote communication error\.  Target disconnected\.: Connection reset by peer\.}
+}
+
+# Test unwinding with no debug/unwind info, right after the connection
+# is dropped.
+
+proc_with_prefix test_unwind_nosyms {} {
+    if ![prepare] {
+	return
+    }
+
+    # Remove symbols, so that we try to unwind with one of the
+    # heuristic unwinders, and read memory from within its sniffer.
+    gdb_unload
+
+    kill_server
+
+    gdb_test "bt" "(Target disconnected|Remote connection closed|Remote communication error).*"
 }
 
-remote_exec target "kill -9 $server_pid"
+# Test unwinding with debug/unwind info, right after the connection is
+# dropped.
 
-# Enable trace status packet which is disabled after the connection
-# if the remote target doesn't support tracepoint at all.  Otherwise,
-# no RSP packet is sent out.
-gdb_test_no_output "set remote trace-status-packet on"
+proc_with_prefix test_unwind_syms {} {
+    if ![prepare] {
+	return
+    }
+
+    kill_server
+
+    gdb_test "bt" "(Target disconnected|Remote connection closed|Remote communication error).*"
+}
 
-# Force GDB to talk with GDBserver, so that we can get the
-# "connection closed" error.
-gdb_test "tstatus" {Remote connection closed|Remote communication error\.  Target disconnected\.: Connection reset by peer\.}
+test_tstatus
+test_unwind_nosyms
+test_unwind_syms