[6/6] gdb: don't share aspace/pspace on fork with "detach-on-fork on" and "follow-fork-mode child"

Message ID 20210910205402.3853607-6-simon.marchi@efficios.com
State New
Headers show
Series
  • [1/6] gdb.base/foll-fork.exp: remove DUPLICATEs
Related show

Commit Message

Lancelot SIX via Gdb-patches Sept. 10, 2021, 8:54 p.m.
We found that when handling forks, two inferiors can unexpectedly share
their program space and address space.  To reproduce:

 1. Using a test program that forks...
 2. "set follow-fork-mode child"
 3. "set detach-on-fork on" (the default)
 4. run to a breakpoint somewhere after the fork

Step 4 should have created a new inferior:

    (gdb) info inferiors
      Num  Description       Connection           Executable
      1    <null>                                 /home/smarchi/build/wt/amd/gdb/fork
    * 2    process 251425    1 (native)           /home/smarchi/build/wt/amd/gdb/fork

By inspecting the state of GDB, we can see that the two inferiors now
share one program space and one address space:

Inferior 1:

    (top-gdb) p inferior_list.m_front.num
    $2 = 1
    (top-gdb) p inferior_list.m_front.aspace
    $3 = (struct address_space *) 0x5595e2520400
    (top-gdb) p inferior_list.m_front.pspace
    $4 = (struct program_space *) 0x5595e2520440

Inferior 2:

    (top-gdb) p inferior_list.m_front.next.num
    $5 = 2
    (top-gdb) p inferior_list.m_front.next.aspace
    $6 = (struct address_space *) 0x5595e2520400
    (top-gdb) p inferior_list.m_front.next.pspace
    $7 = (struct program_space *) 0x5595e2520440

You can then run inferior 1 again and the two inferiors will still
erroneously share their spaces, but already at this point this is wrong.

The cause of the bad {a,p}space sharing is in follow_fork_inferior.
When following the child and detaching from the parent, we just re-use
the parent's spaces, rather than cloning them.  When we switch back to
inferior 1 and run again, we find ourselves with two unrelated inferiors
sharing spaces.

Fix that by creating new spaces for the parent after having moved them
to the child.  My initial implementation created new spaces for the
child instead.  Doing this breaks doing "next" over fork().  When "next"
start, we record the symtab of the starting location.  When the program
stops, we compare that symtab with the symtab the program has stopped
at.  If the symtab or the line number has changed, we conclude the
"next" is done.  If we create a new program space for the child and copy
the parent's program space to it with clone_program_space, it creates
new symtabs for the child as well.  When the child stop, but still on
the fork() line, GDB thinks the "next" is done because the symtab
pointers no longer match.  In reality they are two symtab instances that
represent the same file.  But moving the spaces to the child and
creating new spaces for the parent, we avoid this problem.

Note that the problem described above happens today with "detach-on-fork
off" and "follow-fork-mode child", because we create new spaces for the
child.  This will have to be addressed later.

Test-wise, improve gdb.base/foll-fork.exp to set a breakpoint that is
expected to have a location in each inferiors.  Without the fix, when
the two inferiors erroneously share a program space, GDB reports a
single location.

Change-Id: Ifea76e14f87b9f7321fc3a766217061190e71c6e
---
 gdb/infrun.c                         | 39 +++++++++++++++++++++-------
 gdb/testsuite/gdb.base/foll-fork.exp | 18 +++++++++++++
 2 files changed, 48 insertions(+), 9 deletions(-)

-- 
2.33.0

Comments

John Baldwin Sept. 10, 2021, 11:33 p.m. | #1
On 9/10/21 1:54 PM, Simon Marchi via Gdb-patches wrote:
> We found that when handling forks, two inferiors can unexpectedly share

> their program space and address space.  To reproduce:

> 

>   1. Using a test program that forks...

>   2. "set follow-fork-mode child"

>   3. "set detach-on-fork on" (the default)

>   4. run to a breakpoint somewhere after the fork

> 

> Step 4 should have created a new inferior:

> 

>      (gdb) info inferiors

>        Num  Description       Connection           Executable

>        1    <null>                                 /home/smarchi/build/wt/amd/gdb/fork

>      * 2    process 251425    1 (native)           /home/smarchi/build/wt/amd/gdb/fork

> 

> By inspecting the state of GDB, we can see that the two inferiors now

> share one program space and one address space:

> 

> Inferior 1:

> 

>      (top-gdb) p inferior_list.m_front.num

>      $2 = 1

>      (top-gdb) p inferior_list.m_front.aspace

>      $3 = (struct address_space *) 0x5595e2520400

>      (top-gdb) p inferior_list.m_front.pspace

>      $4 = (struct program_space *) 0x5595e2520440

> 

> Inferior 2:

> 

>      (top-gdb) p inferior_list.m_front.next.num

>      $5 = 2

>      (top-gdb) p inferior_list.m_front.next.aspace

>      $6 = (struct address_space *) 0x5595e2520400

>      (top-gdb) p inferior_list.m_front.next.pspace

>      $7 = (struct program_space *) 0x5595e2520440

> 

> You can then run inferior 1 again and the two inferiors will still

> erroneously share their spaces, but already at this point this is wrong.

> 

> The cause of the bad {a,p}space sharing is in follow_fork_inferior.

> When following the child and detaching from the parent, we just re-use

> the parent's spaces, rather than cloning them.  When we switch back to

> inferior 1 and run again, we find ourselves with two unrelated inferiors

> sharing spaces.

> 

> Fix that by creating new spaces for the parent after having moved them

> to the child.  My initial implementation created new spaces for the

> child instead.  Doing this breaks doing "next" over fork().  When "next"

> start, we record the symtab of the starting location.  When the program

> stops, we compare that symtab with the symtab the program has stopped

> at.  If the symtab or the line number has changed, we conclude the

> "next" is done.  If we create a new program space for the child and copy

> the parent's program space to it with clone_program_space, it creates

> new symtabs for the child as well.  When the child stop, but still on

> the fork() line, GDB thinks the "next" is done because the symtab

> pointers no longer match.  In reality they are two symtab instances that

> represent the same file.  But moving the spaces to the child and

> creating new spaces for the parent, we avoid this problem.

> 

> Note that the problem described above happens today with "detach-on-fork

> off" and "follow-fork-mode child", because we create new spaces for the

> child.  This will have to be addressed later.

> 

> Test-wise, improve gdb.base/foll-fork.exp to set a breakpoint that is

> expected to have a location in each inferiors.  Without the fix, when

> the two inferiors erroneously share a program space, GDB reports a

> single location.


So I wonder about the case where follow-fork-mode is parent and
detach-on-fork is off?  In that case, should the existing aspace/pspace
stay with the parent and the child get clones?  That is, using the
follow-fork-mode setting to determine which inferior gets the existing
aspace/pspace and assigning the cloned copies to the !follow-fork-mode
inferior?

-- 
John Baldwin
Lancelot SIX via Gdb-patches Sept. 11, 2021, 3:16 a.m. | #2
>> Note that the problem described above happens today with "detach-on-fork

>> off" and "follow-fork-mode child", because we create new spaces for the

>> child.  This will have to be addressed later.

>>

>> Test-wise, improve gdb.base/foll-fork.exp to set a breakpoint that is

>> expected to have a location in each inferiors.  Without the fix, when

>> the two inferiors erroneously share a program space, GDB reports a

>> single location.

> 

> So I wonder about the case where follow-fork-mode is parent and

> detach-on-fork is off?  In that case, should the existing aspace/pspace

> stay with the parent and the child get clones?  That is, using the

> follow-fork-mode setting to determine which inferior gets the existing

> aspace/pspace and assigning the cloned copies to the !follow-fork-mode

> inferior?


I think that would work, to address the problem described above.

Simon
Lancelot SIX via Gdb-patches Sept. 11, 2021, 1:02 p.m. | #3
On 2021-09-10 11:16 p.m., Simon Marchi wrote:
>>> Note that the problem described above happens today with "detach-on-fork

>>> off" and "follow-fork-mode child", because we create new spaces for the

>>> child.  This will have to be addressed later.

>>>

>>> Test-wise, improve gdb.base/foll-fork.exp to set a breakpoint that is

>>> expected to have a location in each inferiors.  Without the fix, when

>>> the two inferiors erroneously share a program space, GDB reports a

>>> single location.

>>

>> So I wonder about the case where follow-fork-mode is parent and

>> detach-on-fork is off?  In that case, should the existing aspace/pspace

>> stay with the parent and the child get clones?  That is, using the

>> follow-fork-mode setting to determine which inferior gets the existing

>> aspace/pspace and assigning the cloned copies to the !follow-fork-mode

>> inferior?

> 

> I think that would work, to address the problem described above.

> 

> Simon

> 


Btw, I'm off until the 20th.

Simon
Lancelot SIX via Gdb-patches Sept. 11, 2021, 1:03 p.m. | #4
On 2021-09-11 9:02 a.m., Simon Marchi via Gdb-patches wrote:
> 

> 

> On 2021-09-10 11:16 p.m., Simon Marchi wrote:

>>>> Note that the problem described above happens today with "detach-on-fork

>>>> off" and "follow-fork-mode child", because we create new spaces for the

>>>> child.  This will have to be addressed later.

>>>>

>>>> Test-wise, improve gdb.base/foll-fork.exp to set a breakpoint that is

>>>> expected to have a location in each inferiors.  Without the fix, when

>>>> the two inferiors erroneously share a program space, GDB reports a

>>>> single location.

>>>

>>> So I wonder about the case where follow-fork-mode is parent and

>>> detach-on-fork is off?  In that case, should the existing aspace/pspace

>>> stay with the parent and the child get clones?  That is, using the

>>> follow-fork-mode setting to determine which inferior gets the existing

>>> aspace/pspace and assigning the cloned copies to the !follow-fork-mode

>>> inferior?

>>

>> I think that would work, to address the problem described above.

>>

>> Simon

>>

> 

> Btw, I'm off until the 20th.

> 

> Simon

> 


Haha oops, replied to the wrong message.  Now everybody knows I guess :).

Simon

Patch

diff --git a/gdb/infrun.c b/gdb/infrun.c
index d1ac9b4cbbb..b80884322fa 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -539,25 +539,46 @@  holding the child stopped.  Try \"set detach-on-fork\" or \
       child_inf->gdbarch = parent_inf->gdbarch;
       copy_inferior_target_desc_info (child_inf, parent_inf);
 
-      program_space *parent_pspace = parent_inf->pspace;
-
-      /* If this is a vfork child, then the address-space is shared
-	 with the parent.  If we detached from the parent, then we can
-	 reuse the parent's program/address spaces.  */
-      if (has_vforked || detach_fork)
+      if (has_vforked)
 	{
-	  child_inf->pspace = parent_pspace;
-	  child_inf->aspace = child_inf->pspace->aspace;
+	  /* If this is a vfork child, then the address-space is shared
+	     with the parent.  */
+	  child_inf->aspace = parent_inf->aspace;
+	  child_inf->pspace = parent_inf->pspace;
 
 	  exec_on_vfork (child_inf);
 	}
+      else if (detach_fork)
+	{
+	  /* We follow the child and detach from the parent: move the parent's
+	     program space to the child.  This simplifies some things, like
+	     doing "next" over fork() and landing on the expected line in the
+	     child (note, that is broken with "set detach-on-fork off").
+
+	     Before assigning brand new spaces for the parent, remove
+	     breakpoints from it: because the new pspace won't match
+	     currently inserted locations, the normal detach procedure
+	     wouldn't remove them, and we would leave them inserted when
+	     detaching.  */
+	  remove_breakpoints_inf (parent_inf);
+
+	  child_inf->aspace = parent_inf->aspace;
+	  child_inf->pspace = parent_inf->pspace;
+	  parent_inf->aspace = new_address_space ();
+	  parent_inf->pspace = new program_space (parent_inf->aspace);
+	  clone_program_space (parent_inf->pspace, child_inf->pspace);
+
+	  /* The parent inferior is still the current one, so keep things
+	     in sync.  */
+	  set_current_program_space (parent_inf->pspace);
+	}
       else
 	{
 	  child_inf->aspace = new_address_space ();
 	  child_inf->pspace = new program_space (child_inf->aspace);
 	  child_inf->removable = 1;
 	  child_inf->symfile_flags = SYMFILE_NO_READ;
-	  clone_program_space (child_inf->pspace, parent_pspace);
+	  clone_program_space (child_inf->pspace, parent_inf->pspace);
 	}
     }
 
diff --git a/gdb/testsuite/gdb.base/foll-fork.exp b/gdb/testsuite/gdb.base/foll-fork.exp
index 3a0cc2fe456..7f9e1cf87c6 100644
--- a/gdb/testsuite/gdb.base/foll-fork.exp
+++ b/gdb/testsuite/gdb.base/foll-fork.exp
@@ -181,6 +181,24 @@  proc_with_prefix test_follow_fork { follow-fork-mode detach-on-fork cmd } {
 		".* set breakpoint here.*"
 	}
     }
+
+    # If we end up with two inferiors, verify that they each end up with their
+    # own program space.  Do this by setting a breakpoint, if we see two
+    # locations it means there are two program spaces.
+    if {${detach-on-fork} == "off" || ${follow-fork-mode} == "child"} {
+	set bpnum "<unset>"
+	gdb_test_multiple "break callee" "break callee" {
+	    -re -wrap "Breakpoint ($::decimal) at $::hex: callee\\. \\(2 locations\\)" {
+		set bpnum $expect_out(1,string)
+		pass $gdb_test_name
+	    }
+	}
+	gdb_test "info breakpoints $bpnum" \
+	    [multi_line \
+		"$bpnum\\.1 .* inf 1" \
+		"$bpnum\\.2 .* inf 2"] \
+	    "info breakpoints"
+    }
 }
 
 set reading_in_symbols_re {(?:\r\nReading in symbols for [^\r\n]*)?}