Fix gdb.base/fork-running-state.exp race

Message ID 20180328150630.18319-1-palves@redhat.com
State New
Headers show
Series
  • Fix gdb.base/fork-running-state.exp race
Related show

Commit Message

Pedro Alves March 28, 2018, 3:06 p.m.
On my multi-target branch I was occasionaly seeing a FAIL like this:

  (gdb) PASS: gdb.base/fork-running-state.exp: detach-on-fork=off: follow-fork=parent: non-stop: kill parent
  [Inferior 2 (process 32672) exited normally]
  kill inferior 2
  warning: Inferior ID 2 is not running.
  (gdb) FAIL: gdb.base/fork-running-state.exp: detach-on-fork=off: follow-fork=parent: non-stop: kill child (the program exited)
  ... other similar fails ...

Turns out to be a testcase bug/race.  A tweak like this increases the
changes of hitting the race substancially:

  --- a/gdb/testsuite/gdb.base/fork-running-state.c
  +++ b/gdb/testsuite/gdb.base/fork-running-state.c
  @@ -29,7 +29,7 @@ fork_child (void)
   {
     while (1)
       {
  -      sleep (1);
  +      usleep (100);


The testcase has two processes, parent and child fork.  The problem is
that the child exits itself if it notices the parent is gone, but the
testcase .exp does not expect that.

I first wrote a patch that handled the different combinations of
non-stop/detach-on-fork/follow-fork/schedule-multiple, making the .exp
file know when to expect the child to exit itself vs when to kill it
explicitly, but the result was that the code to kill the parent and
child was getting about as large as the test code that is the actual
point of the testcase, above the kills.

So I scratched that approach and came up with a simpler patch --
simply make the child not exit itself when the parent exits.

The .exp file is going to kill both parent and child explicitly, and,
main() already calls alarm() as a safeguard.  I don't think we lose
anything.

gdb/testsuite/ChangeLog:
yyyy-mm-dd  Pedro Alves  <palves@redhat.com>

	* gdb.base/fork-running-state.c (fork_child): Don't exit if parent
	exits.  Instead loop running forever.
	(fork_parent): Run forever too.
---
 gdb/testsuite/gdb.base/fork-running-state.c | 20 ++++----------------
 1 file changed, 4 insertions(+), 16 deletions(-)

-- 
2.14.3

Comments

Simon Marchi March 30, 2018, 6:23 a.m. | #1
On 2018-03-28 11:06, Pedro Alves wrote:
> On my multi-target branch I was occasionaly seeing a FAIL like this:

> 

>   (gdb) PASS: gdb.base/fork-running-state.exp: detach-on-fork=off:

> follow-fork=parent: non-stop: kill parent

>   [Inferior 2 (process 32672) exited normally]

>   kill inferior 2

>   warning: Inferior ID 2 is not running.

>   (gdb) FAIL: gdb.base/fork-running-state.exp: detach-on-fork=off:

> follow-fork=parent: non-stop: kill child (the program exited)

>   ... other similar fails ...

> 

> Turns out to be a testcase bug/race.  A tweak like this increases the

> changes of hitting the race substancially:

> 

>   --- a/gdb/testsuite/gdb.base/fork-running-state.c

>   +++ b/gdb/testsuite/gdb.base/fork-running-state.c

>   @@ -29,7 +29,7 @@ fork_child (void)

>    {

>      while (1)

>        {

>   -      sleep (1);

>   +      usleep (100);

> 

> 

> The testcase has two processes, parent and child fork.  The problem is

> that the child exits itself if it notices the parent is gone, but the

> testcase .exp does not expect that.

> 

> I first wrote a patch that handled the different combinations of

> non-stop/detach-on-fork/follow-fork/schedule-multiple, making the .exp

> file know when to expect the child to exit itself vs when to kill it

> explicitly, but the result was that the code to kill the parent and

> child was getting about as large as the test code that is the actual

> point of the testcase, above the kills.

> 

> So I scratched that approach and came up with a simpler patch --

> simply make the child not exit itself when the parent exits.

> 

> The .exp file is going to kill both parent and child explicitly, and,

> main() already calls alarm() as a safeguard.  I don't think we lose

> anything.


Does the parent exit as part of the test, or only when we kill it at the 
end when we clean up?

If I understand correctly, we kill the parent, and by the time we want 
to kill the child, it has already noticed the parent is gone and has 
itself exited, is that right?  In that case I think it makes sense to 
have only one way of cleaning up, either we kill the process or we let 
it exit, not both.  So the patch LGTM.

Simon
Pedro Alves April 10, 2018, 2:05 p.m. | #2
On 03/30/2018 07:23 AM, Simon Marchi wrote:
> On 2018-03-28 11:06, Pedro Alves wrote:


>> So I scratched that approach and came up with a simpler patch --

>> simply make the child not exit itself when the parent exits.

>>

>> The .exp file is going to kill both parent and child explicitly, and,

>> main() already calls alarm() as a safeguard.  I don't think we lose

>> anything.

> 

> Does the parent exit as part of the test, or only when we kill it at the end when we clean up?


It's not part of the test, the parent only dies when we kill it for clean up.

> 

> If I understand correctly, we kill the parent, and by the time we want to kill the child, it has already noticed the parent is gone and has itself exited, is that right?


Yup, that's right.

  In that case I think it makes sense to have only one way of cleaning up, either we kill the process or we let it exit, not both.  So the patch LGTM.

Thanks, I've pushed it in now.

Pedro Alves

Patch

diff --git a/gdb/testsuite/gdb.base/fork-running-state.c b/gdb/testsuite/gdb.base/fork-running-state.c
index 233b515831..8ea4739609 100644
--- a/gdb/testsuite/gdb.base/fork-running-state.c
+++ b/gdb/testsuite/gdb.base/fork-running-state.c
@@ -28,30 +28,18 @@  static int
 fork_child (void)
 {
   while (1)
-    {
-      sleep (1);
-
-      /* Exit if GDB kills the parent.  */
-      if (getppid () != save_parent)
-	break;
-      if (kill (getppid (), 0) != 0)
-	break;
-    }
+    pause ();
 
   return 0;
 }
 
-/* The fork parent.  Just runs forever waiting for the child to
-   exit.  */
+/* The fork parent.  Just runs forever.  */
 
 static int
 fork_parent (void)
 {
-  if (wait (NULL) == -1)
-    {
-      perror ("wait");
-      return 1;
-    }
+  while (1)
+    pause ();
 
   return 0;
 }