[v4,01/21] nptl: Do not close the pipe on tst-cancel{2,3}

Message ID 20200403203201.7494-2-adhemerval.zanella@linaro.org
State New
Headers show
Series
  • nptl: Fix Race conditions in pthread cancellation [BZ#12683]
Related show

Commit Message

Florian Weimer via Libc-alpha April 3, 2020, 8:31 p.m.
This can cause a SIGPIPE before SIGCANCEL is processed, which makes
write fail and the thread return an non expected result.

Checked on x86_64-linux-gnu and powerpc64le-linux-gnu.
---
 nptl/tst-cancel2.c | 3 ---
 nptl/tst-cancel3.c | 3 ---
 2 files changed, 6 deletions(-)

-- 
2.17.1

Comments

Zack Weinberg April 7, 2020, 3:24 p.m. | #1
On Fri, Apr 3, 2020 at 4:32 PM Adhemerval Zanella via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>

> This can cause a SIGPIPE before SIGCANCEL is processed, which makes

> write fail and the thread return an non expected result.


1) This is a sensible explanation for tst-cancel2.c, but tst-cancel3.c
does a read, not a write; the commit message should explain why this
is an appropriate change for both files.  Something like

# These tests cancel a thread that's supposed to be blocked while
writing or reading from a pipe.  If we close the other end of the
pipe, the closure might be reported to the thread before the
cancellation, causing a spurious failure.

2) The intention of the closes, seems to have been to prevent these
test cases blocking forever in pthread_join if the cancel isn't
delivered.  We don't actually need to do that, because the 20-second
timeout built into the test harness will suffice to make the test fail
in that case, but it might be worth adding a comment somewhere,
explaining that we're relying on the timeout.

3) I think it's no longer necessary to ignore SIGPIPE after this
change, and I think it's no longer necessary to have a loop in 'tf',
please remove that code also.

4) There's still a race in these tests; the cancellation might or
might not yet be pending when the child thread calls read/write.
Abstractly, we should be testing both a cancellation that's already
pending when we reach the cancellation point, and a cancellation
that's delivered while the thread is blocked on I/O.  It's possible to
ensure that the cancellation is already pending with a mutex, e.g.

static pthread_mutex_t prep_gate = PTHREAD_MUTEX_INITIALIZER;
static int fd[2];

static void *
tf (void *arg)
{
  /* The buffer size must be larger than the pipe size so that the
     write blocks.  */
  char buf[100000];

  /* Once we acquire this mutex, a cancellation request will be
     pending for this thread.  (pthread_mutex_(un)lock are not
     cancellation points.)  */
  pthread_mutex_lock (&prep_gate);
  pthread_mutex_unlock (&prep_gate);

  /* This write operation should be immediately cancelled.  */
  write (fd[1], buf, sizeof (buf));

  /* If control reaches this point, the test has failed;
     the parent will detect this.  */
  return arg;
}

static int
do_test (void)
{
  pthread_t th;
  void *r;

  if (pipe (fd) != 0)
    {
      puts ("pipe failed");
      return 1;
    }

  /* The child thread will wait to acquire this mutex.  */
  pthread_mutex_lock (&prep_gate);

  if (pthread_create (&th, NULL, tf, NULL) != 0)
    {
      puts ("create failed");
      return 1;
    }

  if (pthread_cancel (th) != 0)
    {
      puts ("cancel failed");
      return 1;
    }

  pthread_mutex_unlock (&prep_gate);

  // ...
}

But I don't know a good way to _guarantee_ that 'tf' is blocked on
read/write before the parent calls pthread_cancel.   Can you think of
anything?

zw
Florian Weimer via Libc-alpha April 7, 2020, 8:07 p.m. | #2
On 07/04/2020 12:24, Zack Weinberg wrote:
> On Fri, Apr 3, 2020 at 4:32 PM Adhemerval Zanella via Libc-alpha

> <libc-alpha@sourceware.org> wrote:

>>

>> This can cause a SIGPIPE before SIGCANCEL is processed, which makes

>> write fail and the thread return an non expected result.

> 

> 1) This is a sensible explanation for tst-cancel2.c, but tst-cancel3.c

> does a read, not a write; the commit message should explain why this

> is an appropriate change for both files.  Something like

> 

> # These tests cancel a thread that's supposed to be blocked while

> writing or reading from a pipe.  If we close the other end of the

> pipe, the closure might be reported to the thread before the

> cancellation, causing a spurious failure.


Either a read or write exercises the same issue, but I agree that
the commit message could be more comprehensive. I have added your
suggestion.

> 

> 2) The intention of the closes, seems to have been to prevent these

> test cases blocking forever in pthread_join if the cancel isn't

> delivered.  We don't actually need to do that, because the 20-second

> timeout built into the test harness will suffice to make the test fail

> in that case, but it might be worth adding a comment somewhere,

> explaining that we're relying on the timeout.


I am not sure if the close is indeed to prevent it, if the 
pthread_cancel does not cancel the thread the test will be stuck
regardless.

> 

> 3) I think it's no longer necessary to ignore SIGPIPE after this

> change, and I think it's no longer necessary to have a loop in 'tf',

> please remove that code also.


Ack.

> 

> 4) There's still a race in these tests; the cancellation might or

> might not yet be pending when the child thread calls read/write.

> Abstractly, we should be testing both a cancellation that's already

> pending when we reach the cancellation point, and a cancellation

> that's delivered while the thread is blocked on I/O.  It's possible to

> ensure that the cancellation is already pending with a mutex, e.g.


Indeed it is not intent of the patch to actually fix this race,
but I also agree that this is an opportunity to do so.

> 

> static pthread_mutex_t prep_gate = PTHREAD_MUTEX_INITIALIZER;

> static int fd[2];

> 

> static void *

> tf (void *arg)

> {

>   /* The buffer size must be larger than the pipe size so that the

>      write blocks.  */

>   char buf[100000];

> 

>   /* Once we acquire this mutex, a cancellation request will be

>      pending for this thread.  (pthread_mutex_(un)lock are not

>      cancellation points.)  */

>   pthread_mutex_lock (&prep_gate);

>   pthread_mutex_unlock (&prep_gate);

> 

>   /* This write operation should be immediately cancelled.  */

>   write (fd[1], buf, sizeof (buf));

> 

>   /* If control reaches this point, the test has failed;

>      the parent will detect this.  */

>   return arg;

> }

> 

> static int

> do_test (void)

> {

>   pthread_t th;

>   void *r;

> 

>   if (pipe (fd) != 0)

>     {

>       puts ("pipe failed");

>       return 1;

>     }

> 

>   /* The child thread will wait to acquire this mutex.  */

>   pthread_mutex_lock (&prep_gate);

> 

>   if (pthread_create (&th, NULL, tf, NULL) != 0)

>     {

>       puts ("create failed");

>       return 1;

>     }

> 

>   if (pthread_cancel (th) != 0)

>     {

>       puts ("cancel failed");

>       return 1;

>     }

> 

>   pthread_mutex_unlock (&prep_gate);

> 

>   // ...

> }

> 

> But I don't know a good way to _guarantee_ that 'tf' is blocked on

> read/write before the parent calls pthread_cancel.   Can you think of

> anything?

> 

> zw


I think we cance use the support_process_state_wait on the thread ID 
and check for support_process_state_sleeping (meaning that the process is
blocked on the syscall).

For the write operation with new cancellation it might require a second
write to actually acts on the pending cancellation (since kernel might
write some partial data on internal pipe buffer since it is larger 
than PIPE_BUF).

Update patch based in your comments below.

--

nptl: Do not close the pipe on tst-cancel{2,3}

These tests cancel a thread that is supposed to be blocked while
writing or reading from a pipe.  If we close the other end of the
pipe, the closure might be reported to the thread before the
cancellation, causing a spurious failure.

For the write operation, pthread_cancel might not act on the call
since kernel might write some partial data on pipe internal buffer.
This requires a second write call to act on pending cancellation.

Also, to avoid a race between pthread_cancel and the read/write
operation the master thread wait the target thread reach a sleeping
state prior starting the cancellation.

Checked on x86_64-linux-gnu and powerpc64le-linux-gnu.
---
 nptl/tst-cancel2.c | 89 ++++++++++++++++++++--------------------------
 nptl/tst-cancel3.c | 83 +++++++++++++++---------------------------
 2 files changed, 68 insertions(+), 104 deletions(-)

diff --git a/nptl/tst-cancel2.c b/nptl/tst-cancel2.c
index 1e86711596..860085c1f9 100644
--- a/nptl/tst-cancel2.c
+++ b/nptl/tst-cancel2.c
@@ -18,21 +18,39 @@
 
 #include <pthread.h>
 #include <signal.h>
-#include <stdio.h>
-#include <unistd.h>
+#include <fcntl.h>
 
+#include <support/xunistd.h>
+#include <support/xthread.h>
+#include <support/check.h>
+#include <support/process_state.h>
 
+
+/* The pipe will be used pass the thread TID to master thread.  */
+static int tidfd[2];
+/* The pipe will be used to check the cancellation.  */
 static int fd[2];
 
 
 static void *
 tf (void *arg)
 {
+  int pipesz = fcntl(fd[1], F_GETPIPE_SZ);
+  TEST_VERIFY (pipesz != -1);
+
+  pid_t tid = gettid ();
+  TEST_COMPARE (write (tidfd[1], &tid, sizeof (tid)), sizeof (tid));
+
   /* The buffer size must be larger than the pipe size so that the
      write blocks.  */
-  char buf[100000];
+  char buf[pipesz + 1];
 
-  while (write (fd[1], buf, sizeof (buf)) > 0);
+  /* When cancellation acts, first write might return if kernel fill
+     the socket buffer with partial data.  */
+  TEST_COMPARE (write (fd[1], buf, sizeof (buf)), pipesz);
+
+  /* This will act on the pending cancellation.  */
+  write (fd[1], buf, sizeof (buf));
 
   return (void *) 42l;
 }
@@ -43,53 +61,24 @@ do_test (void)
 {
   pthread_t th;
   void *r;
-  struct sigaction sa;
-
-  sa.sa_handler = SIG_IGN;
-  sigemptyset (&sa.sa_mask);
-  sa.sa_flags = 0;
-
-  if (sigaction (SIGPIPE, &sa, NULL) != 0)
-    {
-      puts ("sigaction failed");
-      return 1;
-    }
-
-  if (pipe (fd) != 0)
-    {
-      puts ("pipe failed");
-      return 1;
-    }
-
-  if (pthread_create (&th, NULL, tf, NULL) != 0)
-    {
-      puts ("create failed");
-      return 1;
-    }
-
-  if (pthread_cancel (th) != 0)
-    {
-      puts ("cancel failed");
-      return 1;
-    }
-
-  /* This will cause the write in the child to return.  */
-  close (fd[0]);
-
-  if (pthread_join (th, &r) != 0)
-    {
-      puts ("join failed");
-      return 1;
-    }
-
-  if (r != PTHREAD_CANCELED)
-    {
-      printf ("result is wrong: expected %p, got %p\n", PTHREAD_CANCELED, r);
-      return 1;
-    }
+
+  xpipe (tidfd);
+  xpipe (fd);
+
+  th = xpthread_create (NULL, tf, NULL);
+
+  pid_t tid;
+  TEST_COMPARE (read (tidfd[0], &tid, sizeof (tid)), sizeof (tid));
+
+  support_process_state_wait (tid, support_process_state_sleeping);
+
+  xpthread_cancel (th);
+
+  r = xpthread_join (th);
+
+  TEST_VERIFY (r == PTHREAD_CANCELED);
 
   return 0;
 }
 
-#define TEST_FUNCTION do_test ()
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
diff --git a/nptl/tst-cancel3.c b/nptl/tst-cancel3.c
index 0a531dbcdb..cf810aef09 100644
--- a/nptl/tst-cancel3.c
+++ b/nptl/tst-cancel3.c
@@ -18,23 +18,27 @@
 
 #include <pthread.h>
 #include <signal.h>
-#include <stdio.h>
-#include <unistd.h>
 
+#include <support/xunistd.h>
+#include <support/xthread.h>
+#include <support/check.h>
+#include <support/process_state.h>
 
+
+/* The pipe will be used pass the thread TID to master thread.  */
+static int tidfd[2];
+/* The pipe will be used to check the cancellation.  */
 static int fd[2];
 
 
 static void *
 tf (void *arg)
 {
-  char buf[100];
+  pid_t tid = gettid ();
+  TEST_COMPARE (write (tidfd[1], &tid, sizeof (tid)), sizeof (tid));
 
-  if (read (fd[0], buf, sizeof (buf)) == sizeof (buf))
-    {
-      puts ("read succeeded");
-      return (void *) 1l;
-    }
+  char buf[100];
+  read (fd[0], buf, sizeof (buf));
 
   return NULL;
 }
@@ -45,53 +49,24 @@ do_test (void)
 {
   pthread_t th;
   void *r;
-  struct sigaction sa;
-
-  sa.sa_handler = SIG_IGN;
-  sigemptyset (&sa.sa_mask);
-  sa.sa_flags = 0;
-
-  if (sigaction (SIGPIPE, &sa, NULL) != 0)
-    {
-      puts ("sigaction failed");
-      return 1;
-    }
-
-  if (pipe (fd) != 0)
-    {
-      puts ("pipe failed");
-      return 1;
-    }
-
-  if (pthread_create (&th, NULL, tf, NULL) != 0)
-    {
-      puts ("create failed");
-      return 1;
-    }
-
-  if (pthread_cancel (th) != 0)
-    {
-      puts ("cancel failed");
-      return 1;
-    }
-
-  /* This will cause the read in the child to return.  */
-  close (fd[0]);
-
-  if (pthread_join (th, &r) != 0)
-    {
-      puts ("join failed");
-      return 1;
-    }
-
-  if (r != PTHREAD_CANCELED)
-    {
-      puts ("result is wrong");
-      return 1;
-    }
+
+  xpipe (tidfd);
+  xpipe (fd);
+
+  th = xpthread_create (NULL, tf, NULL);
+
+  pid_t tid;
+  TEST_COMPARE (read (tidfd[0], &tid, sizeof (tid)), sizeof (tid));
+
+  support_process_state_wait (tid, support_process_state_sleeping);
+
+  xpthread_cancel (th);
+
+  r = xpthread_join (th);
+
+  TEST_VERIFY (r == PTHREAD_CANCELED);
 
   return 0;
 }
 
-#define TEST_FUNCTION do_test ()
-#include "../test-skeleton.c"
+#include <support/test-driver.c>

Patch

diff --git a/nptl/tst-cancel2.c b/nptl/tst-cancel2.c
index 1e86711596..1732980b0f 100644
--- a/nptl/tst-cancel2.c
+++ b/nptl/tst-cancel2.c
@@ -73,9 +73,6 @@  do_test (void)
       return 1;
     }
 
-  /* This will cause the write in the child to return.  */
-  close (fd[0]);
-
   if (pthread_join (th, &r) != 0)
     {
       puts ("join failed");
diff --git a/nptl/tst-cancel3.c b/nptl/tst-cancel3.c
index 0a531dbcdb..3a0acac5e2 100644
--- a/nptl/tst-cancel3.c
+++ b/nptl/tst-cancel3.c
@@ -75,9 +75,6 @@  do_test (void)
       return 1;
     }
 
-  /* This will cause the read in the child to return.  */
-  close (fd[0]);
-
   if (pthread_join (th, &r) != 0)
     {
       puts ("join failed");