[pushed] Block signals during the initial part of dlopen

Message ID 20191127203048.27C2028173@gnutoolchain-gerrit.osci.io
State New
Headers show
Series
  • [pushed] Block signals during the initial part of dlopen
Related show

Commit Message

Simon Marchi (Code Review) Nov. 27, 2019, 8:30 p.m.
Sourceware to Gerrit sync has submitted this change.

Change URL: https://gnutoolchain-gerrit.osci.io/r/c/glibc/+/472
......................................................................

Block signals during the initial part of dlopen

Lazy binding in a signal handler that interrupts a dlopen sees
intermediate dynamic linker state.  This has likely been always
unsafe, but with the new pending NODELETE state, this is clearly
incorrect.  Other threads are excluded via the loader lock, but the
current thread is not.  Blocking signals until right before ELF
constructors run is the safe thing to do.

Change-Id: Iad079080ebe7442c13313ba11dc2797953faef35
---
M elf/dl-open.c
1 file changed, 26 insertions(+), 2 deletions(-)

Approvals:
  Christian Brauner: Looks good to me, approved



-- 
Gerrit-Project: glibc
Gerrit-Branch: master
Gerrit-Change-Id: Iad079080ebe7442c13313ba11dc2797953faef35
Gerrit-Change-Number: 472
Gerrit-PatchSet: 6
Gerrit-Owner: Florian Weimer <fweimer@redhat.com>
Gerrit-Reviewer: Christian Brauner <christian.brauner@ubuntu.com>
Gerrit-Reviewer: Florian Weimer <fweimer@redhat.com>
Gerrit-MessageType: merged

Comments

Joseph Myers Nov. 28, 2019, 6:04 p.m. | #1
On Wed, 27 Nov 2019, Sourceware to Gerrit sync (Code Review) wrote:

> Block signals during the initial part of dlopen

> 

> Lazy binding in a signal handler that interrupts a dlopen sees

> intermediate dynamic linker state.  This has likely been always

> unsafe, but with the new pending NODELETE state, this is clearly

> incorrect.  Other threads are excluded via the loader lock, but the

> current thread is not.  Blocking signals until right before ELF

> constructors run is the safe thing to do.


This change (commit a2e8aa0d9ea648068d8be52dd7b15f1b6a008e23) breaks the 
build for i686-gnu (unfortunately this was after a syntax error was 
introduced in build-many-glibcs.py so my bots quietly died on re-execing 
with the syntax error, and so didn't get to find this build failure).

There are a series of errors linking ld.so starting with:

/scratch/jmyers/glibc/many9/install/compilers/i686-gnu/lib/gcc/i686-glibc-gnu/9.2.1/../../../../i686-glibc-gnu/bin/ld: /scratch/jmyers/glibc/many9/build/glibcs/i686-gnu/glibc/libc_pic.a(libc_fatal.os): in function `__GI___libc_fatal':
/scratch/jmyers/glibc/many9/src/glibc/libio/../sysdeps/posix/libc_fatal.c:161: multiple definition of `__libc_fatal'; /scratch/jmyers/glibc/many9/build/glibcs/i686-gnu/glibc/elf/dl-allobjs.os:/scratch/jmyers/glibc/many9/src/glibc/elf/dl-minimal.c:188: first defined here

and leading up to:

make[3]: *** [/scratch/jmyers/glibc/many9/build/glibcs/i686-gnu/glibc/elf/librtld.map] Error 1

-- 
Joseph S. Myers
joseph@codesourcery.com
Florian Weimer Nov. 28, 2019, 6:48 p.m. | #2
* Joseph Myers:

> On Wed, 27 Nov 2019, Sourceware to Gerrit sync (Code Review) wrote:

>

>> Block signals during the initial part of dlopen

>> 

>> Lazy binding in a signal handler that interrupts a dlopen sees

>> intermediate dynamic linker state.  This has likely been always

>> unsafe, but with the new pending NODELETE state, this is clearly

>> incorrect.  Other threads are excluded via the loader lock, but the

>> current thread is not.  Blocking signals until right before ELF

>> constructors run is the safe thing to do.

>

> This change (commit a2e8aa0d9ea648068d8be52dd7b15f1b6a008e23) breaks the 

> build for i686-gnu (unfortunately this was after a syntax error was 

> introduced in build-many-glibcs.py so my bots quietly died on re-execing 

> with the syntax error, and so didn't get to find this build failure).

>

> There are a series of errors linking ld.so starting with:

>

> /scratch/jmyers/glibc/many9/install/compilers/i686-gnu/lib/gcc/i686-glibc-gnu/9.2.1/../../../../i686-glibc-gnu/bin/ld:

> /scratch/jmyers/glibc/many9/build/glibcs/i686-gnu/glibc/libc_pic.a(libc_fatal.os):

> in function `__GI___libc_fatal':

> /scratch/jmyers/glibc/many9/src/glibc/libio/../sysdeps/posix/libc_fatal.c:161:

> multiple definition of `__libc_fatal';

> /scratch/jmyers/glibc/many9/build/glibcs/i686-gnu/glibc/elf/dl-allobjs.os:/scratch/jmyers/glibc/many9/src/glibc/elf/dl-minimal.c:188:

> first defined here

>

> and leading up to:

>

> make[3]: ***

> [/scratch/jmyers/glibc/many9/build/glibcs/i686-gnu/glibc/elf/librtld.map]

> Error 1


I failed to test the later commits using build-many-glibcs.py because
I deemed them more straightforward.

The patch below should fix it.

8<------------------------------------------------------------------8<
hurd: Add stub implementation of __sigprocmask to the dynamic loader

The actual __sigprocmask cannot be built for the loader since it
depends on per-thread state.

diff --git a/sysdeps/mach/hurd/Versions b/sysdeps/mach/hurd/Versions
index f69d5fef67..57d45e6d0d 100644
--- a/sysdeps/mach/hurd/Versions
+++ b/sysdeps/mach/hurd/Versions
@@ -41,6 +41,6 @@ ld {
 
     # functions that must be shared with libc
     __access_noerrno; __libc_read; __libc_write; __libc_lseek64;
-    __libc_lock_self0;
+    __libc_lock_self0; __sigprocmask;
   }
 }
diff --git a/sysdeps/mach/hurd/dl-sysdep.c b/sysdeps/mach/hurd/dl-sysdep.c
index 719d603f44..e35ce69c4a 100644
--- a/sysdeps/mach/hurd/dl-sysdep.c
+++ b/sysdeps/mach/hurd/dl-sysdep.c
@@ -693,6 +693,15 @@ _hurd_intr_rpc_mach_msg (mach_msg_header_t *msg,
 		     timeout, notify);
 }
 
+/* This function does nothing and will be interposed with the actual
+   implementation once is libc is loaded.  Before that, manipulating
+   the signal mask does not make sense because no signal handlers have
+   been installed.  */
+int weak_function
+__sigprocmask (int how, const sigset_t *set, sigset_t *oset)
+{
+  return 0;
+}
 
 void
 _dl_show_auxv (void)
Florian Weimer Nov. 28, 2019, 7:08 p.m. | #3
* Florian Weimer:

> * Joseph Myers:

>

>> On Wed, 27 Nov 2019, Sourceware to Gerrit sync (Code Review) wrote:

>>

>>> Block signals during the initial part of dlopen

>>> 

>>> Lazy binding in a signal handler that interrupts a dlopen sees

>>> intermediate dynamic linker state.  This has likely been always

>>> unsafe, but with the new pending NODELETE state, this is clearly

>>> incorrect.  Other threads are excluded via the loader lock, but the

>>> current thread is not.  Blocking signals until right before ELF

>>> constructors run is the safe thing to do.

>>

>> This change (commit a2e8aa0d9ea648068d8be52dd7b15f1b6a008e23) breaks the 

>> build for i686-gnu (unfortunately this was after a syntax error was 

>> introduced in build-many-glibcs.py so my bots quietly died on re-execing 

>> with the syntax error, and so didn't get to find this build failure).

>>

>> There are a series of errors linking ld.so starting with:

>>

>> /scratch/jmyers/glibc/many9/install/compilers/i686-gnu/lib/gcc/i686-glibc-gnu/9.2.1/../../../../i686-glibc-gnu/bin/ld:

>> /scratch/jmyers/glibc/many9/build/glibcs/i686-gnu/glibc/libc_pic.a(libc_fatal.os):

>> in function `__GI___libc_fatal':

>> /scratch/jmyers/glibc/many9/src/glibc/libio/../sysdeps/posix/libc_fatal.c:161:

>> multiple definition of `__libc_fatal';

>> /scratch/jmyers/glibc/many9/build/glibcs/i686-gnu/glibc/elf/dl-allobjs.os:/scratch/jmyers/glibc/many9/src/glibc/elf/dl-minimal.c:188:

>> first defined here

>>

>> and leading up to:

>>

>> make[3]: ***

>> [/scratch/jmyers/glibc/many9/build/glibcs/i686-gnu/glibc/elf/librtld.map]

>> Error 1

>

> I failed to test the later commits using build-many-glibcs.py because

> I deemed them more straightforward.

>

> The patch below should fix it.


Posted too soon.  The localplt update was missing.

8<------------------------------------------------------------------8<
hurd: Add stub implementation of __sigprocmask to the dynamic loader

The actual __sigprocmask cannot be built for the loader since it
depends on per-thread state.

Fixes commit a2e8aa0d9ea648068d8be52dd7b15f1b6a008e23 ("Block signals
during the initial part of dlopen").

diff --git a/sysdeps/mach/hurd/Versions b/sysdeps/mach/hurd/Versions
index f69d5fef67..57d45e6d0d 100644
--- a/sysdeps/mach/hurd/Versions
+++ b/sysdeps/mach/hurd/Versions
@@ -41,6 +41,6 @@ ld {
 
     # functions that must be shared with libc
     __access_noerrno; __libc_read; __libc_write; __libc_lseek64;
-    __libc_lock_self0;
+    __libc_lock_self0; __sigprocmask;
   }
 }
diff --git a/sysdeps/mach/hurd/dl-sysdep.c b/sysdeps/mach/hurd/dl-sysdep.c
index 719d603f44..e35ce69c4a 100644
--- a/sysdeps/mach/hurd/dl-sysdep.c
+++ b/sysdeps/mach/hurd/dl-sysdep.c
@@ -693,6 +693,15 @@ _hurd_intr_rpc_mach_msg (mach_msg_header_t *msg,
 		     timeout, notify);
 }
 
+/* This function does nothing and will be interposed with the actual
+   implementation once is libc is loaded.  Before that, manipulating
+   the signal mask does not make sense because no signal handlers have
+   been installed.  */
+int weak_function
+__sigprocmask (int how, const sigset_t *set, sigset_t *oset)
+{
+  return 0;
+}
 
 void
 _dl_show_auxv (void)
diff --git a/sysdeps/mach/hurd/i386/localplt.data b/sysdeps/mach/hurd/i386/localplt.data
index a5b5241b84..b3f5cefeea 100644
--- a/sysdeps/mach/hurd/i386/localplt.data
+++ b/sysdeps/mach/hurd/i386/localplt.data
@@ -41,6 +41,7 @@ ld.so: __strtoul_internal
 #ld.so: _exit
 ld.so: abort
 ld.so: _hurd_intr_rpc_mach_msg
+ld.so: __sigprocmask
 ld.so: __errno_location
 # rtld_hidden is currently disabled to avoid having to special-case the
 # functions above which do need a PLT.  These are thus currently expected.

Patch

diff --git a/elf/dl-open.c b/elf/dl-open.c
index 7415c09..1051e22 100644
--- a/elf/dl-open.c
+++ b/elf/dl-open.c
@@ -34,6 +34,7 @@ 
 #include <atomic.h>
 #include <libc-internal.h>
 #include <array_length.h>
+#include <internal-signals.h>
 
 #include <dl-dst.h>
 #include <dl-prop.h>
@@ -52,6 +53,10 @@ 
   /* Namespace ID.  */
   Lmid_t nsid;
 
+  /* Original signal mask.  Used for unblocking signal handlers before
+     running ELF constructors.  */
+  sigset_t original_signal_mask;
+
   /* Original value of _ns_global_scope_pending_adds.  Set by
      dl_open_worker.  Only valid if nsid is a real namespace
      (non-negative).  */
@@ -524,12 +529,16 @@ 
   if (new == NULL)
     {
       assert (mode & RTLD_NOLOAD);
+      __libc_signal_restore_set (&args->original_signal_mask);
       return;
     }
 
   if (__glibc_unlikely (mode & __RTLD_SPROF))
-    /* This happens only if we load a DSO for 'sprof'.  */
-    return;
+    {
+      /* This happens only if we load a DSO for 'sprof'.  */
+      __libc_signal_restore_set (&args->original_signal_mask);
+      return;
+    }
 
   /* This object is directly loaded.  */
   ++new->l_direct_opencount;
@@ -565,6 +574,7 @@ 
 
       assert (_dl_debug_initialize (0, args->nsid)->r_state == RT_CONSISTENT);
 
+      __libc_signal_restore_set (&args->original_signal_mask);
       return;
     }
 
@@ -748,6 +758,10 @@ 
   if (mode & RTLD_GLOBAL)
     add_to_global_resize (new);
 
+  /* Unblock signals.  Data structures are now consistent, and
+     application code may run.  */
+  __libc_signal_restore_set (&args->original_signal_mask);
+
   /* Run the initializer functions of new objects.  Temporarily
      disable the exception handler, so that lazy binding failures are
      fatal.  */
@@ -837,6 +851,10 @@ 
   args.argv = argv;
   args.env = env;
 
+  /* Recursive lazy binding during manipulation of the dynamic loader
+     structures may result in incorrect behavior.  */
+  __libc_signal_block_all (&args.original_signal_mask);
+
   struct dl_exception exception;
   int errcode = _dl_catch_exception (&exception, dl_open_worker, &args);
 
@@ -877,10 +895,16 @@ 
 
 	  _dl_close_worker (args.map, true);
 
+	  /* Restore the signal mask.  In the success case, this
+	     happens inside dl_open_worker.  */
+	  __libc_signal_restore_set (&args.original_signal_mask);
+
 	  /* All link_map_nodelete_pending objects should have been
 	     deleted at this point, which is why it is not necessary
 	     to reset the flag here.  */
 	}
+      else
+	__libc_signal_restore_set (&args.original_signal_mask);
 
       assert (_dl_debug_initialize (0, args.nsid)->r_state == RT_CONSISTENT);