[07/15] elf: Refactor _dl_update_slotinfo to avoid use after free

Message ID 3ecdb956cbf6d1b46e36311ffe7f491ce186cdbc.1613390045.git.szabolcs.nagy@arm.com
State Superseded
Headers show
Series
  • Dynamic TLS related data race fixes
Related show

Commit Message

Stafford Horne via Libc-alpha Feb. 15, 2021, noon
map is not valid to access here because it can be freed by a
concurrent dlclose, so don't check the modid.

The map == 0 and map != 0 code paths can be shared (avoiding
the dtv resize in case of map == 0 is just an optimization:
larger dtv than necessary would be fine too).
---
 elf/dl-tls.c | 21 +++++----------------
 1 file changed, 5 insertions(+), 16 deletions(-)

-- 
2.17.1

Comments

Stafford Horne via Libc-alpha April 6, 2021, 7:40 p.m. | #1
On 15/02/2021 09:00, Szabolcs Nagy via Libc-alpha wrote:
> map is not valid to access here because it can be freed by a

> concurrent dlclose, so don't check the modid.


Won't it be protected by the recursive GL(dl_load_lock) in such case?
I think the concurrency issue is between dlopen and _dl_deallocate_tls
called by pthread stack handling (nptl/allocatestack.c).  Am I missing
something here?

> 

> The map == 0 and map != 0 code paths can be shared (avoiding

> the dtv resize in case of map == 0 is just an optimization:

> larger dtv than necessary would be fine too).

> ---

>  elf/dl-tls.c | 21 +++++----------------

>  1 file changed, 5 insertions(+), 16 deletions(-)

> 

> diff --git a/elf/dl-tls.c b/elf/dl-tls.c

> index 24d00c14ef..f8b32b3ecb 100644

> --- a/elf/dl-tls.c

> +++ b/elf/dl-tls.c

> @@ -743,6 +743,8 @@ _dl_update_slotinfo (unsigned long int req_modid)

>  	{

>  	  for (size_t cnt = total == 0 ? 1 : 0; cnt < listp->len; ++cnt)

>  	    {

> +	      size_t modid = total + cnt;

> +

>  	      size_t gen = listp->slotinfo[cnt].gen;

>  

>  	      if (gen > new_gen)

> @@ -758,25 +760,12 @@ _dl_update_slotinfo (unsigned long int req_modid)

>  

>  	      /* If there is no map this means the entry is empty.  */

>  	      struct link_map *map = listp->slotinfo[cnt].map;

> -	      if (map == NULL)

> -		{

> -		  if (dtv[-1].counter >= total + cnt)

> -		    {

> -		      /* If this modid was used at some point the memory

> -			 might still be allocated.  */

> -		      free (dtv[total + cnt].pointer.to_free);

> -		      dtv[total + cnt].pointer.val = TLS_DTV_UNALLOCATED;

> -		      dtv[total + cnt].pointer.to_free = NULL;

> -		    }

> -

> -		  continue;

> -		}

> -

>  	      /* Check whether the current dtv array is large enough.  */

> -	      size_t modid = map->l_tls_modid;

> -	      assert (total + cnt == modid);

>  	      if (dtv[-1].counter < modid)

>  		{

> +		  if (map == NULL)

> +		    continue;

> +

>  		  /* Resize the dtv.  */

>  		  dtv = _dl_resize_dtv (dtv);

>  

>
Stafford Horne via Libc-alpha April 7, 2021, 8:01 a.m. | #2
The 04/06/2021 16:40, Adhemerval Zanella wrote:
> On 15/02/2021 09:00, Szabolcs Nagy via Libc-alpha wrote:

> > map is not valid to access here because it can be freed by a

> > concurrent dlclose, so don't check the modid.

> 

> Won't it be protected by the recursive GL(dl_load_lock) in such case?

> I think the concurrency issue is between dlopen and _dl_deallocate_tls

> called by pthread stack handling (nptl/allocatestack.c).  Am I missing

> something here?



_dl_update_slotinfo is called both with and without
the dlopen lock held: during dynamic tls access
the lock is not held (see the __tls_get_addr path)

we cannot add a lock there because that would cause
new deadlocks, dealing with this is the tricky part
of the patchset.

> > 

> > The map == 0 and map != 0 code paths can be shared (avoiding

> > the dtv resize in case of map == 0 is just an optimization:

> > larger dtv than necessary would be fine too).

> > ---

> >  elf/dl-tls.c | 21 +++++----------------

> >  1 file changed, 5 insertions(+), 16 deletions(-)

> > 

> > diff --git a/elf/dl-tls.c b/elf/dl-tls.c

> > index 24d00c14ef..f8b32b3ecb 100644

> > --- a/elf/dl-tls.c

> > +++ b/elf/dl-tls.c

> > @@ -743,6 +743,8 @@ _dl_update_slotinfo (unsigned long int req_modid)

> >  	{

> >  	  for (size_t cnt = total == 0 ? 1 : 0; cnt < listp->len; ++cnt)

> >  	    {

> > +	      size_t modid = total + cnt;

> > +

> >  	      size_t gen = listp->slotinfo[cnt].gen;

> >  

> >  	      if (gen > new_gen)

> > @@ -758,25 +760,12 @@ _dl_update_slotinfo (unsigned long int req_modid)

> >  

> >  	      /* If there is no map this means the entry is empty.  */

> >  	      struct link_map *map = listp->slotinfo[cnt].map;

> > -	      if (map == NULL)

> > -		{

> > -		  if (dtv[-1].counter >= total + cnt)

> > -		    {

> > -		      /* If this modid was used at some point the memory

> > -			 might still be allocated.  */

> > -		      free (dtv[total + cnt].pointer.to_free);

> > -		      dtv[total + cnt].pointer.val = TLS_DTV_UNALLOCATED;

> > -		      dtv[total + cnt].pointer.to_free = NULL;

> > -		    }

> > -

> > -		  continue;

> > -		}

> > -

> >  	      /* Check whether the current dtv array is large enough.  */

> > -	      size_t modid = map->l_tls_modid;

> > -	      assert (total + cnt == modid);

> >  	      if (dtv[-1].counter < modid)

> >  		{

> > +		  if (map == NULL)

> > +		    continue;

> > +

> >  		  /* Resize the dtv.  */

> >  		  dtv = _dl_resize_dtv (dtv);

> >  

> >
Stafford Horne via Libc-alpha April 7, 2021, 2:28 p.m. | #3
On 07/04/2021 05:01, Szabolcs Nagy wrote:
> The 04/06/2021 16:40, Adhemerval Zanella wrote:

>> On 15/02/2021 09:00, Szabolcs Nagy via Libc-alpha wrote:

>>> map is not valid to access here because it can be freed by a

>>> concurrent dlclose, so don't check the modid.

>>

>> Won't it be protected by the recursive GL(dl_load_lock) in such case?

>> I think the concurrency issue is between dlopen and _dl_deallocate_tls

>> called by pthread stack handling (nptl/allocatestack.c).  Am I missing

>> something here?

> 

> 

> _dl_update_slotinfo is called both with and without

> the dlopen lock held: during dynamic tls access

> the lock is not held (see the __tls_get_addr path)



Right, revising the patch I mapped the calls (not sure if it is 
fully complete):

| _dl_open
|   __rtld_lock_lock_recursive (GL(dl_load_lock));
|   dl_open_worker
|     update_tls_slotinfo
|       _dl_update_slotinfo
|   __rtld_lock_unlock_recursive (GL(dl_load_lock));
  
| __tls_get_addr   
|   update_get_addr
|     _dl_update_slotinfo 

| rtld
|   _dl_resolve_conflicts
|     elf_machine_rela
|       TRY_STATIC_TLS
|         _dl_try_allocate_static_tls
|          _dl_update_slotinfo
|    
|    elf_machine_rela 
|      CHECK_STATIC_TLS   
|        _dl_allocate_static_tls
|          _dl_try_allocate_static_tls
|            _dl_update_slotinfo

The rtld part should not matter, since it is done before thread
is supported. 

> 

> we cannot add a lock there because that would cause

> new deadlocks, dealing with this is the tricky part

> of the patchset.


I understand this patch from previous discussion about it.  The
part is confusing me is "because it can be freed by a concurrent
dlclose".  My understanding is '_dl_deallocate_tls' might be called
in thread exit / deallocation without the GL(dl_load_lock) (which
is a potential issue); what I can't see is how concurrent dlclose 
might trigger this issue (since it should be synchronized with dlopen
through the lock).


> 

>>>

>>> The map == 0 and map != 0 code paths can be shared (avoiding

>>> the dtv resize in case of map == 0 is just an optimization:

>>> larger dtv than necessary would be fine too).

>>> ---

>>>  elf/dl-tls.c | 21 +++++----------------

>>>  1 file changed, 5 insertions(+), 16 deletions(-)

>>>

>>> diff --git a/elf/dl-tls.c b/elf/dl-tls.c

>>> index 24d00c14ef..f8b32b3ecb 100644

>>> --- a/elf/dl-tls.c

>>> +++ b/elf/dl-tls.c

>>> @@ -743,6 +743,8 @@ _dl_update_slotinfo (unsigned long int req_modid)

>>>  	{

>>>  	  for (size_t cnt = total == 0 ? 1 : 0; cnt < listp->len; ++cnt)

>>>  	    {

>>> +	      size_t modid = total + cnt;

>>> +

>>>  	      size_t gen = listp->slotinfo[cnt].gen;

>>>  

>>>  	      if (gen > new_gen)

>>> @@ -758,25 +760,12 @@ _dl_update_slotinfo (unsigned long int req_modid)

>>>  

>>>  	      /* If there is no map this means the entry is empty.  */

>>>  	      struct link_map *map = listp->slotinfo[cnt].map;

>>> -	      if (map == NULL)

>>> -		{

>>> -		  if (dtv[-1].counter >= total + cnt)

>>> -		    {

>>> -		      /* If this modid was used at some point the memory

>>> -			 might still be allocated.  */

>>> -		      free (dtv[total + cnt].pointer.to_free);

>>> -		      dtv[total + cnt].pointer.val = TLS_DTV_UNALLOCATED;

>>> -		      dtv[total + cnt].pointer.to_free = NULL;

>>> -		    }

>>> -

>>> -		  continue;

>>> -		}

>>> -

>>>  	      /* Check whether the current dtv array is large enough.  */

>>> -	      size_t modid = map->l_tls_modid;

>>> -	      assert (total + cnt == modid);

>>>  	      if (dtv[-1].counter < modid)

>>>  		{

>>> +		  if (map == NULL)

>>> +		    continue;

>>> +

>>>  		  /* Resize the dtv.  */

>>>  		  dtv = _dl_resize_dtv (dtv);

>>>  

>>>
Stafford Horne via Libc-alpha April 7, 2021, 2:36 p.m. | #4
On 07/04/2021 11:28, Adhemerval Zanella wrote:
> 

> 

> On 07/04/2021 05:01, Szabolcs Nagy wrote:

>> The 04/06/2021 16:40, Adhemerval Zanella wrote:

>>> On 15/02/2021 09:00, Szabolcs Nagy via Libc-alpha wrote:

>>>> map is not valid to access here because it can be freed by a

>>>> concurrent dlclose, so don't check the modid.

>>>

>>> Won't it be protected by the recursive GL(dl_load_lock) in such case?

>>> I think the concurrency issue is between dlopen and _dl_deallocate_tls

>>> called by pthread stack handling (nptl/allocatestack.c).  Am I missing

>>> something here?

>>

>>

>> _dl_update_slotinfo is called both with and without

>> the dlopen lock held: during dynamic tls access

>> the lock is not held (see the __tls_get_addr path)

> 

> 

> Right, revising the patch I mapped the calls (not sure if it is 

> fully complete):

> 

> | _dl_open

> |   __rtld_lock_lock_recursive (GL(dl_load_lock));

> |   dl_open_worker

> |     update_tls_slotinfo

> |       _dl_update_slotinfo

> |   __rtld_lock_unlock_recursive (GL(dl_load_lock));

>   

> | __tls_get_addr   

> |   update_get_addr

> |     _dl_update_slotinfo 

> 

> | rtld

> |   _dl_resolve_conflicts

> |     elf_machine_rela

> |       TRY_STATIC_TLS

> |         _dl_try_allocate_static_tls

> |          _dl_update_slotinfo

> |    

> |    elf_machine_rela 

> |      CHECK_STATIC_TLS   

> |        _dl_allocate_static_tls

> |          _dl_try_allocate_static_tls

> |            _dl_update_slotinfo

> 

> The rtld part should not matter, since it is done before thread

> is supported. 

> 

>>

>> we cannot add a lock there because that would cause

>> new deadlocks, dealing with this is the tricky part

>> of the patchset.

> 

> I understand this patch from previous discussion about it.  The

> part is confusing me is "because it can be freed by a concurrent

> dlclose".  My understanding is '_dl_deallocate_tls' might be called

> in thread exit / deallocation without the GL(dl_load_lock) (which

> is a potential issue); what I can't see is how concurrent dlclose 

> might trigger this issue (since it should be synchronized with dlopen

> through the lock).


I think I got what you meant: the concurrency issues is not related to 
dlopen open, but rather to __tls_get_addr and dclose.  Maybe making this
explicit on the commit message.
Stafford Horne via Libc-alpha April 7, 2021, 5:05 p.m. | #5
On 15/02/2021 09:00, Szabolcs Nagy via Libc-alpha wrote:
> map is not valid to access here because it can be freed by a

> concurrent dlclose, so don't check the modid.

> 

> The map == 0 and map != 0 code paths can be shared (avoiding

> the dtv resize in case of map == 0 is just an optimization:

> larger dtv than necessary would be fine too).


Please extend a bit the patch description and add that __tls_get_addr
is the public interface that show concurrency issues with dlclose.

The patch looks ok, thanks.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>


> ---

>  elf/dl-tls.c | 21 +++++----------------

>  1 file changed, 5 insertions(+), 16 deletions(-)

> 

> diff --git a/elf/dl-tls.c b/elf/dl-tls.c

> index 24d00c14ef..f8b32b3ecb 100644

> --- a/elf/dl-tls.c

> +++ b/elf/dl-tls.c

> @@ -743,6 +743,8 @@ _dl_update_slotinfo (unsigned long int req_modid)

>  	{

>  	  for (size_t cnt = total == 0 ? 1 : 0; cnt < listp->len; ++cnt)

>  	    {

> +	      size_t modid = total + cnt;

> +

>  	      size_t gen = listp->slotinfo[cnt].gen;

>  

>  	      if (gen > new_gen)

> @@ -758,25 +760,12 @@ _dl_update_slotinfo (unsigned long int req_modid)

>  

>  	      /* If there is no map this means the entry is empty.  */

>  	      struct link_map *map = listp->slotinfo[cnt].map;

> -	      if (map == NULL)

> -		{

> -		  if (dtv[-1].counter >= total + cnt)

> -		    {

> -		      /* If this modid was used at some point the memory

> -			 might still be allocated.  */

> -		      free (dtv[total + cnt].pointer.to_free);

> -		      dtv[total + cnt].pointer.val = TLS_DTV_UNALLOCATED;

> -		      dtv[total + cnt].pointer.to_free = NULL;

> -		    }

> -

> -		  continue;

> -		}

> -

>  	      /* Check whether the current dtv array is large enough.  */

> -	      size_t modid = map->l_tls_modid;

> -	      assert (total + cnt == modid);

>  	      if (dtv[-1].counter < modid)

>  		{

> +		  if (map == NULL)

> +		    continue;

> +

>  		  /* Resize the dtv.  */

>  		  dtv = _dl_resize_dtv (dtv);

>  

>

Patch

diff --git a/elf/dl-tls.c b/elf/dl-tls.c
index 24d00c14ef..f8b32b3ecb 100644
--- a/elf/dl-tls.c
+++ b/elf/dl-tls.c
@@ -743,6 +743,8 @@  _dl_update_slotinfo (unsigned long int req_modid)
 	{
 	  for (size_t cnt = total == 0 ? 1 : 0; cnt < listp->len; ++cnt)
 	    {
+	      size_t modid = total + cnt;
+
 	      size_t gen = listp->slotinfo[cnt].gen;
 
 	      if (gen > new_gen)
@@ -758,25 +760,12 @@  _dl_update_slotinfo (unsigned long int req_modid)
 
 	      /* If there is no map this means the entry is empty.  */
 	      struct link_map *map = listp->slotinfo[cnt].map;
-	      if (map == NULL)
-		{
-		  if (dtv[-1].counter >= total + cnt)
-		    {
-		      /* If this modid was used at some point the memory
-			 might still be allocated.  */
-		      free (dtv[total + cnt].pointer.to_free);
-		      dtv[total + cnt].pointer.val = TLS_DTV_UNALLOCATED;
-		      dtv[total + cnt].pointer.to_free = NULL;
-		    }
-
-		  continue;
-		}
-
 	      /* Check whether the current dtv array is large enough.  */
-	      size_t modid = map->l_tls_modid;
-	      assert (total + cnt == modid);
 	      if (dtv[-1].counter < modid)
 		{
+		  if (map == NULL)
+		    continue;
+
 		  /* Resize the dtv.  */
 		  dtv = _dl_resize_dtv (dtv);