[RFC,0/6] Demangle minimal symbol names in worker threads

Message ID 20190309172300.2764-1-tom@tromey.com
Headers show
Series
  • Demangle minimal symbol names in worker threads
Related show

Message

Tom Tromey March 9, 2019, 5:22 p.m.
I've thought for a while that gdb should take advantage of multiple
cores in order to speed up its processing.  This series is some
initial work in that direction.

In particular, this patch arranges to do the demangling for minimal
symbols in worker threads.  I chose this because it seemed relatively
simple to reason about, as the demangler is already mostly thread-safe
(except, as it turns out, the Ada demangler, which is fixed in this
series).  It isn't actually a very important thing to speed up, as
minimal symbol reading is already reasonably speedy; but I thought it
best to start with something straightforward to facilitate flushing
out thread safety bugs.

That said, this does yield a modest speedup on the largest .so on my
system:

A typical pre-patch run looks like:

    /bin/time ./gdb -nx --batch --data-directory ./data-directory/ /usr/lib64/firefox/libxul.so 
    1.22user 0.21system 0:01.46elapsed 98%CPU (0avgtext+0avgdata 188204maxresident)k

And with the patches:

    1.47user 0.39system 0:01.11elapsed 167%CPU (0avgtext+0avgdata 206380maxresident)k

I've only seen %CPU up to 185% or so; but since this is a 4-core
machine I suppose that means there are still gains to be had
somewhere?

This implementation adds a lock to the demangled hash table code.  I
tried a different implementation which did not add locking here.  In
that implementation, minsym names were simply not entered into the
hash table and thus not shared.  However, the lock turned out to be no
slower, and I believe the hash table saves some memory, so I decided
on this approach.

One possible enhancement in this area would be call
build_minimal_symbol_hash_tables in a background thread.  This is
trickier than it sounds, though.  First, it would require more
locking, so that gdb would not attempt to use the minsym hash tables
before they were constructed.  Second, it is possible to introduce
races where the objfile is deleted while the worker thread is still
running.

I have some more ideas for areas in gdb that could benefit from
threads.  I'm happy to discuss if you're interested.

Because threads are tricky, I think this requires some extra scrutiny.
Also I've done various tests using helgrind, which pointed out some of
the things fixed in this series.

You can't readily apply this, because it depends on both the cleanup
removal series and the minsym improvement series.  However it is in my
github repository on the branch "t/parallel-minsyms-mutex".

Tested by the buildbot.

Tom

Comments

Eli Zaretskii March 9, 2019, 6:09 p.m. | #1
> From: Tom Tromey <tom@tromey.com>

> Date: Sat,  9 Mar 2019 10:22:54 -0700

> 

> I've thought for a while that gdb should take advantage of multiple

> cores in order to speed up its processing.  This series is some

> initial work in that direction.

> 

> In particular, this patch arranges to do the demangling for minimal

> symbols in worker threads.  I chose this because it seemed relatively

> simple to reason about, as the demangler is already mostly thread-safe

> (except, as it turns out, the Ada demangler, which is fixed in this

> series).  It isn't actually a very important thing to speed up, as

> minimal symbol reading is already reasonably speedy; but I thought it

> best to start with something straightforward to facilitate flushing

> out thread safety bugs.


Thanks, but is std::thread portable enough?  E.g., I recall problems
with it in MinGW.

Same question regarding delivering signals to threads.
John Baldwin March 11, 2019, 5:26 p.m. | #2
On 3/9/19 9:22 AM, Tom Tromey wrote:
> I have some more ideas for areas in gdb that could benefit from

> threads.  I'm happy to discuss if you're interested.


My only caution here is that I find it handy that I can use single
stepping to debug many of the things I encounter in gdb, especially
the main event loop.  When I've worked with lldb I've found single
stepping less useful and have had to rely on their builtin tracing
framework to usefully debug things.  Peeling off tasks that aren't
part of the main event loop probably won't impact that, but I think
it's worth considering that tradeoff for future patches at least.

-- 
John Baldwin
John Baldwin March 11, 2019, 5:35 p.m. | #3
On 3/9/19 10:09 AM, Eli Zaretskii wrote:
>> From: Tom Tromey <tom@tromey.com>

>> Date: Sat,  9 Mar 2019 10:22:54 -0700

>>

>> I've thought for a while that gdb should take advantage of multiple

>> cores in order to speed up its processing.  This series is some

>> initial work in that direction.

>>

>> In particular, this patch arranges to do the demangling for minimal

>> symbols in worker threads.  I chose this because it seemed relatively

>> simple to reason about, as the demangler is already mostly thread-safe

>> (except, as it turns out, the Ada demangler, which is fixed in this

>> series).  It isn't actually a very important thing to speed up, as

>> minimal symbol reading is already reasonably speedy; but I thought it

>> best to start with something straightforward to facilitate flushing

>> out thread safety bugs.

> 

> Thanks, but is std::thread portable enough?  E.g., I recall problems

> with it in MinGW.

> 

> Same question regarding delivering signals to threads.


I can't speak to MinGW, but the signal in question here is a synchronous
SIGSEGV signal due to a page fault or the like and in POSIX systems that
is always sent to the thread executing the faulting instruction (and that
is the only sane semantic since the thread can't execute any more
instructions until the signal is resolved).

-- 
John Baldwin
Tom Tromey March 11, 2019, 10:39 p.m. | #4
>>>>> "Eli" == Eli Zaretskii <eliz@gnu.org> writes:


Eli> Thanks, but is std::thread portable enough?  E.g., I recall problems
Eli> with it in MinGW.

I asked around and learned that std::thread should work fine if you have
winpthreads, which apparently is used by default for mingw-w64.

I don't actually know much about the mingw world, so I don't know if
that is good enough or not.  Could you say?

Jonathan Wakely also once wrote a port of libstdc++ to the Win32 API,
but couldn't find help with testing, so never checked it in.

Eli> Same question regarding delivering signals to threads.

If the signal thing is broken in this series, then it is probably
already broken today.  However I think it ought to be ok as SEGV is a
synchronous signal; that is, it is delivered to the faulting thread.

Does that address your concern here?

Tom
Tom Tromey March 11, 2019, 10:40 p.m. | #5
>>>>> "John" == John Baldwin <jhb@FreeBSD.org> writes:


Tom> I have some more ideas for areas in gdb that could benefit from
Tom> threads.  I'm happy to discuss if you're interested.

John> My only caution here is that I find it handy that I can use single
John> stepping to debug many of the things I encounter in gdb, especially
John> the main event loop.

At least for the "parallel for" code that is in this series, it would be
simple to add a debugging flag to force it to take the single-thread
path.

Tom
Eli Zaretskii March 12, 2019, 3:33 a.m. | #6
> From: Tom Tromey <tom@tromey.com>

> Cc: Tom Tromey <tom@tromey.com>,  gdb-patches@sourceware.org

> Date: Mon, 11 Mar 2019 16:39:15 -0600

> 

> >>>>> "Eli" == Eli Zaretskii <eliz@gnu.org> writes:

> 

> Eli> Thanks, but is std::thread portable enough?  E.g., I recall problems

> Eli> with it in MinGW.

> 

> I asked around and learned that std::thread should work fine if you have

> winpthreads, which apparently is used by default for mingw-w64.

> 

> I don't actually know much about the mingw world, so I don't know if

> that is good enough or not.  Could you say?


I could try.  But regardless, I think we should allow for ports whose
std::thread is not up to the task, because evidently this is a
problematic area of libstdc++.  It is IMO better to allow running this
code in a single thread than breaking the entire build on some
platform because other platforms might benefit from multiple execution
cores.

> Eli> Same question regarding delivering signals to threads.

> 

> If the signal thing is broken in this series, then it is probably

> already broken today.  However I think it ought to be ok as SEGV is a

> synchronous signal; that is, it is delivered to the faulting thread.


I thought you were going to use that for signals in general.  Maybe
SEGV is OK.  But in general, signals are weird on Windows wrt threads;
for example, a SIGINT handler runs in a separate thread created by
Windows.

Thanks.
Eli Zaretskii March 13, 2019, 3:38 p.m. | #7
> Date: Tue, 12 Mar 2019 05:33:30 +0200

> From: Eli Zaretskii <eliz@gnu.org>

> CC: gdb-patches@sourceware.org

> 

> > I asked around and learned that std::thread should work fine if you have

> > winpthreads, which apparently is used by default for mingw-w64.

> > 

> > I don't actually know much about the mingw world, so I don't know if

> > that is good enough or not.  Could you say?

> 

> I could try.


Here's the answer.  The simple test program attached at the end
doesn't compile:

  D:\usr\eli\data>g++ -c ./threaded-hello.cc
  ./threaded-hello.cc: In function 'int main()':
  ./threaded-hello.cc:12:8: error: 'thread' is not a member of 'std'
     std::thread t1(call_from_thread);
	  ^~~~~~
  ./threaded-hello.cc:12:8: note: suggested alternative: 'tera'
     std::thread t1(call_from_thread);
	  ^~~~~~
	  tera
  ./threaded-hello.cc:15:3: error: 't1' was not declared in this scope
     t1.join();
     ^~
  ./threaded-hello.cc:15:3: note: suggested alternative: 'tm'
     t1.join();
     ^~
     tm

So I think only MinGW64 might have support for std::thread,
mingw.org's MinGW (which is what I'm using) definitely doesn't.

> But regardless, I think we should allow for ports whose

> std::thread is not up to the task, because evidently this is a

> problematic area of libstdc++.  It is IMO better to allow running this

> code in a single thread than breaking the entire build on some

> platform because other platforms might benefit from multiple execution

> cores.


This is still true.  We could detect at configure time that
std::thread isn't supported and revert to serial alternative code
instead.  Is that reasonable?  If that is deemed too much of a
maintenance burden, I could perhaps, with some guidance, implement a
simple replacement using Win32 primitives, if all we need is to start
a thread and then do the thread-join thing.  Let me know what you
think.

Here's the program I used to test this:

#include <iostream>
#include <thread>

//This function will be called from a thread

void call_from_thread() {
  std::cout << "Hello, World" << std::endl;
}

int main() {
  //Launch a thread
  std::thread t1(call_from_thread);

  //Join the thread with the main thread
  t1.join();

  return 0;
}
Tom Tromey March 15, 2019, 11:28 p.m. | #8
>>>>> "Eli" == Eli Zaretskii <eliz@gnu.org> writes:


Eli> This is still true.  We could detect at configure time that
Eli> std::thread isn't supported and revert to serial alternative code
Eli> instead.  Is that reasonable?

Yeah, I think it's not too hard to do this.

Eli> If that is deemed too much of a
Eli> maintenance burden, I could perhaps, with some guidance, implement a
Eli> simple replacement using Win32 primitives, if all we need is to start
Eli> a thread and then do the thread-join thing.

It seems like it would be good for mingw, and maybe even mingw-64, if
somebody wrote a Windows API port of the libstdc++ thread primitives.
Jonathan Wakely sent me his patches for the start of one, I can forward
them if you're interested.

Tom
Tom Tromey March 15, 2019, 11:40 p.m. | #9
>>>>> "Eli" == Eli Zaretskii <eliz@gnu.org> writes:


Eli> Here's the answer.  The simple test program attached at the end
Eli> doesn't compile:
[...]

Could you see if thread-locals work?
A simple program like:

    thread_local int x;

... should be enough to check, though maybe adding a main() and making
sure it could link would also be good.

Tom
Eli Zaretskii March 16, 2019, 7:45 a.m. | #10
> From: Tom Tromey <tom@tromey.com>

> Cc: tom@tromey.com,  gdb-patches@sourceware.org

> Date: Fri, 15 Mar 2019 17:28:54 -0600

> 

> >>>>> "Eli" == Eli Zaretskii <eliz@gnu.org> writes:

> 

> Eli> This is still true.  We could detect at configure time that

> Eli> std::thread isn't supported and revert to serial alternative code

> Eli> instead.  Is that reasonable?

> 

> Yeah, I think it's not too hard to do this.


Then I think this would be a good way forward in the shirt run.

> Eli> If that is deemed too much of a

> Eli> maintenance burden, I could perhaps, with some guidance, implement a

> Eli> simple replacement using Win32 primitives, if all we need is to start

> Eli> a thread and then do the thread-join thing.

> 

> It seems like it would be good for mingw, and maybe even mingw-64, if

> somebody wrote a Windows API port of the libstdc++ thread primitives.

> Jonathan Wakely sent me his patches for the start of one, I can forward

> them if you're interested.


Please do, I will take a look.  (No promises yet until I've seen the
code, sorry ;-)
Eli Zaretskii March 16, 2019, 7:51 a.m. | #11
> From: Tom Tromey <tom@tromey.com>

> Cc: tom@tromey.com,  gdb-patches@sourceware.org

> Date: Fri, 15 Mar 2019 17:40:54 -0600

> 

> Could you see if thread-locals work?

> A simple program like:

> 

>     thread_local int x;

> 

> ... should be enough to check, though maybe adding a main() and making

> sure it could link would also be good.


Yes, the simple program below compiles, runs, and returns zero status.

thread_local int x;

int main (void)
{
  return x;
}
Eli Zaretskii March 16, 2019, 8:31 a.m. | #12
> Date: Sat, 16 Mar 2019 09:45:45 +0200

> From: Eli Zaretskii <eliz@gnu.org>

> CC: gdb-patches@sourceware.org

> 

> > Eli> This is still true.  We could detect at configure time that

> > Eli> std::thread isn't supported and revert to serial alternative code

> > Eli> instead.  Is that reasonable?

> > 

> > Yeah, I think it's not too hard to do this.

> 

> Then I think this would be a good way forward in the shirt run.

                                                       ^^^^^^^^^
Oops, I meant "short run", of course...