gdb: set current thread in btrace_compute_ftrace_1

Message ID 20210714213744.490264-1-simon.marchi@efficios.com
State New
Headers show
Series
  • gdb: set current thread in btrace_compute_ftrace_1
Related show

Commit Message

Simon Marchi via Gdb-patches July 14, 2021, 9:37 p.m.
From: Simon Marchi <simon.marchi@polymtl.ca>


As documented in bug 28086, test gdb.btrace/enable-new-thread.exp
started failing with commit 0618ae414979 ("gdb: optimize
all_matching_threads_iterator"):

    (gdb) record btrace^M
    (gdb) PASS: gdb.btrace/enable-new-thread.exp: record btrace
    break 24^M
    Breakpoint 2 at 0x555555555175: file /home/smarchi/src/binutils-gdb/gdb/testsuite/gdb.btrace/enable-new-thread.c, line 24.^M
    (gdb) continue^M
    Continuing.^M
    /home/smarchi/src/binutils-gdb/gdb/inferior.c:303: internal-error: inferior* find_inferior_pid(process_stratum_target*, int): Assertion `pid != 0' failed.^M
    A problem internal to GDB has been detected,^M
    further debugging may prove unreliable.^M
    Quit this debugging session? (y or n) FAIL: gdb.btrace/enable-new-thread.exp: continue to breakpoint: cont to bp.1 (GDB internal error)

Note that I only see the failure if GDB is compiled without libipt
support.  I presume that this makes it use BTS instead of PT, so
exercises different code paths.  Could we arrange the btrace tests to be
ran with both BTS and PT, to cover more code paths in a single
configuration?

I think that the commit above just exposed an existing problem.  The
stack trace of the internal error is:

    #8  0x0000561cb81e404e in internal_error (file=0x561cb83aa2f8 "/home/smarchi/src/binutils-gdb/gdb/inferior.c", line=303, fmt=0x561cb83aa099 "%s: Assertion `%s' failed.") at /home/smarchi/src/binutils-gdb/gdbsupport/errors.cc:55
    #9  0x0000561cb7b5c031 in find_inferior_pid (targ=0x561cb8aafb60 <the_amd64_linux_nat_target>, pid=0) at /home/smarchi/src/binutils-gdb/gdb/inferior.c:303
    #10 0x0000561cb7b5c102 in find_inferior_ptid (targ=0x561cb8aafb60 <the_amd64_linux_nat_target>, ptid=...) at /home/smarchi/src/binutils-gdb/gdb/inferior.c:317
    #11 0x0000561cb7f1d1c3 in find_thread_ptid (targ=0x561cb8aafb60 <the_amd64_linux_nat_target>, ptid=...) at /home/smarchi/src/binutils-gdb/gdb/thread.c:487
    #12 0x0000561cb7f1b921 in all_matching_threads_iterator::all_matching_threads_iterator (this=0x7ffc4ee34678, filter_target=0x561cb8aafb60 <the_amd64_linux_nat_target>, filter_ptid=...) at /home/smarchi/src/binutils-gdb/gdb/thread-iter.c:125
    #13 0x0000561cb77bc462 in filtered_iterator<all_matching_threads_iterator, non_exited_thread_filter>::filtered_iterator<process_stratum_target* const&, ptid_t const&> (this=0x7ffc4ee34670) at /home/smarchi/src/binutils-gdb/gdb/../gdbsupport/filtered-iterator.h:42
    #14 0x0000561cb77b97cb in all_non_exited_threads_range::begin (this=0x7ffc4ee34650) at /home/smarchi/src/binutils-gdb/gdb/thread-iter.h:243
    #15 0x0000561cb7d8ba30 in record_btrace_target::record_is_replaying (this=0x561cb8aa6250 <record_btrace_ops>, ptid=...) at /home/smarchi/src/binutils-gdb/gdb/record-btrace.c:1411
    #16 0x0000561cb7d8bb83 in record_btrace_target::xfer_partial (this=0x561cb8aa6250 <record_btrace_ops>, object=TARGET_OBJECT_MEMORY, annex=0x0, readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0, offset=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at /home/smarchi/src/binutils-gdb/gdb/record-btrace.c:1437
    #17 0x0000561cb7ef73a9 in raw_memory_xfer_partial (ops=0x561cb8aa6250 <record_btrace_ops>, readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0, memaddr=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at /home/smarchi/src/binutils-gdb/gdb/target.c:1504
    #18 0x0000561cb7ef77da in memory_xfer_partial_1 (ops=0x561cb8aa6250 <record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY, readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0, memaddr=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at /home/smarchi/src/binutils-gdb/gdb/target.c:1635
    #19 0x0000561cb7ef78b5 in memory_xfer_partial (ops=0x561cb8aa6250 <record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY, readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0, memaddr=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at /home/smarchi/src/binutils-gdb/gdb/target.c:1664
    #20 0x0000561cb7ef7ba4 in target_xfer_partial (ops=0x561cb8aa6250 <record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY, annex=0x0, readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0, offset=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at /home/smarchi/src/binutils-gdb/gdb/target.c:1721
    #21 0x0000561cb7ef8503 in target_read_partial (ops=0x561cb8aa6250 <record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY, annex=0x0, buf=0x7ffc4ee34c58 "\260g\343N\374\177", offset=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at /home/smarchi/src/binutils-gdb/gdb/target.c:1974
    #22 0x0000561cb7ef861f in target_read (ops=0x561cb8aa6250 <record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY, annex=0x0, buf=0x7ffc4ee34c58 "\260g\343N\374\177", offset=140737352774277, len=1) at /home/smarchi/src/binutils-gdb/gdb/target.c:2014
    #23 0x0000561cb7ef809f in target_read_code (memaddr=140737352774277, myaddr=0x7ffc4ee34c58 "\260g\343N\374\177", len=1) at /home/smarchi/src/binutils-gdb/gdb/target.c:1869
    #24 0x0000561cb7937f4d in gdb_disassembler::dis_asm_read_memory (memaddr=140737352774277, myaddr=0x7ffc4ee34c58 "\260g\343N\374\177", len=1, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/gdb/disasm.c:139
    #25 0x0000561cb80ab66d in fetch_data (info=0x7ffc4ee34e88, addr=0x7ffc4ee34c59 "g\343N\374\177") at /home/smarchi/src/binutils-gdb/opcodes/i386-dis.c:194
    #26 0x0000561cb80ab7e2 in ckprefix () at /home/smarchi/src/binutils-gdb/opcodes/i386-dis.c:8628
    #27 0x0000561cb80adbd8 in print_insn (pc=140737352774277, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/opcodes/i386-dis.c:9587
    #28 0x0000561cb80abe4f in print_insn_i386 (pc=140737352774277, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/opcodes/i386-dis.c:8894
    #29 0x0000561cb7744a19 in default_print_insn (memaddr=140737352774277, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/gdb/arch-utils.c:1029
    #30 0x0000561cb7b33067 in i386_print_insn (pc=140737352774277, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/gdb/i386-tdep.c:4013
    #31 0x0000561cb7acd8f4 in gdbarch_print_insn (gdbarch=0x561cbae2fb60, vma=140737352774277, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/gdb/gdbarch.c:3478
    #32 0x0000561cb793a32d in gdb_disassembler::print_insn (this=0x7ffc4ee34e80, memaddr=140737352774277, branch_delay_insns=0x0) at /home/smarchi/src/binutils-gdb/gdb/disasm.c:795
    #33 0x0000561cb793a5b0 in gdb_print_insn (gdbarch=0x561cbae2fb60, memaddr=140737352774277, stream=0x561cb8ac99f8 <null_stream>, branch_delay_insns=0x0) at /home/smarchi/src/binutils-gdb/gdb/disasm.c:850
    #34 0x0000561cb793a631 in gdb_insn_length (gdbarch=0x561cbae2fb60, addr=140737352774277) at /home/smarchi/src/binutils-gdb/gdb/disasm.c:859
    #35 0x0000561cb77f53f4 in btrace_compute_ftrace_bts (tp=0x561cbba11210, btrace=0x7ffc4ee35188, gaps=...) at /home/smarchi/src/binutils-gdb/gdb/btrace.c:1107
    #36 0x0000561cb77f55f5 in btrace_compute_ftrace_1 (tp=0x561cbba11210, btrace=0x7ffc4ee35180, cpu=0x0, gaps=...) at /home/smarchi/src/binutils-gdb/gdb/btrace.c:1527
    #37 0x0000561cb77f5705 in btrace_compute_ftrace (tp=0x561cbba11210, btrace=0x7ffc4ee35180, cpu=0x0) at /home/smarchi/src/binutils-gdb/gdb/btrace.c:1560
    #38 0x0000561cb77f583b in btrace_add_pc (tp=0x561cbba11210) at /home/smarchi/src/binutils-gdb/gdb/btrace.c:1589
    #39 0x0000561cb77f5a86 in btrace_enable (tp=0x561cbba11210, conf=0x561cb8ac6878 <record_btrace_conf>) at /home/smarchi/src/binutils-gdb/gdb/btrace.c:1629
    #40 0x0000561cb7d88d26 in record_btrace_enable_warn (tp=0x561cbba11210) at /home/smarchi/src/binutils-gdb/gdb/record-btrace.c:294
    #41 0x0000561cb7c603dc in std::__invoke_impl<void, void (*&)(thread_info*), thread_info*> (__f=@0x561cbb6c4878: 0x561cb7d88cdc <record_btrace_enable_warn(thread_info*)>) at /usr/include/c++/10/bits/invoke.h:60
    #42 0x0000561cb7c5e5a6 in std::__invoke_r<void, void (*&)(thread_info*), thread_info*> (__fn=@0x561cbb6c4878: 0x561cb7d88cdc <record_btrace_enable_warn(thread_info*)>) at /usr/include/c++/10/bits/invoke.h:153
    #43 0x0000561cb7c5dc92 in std::_Function_handler<void (thread_info*), void (*)(thread_info*)>::_M_invoke(std::_Any_data const&, thread_info*&&) (__functor=..., __args#0=@0x7ffc4ee35310: 0x561cbba11210) at /usr/include/c++/10/bits/std_function.h:291
    #44 0x0000561cb7f2600f in std::function<void (thread_info*)>::operator()(thread_info*) const (this=0x561cbb6c4878, __args#0=0x561cbba11210) at /usr/include/c++/10/bits/std_function.h:622
    #45 0x0000561cb7f23dc8 in gdb::observers::observable<thread_info*>::notify (this=0x561cb8ac5aa0 <gdb::observers::new_thread>, args#0=0x561cbba11210) at /home/smarchi/src/binutils-gdb/gdb/../gdbsupport/observable.h:150
    #46 0x0000561cb7f1c436 in add_thread_silent (targ=0x561cb8aafb60 <the_amd64_linux_nat_target>, ptid=...) at /home/smarchi/src/binutils-gdb/gdb/thread.c:263
    #47 0x0000561cb7f1c479 in add_thread_with_info (targ=0x561cb8aafb60 <the_amd64_linux_nat_target>, ptid=..., priv=0x561cbb3f7ab0) at /home/smarchi/src/binutils-gdb/gdb/thread.c:272
    #48 0x0000561cb7bfa1d0 in record_thread (info=0x561cbb0413a0, tp=0x0, ptid=..., th_p=0x7ffc4ee35610, ti_p=0x7ffc4ee35620) at /home/smarchi/src/binutils-gdb/gdb/linux-thread-db.c:1380
    #49 0x0000561cb7bf7a2a in thread_from_lwp (stopped=0x561cba81db20, ptid=...) at /home/smarchi/src/binutils-gdb/gdb/linux-thread-db.c:429
    #50 0x0000561cb7bf7ac5 in thread_db_notice_clone (parent=..., child=...) at /home/smarchi/src/binutils-gdb/gdb/linux-thread-db.c:447
    #51 0x0000561cb7bdc9a2 in linux_handle_extended_wait (lp=0x561cbae25720, status=4991) at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:1981
    #52 0x0000561cb7bdf0f3 in linux_nat_filter_event (lwpid=435403, status=198015) at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:2920
    #53 0x0000561cb7bdfed6 in linux_nat_wait_1 (ptid=..., ourstatus=0x7ffc4ee36398, target_options=...) at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:3202
    #54 0x0000561cb7be0b68 in linux_nat_target::wait (this=0x561cb8aafb60 <the_amd64_linux_nat_target>, ptid=..., ourstatus=0x7ffc4ee36398, target_options=...) at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:3440
    #55 0x0000561cb7bfa2fc in thread_db_target::wait (this=0x561cb8a9acd0 <the_thread_db_target>, ptid=..., ourstatus=0x7ffc4ee36398, options=...) at /home/smarchi/src/binutils-gdb/gdb/linux-thread-db.c:1412
    #56 0x0000561cb7d8e356 in record_btrace_target::wait (this=0x561cb8aa6250 <record_btrace_ops>, ptid=..., status=0x7ffc4ee36398, options=...) at /home/smarchi/src/binutils-gdb/gdb/record-btrace.c:2547
    #57 0x0000561cb7ef996d in target_wait (ptid=..., status=0x7ffc4ee36398, options=...) at /home/smarchi/src/binutils-gdb/gdb/target.c:2608
    #58 0x0000561cb7b6d297 in do_target_wait_1 (inf=0x561cba6d8780, ptid=..., status=0x7ffc4ee36398, options=...) at /home/smarchi/src/binutils-gdb/gdb/infrun.c:3640
    #59 0x0000561cb7b6d43e in operator() (__closure=0x7ffc4ee36190, inf=0x561cba6d8780) at /home/smarchi/src/binutils-gdb/gdb/infrun.c:3701
    #60 0x0000561cb7b6d7b2 in do_target_wait (ecs=0x7ffc4ee36370, options=...) at /home/smarchi/src/binutils-gdb/gdb/infrun.c:3720
    #61 0x0000561cb7b6e67d in fetch_inferior_event () at /home/smarchi/src/binutils-gdb/gdb/infrun.c:4069
    #62 0x0000561cb7b4659b in inferior_event_handler (event_type=INF_REG_EVENT) at /home/smarchi/src/binutils-gdb/gdb/inf-loop.c:41
    #63 0x0000561cb7be25f7 in handle_target_event (error=0, client_data=0x0) at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:4227
    #64 0x0000561cb81e4ee2 in handle_file_event (file_ptr=0x561cbae24e10, ready_mask=1) at /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:575
    #65 0x0000561cb81e5490 in gdb_wait_for_event (block=0) at /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:701
    #66 0x0000561cb81e41be in gdb_do_one_event () at /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:212
    #67 0x0000561cb7c18096 in start_event_loop () at /home/smarchi/src/binutils-gdb/gdb/main.c:421
    #68 0x0000561cb7c181e0 in captured_command_loop () at /home/smarchi/src/binutils-gdb/gdb/main.c:481
    #69 0x0000561cb7c19d7e in captured_main (data=0x7ffc4ee366a0) at /home/smarchi/src/binutils-gdb/gdb/main.c:1353
    #70 0x0000561cb7c19df0 in gdb_main (args=0x7ffc4ee366a0) at /home/smarchi/src/binutils-gdb/gdb/main.c:1368
    #71 0x0000561cb7693186 in main (argc=11, argv=0x7ffc4ee367b8) at /home/smarchi/src/binutils-gdb/gdb/gdb.c:32

At frame 45, the new_thread observable is fired.  At this moment, the
new thread isn't the current thread, inferior_ptid is null_ptid.  I
think this is ok: the new_thread observable doesn't give any guarantee
on the global context when observers are invoked.  Frame 35,
btrace_compute_ftrace_bts, calls gdb_insn_length.  gdb_insn_length
doesn't have a thread_info or other parameter what could indicate where
to read memory from, it implicitly uses the global context
(inferior_ptid).

So we reach the all_non_exited_threads_range in
record_btrace_target::record_is_replaying with a null inferior_ptid.
The previous implemention of all_non_exited_threads_range didn't care,
but the new one does.  The problem of calling gdb_insn_length and
ultimately trying to read memory with a null inferior_ptid already
existed, but the commit mentioned above made it visible.

Something between frames 40 (record_btrace_enable_warn) and 35
(btrace_compute_ftrace_bts) needs to be switching the global context to
make TP the current thread.  By inspection, it looks like
btrace_compute_ftrace_pt may also call functions sensitive to the global
context: it installs the btrace_pt_readmem_callback callback in the PT
instruction decoder.  When this function gets called, inferior_ptid must
be set appropriately.  So it has the same potential problem.

I think that a reasonable place to switch the thread is in
btrace_compute_ftrace_1 (this is what this patch implements) although it
would be fine to do it separately in btrace_compute_ftrace_bts and
btrace_compute_ftrace_pt as well.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28086
Change-Id: I407fbfe41aab990068bd102491aa3709b0a034b3
---
 gdb/btrace.c | 6 ++++++
 1 file changed, 6 insertions(+)

-- 
2.26.2

Comments

Simon Marchi via Gdb-patches July 15, 2021, 1:26 p.m. | #1
Hello Simon,

>    (gdb) record btrace^M

>    (gdb) PASS: gdb.btrace/enable-new-thread.exp: record btrace

>    break 24^M

>    Breakpoint 2 at 0x555555555175: file /home/smarchi/src/binutils-

>gdb/gdb/testsuite/gdb.btrace/enable-new-thread.c, line 24.^M

>    (gdb) continue^M

>    Continuing.^M

>    /home/smarchi/src/binutils-gdb/gdb/inferior.c:303: internal-error: inferior*

>find_inferior_pid(process_stratum_target*, int): Assertion `pid != 0' failed.^M

>    A problem internal to GDB has been detected,^M

>    further debugging may prove unreliable.^M

>    Quit this debugging session? (y or n) FAIL: gdb.btrace/enable-new-thread.exp:

>continue to breakpoint: cont to bp.1 (GDB internal error)

>

>Note that I only see the failure if GDB is compiled without libipt

>support.  I presume that this makes it use BTS instead of PT, so

>exercises different code paths.  Could we arrange the btrace tests to be

>ran with both BTS and PT, to cover more code paths in a single

>configuration?


We run those tests on different boxes with and without PT support.  It suffices
if PT is not available for GDB to fall back to BTS; you don't need a separate build.

It shouldn't be too hard to run them in a loop over 'pt' and 'bts' and test both
configurations in a single run, if available.  I can send patches, but not immediately.


>I think that the commit above just exposed an existing problem.  The

>stack trace of the internal error is:

>

>    #8  0x0000561cb81e404e in internal_error (file=0x561cb83aa2f8

>"/home/smarchi/src/binutils-gdb/gdb/inferior.c", line=303, fmt=0x561cb83aa099

>"%s: Assertion `%s' failed.") at /home/smarchi/src/binutils-

>gdb/gdbsupport/errors.cc:55

>    #9  0x0000561cb7b5c031 in find_inferior_pid (targ=0x561cb8aafb60

><the_amd64_linux_nat_target>, pid=0) at /home/smarchi/src/binutils-

>gdb/gdb/inferior.c:303

>    #10 0x0000561cb7b5c102 in find_inferior_ptid (targ=0x561cb8aafb60

><the_amd64_linux_nat_target>, ptid=...) at /home/smarchi/src/binutils-

>gdb/gdb/inferior.c:317

>    #11 0x0000561cb7f1d1c3 in find_thread_ptid (targ=0x561cb8aafb60

><the_amd64_linux_nat_target>, ptid=...) at /home/smarchi/src/binutils-

>gdb/gdb/thread.c:487

>    #12 0x0000561cb7f1b921 in

>all_matching_threads_iterator::all_matching_threads_iterator

>(this=0x7ffc4ee34678, filter_target=0x561cb8aafb60

><the_amd64_linux_nat_target>, filter_ptid=...) at /home/smarchi/src/binutils-

>gdb/gdb/thread-iter.c:125

>    #13 0x0000561cb77bc462 in filtered_iterator<all_matching_threads_iterator,

>non_exited_thread_filter>::filtered_iterator<process_stratum_target* const&,

>ptid_t const&> (this=0x7ffc4ee34670) at /home/smarchi/src/binutils-

>gdb/gdb/../gdbsupport/filtered-iterator.h:42

>    #14 0x0000561cb77b97cb in all_non_exited_threads_range::begin

>(this=0x7ffc4ee34650) at /home/smarchi/src/binutils-gdb/gdb/thread-iter.h:243

>    #15 0x0000561cb7d8ba30 in record_btrace_target::record_is_replaying

>(this=0x561cb8aa6250 <record_btrace_ops>, ptid=...) at

>/home/smarchi/src/binutils-gdb/gdb/record-btrace.c:1411

>    #16 0x0000561cb7d8bb83 in record_btrace_target::xfer_partial

>(this=0x561cb8aa6250 <record_btrace_ops>, object=TARGET_OBJECT_MEMORY,

>annex=0x0, readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0,

>offset=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at

>/home/smarchi/src/binutils-gdb/gdb/record-btrace.c:1437

>    #17 0x0000561cb7ef73a9 in raw_memory_xfer_partial (ops=0x561cb8aa6250

><record_btrace_ops>, readbuf=0x7ffc4ee34c58 "\260g\343N\374\177",

>writebuf=0x0, memaddr=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at

>/home/smarchi/src/binutils-gdb/gdb/target.c:1504

>    #18 0x0000561cb7ef77da in memory_xfer_partial_1 (ops=0x561cb8aa6250

><record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY,

>readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0,

>memaddr=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at

>/home/smarchi/src/binutils-gdb/gdb/target.c:1635

>    #19 0x0000561cb7ef78b5 in memory_xfer_partial (ops=0x561cb8aa6250

><record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY,

>readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0,

>memaddr=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at

>/home/smarchi/src/binutils-gdb/gdb/target.c:1664

>    #20 0x0000561cb7ef7ba4 in target_xfer_partial (ops=0x561cb8aa6250

><record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY, annex=0x0,

>readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0,

>offset=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at

>/home/smarchi/src/binutils-gdb/gdb/target.c:1721

>    #21 0x0000561cb7ef8503 in target_read_partial (ops=0x561cb8aa6250

><record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY, annex=0x0,

>buf=0x7ffc4ee34c58 "\260g\343N\374\177", offset=140737352774277, len=1,

>xfered_len=0x7ffc4ee34ad8) at /home/smarchi/src/binutils-

>gdb/gdb/target.c:1974

>    #22 0x0000561cb7ef861f in target_read (ops=0x561cb8aa6250

><record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY, annex=0x0,

>buf=0x7ffc4ee34c58 "\260g\343N\374\177", offset=140737352774277, len=1) at

>/home/smarchi/src/binutils-gdb/gdb/target.c:2014

>    #23 0x0000561cb7ef809f in target_read_code (memaddr=140737352774277,

>myaddr=0x7ffc4ee34c58 "\260g\343N\374\177", len=1) at

>/home/smarchi/src/binutils-gdb/gdb/target.c:1869

>    #24 0x0000561cb7937f4d in gdb_disassembler::dis_asm_read_memory

>(memaddr=140737352774277, myaddr=0x7ffc4ee34c58 "\260g\343N\374\177",

>len=1, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/gdb/disasm.c:139

>    #25 0x0000561cb80ab66d in fetch_data (info=0x7ffc4ee34e88,

>addr=0x7ffc4ee34c59 "g\343N\374\177") at /home/smarchi/src/binutils-

>gdb/opcodes/i386-dis.c:194

>    #26 0x0000561cb80ab7e2 in ckprefix () at /home/smarchi/src/binutils-

>gdb/opcodes/i386-dis.c:8628

>    #27 0x0000561cb80adbd8 in print_insn (pc=140737352774277,

>info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/opcodes/i386-dis.c:9587

>    #28 0x0000561cb80abe4f in print_insn_i386 (pc=140737352774277,

>info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/opcodes/i386-dis.c:8894

>    #29 0x0000561cb7744a19 in default_print_insn (memaddr=140737352774277,

>info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/gdb/arch-utils.c:1029

>    #30 0x0000561cb7b33067 in i386_print_insn (pc=140737352774277,

>info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/gdb/i386-tdep.c:4013

>    #31 0x0000561cb7acd8f4 in gdbarch_print_insn (gdbarch=0x561cbae2fb60,

>vma=140737352774277, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-

>gdb/gdb/gdbarch.c:3478

>    #32 0x0000561cb793a32d in gdb_disassembler::print_insn

>(this=0x7ffc4ee34e80, memaddr=140737352774277, branch_delay_insns=0x0) at

>/home/smarchi/src/binutils-gdb/gdb/disasm.c:795

>    #33 0x0000561cb793a5b0 in gdb_print_insn (gdbarch=0x561cbae2fb60,

>memaddr=140737352774277, stream=0x561cb8ac99f8 <null_stream>,

>branch_delay_insns=0x0) at /home/smarchi/src/binutils-gdb/gdb/disasm.c:850

>    #34 0x0000561cb793a631 in gdb_insn_length (gdbarch=0x561cbae2fb60,

>addr=140737352774277) at /home/smarchi/src/binutils-gdb/gdb/disasm.c:859

>    #35 0x0000561cb77f53f4 in btrace_compute_ftrace_bts (tp=0x561cbba11210,

>btrace=0x7ffc4ee35188, gaps=...) at /home/smarchi/src/binutils-

>gdb/gdb/btrace.c:1107

>    #36 0x0000561cb77f55f5 in btrace_compute_ftrace_1 (tp=0x561cbba11210,

>btrace=0x7ffc4ee35180, cpu=0x0, gaps=...) at /home/smarchi/src/binutils-

>gdb/gdb/btrace.c:1527

>    #37 0x0000561cb77f5705 in btrace_compute_ftrace (tp=0x561cbba11210,

>btrace=0x7ffc4ee35180, cpu=0x0) at /home/smarchi/src/binutils-

>gdb/gdb/btrace.c:1560

>    #38 0x0000561cb77f583b in btrace_add_pc (tp=0x561cbba11210) at

>/home/smarchi/src/binutils-gdb/gdb/btrace.c:1589

>    #39 0x0000561cb77f5a86 in btrace_enable (tp=0x561cbba11210,

>conf=0x561cb8ac6878 <record_btrace_conf>) at /home/smarchi/src/binutils-

>gdb/gdb/btrace.c:1629

>    #40 0x0000561cb7d88d26 in record_btrace_enable_warn (tp=0x561cbba11210)

>at /home/smarchi/src/binutils-gdb/gdb/record-btrace.c:294

>    #41 0x0000561cb7c603dc in std::__invoke_impl<void, void (*&)(thread_info*),

>thread_info*> (__f=@0x561cbb6c4878: 0x561cb7d88cdc

><record_btrace_enable_warn(thread_info*)>) at

>/usr/include/c++/10/bits/invoke.h:60

>    #42 0x0000561cb7c5e5a6 in std::__invoke_r<void, void (*&)(thread_info*),

>thread_info*> (__fn=@0x561cbb6c4878: 0x561cb7d88cdc

><record_btrace_enable_warn(thread_info*)>) at

>/usr/include/c++/10/bits/invoke.h:153

>    #43 0x0000561cb7c5dc92 in std::_Function_handler<void (thread_info*), void

>(*)(thread_info*)>::_M_invoke(std::_Any_data const&, thread_info*&&)

>(__functor=..., __args#0=@0x7ffc4ee35310: 0x561cbba11210) at

>/usr/include/c++/10/bits/std_function.h:291

>    #44 0x0000561cb7f2600f in std::function<void

>(thread_info*)>::operator()(thread_info*) const (this=0x561cbb6c4878,

>__args#0=0x561cbba11210) at /usr/include/c++/10/bits/std_function.h:622

>    #45 0x0000561cb7f23dc8 in gdb::observers::observable<thread_info*>::notify

>(this=0x561cb8ac5aa0 <gdb::observers::new_thread>, args#0=0x561cbba11210) at

>/home/smarchi/src/binutils-gdb/gdb/../gdbsupport/observable.h:150

>    #46 0x0000561cb7f1c436 in add_thread_silent (targ=0x561cb8aafb60

><the_amd64_linux_nat_target>, ptid=...) at /home/smarchi/src/binutils-

>gdb/gdb/thread.c:263

>    #47 0x0000561cb7f1c479 in add_thread_with_info (targ=0x561cb8aafb60

><the_amd64_linux_nat_target>, ptid=..., priv=0x561cbb3f7ab0) at

>/home/smarchi/src/binutils-gdb/gdb/thread.c:272

>    #48 0x0000561cb7bfa1d0 in record_thread (info=0x561cbb0413a0, tp=0x0,

>ptid=..., th_p=0x7ffc4ee35610, ti_p=0x7ffc4ee35620) at

>/home/smarchi/src/binutils-gdb/gdb/linux-thread-db.c:1380

>    #49 0x0000561cb7bf7a2a in thread_from_lwp (stopped=0x561cba81db20,

>ptid=...) at /home/smarchi/src/binutils-gdb/gdb/linux-thread-db.c:429

>    #50 0x0000561cb7bf7ac5 in thread_db_notice_clone (parent=..., child=...) at

>/home/smarchi/src/binutils-gdb/gdb/linux-thread-db.c:447

>    #51 0x0000561cb7bdc9a2 in linux_handle_extended_wait (lp=0x561cbae25720,

>status=4991) at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:1981

>    #52 0x0000561cb7bdf0f3 in linux_nat_filter_event (lwpid=435403,

>status=198015) at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:2920

>    #53 0x0000561cb7bdfed6 in linux_nat_wait_1 (ptid=...,

>ourstatus=0x7ffc4ee36398, target_options=...) at /home/smarchi/src/binutils-

>gdb/gdb/linux-nat.c:3202

>    #54 0x0000561cb7be0b68 in linux_nat_target::wait (this=0x561cb8aafb60

><the_amd64_linux_nat_target>, ptid=..., ourstatus=0x7ffc4ee36398,

>target_options=...) at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:3440

>    #55 0x0000561cb7bfa2fc in thread_db_target::wait (this=0x561cb8a9acd0

><the_thread_db_target>, ptid=..., ourstatus=0x7ffc4ee36398, options=...) at

>/home/smarchi/src/binutils-gdb/gdb/linux-thread-db.c:1412

>    #56 0x0000561cb7d8e356 in record_btrace_target::wait (this=0x561cb8aa6250

><record_btrace_ops>, ptid=..., status=0x7ffc4ee36398, options=...) at

>/home/smarchi/src/binutils-gdb/gdb/record-btrace.c:2547

>    #57 0x0000561cb7ef996d in target_wait (ptid=..., status=0x7ffc4ee36398,

>options=...) at /home/smarchi/src/binutils-gdb/gdb/target.c:2608

>    #58 0x0000561cb7b6d297 in do_target_wait_1 (inf=0x561cba6d8780, ptid=...,

>status=0x7ffc4ee36398, options=...) at /home/smarchi/src/binutils-

>gdb/gdb/infrun.c:3640

>    #59 0x0000561cb7b6d43e in operator() (__closure=0x7ffc4ee36190,

>inf=0x561cba6d8780) at /home/smarchi/src/binutils-gdb/gdb/infrun.c:3701

>    #60 0x0000561cb7b6d7b2 in do_target_wait (ecs=0x7ffc4ee36370, options=...)

>at /home/smarchi/src/binutils-gdb/gdb/infrun.c:3720

>    #61 0x0000561cb7b6e67d in fetch_inferior_event () at

>/home/smarchi/src/binutils-gdb/gdb/infrun.c:4069

>    #62 0x0000561cb7b4659b in inferior_event_handler

>(event_type=INF_REG_EVENT) at /home/smarchi/src/binutils-gdb/gdb/inf-

>loop.c:41

>    #63 0x0000561cb7be25f7 in handle_target_event (error=0, client_data=0x0) at

>/home/smarchi/src/binutils-gdb/gdb/linux-nat.c:4227

>    #64 0x0000561cb81e4ee2 in handle_file_event (file_ptr=0x561cbae24e10,

>ready_mask=1) at /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:575

>    #65 0x0000561cb81e5490 in gdb_wait_for_event (block=0) at

>/home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:701

>    #66 0x0000561cb81e41be in gdb_do_one_event () at

>/home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:212

>    #67 0x0000561cb7c18096 in start_event_loop () at /home/smarchi/src/binutils-

>gdb/gdb/main.c:421

>    #68 0x0000561cb7c181e0 in captured_command_loop () at

>/home/smarchi/src/binutils-gdb/gdb/main.c:481

>    #69 0x0000561cb7c19d7e in captured_main (data=0x7ffc4ee366a0) at

>/home/smarchi/src/binutils-gdb/gdb/main.c:1353

>    #70 0x0000561cb7c19df0 in gdb_main (args=0x7ffc4ee366a0) at

>/home/smarchi/src/binutils-gdb/gdb/main.c:1368

>    #71 0x0000561cb7693186 in main (argc=11, argv=0x7ffc4ee367b8) at

>/home/smarchi/src/binutils-gdb/gdb/gdb.c:32

>

>At frame 45, the new_thread observable is fired.  At this moment, the

>new thread isn't the current thread, inferior_ptid is null_ptid.  I

>think this is ok: the new_thread observable doesn't give any guarantee

>on the global context when observers are invoked.  Frame 35,

>btrace_compute_ftrace_bts, calls gdb_insn_length.  gdb_insn_length

>doesn't have a thread_info or other parameter what could indicate where

>to read memory from, it implicitly uses the global context

>(inferior_ptid).

>

>So we reach the all_non_exited_threads_range in

>record_btrace_target::record_is_replaying with a null inferior_ptid.

>The previous implemention of all_non_exited_threads_range didn't care,

>but the new one does.  The problem of calling gdb_insn_length and

>ultimately trying to read memory with a null inferior_ptid already

>existed, but the commit mentioned above made it visible.

>

>Something between frames 40 (record_btrace_enable_warn) and 35

>(btrace_compute_ftrace_bts) needs to be switching the global context to

>make TP the current thread.  By inspection, it looks like

>btrace_compute_ftrace_pt may also call functions sensitive to the global

>context: it installs the btrace_pt_readmem_callback callback in the PT

>instruction decoder.  When this function gets called, inferior_ptid must

>be set appropriately.  So it has the same potential problem.

>

>I think that a reasonable place to switch the thread is in

>btrace_compute_ftrace_1 (this is what this patch implements) although it

>would be fine to do it separately in btrace_compute_ftrace_bts and

>btrace_compute_ftrace_pt as well.

>

>Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28086

>Change-Id: I407fbfe41aab990068bd102491aa3709b0a034b3

>---

> gdb/btrace.c | 6 ++++++

> 1 file changed, 6 insertions(+)

>

>diff --git a/gdb/btrace.c b/gdb/btrace.c

>index 5e689c11d4bd..9dd6f1dc9e94 100644

>--- a/gdb/btrace.c

>+++ b/gdb/btrace.c

>@@ -1518,6 +1518,12 @@ btrace_compute_ftrace_1 (struct thread_info *tp,

> {

>   DEBUG ("compute ftrace");

>

>+ /* btrace_compute_ftrace_bts and btrace_compute_ftrace_pt may end up

>doing

>+    target calls that require the current thread to be TP, for example to read

>+    memory.  Make sure TP is the current thread.  */

>+  scoped_restore_current_thread restore_thread;

>+  switch_to_thread (tp);

>+

>   switch (btrace->format)

>     {

>     case BTRACE_FORMAT_NONE:

>--

>2.26.2


I see a lot of those switch-to-thread and restore-current-thread sprinkled all over GDB.
I counted close to 200 switch_to_.*thread.* using grep.

It is not clear to me who is responsible for establishing the global state and who may
assume that the global state has already been setup.

This patch apparently fixes an issue but the choice to put the switch-to-thread here
seems arbitrary to me.

Regards,
Markus.

Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928
Simon Marchi via Gdb-patches July 15, 2021, 1:44 p.m. | #2
On 2021-07-15 9:26 a.m., Metzger, Markus T via Gdb-patches wrote:
> Hello Simon,

> 

>>    (gdb) record btrace^M

>>    (gdb) PASS: gdb.btrace/enable-new-thread.exp: record btrace

>>    break 24^M

>>    Breakpoint 2 at 0x555555555175: file /home/smarchi/src/binutils-

>> gdb/gdb/testsuite/gdb.btrace/enable-new-thread.c, line 24.^M

>>    (gdb) continue^M

>>    Continuing.^M

>>    /home/smarchi/src/binutils-gdb/gdb/inferior.c:303: internal-error: inferior*

>> find_inferior_pid(process_stratum_target*, int): Assertion `pid != 0' failed.^M

>>    A problem internal to GDB has been detected,^M

>>    further debugging may prove unreliable.^M

>>    Quit this debugging session? (y or n) FAIL: gdb.btrace/enable-new-thread.exp:

>> continue to breakpoint: cont to bp.1 (GDB internal error)

>>

>> Note that I only see the failure if GDB is compiled without libipt

>> support.  I presume that this makes it use BTS instead of PT, so

>> exercises different code paths.  Could we arrange the btrace tests to be

>> ran with both BTS and PT, to cover more code paths in a single

>> configuration?

> 

> We run those tests on different boxes with and without PT support.  It suffices

> if PT is not available for GDB to fall back to BTS; you don't need a separate build.

> 

> It shouldn't be too hard to run them in a loop over 'pt' and 'bts' and test both

> configurations in a single run, if available.  I can send patches, but not immediately.


Yes, running them in loop to test both technologies was what I was
thinking.  Otherwise, a failure only in bts wouldn't be seen in a
pt-enabled build (like what happened here).

> 

> 

>> I think that the commit above just exposed an existing problem.  The

>> stack trace of the internal error is:

>>

>>    #8  0x0000561cb81e404e in internal_error (file=0x561cb83aa2f8

>> "/home/smarchi/src/binutils-gdb/gdb/inferior.c", line=303, fmt=0x561cb83aa099

>> "%s: Assertion `%s' failed.") at /home/smarchi/src/binutils-

>> gdb/gdbsupport/errors.cc:55

>>    #9  0x0000561cb7b5c031 in find_inferior_pid (targ=0x561cb8aafb60

>> <the_amd64_linux_nat_target>, pid=0) at /home/smarchi/src/binutils-

>> gdb/gdb/inferior.c:303

>>    #10 0x0000561cb7b5c102 in find_inferior_ptid (targ=0x561cb8aafb60

>> <the_amd64_linux_nat_target>, ptid=...) at /home/smarchi/src/binutils-

>> gdb/gdb/inferior.c:317

>>    #11 0x0000561cb7f1d1c3 in find_thread_ptid (targ=0x561cb8aafb60

>> <the_amd64_linux_nat_target>, ptid=...) at /home/smarchi/src/binutils-

>> gdb/gdb/thread.c:487

>>    #12 0x0000561cb7f1b921 in

>> all_matching_threads_iterator::all_matching_threads_iterator

>> (this=0x7ffc4ee34678, filter_target=0x561cb8aafb60

>> <the_amd64_linux_nat_target>, filter_ptid=...) at /home/smarchi/src/binutils-

>> gdb/gdb/thread-iter.c:125

>>    #13 0x0000561cb77bc462 in filtered_iterator<all_matching_threads_iterator,

>> non_exited_thread_filter>::filtered_iterator<process_stratum_target* const&,

>> ptid_t const&> (this=0x7ffc4ee34670) at /home/smarchi/src/binutils-

>> gdb/gdb/../gdbsupport/filtered-iterator.h:42

>>    #14 0x0000561cb77b97cb in all_non_exited_threads_range::begin

>> (this=0x7ffc4ee34650) at /home/smarchi/src/binutils-gdb/gdb/thread-iter.h:243

>>    #15 0x0000561cb7d8ba30 in record_btrace_target::record_is_replaying

>> (this=0x561cb8aa6250 <record_btrace_ops>, ptid=...) at

>> /home/smarchi/src/binutils-gdb/gdb/record-btrace.c:1411

>>    #16 0x0000561cb7d8bb83 in record_btrace_target::xfer_partial

>> (this=0x561cb8aa6250 <record_btrace_ops>, object=TARGET_OBJECT_MEMORY,

>> annex=0x0, readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0,

>> offset=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at

>> /home/smarchi/src/binutils-gdb/gdb/record-btrace.c:1437

>>    #17 0x0000561cb7ef73a9 in raw_memory_xfer_partial (ops=0x561cb8aa6250

>> <record_btrace_ops>, readbuf=0x7ffc4ee34c58 "\260g\343N\374\177",

>> writebuf=0x0, memaddr=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at

>> /home/smarchi/src/binutils-gdb/gdb/target.c:1504

>>    #18 0x0000561cb7ef77da in memory_xfer_partial_1 (ops=0x561cb8aa6250

>> <record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY,

>> readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0,

>> memaddr=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at

>> /home/smarchi/src/binutils-gdb/gdb/target.c:1635

>>    #19 0x0000561cb7ef78b5 in memory_xfer_partial (ops=0x561cb8aa6250

>> <record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY,

>> readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0,

>> memaddr=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at

>> /home/smarchi/src/binutils-gdb/gdb/target.c:1664

>>    #20 0x0000561cb7ef7ba4 in target_xfer_partial (ops=0x561cb8aa6250

>> <record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY, annex=0x0,

>> readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0,

>> offset=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at

>> /home/smarchi/src/binutils-gdb/gdb/target.c:1721

>>    #21 0x0000561cb7ef8503 in target_read_partial (ops=0x561cb8aa6250

>> <record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY, annex=0x0,

>> buf=0x7ffc4ee34c58 "\260g\343N\374\177", offset=140737352774277, len=1,

>> xfered_len=0x7ffc4ee34ad8) at /home/smarchi/src/binutils-

>> gdb/gdb/target.c:1974

>>    #22 0x0000561cb7ef861f in target_read (ops=0x561cb8aa6250

>> <record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY, annex=0x0,

>> buf=0x7ffc4ee34c58 "\260g\343N\374\177", offset=140737352774277, len=1) at

>> /home/smarchi/src/binutils-gdb/gdb/target.c:2014

>>    #23 0x0000561cb7ef809f in target_read_code (memaddr=140737352774277,

>> myaddr=0x7ffc4ee34c58 "\260g\343N\374\177", len=1) at

>> /home/smarchi/src/binutils-gdb/gdb/target.c:1869

>>    #24 0x0000561cb7937f4d in gdb_disassembler::dis_asm_read_memory

>> (memaddr=140737352774277, myaddr=0x7ffc4ee34c58 "\260g\343N\374\177",

>> len=1, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/gdb/disasm.c:139

>>    #25 0x0000561cb80ab66d in fetch_data (info=0x7ffc4ee34e88,

>> addr=0x7ffc4ee34c59 "g\343N\374\177") at /home/smarchi/src/binutils-

>> gdb/opcodes/i386-dis.c:194

>>    #26 0x0000561cb80ab7e2 in ckprefix () at /home/smarchi/src/binutils-

>> gdb/opcodes/i386-dis.c:8628

>>    #27 0x0000561cb80adbd8 in print_insn (pc=140737352774277,

>> info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/opcodes/i386-dis.c:9587

>>    #28 0x0000561cb80abe4f in print_insn_i386 (pc=140737352774277,

>> info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/opcodes/i386-dis.c:8894

>>    #29 0x0000561cb7744a19 in default_print_insn (memaddr=140737352774277,

>> info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/gdb/arch-utils.c:1029

>>    #30 0x0000561cb7b33067 in i386_print_insn (pc=140737352774277,

>> info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/gdb/i386-tdep.c:4013

>>    #31 0x0000561cb7acd8f4 in gdbarch_print_insn (gdbarch=0x561cbae2fb60,

>> vma=140737352774277, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-

>> gdb/gdb/gdbarch.c:3478

>>    #32 0x0000561cb793a32d in gdb_disassembler::print_insn

>> (this=0x7ffc4ee34e80, memaddr=140737352774277, branch_delay_insns=0x0) at

>> /home/smarchi/src/binutils-gdb/gdb/disasm.c:795

>>    #33 0x0000561cb793a5b0 in gdb_print_insn (gdbarch=0x561cbae2fb60,

>> memaddr=140737352774277, stream=0x561cb8ac99f8 <null_stream>,

>> branch_delay_insns=0x0) at /home/smarchi/src/binutils-gdb/gdb/disasm.c:850

>>    #34 0x0000561cb793a631 in gdb_insn_length (gdbarch=0x561cbae2fb60,

>> addr=140737352774277) at /home/smarchi/src/binutils-gdb/gdb/disasm.c:859

>>    #35 0x0000561cb77f53f4 in btrace_compute_ftrace_bts (tp=0x561cbba11210,

>> btrace=0x7ffc4ee35188, gaps=...) at /home/smarchi/src/binutils-

>> gdb/gdb/btrace.c:1107

>>    #36 0x0000561cb77f55f5 in btrace_compute_ftrace_1 (tp=0x561cbba11210,

>> btrace=0x7ffc4ee35180, cpu=0x0, gaps=...) at /home/smarchi/src/binutils-

>> gdb/gdb/btrace.c:1527

>>    #37 0x0000561cb77f5705 in btrace_compute_ftrace (tp=0x561cbba11210,

>> btrace=0x7ffc4ee35180, cpu=0x0) at /home/smarchi/src/binutils-

>> gdb/gdb/btrace.c:1560

>>    #38 0x0000561cb77f583b in btrace_add_pc (tp=0x561cbba11210) at

>> /home/smarchi/src/binutils-gdb/gdb/btrace.c:1589

>>    #39 0x0000561cb77f5a86 in btrace_enable (tp=0x561cbba11210,

>> conf=0x561cb8ac6878 <record_btrace_conf>) at /home/smarchi/src/binutils-

>> gdb/gdb/btrace.c:1629

>>    #40 0x0000561cb7d88d26 in record_btrace_enable_warn (tp=0x561cbba11210)

>> at /home/smarchi/src/binutils-gdb/gdb/record-btrace.c:294

>>    #41 0x0000561cb7c603dc in std::__invoke_impl<void, void (*&)(thread_info*),

>> thread_info*> (__f=@0x561cbb6c4878: 0x561cb7d88cdc

>> <record_btrace_enable_warn(thread_info*)>) at

>> /usr/include/c++/10/bits/invoke.h:60

>>    #42 0x0000561cb7c5e5a6 in std::__invoke_r<void, void (*&)(thread_info*),

>> thread_info*> (__fn=@0x561cbb6c4878: 0x561cb7d88cdc

>> <record_btrace_enable_warn(thread_info*)>) at

>> /usr/include/c++/10/bits/invoke.h:153

>>    #43 0x0000561cb7c5dc92 in std::_Function_handler<void (thread_info*), void

>> (*)(thread_info*)>::_M_invoke(std::_Any_data const&, thread_info*&&)

>> (__functor=..., __args#0=@0x7ffc4ee35310: 0x561cbba11210) at

>> /usr/include/c++/10/bits/std_function.h:291

>>    #44 0x0000561cb7f2600f in std::function<void

>> (thread_info*)>::operator()(thread_info*) const (this=0x561cbb6c4878,

>> __args#0=0x561cbba11210) at /usr/include/c++/10/bits/std_function.h:622

>>    #45 0x0000561cb7f23dc8 in gdb::observers::observable<thread_info*>::notify

>> (this=0x561cb8ac5aa0 <gdb::observers::new_thread>, args#0=0x561cbba11210) at

>> /home/smarchi/src/binutils-gdb/gdb/../gdbsupport/observable.h:150

>>    #46 0x0000561cb7f1c436 in add_thread_silent (targ=0x561cb8aafb60

>> <the_amd64_linux_nat_target>, ptid=...) at /home/smarchi/src/binutils-

>> gdb/gdb/thread.c:263

>>    #47 0x0000561cb7f1c479 in add_thread_with_info (targ=0x561cb8aafb60

>> <the_amd64_linux_nat_target>, ptid=..., priv=0x561cbb3f7ab0) at

>> /home/smarchi/src/binutils-gdb/gdb/thread.c:272

>>    #48 0x0000561cb7bfa1d0 in record_thread (info=0x561cbb0413a0, tp=0x0,

>> ptid=..., th_p=0x7ffc4ee35610, ti_p=0x7ffc4ee35620) at

>> /home/smarchi/src/binutils-gdb/gdb/linux-thread-db.c:1380

>>    #49 0x0000561cb7bf7a2a in thread_from_lwp (stopped=0x561cba81db20,

>> ptid=...) at /home/smarchi/src/binutils-gdb/gdb/linux-thread-db.c:429

>>    #50 0x0000561cb7bf7ac5 in thread_db_notice_clone (parent=..., child=...) at

>> /home/smarchi/src/binutils-gdb/gdb/linux-thread-db.c:447

>>    #51 0x0000561cb7bdc9a2 in linux_handle_extended_wait (lp=0x561cbae25720,

>> status=4991) at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:1981

>>    #52 0x0000561cb7bdf0f3 in linux_nat_filter_event (lwpid=435403,

>> status=198015) at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:2920

>>    #53 0x0000561cb7bdfed6 in linux_nat_wait_1 (ptid=...,

>> ourstatus=0x7ffc4ee36398, target_options=...) at /home/smarchi/src/binutils-

>> gdb/gdb/linux-nat.c:3202

>>    #54 0x0000561cb7be0b68 in linux_nat_target::wait (this=0x561cb8aafb60

>> <the_amd64_linux_nat_target>, ptid=..., ourstatus=0x7ffc4ee36398,

>> target_options=...) at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:3440

>>    #55 0x0000561cb7bfa2fc in thread_db_target::wait (this=0x561cb8a9acd0

>> <the_thread_db_target>, ptid=..., ourstatus=0x7ffc4ee36398, options=...) at

>> /home/smarchi/src/binutils-gdb/gdb/linux-thread-db.c:1412

>>    #56 0x0000561cb7d8e356 in record_btrace_target::wait (this=0x561cb8aa6250

>> <record_btrace_ops>, ptid=..., status=0x7ffc4ee36398, options=...) at

>> /home/smarchi/src/binutils-gdb/gdb/record-btrace.c:2547

>>    #57 0x0000561cb7ef996d in target_wait (ptid=..., status=0x7ffc4ee36398,

>> options=...) at /home/smarchi/src/binutils-gdb/gdb/target.c:2608

>>    #58 0x0000561cb7b6d297 in do_target_wait_1 (inf=0x561cba6d8780, ptid=...,

>> status=0x7ffc4ee36398, options=...) at /home/smarchi/src/binutils-

>> gdb/gdb/infrun.c:3640

>>    #59 0x0000561cb7b6d43e in operator() (__closure=0x7ffc4ee36190,

>> inf=0x561cba6d8780) at /home/smarchi/src/binutils-gdb/gdb/infrun.c:3701

>>    #60 0x0000561cb7b6d7b2 in do_target_wait (ecs=0x7ffc4ee36370, options=...)

>> at /home/smarchi/src/binutils-gdb/gdb/infrun.c:3720

>>    #61 0x0000561cb7b6e67d in fetch_inferior_event () at

>> /home/smarchi/src/binutils-gdb/gdb/infrun.c:4069

>>    #62 0x0000561cb7b4659b in inferior_event_handler

>> (event_type=INF_REG_EVENT) at /home/smarchi/src/binutils-gdb/gdb/inf-

>> loop.c:41

>>    #63 0x0000561cb7be25f7 in handle_target_event (error=0, client_data=0x0) at

>> /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:4227

>>    #64 0x0000561cb81e4ee2 in handle_file_event (file_ptr=0x561cbae24e10,

>> ready_mask=1) at /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:575

>>    #65 0x0000561cb81e5490 in gdb_wait_for_event (block=0) at

>> /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:701

>>    #66 0x0000561cb81e41be in gdb_do_one_event () at

>> /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:212

>>    #67 0x0000561cb7c18096 in start_event_loop () at /home/smarchi/src/binutils-

>> gdb/gdb/main.c:421

>>    #68 0x0000561cb7c181e0 in captured_command_loop () at

>> /home/smarchi/src/binutils-gdb/gdb/main.c:481

>>    #69 0x0000561cb7c19d7e in captured_main (data=0x7ffc4ee366a0) at

>> /home/smarchi/src/binutils-gdb/gdb/main.c:1353

>>    #70 0x0000561cb7c19df0 in gdb_main (args=0x7ffc4ee366a0) at

>> /home/smarchi/src/binutils-gdb/gdb/main.c:1368

>>    #71 0x0000561cb7693186 in main (argc=11, argv=0x7ffc4ee367b8) at

>> /home/smarchi/src/binutils-gdb/gdb/gdb.c:32

>>

>> At frame 45, the new_thread observable is fired.  At this moment, the

>> new thread isn't the current thread, inferior_ptid is null_ptid.  I

>> think this is ok: the new_thread observable doesn't give any guarantee

>> on the global context when observers are invoked.  Frame 35,

>> btrace_compute_ftrace_bts, calls gdb_insn_length.  gdb_insn_length

>> doesn't have a thread_info or other parameter what could indicate where

>> to read memory from, it implicitly uses the global context

>> (inferior_ptid).

>>

>> So we reach the all_non_exited_threads_range in

>> record_btrace_target::record_is_replaying with a null inferior_ptid.

>> The previous implemention of all_non_exited_threads_range didn't care,

>> but the new one does.  The problem of calling gdb_insn_length and

>> ultimately trying to read memory with a null inferior_ptid already

>> existed, but the commit mentioned above made it visible.

>>

>> Something between frames 40 (record_btrace_enable_warn) and 35

>> (btrace_compute_ftrace_bts) needs to be switching the global context to

>> make TP the current thread.  By inspection, it looks like

>> btrace_compute_ftrace_pt may also call functions sensitive to the global

>> context: it installs the btrace_pt_readmem_callback callback in the PT

>> instruction decoder.  When this function gets called, inferior_ptid must

>> be set appropriately.  So it has the same potential problem.

>>

>> I think that a reasonable place to switch the thread is in

>> btrace_compute_ftrace_1 (this is what this patch implements) although it

>> would be fine to do it separately in btrace_compute_ftrace_bts and

>> btrace_compute_ftrace_pt as well.

>>

>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28086

>> Change-Id: I407fbfe41aab990068bd102491aa3709b0a034b3

>> ---

>> gdb/btrace.c | 6 ++++++

>> 1 file changed, 6 insertions(+)

>>

>> diff --git a/gdb/btrace.c b/gdb/btrace.c

>> index 5e689c11d4bd..9dd6f1dc9e94 100644

>> --- a/gdb/btrace.c

>> +++ b/gdb/btrace.c

>> @@ -1518,6 +1518,12 @@ btrace_compute_ftrace_1 (struct thread_info *tp,

>> {

>>   DEBUG ("compute ftrace");

>>

>> + /* btrace_compute_ftrace_bts and btrace_compute_ftrace_pt may end up

>> doing

>> +    target calls that require the current thread to be TP, for example to read

>> +    memory.  Make sure TP is the current thread.  */

>> +  scoped_restore_current_thread restore_thread;

>> +  switch_to_thread (tp);

>> +

>>   switch (btrace->format)

>>     {

>>     case BTRACE_FORMAT_NONE:

>> --

>> 2.26.2

> 

> I see a lot of those switch-to-thread and restore-current-thread sprinkled all over GDB.

> I counted close to 200 switch_to_.*thread.* using grep.

> 

> It is not clear to me who is responsible for establishing the global state and who may

> assume that the global state has already been setup.

> 

> This patch apparently fixes an issue but the choice to put the switch-to-thread here

> seems arbitrary to me.


Yeah, I really don't like that either.  I would much prefer if we could
always pass the context (inferior, thread, frame, whatever) down to
functions that need them.  Unfortunately, it's not an easy change, as a
lot of functions / interfaces rely on the global state, so it's hard to
do local, small-step changes.  Notably, reading memory through
target_ops::xfer_partial (the culprit here) relies on inferior_ptid.
Also, even doing a target call relies on the current inferior being the
right one, because target_ops::beneath finds the target beneath using
the current inferior's target stack.

It is indeed confusing who is responsible for switching / restoring the
current thread.  After talking to Pedro about this problem a few times,
I think a guideline to make progressive changes could be: if a function
A (such as btrace_compute_ftrace_bts) takes a thread pointer as
parameter (or inferior, or frame), it's because it advertises that it
will be working on that thread and does not require that thread to be
the current one.  If it did require the current thread to be the same
thread it receives as argument, then why have the parameter at all?  So
if function A (who doesn't care about the global context) needs to call
a function B (like gdb_insn_length in our example) that relies on the
current global thread, then it needs to set the current thread to
prepare the call B.  It also needs to restore the current thread that
was there on entry, because it could happen to be called from a function
that cares about the current thread.  And that caller wouldn't expect
a function that doesn't care about the global context to modify the
global context.

Following this guideline, it might make more sense to put the
switching/restoring in btrace_compute_ftrace_bts and
btrace_compute_ftrace_pt, since they are the functions A that call
functions B.  If/when we ever change gdb_insn_length and others to take
the context (thread or other) by parameter, then
btrace_compute_ftrace_bts and btrace_compute_ftrace_pt won't need to
context-switch anymore.  But from their caller's point of view, they'll
work exactly the same.

Simon
Simon Marchi via Gdb-patches July 17, 2021, 1:09 p.m. | #3
On 2021-07-15 9:44 a.m., Simon Marchi via Gdb-patches wrote:
> Yeah, I really don't like that either.  I would much prefer if we could

> always pass the context (inferior, thread, frame, whatever) down to

> functions that need them.  Unfortunately, it's not an easy change, as a

> lot of functions / interfaces rely on the global state, so it's hard to

> do local, small-step changes.  Notably, reading memory through

> target_ops::xfer_partial (the culprit here) relies on inferior_ptid.

> Also, even doing a target call relies on the current inferior being the

> right one, because target_ops::beneath finds the target beneath using

> the current inferior's target stack.

> 

> It is indeed confusing who is responsible for switching / restoring the

> current thread.  After talking to Pedro about this problem a few times,

> I think a guideline to make progressive changes could be: if a function

> A (such as btrace_compute_ftrace_bts) takes a thread pointer as

> parameter (or inferior, or frame), it's because it advertises that it

> will be working on that thread and does not require that thread to be

> the current one.  If it did require the current thread to be the same

> thread it receives as argument, then why have the parameter at all?  So

> if function A (who doesn't care about the global context) needs to call

> a function B (like gdb_insn_length in our example) that relies on the

> current global thread, then it needs to set the current thread to

> prepare the call B.  It also needs to restore the current thread that

> was there on entry, because it could happen to be called from a function

> that cares about the current thread.  And that caller wouldn't expect

> a function that doesn't care about the global context to modify the

> global context.

> 

> Following this guideline, it might make more sense to put the

> switching/restoring in btrace_compute_ftrace_bts and

> btrace_compute_ftrace_pt, since they are the functions A that call

> functions B.  If/when we ever change gdb_insn_length and others to take

> the context (thread or other) by parameter, then

> btrace_compute_ftrace_bts and btrace_compute_ftrace_pt won't need to

> context-switch anymore.  But from their caller's point of view, they'll

> work exactly the same.


In light of this, I modified my patch to follow the above:


From b153656b2b8890ab67ed697da64c8fc6fbfde03f Mon Sep 17 00:00:00 2001
From: Simon Marchi <simon.marchi@polymtl.ca>

Date: Wed, 14 Jul 2021 16:31:09 -0400
Subject: [PATCH v2] gdb: set current thread in btrace_compute_ftrace_1

As documented in bug 28086, test gdb.btrace/enable-new-thread.exp
started failing with commit 0618ae414979 ("gdb: optimize
all_matching_threads_iterator"):

    (gdb) record btrace^M
    (gdb) PASS: gdb.btrace/enable-new-thread.exp: record btrace
    break 24^M
    Breakpoint 2 at 0x555555555175: file /home/smarchi/src/binutils-gdb/gdb/testsuite/gdb.btrace/enable-new-thread.c, line 24.^M
    (gdb) continue^M
    Continuing.^M
    /home/smarchi/src/binutils-gdb/gdb/inferior.c:303: internal-error: inferior* find_inferior_pid(process_stratum_target*, int): Assertion `pid != 0' failed.^M
    A problem internal to GDB has been detected,^M
    further debugging may prove unreliable.^M
    Quit this debugging session? (y or n) FAIL: gdb.btrace/enable-new-thread.exp: continue to breakpoint: cont to bp.1 (GDB internal error)

Note that I only see the failure if GDB is compiled without libipt
support.  I presume that this makes it use BTS instead of PT, so
exercises different code paths.

I think that the commit above just exposed an existing problem.  The
stack trace of the internal error is:

    #8  0x0000561cb81e404e in internal_error (file=0x561cb83aa2f8 "/home/smarchi/src/binutils-gdb/gdb/inferior.c", line=303, fmt=0x561cb83aa099 "%s: Assertion `%s' failed.") at /home/smarchi/src/binutils-gdb/gdbsupport/errors.cc:55
    #9  0x0000561cb7b5c031 in find_inferior_pid (targ=0x561cb8aafb60 <the_amd64_linux_nat_target>, pid=0) at /home/smarchi/src/binutils-gdb/gdb/inferior.c:303
    #10 0x0000561cb7b5c102 in find_inferior_ptid (targ=0x561cb8aafb60 <the_amd64_linux_nat_target>, ptid=...) at /home/smarchi/src/binutils-gdb/gdb/inferior.c:317
    #11 0x0000561cb7f1d1c3 in find_thread_ptid (targ=0x561cb8aafb60 <the_amd64_linux_nat_target>, ptid=...) at /home/smarchi/src/binutils-gdb/gdb/thread.c:487
    #12 0x0000561cb7f1b921 in all_matching_threads_iterator::all_matching_threads_iterator (this=0x7ffc4ee34678, filter_target=0x561cb8aafb60 <the_amd64_linux_nat_target>, filter_ptid=...) at /home/smarchi/src/binutils-gdb/gdb/thread-iter.c:125
    #13 0x0000561cb77bc462 in filtered_iterator<all_matching_threads_iterator, non_exited_thread_filter>::filtered_iterator<process_stratum_target* const&, ptid_t const&> (this=0x7ffc4ee34670) at /home/smarchi/src/binutils-gdb/gdb/../gdbsupport/filtered-iterator.h:42
    #14 0x0000561cb77b97cb in all_non_exited_threads_range::begin (this=0x7ffc4ee34650) at /home/smarchi/src/binutils-gdb/gdb/thread-iter.h:243
    #15 0x0000561cb7d8ba30 in record_btrace_target::record_is_replaying (this=0x561cb8aa6250 <record_btrace_ops>, ptid=...) at /home/smarchi/src/binutils-gdb/gdb/record-btrace.c:1411
    #16 0x0000561cb7d8bb83 in record_btrace_target::xfer_partial (this=0x561cb8aa6250 <record_btrace_ops>, object=TARGET_OBJECT_MEMORY, annex=0x0, readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0, offset=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at /home/smarchi/src/binutils-gdb/gdb/record-btrace.c:1437
    #17 0x0000561cb7ef73a9 in raw_memory_xfer_partial (ops=0x561cb8aa6250 <record_btrace_ops>, readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0, memaddr=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at /home/smarchi/src/binutils-gdb/gdb/target.c:1504
    #18 0x0000561cb7ef77da in memory_xfer_partial_1 (ops=0x561cb8aa6250 <record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY, readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0, memaddr=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at /home/smarchi/src/binutils-gdb/gdb/target.c:1635
    #19 0x0000561cb7ef78b5 in memory_xfer_partial (ops=0x561cb8aa6250 <record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY, readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0, memaddr=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at /home/smarchi/src/binutils-gdb/gdb/target.c:1664
    #20 0x0000561cb7ef7ba4 in target_xfer_partial (ops=0x561cb8aa6250 <record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY, annex=0x0, readbuf=0x7ffc4ee34c58 "\260g\343N\374\177", writebuf=0x0, offset=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at /home/smarchi/src/binutils-gdb/gdb/target.c:1721
    #21 0x0000561cb7ef8503 in target_read_partial (ops=0x561cb8aa6250 <record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY, annex=0x0, buf=0x7ffc4ee34c58 "\260g\343N\374\177", offset=140737352774277, len=1, xfered_len=0x7ffc4ee34ad8) at /home/smarchi/src/binutils-gdb/gdb/target.c:1974
    #22 0x0000561cb7ef861f in target_read (ops=0x561cb8aa6250 <record_btrace_ops>, object=TARGET_OBJECT_CODE_MEMORY, annex=0x0, buf=0x7ffc4ee34c58 "\260g\343N\374\177", offset=140737352774277, len=1) at /home/smarchi/src/binutils-gdb/gdb/target.c:2014
    #23 0x0000561cb7ef809f in target_read_code (memaddr=140737352774277, myaddr=0x7ffc4ee34c58 "\260g\343N\374\177", len=1) at /home/smarchi/src/binutils-gdb/gdb/target.c:1869
    #24 0x0000561cb7937f4d in gdb_disassembler::dis_asm_read_memory (memaddr=140737352774277, myaddr=0x7ffc4ee34c58 "\260g\343N\374\177", len=1, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/gdb/disasm.c:139
    #25 0x0000561cb80ab66d in fetch_data (info=0x7ffc4ee34e88, addr=0x7ffc4ee34c59 "g\343N\374\177") at /home/smarchi/src/binutils-gdb/opcodes/i386-dis.c:194
    #26 0x0000561cb80ab7e2 in ckprefix () at /home/smarchi/src/binutils-gdb/opcodes/i386-dis.c:8628
    #27 0x0000561cb80adbd8 in print_insn (pc=140737352774277, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/opcodes/i386-dis.c:9587
    #28 0x0000561cb80abe4f in print_insn_i386 (pc=140737352774277, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/opcodes/i386-dis.c:8894
    #29 0x0000561cb7744a19 in default_print_insn (memaddr=140737352774277, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/gdb/arch-utils.c:1029
    #30 0x0000561cb7b33067 in i386_print_insn (pc=140737352774277, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/gdb/i386-tdep.c:4013
    #31 0x0000561cb7acd8f4 in gdbarch_print_insn (gdbarch=0x561cbae2fb60, vma=140737352774277, info=0x7ffc4ee34e88) at /home/smarchi/src/binutils-gdb/gdb/gdbarch.c:3478
    #32 0x0000561cb793a32d in gdb_disassembler::print_insn (this=0x7ffc4ee34e80, memaddr=140737352774277, branch_delay_insns=0x0) at /home/smarchi/src/binutils-gdb/gdb/disasm.c:795
    #33 0x0000561cb793a5b0 in gdb_print_insn (gdbarch=0x561cbae2fb60, memaddr=140737352774277, stream=0x561cb8ac99f8 <null_stream>, branch_delay_insns=0x0) at /home/smarchi/src/binutils-gdb/gdb/disasm.c:850
    #34 0x0000561cb793a631 in gdb_insn_length (gdbarch=0x561cbae2fb60, addr=140737352774277) at /home/smarchi/src/binutils-gdb/gdb/disasm.c:859
    #35 0x0000561cb77f53f4 in btrace_compute_ftrace_bts (tp=0x561cbba11210, btrace=0x7ffc4ee35188, gaps=...) at /home/smarchi/src/binutils-gdb/gdb/btrace.c:1107
    #36 0x0000561cb77f55f5 in btrace_compute_ftrace_1 (tp=0x561cbba11210, btrace=0x7ffc4ee35180, cpu=0x0, gaps=...) at /home/smarchi/src/binutils-gdb/gdb/btrace.c:1527
    #37 0x0000561cb77f5705 in btrace_compute_ftrace (tp=0x561cbba11210, btrace=0x7ffc4ee35180, cpu=0x0) at /home/smarchi/src/binutils-gdb/gdb/btrace.c:1560
    #38 0x0000561cb77f583b in btrace_add_pc (tp=0x561cbba11210) at /home/smarchi/src/binutils-gdb/gdb/btrace.c:1589
    #39 0x0000561cb77f5a86 in btrace_enable (tp=0x561cbba11210, conf=0x561cb8ac6878 <record_btrace_conf>) at /home/smarchi/src/binutils-gdb/gdb/btrace.c:1629
    #40 0x0000561cb7d88d26 in record_btrace_enable_warn (tp=0x561cbba11210) at /home/smarchi/src/binutils-gdb/gdb/record-btrace.c:294
    #41 0x0000561cb7c603dc in std::__invoke_impl<void, void (*&)(thread_info*), thread_info*> (__f=@0x561cbb6c4878: 0x561cb7d88cdc <record_btrace_enable_warn(thread_info*)>) at /usr/include/c++/10/bits/invoke.h:60
    #42 0x0000561cb7c5e5a6 in std::__invoke_r<void, void (*&)(thread_info*), thread_info*> (__fn=@0x561cbb6c4878: 0x561cb7d88cdc <record_btrace_enable_warn(thread_info*)>) at /usr/include/c++/10/bits/invoke.h:153
    #43 0x0000561cb7c5dc92 in std::_Function_handler<void (thread_info*), void (*)(thread_info*)>::_M_invoke(std::_Any_data const&, thread_info*&&) (__functor=..., __args#0=@0x7ffc4ee35310: 0x561cbba11210) at /usr/include/c++/10/bits/std_function.h:291
    #44 0x0000561cb7f2600f in std::function<void (thread_info*)>::operator()(thread_info*) const (this=0x561cbb6c4878, __args#0=0x561cbba11210) at /usr/include/c++/10/bits/std_function.h:622
    #45 0x0000561cb7f23dc8 in gdb::observers::observable<thread_info*>::notify (this=0x561cb8ac5aa0 <gdb::observers::new_thread>, args#0=0x561cbba11210) at /home/smarchi/src/binutils-gdb/gdb/../gdbsupport/observable.h:150
    #46 0x0000561cb7f1c436 in add_thread_silent (targ=0x561cb8aafb60 <the_amd64_linux_nat_target>, ptid=...) at /home/smarchi/src/binutils-gdb/gdb/thread.c:263
    #47 0x0000561cb7f1c479 in add_thread_with_info (targ=0x561cb8aafb60 <the_amd64_linux_nat_target>, ptid=..., priv=0x561cbb3f7ab0) at /home/smarchi/src/binutils-gdb/gdb/thread.c:272
    #48 0x0000561cb7bfa1d0 in record_thread (info=0x561cbb0413a0, tp=0x0, ptid=..., th_p=0x7ffc4ee35610, ti_p=0x7ffc4ee35620) at /home/smarchi/src/binutils-gdb/gdb/linux-thread-db.c:1380
    #49 0x0000561cb7bf7a2a in thread_from_lwp (stopped=0x561cba81db20, ptid=...) at /home/smarchi/src/binutils-gdb/gdb/linux-thread-db.c:429
    #50 0x0000561cb7bf7ac5 in thread_db_notice_clone (parent=..., child=...) at /home/smarchi/src/binutils-gdb/gdb/linux-thread-db.c:447
    #51 0x0000561cb7bdc9a2 in linux_handle_extended_wait (lp=0x561cbae25720, status=4991) at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:1981
    #52 0x0000561cb7bdf0f3 in linux_nat_filter_event (lwpid=435403, status=198015) at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:2920
    #53 0x0000561cb7bdfed6 in linux_nat_wait_1 (ptid=..., ourstatus=0x7ffc4ee36398, target_options=...) at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:3202
    #54 0x0000561cb7be0b68 in linux_nat_target::wait (this=0x561cb8aafb60 <the_amd64_linux_nat_target>, ptid=..., ourstatus=0x7ffc4ee36398, target_options=...) at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:3440
    #55 0x0000561cb7bfa2fc in thread_db_target::wait (this=0x561cb8a9acd0 <the_thread_db_target>, ptid=..., ourstatus=0x7ffc4ee36398, options=...) at /home/smarchi/src/binutils-gdb/gdb/linux-thread-db.c:1412
    #56 0x0000561cb7d8e356 in record_btrace_target::wait (this=0x561cb8aa6250 <record_btrace_ops>, ptid=..., status=0x7ffc4ee36398, options=...) at /home/smarchi/src/binutils-gdb/gdb/record-btrace.c:2547
    #57 0x0000561cb7ef996d in target_wait (ptid=..., status=0x7ffc4ee36398, options=...) at /home/smarchi/src/binutils-gdb/gdb/target.c:2608
    #58 0x0000561cb7b6d297 in do_target_wait_1 (inf=0x561cba6d8780, ptid=..., status=0x7ffc4ee36398, options=...) at /home/smarchi/src/binutils-gdb/gdb/infrun.c:3640
    #59 0x0000561cb7b6d43e in operator() (__closure=0x7ffc4ee36190, inf=0x561cba6d8780) at /home/smarchi/src/binutils-gdb/gdb/infrun.c:3701
    #60 0x0000561cb7b6d7b2 in do_target_wait (ecs=0x7ffc4ee36370, options=...) at /home/smarchi/src/binutils-gdb/gdb/infrun.c:3720
    #61 0x0000561cb7b6e67d in fetch_inferior_event () at /home/smarchi/src/binutils-gdb/gdb/infrun.c:4069
    #62 0x0000561cb7b4659b in inferior_event_handler (event_type=INF_REG_EVENT) at /home/smarchi/src/binutils-gdb/gdb/inf-loop.c:41
    #63 0x0000561cb7be25f7 in handle_target_event (error=0, client_data=0x0) at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:4227
    #64 0x0000561cb81e4ee2 in handle_file_event (file_ptr=0x561cbae24e10, ready_mask=1) at /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:575
    #65 0x0000561cb81e5490 in gdb_wait_for_event (block=0) at /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:701
    #66 0x0000561cb81e41be in gdb_do_one_event () at /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:212
    #67 0x0000561cb7c18096 in start_event_loop () at /home/smarchi/src/binutils-gdb/gdb/main.c:421
    #68 0x0000561cb7c181e0 in captured_command_loop () at /home/smarchi/src/binutils-gdb/gdb/main.c:481
    #69 0x0000561cb7c19d7e in captured_main (data=0x7ffc4ee366a0) at /home/smarchi/src/binutils-gdb/gdb/main.c:1353
    #70 0x0000561cb7c19df0 in gdb_main (args=0x7ffc4ee366a0) at /home/smarchi/src/binutils-gdb/gdb/main.c:1368
    #71 0x0000561cb7693186 in main (argc=11, argv=0x7ffc4ee367b8) at /home/smarchi/src/binutils-gdb/gdb/gdb.c:32

At frame 45, the new_thread observable is fired.  At this moment, the
new thread isn't the current thread, inferior_ptid is null_ptid.  I
think this is ok: the new_thread observable doesn't give any guarantee
on the global context when observers are invoked.  Frame 35,
btrace_compute_ftrace_bts, calls gdb_insn_length.  gdb_insn_length
doesn't have a thread_info or other parameter what could indicate where
to read memory from, it implicitly uses the global context
(inferior_ptid).

So we reach the all_non_exited_threads_range in
record_btrace_target::record_is_replaying with a null inferior_ptid.
The previous implemention of all_non_exited_threads_range didn't care,
but the new one does.  The problem of calling gdb_insn_length and
ultimately trying to read memory with a null inferior_ptid already
existed, but the commit mentioned above made it visible.

Something between frames 40 (record_btrace_enable_warn) and 35
(btrace_compute_ftrace_bts) needs to be switching the global context to
make TP the current thread.  Since btrace_compute_ftrace_bts takes the
thread_info to work with as a parameter, that typically means that it
doesn't require its caller to also set the global current context
(current thread) when calling.  If it needs to call other functions
that do require the global current thread to be set, then it needs to
temporarily change the current thread while calling these other
functions.  Therefore, switch and restore the current thread in
btrace_compute_ftrace_bts.

By inspection, it looks like btrace_compute_ftrace_pt may also call
functions sensitive to the global context: it installs the
btrace_pt_readmem_callback callback in the PT instruction decoder.  When
this function gets called, inferior_ptid must be set appropriately.  Add
a switch and restore in there too.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28086
Change-Id: I407fbfe41aab990068bd102491aa3709b0a034b3
---
 gdb/btrace.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/gdb/btrace.c b/gdb/btrace.c
index 5e689c11d4bd..6fec8103d4ce 100644
--- a/gdb/btrace.c
+++ b/gdb/btrace.c
@@ -1052,6 +1052,12 @@ btrace_compute_ftrace_bts (struct thread_info *tp,
 			   const struct btrace_data_bts *btrace,
 			   std::vector<unsigned int> &gaps)
 {
+ /* btrace_compute_ftrace_bts may end up doing target calls that require the
+    current thread to be TP, for example reading memory through gdb_insn_length.
+    Make sure TP is the current thread.  */
+  scoped_restore_current_thread restore_thread;
+  switch_to_thread (tp);
+
   struct btrace_thread_info *btinfo;
   struct gdbarch *gdbarch;
   unsigned int blk;
@@ -1427,6 +1433,12 @@ btrace_compute_ftrace_pt (struct thread_info *tp,
 			  const struct btrace_data_pt *btrace,
 			  std::vector<unsigned int> &gaps)
 {
+ /* btrace_compute_ftrace_bts may end up doing target calls that require the
+    current thread to be TP, for example reading memory through
+    btrace_pt_readmem_callback.  Make sure TP is the current thread.  */
+  scoped_restore_current_thread restore_thread;
+  switch_to_thread (tp);
+
   struct btrace_thread_info *btinfo;
   struct pt_insn_decoder *decoder;
   struct pt_config config;
-- 
2.32.0
Simon Marchi via Gdb-patches July 19, 2021, 9:17 a.m. | #4
Hello Simon,

>> It is indeed confusing who is responsible for switching / restoring the

>> current thread.  After talking to Pedro about this problem a few times,

>> I think a guideline to make progressive changes could be: if a function

>> A (such as btrace_compute_ftrace_bts) takes a thread pointer as

>> parameter (or inferior, or frame), it's because it advertises that it

>> will be working on that thread and does not require that thread to be

>> the current one.  If it did require the current thread to be the same

>> thread it receives as argument, then why have the parameter at all?  So


Because I don't like global state and didn't want to add to the problem.


>> if function A (who doesn't care about the global context) needs to call

>> a function B (like gdb_insn_length in our example) that relies on the

>> current global thread, then it needs to set the current thread to

>> prepare the call B.  It also needs to restore the current thread that

>> was there on entry, because it could happen to be called from a function

>> that cares about the current thread.  And that caller wouldn't expect

>> a function that doesn't care about the global context to modify the

>> global context.


OK, that makes sense.  Those switch-to would go away once we got rid of
global state so we'd want to mark them somehow.

It would really be nice to put them right in front of the call that required it.
But that would mean a lot of switching back and forth.


>From b153656b2b8890ab67ed697da64c8fc6fbfde03f Mon Sep 17 00:00:00 2001

>From: Simon Marchi <simon.marchi@polymtl.ca>

>Date: Wed, 14 Jul 2021 16:31:09 -0400

>Subject: [PATCH v2] gdb: set current thread in btrace_compute_ftrace_1

>

>As documented in bug 28086, test gdb.btrace/enable-new-thread.exp

>started failing with commit 0618ae414979 ("gdb: optimize

>all_matching_threads_iterator"):

>

>    (gdb) record btrace^M

>    (gdb) PASS: gdb.btrace/enable-new-thread.exp: record btrace

>    break 24^M

>    Breakpoint 2 at 0x555555555175: file /home/smarchi/src/binutils-

>gdb/gdb/testsuite/gdb.btrace/enable-new-thread.c, line 24.^M

>    (gdb) continue^M

>    Continuing.^M

>    /home/smarchi/src/binutils-gdb/gdb/inferior.c:303: internal-error: inferior*

>find_inferior_pid(process_stratum_target*, int): Assertion `pid != 0' failed.^M

>    A problem internal to GDB has been detected,^M

>    further debugging may prove unreliable.^M

>    Quit this debugging session? (y or n) FAIL: gdb.btrace/enable-new-thread.exp:

>continue to breakpoint: cont to bp.1 (GDB internal error)

>

>Note that I only see the failure if GDB is compiled without libipt

>support.  I presume that this makes it use BTS instead of PT, so

>exercises different code paths.


Only BTS calls gdb_insn_length ().  But, as you point out below, PT passes a
read-memory callback to the decoder.  I wonder why we are not seeing a
similar problem there?

static int
btrace_pt_readmem_callback (gdb_byte *buffer, size_t size,
                            const struct pt_asid *asid, uint64_t pc,
                            void *context)
{
  int result, errcode;

  result = (int) size;
  try
    {
      errcode = target_read_code ((CORE_ADDR) pc, buffer, size);
      if (errcode != 0)
        result = -pte_nomap;
    }
  catch (const gdb_exception_error &error)
    {
      result = -pte_nomap;
    }

  return result;
}

We turn the exception into a decoder error code.  In the corresponding
test, gdb.btrace/enable-new-thread.exp, we only check that we are recording;
we are not looking at the actual trace.


>Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28086

>Change-Id: I407fbfe41aab990068bd102491aa3709b0a034b3

>---

> gdb/btrace.c | 12 ++++++++++++

> 1 file changed, 12 insertions(+)

>

>diff --git a/gdb/btrace.c b/gdb/btrace.c

>index 5e689c11d4bd..6fec8103d4ce 100644

>--- a/gdb/btrace.c

>+++ b/gdb/btrace.c

>@@ -1052,6 +1052,12 @@ btrace_compute_ftrace_bts (struct thread_info *tp,

> 			   const struct btrace_data_bts *btrace,

> 			   std::vector<unsigned int> &gaps)

> {

>+ /* btrace_compute_ftrace_bts may end up doing target calls that require the

>+    current thread to be TP, for example reading memory through

>gdb_insn_length.

>+    Make sure TP is the current thread.  */

>+  scoped_restore_current_thread restore_thread;

>+  switch_to_thread (tp);

>+

>   struct btrace_thread_info *btinfo;

>   struct gdbarch *gdbarch;

>   unsigned int blk;

>@@ -1427,6 +1433,12 @@ btrace_compute_ftrace_pt (struct thread_info *tp,

> 			  const struct btrace_data_pt *btrace,

> 			  std::vector<unsigned int> &gaps)

> {

>+ /* btrace_compute_ftrace_bts may end up doing target calls that require the


The function name is no longer valid in this copy.  Maybe just say 'We may end up ...'?


>+    current thread to be TP, for example reading memory through

>+    btrace_pt_readmem_callback.  Make sure TP is the current thread.  */

>+  scoped_restore_current_thread restore_thread;

>+  switch_to_thread (tp);

>+

>   struct btrace_thread_info *btinfo;

>   struct pt_insn_decoder *decoder;

>   struct pt_config config;

>--

>2.32.0


LGTM.

Thanks,
Markus.
Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928
Simon Marchi via Gdb-patches July 19, 2021, 1:36 p.m. | #5
> Hello Simon,

> 

>>> It is indeed confusing who is responsible for switching / restoring the

>>> current thread.  After talking to Pedro about this problem a few times,

>>> I think a guideline to make progressive changes could be: if a function

>>> A (such as btrace_compute_ftrace_bts) takes a thread pointer as

>>> parameter (or inferior, or frame), it's because it advertises that it

>>> will be working on that thread and does not require that thread to be

>>> the current one.  If it did require the current thread to be the same

>>> thread it receives as argument, then why have the parameter at all?  So

> 

> Because I don't like global state and didn't want to add to the problem.


Sorry if that wasn't clear, that wasn't a "why did Markus do it like
this" question but just part of the reasoning of why a function
shouldn't use both a parameter _and_ rely on the global context.

>>> if function A (who doesn't care about the global context) needs to call

>>> a function B (like gdb_insn_length in our example) that relies on the

>>> current global thread, then it needs to set the current thread to

>>> prepare the call B.  It also needs to restore the current thread that

>>> was there on entry, because it could happen to be called from a function

>>> that cares about the current thread.  And that caller wouldn't expect

>>> a function that doesn't care about the global context to modify the

>>> global context.

> 

> OK, that makes sense.  Those switch-to would go away once we got rid of

> global state so we'd want to mark them somehow.


Exact.

I guess we could imagine a system of attributes to mark functions as
using the global context and have some static analysis making sure that
when a function not using the global context calls one using the global
context, a function setting the global context is called before.  But,
that's probably not realistic.

> It would really be nice to put them right in front of the call that required it.

> But that would mean a lot of switching back and forth.


Indeed.  I initially put the switch in the narrowest scope possible,
that is in the try around the gdb_insn_length call.  But then some other
function call later also needed the current thread to be set.  I think
in this case it's reasonable to have just one switch for the whole
function.

>>From b153656b2b8890ab67ed697da64c8fc6fbfde03f Mon Sep 17 00:00:00 2001

>> From: Simon Marchi <simon.marchi@polymtl.ca>

>> Date: Wed, 14 Jul 2021 16:31:09 -0400

>> Subject: [PATCH v2] gdb: set current thread in btrace_compute_ftrace_1

>>

>> As documented in bug 28086, test gdb.btrace/enable-new-thread.exp

>> started failing with commit 0618ae414979 ("gdb: optimize

>> all_matching_threads_iterator"):

>>

>>    (gdb) record btrace^M

>>    (gdb) PASS: gdb.btrace/enable-new-thread.exp: record btrace

>>    break 24^M

>>    Breakpoint 2 at 0x555555555175: file /home/smarchi/src/binutils-

>> gdb/gdb/testsuite/gdb.btrace/enable-new-thread.c, line 24.^M

>>    (gdb) continue^M

>>    Continuing.^M

>>    /home/smarchi/src/binutils-gdb/gdb/inferior.c:303: internal-error: inferior*

>> find_inferior_pid(process_stratum_target*, int): Assertion `pid != 0' failed.^M

>>    A problem internal to GDB has been detected,^M

>>    further debugging may prove unreliable.^M

>>    Quit this debugging session? (y or n) FAIL: gdb.btrace/enable-new-thread.exp:

>> continue to breakpoint: cont to bp.1 (GDB internal error)

>>

>> Note that I only see the failure if GDB is compiled without libipt

>> support.  I presume that this makes it use BTS instead of PT, so

>> exercises different code paths.

> 

> Only BTS calls gdb_insn_length ().  But, as you point out below, PT passes a

> read-memory callback to the decoder.  I wonder why we are not seeing a

> similar problem there?

> 

> static int

> btrace_pt_readmem_callback (gdb_byte *buffer, size_t size,

>                             const struct pt_asid *asid, uint64_t pc,

>                             void *context)

> {

>   int result, errcode;

> 

>   result = (int) size;

>   try

>     {

>       errcode = target_read_code ((CORE_ADDR) pc, buffer, size);

>       if (errcode != 0)

>         result = -pte_nomap;

>     }

>   catch (const gdb_exception_error &error)

>     {

>       result = -pte_nomap;

>     }

> 

>   return result;

> }

> 

> We turn the exception into a decoder error code.  In the corresponding

> test, gdb.btrace/enable-new-thread.exp, we only check that we are recording;

> we are not looking at the actual trace.


Good point.  I didn't have much time to dig into why PT didn't fail, but
it sounds like the test could be improved.  Would you like to improve
the test in such a way that it would have caught the issue with PT?

>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28086

>> Change-Id: I407fbfe41aab990068bd102491aa3709b0a034b3

>> ---

>> gdb/btrace.c | 12 ++++++++++++

>> 1 file changed, 12 insertions(+)

>>

>> diff --git a/gdb/btrace.c b/gdb/btrace.c

>> index 5e689c11d4bd..6fec8103d4ce 100644

>> --- a/gdb/btrace.c

>> +++ b/gdb/btrace.c

>> @@ -1052,6 +1052,12 @@ btrace_compute_ftrace_bts (struct thread_info *tp,

>> 			   const struct btrace_data_bts *btrace,

>> 			   std::vector<unsigned int> &gaps)

>> {

>> + /* btrace_compute_ftrace_bts may end up doing target calls that require the

>> +    current thread to be TP, for example reading memory through

>> gdb_insn_length.

>> +    Make sure TP is the current thread.  */

>> +  scoped_restore_current_thread restore_thread;

>> +  switch_to_thread (tp);

>> +

>>   struct btrace_thread_info *btinfo;

>>   struct gdbarch *gdbarch;

>>   unsigned int blk;

>> @@ -1427,6 +1433,12 @@ btrace_compute_ftrace_pt (struct thread_info *tp,

>> 			  const struct btrace_data_pt *btrace,

>> 			  std::vector<unsigned int> &gaps)

>> {

>> + /* btrace_compute_ftrace_bts may end up doing target calls that require the

> 

> The function name is no longer valid in this copy.  Maybe just say 'We may end up ...'?


Sorry, copy-pasto.  You're right I don't need to name the function in
the comment.  That came from my previous comment that was in
btrace_compute_ftrace_1.

>> +    current thread to be TP, for example reading memory through

>> +    btrace_pt_readmem_callback.  Make sure TP is the current thread.  */

>> +  scoped_restore_current_thread restore_thread;

>> +  switch_to_thread (tp);

>> +

>>   struct btrace_thread_info *btinfo;

>>   struct pt_insn_decoder *decoder;

>>   struct pt_config config;

>> --

>> 2.32.0

> 

> LGTM.


Thanks, I'll commit it with the comments adjusted.

Simon

Patch

diff --git a/gdb/btrace.c b/gdb/btrace.c
index 5e689c11d4bd..9dd6f1dc9e94 100644
--- a/gdb/btrace.c
+++ b/gdb/btrace.c
@@ -1518,6 +1518,12 @@  btrace_compute_ftrace_1 (struct thread_info *tp,
 {
   DEBUG ("compute ftrace");
 
+ /* btrace_compute_ftrace_bts and btrace_compute_ftrace_pt may end up doing
+    target calls that require the current thread to be TP, for example to read
+    memory.  Make sure TP is the current thread.  */
+  scoped_restore_current_thread restore_thread;
+  switch_to_thread (tp);
+
   switch (btrace->format)
     {
     case BTRACE_FORMAT_NONE: