Discussion:
[Bug gdb/23392] New: GDB hangs forever when attaching to threaded program with futexes
etesta at undo dot io
2018-07-10 04:49:56 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=23392

Bug ID: 23392
Summary: GDB hangs forever when attaching to threaded program
with futexes
Product: gdb
Version: 8.1
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: gdb
Assignee: unassigned at sourceware dot org
Reporter: etesta at undo dot io
Target Milestone: ---

Created attachment 11119
--> https://sourceware.org/bugzilla/attachment.cgi?id=11119&action=edit
contains all files required to reproduce the issue.

I have an automated test running gdb through pexpect and it intermittently
times out waiting for the gdb prompt.

The program GDB is attaching to contains 10 threads that do a variety of
actions around futexes.
We spawn all the threads before attaching and sometimes GDB locks on the
sigsuspend() at line 3428 in lunx-nat.c, function linux_nat_wait_1()

I read the code and I'm struggling to understand the logic, The comments help
but only to an extent. What I'm sure about is that being a timing issue it
might be hard to reproduce:
On a test machine we have running suse 123 with kernel 3.7 it takes about 4 to
8 runs, on my machine (ubuntu 16.04 with kernel 4.4 it takes around 1000 runs).

I tested only two versions: GDB 7.7 and GDB 8.1 and I can reproduce only on
8.1. I tried compiling the git repository but I keep banging my head against a
YACC crash.

Please find attached all the files required to reproduce:

test2364.c is the C file with all the threads that GDB will attach to. Compile
with just "$ gcc -g -o test2364 test2364.c -lpthread"

test2364_repro.py is the test infrastructure. I cut it down so it is a bit
rough, you'll have to manually edit the paths to the test executable and the
gdb installation.
run once with "$ python test2364_repro.py"

wrapper_script.sh is just a small utility to run the test until it fails (on my
machine it takes around 500 runs). run with "$ ./wrapper_script.sh"

the back trace for gdb when the bug manifests itself is:

(gdb) bt
#0 0x00007fc287eab7e4 in __GI___sigsuspend (set=***@entry=0xd6ad20
<suspend_mask>)
at ../sysdeps/unix/sysv/linux/sigsuspend.c:30
#1 0x0000000000445a7e in linux_nat_wait_1 (ops=<optimised out>,
target_options=0, ourstatus=0x7ffd1e345f40, ptid=...)
at ../../gdb-8.1/gdb/linux-nat.c:3428
#2 linux_nat_wait (ops=<optimised out>, ptid=..., ourstatus=0x7ffd1e345f40,
target_options=0)
at ../../gdb-8.1/gdb/linux-nat.c:3636
#3 0x000000000043d51a in thread_db_wait (
During symbol reading, Only single DW_OP_reg or DW_OP_fbreg is supported for
DW_FORM_block* DW_AT_location is supported for DW_TAG_call_site child DIE
0x2068fa0 [in module /home/etesta/tmp/gdb81_build/gdb/gdb].
ops=<optimised out>, ptid=..., ourstatus=0x7ffd1e345f40, options=0) at
../../gdb-8.1/gdb/linux-thread-db.c:1117
#4 0x0000000000673bfd in delegate_wait (self=<optimised out>, arg1=...,
arg2=<optimised out>, arg3=<optimised out>)
at ../../gdb-8.1/gdb/target-delegates.c:138
#5 0x0000000000682ca4 in target_wait (ptid=...,
status=***@entry=0x7ffd1e345f40, options=***@entry=0)
at ../../gdb-8.1/gdb/target.c:2179
#6 0x00000000005cfb97 in wait_one (ws=0x7ffd1e345f40) at
../../gdb-8.1/gdb/infrun.c:4347
#7 stop_all_threads () at ../../gdb-8.1/gdb/infrun.c:4559
#8 0x00000000005d0f0f in stop_waiting (ecs=0x7ffd1e346240) at
../../gdb-8.1/gdb/infrun.c:7661
#9 0x00000000005d42c5 in handle_signal_stop (ecs=***@entry=0x7ffd1e346240) at
../../gdb-8.1/gdb/infrun.c:5754
#10 0x00000000005d58e8 in handle_inferior_event_1 (ecs=0x7ffd1e346240) at
../../gdb-8.1/gdb/infrun.c:5379
#11 handle_inferior_event (ecs=***@entry=0x7ffd1e346240) at
../../gdb-8.1/gdb/infrun.c:5414
#12 0x00000000005d6e3e in fetch_inferior_event (client_data=<optimised out>) at
../../gdb-8.1/gdb/infrun.c:3930
#13 0x0000000000595a7d in gdb_wait_for_event (block=***@entry=0) at
../../gdb-8.1/gdb/event-loop.c:859
#14 0x0000000000595ba7 in gdb_do_one_event () at
../../gdb-8.1/gdb/event-loop.c:322
#15 0x0000000000595cf6 in gdb_do_one_event () at
../../gdb-8.1/gdb/event-loop.c:353
#16 0x0000000000694d2c in wait_sync_command_done () at
../../gdb-8.1/gdb/top.c:503
#17 0x0000000000694d6a in maybe_wait_sync_command_done
(was_sync=***@entry=0) at ../../gdb-8.1/gdb/top.c:520
#18 0x00000000005ec36f in catch_command_errors (command=0x5c6270
<attach_command(char const*, int)>,
arg=***@entry=0x7ffd1e347179 "13136", from_tty=1) at
../../gdb-8.1/gdb/main.c:380
#19 0x00000000005ed449 in captured_main_1 (context=0x7ffd1e3463a0,
this=<optimised out>)
at ../../gdb-8.1/gdb/main.c:1061
#20 captured_main (data=0x7ffd1e3463a0) at ../../gdb-8.1/gdb/main.c:1147
#21 gdb_main (args=***@entry=0x7ffd1e3464e0) at ../../gdb-8.1/gdb/main.c:1173
#22 0x000000000040e835 in main (argc=<optimised out>, argv=<optimised out>) at
../../gdb-8.1/gdb/gdb.c:32

I am available to give more information should I have forgotten something.
--
You are receiving this mail because:
You are on the CC list for the bug.
etesta at undo dot io
2018-07-10 12:45:44 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=23392

Emiliano Testa <etesta at undo dot io> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |etesta at undo dot io
--
You are receiving this mail because:
You are on the CC list for the bug.
autkin at undo dot io
2018-07-25 07:56:15 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=23392

Andrey Utkin <autkin at undo dot io> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |autkin at undo dot io
--
You are receiving this mail because:
You are on the CC list for the bug.
tromey at sourceware dot org
2018-07-27 14:30:21 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=23392

Tom Tromey <tromey at sourceware dot org> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |tromey at sourceware dot org

--- Comment #1 from Tom Tromey <tromey at sourceware dot org> ---
Could you try git master or the 8.2 branch?
This sounded a lot like the problem addressed here:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commitdiff;h=7010835a6c5fd3679feae7e6345f2f10d4d761b4
but I keep banging my head against a YACC crash
Either byacc or bison should be fine. If you post the
details maybe I can help with this problem too.
--
You are receiving this mail because:
You are on the CC list for the bug.
etesta at undo dot io
2018-07-30 06:01:36 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=23392

--- Comment #2 from Emiliano Testa <etesta at undo dot io> ---
Hi Tom,

so the same tests I attached passes 100 runs with the current master of gdb
development branch on a machine where the test failed on average after 5 runs.

If it is not fixed (with intermittent failures it's hard to say they are gone
forever) it is definitely way harder to reproduce.

Thanks!

Emiliano
--
You are receiving this mail because:
You are on the CC list for the bug.
tromey at sourceware dot org
2018-07-30 12:54:47 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=23392

Tom Tromey <tromey at sourceware dot org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution|--- |FIXED
Target Milestone|--- |8.2

--- Comment #3 from Tom Tromey <tromey at sourceware dot org> ---
Thank you for trying it out.
I am going to close this, but if you encounter it again, feel
free to reopen it.
--
You are receiving this mail because:
You are on the CC list for the bug.
Loading...