[Bug backtrace/23134] New: gdb attached to a crashed process shows a different call stack than when running directly

Discussion:

ralf.habacker at freenet dot de

2018-05-03 10:06:42 UTC

https://sourceware.org/bugzilla/show_bug.cgi?id=23134

Bug ID: 23134
Summary: gdb attached to a crashed process shows a different
call stack than when running directly
Product: gdb
Version: 7.8
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: backtrace
Assignee: unassigned at sourceware dot org
Reporter: ralf.habacker at freenet dot de
Target Milestone: ---

Attaching gdb to a crashed process on Windows shows a different backtrace
compared to running the related process directly from gdb.

How to reproduce:
1. Compile the following source with mingw compiler

static void
foo()
{
int *f=NULL;
*f = 0;
}

static void
bar()
{
foo();
}

int
main()
{
bar();
return 0;
}

2. Run

f:>gdb test.exe
GNU gdb (GDB) 7.8.1
...
Reading symbols from test.exe...done.
(gdb) r
Starting program: f:\test.exe
[New Thread 1532.0xee0]

Program received signal SIGSEGV, Segmentation fault.
0x00401560 in foo () at f:/test.c:7
(gdb) bt
#0 0x00401560 in foo () at f:/test.c:7
#1 0x00401571 in bar () at f:/test.c:13
#2 0x00401584 in main () at f:/test.c:20
(gdb) thread apply all bt

Thread 1 (Thread 1532.0xee0):
#0 0x00401560 in foo () at f:/test.c:7
#1 0x00401571 in bar () at f:/test.c:13
#2 0x00401584 in main () at f:/test.c:20
(gdb)

This shows the expected call stack

3. Run test.exe directly and attach with gdb after crash happened

F:\>gdb --pid=1316
GNU gdb (GDB) 7.8.1
...
Attaching to process 1316
[New Thread 1316.0x4e8]
[New Thread 1316.0xd40]
Reading symbols from f:\test.exe...done.
0x7700000d in ntdll!DbgBreakPoint () from C:\Windows\SysWOW64\ntdll.dll
(gdb) bt
#0 0x7700000d in ntdll!DbgBreakPoint () from C:\Windows\SysWOW64\ntdll.dll
#1 0x7708fbe6 in ntdll!DbgUiRemoteBreakin () from
C:\Windows\SysWOW64\ntdll.dll
#2 0x73d89d90 in ?? ()
#3 0x00000000 in ?? ()
(gdb) thread apply all bt

Thread 2 (Thread 1316.0xd40):
#0 0x7700000d in ntdll!DbgBreakPoint () from C:\Windows\SysWOW64\ntdll.dll
#1 0x7708fbe6 in ntdll!DbgUiRemoteBreakin () from
C:\Windows\SysWOW64\ntdll.dll
#2 0x73d89d90 in ?? ()
#3 0x00000000 in ?? ()

Thread 1 (Thread 1316.0x4e8):
#0 0x7701019d in ntdll!ZwWaitForMultipleObjects () from
C:\Windows\SysWOW64\ntdll.dll
#1 0x7701019d in ntdll!ZwWaitForMultipleObjects () from
C:\Windows\SysWOW64\ntdll.dll
#2 0x74e015f7 in WaitForMultipleObjectsEx () from
C:\Windows\syswow64\KernelBase.dll
#3 0x00000002 in ?? ()
#4 0x0028f6f0 in ?? ()
#5 0x76161a0c in WaitForMultipleObjectsEx () from
C:\Windows\syswow64\kernel32.dll
#6 0x76164200 in WaitForMultipleObjects () from
C:\Windows\syswow64\kernel32.dll
#7 0x761880bc in KERNEL32!GetApplicationRecoveryCallback () from
C:\Windows\syswow64\kernel32.dll
#8 0x76187f7b in KERNEL32!GetApplicationRecoveryCallback () from
C:\Windows\syswow64\kernel32.dll
#9 0x76187870 in UnhandledExceptionFilter () from
C:\Windows\syswow64\kernel32.dll
#10 0x0028f8ec in ?? ()
#11 0x761877ef in UnhandledExceptionFilter () from
C:\Windows\syswow64\kernel32.dll
#12 0x0028f8ec in ?? ()
#13 0x7706344f in ntdll!RtlKnownExceptionFilter () from
C:\Windows\SysWOW64\ntdll.dll
#14 0x77029855 in ntdll!RtlInitializeExceptionChain () from
C:\Windows\SysWOW64\ntdll.dll
#15 0x00000000 in ?? ()

This call stack does not related to the call stack mentioned in step 2.

--
You are receiving this mail because:
You are on the CC list for the bug.

palves at redhat dot com

2018-05-03 11:44:01 UTC

Permalink

https://sourceware.org/bugzilla/show_bug.cgi?id=23134

Pedro Alves <palves at redhat dot com> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |palves at redhat dot com

--- Comment #1 from Pedro Alves <palves at redhat dot com> ---

Post by ralf.habacker at freenet dot de
3. Run test.exe directly and attach with gdb after crash happened
F:\>gdb --pid=1316

If the crash already happened, how come you can still attach to the process?

Is it showing some dialog box? If something like that, then the difference
should be expected -- thread #1 appears to be waiting for something, while
inside some GetApplicationRecoveryCallback function, after a SEH exception was
raised (UnhandledExceptionFilter).

When running under gdb, gdb intercepts the exception/crash _before_ any of that
has a chance to run.

Post by ralf.habacker at freenet dot de
GNU gdb (GDB) 7.8.1

Note that that's old by now.

--
You are receiving this mail because:
You are on the CC list for the bug.

ralf.habacker at freenet dot de

2018-05-03 12:18:27 UTC

Permalink

https://sourceware.org/bugzilla/show_bug.cgi?id=23134

--- Comment #2 from Ralf Habacker <ralf.habacker at freenet dot de> ---
(In reply to Pedro Alves from comment #1)

Post by palves at redhat dot com

Post by ralf.habacker at freenet dot de
3. Run test.exe directly and attach with gdb after crash happened
F:\>gdb --pid=1316

If the crash already happened, how come you can still attach to the process?
Is it showing some dialog box?

There is a a default crash handler - showing a message box that the application
has been crashed

Post by palves at redhat dot com
If something like that, then the difference
should be expected -- thread #1 appears to be waiting for something, while
inside some GetApplicationRecoveryCallback function, after a SEH exception
was raised (UnhandledExceptionFilter).
When running under gdb, gdb intercepts the exception/crash _before_ any of
that has a chance to run.

If I start a KDE application on Linux, a crash handler (drkonqi) is installed
by default.
In case of an unhandled exception the crash handler is started
(https://cgit.kde.org/kdelibs.git/tree/kdeui/util/kcrash.cpp?h=KDE/4.14#n481)
similar to the default crash handler on Windows. This crash handler starts gdb
in background to collect the back trace, which contains the same callstack for
the crashing thread if the related application was started from gdb and crashed
inside gdb.

On windows drkonqi does not use gdb, but a back trace generator named kdbgwin
(https://cgit.kde.org/drkonqi.git/tree/src/kdbgwin) with broken mingw symbol
support, although there is also support to use gdb on windows (which have
working mingw symbol support). Unfortunally gdb on Windows does not perform as
on linux for this use case, which would be required to replace kdbgwin to
reduce the maintenance burden and have up to date mingw symbol support (and may
be pdb symbol support too)

Post by palves at redhat dot com

Post by ralf.habacker at freenet dot de
GNU gdb (GDB) 7.8.1

Note that that's old by now.

Has the mentioned behavior been changed in newer gdb versions ?

--
You are receiving this mail because:
You are on the CC list for the bug.

palves at redhat dot com

2018-05-03 13:50:24 UTC

Permalink

https://sourceware.org/bugzilla/show_bug.cgi?id=23134

--- Comment #3 from Pedro Alves <palves at redhat dot com> ---
(In reply to Ralf Habacker from comment #2)

Post by ralf.habacker at freenet dot de
Unfortunally gdb on Windows does
not perform as on linux for this use case, which would be required to
replace kdbgwin to reduce the maintenance burden and have up to date mingw
symbol support (and may be pdb symbol support too)

gdb does not support pdb symbols.

Post by ralf.habacker at freenet dot de

Post by palves at redhat dot com

Post by ralf.habacker at freenet dot de
GNU gdb (GDB) 7.8.1

Note that that's old by now.

Has the mentioned behavior been changed in newer gdb versions ?

I'm not sure what is there to change, if you attach at that point. I assume
that what happened is that the application thrown a access violation exception,
and then nothing in the program caught the exception, and so it unwound the
stack all the way to main, and the default handler opened a dialog box for the
uncaught exception. So when you attach, that's what you're debugging.

Or is it that the user code should up visible up in the stack, above frame #15?
I doubt it, since we see 2 threads, and I'd expect 3: 2 for the program's
original threads, and a third for the one that Windows injects in the process
to halt it (see DbgUiRemoteBreakin in the stack).

gdb 7.12 added the "signal-event" command. From the gdb/NEWS file:

signal-event EVENTID
Signal ("set") the given MS-Windows event object. This is used in
conjunction with the Windows JIT debugging (AeDebug) support, where
the OS suspends a crashing process until a debugger can attach to
it. Resuming the crashing process, in order to debug it, is done by
signalling an event.

See more in the manual:
https://www.sourceware.org/gdb/onlinedocs/gdb.html#index-signal_002devent

I believe that's the recommended way to attach to a dying process on Windows.

It may be that newer gdbs unwind through system dlls better. I'm not sure.
Try it yourself, and let us know.

--
You are receiving this mail because:
You are on the CC list for the bug.

ralf.habacker at freenet dot de

2018-05-04 08:06:55 UTC

Permalink

https://sourceware.org/bugzilla/show_bug.cgi?id=23134

--- Comment #4 from Ralf Habacker <ralf.habacker at freenet dot de> ---
(In reply to Pedro Alves from comment #3)

Post by palves at redhat dot com

Post by ralf.habacker at freenet dot de

Post by palves at redhat dot com

Post by ralf.habacker at freenet dot de
GNU gdb (GDB) 7.8.1

Note that that's old by now.

Has the mentioned behavior been changed in newer gdb versions ?

I need to inspect this more in detail to give an answer, but it may be
superseeded by the mentioned approach below.

Post by palves at redhat dot com
signal-event EVENTID
Signal ("set") the given MS-Windows event object. This is used in
conjunction with the Windows JIT debugging (AeDebug) support, where
the OS suspends a crashing process until a debugger can attach to
it. Resuming the crashing process, in order to debug it, is done by
signalling an event.
https://www.sourceware.org/gdb/onlinedocs/gdb.html#index-signal_002devent
I believe that's the recommended way to attach to a dying process on Windows.

I tried this with gdb and it works out if the box - thanks for this pointer.
Unfortunally adding this registry key(s) requires administrative privileges,
which is a no go for portable installations. Using a custom crash handler as
implemented in KDE apps is the only solution here.

https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/enabling-postmortem-debugging
states

" DWORD (%ld) - Event Handle duplicated into the postmortem debugger process.
If the postmortem debugger signals the event, WER will continue the target
process without waiting for the postmortem debugger to terminate. The event
should only be signaled if the issue has been resolved. If the postmortem
debugger terminates without signaling the event, WER continues the collection
of information about the target processes."

This looks to be a way how to make gdb to be able to deal with. Currently,
crashing KDE apps are waiting on the termination of the previously started
crash handler process unconditional (
https://cgit.kde.org/kdelibs.git/tree/kdeui/util/kcrash.cpp?h=KDE/4.14#n475),
which needs to be changed into waiting on a event signaled by the attached gdb
process - what do you think ?

Post by palves at redhat dot com
It may be that newer gdbs unwind through system dlls better. I'm not sure.
Try it yourself, and let us know.

I tried that with gdb 8.1 and did not see any changes.

--
You are receiving this mail because:
You are on the CC list for the bug.

ralf.habacker at freenet dot de

2018-05-07 08:39:47 UTC

Permalink

https://sourceware.org/bugzilla/show_bug.cgi?id=23134

--- Comment #5 from Ralf Habacker <ralf.habacker at freenet dot de> ---
Created attachment 10997
--> https://sourceware.org/bugzilla/attachment.cgi?id=10997&action=edit
test case source

Got it working with gdb 8.1 and 'send-event' command (see appended test case
source).

--
You are receiving this mail because:
You are on the CC list for the bug.

ralf.habacker at freenet dot de

2018-05-07 08:50:28 UTC

Permalink

https://sourceware.org/bugzilla/show_bug.cgi?id=23134

--- Comment #6 from Ralf Habacker <ralf.habacker at freenet dot de> ---
(In reply to Pedro Alves from comment #3)

Post by palves at redhat dot com
(In reply to Ralf Habacker from comment #2)

gdb does not support pdb symbols.

That it pity. It is because no one cared about in the past or does it have
other reasons ?
I'm asking because gdb is very easy to integrate with the help of the 'ex'
command support and it looks not very much work to add pdb symbol support for
backtraces compared to mingw using bfd (see
https://github.com/cloudwu/backtrace-mingw/blob/master/backtrace.c#L323 for
initializing and
https://github.com/cloudwu/backtrace-mingw/blob/master/backtrace.c#L289 for
fetching symbol information).

--
You are receiving this mail because:
You are on the CC list for the bug.