Discussion:
[Bug threads/22960] New: brew gdb 8.1 (but not 8.0.1) breakpoint trap in mac os high sierra 10.13.3
xdavidliu at gmail dot com
2018-03-13 13:51:55 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=22960

Bug ID: 22960
Summary: brew gdb 8.1 (but not 8.0.1) breakpoint trap in mac os
high sierra 10.13.3
Product: gdb
Version: 8.1
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: threads
Assignee: unassigned at sourceware dot org
Reporter: xdavidliu at gmail dot com
Target Milestone: ---

I filed this bug at the homebrew page, so the relevant info can be found there.
https://github.com/Homebrew/homebrew-core/issues/25172

If someone wants me to copy the info into a post here, for further convenience,
I would be happy to.

This may actually not be the same as bug # 20266 here, which was from two years
ago, since the issue occurs only on gdb 8.1 in homebrew in mac os, *not* gdb
8.0.1 homebrew. In 8.0.1, gdb actually works correctly (assuming codesigning is
done correctly, which is unrelated but has caused many users trouble), only
with a dyld version warning.
--
You are receiving this mail because:
You are on the CC list for the bug.
xdavidliu at gmail dot com
2018-03-13 19:24:33 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=22960

xdavidliu at gmail dot com changed:

What |Removed |Added
----------------------------------------------------------------------------
Component|threads |gdb
--
You are receiving this mail because:
You are on the CC list for the bug.
thor.lilei at gmail dot com
2018-04-05 11:33:41 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=22960

thor.lilei at gmail dot com changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |thor.lilei at gmail dot com

--- Comment #1 from thor.lilei at gmail dot com ---
Same problem with me.
--
You are receiving this mail because:
You are on the CC list for the bug.
ray.seyfarth at gmail dot com
2018-04-27 19:21:22 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=22960

Ray Seyfarth <ray.seyfarth at gmail dot com> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |ray.seyfarth at gmail dot com

--- Comment #2 from Ray Seyfarth <ray.seyfarth at gmail dot com> ---
Same problem with me.

I have to wonder how Apple gets lldb to work with no problems. It appears to
not be codesigned nor setuid/gid or anything special. There is an option which
works which Apple should share with the open source community. They have
wasted a lot of my time.
--
You are receiving this mail because:
You are on the CC list for the bug.
palves at redhat dot com
2018-04-28 10:55:34 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=22960

Pedro Alves <palves at redhat dot com> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |palves at redhat dot com

--- Comment #3 from Pedro Alves <palves at redhat dot com> ---
If gdb 8.0 works, but gdb 8.1 doesn't, then that suggests doing a git bisect to
find the exact change in gdb that caused the problem. Any takers?
--
You are receiving this mail because:
You are on the CC list for the bug.
saagar at saagarjha dot com
2018-05-24 08:05:38 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=22960

Saagar Jha <saagar at saagarjha dot com> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |saagar at saagarjha dot com

--- Comment #4 from Saagar Jha <saagar at saagarjha dot com> ---
I think I've found the culprit:

$ git bisect run ../gdb-bisect.sh
running /Users/saagarjha/Git/bisect-test.sh

[Snip]

f6ac5f3d63e03a81c4ff3749aba234961cc9090e is the first bad commit
commit f6ac5f3d63e03a81c4ff3749aba234961cc9090e
Author: Pedro Alves <***@redhat.com>
Date: Thu May 3 00:37:22 2018 +0100

Convert struct target_ops to C++

[Snip]

bisect run success

Could someone confirm this for me? Commits before this one can successfully
follow the debuggee to completion without incident, but the ones after and
including this one crash with a null pointer dereference in
gdb`push_target(struct target_ops *) at target.c:653. From a cursory glance, it
seems a little fishy that darwin-nat.c doesn't have any sort of add_target call
in it, but I can't understand the code in the C/C++ frankenstein state it's in
right now, so I wasn't able to come up with a fix. (I did find a bunch of
undefined behavior being hit, though, which I *do* have patches for. Let me
know if you're curious in seeing them.)

<rant>
Just as a FYI, confirming this particular commit took well over two days and
testing over two hundred revisions, which is something that I find as an
outside observer to be truly horrible. Does GDB have *no* automated testing or
continuous integration whatsoever? Putting aside the fact that any such
infrastructure would catch simple bugs like this one, which are easy to
reproduce, it would have also made my life bisecting a lot easier. Many
intermediate commits are broken, as in they *literally don't build on macOS*,
because someone forgot a header file or messed up a Makefile. Others
dereference null pointers or overflow ints during startup, which really threw
off my bisect script with false positives: I had to restart the bisect from the
beginning at least half a dozen times because it homed in on the wrong bug. I'm
aghast that it's possible for such clearly broken patches to land in the master
branch. I do apologize for the vitriolic tone here, but I'm extremely
frustrated at the amount of time I had to spend finding this when it should
have been a rather trivial task. I do hope none of you take it personally–but
if you're looking for things to improve, this is one thing I think you should
focus on.
</rant>
--
You are receiving this mail because:
You are on the CC list for the bug.
palves at redhat dot com
2018-05-24 06:17:01 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=22960

--- Comment #5 from Saagar Jha <saagar at saagarjha dot com> ---
(In reply to Ray Seyfarth from comment #2)
Post by thor.lilei at gmail dot com
Same problem with me.
I have to wonder how Apple gets lldb to work with no problems. It appears
to not be codesigned nor setuid/gid or anything special. There is an option
which works which Apple should share with the open source community. They
have wasted a lot of my time.
LLDB is signed with Apple's certificate:

$ codesign -dvv `xcrun -find lldb`
Executable=/Applications/Xcode-beta.app/Contents/Developer/usr/bin/lldb
Identifier=com.apple.lldb
Format=Mach-O thin (x86_64)
CodeDirectory v=20200 size=622 flags=0x0(none) hashes=15+2 location=embedded
Signature size=4535
Authority=Software Signing
Authority=Apple Code Signing Certification Authority
Authority=Apple Root CA
Info.plist entries=6
TeamIdentifier=59GAB85EFG
Sealed Resources=none
Internal requirements count=1 size=64

--- Comment #6 from Pedro Alves <palves at redhat dot com> ---
(In reply to Saagar Jha from comment #4)
Post by thor.lilei at gmail dot com
$ git bisect run ../gdb-bisect.sh
running /Users/saagarjha/Git/bisect-test.sh
[Snip]
f6ac5f3d63e03a81c4ff3749aba234961cc9090e is the first bad commit
commit f6ac5f3d63e03a81c4ff3749aba234961cc9090e
Date: Thu May 3 00:37:22 2018 +0100
Convert struct target_ops to C++
[Snip]
bisect run success
That commit can't be the culprit for the issue reported in this bug,
because that commit is recent, it is in master only, not in 8.1.
It if caused some breakage, it's something else. A separate bug report
would have been better.
Post by thor.lilei at gmail dot com
Could someone confirm this for me? Commits before this one can successfully
follow the debuggee to completion without incident, but the ones after and
including this one crash with a null pointer dereference in
gdb`push_target(struct target_ops *) at target.c:653. From a cursory glance,
it seems a little fishy that darwin-nat.c doesn't have any sort of
add_target call in it,
The add_target call is in i386-darwin-nat.c:_initialize_i386_darwin_nat

add_inf_child_target (&darwin_target);
Post by thor.lilei at gmail dot com
but I can't understand the code in the C/C++
frankenstein state it's in right now,
Yeah. Anything in particular you'd like to point out?
Post by thor.lilei at gmail dot com
so I wasn't able to come up with a
fix. (I did find a bunch of undefined behavior being hit, though, which I
*do* have patches for. Let me know if you're curious in seeing them.)
Yes please. If you could contribute fixes, it'd be awesome:

https://sourceware.org/gdb/wiki/ContributionChecklist

In case it isn't obvious, the macOS port is in real need of someone motivated
to maintain it. I'm afraid that none of the day-to-day maintainers uses macOS,
AFAIK. You can see it as an opportunity.
Post by thor.lilei at gmail dot com
<rant>
Just as a FYI, confirming this particular commit took well over two days and
testing over two hundred revisions, which is something that I find as an
outside observer to be truly horrible.
Wow. Sorry about that. Two hundred revisions sounds way too many for a git
bisect? How could that have happened?
Post by thor.lilei at gmail dot com
Does GDB have *no* automated testing
or continuous integration whatsoever?
It does, see <https://sourceware.org/gdb/wiki/BuildBot>. The problem is nobody
ever contributed a macOS buildslave.
Post by thor.lilei at gmail dot com
Putting aside the fact that any such
infrastructure would catch simple bugs like this one, which are easy to
reproduce, it would have also made my life bisecting a lot easier. Many
intermediate commits are broken, as in they *literally don't build on
macOS*, because someone forgot a header file or messed up a Makefile. Others
dereference null pointers or overflow ints during startup, which really
threw off my bisect script with false positives: I had to restart the bisect
from the beginning at least half a dozen times because it homed in on the
wrong bug.
:-( Sound like maybe "git bisect skip" would have helped?
Post by thor.lilei at gmail dot com
I'm aghast that it's possible for such clearly broken patches to
land in the master branch. I do apologize for the vitriolic tone here, but
I'm extremely frustrated at the amount of time I had to spend finding this
when it should have been a rather trivial task. I do hope none of you take
it personally–but if you're looking for things to improve, this is one thing
I think you should focus on.
Nope, sorry. The thing to improve is _getting someone that actually cares
about the port to step up and help maintain it_. That could be you.
Otherwise, I fear that at some point, the port will just end up deprecated and
removed.
--
You are receiving this mail because:
You are on the CC list for the bug.
palves at redhat dot com
2018-05-24 12:20:34 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=22960

--- Comment #7 from Pedro Alves <palves at redhat dot com> ---
Post by saagar at saagarjha dot com
including this one crash with a null pointer dereference in
gdb`push_target(struct target_ops *) at target.c:653.
I think I see what is going on here. I'll send a patch.
--
You are receiving this mail because:
You are on the CC list for the bug.
grassfedcode at gmail dot com
2018-05-27 04:55:53 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=22960

grassfedcode at gmail dot com changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |grassfedcode at gmail dot com
--
You are receiving this mail because:
You are on the CC list for the bug.
tromey at sourceware dot org
2018-06-27 22:33:03 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=22960

Tom Tromey <tromey at sourceware dot org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2018-06-27
CC| |tromey at sourceware dot org
Ever confirmed|0 |1

--- Comment #13 from Tom Tromey <tromey at sourceware dot org> ---
I built gdb 8.0 from git and that did not work for me on macOS 10.13.5.
Neither did git master. I also tried the 8.0.1 from brew.

They both fail in the same way, with "Unknown signal".

I tend to think this is a dup of 20266.
--
You are receiving this mail because:
You are on the CC list for the bug.
palves at redhat dot com
2018-06-28 00:05:44 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=22960

--- Comment #14 from Pedro Alves <palves at redhat dot com> ---
Tom, does reverting the offending commit work for you?
--
You are receiving this mail because:
You are on the CC list for the bug.
tromey at sourceware dot org
2018-06-28 13:57:40 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=22960

--- Comment #15 from Tom Tromey <tromey at sourceware dot org> ---
(In reply to Pedro Alves from comment #14)
Post by palves at redhat dot com
Tom, does reverting the offending commit work for you?
No.

What happens for me is that darwin_decode_message gets
a MACH_NOTIFY_DEAD_NAME (the "== 0x48") case. Then the
subsequent wait4() call returns with wstatus=5.

wstatus=5 is a strange response. It is not WIFEXITED,
but neither is it WIFSIGNALED. So far I haven't found
any documentation about what it might be.

One wild guess is that maybe this mach message actually
does carry the name of the new port and it could be
extracted via darwin_find_new_inferior. But that seems
like a longshot.

I looked at the lldb patch that Jason Molenda posted
(see https://sourceware.org/bugzilla/show_bug.cgi?id=20266#c6),
but lldb seems to work in a completely different way here,
I guess hooking into some low-level mach thing somehow? Like,
those functions aren't obviously called from anywhere.
So, I do wonder whether the answer is a bigger rewrite of
darwin-nat.c, to use mach stuff everywhere and not ptrace
or wait. However, this experience has shown me that even
minor revisions of macOS can come with big changes, so
modifying this code seems somewhat tricky.
--
You are receiving this mail because:
You are on the CC list for the bug.
palves at redhat dot com
2018-06-28 14:20:29 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=22960

--- Comment #16 from Pedro Alves <palves at redhat dot com> ---
Are you testing with "set startup-with-shell off", perhaps? Or maybe Saagar
was? I could see that impacting whether affecting whether you see the SIGTRAP,
since this all seems to be exec-event related.
--
You are receiving this mail because:
You are on the CC list for the bug.
tromey at sourceware dot org
2018-06-28 14:47:59 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=22960

--- Comment #17 from Tom Tromey <tromey at sourceware dot org> ---
(In reply to Pedro Alves from comment #16)
Post by palves at redhat dot com
Are you testing with "set startup-with-shell off", perhaps? Or maybe Saagar
was? I could see that impacting whether affecting whether you see the
SIGTRAP, since this all seems to be exec-event related.
I have tried it both ways to no avail.
--
You are receiving this mail because:
You are on the CC list for the bug.
tromey at sourceware dot org
2018-06-29 14:53:36 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=22960

--- Comment #18 from Tom Tromey <tromey at sourceware dot org> ---
In my case the first problem was that I was trying "gdb /bin/ls" --
but that is subject to System Integrity Protection.
Using my own test executable gives a different problem.

I'll file a separate bug about detecting SIP.
gdb could at least tell the user what is going on.
--
You are receiving this mail because:
You are on the CC list for the bug.
rtmills at anl dot gov
2018-10-15 16:06:55 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=22960

Richard Tran Mills <rtmills at anl dot gov> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |rtmills at anl dot gov

--- Comment #19 from Richard Tran Mills <rtmills at anl dot gov> ---
I'd just like to confirm that I am seeing the exact same error on Mac OS
10.13.6 using GDB 8.2 installed via Homebrew. Downgrading to 8.0.1 via Homebrew
gives me a working version of GDB.
--
You are receiving this mail because:
You are on the CC list for the bug.
tromey at sourceware dot org
2018-10-15 23:31:40 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=22960

--- Comment #20 from Tom Tromey <tromey at sourceware dot org> ---
Try git master gdb. There have been a few High Sierra fixes there.
I used to get this problem there but now I no longer do.
--
You are receiving this mail because:
You are on the CC list for the bug.
saagar at saagarjha dot com
2018-10-16 02:50:50 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=22960

--- Comment #21 from Saagar Jha <saagar at saagarjha dot com> ---
Ooh, it's nice to see that the underlying issue has been fixed. Can confirm
that this works on macOS Mojave with a small patch to deal with new load
commands. I'll look into the process of getting this merged in so we can extend
support to 10.14 as well.
--
You are receiving this mail because:
You are on the CC list for the bug.
tromey at sourceware dot org
2018-10-16 19:42:49 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=22960

--- Comment #22 from Tom Tromey <tromey at sourceware dot org> ---
(In reply to Saagar Jha from comment #21)
Post by saagar at saagarjha dot com
Ooh, it's nice to see that the underlying issue has been fixed. Can confirm
that this works on macOS Mojave with a small patch to deal with new load
commands. I'll look into the process of getting this merged in so we can
extend support to 10.14 as well.
Looking forward to that.
See also bug #23728, bug #23742, and bug #23746.
--
You are receiving this mail because:
You are on the CC list for the bug.
saagar at saagarjha dot com
2018-10-26 13:35:49 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=22960

--- Comment #23 from Saagar Jha <saagar at saagarjha dot com> ---
I've taken the time to clean up my patches and submit them to the gdb-patches
mailing list (though, I don't see them in the archives. Is this just a standard
delay, or did I mess up somewhere?)
--
You are receiving this mail because:
You are on the CC list for the bug.
tromey at sourceware dot org
2018-10-26 19:34:27 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=22960

--- Comment #24 from Tom Tromey <tromey at sourceware dot org> ---
(In reply to Saagar Jha from comment #23)
Post by saagar at saagarjha dot com
I've taken the time to clean up my patches and submit them to the
gdb-patches mailing list (though, I don't see them in the archives. Is this
just a standard delay, or did I mess up somewhere?)
I didn't see them either, so maybe try re-sending.
--
You are receiving this mail because:
You are on the CC list for the bug.
saagar at saagarjha dot com
2018-10-27 05:09:37 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=22960

--- Comment #25 from Saagar Jha <saagar at saagarjha dot com> ---
The patches should be on the list now. Turns out gdb-patches is extremely
strict about emails that contain any kind of HTML ;P
--
You are receiving this mail because:
You are on the CC list for the bug.
Loading...