Re: firefox3-bin crashes near arc4random_buf() - FreeBSD

This is a discussion on Re: firefox3-bin crashes near arc4random_buf() - FreeBSD ; On Sat, Oct 04, 2008 at 08:10:24PM +0400, Andrey Chernov wrote: > On Sat, Oct 04, 2008 at 01:05:11AM -0700, Jos Backus wrote: > > For a few weeks now firefox3-bin has been crashing semi-regularly for me. > > Backtrace ...

+ Reply to Thread
Results 1 to 8 of 8

Thread: Re: firefox3-bin crashes near arc4random_buf()

  1. Re: firefox3-bin crashes near arc4random_buf()

    On Sat, Oct 04, 2008 at 08:10:24PM +0400, Andrey Chernov wrote:
    > On Sat, Oct 04, 2008 at 01:05:11AM -0700, Jos Backus wrote:
    > > For a few weeks now firefox3-bin has been crashing semi-regularly for me.
    > > Backtrace attached. I selected `Build a debugging image' but the resulting
    > > binary is stripped, so no symbols.
    > > #3 0x28237381 in XRE_InitEmbedding () from /usr/local/lib/firefox3/libxul.so
    > > #4
    > > #5 0x2a39eb2d in arc4random_buf () from /lib/libc.so.7
    > > #6 0x2a39aa7d in dbopen () from /lib/libc.so.7
    > > #7 0x2a39973a in __srget () from /lib/libc.so.7
    > > #8 0x2a39ab49 in dbopen () from /lib/libc.so.7
    > > #9 0x2a39916f in __srget () from /lib/libc.so.7
    > > #10 0x2a39c220 in __hash_open () from /lib/libc.so.7
    > > #11 0x2aae9b9c in ?? () from /usr/local/lib/firefox3/libnssdbm3.so

    >
    > It looks like stack damaged at this moment. No libc functions, including
    > db* functions calls arc4random_buf().


    I was surprised to see that, too. The problem is perfectly repeatable on my
    system. I tried building firefox3 using

    WITH_DEBUG=true STRIP= make deinstall reinstall clean

    but the resulting binary is still stripped:

    lizzy:~% file /usr/local/lib/firefox3/firefox-bin
    /usr/local/lib/firefox3/firefox-bin: ELF 32-bit LSB executable, Intel 80386,
    version 1 (FreeBSD), for FreeBSD 8.0 (800049), dynamically linked (uses shared
    libs), FreeBSD-style, stripped
    lizzy:~%

    A few weeks ago, after these crashes had started happening, I rebuilt most
    ports on this machine, hoping it would fix the issue, but it has not.

    Any suggestions on how to debug this?

    --
    Jos Backus
    jos at catnook.com
    _______________________________________________
    freebsd-current@freebsd.org mailing list
    http://lists.freebsd.org/mailman/lis...reebsd-current
    To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"


  2. Re: firefox3-bin crashes near arc4random_buf()

    On Sat, Oct 04, 2008 at 05:49:06PM -0700, Tim Kientzle wrote:
    > First, you need to share the first items in the
    > backtrace, as they're more likely to be correct.
    > I agree with Andrey that it looks like there's
    > some stack corruption, so it's hard to trust
    > anything except the first couple of entries.


    Attached is a tarball containing firefox3.gdb which has the full output of
    `bt'. Unfortunately it doesn't tell me very much more.

    > You should also look at several independent core
    > dumps and see how much the backtraces have in common.


    I watched it crash a bunch more times and the backtraces are the same. That's
    good, right? :-)

    > It might also be worth running it under ktrace,
    > forcing the crash, then sharing the last few dozen
    > lines from kdump output.


    Also attached is firefox3.kdump. The last few lines look like:

    6855 firefox-bin RET clock_gettime 0
    6855 firefox-bin CALL _umtx_op(0x8179760,0x8,0x1,0x8179740,0xbf8fdddc)
    6855 firefox-bin PSIG SIGSEGV caught handler=0x28237290 mask=0x0 code=0x1
    6855 firefox-bin CALL unlink(0x8179600)
    6855 firefox-bin NAMI "/home/jos/.mozilla/firefox/tosfxhak.default/lock"
    6855 firefox-bin RET unlink 0
    6855 firefox-bin CALL sigaction(SIGSEGV,0x2978dfb4,0)
    6855 firefox-bin RET sigaction 0
    6855 firefox-bin CALL sigprocmask(SIG_UNBLOCK,0xbf4f906c,0)
    6855 firefox-bin RET sigprocmask 0
    6855 firefox-bin CALL thr_kill(0x1878c,SIGSEGV)
    6855 firefox-bin RET thr_kill 0
    6855 firefox-bin PSIG SIGSEGV SIG_DFL
    6855 firefox-bin NAMI "firefox-bin.core"
    6855 firefox-bin RET poll -1 errno 4 Interrupted system call
    6855 firefox-bin RET _umtx_op -1 errno 4 Interrupted system call
    6855 firefox-bin RET _umtx_op -1 errno 4 Interrupted system call
    6855 firefox-bin RET _umtx_op -1 errno 60 Operation timed out
    6855 firefox-bin RET _umtx_op -1 errno 4 Interrupted system call
    6850 sh RET wait4 6855/0x1ac7
    6850 sh CALL write(0x1,0x814e400,0x21)
    6850 sh GIO fd 1 wrote 33 bytes
    "Segmentation fault (core dumped)
    "
    6850 sh RET write 33/0x21
    6850 sh CALL exit(0x8b)
    6846 sh RET wait4 6850/0x1ac2
    6846 sh CALL exit(0x8b)

    This to me suggests that the segfault happens inside _umtx_op. Am I reading
    that correctly?

    Thanks for looking into this!

    --
    Jos Backus
    jos at catnook.com

    _______________________________________________
    freebsd-current@freebsd.org mailing list
    http://lists.freebsd.org/mailman/lis...reebsd-current
    To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

  3. Re: firefox3-bin crashes near arc4random_buf()

    On Mon, Oct 06, 2008 at 02:57:39AM +0300, Giorgos Keramidas wrote:
    > Unfortunately, tarballs are stripped off by the list software.
    >
    > Can you upload this online somewhere and point us to a URL?


    Oops, thanks for reminding me, Giorgos. Tarball at
    http://lizzy.dyndns.org/~jos/firefox3.crash.tgz

    --
    Jos Backus
    jos at catnook.com
    _______________________________________________
    freebsd-current@freebsd.org mailing list
    http://lists.freebsd.org/mailman/lis...reebsd-current
    To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"


  4. Re: firefox3-bin crashes near arc4random_buf()

    > I watched it crash a bunch more times and the backtraces are the same. That's
    > good, right? :-)


    Yes. For a suitable definition of "good." ;-)

    >>It might also be worth running it under ktrace,
    >>forcing the crash, then sharing the last few dozen
    >>lines from kdump output.

    >
    > Also attached is firefox3.kdump. The last few lines look like:
    >
    > 6855 firefox-bin RET clock_gettime 0
    > 6855 firefox-bin CALL _umtx_op(0x8179760,0x8,0x1,0x8179740,0xbf8fdddc)
    > 6855 firefox-bin PSIG SIGSEGV caught handler=0x28237290 mask=0x0 code=0x1
    > 6855 firefox-bin CALL unlink(0x8179600)
    > 6855 firefox-bin NAMI "/home/jos/.mozilla/firefox/tosfxhak.default/lock"
    > 6855 firefox-bin RET unlink 0
    > 6855 firefox-bin CALL sigaction(SIGSEGV,0x2978dfb4,0)
    > 6855 firefox-bin RET sigaction 0
    > 6855 firefox-bin CALL sigprocmask(SIG_UNBLOCK,0xbf4f906c,0)
    > 6855 firefox-bin RET sigprocmask 0
    > 6855 firefox-bin CALL thr_kill(0x1878c,SIGSEGV)
    > 6855 firefox-bin RET thr_kill 0
    > 6855 firefox-bin PSIG SIGSEGV SIG_DFL
    >
    > This to me suggests that the segfault happens inside _umtx_op. Am I reading
    > that correctly?


    Not necessarily. Firefox is multi-threaded. The thread that
    called _umtx_op() is not the thread that crashed (_umtx_op()
    hadn't returned to userspace, so that thread was still in
    the kernel).

    This does, however, answer one puzzle: Firefox appears to
    have a signal handler that catches SEGV, releases the lock
    file, then re-throws SEGV to actually kill the program.
    That explains stack frames #0-#4 in your backtrace; that's
    the signal handler executing after the segfault but before
    the program is terminated.

    Something is still screwy about the backtrace. dbopen()
    doesn't call arc4random_buf. However, it does call
    mkstemp() which does call arc4random_uniform, which should
    be right next to arc4random_buf in memory. GCC optimizations
    could be obscuring the call stack here.

    It's certainly possible that arc4random is involved
    somehow but I don't yet see it. It does seem likely
    that we're looking at a libc problem, so a debug
    version of libc might help. Replacing libc on a
    running system is a little tricky. I believe the
    following works, though I've not tried it:

    % cd /usr/src/lib/libc
    % make clean
    % make DEBUG_FLAGS=-g
    % cp /lib/libc.so.7 /lib/libc.so.7-backup
    .... reboot to single user, use /rescue/sh as your shell ...
    % cp /usr/src/lib/libc/libc.so.7 /lib/libc.so.7
    .... reboot ...

    This should give you a standard libc with full
    debugging symbols. Hopefully, the backtrace will
    now give more details.

    I think we're getting closer.

    Tim
    _______________________________________________
    freebsd-current@freebsd.org mailing list
    http://lists.freebsd.org/mailman/lis...reebsd-current
    To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"


  5. Re: firefox3-bin crashes near arc4random_buf()

    On Sun, Oct 05, 2008 at 05:34:22PM -0700, Tim Kientzle wrote:
    > > This to me suggests that the segfault happens inside _umtx_op. Am I reading
    > > that correctly?

    >
    > Not necessarily. Firefox is multi-threaded. The thread that
    > called _umtx_op() is not the thread that crashed (_umtx_op()
    > hadn't returned to userspace, so that thread was still in
    > the kernel).
    >
    > This does, however, answer one puzzle: Firefox appears to
    > have a signal handler that catches SEGV, releases the lock
    > file, then re-throws SEGV to actually kill the program.
    > That explains stack frames #0-#4 in your backtrace; that's
    > the signal handler executing after the segfault but before
    > the program is terminated.


    Understood.

    > Something is still screwy about the backtrace. dbopen()
    > doesn't call arc4random_buf. However, it does call
    > mkstemp() which does call arc4random_uniform, which should
    > be right next to arc4random_buf in memory. GCC optimizations
    > could be obscuring the call stack here.
    >
    > It's certainly possible that arc4random is involved
    > somehow but I don't yet see it. It does seem likely
    > that we're looking at a libc problem, so a debug
    > version of libc might help. Replacing libc on a
    > running system is a little tricky. I believe the
    > following works, though I've not tried it:
    >
    > % cd /usr/src/lib/libc
    > % make clean
    > % make DEBUG_FLAGS=-g
    > % cp /lib/libc.so.7 /lib/libc.so.7-backup
    > ... reboot to single user, use /rescue/sh as your shell ...
    > % cp /usr/src/lib/libc/libc.so.7 /lib/libc.so.7


    chflags noschg /lib/libc.so.7
    /rescue/cp /usr/obj/usr/src/lib/libc/libc.so.7 /lib/libc.so.7

    > ... reboot ...
    >
    > This should give you a standard libc with full
    > debugging symbols. Hopefully, the backtrace will
    > now give more details.
    >
    > I think we're getting closer.


    Yeah. Oddly enough the debug version seems to make a difference; firefox3
    hasn't crashed yet. Normally even without touching it firefox3 will segfault
    within an hour or so. I will leave it up all night to see what happens.

    Thanks, Tim. I'll keep you posted.

    --
    Jos Backus
    jos at catnook.com
    _______________________________________________
    freebsd-current@freebsd.org mailing list
    http://lists.freebsd.org/mailman/lis...reebsd-current
    To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"


  6. Re: firefox3-bin crashes near arc4random_buf()

    > Yeah. Oddly enough the debug version seems to make a difference; firefox3
    > hasn't crashed yet. Normally even without touching it firefox3 will segfault
    > within an hour or so. I will leave it up all night to see what happens.


    Either, as Peter Jeremy suggested, using -g changed
    the compile or else you've built different sources.

    Have you updated your source since you last updated libc?

    Tim

    _______________________________________________
    freebsd-current@freebsd.org mailing list
    http://lists.freebsd.org/mailman/lis...reebsd-current
    To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"


  7. Re: firefox3-bin crashes near arc4random_buf()

    > Before following your instructions, I cvsupped, ran a `make kernel' and booted
    > into the new kernel. My userland is from Oct 4th.


    I presume your 'cvsup' also updated your libc sources.
    So you've basically upgraded to the newest libc...

    > As of this moment, firefox3 is still running. Would you like me to try
    > anything different?


    .... which seems to have fixed your problem. I suggest
    you install a regular non-debugging libc and see if
    everything remains fixed.

    Tim
    _______________________________________________
    freebsd-current@freebsd.org mailing list
    http://lists.freebsd.org/mailman/lis...reebsd-current
    To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"


  8. Re: firefox3-bin crashes near arc4random_buf()

    On Tue, Oct 07, 2008 at 06:50:09PM -0700, Tim Kientzle wrote:
    > This is a lot more interesting. This points to a crash
    > within libc's db code. Somehow, it's trying to compute
    > a hash for some element with length -10618, which is
    > getting converted to an unsigned 4294956678, which is
    > causing the crash.
    >
    > Does Firefox have knobs to use a newer Berkeley DB?


    Not that I am aware of. Maybe I should ask ports@...

    > I can't
    > recall whether newer Berkeley DB versions are thread-safe but
    > I'm pretty sure the old version in our libc isn't. If Firefox
    > is assuming the BDB code is thread-safe that could certainly
    > cause corruption of the BDB data with all sorts of unpleasant
    > consequences. That's just a random guess, though. Maybe someone
    > else on this mailing list knows better.


    I think you're on to something.

    Also, I have found a reliable way to cause the crash. It happens when I go to
    https://wellpointnextrx.com/ and try to accept the cert for the session.

    --
    Jos Backus
    jos at catnook.com
    _______________________________________________
    freebsd-current@freebsd.org mailing list
    http://lists.freebsd.org/mailman/lis...reebsd-current
    To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"


+ Reply to Thread