Bug 211233 - www/firefox: frequent crashes
Summary: www/firefox: frequent crashes
Status: Closed Overcome By Events
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-gecko (Nobody)
URL:
Keywords: crash, needs-qa
Depends on:
Blocks:
 
Reported: 2016-07-19 18:44 UTC by Martin Birgmeier
Modified: 2016-07-25 05:59 UTC (History)
1 user (show)

See Also:
bugzilla: maintainer-feedback? (gecko)
koobs: merge-quarterly?


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Birgmeier 2016-07-19 18:44:29 UTC
Since yesterday, firefox crashes very frequently. After a crash it leaves an on-disk state leading to even more frequent crashes up to the point where it crashes immediately after starting.

Restoring .mozilla/firefox and .cache/mozilla/firefox from a backup improves the situation for a while.

I suspect that one of the port upgrades is causing this behavior.

Getting the list of dependencies:
# pkg query %dn-%dv firefox-47.0.1_2,1 > /tmp/x1

Checking which ports have been installed more recently than firefox itself:
# pkg query '%n-%v   %t' `cat /tmp/x1` firefox-47.0.1_2,1 | sort -k2,2n | sed -n '/firefox-47/,$ p'
firefox-47.0.1_2,1      1467920026
ffmpeg-2.8.7_2,1        1468341989
harfbuzz-1.2.7  1468342073
png-1.6.23      1468519635
#

So I guess it could be one of ffmpeg, harfbuzz, or png.
Comment 1 Martin Birgmeier 2016-07-19 19:20:04 UTC
After each crash, two core files are left:
firefox.core, plugin-container.core
Comment 2 Martin Birgmeier 2016-07-19 19:31:13 UTC
Hmmm...

# pkg query '%n-%v   %t' `pkg query %dn-%dv firefox-47.0.1_2,1` firefox-47.0.1_2,1 | sort -k2,2n | sed -n '/firefox-47/,$ p' | gawk 'BEGIN { OFS = "\t"; }
{ $2 = strftime("%Y-%m-%d.%H:%M:%S", $2); print; }'       
firefox-47.0.1_2,1      2016-07-08.13:37:15
ffmpeg-2.8.7_2,1        2016-07-12.19:06:44
harfbuzz-1.2.7  2016-07-12.19:06:47
png-1.6.23      2016-07-14.21:02:48
# 

So all three ports have been updated before firefox started crashing.

# pkg query -a '%n-%v        %o      %t' | sort -t'  ' -k3,3n | gawk 'BEGIN { OFS = "\t"; }                                                               
{ $3 = strftime("%Y-%m-%d.%H:%M:%S", $3); print; }' | tail -10
png-1.6.23      graphics/png    2016-07-14.21:02:48
e2fsprogs-libuuid-1.43.1_1      misc/e2fsprogs-libuuid  2016-07-15.19:58:53
p7zip-15.14_1   archivers/p7zip 2016-07-15.19:58:54
py27-pytz-2016.6.1,1    devel/py-pytz   2016-07-15.19:58:55
portmaster-3.17.9_3     ports-mgmt/portmaster   2016-07-16.19:09:03
tiff-4.0.6_2    graphics/tiff   2016-07-16.19:09:04
telepathy-qt4-0.9.7     net-im/telepathy-qt4    2016-07-17.20:04:29
nmap-7.25.b1    security/nmap   2016-07-17.20:04:31
webfonts-0.30_11        x11-fonts/webfonts      2016-07-17.20:04:34
git-2.9.2       devel/git       2016-07-18.19:42:00
# 

So only portmaster, tiff, telepathy-qt4, nmap, webfonts, and git have been updated since firefox started crashing.
Comment 3 Kubilay Kocak freebsd_committer freebsd_triage 2016-07-20 08:24:27 UTC
@Martin, if you could provide backtraces of the core files as attachments that would be great
Comment 4 Martin Birgmeier 2016-07-20 15:44:28 UTC
Hmmm... not really much information here...

% gdb /usr/local/lib/firefox/firefox firefox.core 
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...(no debugging symbols found)...
Core was generated by `firefox'.
Program terminated with signal 11, Segmentation fault.
#0  0x000000080208c35a in ?? ()
(gdb) where
#0  0x000000080208c35a in ?? ()
#1  0x000000080208c346 in ?? ()
#2  0x0000000000018a30 in double_conversion::DoubleToStringConverter::ToPrecision ()
#3  0x0000000000000000 in ?? ()
(gdb)
%

[0]% gdb /usr/local/lib/firefox/plugin-container plugin-container.core 
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...(no debugging symbols found)...
Core was generated by `plugin-container'.
Program terminated with signal 11, Segmentation fault.
#0  0x00000000010294dd in ?? ()
(gdb) where
#0  0x00000000010294dd in ?? ()
#1  0x0000000000000003 in ?? ()
#2  0x0000000802072e56 in ?? ()
#3  0x00007fffdfffd890 in ?? ()
#4  0x0000000802072c6d in ?? ()
#5  0x00007fffdfffd910 in ?? ()
#6  0x0000000801276f8b in ?? ()
#7  0x000007eb00000001 in ?? ()
#8  0x0000000804e39f80 in ?? ()
#9  0x3220646c6968435b in ?? ()
#10 0x232323205d393431 in ?? ()
#11 0x524f424120212121 in ?? ()
#12 0x74726f6241203a54 in ?? ()
#13 0x63206e6f20676e69 in ?? ()
#14 0x65206c656e6e6168 in ?? ()
#15 0x66203a2e726f7272 in ?? ()
#16 0x7273752f20656c69 in ?? ()
#17 0x2e2e2e2f706d742f in ?? ()
#18 0x532f7a2f6c61682f in ?? ()
#19 0x42656572462f4352 in ?? ()
#20 0x7374726f702d4453 in ?? ()
#21 0x77772f646165682f in ?? ()
#22 0x6f66657269662f77 in ?? ()
---Type <return> to continue, or q <return> to quit---
#23 0x662f6b726f772f78 in ?? ()
#24 0x342d786f66657269 in ?? ()
#25 0x70692f312e302e37 in ?? ()
#26 0x4d2f65756c672f63 in ?? ()
#27 0x6843656761737365 in ?? ()
#28 0x70632e6c656e6e61 in ?? ()
#29 0x20656e696c202c70 in ?? ()
#30 0x0000000037323032 in ?? ()
#31 0x00007fffdfffd918 in ?? ()
#32 0x00007fffdfffd9c0 in ?? ()
#33 0x000000080dd71140 in ?? ()
#34 0x000000080dd71170 in ?? ()
#35 0x0000000000001140 in ?? ()
#36 0x000000080148f950 in ?? ()
#37 0x00007fffdfff037f in ?? ()
#38 0x0000000801276d1d in ?? ()
#39 0x0000000000000000 in ?? ()
(gdb) 
(gdb)
%
Comment 5 Martin Birgmeier 2016-07-20 15:49:17 UTC
Some observations - don't know how useful they are:

- Seems to happen with new, empty tabs
- Happened when I switched (using the tab groups extension) to a tab group which did not contain any windows (I then deleted that tab group and hoped for the best, but today got core dumps again)
- Probably has to do with sync? - Might be a wrong lead.
- Sometimes when I opened a new tab, the letter "g" would be shown in the address bar and/while firefox was already core dumping (normally the bar should be empty)

-- Martin
Comment 6 Martin Birgmeier 2016-07-20 17:24:55 UTC
It seems that opening a new tab reliably triggers the crash.
Comment 7 Martin Birgmeier 2016-07-21 18:25:22 UTC
Since a new tab shows data from the history and the address bar tries to autocomplete from it I had the idea to clear it.

It seems the crashes are gone.

So let's close the report (for the time being).

Although it is most likely a problem in firefox not to be robust enough against funny history entries.

-- Martin
Comment 8 Martin Birgmeier 2016-07-23 12:23:06 UTC
It is still crashing, even on multiple machines (running the same binaries).

I tried to track it down further: Since there are always two cores, namely firefox.core and plugin-container.core, I tried to watch any plugin-container processes. Sure enough, they appear when I open a new tab. And their memory footprint (SIZE in 'top') increases rapidly. So I started a small script to kill any plugin-container as soon as it would appear and voila - firefox does not crash any more.

I found a vaguely similar bug report: https://support.mozilla.org/en-US/questions/1062571

However, even disabling all plugins does not help, the plugin-container is still started when opening a new tab. The following plugins are installed according to 'about:addons':
OpenH264 Video Codec provided by Cisco Systems, Inc. ............ always activate
IcedTea-Web Plugin (using IcedTea-Web 1.6.2) .................... ask to activate
Skype Buttons for Kopete ........................................ ask to activate

At least, the plugin-container is being activated most of the time. Now, on one machine, after several 'killall plugin-container', it does not start any more when opening a new tab (this is still the same firefox process). On the other machine, with the kill script running twice per second, between 4 and 5 plugin-containers are created until no further ones appear.
Comment 9 Martin Birgmeier 2016-07-24 07:46:47 UTC
Some more analysis...

A new tab shows the top sites in a 5x3 preview grid. It seems that if for at least one site a preview is not available, a plugin-container process is being spawned which then crashes.

The plugin-container is started as follows:

/usr/local/lib/firefox/plugin-container -greomni /usr/local/lib/firefox/omni.ja -appomni /usr/local/lib/firefox/browser/omni.ja -appdir /usr/local/lib/firefox/browser 3188 tab

3188 is the PID of the firefox main process.

Whenever such a plugin-container is killed, the following lines are added to .xsession-errors:

(process:12117): GLib-CRITICAL **: g_path_get_basename: assertion 'file_name != NULL' failed

###!!! [Parent][MessageChannel] Error: (msgtype=0x2C0076,name=PBrowser::Msg_Destroy) Channel error: cannot send/recv

So most likely the two processes - firefox and plugin-container - exchange some information, with the plugin-container ultimately running out of memory and crashing.

More info: plugin-container works fine for icedtea-web - I can start Java plugins without a problem.

-- Martin
Comment 10 Martin Birgmeier 2016-07-24 14:17:20 UTC
New observations.

So plugin-container is started to render missing previews of the preview tiles shown in a new tab.

Instead of killing it with my script, I now just let it run.

Whenever both firefox and plugin-container crash, two lines are added to .xsession-errors:

^G[Child 11184] ###!!! ABORT: Aborting on channel error.: file /usr/tmp/.../hal/z/SRC/FreeBSD-ports/head/www/firefox/work/firefox-47.0.1/ipc/glue/MessageChannel.cpp, line 2027
[Child 11184] ###!!! ABORT: Aborting on channel error.: file /usr/tmp/.../hal/z/SRC/FreeBSD-ports/head/www/firefox/work/firefox-47.0.1/ipc/glue/MessageChannel.cpp, line 2027
Comment 11 Martin Birgmeier 2016-07-24 15:03:01 UTC
Continuing this on https://bugzilla.mozilla.org/show_bug.cgi?id=1288962
Comment 12 Jan Beich freebsd_committer freebsd_triage 2016-07-25 05:59:00 UTC
> /usr/local/lib/firefox/plugin-container -greomni /usr/local/lib/firefox/omni.ja -appomni /usr/local/lib/firefox/browser/omni.ja -appdir /usr/local/lib/firefox/browser 3188 tab

This is e10s content process and can be confirmed by "Multiprocess Windows" in about:support. e10s is controlled by browser.tabs.remote.autostart* preferences in about:config and currently isn't enabled by default yet[1]. I haven't tested e10s much on anything but Nightly, 11.0-CURRENT, default options, so it's possible there're stability issues.

To get a useful stacktraces rebuild www/firefox and all its dependencies with debugging symbols either by specifying WITH_DEBUG=1 on command line or adding something like the following to make.conf
  
  CFLAGS += -g -O0
  STRIP = # emtpy

Otherwise try building vanilla source e.g.,

  $ hg clone https://hg.mozilla.org/releases/mozilla-release/
  $ cd mozilla-release
  $ ./mach bootstrap
  $ nice ./mach build
  $ ./mach run
  $ ./mach run --debug # requires devel/gdb

[1] https://wiki.mozilla.org/Electrolysis#Schedule