Summary: | devel/glib20 Apps using glib 2.42.2 crashing with 'pthread_mutex_lock' abort | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Ports & Packages | Reporter: | Will B <will.brokenbourgh2877> | ||||||||||||||||||
Component: | Individual Port(s) | Assignee: | Koop Mast <kwm> | ||||||||||||||||||
Status: | Closed FIXED | ||||||||||||||||||||
Severity: | Affects Some People | CC: | gnome, kwm, lantw44, lightside, madpilot, olivierd, pi, rozhuk.im, tobik, vermaden, wulf | ||||||||||||||||||
Priority: | --- | Flags: | kwm:
maintainer-feedback+
kwm: merge-quarterly+ |
||||||||||||||||||
Version: | Latest | ||||||||||||||||||||
Hardware: | amd64 | ||||||||||||||||||||
OS: | Any | ||||||||||||||||||||
Bug Depends on: | |||||||||||||||||||||
Bug Blocks: | 217946 | ||||||||||||||||||||
Attachments: |
|
Description
Will B
2015-05-03 03:17:15 UTC
For higher probability of reproducing this bug: * Double-click/open a text file in Thunar, then close the text editor * Switch to another program that uses glib (Thunderbird, Firefox) * Take some action, such as double-clicking a list item or switch tabs * Repeatedly execute the commands show above in the bug report in xterm The crashes do not happen every time these steps are taken, only randomly, but the error message is always the same. Sorry for the noise (no editing of existing comments). The above steps *should* be: For higher probability of reproducing this bug: * While the commands shown above are repeatedly executing in xterm... * ...double-click a folder in Thunar to open it then... * ...double-click a file (text, picture, sound) in the newly opened folder * Switch the focus between Thunar, Firefox and Thunderbird * Take some action in each program, such as double-clicking a list item or switching tabs * Continue executing the commands in xterm and one of the glib-using apps will eventually crash I think it is the known use-after-free problem of glib file monitor: https://bugzilla.gnome.org/show_bug.cgi?id=739424 > I think it is the known use-after-free problem of glib file monitor:
> https://bugzilla.gnome.org/show_bug.cgi?id=739424
Thank you for that. Yes, it appears to be similar. I saw my first crash after a clean install when I deleted selected files from ~/.local/share/applications, and when I was installing something with pkg (which probably installed an item in /usr/local/share/applications), it also happened.
From your link it looks like there is progress being made. This would fix a long-standing issue I've had when running FreeBSD as a desktop system.
Thanks :-)
Mark as in progress. Glib 2.45.x has a fix for this problem. But since it quite deep, it will not be backported to the 2.44.x series. Does GLib 2.45 really fix this problem? I still get many gnome-shell and firefox crashes on GLib 2.45. (In reply to Ting-Wei Lan from comment #6) No, this bug is not fixed. I can trigger Firefox crashes with just exiting llpp. My theory is that ~/.config is monitored and llpp writes to ~/.config/llpp.conf on exit which may (or not) trigger a crash. I solved the problem related to llpp by moving the config file to ~/.llpp.conf instead (bug #214458), but there is clearly still a larger problem. Firefox dies with SIGBUS most often but sometimes prints the pthread_mutex_lock error when this happens. Created attachment 177083 [details]
glib20-nokqueue.diff
Disabling kqueue support in glib seems to stop apps from randomly crashing.
One consequence of this (there are probably more) is that e.g. Thunar or
open/save dialogs won't auto-refresh its views anymore when files get added to
directories etc. But having stable apps is more important to me than having
features like this.
(In reply to Tobias Kortkamp from comment #8) > But having stable apps is more important to me than having > features like this. Agreed. Firefox, Thunderbird and PCManFM still crash for me, but I just restart them and keep working (although it *is* annoying). (In reply to Tobias Kortkamp from comment #8) No it's not good idea to disable kqueue. It's our monitoring backend for BSD systems (and there're lot of side effects). But it's right, there's problem with GLib. Created attachment 177118 [details]
glib20-libinotify.patch
You can try glib20 with inotify backend.
Really, gio-kqueue backend is stripped down version of libinotify-kqueue library (devel/libinotify) imported in 2012 and modified to use gio API rather than inotify one.
Lots of bugs have been fixed in upstream since.
Comment on attachment 177083 [details] glib20-nokqueue.diff (In reply to Vladimir Kondratyev from comment #11) Hi Vladimir, nice idea. I've been running with your patch the entire day and did not have a single crash. I'm also unable to trigger crashes intentionally. This looks very good to me so far, and much, much better than disabling kqueue/file monitoring entirely. Thanks! (In reply to Tobias Kortkamp from comment #12) > I'm also unable to trigger crashes intentionally. I use inotify as gio backend for more than year w/o any issues. But this patch has one drawback - it unconditionally disables watching for file content modifications in watched directories with removing of IN_MODIFY, IN_ATTRIB and IN_CLOSE_WRITE flags from inotify_add_watch() arguments. Right way IMO is to leave this flags for local directories and remove it for watching for network mounts and removable devices. Something similar is done for native kqueue gio backend but not for inotify: https://github.com/GNOME/glib/blob/master/gio/kqueue/kqueue-exclusions.c Alternate kqueue backend: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=214338 Any idea when this will be fixed? Created attachment 186826 [details] Update glib patch the present glib port patches glib with upstream code from https://bugzilla.gnome.org/show_bug.cgi?id=739424 That bug has an updated patch since when it was imported. New patch fixes a locking problem in the specific code path causing this problem. The same findings can be found in bug #217946 and bug #221679. I compiled a patch importing the newer upstream fix, which even iif not perfect seems to be better than what we have at present. I had to rename the two patch files to force correct order when applying, because the new patch in upstream bug 739424 depends on code from upstream bug 778515. I'm asking other xfce4 users and glib consumers to test this patch, and report back. Hope this can be committed soon. In case my patch is accepted, it should also be merged to quarterly. Created attachment 188059 [details]
Further update glib patch
I have analysed another crash, in x11-wm/xfce4-desktop this time.
I've discovered a code path in the kqueue helper casing glib to exit due to a filed assertion.
This is triggered when a kqueue NOTE_RENAME is generated on a file. The code translates it in a G_FILE_MONITOR_EVENT_MOVED and passes it to g_file_monitor_source_handle_event(), where it ends in an assertion every time.
I simply commented out the code causing this, so this event is actually ignored.
It does not look like a problem since this code path is bound to end in a failed assertion every time, while the same file move causing this event will also generate an event for the parent directory, which is handled through a slightly different code path.
My intuition is that while kqueue generates NOTE_RENAME events for single files, GIO semantics are that such events should be generated only for moves happening in watched directories, so ignoring a moved event for a file here is actually correct.
Please test this patch, I think it will fix a few more problems, but needs ample testing to make sure it's not causing problems in other areas.
Dont waste your time to ugly kqueue() fam backend in glib, you cant fix this crap by simple patches. Kqueue based fam was broken and disabled (in ports) over year and upstream do nothing. Forget about about it and use libinotify or my patch: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=214338 it works fine: no crashes, no heavy cpu usage. (In reply to rozhuk.im from comment #19) > Dont waste your time to ugly kqueue() fam backend in glib, you cant fix this > crap by simple patches. > Kqueue based fam was broken and disabled (in ports) over year and upstream > do nothing. > > Forget about about it and use libinotify or my patch: > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=214338 > it works fine: no crashes, no heavy cpu usage. At present the glib20 port does not include an option to use inotify or your patch. My patch is a simple incremental fix, which requires only importing upstream bits. (I'm going to update the patch shortly removing my code an replacing it with code recently committed upstream) I'm not on the gnome team so I don't know which solution they will import in the tree, but I don't think diverging from upstream is accounted as an option. (In reply to Guido Falsi from comment #20) > At present the glib20 port does not include an option to use inotify or your patch. Yes, and no one care this. But patches for both case available. > My patch is a simple incremental fix, which requires only importing upstream bits. Small patch cant fix big ugly code. Why you do not remove/disable this code and use new kqueue or libinotify? > I'm not on the gnome team so I don't know which solution they will import in > the tree, but I don't think diverging from upstream is accounted as an option. Than you can: - use libinotify - promote my patch in upstream or here - do nothing and continue live with app crashing My patch (~1000 lines) have small (150 lines) code wrapper from plain C to glib, it is easy to rewrite/support. Also I publish C code to debug kqueue() backend without building with glib. IMHO FreeBSD desktop is unusable with current glib fam. We can discuss here what is right way to fix it, but users will migrate to other OSes where developers fix problems. This bug 2,5 years old, there is two stable solutions=pathes and it does not commited/fixed yet. In commercial or more an adequate community this situation is impossible! (In reply to rozhuk.im from comment #21) > (In reply to Guido Falsi from comment #20) > > > At present the glib20 port does not include an option to use inotify or your patch. > > Yes, and no one care this. > But patches for both case available. > [...] > > IMHO FreeBSD desktop is unusable with current glib fam. > We can discuss here what is right way to fix it, but users will migrate to > other OSes where developers fix problems. > > This bug 2,5 years old, there is two stable solutions=pathes and it does not > commited/fixed yet. > In commercial or more an adequate community this situation is impossible! I am not a gnome committer I cannot decide which code should go in. I've not even said mine is better. In fact I don't know. You already stated you have another PR with a better solution, the people in charge of this port will choose what they prefer to commit. Please avoid this kind of aggressive posts. Created attachment 188365 [details] Upstream updates patch A recent upstream commit [1] implements changes very similar to what I did, but better. I'm updating the patch importing the upstream commit. I tested it and seems to correctly avoid the assertion. [1] https://git.gnome.org/browse/glib/commit/?id=76072a2dde4a4acc8be8d3c47efbc6811ebe0c1e (In reply to Guido Falsi from comment #22) This is a very soft reaction from the user who lived almost a year without any fam, another year the fam worked very poorly. And now, when there are two ready solutions to the problem, the community does nothing at all. For a whole year the community can not decide to adopt a variant or add both as options. It seems that the community does not care about not only users and their convenience, but even the developers who do something for the community. FreeBSD is not very popular, and with this attitude, the last users will turn away from it. (In reply to Guido Falsi from comment #23) Original code have very bad design, it cant be fixed by series of one-line-patch. Also my code handle NOTE_RENAME more correct way: it try to find in up folder by inode and report deleted only if not found. Probably libinotify do same smart thing. This is one of the biggest problems of FreeBSD. This powerless attitude to NOT fix problematic bugs that affect almost everybody, and turns users away from FreeBSD. Seems that FreeBSD developers still mostly use OS X because if they would use FreeBSD desktop, they would fix that bug in a week ... Created attachment 189330 [details] Proposed patch for options (since 453790) Hello. (In reply to comment #24) > For a whole year the community can not decide to adopt a variant or add both > as options. I attached some patch, which adds INOTIFY and KQUEUE as options, based on attachment #177118 [details] from Vladimir Kondratyev. Possible to change OPTIONS_DEFAULT to other value, if needed. Created attachment 189354 [details] Proposed patch for options (since 453790 revision) (In reply to comment #26) Need to note, that proposed (INOTIFY and KQUEUE) port's options allows to switch between gio/inotify and gio/kqueue (affected) implementations. I'm aware, that devel/libinotify port is for "Kevent based inotify compatible library" (i.e. also uses kqueue; see comment #11). Possible to clarify descriptions for port's options. A commit references this bug: Author: kwm Date: Fri Jan 26 21:26:58 UTC 2018 New revision: 460052 URL: https://svnweb.freebsd.org/changeset/ports/460052 Log: Update glib to 2.50.3. Also redo the kqueue patches. Now we patch files only once, and add some bits that got lost somewhere (which is probably my fault). Which where causing crashes when for example nautilus or thundar where monitoring directories and files where added/removed. PR: 199872 Changes: head/devel/glib20/Makefile head/devel/glib20/distinfo head/devel/glib20/files/patch-bug739424 head/devel/glib20/files/patch-bug778515 head/devel/glib20/files/patch-gio_kqueue_gkqueuefilemonitor.c head/devel/glib20/files/patch-gio_kqueue_kqueue-helper.c Thank you for fixing this one, now deleting directories is not PITA. Another small step in right direction for FreeBSD on the road to 'desktop' :p Regards. There still is a code patch which causes the G_FILE_MONITOR_EVENT_MOVED event to be passed to g_file_monitor_source_handle_event() causing a failed assertion and crash. It's fixed in this upstream commit: https://github.com/GNOME/glib/commit/76072a2dde4a4acc8be8d3c47efbc6811ebe0c1e I think this should also be imported in our port. My proposed patch included this. (In reply to Guido Falsi from comment #30) Is there something that 'blocks' importing this fixed version into Ports? (In reply to vermaden from comment #31) > (In reply to Guido Falsi from comment #30) > Is there something that 'blocks' importing this fixed version into Ports? glib20 port is depended upon by many ports so thorough testing is needed, and should be performed by people confident with the tested port. Apart from this I'm not in the gnome group, so to commit to this port I would anyway need gnome group approval. Otherwise gnome group members can commit this, once they are confident it does not break anything in the many ports glib is depended from. Please allow time for testing an evaluating to all involved parties. (In reply to Guido Falsi from comment #32) Thank You for explanation. A commit references this bug: Author: kwm Date: Sun Jan 28 20:29:16 UTC 2018 New revision: 460230 URL: https://svnweb.freebsd.org/changeset/ports/460230 Log: Fix another crash bug in the kqueue backend. PR: 199872 217946 Changes: head/devel/glib20/Makefile head/devel/glib20/files/patch-gio_kqueue_kqueue-helper.c (In reply to Guido Falsi from comment #32) Do you and gnome group understand situation!? This is not some small bug in freak port with few users. This BIG ANNOYING BUG that all users can see every day. They see that many apps is crashing and think: "whole FreeBSD is crap", because on Linux, mac and windows same does not happen so rarely. Even on windows 98se explorer and apps was not crash on file operations. Is windows 98se better than FreeBSD 11 on desktop? How long you will use mac/windows and told here about some tests, right way to apply and support patches!? As FreeBSD user on desktop I do not care about: "need tests with other ports" and "I don't think it good to include this into ports" - just fix that crashes, what are you waiting for? Forget about glib legacy kqueue() FAM backend. Is is totally broken many years. Select between libinotify and my patch or let users to choose via options. Is it so difficult? This is a ticket of shame. IMHO. A commit references this bug: Author: kwm Date: Tue Jan 30 07:04:21 UTC 2018 New revision: 460371 URL: https://svnweb.freebsd.org/changeset/ports/460371 Log: MFH: r460052 r460230 Update glib to 2.50.3. Also redo the kqueue patches. Now we patch files only once, and add some bits that got lost somewhere (which is probably my fault). Which where causing crashes when for example nautilus or thundar where monitoring directories and files where added/removed. PR: 199872 Fix another crash bug in the kqueue backend. PR: 199872 217946 Approved by: ports-secteam (swills@) Changes: _U branches/2018Q1/ branches/2018Q1/devel/glib20/Makefile branches/2018Q1/devel/glib20/distinfo branches/2018Q1/devel/glib20/files/patch-bug739424 branches/2018Q1/devel/glib20/files/patch-bug778515 branches/2018Q1/devel/glib20/files/patch-gio_kqueue_gkqueuefilemonitor.c branches/2018Q1/devel/glib20/files/patch-gio_kqueue_kqueue-helper.c A commit references this bug: Author: kwm Date: Tue Jan 30 07:04:22 UTC 2018 New revision: 460371 URL: https://svnweb.freebsd.org/changeset/ports/460371 Log: MFH: r460052 r460230 Update glib to 2.50.3. Also redo the kqueue patches. Now we patch files only once, and add some bits that got lost somewhere (which is probably my fault). Which where causing crashes when for example nautilus or thundar where monitoring directories and files where added/removed. PR: 199872 Fix another crash bug in the kqueue backend. PR: 199872 217946 Approved by: ports-secteam (swills@) Changes: _U branches/2018Q1/ branches/2018Q1/devel/glib20/Makefile branches/2018Q1/devel/glib20/distinfo branches/2018Q1/devel/glib20/files/patch-bug739424 branches/2018Q1/devel/glib20/files/patch-bug778515 branches/2018Q1/devel/glib20/files/patch-gio_kqueue_gkqueuefilemonitor.c branches/2018Q1/devel/glib20/files/patch-gio_kqueue_kqueue-helper.c The problems with glib crashes should be fixed now. Please see bug 217946 for thunar related issues. Sorry for taking so long for fixing this. Thank You. |