Bug 199872 - devel/glib20 Apps using glib 2.42.2 crashing with 'pthread_mutex_lock' abort
Summary: devel/glib20 Apps using glib 2.42.2 crashing with 'pthread_mutex_lock' abort
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: amd64 Any
: --- Affects Some People
Assignee: Koop Mast
URL:
Keywords:
Depends on:
Blocks: 217946
  Show dependency treegraph
 
Reported: 2015-05-03 03:17 UTC by Will B
Modified: 2018-01-30 07:51 UTC (History)
11 users (show)

See Also:
kwm: maintainer-feedback+
kwm: merge-quarterly+


Attachments
gdb output while debugging Thunar (1.55 KB, text/plain)
2015-05-03 03:17 UTC, Will B
no flags Details
glib20-nokqueue.diff (1.40 KB, patch)
2016-11-16 17:33 UTC, Tobias Kortkamp
no flags Details | Diff
glib20-libinotify.patch (2.70 KB, patch)
2016-11-17 16:06 UTC, Vladimir Kondratyev
no flags Details | Diff
Update glib patch (8.19 KB, patch)
2017-10-01 09:05 UTC, Guido Falsi
no flags Details | Diff
Further update glib patch (9.65 KB, patch)
2017-11-16 23:13 UTC, Guido Falsi
no flags Details | Diff
Upstream updates patch (9.38 KB, patch)
2017-11-28 19:57 UTC, Guido Falsi
madpilot: maintainer-approval? (gnome)
Details | Diff
Proposed patch for options (since 453790) (3.00 KB, patch)
2018-01-01 23:40 UTC, lightside
no flags Details | Diff
Proposed patch for options (since 453790 revision) (2.96 KB, patch)
2018-01-03 01:37 UTC, lightside
lightside: maintainer-approval? (gnome)
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Will B 2015-05-03 03:17:15 UTC
Created attachment 156258 [details]
gdb output while debugging Thunar

Problem description:
- - -
Apps that use glib, such as Thunar, Firefox and Thunderbird are crashing with the following error when performing operations with ports-mgmt/pkg:

  "GLib (gthread-posix.c): Unexpected error from C library during 'pthread_mutex_lock': Invalid argument.  Aborting."


How crash is produced:
- - -
* Thunar, Firefox and Thunderbird are started from lxpanel
* xterm is started from lxpanel, then following commands are executed repeatedly:

    sudo pkg remove -y smplayer && \
    sudo pkg autoremove -y && \
    sudo pkg clean -ay && \
    sudo pkg install -y smplayer && \
    sudo ldconfig 

* In no discernible order, Thunar, Firefox and Thunderbird crash with SIGABRT with the above error message.


Other information:
- - -
The gdb debug output of Thunar crashing is attached.

uname -a:
FreeBSD will-freebsd 10.1-RELEASE-p9 FreeBSD 10.1-RELEASE-p9 #0: Tue Apr  7 01:09:46 UTC 2015 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64

dmesg at time of crash:
pid 941 (thunar), uid 1001: exited on signal 6 (core dumped)

Other apps running:
openbox, lxpanel, thunderbird, firefox and xterm

Hardware info:
  * AMD 6-core processor (dmesg: "AMD FX(tm)-6300 Six-Core Processor (3511.78-MHz K8-class CPU)")
  * Asus M5A97 LE R2.0 motherboard
  * Radeon HD 5450 video card
  * 8 GB RAM
  * 500 GB 7,200 RPM Samsung hdd (dmesg: "SAMSUNG HD502HJ 1AJ10001")

Thank you! :-)
Comment 1 Will B 2015-05-03 03:34:52 UTC
For higher probability of reproducing this bug:
  * Double-click/open a text file in Thunar, then close the text editor
  * Switch to another program that uses glib (Thunderbird, Firefox)
  * Take some action, such as double-clicking a list item or switch tabs
  * Repeatedly execute the commands show above in the bug report in xterm

The crashes do not happen every time these steps are taken, only randomly, but the error message is always the same.
Comment 2 Will B 2015-05-03 03:58:35 UTC
Sorry for the noise (no editing of existing comments).  The above steps *should* be:

For higher probability of reproducing this bug:
  * While the commands shown above are repeatedly executing in xterm...
  * ...double-click a folder in Thunar to open it then...
  * ...double-click a file (text, picture, sound) in the newly opened folder
  * Switch the focus between Thunar, Firefox and Thunderbird
  * Take some action in each program, such as double-clicking a list item or switching tabs
  * Continue executing the commands in xterm and one of the glib-using apps will eventually crash
Comment 3 Ting-Wei Lan 2015-05-03 07:00:26 UTC
I think it is the known use-after-free problem of glib file monitor:
https://bugzilla.gnome.org/show_bug.cgi?id=739424
Comment 4 Will B 2015-05-03 07:49:10 UTC
> I think it is the known use-after-free problem of glib file monitor:
> https://bugzilla.gnome.org/show_bug.cgi?id=739424

Thank you for that.  Yes, it appears to be similar.  I saw my first crash after a clean install when I deleted selected files from ~/.local/share/applications, and when I was installing something with pkg (which probably installed an item in /usr/local/share/applications), it also happened.

From your link it looks like there is progress being made.  This would fix a long-standing issue I've had when running FreeBSD as a desktop system.

Thanks :-)
Comment 5 Koop Mast freebsd_committer 2015-07-14 14:37:28 UTC
Mark as in progress.

Glib 2.45.x has a fix for this problem. But since it quite deep, it will not be backported to the 2.44.x series.
Comment 6 Ting-Wei Lan 2015-07-14 18:11:29 UTC
Does GLib 2.45 really fix this problem? I still get many gnome-shell and firefox crashes on GLib 2.45.
Comment 7 Tobias Kortkamp freebsd_committer 2016-11-13 00:11:20 UTC
(In reply to Ting-Wei Lan from comment #6)
No, this bug is not fixed.  I can trigger Firefox crashes with just exiting llpp.  My theory is that ~/.config is monitored and llpp writes to ~/.config/llpp.conf on exit which may (or not) trigger a crash.  I solved the problem related to llpp by moving the config file to ~/.llpp.conf instead (bug #214458), but there is clearly still a larger problem.

Firefox dies with SIGBUS most often but sometimes prints the pthread_mutex_lock error when this happens.
Comment 8 Tobias Kortkamp freebsd_committer 2016-11-16 17:33:13 UTC
Created attachment 177083 [details]
glib20-nokqueue.diff

Disabling kqueue support in glib seems to stop apps from randomly crashing.

One consequence of this (there are probably more) is that e.g. Thunar or
open/save dialogs won't auto-refresh its views anymore when files get added to
directories etc.  But having stable apps is more important to me than having
features like this.
Comment 9 Will B 2016-11-16 17:37:12 UTC
(In reply to Tobias Kortkamp from comment #8)
> But having stable apps is more important to me than having
> features like this.

Agreed.  Firefox, Thunderbird and PCManFM still crash for me, but I just restart them and keep working (although it *is* annoying).
Comment 10 Olivier Duchateau freebsd_committer 2016-11-16 17:47:03 UTC
(In reply to Tobias Kortkamp from comment #8)

No it's not good idea to disable kqueue. It's our monitoring backend for BSD systems (and there're lot of side effects).

But it's right, there's problem with GLib.
Comment 11 Vladimir Kondratyev freebsd_committer 2016-11-17 16:06:16 UTC
Created attachment 177118 [details]
glib20-libinotify.patch

You can try glib20 with inotify backend.

Really, gio-kqueue backend is stripped down version of libinotify-kqueue library (devel/libinotify) imported in 2012 and modified to use gio API rather than inotify one.
Lots of bugs have been fixed in upstream since.
Comment 12 Tobias Kortkamp freebsd_committer 2016-11-18 16:13:51 UTC
Comment on attachment 177083 [details]
glib20-nokqueue.diff

(In reply to Vladimir Kondratyev from comment #11)
Hi Vladimir,

nice idea.  I've been running with your patch the entire day and did
not have a single crash.  I'm also unable to trigger crashes
intentionally.  This looks very good to me so far, and much, much
better than disabling kqueue/file monitoring entirely.

Thanks!
Comment 13 Vladimir Kondratyev freebsd_committer 2016-11-20 21:42:00 UTC
(In reply to Tobias Kortkamp from comment #12)

> I'm also unable to trigger crashes intentionally.

I use inotify as gio backend for more than year w/o any issues.

But this patch has one drawback - it unconditionally disables watching for file content modifications in watched directories with removing of IN_MODIFY, IN_ATTRIB and IN_CLOSE_WRITE flags from inotify_add_watch() arguments.
Right way IMO is to leave this flags for local directories and remove it for watching for network mounts and removable devices. Something similar is done for native kqueue gio backend but not for inotify:
https://github.com/GNOME/glib/blob/master/gio/kqueue/kqueue-exclusions.c
Comment 14 rozhuk.im 2017-04-29 19:57:03 UTC
Alternate kqueue backend: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=214338
Comment 15 vermaden 2017-09-01 14:15:23 UTC
Any idea when this will be fixed?
Comment 16 Guido Falsi freebsd_committer 2017-10-01 09:05:27 UTC
Created attachment 186826 [details]
Update glib patch

the present glib port patches glib with upstream code from https://bugzilla.gnome.org/show_bug.cgi?id=739424

That bug has an updated patch since when it was imported. New patch fixes a locking problem in the specific code path causing this problem.

The same findings can be found in bug #217946 and bug #221679.

I compiled a patch importing the newer upstream fix, which even iif not perfect seems to be better than what we have at present.

I had to rename the two patch files to force correct order when applying, because the new patch in upstream bug 739424 depends on code from upstream bug 778515.

I'm asking other xfce4 users and glib consumers to test this patch, and report back.

Hope this can be committed soon.
Comment 17 Guido Falsi freebsd_committer 2017-10-01 09:19:50 UTC
In case my patch is accepted, it should also be merged to quarterly.
Comment 18 Guido Falsi freebsd_committer 2017-11-16 23:13:59 UTC
Created attachment 188059 [details]
Further update glib patch

I have analysed another crash, in x11-wm/xfce4-desktop this time.

I've discovered a code path in the kqueue helper  casing glib to exit due to a filed assertion.

This is triggered when a kqueue NOTE_RENAME is generated on a file. The code translates it in a G_FILE_MONITOR_EVENT_MOVED and passes it to g_file_monitor_source_handle_event(), where it ends in an assertion every time.

I simply commented out the code causing this, so this event is actually ignored.

It does not look like a problem since this code path is bound to end in a failed assertion every time, while the same file move causing this event will also generate an event for the parent directory, which is handled through a slightly different code path.

My intuition is that while kqueue generates NOTE_RENAME events for single files, GIO semantics are that such events should be generated only for moves happening in watched directories, so ignoring a moved event for a file here is actually correct.

Please test this patch, I think it will fix a few more problems, but needs ample testing to make sure it's not causing problems in other areas.
Comment 19 rozhuk.im 2017-11-28 16:35:31 UTC
Dont waste your time to ugly kqueue() fam backend in glib, you cant fix this crap by simple patches.
Kqueue based fam was broken and disabled (in ports) over year and upstream do nothing.

Forget about about it and use libinotify or my patch: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=214338
it works fine: no crashes, no heavy cpu usage.
Comment 20 Guido Falsi freebsd_committer 2017-11-28 16:54:58 UTC
(In reply to rozhuk.im from comment #19)
> Dont waste your time to ugly kqueue() fam backend in glib, you cant fix this
> crap by simple patches.
> Kqueue based fam was broken and disabled (in ports) over year and upstream
> do nothing.
> 
> Forget about about it and use libinotify or my patch:
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=214338
> it works fine: no crashes, no heavy cpu usage.

At present the glib20 port does not include an option to use inotify or your patch.

My patch is a simple incremental fix, which requires only importing upstream bits. (I'm going to update the patch shortly removing my code an replacing it with code recently committed upstream)

I'm not on the gnome team so I don't know which solution they will import in the tree, but I don't think diverging from upstream is accounted as an option.
Comment 21 rozhuk.im 2017-11-28 18:43:48 UTC
(In reply to Guido Falsi from comment #20)

> At present the glib20 port does not include an option to use inotify or your patch.

Yes, and no one care this.
But patches for both case available.


> My patch is a simple incremental fix, which requires only importing upstream bits.

Small patch cant fix big ugly code.
Why you do not remove/disable this code and use new kqueue or libinotify?


> I'm not on the gnome team so I don't know which solution they will import in
> the tree, but I don't think diverging from upstream is accounted as an option.

Than you can:
- use libinotify
- promote my patch in upstream or here
- do nothing and continue live with app crashing

My patch (~1000 lines) have small (150 lines) code wrapper from plain C to glib, it is easy to rewrite/support. Also I publish C code to debug kqueue() backend without building with glib.


IMHO FreeBSD desktop is unusable with current glib fam.
We can discuss here what is right way to fix it, but users will migrate to other OSes where developers fix problems.

This bug 2,5 years old, there is two stable solutions=pathes and it does not commited/fixed yet.
In commercial or more an adequate community this situation is impossible!
Comment 22 Guido Falsi freebsd_committer 2017-11-28 19:02:31 UTC
(In reply to rozhuk.im from comment #21)
> (In reply to Guido Falsi from comment #20)
> 
> > At present the glib20 port does not include an option to use inotify or your patch.
> 
> Yes, and no one care this.
> But patches for both case available.
> 
[...]
> 
> IMHO FreeBSD desktop is unusable with current glib fam.
> We can discuss here what is right way to fix it, but users will migrate to
> other OSes where developers fix problems.
> 
> This bug 2,5 years old, there is two stable solutions=pathes and it does not
> commited/fixed yet.
> In commercial or more an adequate community this situation is impossible!

I am not a gnome committer I cannot decide which code should go in. I've not even said mine is better. In fact I don't know.

You already stated you have another PR with a better solution, the people in charge of this port will choose what they prefer to commit.

Please avoid this kind of aggressive posts.
Comment 23 Guido Falsi freebsd_committer 2017-11-28 19:57:04 UTC
Created attachment 188365 [details]
Upstream updates patch

A recent upstream commit [1] implements changes very similar to what I did, but better.

I'm updating the patch importing the upstream commit. I tested it and seems to correctly avoid the assertion.


[1] https://git.gnome.org/browse/glib/commit/?id=76072a2dde4a4acc8be8d3c47efbc6811ebe0c1e
Comment 24 rozhuk.im 2017-11-28 20:19:41 UTC
(In reply to Guido Falsi from comment #22)

This is a very soft reaction from the user who lived almost a year without any fam, another year the fam worked very poorly.
And now, when there are two ready solutions to the problem, the community does nothing at all.
For a whole year the community can not decide to adopt a variant or add both as options.
It seems that the community does not care about not only users and their convenience, but even the developers who do something for the community.
FreeBSD is not very popular, and with this attitude, the last users will turn away from it.


(In reply to Guido Falsi from comment #23)

Original code have very bad design, it cant be fixed by series of one-line-patch.

Also my code handle NOTE_RENAME more correct way: it try to find in up folder by inode and report deleted only if not found.
Probably libinotify do same smart thing.
Comment 25 vermaden 2017-11-28 22:36:24 UTC
This is one of the biggest problems of FreeBSD.

This powerless attitude to NOT fix problematic bugs that affect almost everybody, and turns users away from FreeBSD.

Seems that FreeBSD developers still mostly use OS X because if they would use FreeBSD desktop, they would fix that bug in a week ...
Comment 26 lightside 2018-01-01 23:40:27 UTC
Created attachment 189330 [details]
Proposed patch for options (since 453790)

Hello.

(In reply to comment #24)
> For a whole year the community can not decide to adopt a variant or add both
> as options.
I attached some patch, which adds INOTIFY and KQUEUE as options, based on attachment #177118 [details] from Vladimir Kondratyev. Possible to change OPTIONS_DEFAULT to other value, if needed.
Comment 27 lightside 2018-01-03 01:37:54 UTC
Created attachment 189354 [details]
Proposed patch for options (since 453790 revision)

(In reply to comment #26)
Need to note, that proposed (INOTIFY and KQUEUE) port's options allows to switch between gio/inotify and gio/kqueue (affected) implementations. I'm aware, that devel/libinotify port is for "Kevent based inotify compatible library" (i.e. also uses kqueue; see comment #11). Possible to clarify descriptions for port's options.
Comment 28 commit-hook freebsd_committer 2018-01-26 21:27:31 UTC
A commit references this bug:

Author: kwm
Date: Fri Jan 26 21:26:58 UTC 2018
New revision: 460052
URL: https://svnweb.freebsd.org/changeset/ports/460052

Log:
  Update glib to 2.50.3.

  Also redo the kqueue patches. Now we patch files only once, and add some
  bits that got lost somewhere (which is probably my fault). Which where
  causing crashes when for example nautilus or thundar where monitoring
  directories and files where added/removed.

  PR:		199872

Changes:
  head/devel/glib20/Makefile
  head/devel/glib20/distinfo
  head/devel/glib20/files/patch-bug739424
  head/devel/glib20/files/patch-bug778515
  head/devel/glib20/files/patch-gio_kqueue_gkqueuefilemonitor.c
  head/devel/glib20/files/patch-gio_kqueue_kqueue-helper.c
Comment 29 vermaden 2018-01-27 15:55:28 UTC
Thank you for fixing this one, now deleting directories is not PITA.

Another small step in right direction for FreeBSD on the road to 'desktop' :p

Regards.
Comment 30 Guido Falsi freebsd_committer 2018-01-28 11:37:16 UTC
There still is a code  patch which causes the G_FILE_MONITOR_EVENT_MOVED event to be passed to g_file_monitor_source_handle_event() causing a failed assertion and crash.

It's fixed in this upstream commit:

https://github.com/GNOME/glib/commit/76072a2dde4a4acc8be8d3c47efbc6811ebe0c1e

I think this should also be imported in our port.

My proposed patch included this.
Comment 31 vermaden 2018-01-28 17:18:06 UTC
(In reply to Guido Falsi from comment #30)
Is there something that 'blocks' importing this fixed version into Ports?
Comment 32 Guido Falsi freebsd_committer 2018-01-28 17:42:18 UTC
(In reply to vermaden from comment #31)
> (In reply to Guido Falsi from comment #30)
> Is there something that 'blocks' importing this fixed version into Ports?

glib20 port is depended upon by many ports so thorough testing is needed, and should be performed by people confident with the tested port.

Apart from this I'm not in the gnome group, so to commit to this port I would anyway need gnome group approval. Otherwise gnome group members can commit this, once they are confident it does not break anything in the many ports glib is depended from.

Please allow time for testing an evaluating to all involved parties.
Comment 33 vermaden 2018-01-28 17:49:40 UTC
(In reply to Guido Falsi from comment #32)
Thank You for explanation.
Comment 34 commit-hook freebsd_committer 2018-01-28 20:29:54 UTC
A commit references this bug:

Author: kwm
Date: Sun Jan 28 20:29:16 UTC 2018
New revision: 460230
URL: https://svnweb.freebsd.org/changeset/ports/460230

Log:
  Fix another crash bug in the kqueue backend.

  PR:		199872 217946

Changes:
  head/devel/glib20/Makefile
  head/devel/glib20/files/patch-gio_kqueue_kqueue-helper.c
Comment 35 rozhuk.im 2018-01-28 21:26:33 UTC
(In reply to Guido Falsi from comment #32)

Do you and gnome group understand situation!?

This is not some small bug in freak port with few users.
This BIG ANNOYING BUG that all users can see every day.
They see that many apps is crashing and think: "whole FreeBSD is crap", because on Linux, mac and windows same does not happen so rarely. Even on windows 98se explorer and apps was not crash on file operations.
Is windows 98se better than FreeBSD 11 on desktop?

How long you will use mac/windows and told here about some tests, right way to apply and support patches!?

As FreeBSD user on desktop I do not care about: "need tests with other ports" and "I don't think it good to include this into ports" - just fix that crashes, what are you waiting for?

Forget about glib legacy kqueue() FAM backend. Is is totally broken many years.
Select between libinotify and my patch or let users to choose via options.
Is it so difficult?


This is a ticket of shame. IMHO.
Comment 36 commit-hook freebsd_committer 2018-01-30 07:04:34 UTC
A commit references this bug:

Author: kwm
Date: Tue Jan 30 07:04:21 UTC 2018
New revision: 460371
URL: https://svnweb.freebsd.org/changeset/ports/460371

Log:
  MFH: r460052 r460230

  Update glib to 2.50.3.

  Also redo the kqueue patches. Now we patch files only once, and add some
  bits that got lost somewhere (which is probably my fault). Which where
  causing crashes when for example nautilus or thundar where monitoring
  directories and files where added/removed.

  PR:		199872

  Fix another crash bug in the kqueue backend.

  PR:		199872 217946

  Approved by:	ports-secteam (swills@)

Changes:
_U  branches/2018Q1/
  branches/2018Q1/devel/glib20/Makefile
  branches/2018Q1/devel/glib20/distinfo
  branches/2018Q1/devel/glib20/files/patch-bug739424
  branches/2018Q1/devel/glib20/files/patch-bug778515
  branches/2018Q1/devel/glib20/files/patch-gio_kqueue_gkqueuefilemonitor.c
  branches/2018Q1/devel/glib20/files/patch-gio_kqueue_kqueue-helper.c
Comment 37 commit-hook freebsd_committer 2018-01-30 07:04:39 UTC
A commit references this bug:

Author: kwm
Date: Tue Jan 30 07:04:22 UTC 2018
New revision: 460371
URL: https://svnweb.freebsd.org/changeset/ports/460371

Log:
  MFH: r460052 r460230

  Update glib to 2.50.3.

  Also redo the kqueue patches. Now we patch files only once, and add some
  bits that got lost somewhere (which is probably my fault). Which where
  causing crashes when for example nautilus or thundar where monitoring
  directories and files where added/removed.

  PR:		199872

  Fix another crash bug in the kqueue backend.

  PR:		199872 217946

  Approved by:	ports-secteam (swills@)

Changes:
_U  branches/2018Q1/
  branches/2018Q1/devel/glib20/Makefile
  branches/2018Q1/devel/glib20/distinfo
  branches/2018Q1/devel/glib20/files/patch-bug739424
  branches/2018Q1/devel/glib20/files/patch-bug778515
  branches/2018Q1/devel/glib20/files/patch-gio_kqueue_gkqueuefilemonitor.c
  branches/2018Q1/devel/glib20/files/patch-gio_kqueue_kqueue-helper.c
Comment 38 Koop Mast freebsd_committer 2018-01-30 07:22:31 UTC
The problems with glib crashes should be fixed now. Please see bug 217946 for thunar related issues. Sorry for taking so long for fixing this.
Comment 39 vermaden 2018-01-30 07:51:44 UTC
Thank You.