Description
Ivan Rozhuk
2016-11-08 21:15:40 UTC
(In reply to rozhuk.im from comment #0) > - no additional threads created Just wondering. Is it safe to call readdir() from glib worker thread context? Are there any chances that glib-based application would just freeze if readdir() would take tooooo looong time to complete? I don`t know much about glib internals, that is why I am asking I use this patch mo than 10 days and all OK - no freezes, no high CPU usage. I think kernel cache files/dirs info for most cases. I don`t know much about glib internals too. :) Thats why code have only tinny API shim from my code that can be used in any application on BSD to glib filemon plugin API. (In reply to rozhuk.im from comment #2) > I use this patch mo than 10 days and all OK - no freezes, no high CPU usage. > I think kernel cache files/dirs info for most cases. No doubt, it should work nice in most desktop usages. But there are some worst -case scenarios like: 1. Working over slow and unreliable media like mounted network shares especially when public networks are used 2. Working with userspace filesystems under heavy CPU load and memory pressure 3. Working with large directories on medium-speed media like USB1/2-dongles I have seen readdir()s lasting for more than minute in real life and it is undesirable to see application stopped that time. I think you can avoid this with using of some async io libraries like libeio but that just moves thread creation from gio to external library. I see more issues in your backend but this one is most important IMO (In reply to rozhuk.im from comment #0) I opened an upstream bug report for kqueue backend reworking 2 years ago: https://bugzilla.gnome.org/show_bug.cgi?id=739424 You can regenerate your patch in 'git format-patch' format and upload it there to get feedback from upstream developers. kqueue backend is used by many operating systems, so your code will be tested on other operating system like OpenBSD before it can be accepted by upstream. From the discussion on IRC we had last year, the reason to have a polling fallback is that kqueue requires a file descriptor to the directory to be monitored, but having an open file descriptor blocks unmount. Linux inotify doesn't have this problem because inotify uses a file path instead of a file descriptor, and Apple Mac allows getting a file descriptor for use in kqueue without blocking unmount by using O_EVTONLY. (In reply to Ting-Wei Lan from comment #4) I can give it a try on OpenBSD this week-end if you want. AFAIK the polling code is also used when running out of file descriptors (which is limited by kern.maxfiles sysctl in and openfiles in login.conf). (In reply to Antoine Jacoutot from comment #5) I didn't mean to ask for testing now. I think we can test it after it has been reviewed by at least one person. (In reply to Vladimir Kondratyev from comment #3) All your "worst cases" match better then high CPU usage after short time of use original backend. 1, 3 No events received from remote systems if other host change files. If files changed by system with my code - all changes cached by kernel. I see some freezes on dir open, where thunar add other dir to monitoring and they open() for read and they filetime updates. 2. In this case original code will do same, no difference. In some previous version I use additional thread for receive and process kqueue events, but for now I se no difference. May be glib create its own thread to handle read events from kqueue, I do not investigate this. (In reply to Ting-Wei Lan from comment #6) > I think we can test it after it has been reviewed by at least one person. For now I see 3 major issues with this patch. 1. Possible glib worker thread blocking (discutable, I like the idea to move event processing from dedicated thread to glib worker) 2. Multiple hardlinked files in single directory are not handled properly - rename tracking code just search for first file with the same inode number not taking in account if it still persists in directory or not 3. No watching for modifying/attribute changing of files inside watched directory. I think, implementing 2 & 3 will lead to reinventing libinotify which old version is a base of current kqueue backend (In reply to Ting-Wei Lan from comment #4) unmount block if one or more filemanagers watch for dir, user can close filemanager and retry unmount. I use umount -f, and make this fix: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=214336 O_EVTONLY may be some one add to FreeBSD... I will post my patch to upstream later. (In reply to Vladimir Kondratyev from comment #8) 1. You can add sleep(10000) over kevent() to test freezes. 2. If file deleted it will reported. 3. Im not sure about it. Code compare current files/subdirs metadata in dir with previous and report if changed. But if no event from kqueue() received for dir then no compare/report for files/subdirs. I can add polling by timer to handle this case and network mounts but I think that it bad idea. (In reply to rozhuk.im from comment #10) > 2. If file deleted it will reported. It seems that I did not explain well what i meant. We are watching for directory with hardlinked childs "1" and "2". They are sharing same inode number as it is one file with 2 links to parent dir. Than we rename "1" to "3" triggering NOTE_WRITE for parent directory filedesc gio handler for NOTE_WRITE calls your code and at some point of time readdir() loop reads entry for file "3" and find it as "new file". Next step it tries to find out if this file has been renamed and after some looping over previous directory content with 50% probability it stops at file "2" with the same inode number and decides that "2" is the filename before rename But that is wrong as you would send 3 events in that case: '"1" is deleted', '"2" is renamed to "3"' and '"2" is created'. > 3. Im not sure about it. Code compare current files/subdirs metadata in dir with previous and report if changed. > But if no event from kqueue() received for dir then no compare/report for files/subdirs. IIRC, if directory is watched, file monitor should report child files attribute changes and modifications. In kqueue() case it means that you should open() every child file and watch for NOTE_ATTRIB, NOTE_WRITE and NOTE_CLOSE_WRITE (and maybe NOTE_DELETE as it mapped to attribute changes sometimes) events. (In reply to Vladimir Kondratyev from comment #11) > But that is wrong as you would send 3 events in that case: > '"1" is deleted', '"2" is renamed to "3"' and '"2" is created'. Current gio-kqueue backend uses 4-pass algorithm to solve issues like this one and implement correct event-ordering Is there some real issues with my current implementation? Created attachment 178938 [details]
kqueue() file monitoring backend: async add/delete monitor
Now add/remove file/dir monitoring is async.
Created attachment 181557 [details]
readdir_r() -> getdirentries()
After some system/ports updates during this months app start crash on standard open dialog: (geany in this trace)
I fail to reproduce this in demo app, so I decide replace readdir_r() to getdirentries() and this helps.
#0 0x0000000803e856ea in thr_kill () from /lib/libc.so.7
#1 0x0000000803e856b4 in raise () from /lib/libc.so.7
#2 0x0000000803e85629 in abort () from /lib/libc.so.7
#3 0x000000080206641a in pthread_attr_getaffinity_np () from /lib/libthr.so.3
#4 0x00000008020601c1 in pthread_mutex_destroy () from /lib/libthr.so.3
#5 0x00000008020606bd in pthread_mutex_destroy () from /lib/libthr.so.3
#6 0x000000080205f8c9 in pthread_mutex_lock () from /lib/libthr.so.3
#7 0x0000000803ef89d2 in readdir_r () from /lib/libc.so.7
#8 0x0000000802f9666a in kqueue_file_mon_data_init (fmd=0x80f99b700) at kqueue_fnm.c:262
#9 0x0000000802f96020 in kqueue_fnm_delay_call_process (kfnm=0x80f6f14e0, forced_msg_cb=0) at kqueue_fnm.c:498
#10 0x0000000802f969fe in kqueue_fnm_proccess_events (kfnm=0x80f6f14e0, cb_func=0x802f95e00 <kqueue_event_handler>) at kqueue_fnm.c:621
#11 0x0000000802f95de4 in g_kqueue_file_monitor_callback (fd=14, condition=G_IO_IN, user_data=0x0) at gkqueuefilemonitor.c:92
#12 0x000000080393893a in g_unix_fd_source_dispatch (source=0x80cb6faf0, callback=0x802f95db0 <g_kqueue_file_monitor_callback>, user_data=0x0) at glib-unix.c:308
#13 0x00000008038d2be3 in g_main_dispatch (context=0x80ae41700) at gmain.c:3203
#14 0x00000008038d2a30 in g_main_context_dispatch (context=0x80ae41700) at gmain.c:3856
#15 0x00000008038d2f7e in g_main_context_iterate (context=0x80ae41700, block=1, dispatch=1, self=0x80b08b280) at gmain.c:3929
#16 0x00000008038d2ff3 in g_main_context_iteration (context=0x80ae41700, may_block=1) at gmain.c:3990
#17 0x00000008038d498d in glib_worker_main (data=0x0) at gmain.c:5783
#18 0x000000080390ae9d in g_thread_proxy (data=0x80b08b280) at gthread.c:784
#19 0x0000000802058bd5 in pthread_create () from /lib/libthr.so.3
#20 0x0000000000000000 in ?? ()
Created attachment 181816 [details]
fix mem corruption
Created attachment 182423 [details]
update to build with last svn version
Koop Mast, may be we try to use this instead of buggy gnome version? Created attachment 182749 [details]
fix notify sequence
Now all deleted events send before rename/mod.
This fix for case where pcmanfm show not all files after copy to sshfs mount folder.
Created attachment 184001 [details]
patch for Makefile.in
Build fix after last glib update.
Created attachment 184287 [details]
make patch as option; call autoreconf
Created attachment 184290 [details]
fix patch sequence
Created attachment 184301 [details]
replace patch by file for gio/kqueue/gkqueuefilemonitor.c
I will commit this in from of non-default option after some testing if no objectins from gnome@ Created attachment 184456 [details]
merge in NLS option
Ok so first, sorry I haven't responded to this PR. Secondly, I don't think it good to include this into ports [1]. This should go via upstream, and interaction with the OpenBSD people (who are also using the kqueue backend) and maybe NetBSD but I'm not sure what backend they are using. [1] this is not so much about that rozkuh.im@ isn't doing good work, but more of that the patches stay in the ports tree and never make it upstream, kind of deal. It is something I'm guilty off myself, so I rather not step into that trap this time. Upstream wants that I test it with OpenBSD and all other systems that use this code. I have no time to do this and I dont want play with other systems. Also they wants that I use all staff (like malloc() and other) from their glib instead of system. I do not want to do this because now I can debug and test code with simple C prog without any relations with glib. (In reply to Koop Mast from comment #26) 1. Upstream wait for patch that tested with all systems where kequeue(). 2. Current kqueue() backend is ugly, buggy (and broken/disabled few long times in past 3 years) at least for desktop purposes. 3. Patch for port is optional. OpenBSD and NetBSD peoples wait with FreeBSD people then some one do all job. No one want to do this. I cant spent many time to promote this patch. So, FreeBSD peoples can continue wait with others or use this optional patch. Any idea when this will be fixed? (In reply to vermaden from comment #29) Upstream does not want the patch in its current form. Author of the patch does not want to rewrite it using glibc memory management instead of standard library and test under other BSD systems as upstrem demands. Maintainer does not want local patch at all. (In reply to Eugene Grosbein from comment #30) Politics ... (In reply to Eugene Grosbein from comment #30) Maybe FreeBSD Foundation could jump in with sponsored rewrite/project? (In reply to Eugene Grosbein from comment #30) > Upstream does not want the patch in its current form. Patch in its current form is feature incomplete. It can not track directory subfile modifications. That is why it is tiny. It handles only filename modifications not file content changes as required. > Author of the patch does not want to rewrite it using glc memory management instead of standard library and test under other BSD systems as upstrem demands. > Maintainer does not want local patch at all. Unfortunately, author is not very responsible for his patch too. Hi just ignored some of my questions I had asked him in this thread before. P.S. I am not a glib developer. I am current maintainer (not original author) of libinotify that is a base of current glib`s kqueue file monitoring backend. The change is not approved in its current form by maintainer nor upstream. Some changes are required to make progress. (In reply to Vladimir Kondratyev from comment #33) There is 3 options: 1. Ignore files changes in dir 2. Use timer for polling (rescan all monitored dirs every 1-5-10 sec) 3. Open all files in monitored dir and add their fd to kqueue() to monitor size/attr changes. I choose 1. It good for me on desktop. 2. Looks low additional resources need, but I dont know what time interval to chose. Also Im not sure that this good idea for sshfs on slow links. 3. FreeBSD does not have O_EVTONLY or at least O_NOATIME. Only small number of file systems can be mounted with NOATIME. I dont want update atime on all files in dir every time I open it in filemanager. Also I dont was lags on sshfs with slow net. In next patch version I will add thread to handle kqueue() events, also all open(), close(), getdents() will be handled by this thread. Created attachment 188088 [details]
add internal thread for syscals that may block
(In reply to rozhuk.im from comment #35) > I choose 1. libinotify is more flexible in doing this. It defaults to p.3. but can choose p.1. for certain filesystems. Look at relevant commit: https://github.com/libinotify-kqueue/libinotify-kqueue/commit/ec365538c488572e4d68dacd9498d46dbd62e257 Personally I disable opening of subfiles on fusefs with passing --enable-skip-subfiles=fusefs parameter to configure script. But our devel/libinotify port does not utilize this option yet. Created attachment 188134 [details]
add monitoring files/subdirs in monitored dir
Created attachment 188144 [details]
fix mem access bug, code cleanup
Now all syscalls that can block - executed in special internal thread
All files/subdirs on local filesystems added to kqueue() to track changes
(fufsefs treat as local fs, except sshfs)
Created attachment 188261 [details]
patch: code cleanup and optimization
Created attachment 188262 [details]
tool to test patch
Tool to test code.
Place in folder with latest kqueue_fnm.h and kqueue_fnm.c, edit const char *paths
build:
cc gkqueuefilemonitor.c kqueue_fnm.c -O0 -DDEBUG -lpthread -o gkqueuefilemonitor
run:
./gkqueuefilemonitor
Created attachment 190711 [details]
tool to test
Created attachment 190712 [details]
patch
Add:
- adaptive rate limit
- dir objects count limit
- tunable to disable monitor local dir subfiles changes
- tunable to disable monitor local dir subdirs changes
- tunable to disable monitor for subfiles and subdirs on non local filesystems
Fix:
- descriptors leak on procfs, may cause use mem after free
- add workaround for procfs and tmpfs, where st_ino may != d_fileno
Also done some test with valgring and cppcheck.
Created attachment 191448 [details]
fix broken patch
Created attachment 195450 [details] Proposed patch (since 473551 revision) Hello. I tested build for patch in attachment #191448 [details] after ports r473551 changes and found following linker error on FreeBSD 10.4 amd64: -8<-- ./.libs/libgio-2.0.so: undefined reference to `mstosbt' cc: error: linker command failed with exit code 1 (use -v to see invocation) -->8- Therefore, I added implementation for mstosbt function to files/kqueue_fnm.c file, based on: https://svnweb.freebsd.org/base?view=revision&revision=321686 https://svnweb.freebsd.org/base/head/sys/sys/time.h?view=markup&pathrev=321686#l198 https://svnweb.freebsd.org/base/head/sys/sys/param.h?revision=321688&view=markup#l61 The `make check-plist` command found some errors (excerpt): -8<-- ====> Running Q/A tests (stage-qa) ====> Checking for pkg-plist issues (check-plist) ===> Parsing plist ===> Checking for items in STAGEDIR missing from pkg-plist Error: Orphaned: lib/charset.alias Error: Orphaned: share/doc/gio/GAction.html <..> Error: Orphaned: share/doc/gobject/up.png ===> Checking for items in pkg-plist which are not in STAGEDIR ===> Error: Plist issues found. *** Error code 1 Stop. <..> -->8- This is because of autoreconf usage on pre-configure-FAM_ALTBACKEND-on stage, which overwrites previous changes for Makefile.in files (see bug #226920, comment #4 for another explanation). I fixed this with using some sed changes for ${WRKSRC}/gio/kqueue/Makefile.in file. The files/extra-patch-gio_kqueue_Makefile.am patch was removed. Attached modified patch. Probably, there is a need to clarify description for FAM_ALTBACKEND option. For example, usage of "Alternative" instead of "Alternate" word. (In reply to comment #45) > Probably, there is a need to clarify description for FAM_ALTBACKEND option. > For example, usage of "Alternative" instead of "Alternate" word. Maybe it's ok, if "Alternate" has a meaning of changing/replacing something. Created attachment 195570 [details]
add patched Makefile.in
(In reply to lightside from comment #45) Thanks for pointing to mstosbt(). I got this issue and fix it, but forget update patch there. To avoid playing with __FreeBSD_version I make always defined macro: #define MSTOSBT(__ms) ((sbintime_t)((((int64_t)(__ms)) * (int64_t)(((uint64_t)1 << 63) / 500) >> 32))) Your way to fix `make check-plist` is a bit hacky to me :) In last patch I put patched/regenerated Makefile.in and now no need to do any patch or autoreconf operations. "Alternative" vs "Alternate" - I do not care, feel will set any other. It's been almost a year since last maintainer response and patches were updates since then. Let's see if maintainer changed their mind. Created attachment 195597 [details] Proposed patch (since 473551 revision) (In reply to comment #48) > To avoid playing with __FreeBSD_version I make always defined macro Ok, I updated files/kqueue_fnm.c file based on your changes in attachment #195570 [details], but fixed spelling error for "compiler" word. (In reply to comment #48) > Your way to fix `make check-plist` is a bit hacky to me :) The following sed patch: @${REINPLACE_CMD} -e '/^libkqueue_la_OBJECTS =/s|$$(am_libkqueue_la_OBJECTS)|libkqueue_la-gkqueuefilemonitor.lo libkqueue_la-kqueue-helper.lo|' \ ${WRKSRC}/gio/kqueue/Makefile.in changes following line of ${WRKSRC}/gio/kqueue/Makefile.in file from: libkqueue_la_OBJECTS = $(am_libkqueue_la_OBJECTS) to: libkqueue_la_OBJECTS = libkqueue_la-gkqueuefilemonitor.lo libkqueue_la-kqueue-helper.lo This is because of other changes for am_libkqueue_la_OBJECTS variable in files/patch-gio_filemonitor patch: -8<-- --- gio/kqueue/Makefile.in.orig 2017-02-13 15:22:04 UTC +++ gio/kqueue/Makefile.in @@ -183,9 +183,8 @@ LTLIBRARIES = $(installed_test_LTLIBRARI libkqueue_la_LIBADD = am__objects_1 = am_libkqueue_la_OBJECTS = libkqueue_la-gkqueuefilemonitor.lo \ - libkqueue_la-kqueue-helper.lo libkqueue_la-kqueue-thread.lo \ - libkqueue_la-kqueue-sub.lo libkqueue_la-kqueue-missing.lo \ - libkqueue_la-kqueue-utils.lo libkqueue_la-kqueue-exclusions.lo \ + libkqueue_la-kqueue-helper.lo \ + libkqueue_la-kqueue-missing.lo \ libkqueue_la-dep-list.lo $(am__objects_1) libkqueue_la_OBJECTS = $(am_libkqueue_la_OBJECTS) AM_V_lt = $(am__v_lt_@AM_V@) -->8- The "/^libkqueue_la_OBJECTS =/" fragment searches for line(s) which starts with "libkqueue_la_OBJECTS =" text and "s|$$(am_libkqueue_la_OBJECTS)|libkqueue_la-gkqueuefilemonitor.lo libkqueue_la-kqueue-helper.lo|" fragment changes "$(am_libkqueue_la_OBJECTS)" text to "libkqueue_la-gkqueuefilemonitor.lo libkqueue_la-kqueue-helper.lo", if line(s) with "libkqueue_la_OBJECTS =" text was found. The libkqueue_la-gkqueuefilemonitor.lo and libkqueue_la-kqueue-helper.lo have available Makefile's rules for gkqueuefilemonitor.c and kqueue-helper.c files. This is why the files/kqueue_fnm.c was copied to ${WRKSRC}/gio/kqueue/kqueue-helper.c, instead of ${WRKSRC}/gio/kqueue/kqueue_fnm.c. (In reply to comment #48) > In last patch I put patched/regenerated Makefile.in and now no need to do any > patch or autoreconf operations. This works, but may require to apply more changes for devel/glib20 port. I think, the maintainer may decide about used approach. Attached modified patch. Thanks for attention. Created attachment 195598 [details]
Proposed patch (since 473551 revision)
Added devel/glib20 prefix, just in case (for easy comparison between existing patches on FreeBSD Bugzilla).
Need to note, that I just proposed some changes related to integration of this PR's proposal for current devel/glib20 port (based on ports r469308 and ports r473551 changes). I saw, that PR's reporter also proposed to replace existing gio/kqueue backend on: https://gitlab.gnome.org/GNOME/glib/issues/1460 I maybe wrong, but I guess, this may be counterproductive, if consider amount of other people's work for existing gio/kqueue backend, some of which maybe against of replacing it for some reason(s). Probably, possible to propose these changes as an alternative backend, for example as an additional gio's sub-directory (e.g. there are gio/inotify (see bug #199872, comment #26, where it was possible), gio/kqueue, etc.). But this may require to adapt it for existing build system(s). So, other people may test/use this backend, if needed. (In reply to lightside from comment #52) https://gitlab.gnome.org/GNOME/glib/commits/master/gio/kqueue GNOME peoples broke kqueue() backend at 2015 and not fix it during years. After fixes it is still ugly: eat CPU, freeze GUI, etc... I see no reason to keep it. Even migration to libinotify backend is much better. Created attachment 195618 [details] Alternative patch (since 473551 revision) (In reply to comment #52) > Probably, possible to propose these changes as an alternative backend, > for example as an additional gio's sub-directory I attached (obsoleted) patch, which adds/uses --enable-kqueue-fnm configure's option, if FAM_ALTBACKEND port's option enabled. But this may require to use autoreconf for current devel/glib20 version. I converted changes for Makefile.in to Makefile.am changes, with using some sed patches (similar to attachment #191861 [details]). For example, this adds "--enable-kqueue-fnm" configure option: -8<-- AC_ARG_ENABLE([kqueue-fnm], [AS_HELP_STRING([--enable-kqueue-fnm], [enable gio/kqueue_fnm backend, if kqueue supported])]) AM_CONDITIONAL(ENABLE_KQUEUE_FNM, [test "$enable_kqueue_fnm" = "yes"]) -->8- and "gio/kqueue_fnm/Makefile" to list of AC_CONFIG_FILES for configure.ac file. The ENABLE_KQUEUE_FNM define checked in gio/Makefile.am: -8<-- if HAVE_KQUEUE if ENABLE_KQUEUE_FNM SUBDIRS += kqueue_fnm platform_libadd += kqueue_fnm/libkqueue.la platform_deps += kqueue_fnm/libkqueue.la else SUBDIRS += kqueue platform_libadd += kqueue/libkqueue.la platform_deps += kqueue/libkqueue.la endif endif -->8- It's just to show that such a concept is possible. Created attachment 195619 [details] Alternative patch (since 473551 revision) (In reply to comment #54) Updated changes for gio/kqueue/Makefile.am file in files/patch-gio_filemonitor patch. Created attachment 195649 [details] Proposed patch (since 475857 revision) Fixed MSTOSBT macro for files/kqueue_fnm.c. Test case: -8<-- /* `cc -O2 -o test test.c` or `cc -O2 -DFIXED=1 -o test test.c` * ./test && echo ok || echo not ok * ./test > test.txt * cat test.txt | grep true * cat test.txt | grep false */ #include <stdio.h> #include <sys/param.h> #include <sys/types.h> #include <sys/time.h> #ifndef FIXED #define MSTOSBT(__ms) ((sbintime_t)((((int64_t)(__ms)) * (int64_t)(((uint64_t)1 << 63) / 500) >> 32))) #else #define MSTOSBT(__ms) (sbintime_t)(((int64_t)__ms * (((uint64_t)1 << 63) / 500)) >> 32) #endif #if defined(__FreeBSD_version) && (__FreeBSD_version < 1200040) /* https://svnweb.freebsd.org/base?view=revision&revision=321686 * https://svnweb.freebsd.org/base/head/sys/sys/time.h?view=markup&pathrev=321686#l198 */ static __inline sbintime_t mstosbt(int64_t _ms) { return ((_ms * (((uint64_t)1 << 63) / 500)) >> 32); } #endif int main() { sbintime_t funct, macro; int match, result = 0; for (int64_t i = 0; i < 1000; ++i) { funct = mstosbt(i); macro = MSTOSBT(i); match = (funct == macro) ? 1 : 0; printf("%ld: funct = %ld, macro = %ld, %s\n", i, funct, macro, match ? "true" : "false"); if ((result == 0) && (match == 0)) result = 1; } return result; } -->8- % cc -O2 -o test test.c % ./test && echo ok || echo not ok <..> 997: funct = 4282082394, macro = -12884902, false 998: funct = 4286377361, macro = -8589935, false 999: funct = 4290672328, macro = -4294968, false not ok % ./test | grep true | head -5 0: funct = 0, macro = 0, true 1: funct = 4294967, macro = 4294967, true 2: funct = 8589934, macro = 8589934, true 3: funct = 12884901, macro = 12884901, true 4: funct = 17179869, macro = 17179869, true % ./test | grep false | head -5 501: funct = 2151778615, macro = -2143188681, false 502: funct = 2156073582, macro = -2138893714, false 503: funct = 2160368549, macro = -2134598747, false 504: funct = 2164663517, macro = -2130303779, false 505: funct = 2168958484, macro = -2126008812, false % cc -O2 -DFIXED=1 -o test_fixed test.c % ./test_fixed && echo ok || echo not ok <..> 997: funct = 4282082394, macro = 4282082394, true 998: funct = 4286377361, macro = 4286377361, true 999: funct = 4290672328, macro = 4290672328, true ok Created attachment 195650 [details] Alternative patch (since 475857 revision) (In reply to comment #56) > Fixed MSTOSBT macro for files/kqueue_fnm.c. Fixed the same for files/kqueue_fnm/kqueue_fnm.c (In reply to comment #56) > Fixed MSTOSBT macro for files/kqueue_fnm.c. Need to say, that testcase was compiled with using FreeBSD 10.4 amd64 base compiler: % cc --version | head -1 FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512 If test the same testcase with using Clang 6.0.1 compiler from devel/llvm60 port, then there are no errors detected for new (attachment #195649 [details]) and previous (attachment #195598 [details]) MSTOSBT macro: % clang60 --version | head -1 clang version 6.0.1 (tags/RELEASE_601/final) % clang60 -O2 -o test test.c % ./test && echo ok || echo not ok <..> 997: funct = 4282082394, macro = 4282082394, true 998: funct = 4286377361, macro = 4286377361, true 999: funct = 4290672328, macro = 4290672328, true ok % clang60 -O2 -DFIXED=1 -o test_fixed test.c % ./test_fixed && echo ok || echo not ok <..> 997: funct = 4282082394, macro = 4282082394, true 998: funct = 4286377361, macro = 4286377361, true 999: funct = 4290672328, macro = 4290672328, true ok (In reply to comment #56) > Fixed MSTOSBT macro for files/kqueue_fnm.c. The MSTOSBT macro used in kq_fnm_create function in files/kqueue_fnm.c file as: kfnm->rate_lim_time_init = MSTOSBT(kfnm->s.rate_limit_time_init); where KQUEUE_MON_RATE_LIMIT_TIME_INIT defined as: #ifndef KQUEUE_MON_RATE_LIMIT_TIME_INIT # define KQUEUE_MON_RATE_LIMIT_TIME_INIT 1000 #endif and used for "kfms.rate_limit_time_init = KQUEUE_MON_RATE_LIMIT_TIME_INIT;" in g_kqueue_file_monitor_is_supported function in files/gkqueuefilemonitor.c file before kq_fnm_create function invocation. If change "<" to "<=" in "for (int64_t i = 0; i < 1000; ++i) {" for testcase in comment #56, then possible to get result for 1000 value: % cc --version | head -1 FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512 % cc -O2 -o test test.c % ./test && echo ok || echo not ok <..> 998: funct = 4286377361, macro = -8589935, false 999: funct = 4290672328, macro = -4294968, false 1000: funct = 4294967295, macro = -1, false not ok % cc -O2 -DFIXED=1 -o test test.c % ./test && echo ok || echo not ok <..> 998: funct = 4286377361, macro = 4286377361, true 999: funct = 4290672328, macro = 4290672328, true 1000: funct = 4294967295, macro = 4294967295, true ok % clang60 --version | head -1 clang version 6.0.1 (tags/RELEASE_601/final) % clang60 -O2 -o test test.c % ./test && echo ok || echo not ok <..> 998: funct = 4286377361, macro = 4286377361, true 999: funct = 4290672328, macro = 4290672328, true 1000: funct = 4294967295, macro = 4294967295, true ok % clang60 -O2 -DFIXED=1 -o test test.c % ./test && echo ok || echo not ok <..> 998: funct = 4286377361, macro = 4286377361, true 999: funct = 4290672328, macro = 4290672328, true 1000: funct = 4294967295, macro = 4294967295, true ok In other words, if kfnm->s.rate_limit_time_init was assigned to 1000 value, then kfnm->rate_lim_time_init maybe assigned to -1 instead of 4294967295, if compiled without fix for MSTOSBT macro (attachment #195598 [details]) with using Clang 3.4.1 base compiler on FreeBSD 10.4 amd64. Sorry about long explanations. (In reply to comment #59) > In other words, if kfnm->s.rate_limit_time_init was assigned to 1000 value, > then kfnm->rate_lim_time_init maybe assigned to -1 instead of 4294967295, if > compiled without fix for MSTOSBT macro (attachment #195598 [details]) with using > Clang 3.4.1 base compiler on FreeBSD 10.4 amd64. Looks like, my testcase in comment #56 wasn't correct, because rate_limit_time_init has type uint32_t, while I used int64_t as for "mstosbt(int64_t _ms)" function: uint32_t rate_limit_time_init; /* Fire events for dir min interval, mseconds. */ If change "int64_t" to "uint32_t" in "for (int64_t i = 0; i < 1000; ++i) {" (and "%ld: funct" to "%u: funct" for printf function), then results are ok for Clang 3.4.1 and 6.0.1 compilers, including for new (attachment #195649 [details]) and previous (attachment #195598 [details]) MSTOSBT macro. Created attachment 198889 [details]
COLLATION_FIX removed
Created attachment 198900 [details] Proposed patch (since 483807 revision) Updated patch after ports r483807 changes. Created attachment 199221 [details]
up to 2.56.3
Created attachment 199229 [details] Proposed patch (since 484886 revision) Updated patch after ports r484886 changes. Created attachment 202832 [details]
new kqueue
- Add sharing/caching for file descriptor: create only one file handle for one path, all events duplicateted to multiple subscribers.
- no trailing slash for dir names
Sharing/caching reduce x2 files descriptors in Thunar and improve browse speed for non local dirs.
This already done in gio for inotify backend only.
Created attachment 202839 [details]
Proposed patch (since 493725 revision)
Created attachment 202935 [details]
fix: close(0) called mistakely, cause gtk3 apps errors
File monitoring from upstream still buggy: 1. Run thunar 2. Open /tmp in thunar 3. cd /usr/ports/devel/glib20/ 4. make Wait less than 1 minute to crash. Core was generated by `/usr/local/bin/Thunar /home/rim'. Program terminated with signal SIGILL, Illegal instruction. #0 0x000000080103006d in accept_ready (accept_socket=<optimized out>, condition=<optimized out>, user_data=0x805386888) at gsocketlistener.c:776 776 g_object_set_qdata_full (G_OBJECT (task), [Current thread is 1 (LWP 101870)] (gdb) bt #0 0x000000080103006d in accept_ready (accept_socket=<optimized out>, condition=<optimized out>, user_data=0x805386888) at gsocketlistener.c:776 #1 0x000000080113e919 in handler_unref_R (signal_id=<optimized out>, instance=<optimized out>, handler=<optimized out>) at gsignal.c:700 #2 0x000000080113f61d in handler_match_free1_R (node=0x806866d80, instance=<optimized out>) at gsignal.c:524 #3 0x000000080113f61d in signal_handlers_foreach_matched_R (instance=<optimized out>, mask=<optimized out>, signal_id=<optimized out>, detail=<optimized out>, closure=<optimized out>, func=0x801039040 <task_thread_cancelled>, data=<optimized out>, callback=<optimized out>) at gsignal.c:2797 #4 0x000000080113f61d in g_signal_handlers_disconnect_matched (instance=0x8059550d0, mask=<optimized out>, signal_id=<optimized out>, detail=<optimized out>, closure=<optimized out>, func=0x801039040 <task_thread_cancelled>, data=0x8058344e0) at gsignal.c:2937 #5 0x0000000801037c5e in g_task_thread_complete (task=0x8058344e0 [GTask]) at gtask.c:1268 #6 0x0000000801039319 in g_task_thread_pool_thread (thread_data=0x8058344e0, pool_data=<optimized out>) at gtask.c:1333 #7 0x0000000801261f58 in g_thread_pool_thread_proxy (data=0x805327f60) at gthreadpool.c:307 #8 0x0000000801260bb1 in g_thread_proxy (data=0x8054b3190) at gthread.c:784 #9 0x0000000800c8a7ab in () at /lib/libthr.so.3 #10 0x0000000000000000 in () To get proper debug build: -INSTALL_TARGET= install-strip +INSTALL_TARGET= install (In reply to rozhuk.im from comment #68) > Program terminated with signal SIGILL, Illegal instruction. Are you sure the crash is related to the file monitor? Illegal instruction suggests that your GLib may be compiled with options which are not compatible with your machine. You may want to disassemble the code to see which instruction causes the error. (In reply to Ting-Wei Lan from comment #69) These crashes happen only if I use glib without my patch. This happen then multiple files added and deleted. This happen then files deleted, not first and not last in file list. I think this is not because glib compiled with some strange flags, it some memory corruption. (In reply to Ting-Wei Lan from comment #69) There is another case when SIGILL means a bug and not wrong building options: code producing undefined behaviour may be compiled to "ud2a" machine instruction that generates SIGILL if execution reaches it. See https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=208027 for example we once had. Created attachment 202948 [details] Proposed patch (since 493725 revision) Updated patch, based on attachment #202935 [details], including changes for patching method (comment #50) and MSTOSBT macro (comment #56). This patch was for informative purposes, therefore can be marked as obsolete. (In reply to rozhuk.im from comment #70) Can you type 'disassemble' in GDB and paste the output? Seeing illegal instruction error without knowing which instruction causes the error isn't very useful because the backtrace isn't related to the file monitor. I don't claim that GLib port shipped with FreeBSD doesn't have problems. In fact, I know it has problems and I modified hundreds of lines of GLib code in the kqueue file monitor backend during the 2.58 cycle. This includes fixes for crashes caused by race conditions in order to let GLib pass all tests on FreeBSD and run tests regularly on the upstream CI. It will be nice if you can test the latest version of GLib (currently 2.60) instead of the outdated version included in FreeBSD ports to know whether upstream GLib still has problems. Created attachment 204279 [details]
patch update
Crash fixes and improvements
- Add reaction to mount points changed.
- kq_fnme_init_cb() now check is existing fnmo is alive and try to reinit if needed.
- Add few additional checks for -1 == fnmo->fd (is_removed).
- use mem addr as timer id (for EVFILT_TIMER)
- split kq_fnmo_init() into kq_fnmo_init() + kq_fnmo_reopen_fd()
- fix MSTOSBT macro and comment.
- fix: readdir_next() may returns empty entryes without filename
- fix: kq_fnmo_readdir() may return already deleted files, that returns from readdir_next()
- fix: wrong event "add file" after event "remove file" (die to kq_fnmo_readdir())
- fix: fd leak and may crash on dir remove: use kq_fnmo_reopen_fd() instead of kq_fnmo_init()
- fix: duplicate event about monitoring dir deleted
- cosmetic changes...
Also fix DEBUG option: INSTALL_TARGET=install-strip - causes install without debug symbols. Created attachment 204331 [details]
patch update
- fix: replace getmntinfo() -> getfsstat(): getmntinfo() alloc mem that can not be free, causes crash
- fix: mem leak on exit: free(kfnm->mounts) missed
- fix: some notifications about deleted files on unmount was lost
- kq_fnmo_free() and kq_fnme_free() now use only 1 arg
- do not notify about previosly deleted objects if they detected after mount is back
Change "In Progress" status to "Open" due to maintainer lack of interest. Created attachment 218219 [details]
patch
Rebased to 2.66.0
Moin moin Is there a chance of this being upstreamed? mfg Tobias (In reply to Tobias C. Berner from comment #79) IMHO no. Upstream wants: 1. Series of small patches for existing backend. 2. Test on all BSD systems. 3. Code style changes. I agree to do only 3. :) That's not quite the answer I hoped for :) Just for my edification: what makes 1 and 2 painful to do? mfg Tobias (In reply to Tobias C. Berner from comment #81) 1. Short answer: there is no way to patch original kqueue to thing that I made. Long version. IMHO glib`s kqueue gio fam backend is peace of crap and it is much easy for me drop it and write new. glib/gnome does not care about BSD users at all: a) It was broken and disabled from 2015.08 to 2016.01 or longer. b) After it was "fixed" - it consume to much CPU in thunar and crash it rarely. This happen with most apps that uses glib`s gio. Summary: it was totally unusable on desktop during at least 3+ years, probably in 2019-03 there was some fixes. But I do not want waste my time again with it to test this crap, at least because my implementation has many features to improve performance, like: fd cache/deduplication, rate limits and not use any polling. c) Even here, on FreeBSD bug trucker all patches for gnome staff from me was newer go to our port tree. Even then glib take it to their tree: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=240517 gnome@freebsd.org = /dev/null. d) Even simple build documentation update take to many time if it is *BSD specific: https://gitlab.gnome.org/GNOME/glib/-/merge_requests/1114 2. I use only FreeBSD and Open/DragonFly/NET BSD, MacOS is not in my comfort zone. (In reply to rozhuk.im from comment #82) OK, sounds like "fun" :) Well, I'm more than happy to commit a better back-end to the tree. mfg Toibas What's the current state of the impelementation? Are the issues raised by wulf@ fixed? mfg Tobias (In reply to Tobias C. Berner from comment #84) > Are the issues raised by wulf@ fixed? At first glance, there are snippets of code addressing them. Though, I did not dig deep here. There are lot of fixes to upstream file monitoring backend made by Ting-Wei since glib2.56 so we definitely should keep it enabled by default, but IMO as we have very negative experience with 2.56, it would be good to have backup option if some problems would rise again. (In reply to Tobias C. Berner from comment #84) I think all issues are fixed. At least on mine desktops I do not see any crashes in glib and I have almost zero annoying things with fam. BTW, as far I remember, Vladimir uses (or was use) libinotify as fam backend. Is is nice option to have. (In reply to Vladimir Kondratyev from comment #85) I think glib kqueue is still ugly. Last time I try it 2019-03-17, most changes was before this, and it crash. Even more, it still use polling and consume cpu. No caches, no event ratelimit/dalying, no grouping/deduplication and no other pretty things that I had done. But some of these things exist in linux backend. Created attachment 222442 [details]
patch
A commit references this bug: Author: tcberner Date: Sun Feb 14 19:38:26 UTC 2021 New revision: 565264 URL: https://svnweb.freebsd.org/changeset/ports/565264 Log: devel/glib20: add alternative file monitoring backend This altnerative file monitoring backend implemented by the submitter may lead to a big decrease in cpu time usage in for example thunar. Consider it experimental, but worth a try, if your having issues with high cpu. PR: 214338 Submitted by: rozhuk.im@gmail.com Changes: head/devel/glib20/Makefile head/devel/glib20/files/gkqueuefilemonitor.c head/devel/glib20/files/kqueue_fnm.c head/devel/glib20/files/kqueue_fnm.h Moin moin You're latest version has now been added to the ports tree, so that people can more easily give it a try. I would really like to see it integrated into upstream's codebase though, as otherwise, it will likely just rot away over the weeks/months/years. mfg Tobias (In reply to Tobias C. Berner from comment #90) Thanks! |