Bug 197696

Summary: Implement file private locks as implemented in Linux 3.15 and submitted to the AWG for standardisation
Product: Base System Reporter: Niall Douglas <s_bugzilla>
Component: kernAssignee: freebsd-bugs (Nobody) <bugs>
Status: New ---    
Severity: Affects Many People CC: emaste, german.mb, mtk.manpages, trasz
Priority: ---    
Version: CURRENT   
Hardware: Any   
OS: Any   

Description Niall Douglas 2015-02-16 01:39:38 UTC
Linux kernel 3.15 adds support for file private advisory locks which are essentially POSIX byte range advisory locks without the unfortunate per-process first-close unlocking semantics as required by POSIX. Much more info is available at http://lwn.net/Articles/586904/, but in essence the API is simply new flags for fcntl():


These still take a struct flock as before. The Linux man page describing these is available at http://man7.org/linux/man-pages/man2/fcntl.2.html.

BSD flock() becomes a wrapper for F_ORD_SETLK with a struct flock length set to 0 (i.e. the whole file).

I'd also suggest, seeing as this is BSD, that you add asynchronous lock notification support to kqueues.

The naive approach ought to be trivial if not hugely efficient: to EVFILT_VNODE add a NOTE_LOCK and a NOTE_UNLOCK where kevent_data.ident is set to the fd used to register the kevent and kevent_data.data is a pointer to a struct flock (not sure how to bundle this through, maybe embedded via a null subsequent kevent? I suppose the struct flock isn't absolutely needed). When the process receives a NODE_UNLOCK kevent, it can try fcntl(F_OFD_SETLK) which if it succeeds it initiates the changes to the file, else it goes back to waiting for the next NODE_UNLOCK kevent to arrive.

The more sophisticated approach is to add an extra F_OFD_SETLKQ which models EVFILT_AIO using a new EVFILT_LOCK filter. One issues F_OFD_SETLKQ with a target kqueue to notify when the lock becomes granted. This would be a much more efficient solution, but probably not trivial to implement.

Comment 1 mtk.manpages 2015-03-09 08:19:04 UTC
For info, the Austin bug on adding this feature is