Bug 256952 - kqueue(2): Improve epoll Linux compatibility (compat/linux/linux_event)
Summary: kqueue(2): Improve epoll Linux compatibility (compat/linux/linux_event)
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.0-STABLE
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-bugs (Nobody)
URL:
Keywords: feature, needs-qa, standards
Depends on:
Blocks: 247219
  Show dependency treegraph
 
Reported: 2021-07-03 08:34 UTC by Vico
Modified: 2021-10-19 12:42 UTC (History)
6 users (show)

See Also:
koobs: maintainer-feedback? (wulf)
koobs: maintainer-feedback? (trasz)
koobs: mfc-stable13?
koobs: mfc-stable12?


Attachments
Introduce 'pevents' in kevent to record the poll events, and handle them in 'linux_event.c' to report correct epoll events to match Linux behavior. (3.68 KB, patch)
2021-07-03 08:34 UTC, Vico
no flags Details | Diff
note_hup.patch (4.26 KB, patch)
2021-08-08 22:23 UTC, Vladimir Kondratyev
no flags Details | Diff
Test cases (2.69 KB, text/plain)
2021-08-09 09:55 UTC, Vico
no flags Details
note_hup.patch (4.50 KB, patch)
2021-08-09 23:44 UTC, Vladimir Kondratyev
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Vico 2021-07-03 08:34:58 UTC
Created attachment 226189 [details]
Introduce 'pevents' in kevent to record the poll events, and handle them in 'linux_event.c' to report correct epoll events to match Linux behavior.

The epoll behavior for Unix socket is:
1. If both sender and receiver are shutdown, Linux reports 'EPOLLHUP'
2. If only receiver is shutdown, Linux reports 'EPOLLRDHUP | EPOLLRDNORM | EPOLLIN'.
3. For EPOLL error, Linux reports it with other epoll events but not report epoll error only once error detected.

The current code for socket only handles 'CANTRCVMORE' (receiver shutdown) in kevent filter read and only handle 'CANTSENDMORE' in kevent filter write. 

For Linux, the epoll behaviors, like pipe, socket, are quite different. And this patch creates a new mechanism to report epoll events according to each component, and this patch only fixes the socket case to align Linux epoll behavior, and other components, like pipe, can be improved based on the new mechanism.
Comment 1 Edward Tomasz Napierala freebsd_committer freebsd_triage 2021-08-02 13:58:35 UTC
Can you submit the patch for review on https://reviews.freebsd.org?

Also, can you suggest how to test it?
Comment 2 Vico 2021-08-06 08:41:58 UTC
hi,
I have submit the patch to https://reviews.freebsd.org/D31037

I have a linux test case, and how could I upload the test case?
Comment 3 Ed Maste freebsd_committer freebsd_triage 2021-08-06 16:30:21 UTC
(In reply to Vico from comment #2)
> I have a linux test case, and how could I upload the test case?

If the file is small you can add it as an attachment to this PR or to the Phabricator review (Cloud "Upload File" icon)
Comment 4 Vladimir Kondratyev freebsd_committer freebsd_triage 2021-08-08 22:23:09 UTC
Created attachment 227028 [details]
note_hup.patch

That is a POC patch for less intrusive way of adding new events through filter flags. It adds NOTE_HUP filter flag which behaves like epoll's EPOLLHUP and uses it in linuxolator.
Comment 5 Vico 2021-08-09 08:54:11 UTC
Comment on attachment 227028 [details]
note_hup.patch

Hi Vladimir,
I have several questions:

1. For 'LINUX_EPOLL_EVRD', it is defined only as “LINUX_EPOLLIN|LINUX_EPOLLRDNORM". If the user call epoll with only "EPOLLRDHUP", then filter will be no actions.

2. The Linux behavior for Linux are:
    a.If both sender and receiver are shutdown, Linux reports 'EPOLLHUP | EPOLLRDHUP | EPOLLRDNORM | EPOLLIN'
    b.If only receiver is shutdown, Linux reports 'EPOLLRDHUP | EPOLLRDNORM | EPOLLIN'.
    c. For EPOLL error, Linux reports it with other epoll events but not report epoll error only once error detected.
    d. LINUX_EPOLLERR and LINUX_EPOLLHUP are always in EPOLL mask.
    For this patch, based on Linux behavior a) and b), EPOLLRDNORM is missed for socket. 

3. No masks for this patch. For example, if the applications only request EPOLLHUP, but EPOLLIN is detected, the applicaiton will be waken up and catch EPOLLIN.

It is better to design a generic framework to handle all cases to match Linux behavior. And let each module to handle its own EPOLL events, for example, socket knows its own epoll behavior, and pipe knows its own epoll behavior, and epoll_to_kevent is a generic handling.

Please refer https://reviews.freebsd.org/D31037, as someone has concerns to introduce pevent and pmask in kevent structure. Coudl you please help clarify the issue?
Comment 6 Vico 2021-08-09 09:55:27 UTC
Created attachment 227040 [details]
Test cases

This test case only handles EPOLLRDHUP:
1. Run this case on Linux, it prints two lines that events are recieved( '0x2010': EPOLLHUP & EPOLLRDHUP)
​2. Run this case on BSD, it will hang as no EPOLLHUP or EPOLLRDHUP reported.​
​3. Run this case on BSD with this patch, it prints two events:
   a. First: print 'eventmask 0'. EPOLLIN reported as READ, and this patch just mask it but it still wake up the epoll_wait. (we have following patch to fix this issue)
   b. Second: print 'eventmask 0x2010'. EPOLLHUP & EPOLLRDHUP are recieved for close.
Comment 7 Vladimir Kondratyev freebsd_committer freebsd_triage 2021-08-09 23:44:02 UTC
Created attachment 227065 [details]
note_hup.patch