Bug 203366 - kevent: EV_CLEAR on fifo does not work correctly
Summary: kevent: EV_CLEAR on fifo does not work correctly
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 10.2-STABLE
Hardware: Any Any
: --- Affects Only Me
Assignee: Mark Johnston
URL:
Keywords: patch
: 198168 (view as bug list)
Depends on:
Blocks:
 
Reported: 2015-09-26 09:04 UTC by Marko Turk
Modified: 2020-05-11 15:21 UTC (History)
7 users (show)

See Also:


Attachments
patch (591 bytes, patch)
2015-10-21 18:28 UTC, Marko Turk
no flags Details | Diff
patch (601 bytes, patch)
2016-02-29 19:20 UTC, Mark Johnston
no flags Details | Diff
wake up pipe writers in edge triggered way (912 bytes, patch)
2017-12-19 16:43 UTC, Jan Kokemüller
no flags Details | Diff
patch for FIFO related poll/kqueue issues (2.84 KB, patch)
2018-02-17 11:52 UTC, Jan Kokemüller
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Marko Turk 2015-09-26 09:04:19 UTC
Hi,

according to kqueue(2) man page for EVFILT_READ on a fifo:
"When the last writer disconnects, the filter will set EV_EOF in flags. This may be cleared by passing in EV_CLEAR, at which point the filter will resume waiting for data to become available before returning."

But, EV_CLEAR on fifo does not work (EV_EOF is returned).

To reproduce (tail uses kevent and EV_CLEAR internally):
mkfifo foo
tail -f foo
(in another terminal:) echo xxx > foo
After this, top process uses all CPU.

I tried to fix the bug. After applying this patch the problem is not visible, but I'm not sure if this is the correct solution:

Index: sys/kern/sys_pipe.c
===================================================================
--- sys/kern/sys_pipe.c (revision 288220)
+++ sys/kern/sys_pipe.c (working copy)
@@ -1791,9 +1791,10 @@
        if ((kn->kn_data == 0) && (rpipe->pipe_state & PIPE_DIRECTW))
                kn->kn_data = rpipe->pipe_map.cnt;
 
-       if ((rpipe->pipe_state & PIPE_EOF) ||
+       if (!(kn->kn_flags & EV_CLEAR) &&
+           ((rpipe->pipe_state & PIPE_EOF) ||
            wpipe->pipe_present != PIPE_ACTIVE ||
-           (wpipe->pipe_state & PIPE_EOF)) {
+           (wpipe->pipe_state & PIPE_EOF))) {
                kn->kn_flags |= EV_EOF;
                return (1);
        }
Comment 1 Marko Turk 2015-10-21 18:28:40 UTC
Created attachment 162301 [details]
patch
Comment 2 Marius Halden 2015-12-03 10:47:01 UTC
Hi,

I am able to reproduce this under 10.2-RELEASE-p7 and 11-CURRENT r291671.

This might be related to bug #198168.
Comment 3 Marko Turk 2016-02-25 19:24:41 UTC
(In reply to Marius Halden from comment #2)

Yes, it is probably the same problem as in #198168.

Did you maybe try the patch?
Comment 4 Mark Johnston freebsd_committer 2016-02-29 19:20:24 UTC
Created attachment 167574 [details]
patch

I think the patch isn't completely correct based on my interpretation of the kevent man page: if EV_EOF isn't already set, we want to activate the knote, even if EV_CLEAR is specified. That is, we should get an event if the remote side closes the FIFO. Once EV_EOF has been returned once, EV_CLEAR should clear it.

To see the difference with the first patch, try trussing a tail -f <fifo> and run "(echo a; sleep 1) > <fifo>".

That suggests something like the attached diff.

The kevent man page also documents similar behaviour for EVFILE_WRITE, so filt_pipewrite() should be fixed as well for completeness.
Comment 5 Mark Johnston freebsd_committer 2016-02-29 19:28:28 UTC
*** Bug 198168 has been marked as a duplicate of this bug. ***
Comment 6 Marko Turk 2016-03-07 10:35:58 UTC
Hi,

yes, the new patch is better. I tested it and it fixes my problem.

BR,
Marko
Comment 7 Jan Kokemüller 2017-12-19 16:43:46 UTC
Created attachment 188967 [details]
wake up pipe writers in edge triggered way

What do you think about the attached patch?

I believe the EV_EOF condition is cleared correctly by EV_CLEAR -- but each time you try to read(2) at EOF the pipeselwakeup() down in pipe_read() re-triggers the EOF condition. It should be enough to just wake up the writers when there was actually some data read.
Comment 8 Jan Kokemüller 2017-12-20 10:17:05 UTC
NetBSD does something similar, but in addition they trigger the wakeup on some other condition (flag PIPE_SIGNALR):

https://github.com/NetBSD/src/blob/7c38ee86e0812f49fe3889bfc1c59757688d8d16/sys/kern/sys_pipe.c#L622

This flag is set for example if this is the first read(2) after a write(2). I'm not quite sure yet why this is needed, though.
Comment 9 Bryan Drewery freebsd_committer 2018-02-15 19:15:22 UTC
r232055 dropped the fifo filter:
-static int
-filt_fiforead(struct knote *kn, long hint)
-{
-       struct socket *so = (struct socket *)kn->kn_hook;
-
-       SOCKBUF_LOCK_ASSERT(&so->so_rcv);
-       kn->kn_data = so->so_rcv.sb_cc;
-       if (so->so_rcv.sb_state & SBS_CANTRCVMORE) {
-               kn->kn_flags |= EV_EOF;
-               return (1);
-       } else {
-               kn->kn_flags &= ~EV_EOF;
-               return (kn->kn_data > 0);
-       }
-}

in fifo_open (for writer) this change was made:
                if (fip->fi_writers == 1) {
-                       SOCKBUF_LOCK(&fip->fi_readsock->so_rcv);
-                       fip->fi_readsock->so_rcv.sb_state &= ~SBS_CANTRCVMORE;
-                       SOCKBUF_UNLOCK(&fip->fi_readsock->so_rcv);
-                       if (fip->fi_readers > 0) {
+                       fpipe->pipe_state &= ~PIPE_EOF;
+                       if (fip->fi_readers > 0)
                                wakeup(&fip->fi_readers);
-                               sorwakeup(fip->fi_readsock);
-                       }

The filter used is:
static int                                                           
filt_piperead(struct knote *kn, long hint)                           
{                                                                    
        struct pipe *rpipe = kn->kn_hook;                            
        struct pipe *wpipe = rpipe->pipe_peer;                       
        int ret;                                                     
                                                                     
        PIPE_LOCK_ASSERT(rpipe, MA_OWNED);                           
        kn->kn_data = rpipe->pipe_buffer.cnt;                        
        if ((kn->kn_data == 0) && (rpipe->pipe_state & PIPE_DIRECTW))
                kn->kn_data = rpipe->pipe_map.cnt;                   
                                                                     
        if ((rpipe->pipe_state & PIPE_EOF) ||                        
            wpipe->pipe_present != PIPE_ACTIVE ||                    
            (wpipe->pipe_state & PIPE_EOF)) {                        
                kn->kn_flags |= EV_EOF;                              
                return (1);                                          
        }                                                            
        ret = kn->kn_data > 0;                                       
        return ret;                                                  
}                                                                    

So it seems we just need to clear EV_EOF again if PIPE_EOF is not set.
Comment 10 Jan Kokemüller 2018-02-16 06:14:14 UTC
So the EV_EOF flag is 'sticky' in kn_flags and does not get cleared anywhere, right? Then I was wrong in comment #7.

There is also bug #224615 for an interesting edgecase at FIFO EOF where kevent and poll differ.

I also wonder why the man page speaks of using EV_CLEAR to clear the EV_EOF condition. What about level triggered kevent() without EV_CLEAR?
See also this commit from DragonFly[1]. They had the same problem. They still use the filt_fiforead filter in 'sys/vfs/fifofs/fifo_vnops.c', though.

I'll try to write a test program that reproduces all those EOF related issues.


[1]: http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/acb71b22f1443bd8a34f6e457cf57c4527d7ab52
Comment 11 Jan Kokemüller 2018-02-17 11:52:06 UTC
Created attachment 190730 [details]
patch for FIFO related poll/kqueue issues

I've added some tests here:
https://github.com/jiixyj/fifo-kqueue-tests

I've attached a patch that fixes all those issues for me.

Here is a list of the tests that currently fail and their bug IDs.
Should I open new bug reports where appropriate and split up the patch?


bug #203366 - "sticky EV_EOF"
----
10.1
18.1

no bug ID yet (but see comment #7/comment #8 in bug #203366) - "spurious wakeups at EOF"
----
14.1

no bug ID yet - "writer gets POLLIN"
----
16.1

no bug ID yet - "no wakeups when new readers/writers connect"
----
 8.1
17.1

bug #224615 - "EV_EOF for new readers/writers"
----
 6.2
11.2
12.2
13.2
14.2
Comment 12 commit-hook freebsd_committer 2020-04-27 15:59:52 UTC
A commit references this bug:

Author: markj
Date: Mon Apr 27 15:59:20 UTC 2020
New revision: 360380
URL: https://svnweb.freebsd.org/changeset/base/360380

Log:
  Fix handling of EV_EOF for named pipes.

  Contrary to the kevent man page, EV_EOF on a fifo is not cleared by
  EV_CLEAR.  Modify the read and write filters to clear EV_EOF when the
  fifo's PIPE_EOF flag is clear, and update the man page to document the
  new behaviour.

  Modify the write filter to return the amount of buffer space available
  even if no readers are present.  This matches the behaviour for sockets.

  When reading from a pipe, only call pipeselwakeup() if some data was
  actually read.  This prevents the continuous re-triggering of a
  EVFILT_READ event on EOF when in edge-triggered mode.

  PR:		203366, 224615
  Submitted by:	Jan Kokem?ller <jan.kokemueller@gmail.com>
  MFC after:	2 weeks
  Differential Revision:	https://reviews.freebsd.org/D24528

Changes:
  head/lib/libc/sys/kqueue.2
  head/sys/kern/sys_pipe.c
Comment 13 commit-hook freebsd_committer 2020-05-11 15:20:58 UTC
A commit references this bug:

Author: markj
Date: Mon May 11 15:20:41 UTC 2020
New revision: 360897
URL: https://svnweb.freebsd.org/changeset/base/360897

Log:
  MFC r360380:
  Fix handling of EV_EOF for named pipes.

  PR:	203366, 224615, 246350

Changes:
_U  stable/12/
  stable/12/lib/libc/sys/kqueue.2
  stable/12/sys/kern/sys_pipe.c