Bug 170203 - [fifo] piped dd's don't behave sanely when dealing with a fifo
Summary: [fifo] piped dd's don't behave sanely when dealing with a fifo
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: Unspecified
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-07-27 00:00 UTC by Enji Cooper
Modified: 2017-12-31 22:29 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Enji Cooper freebsd_committer freebsd_triage 2012-07-27 00:00:21 UTC
Creating a fifo and then dd'ing across the fifo using /dev/zero doesn't seem to yield the behavior one would expect to have; dd should either exit thanks to SIGPIPE being sent or the count being completed.

Furthermore, the count is bogus:

Terminal 1:

$ dd if=fifo bs=512k count=4
0+4 records in
0+4 records out
32768 bytes transferred in 0.002121 secs (15449523 bytes/sec)
$ dd if=fifo bs=512k count=4
0+4 records in
0+4 records out
32768 bytes transferred in 0.001483 secs (22096295 bytes/sec)
$ dd if=fifo bs=512M count=4
0+4 records in
0+4 records out
32768 bytes transferred in 0.003908 secs (8384514 bytes/sec)

Terminal 2:

$ dd if=/dev/zero bs=512k count=4 of=fifo
^T
load: 0.40  cmd: dd 1779 [sbwait] 2.63r 0.00u 0.00s 0% 1800k
1+0 records in
0+0 records out
0 bytes transferred in 2.639078 secs (0 bytes/sec)
^T
load: 0.37  cmd: dd 1779 [sbwait] 8.19r 0.00u 0.00s 0% 1804k
1+0 records in
0+1 records out
40960 bytes transferred in 8.191172 secs (5001 bytes/sec)

How-To-Repeat: mkfifo fifo

Terminal 1:

dd if=fifo bs=512k count=4

Terminal 2:

dd if=/dev/zero bs=512k count=4 of=fifo
Comment 1 Bruce Evans freebsd_committer freebsd_triage 2012-07-27 03:07:07 UTC
On Thu, 26 Jul 2012, Garrett Cooper wrote:

>> Description:
> Creating a fifo and then dd'ing across the fifo using /dev/zero doesn't seem to yield the behavior one would expect to have; dd should either exit thanks to SIGPIPE being sent or the count being completed.
>
> Furthermore, the count is bogus:
>
> Terminal 1:
>
> $ dd if=fifo bs=512k count=4
> 0+4 records in
> 0+4 records out
> 32768 bytes transferred in 0.002121 secs (15449523 bytes/sec)
> $ dd if=fifo bs=512k count=4
> 0+4 records in
> 0+4 records out
> 32768 bytes transferred in 0.001483 secs (22096295 bytes/sec)
> ...

I think it's working almost as expected.  Large blocks give non-atomic
I/O, so the reader sees small blocks, then EOF when it gets ahead of
the writer.  This always happens without SMP.

Not is a bug (debugged below).  There is no SIGPIPE at the start of
write() because there is a reader then, and no SIGPIPE for the next
write() because there is no next write() -- the current one doesn't
notice when the reader goes away.

This is what happens under FreeBSD-~5.2 with the old fifo implementation,
at least.  It also shows a bug in truss(1) -- the current write() is not
shown, because it hasn't returned.  kdump shows that the write() has
started but not returned.

> $ dd if=fifo bs=512M count=4
> 0+4 records in
> 0+4 records out
> 32768 bytes transferred in 0.003908 secs (8384514 bytes/sec)
>
> Terminal 2:
>
> $ dd if=/dev/zero bs=512k count=4 of=fifo
> ^T
> load: 0.40  cmd: dd 1779 [sbwait] 2.63r 0.00u 0.00s 0% 1800k

FreeBSD-~5.2 shows [runnable] for the wait channel.  This is
strange.  dd should be blocked waiting for a reader, and only
sbwait makes sense for that.  FreeBSD-9 apparently doesn't
have the new named pipe implementation either.  -current shows
[pipdwt].  This makes it clearer that is waiting in write()
and not in open().  dd probably does the wrong thing for
fifos, by always trying to open files in O_RDWR mode first.
This breaks the normal synchronization of readers and writers.
In fact, this explains why there is no SIGPIPE -- there is
always a reader since dd can always talk to itself.  First
the open succeeds without blocking as expected.

After changing the O_RDWR to O_WRONLY in FreeBSD-~5.2, dd almost
works as expected.  The reader reads 4 blocks of size 8K and
then exits.  The writer first blocks in open.  Then it is
killed by SIGPIPE.  Its SIGPIPE handling is broken (nonexistent),
and the signal kills it without it printing a status message:

%   1266 dd       RET   read 524288/0x80000
%   1266 dd       CALL  write(0x4,0x8063000,0x80000)
%   1266 dd       RET   write -1 errno 32 Broken pipe
%   1266 dd       PSIG  SIGPIPE SIG_DFL

The read is from /dev/zero.  The write is of 512K to the fifo.
This delivers 4*8K then is killed.  If dd caught the signal
like it should, then we would expect to see either a short
write().  The signal handling should clear SA_RESTART, else
the write() would be restarted and would deliver endless
SIGPIPEs, now for failing writes.  Reporting of short writes
is quite broken and this is an interesting test for it.

-current delivers 4*64K instead of 4*8K.  This is because
the i/o unit is BIG_PIPE_SIZE = 64K for nameless pipes and
now for nameless pipes.  Apparently the unit is 8K for
sockets.  I think the unit of atomicity is only 512 bytes
for both.  Certainly, PIPE_BUF is still 512 in limits.h.
I think limits.h is broken since the unit isn't actually
512 bytes for _all_ file types.  For sockets, you can control
the watermarks and I think this changes the unit of atomicity.
I wonder if the socket ioctls for this the old named pipe
implemention.

The pipe wait channel names are less than perfect.  "pipdw"
means "pipe direct write".  "wt" looks like an abreviation
for "write", but there are 3 waits in pipe_direct_write()
and they are distinguished by the suffixes "w", "c" and "t".
It isn't clear what these mean.

>> How-To-Repeat:
> mkfifo fifo
>
> Terminal 1:
>
> dd if=fifo bs=512k count=4
>
> Terminal 2:
>
> dd if=/dev/zero bs=512k count=4 of=fifo

Remember to kill the writing dd if you stop it with ^Z.  Otherwise, since
the unhacked version is talking to itself, the fifo acts strangely for
other tests.

conv=block and conv=noerror (with cbs=512k) change the behaviour only
slightly (slightly worse).  What works easily is omitting the count.
dd then reads until EOF, in 256 records of size exactly 8K each under
FreeBSD-~5.2.  Not giving the count is normal practice, since you
rarely know the block size for pipes and many other file types.  It
there is another bug here, then it is conv=foo not working.  But
reblocking is confusing, and I probably did it wrong.

ANother thing that doesn't work well here is trying to control the
writer with SIGPIPE from the reader.  Even if you can get the reblocking
right and read precisily 2MB, and fix SIGPIPE, then the SIGPIPE may be
delivered after the writer has dirtied the fifo with a little more than
2MB.  The unread data then remains to bite the next reader.

Bruce
Comment 2 listlog2011 2012-07-27 09:52:22 UTC
On 2012/7/27 10:07, Bruce Evans wrote:
> On Thu, 26 Jul 2012, Garrett Cooper wrote:
>
>>> Description:
>> Creating a fifo and then dd'ing across the fifo using /dev/zero 
>> doesn't seem to yield the behavior one would expect to have; dd 
>> should either exit thanks to SIGPIPE being sent or the count being 
>> completed.
>>
>> Furthermore, the count is bogus:
>>
>> Terminal 1:
>>
>> $ dd if=fifo bs=512k count=4
>> 0+4 records in
>> 0+4 records out
>> 32768 bytes transferred in 0.002121 secs (15449523 bytes/sec)
>> $ dd if=fifo bs=512k count=4
>> 0+4 records in
>> 0+4 records out
>> 32768 bytes transferred in 0.001483 secs (22096295 bytes/sec)
>> ...
>
> I think it's working almost as expected.  Large blocks give non-atomic
> I/O, so the reader sees small blocks, then EOF when it gets ahead of
> the writer.  This always happens without SMP.
>
> Not is a bug (debugged below).  There is no SIGPIPE at the start of
> write() because there is a reader then, and no SIGPIPE for the next
> write() because there is no next write() -- the current one doesn't
> notice when the reader goes away.
>
After fixed dd to not open fifo output file in O_RDWR mode, I still 
found the
writer is blocked there even the reader is already exited.
I think this is definitely a bug. if reader is exited, the writer should 
be aborted too,
but I found it still be blocked in state "pipedwt", obviously, the code in
/sys/fs/fifo_vnops.c wants to wake up the writer when the reader is 
closing the fifo,
but it failed, because the bit flag PIPE_WANTW is forgotten to be set by 
writer,
so it skips executing wakeup(), and then the writer has no chance to 
find EOF bit flag
is set.

I have to apply the following two patches to make the bug  go away:
http://people.freebsd.org/~davidxu/patch/fifopipe/kernel_pipe.diff 
<http://people.freebsd.org/%7Edavidxu/patch/fifopipe/kernel_pipe.diff>
http://people.freebsd.org/~davidxu/patch/fifopipe/dd.diff 
<http://people.freebsd.org/%7Edavidxu/patch/fifopipe/dd.diff>



> This is what happens under FreeBSD-~5.2 with the old fifo implementation,
> at least.  It also shows a bug in truss(1) -- the current write() is not
> shown, because it hasn't returned.  kdump shows that the write() has
> started but not returned.
>
>> $ dd if=fifo bs=512M count=4
>> 0+4 records in
>> 0+4 records out
>> 32768 bytes transferred in 0.003908 secs (8384514 bytes/sec)
>>
>> Terminal 2:
>>
>> $ dd if=/dev/zero bs=512k count=4 of=fifo
>> ^T
>> load: 0.40  cmd: dd 1779 [sbwait] 2.63r 0.00u 0.00s 0% 1800k
>
> FreeBSD-~5.2 shows [runnable] for the wait channel.  This is
> strange.  dd should be blocked waiting for a reader, and only
> sbwait makes sense for that.  FreeBSD-9 apparently doesn't
> have the new named pipe implementation either.  -current shows
> [pipdwt].  This makes it clearer that is waiting in write()
> and not in open().  dd probably does the wrong thing for
> fifos, by always trying to open files in O_RDWR mode first.
> This breaks the normal synchronization of readers and writers.
> In fact, this explains why there is no SIGPIPE -- there is
> always a reader since dd can always talk to itself.  First
> the open succeeds without blocking as expected.
>
> After changing the O_RDWR to O_WRONLY in FreeBSD-~5.2, dd almost
> works as expected.  The reader reads 4 blocks of size 8K and
> then exits.  The writer first blocks in open.  Then it is
> killed by SIGPIPE.  Its SIGPIPE handling is broken (nonexistent),
> and the signal kills it without it printing a status message:
>
> %   1266 dd       RET   read 524288/0x80000
> %   1266 dd       CALL  write(0x4,0x8063000,0x80000)
> %   1266 dd       RET   write -1 errno 32 Broken pipe
> %   1266 dd       PSIG  SIGPIPE SIG_DFL
>
> The read is from /dev/zero.  The write is of 512K to the fifo.
> This delivers 4*8K then is killed.  If dd caught the signal
> like it should, then we would expect to see either a short
> write().  The signal handling should clear SA_RESTART, else
> the write() would be restarted and would deliver endless
> SIGPIPEs, now for failing writes.  Reporting of short writes
> is quite broken and this is an interesting test for it.
>
> -current delivers 4*64K instead of 4*8K.  This is because
> the i/o unit is BIG_PIPE_SIZE = 64K for nameless pipes and
> now for nameless pipes.  Apparently the unit is 8K for
> sockets.  I think the unit of atomicity is only 512 bytes
> for both.  Certainly, PIPE_BUF is still 512 in limits.h.
> I think limits.h is broken since the unit isn't actually
> 512 bytes for _all_ file types.  For sockets, you can control
> the watermarks and I think this changes the unit of atomicity.
> I wonder if the socket ioctls for this the old named pipe
> implemention.
>
> The pipe wait channel names are less than perfect.  "pipdw"
> means "pipe direct write".  "wt" looks like an abreviation
> for "write", but there are 3 waits in pipe_direct_write()
> and they are distinguished by the suffixes "w", "c" and "t".
> It isn't clear what these mean.
>
>>> How-To-Repeat:
>> mkfifo fifo
>>
>> Terminal 1:
>>
>> dd if=fifo bs=512k count=4
>>
>> Terminal 2:
>>
>> dd if=/dev/zero bs=512k count=4 of=fifo
>
> Remember to kill the writing dd if you stop it with ^Z. Otherwise, since
> the unhacked version is talking to itself, the fifo acts strangely for
> other tests.
>
> conv=block and conv=noerror (with cbs=512k) change the behaviour only
> slightly (slightly worse).  What works easily is omitting the count.
> dd then reads until EOF, in 256 records of size exactly 8K each under
> FreeBSD-~5.2.  Not giving the count is normal practice, since you
> rarely know the block size for pipes and many other file types. It
> there is another bug here, then it is conv=foo not working.  But
> reblocking is confusing, and I probably did it wrong.
>
> ANother thing that doesn't work well here is trying to control the
> writer with SIGPIPE from the reader.  Even if you can get the reblocking
> right and read precisily 2MB, and fix SIGPIPE, then the SIGPIPE may be
> delivered after the writer has dirtied the fifo with a little more than
> 2MB.  The unread data then remains to bite the next reader.
>
> Bruce
> _______________________________________________
> freebsd-bugs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-bugs
> To unsubscribe, send any mail to "freebsd-bugs-unsubscribe@freebsd.org"
> .
>
Comment 3 Bruce Evans freebsd_committer freebsd_triage 2012-07-27 12:08:13 UTC
On Fri, 27 Jul 2012, David Xu wrote:

> On 2012/7/27 10:07, Bruce Evans wrote:
>> 
>> I think it's working almost as expected.  Large blocks give non-atomic
>> I/O, so the reader sees small blocks, then EOF when it gets ahead of
>> the writer.  This always happens without SMP.
>> 
>> Not is a bug (debugged below).  There is no SIGPIPE at the start of
>> write() because there is a reader then, and no SIGPIPE for the next
>> write() because there is no next write() -- the current one doesn't
>> notice when the reader goes away.
>> 
> After fixed dd to not open fifo output file in O_RDWR mode, I still found the
> writer is blocked there even the reader is already exited.

I'm not sure that dd's open is a bug.  It must be intentional to use
O_RDWR for some cases.

POSIX (old 2001 draft) doesn't say anything about dd's open mode.

> I think this is definitely a bug. if reader is exited, the writer should be 
> aborted too,
> but I found it still be blocked in state "pipedwt", obviously, the code in
> /sys/fs/fifo_vnops.c wants to wake up the writer when the reader is closing 
> the fifo,
> but it failed, because the bit flag PIPE_WANTW is forgotten to be set by 
> writer,
> so it skips executing wakeup(), and then the writer has no chance to find EOF 
> bit flag
> is set.

Does this affect nameless pipes too?  The old implementation presumably
doesn't have this bug.

Bruce
Comment 4 listlog2011 2012-07-27 13:29:02 UTC
On 2012/7/27 19:08, Bruce Evans wrote:
> On Fri, 27 Jul 2012, David Xu wrote:
>
>> On 2012/7/27 10:07, Bruce Evans wrote:
>>>
>>> I think it's working almost as expected.  Large blocks give non-atomic
>>> I/O, so the reader sees small blocks, then EOF when it gets ahead of
>>> the writer.  This always happens without SMP.
>>>
>>> Not is a bug (debugged below).  There is no SIGPIPE at the start of
>>> write() because there is a reader then, and no SIGPIPE for the next
>>> write() because there is no next write() -- the current one doesn't
>>> notice when the reader goes away.
>>>
>> After fixed dd to not open fifo output file in O_RDWR mode, I still 
>> found the
>> writer is blocked there even the reader is already exited.
>
> I'm not sure that dd's open is a bug.  It must be intentional to use
> O_RDWR for some cases.
>
Don't know if original author even thought about FIFO.

> POSIX (old 2001 draft) doesn't say anything about dd's open mode.
>
>> I think this is definitely a bug. if reader is exited, the writer 
>> should be aborted too,
>> but I found it still be blocked in state "pipedwt", obviously, the 
>> code in
>> /sys/fs/fifo_vnops.c wants to wake up the writer when the reader is 
>> closing the fifo,
>> but it failed, because the bit flag PIPE_WANTW is forgotten to be set 
>> by writer,
>> so it skips executing wakeup(), and then the writer has no chance to 
>> find EOF bit flag
>> is set.
>
> Does this affect nameless pipes too?  The old implementation presumably
> doesn't have this bug.
>
It is easy to repeat the bug for named pipes,  don't know if nameless 
pipes have
same bug,  I can not reproduce it yet.


> Bruce
> .
>
Comment 5 Enji Cooper freebsd_committer freebsd_triage 2012-07-27 22:47:47 UTC
Hi David!
    I'll give the patches you provided a shot this weekend.
Thanks!
-Garrett
Comment 6 dfilter service freebsd_committer freebsd_triage 2012-07-31 03:00:54 UTC
Author: davidxu
Date: Tue Jul 31 02:00:37 2012
New Revision: 238928
URL: http://svn.freebsd.org/changeset/base/238928

Log:
  When a thread is blocked in direct write state, it only sets PIPE_DIRECTW
  flag but not PIPE_WANTW, but FIFO pipe code does not understand this internal
  state, when a FIFO peer reader closes the pipe, it wants to notify the writer,
  it checks PIPE_WANTW, if not set, it skips calling wakeup(), so blocked writer
  never noticed the case, but in general, the writer should return from the
  syscall with EPIPE error code and may get SIGPIPE signal. Setting the
  PIPE_WANTW fixed problem, or you can turn off direct write, it should fix the
  problem too. This bug is found by PR/170203.
  
  Another bug in FIFO pipe code is when peer closes the pipe, another end which
  is being blocked in select() or poll() is not notified, it missed to call
  pipeselwakeup().
  
  Third problem is found in poll regression test, the existing code can not
  pass 6b,6c,6d tests, but FreeBSD-4 works. This commit does not fix the
  problem, I still need to study more to find the cause.
  
  PR: 170203
  Tested by: Garrett Copper &lt; yanegomi at gmail dot com &gt;

Modified:
  head/sys/fs/fifofs/fifo_vnops.c
  head/sys/kern/sys_pipe.c
  head/sys/sys/pipe.h

Modified: head/sys/fs/fifofs/fifo_vnops.c
==============================================================================
--- head/sys/fs/fifofs/fifo_vnops.c	Tue Jul 31 00:46:19 2012	(r238927)
+++ head/sys/fs/fifofs/fifo_vnops.c	Tue Jul 31 02:00:37 2012	(r238928)
@@ -283,8 +283,11 @@ fifo_close(ap)
 		if (fip->fi_readers == 0) {
 			PIPE_LOCK(cpipe);
 			cpipe->pipe_state |= PIPE_EOF;
-			if (cpipe->pipe_state & PIPE_WANTW)
+			if ((cpipe->pipe_state & PIPE_WANTW)) {
+				cpipe->pipe_state &= ~PIPE_WANTW;
 				wakeup(cpipe);
+			}
+			pipeselwakeup(cpipe);
 			PIPE_UNLOCK(cpipe);
 		}
 	}
@@ -293,10 +296,13 @@ fifo_close(ap)
 		if (fip->fi_writers == 0) {
 			PIPE_LOCK(cpipe);
 			cpipe->pipe_state |= PIPE_EOF;
-			if (cpipe->pipe_state & PIPE_WANTR)
+			if ((cpipe->pipe_state & PIPE_WANTR)) {
+				cpipe->pipe_state &= ~PIPE_WANTR;
 				wakeup(cpipe);
+			}
 			fip->fi_wgen++;
 			FIFO_UPDWGEN(fip, cpipe);
+			pipeselwakeup(cpipe);
 			PIPE_UNLOCK(cpipe);
 		}
 	}

Modified: head/sys/kern/sys_pipe.c
==============================================================================
--- head/sys/kern/sys_pipe.c	Tue Jul 31 00:46:19 2012	(r238927)
+++ head/sys/kern/sys_pipe.c	Tue Jul 31 02:00:37 2012	(r238928)
@@ -227,7 +227,6 @@ static int pipe_create(struct pipe *pipe
 static int pipe_paircreate(struct thread *td, struct pipepair **p_pp);
 static __inline int pipelock(struct pipe *cpipe, int catch);
 static __inline void pipeunlock(struct pipe *cpipe);
-static __inline void pipeselwakeup(struct pipe *cpipe);
 #ifndef PIPE_NODIRECT
 static int pipe_build_write_buffer(struct pipe *wpipe, struct uio *uio);
 static void pipe_destroy_write_buffer(struct pipe *wpipe);
@@ -607,7 +606,7 @@ pipeunlock(cpipe)
 	}
 }
 
-static __inline void
+void
 pipeselwakeup(cpipe)
 	struct pipe *cpipe;
 {
@@ -738,7 +737,7 @@ pipe_read(fp, uio, active_cred, flags, t
 			rpipe->pipe_map.pos += size;
 			rpipe->pipe_map.cnt -= size;
 			if (rpipe->pipe_map.cnt == 0) {
-				rpipe->pipe_state &= ~PIPE_DIRECTW;
+				rpipe->pipe_state &= ~(PIPE_DIRECTW|PIPE_WANTW);
 				wakeup(rpipe);
 			}
 #endif
@@ -1001,6 +1000,7 @@ retry:
 			wakeup(wpipe);
 		}
 		pipeselwakeup(wpipe);
+		wpipe->pipe_state |= PIPE_WANTW;
 		pipeunlock(wpipe);
 		error = msleep(wpipe, PIPE_MTX(wpipe), PRIBIO | PCATCH,
 		    "pipdwt", 0);

Modified: head/sys/sys/pipe.h
==============================================================================
--- head/sys/sys/pipe.h	Tue Jul 31 00:46:19 2012	(r238927)
+++ head/sys/sys/pipe.h	Tue Jul 31 02:00:37 2012	(r238928)
@@ -143,5 +143,5 @@ struct pipepair {
 
 void	pipe_dtor(struct pipe *dpipe);
 int	pipe_named_ctor(struct pipe **ppipe, struct thread *td);
-
+void	pipeselwakeup(struct pipe *cpipe);
 #endif /* !_SYS_PIPE_H_ */
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 7 Eitan Adler freebsd_committer freebsd_triage 2017-12-31 08:00:43 UTC
For bugs matching the following criteria:

Status: In Progress Changed: (is less than) 2014-06-01

Reset to default assignee and clear in-progress tags.

Mail being skipped