| Summary: | EAGAIN on shell pipes / O_NONBLOCK error in kernel ? | ||
|---|---|---|---|
| Product: | Base System | Reporter: | Poul-Henning Kamp <phk> |
| Component: | kern | Assignee: | freebsd-bugs (Nobody) <bugs> |
| Status: | Closed Works As Intended | ||
| Severity: | Affects Only Me | CC: | jilles, kib, truckman |
| Priority: | --- | ||
| Version: | CURRENT | ||
| Hardware: | Any | ||
| OS: | Any | ||
|
Description
Poul-Henning Kamp
2016-05-14 22:01:52 UTC
I don't see this with slightly older -CURRENT (r299612) It may also be dependent on the write size with partial writes being the trigger. This is how I did my test: # truss -do /tmp/truss.out cat /usr/share/dict/web2 | & ( sleep 60 ; cat > /tmp/out ) The point at which the pipe file is visible here: 0.008018050 write(1,"\nalif\naliferous\naliform\nalig"...,4096) = 4096 (0x1000) 0.008089335 read(3,"oplasmatic\nalloplasmic\nallopla"...,4096) = 4096 (0x1000) 0.008183362 write(1,"oplasmatic\nalloplasmic\nallopla"...,4096) = 4096 (0x1000) 0.008255003 read(3,"\naltininck\naltiplano\naltiscop"...,4096) = 4096 (0x1000) 0.008353817 write(1,"\naltininck\naltiplano\naltiscop"...,4096) = 4096 (0x1000) 0.008425411 read(3,"ambrosiate\nambrosin\nambrosine"...,4096) = 4096 (0x1000) 60.037870962 write(1,"ambrosiate\nambrosin\nambrosine"...,4096) = 4096 (0x1000) 60.038272972 read(3,"n\nAmnionata\namnionate\namnioni"...,4096) = 4096 (0x1000) 60.038579389 write(1,"n\nAmnionata\namnionate\namnioni"...,4096) = 4096 (0x1000) I just tried an odd-size write test and that works properly as well: # truss -do /tmp/truss.out dd if=/usr/share/dict/web2 bs=111 | & ( sleep 60 ; cat > /tmp/out ) 0.281516906 read(3,"\namboceptor\nAmbocoelia\nAmboin"...,111) = 111 (0x6f) 0.281774261 write(1,"\namboceptor\nAmbocoelia\nAmboin"...,111) = 111 (0x6f) 0.281966423 read(3,"n\nambrein\nambrette\nAmbrica\na"...,111) = 111 (0x6f) 0.282239613 write(1,"n\nambrein\nambrette\nAmbrica\na"...,111) = 111 (0x6f) 0.282433782 read(3,"ous\nambrosial\nambrosially\nAmb"...,111) = 111 (0x6f) 60.078868711 write(1,"ous\nambrosial\nambrosially\nAmb"...,111) = 111 (0x6f) 60.079279323 read(3,"y\nambsace\nambulacral\nambulacr"...,111) = 111 (0x6f) 60.079588632 write(1,"y\nambsace\nambulacral\nambulacr"...,111) = 111 (0x6f) 60.079823710 read(3,"ative\nambulator\nAmbulatoria\na"...,111) = 111 (0x6f) 60.080113161 write(1,"ative\nambulator\nAmbulatoria\na"...,111) = 111 (0x6f) (In reply to Poul-Henning Kamp from comment #0) Can you provide a minimal reproduction case ? The problem may be caused by ssh. When it starts, ssh sets fd 0, 1 and 2 to non-blocking mode if they are not TTYs, restoring them to their original state on exit. This causes breakage if you use the open files (pipes or sockets) for other things while ssh is running.
Unfortunately, fixing ssh requires adding threads or processes to do blocking reads and writes. On FreeBSD, although socket receives support MSG_DONTWAIT, socket sends do not (although their behaviour is affected by it slightly) and pipe reads and writes do not support anything like it. Performance of common use cases may be affected negatively.
As a workaround, try redirecting ssh's stderr through a 'cat'. For example,
{ ssh ... 2>&1 >&3 3>&- | cat >&2; } 3>&1
(Bug: this example loses ssh's exit status. Using fifos or doing pipe manipulations from C will avoid that.)
That would explain why things don't break left, right and center. I'll test that theory later today. Isn't this going to screw up programs like rsync or do they take special precautions ? This was actually pretty trivial to test, and yes, ssh seems to be the culprit.
This might be a good thing to note in the ssh manual page, since the focus on encryption these days makes popen("ssh...") quite a logical thing to do.
|