Bug 176916 - [patch] sh(1): implement multiple arguments to wait builtin
Summary: [patch] sh(1): implement multiple arguments to wait builtin
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: standards (show other bugs)
Version: 9.0-RELEASE
Hardware: Any Any
: Normal Affects Only Me
Assignee: Jilles Tjoelker
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-03-13 12:20 UTC by Vadim Goncharov
Modified: 2014-08-31 21:00 UTC (History)
0 users

See Also:


Attachments
file.diff (1.89 KB, patch)
2013-03-13 12:20 UTC, Vadim Goncharov
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Vadim Goncharov 2013-03-13 12:20:00 UTC
POSIX.2 requires that wait(1) built-in in sh(1) shall support multiple arguments, and that unknown (e.g. errorneous) job specification be treated as known with exit status 127 (it is currently 2), see http://pubs.opengroup.org/onlinepubs/009695399/utilities/wait.html

The patch attached implements this functionality which was absent in our shell.

This feature is essential for implementing e.g. parallel execution of rc.d scripts, so please commit/MFC it as soon as possible :)

Fix: Patch attached.

Patch attached with submission follows:
How-To-Repeat: Try to do

  sleep 33 & sleep 44 &
  wait %2 %1
  echo $?

other args are ignored.
Comment 1 Jilles Tjoelker freebsd_committer freebsd_triage 2013-03-15 17:13:47 UTC
Responsible Changed
From-To: freebsd-standards->jilles

sh(1) is my area. 

The wait builtin is indeed required to accept multiple operands 
and to treat unknown process IDs as known process IDs that 
exited with status 127. 

The way you patched getjob seems incorrect as it will still 
print an error message. Also, I would like to avoid calling 
setjmp(). 

I will also implement "--" handling and rejection of unknown 
options in "wait" first. 

About working around the lack of these features, invoking 
"wait" with multiple operands is equivalent to invoking it 
multiple times with one operand each. Passing unknown process 
IDs can be avoided iff there are no trap handlers that do 
not exit.
Comment 2 Vadim Goncharov 2013-03-21 12:56:14 UTC
Hi jilles@FreeBSD.org!

On Fri, 15 Mar 2013 17:25:53 GMT; jilles@FreeBSD.org <jilles@FreeBSD.org>  
wrote:

> The way you patched getjob seems incorrect as it will still
> print an error message.

> About working around the lack of these features, invoking
> "wait" with multiple operands is equivalent to invoking it
> multiple times with one operand each.

I've also thought so at first, but then I've realized that error printing  
is
needed for debugging. See here, I've implemented parallel rcNG:

http://nuclight.ipfw.ru/vadim/freebsd/rc_parallel.{8.3R,HEAD}.patch

and printing errors about unknown jobs helped a lot during debug (after  
all,
shell still prints many errors to stderr in non-interactive scripts, and  
that
helps, too). I think the same will be true for anyone utilizing job control
(in scripts) with 'wait' for other tasks, too (BTW, you'll find in the 2  
links
above [substitute needed fbsd ver to URL] another one-line patch hunk about
'mflag' to utilize %? in non-interactive scripts with job control).

POSIX.2 does not mandate that 'wait' must be silent, only about error-code
behaviour. I've digged into code of 3 other shells, bash, ksh and zsh,
supporting multiple args - they don't do suppressing error messages, while
bash's code comments explicitly say about POSIX.2 in this place.

So finally, given some experience for what it is needed for scripts, I've
left getjob() untouched - and this also lead to simpler and cleaner diff.

> Also, I would like to avoid calling setjmp().

Also, just curious, what is bad with setjmp() ? It is common in sh(1) code,
nice idiom with savehandler which is easy to write and understand,  
especially
in this place, where the code is short.

> Passing unknown process IDs can be avoided iff there are no
> trap handlers that do not exit.

I don't understood. In script, you can't be always sure your argument is
known or unknown, sh(1) knows better. This patch was done for real task
(parallel rcNG from link above) where jobspecs are coming from external
source, we, as script writers, don't know if they are known or unknown.

> I will also implement "--" handling and rejection of unknown
> options in "wait" first.

Is it really useful? Several built-ins don't have options and don't need to
check for them. Seems like over-complication.

-- 
P.S. Resent by copy-paste without all headers, as freebsd.org responded
"blocked using b.barracudacentral.org" to my sendmail about address of
my home machine.

WBR, Vadim Goncharov. ICQ#166852181       mailto:vadim_nuclight@mail.ru
[Anti-Greenpeace][Sober FreeBSD zealot][http://nuclight.livejournal.com]
Comment 3 Jilles Tjoelker freebsd_committer freebsd_triage 2013-03-23 22:55:26 UTC
On Thu, Mar 21, 2013 at 04:56:14PM +0400, Vadim Goncharov wrote:
> On Fri, 15 Mar 2013 17:25:53 GMT; jilles@FreeBSD.org <jilles@FreeBSD.org>  
> wrote:

> > The way you patched getjob seems incorrect as it will still
> > print an error message.

> > About working around the lack of these features, invoking
> > "wait" with multiple operands is equivalent to invoking it
> > multiple times with one operand each.

> I've also thought so at first, but then I've realized that error printing  
> is
> needed for debugging. See here, I've implemented parallel rcNG:

> http://nuclight.ipfw.ru/vadim/freebsd/rc_parallel.{8.3R,HEAD}.patch

Hmm, a third implementation. Two others are
https://github.com/buganini/rcexecr/ and
https://github.com/kil/rcorder/.

What appears missing in your implementation is a limit (greater than 1
and less than the number of scripts) on how many scripts may be running
at once. This is like make -jN and prevents thrashing.

> and printing errors about unknown jobs helped a lot during debug
> (after  all, shell still prints many errors to stderr in
> non-interactive scripts, and  that helps, too). I think the same will
> be true for anyone utilizing job control (in scripts) with 'wait' for
> other tasks, too (BTW, you'll find in the 2  links above [substitute
> needed fbsd ver to URL] another one-line patch hunk about 'mflag' to
> utilize %? in non-interactive scripts with job control).

I need to think about this. I am not sure it is appropriate to tie
together the command text and the separate process group for each job.
The latter effect of set -m is definitely inappropriate for rc scripts
since it will prevent interrupting a script using ^C or ^\.

> POSIX.2 does not mandate that 'wait' must be silent, only about error-code
> behaviour. I've digged into code of 3 other shells, bash, ksh and zsh,
> supporting multiple args - they don't do suppressing error messages, while
> bash's code comments explicitly say about POSIX.2 in this place.

In general, POSIX does not permit printing diagnostics (warnings) if the
exit status is still 0. This is indicated by a STDERR section saying
"The standard error shall be used only for diagnostic messages."
and fully detailed in XCU 1.4 Utility Description Defaults.

Furthermore, the text "If one or more pid operands are specified that
represent unknown process IDs, wait shall treat them as if they were
known process IDs that exited with exit status 127." seems clear enough.
That would apply to both numeric process IDs and job control job IDs.
Implementations mostly follow this for the former but not for the
latter. That bash (even in POSIX mode) and zsh (even in sh and ksh
emulations) generate a diagnostic seems a bug to me. Note that zsh has
many of this kind of bug (the developers don't consider them very
important) but bash in POSIX mode should probably fix it.

Note that in the sequence
  :& wait 1 $!
the exit status shall be 0, so no diagnostic shall be written about pid
1. And it makes no sense to write a diagnostic only if the unknown pid
is last.

> So finally, given some experience for what it is needed for scripts, I've
> left getjob() untouched - and this also lead to simpler and cleaner diff.

> > Also, I would like to avoid calling setjmp().

> Also, just curious, what is bad with setjmp() ? It is common in sh(1)
> code, nice idiom with savehandler which is easy to write and
> understand,  especially in this place, where the code is short.

Compilers do not like setjmp(). Mainly because of this issue, the
warning level is set low. If it is set higher, many warnings like the
below appear:

eval.c:1028: warning: variable 'argv' might be clobbered by 'longjmp' or 'vfork'

I have checked the C standard and POSIX and made variables volatile
where they say it is needed. Until now, I have refused to make more
variables volatile where the standards do not require it but compilers
complain anyway.

I am willing to make deeper changes to avoid adding more setjmp() or to
remove existing setjmp().

> > Passing unknown process IDs can be avoided iff there are no
> > trap handlers that do not exit.

> I don't understood. In script, you can't be always sure your argument is
> known or unknown, sh(1) knows better. This patch was done for real task
> (parallel rcNG from link above) where jobspecs are coming from external
> source, we, as script writers, don't know if they are known or unknown.

You can avoid it but may not want to for complexity reasons... OK.

> > I will also implement "--" handling and rejection of unknown
> > options in "wait" first.

> Is it really useful? Several built-ins don't have options and don't
> need to check for them. Seems like over-complication.

I would like to have the possibility of adding options in future. As
long as things that look like options cause the command to fail, it is
not a problem not to support them, but if such things will be accepted,
something needs to happen. I committed that change.

-- 
Jilles Tjoelker
Comment 4 Vadim Goncharov 2013-03-26 22:27:48 UTC
24.03.13 @ 02:55 Jilles Tjoelker wrote:

> On Thu, Mar 21, 2013 at 04:56:14PM +0400, Vadim Goncharov wrote:
>> On Fri, 15 Mar 2013 17:25:53 GMT; jilles@FreeBSD.org  
>> <jilles@FreeBSD.org>
>> wrote:
>
>> > The way you patched getjob seems incorrect as it will still
>> > print an error message.
>
>> > About working around the lack of these features, invoking
>> > "wait" with multiple operands is equivalent to invoking it
>> > multiple times with one operand each.
>
>> I've also thought so at first, but then I've realized that error  
>> printing
>> is
>> needed for debugging. See here, I've implemented parallel rcNG:
>
>> http://nuclight.ipfw.ru/vadim/freebsd/rc_parallel.{8.3R,HEAD}.patch
>
> Hmm, a third implementation. Two others are
> https://github.com/buganini/rcexecr/ and
> https://github.com/kil/rcorder/.
>
> What appears missing in your implementation is a limit (greater than 1
> and less than the number of scripts) on how many scripts may be running
> at once. This is like make -jN and prevents thrashing.

Yes, I am aware about these two implementations and about this problem. My  
version is not final, and was born after looking at these two. The problem  
with them is that they pass too much to rcorder's shoulders - it is  
converted to launcher, too much C code is touched, and result is still  
very far from being flexible, while already being incompatible with  
current rcNG scripts. In fact, that's something like systemd in Linux,  
having all systemd's cons (e.g. inability for admin to fix problem with  
boot, which was always possible with shell), but not even any of systemd's  
pros.

I've tried to take different approach, making it The Right Way(tm):

* make it Unix way, every utility does it's small part and does it well
* minimal possible patch to C code: do things in higher-level lang (sh)
* being fully compatible with current rcNG system (ready to import now)

This is proof-of-concept, and it worked - you see, first part is not  
parallel, rcorder's output ($1 of every line) is not changed, being  
compatible to previous (being able to disable); rcorder's output can be  
possibly used for other purposes, not just startup; any system  
administrator can now implement any other scheme in pure shell. Yes, this  
version is actually thrashing, but implementing any variant of 'make -jN'  
is now very easy for everyone in 20 lines of shell, compared to hundredths  
of C code lines graven-in-stone.

(It may be possible, given your comments below, that some minor changes to  
sh(1) is more proper way, though. But that's probably theme for discussion  
in more wider community (hackers@ ?).)

>> and printing errors about unknown jobs helped a lot during debug
>> (after  all, shell still prints many errors to stderr in
>> non-interactive scripts, and  that helps, too). I think the same will
>> be true for anyone utilizing job control (in scripts) with 'wait' for
>> other tasks, too (BTW, you'll find in the 2  links above [substitute
>> needed fbsd ver to URL] another one-line patch hunk about 'mflag' to
>> utilize %? in non-interactive scripts with job control).
>
> I need to think about this. I am not sure it is appropriate to tie
> together the command text and the separate process group for each job.

That's definitely a question for more competent people than me. I wasn't  
sure about adding a new flag or making storing that info permanent for all  
modes of operation (is that IIRC 200 bytes per job critical, though?), so  
tried to do minimal change keeping that turned off until explicitly  
requested.

> The latter effect of set -m is definitely inappropriate for rc scripts
> since it will prevent interrupting a script using ^C or ^\.

This probably is not so seriuos as rc.subr sets 'kill -sig $$' for both  
INT and QUIT traps. This happens not under all rc_* settings, and making  
separate process group is not necessary, though (and makes it a little  
slower).

>> POSIX.2 does not mandate that 'wait' must be silent, only about  
>> error-code
>> behaviour. I've digged into code of 3 other shells, bash, ksh and zsh,
>> supporting multiple args - they don't do suppressing error messages,  
>> while
>> bash's code comments explicitly say about POSIX.2 in this place.
>
> In general, POSIX does not permit printing diagnostics (warnings) if the
> exit status is still 0. This is indicated by a STDERR section saying
> "The standard error shall be used only for diagnostic messages."
> and fully detailed in XCU 1.4 Utility Description Defaults.

That's strange for me as user and admin. Exit status is checked in  
scripts, and if anyone don't want warnings, they can always be suppressed  
with 2>/dev/null (BTW, that's what is already done with rcorder in  
/etc/rc). It is better to have them and silence if you don't need, than  
not to have when you need.

> Furthermore, the text "If one or more pid operands are specified that
> represent unknown process IDs, wait shall treat them as if they were
> known process IDs that exited with exit status 127." seems clear enough.
> That would apply to both numeric process IDs and job control job IDs.
> Implementations mostly follow this for the former but not for the
> latter. That bash (even in POSIX mode) and zsh (even in sh and ksh
> emulations) generate a diagnostic seems a bug to me. Note that zsh has
> many of this kind of bug (the developers don't consider them very
> important) but bash in POSIX mode should probably fix it.
>
> Note that in the sequence
>   :& wait 1 $!
> the exit status shall be 0, so no diagnostic shall be written about pid
> 1. And it makes no sense to write a diagnostic only if the unknown pid
> is last.

Hmm, "clear enough" ? For me it's not clear that this is about suppressing  
warnings, too. I've read POSIX page and relevant places in other shells  
several times, and decided to leave printing warnings as is. It seems to  
me that primary focus is on the exit status, which is almost always the  
only relevant/usable thing in scripts. Think about it from user's point of  
view where you do grep(1) or find(1) over /etc and see 'Permission  
denied', but you get your request satisfied from other files - so, for  
you, both the program worked correct (produced useful answer to you) AND  
warnings also told you some useful info.

Yes, it makes no sense to write a diagnostic only if the unknown pid is  
last. Either about all, or about nothing. The grep/find analogy makes it  
intuitive (for user) to see all diagnostic messages about (transient!)  
failures while final result was still OK.

This grep analogy, however, reminds us about 'grep -q' thus suggesting  
possible solution - control diagnostic printing based on current mode. At  
least, user definitely wants warnings when shell is interactive. For  
scripts... may be mflag for those who explicitly request it? Or, if not  
-m, tie it with another problem with command text mentioned above, if some  
other flag will be chosen (-b ? -T ? probably depends on what will be  
chosen for proper implementing 'make -jN' for parallel rcNG).

>
>> So finally, given some experience for what it is needed for scripts,  
>> I've
>> left getjob() untouched - and this also lead to simpler and cleaner  
>> diff.
>
>> > Also, I would like to avoid calling setjmp().
>
>> Also, just curious, what is bad with setjmp() ? It is common in sh(1)
>> code, nice idiom with savehandler which is easy to write and
>> understand,  especially in this place, where the code is short.
>
> Compilers do not like setjmp(). Mainly because of this issue, the
> warning level is set low. If it is set higher, many warnings like the
> below appear:
>
> eval.c:1028: warning: variable 'argv' might be clobbered by 'longjmp' or  
> 'vfork'
>
> I have checked the C standard and POSIX and made variables volatile
> where they say it is needed. Until now, I have refused to make more
> variables volatile where the standards do not require it but compilers
> complain anyway.

That's pretty strange. The setjmp/longjmp are required by standards, so  
these are bugs in compilers, not in sh(1). I heard Clang people are doing  
many things based on reports from FreeBSD, may be they will fix it?

> I am willing to make deeper changes to avoid adding more setjmp() or to
> remove existing setjmp().

But how exceptions will be handled, then? Personally I find this method,  
as it is done in /bin/sh, very elegant for C code (previously I feared  
setjmp() but in sh(1) it is very clear, now I'll use it in some mine  
projects). It is sometimes even better and more understandable than  
C++ exceptions, while being so simple: I looked at setvarsafe(), the most  
refined example - waitcmd() could be another :-) The code of our /bin/sh  
is much better (including this aspect) than three other shells I looked at  
(except may be ksh in OpenBSD).

>> > Passing unknown process IDs can be avoided iff there are no
>> > trap handlers that do not exit.
>
>> I don't understood. In script, you can't be always sure your argument is
>> known or unknown, sh(1) knows better. This patch was done for real task
>> (parallel rcNG from link above) where jobspecs are coming from external
>> source, we, as script writers, don't know if they are known or unknown.
>
> You can avoid it but may not want to for complexity reasons... OK.

Probably I still do not understand you, but this is irrelevant for  
discussion as the whole needed feature is required by POSIX.2, no need to  
think about workarounds.

>> > I will also implement "--" handling and rejection of unknown
>> > options in "wait" first.
>
>> Is it really useful? Several built-ins don't have options and don't
>> need to check for them. Seems like over-complication.
>
> I would like to have the possibility of adding options in future. As
> long as things that look like options cause the command to fail, it is
> not a problem not to support them, but if such things will be accepted,
> something needs to happen. I committed that change.

Then may be wait -q for interactive mode or wait -v for scripts will  
handle printing warnings (different defaults for iflag/mflag/whatever) ?  
Still seems like over-complication to me, however, but just "--" for  
future (looked at your commit) is OK... until somebody will request  
processing 'wait -1' like 'kill -INT -1' :-).

-- 
WBR, Vadim Goncharov
Comment 5 Jilles Tjoelker freebsd_committer freebsd_triage 2013-03-29 14:52:06 UTC
On Wed, Mar 27, 2013 at 02:27:48AM +0400, Vadim Goncharov wrote:
> 24.03.13 @ 02:55 Jilles Tjoelker wrote:

> > On Thu, Mar 21, 2013 at 04:56:14PM +0400, Vadim Goncharov wrote:
> >> On Fri, 15 Mar 2013 17:25:53 GMT; jilles@FreeBSD.org wrote:

> >> > The way you patched getjob seems incorrect as it will still
> >> > print an error message.

> >> > About working around the lack of these features, invoking
> >> > "wait" with multiple operands is equivalent to invoking it
> >> > multiple times with one operand each.

> >> I've also thought so at first, but then I've realized that error  
> >> printing is
> >> needed for debugging. See here, I've implemented parallel rcNG:

> >> http://nuclight.ipfw.ru/vadim/freebsd/rc_parallel.{8.3R,HEAD}.patch

> > Hmm, a third implementation. Two others are
> > https://github.com/buganini/rcexecr/ and
> > https://github.com/kil/rcorder/.

> > What appears missing in your implementation is a limit (greater than 1
> > and less than the number of scripts) on how many scripts may be running
> > at once. This is like make -jN and prevents thrashing.

> Yes, I am aware about these two implementations and about this problem. My  
> version is not final, and was born after looking at these two. The problem  
> with them is that they pass too much to rcorder's shoulders - it is  
> converted to launcher, too much C code is touched, and result is still  
> very far from being flexible, while already being incompatible with  
> current rcNG scripts. In fact, that's something like systemd in Linux,  
> having all systemd's cons (e.g. inability for admin to fix problem with  
> boot, which was always possible with shell), but not even any of systemd's  
> pros.

> I've tried to take different approach, making it The Right Way(tm):

> * make it Unix way, every utility does it's small part and does it well
> * minimal possible patch to C code: do things in higher-level lang (sh)
> * being fully compatible with current rcNG system (ready to import now)

> This is proof-of-concept, and it worked - you see, first part is not  
> parallel, rcorder's output ($1 of every line) is not changed, being  
> compatible to previous (being able to disable); rcorder's output can be  
> possibly used for other purposes, not just startup; any system  
> administrator can now implement any other scheme in pure shell. Yes, this  
> version is actually thrashing, but implementing any variant of 'make -jN'  
> is now very easy for everyone in 20 lines of shell, compared to hundredths  
> of C code lines graven-in-stone.

> (It may be possible, given your comments below, that some minor changes to  
> sh(1) is more proper way, though. But that's probably theme for discussion  
> in more wider community (hackers@ ?).)

I think it is fine to make rcorder a launcher, but this is a topic for
freebsd-rc@freebsd.org.

> >> and printing errors about unknown jobs helped a lot during debug
> >> (after  all, shell still prints many errors to stderr in
> >> non-interactive scripts, and  that helps, too). I think the same will
> >> be true for anyone utilizing job control (in scripts) with 'wait' for
> >> other tasks, too (BTW, you'll find in the 2  links above [substitute
> >> needed fbsd ver to URL] another one-line patch hunk about 'mflag' to
> >> utilize %? in non-interactive scripts with job control).

> > I need to think about this. I am not sure it is appropriate to tie
> > together the command text and the separate process group for each job.

> That's definitely a question for more competent people than me. I wasn't  
> sure about adding a new flag or making storing that info permanent for all  
> modes of operation (is that IIRC 200 bytes per job critical, though?), so  
> tried to do minimal change keeping that turned off until explicitly  
> requested.

> > The latter effect of set -m is definitely inappropriate for rc scripts
> > since it will prevent interrupting a script using ^C or ^\.

> This probably is not so seriuos as rc.subr sets 'kill -sig $$' for both  
> INT and QUIT traps. This happens not under all rc_* settings, and making  
> separate process group is not necessary, though (and makes it a little  
> slower).

That trap is in run_rc_script which would already be in the background,
so keyboard ^C/^\ would do nothing.

> >> POSIX.2 does not mandate that 'wait' must be silent, only about
> >> error-code behaviour. I've digged into code of 3 other shells,
> >> bash, ksh and zsh, supporting multiple args - they don't do
> >> suppressing error messages,  while bash's code comments explicitly
> >> say about POSIX.2 in this place.

> > In general, POSIX does not permit printing diagnostics (warnings) if the
> > exit status is still 0. This is indicated by a STDERR section saying
> > "The standard error shall be used only for diagnostic messages."
> > and fully detailed in XCU 1.4 Utility Description Defaults.

> That's strange for me as user and admin. Exit status is checked in  
> scripts, and if anyone don't want warnings, they can always be suppressed  
> with 2>/dev/null (BTW, that's what is already done with rcorder in  
> /etc/rc). It is better to have them and silence if you don't need, than  
> not to have when you need.

> > Furthermore, the text "If one or more pid operands are specified that
> > represent unknown process IDs, wait shall treat them as if they were
> > known process IDs that exited with exit status 127." seems clear enough.
> > That would apply to both numeric process IDs and job control job IDs.
> > Implementations mostly follow this for the former but not for the
> > latter. That bash (even in POSIX mode) and zsh (even in sh and ksh
> > emulations) generate a diagnostic seems a bug to me. Note that zsh has
> > many of this kind of bug (the developers don't consider them very
> > important) but bash in POSIX mode should probably fix it.

> > Note that in the sequence
> >   :& wait 1 $!
> > the exit status shall be 0, so no diagnostic shall be written about pid
> > 1. And it makes no sense to write a diagnostic only if the unknown pid
> > is last.

> Hmm, "clear enough" ? For me it's not clear that this is about suppressing  
> warnings, too. I've read POSIX page and relevant places in other shells  
> several times, and decided to leave printing warnings as is.

Another example:
jilles@jaguar /home/jilles% bash -c 'break; echo $?'
bash: line 0: break: only meaningful in a `for', `while', or `until'
loop
0
jilles@jaguar /home/jilles% bash --posix -c 'break; echo $?'
0

Bash is suppressing the diagnostic in POSIX mode because the exit status
remains 0.

Because ksh variants don't generate a diagnostic for unknown numeric
PIDs, I think that is the way to go. Unknown %jobspecs are somewhat
different; these are usually best avoided in scripts anyway,
particularly %?substring because of its uniqueness requirement.

> It seems to  me that primary focus is on the exit status, which is
> almost always the  only relevant/usable thing in scripts. Think about
> it from user's point of  view where you do grep(1) or find(1) over
> /etc and see 'Permission  denied', but you get your request satisfied
> from other files - so, for  you, both the program worked correct
> (produced useful answer to you) AND  warnings also told you some
> useful info.

The cases with grep(1) and find(1) are different. The error affects the
exit status here, even though the utility does not abort. This is a
frequent pattern in POSIX.

> Yes, it makes no sense to write a diagnostic only if the unknown pid is  
> last. Either about all, or about nothing. The grep/find analogy makes it  
> intuitive (for user) to see all diagnostic messages about (transient!)  
> failures while final result was still OK.

> This grep analogy, however, reminds us about 'grep -q' thus suggesting  
> possible solution - control diagnostic printing based on current mode. At  
> least, user definitely wants warnings when shell is interactive. For  
> scripts... may be mflag for those who explicitly request it? Or, if not  
> -m, tie it with another problem with command text mentioned above, if some  
> other flag will be chosen (-b ? -T ? probably depends on what will be  
> chosen for proper implementing 'make -jN' for parallel rcNG).

> >> So finally, given some experience for what it is needed for scripts,  
> >> I've
> >> left getjob() untouched - and this also lead to simpler and cleaner  
> >> diff.

> >> > Also, I would like to avoid calling setjmp().

> >> Also, just curious, what is bad with setjmp() ? It is common in sh(1)
> >> code, nice idiom with savehandler which is easy to write and
> >> understand,  especially in this place, where the code is short.

> > Compilers do not like setjmp(). Mainly because of this issue, the
> > warning level is set low. If it is set higher, many warnings like the
> > below appear:

> > eval.c:1028: warning: variable 'argv' might be clobbered by 'longjmp' or  
> > 'vfork'

> > I have checked the C standard and POSIX and made variables volatile
> > where they say it is needed. Until now, I have refused to make more
> > variables volatile where the standards do not require it but compilers
> > complain anyway.

> That's pretty strange. The setjmp/longjmp are required by standards, so  
> these are bugs in compilers, not in sh(1). I heard Clang people are doing  
> many things based on reports from FreeBSD, may be they will fix it?

It looks like Clang has already fixed this. These wrong warnings do not
appear when compiling with clang with WARNS=7.

> > I am willing to make deeper changes to avoid adding more setjmp() or to
> > remove existing setjmp().

> But how exceptions will be handled, then? Personally I find this method,  
> as it is done in /bin/sh, very elegant for C code (previously I feared  
> setjmp() but in sh(1) it is very clear, now I'll use it in some mine  
> projects). It is sometimes even better and more understandable than  
> C++ exceptions, while being so simple: I looked at setvarsafe(), the most  
> refined example - waitcmd() could be another :-) The code of our /bin/sh  
> is much better (including this aspect) than three other shells I looked at  
> (except may be ksh in OpenBSD).

The other method is to return a value from a function and check that.

> >> > Passing unknown process IDs can be avoided iff there are no
> >> > trap handlers that do not exit.

> >> I don't understood. In script, you can't be always sure your argument is
> >> known or unknown, sh(1) knows better. This patch was done for real task
> >> (parallel rcNG from link above) where jobspecs are coming from external
> >> source, we, as script writers, don't know if they are known or unknown.

> > You can avoid it but may not want to for complexity reasons... OK.

> Probably I still do not understand you, but this is irrelevant for  
> discussion as the whole needed feature is required by POSIX.2, no need to  
> think about workarounds.

OK.

> >> > I will also implement "--" handling and rejection of unknown
> >> > options in "wait" first.

> >> Is it really useful? Several built-ins don't have options and don't
> >> need to check for them. Seems like over-complication.

> > I would like to have the possibility of adding options in future. As
> > long as things that look like options cause the command to fail, it is
> > not a problem not to support them, but if such things will be accepted,
> > something needs to happen. I committed that change.

> Then may be wait -q for interactive mode or wait -v for scripts will  
> handle printing warnings (different defaults for iflag/mflag/whatever) ?  
> Still seems like over-complication to me, however, but just "--" for  
> future (looked at your commit) is OK... until somebody will request  
> processing 'wait -1' like 'kill -INT -1' :-).

I'm not thinking of any concrete new options yet.

-- 
Jilles Tjoelker
Comment 6 Vadim Goncharov 2013-05-12 19:04:56 UTC
29.03.13 @ 18:52 Jilles Tjoelker wrote:

> On Wed, Mar 27, 2013 at 02:27:48AM +0400, Vadim Goncharov wrote:
>> 24.03.13 @ 02:55 Jilles Tjoelker wrote:
>
>> > On Thu, Mar 21, 2013 at 04:56:14PM +0400, Vadim Goncharov wrote:
>> >> On Fri, 15 Mar 2013 17:25:53 GMT; jilles@FreeBSD.org wrote:
>
>> >> > The way you patched getjob seems incorrect as it will still
>> >> > print an error message.
>
>> >> > About working around the lack of these features, invoking
>> >> > "wait" with multiple operands is equivalent to invoking it
>> >> > multiple times with one operand each.
>
>> >> I've also thought so at first, but then I've realized that error
>> >> printing is
>> >> needed for debugging. See here, I've implemented parallel rcNG:
>
>> >> http://nuclight.ipfw.ru/vadim/freebsd/rc_parallel.{8.3R,HEAD}.patch
>
>> > Hmm, a third implementation. Two others are
>> > https://github.com/buganini/rcexecr/ and
>> > https://github.com/kil/rcorder/.
>
>> > What appears missing in your implementation is a limit (greater than 1
>> > and less than the number of scripts) on how many scripts may be  
>> running
>> > at once. This is like make -jN and prevents thrashing.
>
>> Yes, I am aware about these two implementations and about this problem.  
>> My
>> version is not final, and was born after looking at these two. The  
>> problem
>> with them is that they pass too much to rcorder's shoulders - it is
>> converted to launcher, too much C code is touched, and result is still
>> very far from being flexible, while already being incompatible with
>> current rcNG scripts. In fact, that's something like systemd in Linux,
>> having all systemd's cons (e.g. inability for admin to fix problem with
>> boot, which was always possible with shell), but not even any of  
>> systemd's
>> pros.
>
>> I've tried to take different approach, making it The Right Way(tm):
>
>> * make it Unix way, every utility does it's small part and does it well
>> * minimal possible patch to C code: do things in higher-level lang (sh)
>> * being fully compatible with current rcNG system (ready to import now)
>
>> This is proof-of-concept, and it worked - you see, first part is not
>> parallel, rcorder's output ($1 of every line) is not changed, being
>> compatible to previous (being able to disable); rcorder's output can be
>> possibly used for other purposes, not just startup; any system
>> administrator can now implement any other scheme in pure shell. Yes,  
>> this
>> version is actually thrashing, but implementing any variant of 'make  
>> -jN'
>> is now very easy for everyone in 20 lines of shell, compared to  
>> hundredths
>> of C code lines graven-in-stone.
>
>> (It may be possible, given your comments below, that some minor changes  
>> to
>> sh(1) is more proper way, though. But that's probably theme for  
>> discussion
>> in more wider community (hackers@ ?).)
>
> I think it is fine to make rcorder a launcher, but this is a topic for
> freebsd-rc@freebsd.org.

I think this is inappropriate from the long-term maintenance point of view  
(look e.g. at number commits to rcorder, probably it scares people), but  
yes, this is a topic for freebsd-rc@freebsd.org. But, I think that an  
alternative to this still should exist (that this shall be doable via  
shell).

>> >> and printing errors about unknown jobs helped a lot during debug
>> >> (after  all, shell still prints many errors to stderr in
>> >> non-interactive scripts, and  that helps, too). I think the same will
>> >> be true for anyone utilizing job control (in scripts) with 'wait' for
>> >> other tasks, too (BTW, you'll find in the 2  links above [substitute
>> >> needed fbsd ver to URL] another one-line patch hunk about 'mflag' to
>> >> utilize %? in non-interactive scripts with job control).
>
>> > I need to think about this. I am not sure it is appropriate to tie
>> > together the command text and the separate process group for each job.
>
>> That's definitely a question for more competent people than me. I wasn't
>> sure about adding a new flag or making storing that info permanent for  
>> all
>> modes of operation (is that IIRC 200 bytes per job critical, though?),  
>> so
>> tried to do minimal change keeping that turned off until explicitly
>> requested.
>

Have you any thoughts about this now?..

>> > The latter effect of set -m is definitely inappropriate for rc scripts
>> > since it will prevent interrupting a script using ^C or ^\.
>
>> This probably is not so seriuos as rc.subr sets 'kill -sig $$' for both
>> INT and QUIT traps. This happens not under all rc_* settings, and making
>> separate process group is not necessary, though (and makes it a little
>> slower).
>
> That trap is in run_rc_script which would already be in the background,
> so keyboard ^C/^\ would do nothing.

Hmm. How it is supposed to handle interrupting parallel boot, then?

>> >> POSIX.2 does not mandate that 'wait' must be silent, only about
>> >> error-code behaviour. I've digged into code of 3 other shells,
>> >> bash, ksh and zsh, supporting multiple args - they don't do
>> >> suppressing error messages,  while bash's code comments explicitly
>> >> say about POSIX.2 in this place.
>
>> > In general, POSIX does not permit printing diagnostics (warnings) if  
>> the
>> > exit status is still 0. This is indicated by a STDERR section saying
>> > "The standard error shall be used only for diagnostic messages."
>> > and fully detailed in XCU 1.4 Utility Description Defaults.
>
>> That's strange for me as user and admin. Exit status is checked in
>> scripts, and if anyone don't want warnings, they can always be  
>> suppressed
>> with 2>/dev/null (BTW, that's what is already done with rcorder in
>> /etc/rc). It is better to have them and silence if you don't need, than
>> not to have when you need.

Given the facts/discussion below, may be just add (another) option for  
POSIX compatibility? Then admins/users will not be impaired by this  
"standard requirement" if they don't want to.

>> > Furthermore, the text "If one or more pid operands are specified that
>> > represent unknown process IDs, wait shall treat them as if they were
>> > known process IDs that exited with exit status 127." seems clear  
>> enough.
>> > That would apply to both numeric process IDs and job control job IDs.
>> > Implementations mostly follow this for the former but not for the
>> > latter. That bash (even in POSIX mode) and zsh (even in sh and ksh
>> > emulations) generate a diagnostic seems a bug to me. Note that zsh has
>> > many of this kind of bug (the developers don't consider them very
>> > important) but bash in POSIX mode should probably fix it.
>
>> > Note that in the sequence
>> >   :& wait 1 $!
>> > the exit status shall be 0, so no diagnostic shall be written about  
>> pid
>> > 1. And it makes no sense to write a diagnostic only if the unknown pid
>> > is last.
>
>> Hmm, "clear enough" ? For me it's not clear that this is about  
>> suppressing
>> warnings, too. I've read POSIX page and relevant places in other shells
>> several times, and decided to leave printing warnings as is.
>
> Another example:
> jilles@jaguar /home/jilles% bash -c 'break; echo $?'
> bash: line 0: break: only meaningful in a `for', `while', or `until'
> loop
> 0
> jilles@jaguar /home/jilles% bash --posix -c 'break; echo $?'
> 0
>
> Bash is suppressing the diagnostic in POSIX mode because the exit status
> remains 0.
>
> Because ksh variants don't generate a diagnostic for unknown numeric
> PIDs, I think that is the way to go. Unknown %jobspecs are somewhat
> different; these are usually best avoided in scripts anyway,
> particularly %?substring because of its uniqueness requirement.

OK, unknown numeric PIDs are not the my case (at least yet, I don't know  
if this would be changed to job PIDs instead of %jobspecs, but hope this  
is unlikely). Thanks for clarifying this difference. So, %?substring will  
still generate a warning, or we'll need to introduce somewhat like  
'--posix' option?

>> It seems to  me that primary focus is on the exit status, which is
>> almost always the  only relevant/usable thing in scripts. Think about
>> it from user's point of  view where you do grep(1) or find(1) over
>> /etc and see 'Permission  denied', but you get your request satisfied
>> from other files - so, for  you, both the program worked correct
>> (produced useful answer to you) AND  warnings also told you some
>> useful info.
>
> The cases with grep(1) and find(1) are different. The error affects the
> exit status here, even though the utility does not abort. This is a
> frequent pattern in POSIX.

Hmm, yes, they do. But the ultimate goal is [reducing] amount of work for  
admin and/or user, that's also the goal which lies behind the goals for  
which standards are created. Making debug more easy helps reaching this  
ultimate goal.

Personally I was happy to make a patch which touches very little lines of  
code and thus more simple to understand and maintain - getjob() was  
unmodified, no need to handle many cases and modes scattered throughout  
the code, that's usually worse for anyone who'll look to that code for  
first time.

>> Yes, it makes no sense to write a diagnostic only if the unknown pid is
>> last. Either about all, or about nothing. The grep/find analogy makes it
>> intuitive (for user) to see all diagnostic messages about (transient!)
>> failures while final result was still OK.
>
>> This grep analogy, however, reminds us about 'grep -q' thus suggesting
>> possible solution - control diagnostic printing based on current mode.  
>> At
>> least, user definitely wants warnings when shell is interactive. For
>> scripts... may be mflag for those who explicitly request it? Or, if not
>> -m, tie it with another problem with command text mentioned above, if  
>> some
>> other flag will be chosen (-b ? -T ? probably depends on what will be
>> chosen for proper implementing 'make -jN' for parallel rcNG).
>
>> >> So finally, given some experience for what it is needed for scripts,
>> >> I've
>> >> left getjob() untouched - and this also lead to simpler and cleaner
>> >> diff.
>
>> >> > Also, I would like to avoid calling setjmp().
>
>> >> Also, just curious, what is bad with setjmp() ? It is common in sh(1)
>> >> code, nice idiom with savehandler which is easy to write and
>> >> understand,  especially in this place, where the code is short.
>
>> > Compilers do not like setjmp(). Mainly because of this issue, the
>> > warning level is set low. If it is set higher, many warnings like the
>> > below appear:
>
>> > eval.c:1028: warning: variable 'argv' might be clobbered by 'longjmp'  
>> or
>> > 'vfork'
>
>> > I have checked the C standard and POSIX and made variables volatile
>> > where they say it is needed. Until now, I have refused to make more
>> > variables volatile where the standards do not require it but compilers
>> > complain anyway.
>
>> That's pretty strange. The setjmp/longjmp are required by standards, so
>> these are bugs in compilers, not in sh(1). I heard Clang people are  
>> doing
>> many things based on reports from FreeBSD, may be they will fix it?
>
> It looks like Clang has already fixed this. These wrong warnings do not
> appear when compiling with clang with WARNS=7.

Good news to here! So the setjmp() is now persona grata again? :)

>> > I am willing to make deeper changes to avoid adding more setjmp() or  
>> to
>> > remove existing setjmp().
>
>> But how exceptions will be handled, then? Personally I find this method,
>> as it is done in /bin/sh, very elegant for C code (previously I feared
>> setjmp() but in sh(1) it is very clear, now I'll use it in some mine
>> projects). It is sometimes even better and more understandable than
>> C++ exceptions, while being so simple: I looked at setvarsafe(), the  
>> most
>> refined example - waitcmd() could be another :-) The code of our /bin/sh
>> is much better (including this aspect) than three other shells I looked  
>> at
>> (except may be ksh in OpenBSD).
>
> The other method is to return a value from a function and check that.

This other method may be better and may be worse, it depends. In /bin/sh  
case it will be most probably worse (need to repeat the same checks again  
and again will lead to more bugs, if one places is forgotten, etc.).

>> >> > I will also implement "--" handling and rejection of unknown
>> >> > options in "wait" first.
>
>> >> Is it really useful? Several built-ins don't have options and don't
>> >> need to check for them. Seems like over-complication.
>
>> > I would like to have the possibility of adding options in future. As
>> > long as things that look like options cause the command to fail, it is
>> > not a problem not to support them, but if such things will be  
>> accepted,
>> > something needs to happen. I committed that change.
>
>> Then may be wait -q for interactive mode or wait -v for scripts will
>> handle printing warnings (different defaults for iflag/mflag/whatever) ?
>> Still seems like over-complication to me, however, but just "--" for
>> future (looked at your commit) is OK... until somebody will request
>> processing 'wait -1' like 'kill -INT -1' :-).
>
> I'm not thinking of any concrete new options yet.

OK, me is interested in solving some specific subset of problems, not the  
every possible options in the future :) Notice that not just rcNG, but  
other bg-jobs-handling problems, too: I've implemented once a script which  
did several 'make fetch' in ports at once, excluding by pattern (so 'make  
-j' could not be uused) - to control no more than N jobs I've started it  
as '$0 fetchdcookie' and then parsed 'ps ax | grep -c [f]etchdcookie' -  
ugly, but worked. More perfect wait/jobs primitives could help to solve  
such tasks more elegant and perfect way.

-- 
WBR, Vadim Goncharov
Comment 7 dfilter service freebsd_committer freebsd_triage 2013-06-05 20:08:30 UTC
Author: jilles
Date: Wed Jun  5 19:08:22 2013
New Revision: 251429
URL: http://svnweb.freebsd.org/changeset/base/251429

Log:
  sh: Allow multiple operands in wait builtin.
  
  This is only part of the PR; the behaviour for unknown/invalid pids/jobs
  remains unchanged (aborts the builtin with status 2).
  
  PR:		176916
  Submitted by:	Vadim Goncharov

Added:
  head/tools/regression/bin/sh/builtins/wait8.0   (contents, props changed)
Modified:
  head/bin/sh/jobs.c

Modified: head/bin/sh/jobs.c
==============================================================================
--- head/bin/sh/jobs.c	Wed Jun  5 18:49:28 2013	(r251428)
+++ head/bin/sh/jobs.c	Wed Jun  5 19:08:22 2013	(r251429)
@@ -95,6 +95,7 @@ static int ttyfd = -1;
 static void restartjob(struct job *);
 #endif
 static void freejob(struct job *);
+static int waitcmdloop(struct job *);
 static struct job *getjob(char *);
 pid_t getjobpgrp(char *);
 static pid_t dowait(int, struct job *);
@@ -459,15 +460,26 @@ int
 waitcmd(int argc __unused, char **argv __unused)
 {
 	struct job *job;
-	int status, retval;
-	struct job *jp;
+	int retval;
 
 	nextopt("");
-	if (*argptr != NULL) {
+	if (*argptr == NULL)
+		return (waitcmdloop(NULL));
+
+	do {
 		job = getjob(*argptr);
-	} else {
-		job = NULL;
-	}
+		retval = waitcmdloop(job);
+		argptr++;
+	} while (*argptr != NULL);
+
+	return (retval);
+}
+
+static int
+waitcmdloop(struct job *job)
+{
+	int status, retval;
+	struct job *jp;
 
 	/*
 	 * Loop until a process is terminated or stopped, or a SIGINT is

Added: head/tools/regression/bin/sh/builtins/wait8.0
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ head/tools/regression/bin/sh/builtins/wait8.0	Wed Jun  5 19:08:22 2013	(r251429)
@@ -0,0 +1,7 @@
+# $FreeBSD$
+
+exit 44 & p44=$!
+exit 45 & p45=$!
+exit 7 & p7=$!
+wait "$p44" "$p7" "$p45"
+[ "$?" = 45 ]
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 8 dfilter service freebsd_committer freebsd_triage 2013-06-05 20:41:01 UTC
Author: jilles
Date: Wed Jun  5 19:40:52 2013
New Revision: 251430
URL: http://svnweb.freebsd.org/changeset/base/251430

Log:
  sh: Return status 127 for unknown jobs in wait builtin.
  
  This is required by POSIX, at least for pids that are not known child
  processes.
  
  Other problems with job specifications still cause wait to abort with
  exit status 2.
  
  PR:		176916

Added:
  head/tools/regression/bin/sh/builtins/wait10.0   (contents, props changed)
  head/tools/regression/bin/sh/builtins/wait9.127   (contents, props changed)
Modified:
  head/bin/sh/jobs.c

Modified: head/bin/sh/jobs.c
==============================================================================
--- head/bin/sh/jobs.c	Wed Jun  5 19:08:22 2013	(r251429)
+++ head/bin/sh/jobs.c	Wed Jun  5 19:40:52 2013	(r251430)
@@ -96,6 +96,7 @@ static void restartjob(struct job *);
 #endif
 static void freejob(struct job *);
 static int waitcmdloop(struct job *);
+static struct job *getjob_nonotfound(char *);
 static struct job *getjob(char *);
 pid_t getjobpgrp(char *);
 static pid_t dowait(int, struct job *);
@@ -467,8 +468,11 @@ waitcmd(int argc __unused, char **argv _
 		return (waitcmdloop(NULL));
 
 	do {
-		job = getjob(*argptr);
-		retval = waitcmdloop(job);
+		job = getjob_nonotfound(*argptr);
+		if (job == NULL)
+			retval = 127;
+		else
+			retval = waitcmdloop(job);
 		argptr++;
 	} while (*argptr != NULL);
 
@@ -558,7 +562,7 @@ jobidcmd(int argc __unused, char **argv)
  */
 
 static struct job *
-getjob(char *name)
+getjob_nonotfound(char *name)
 {
 	int jobno;
 	struct job *found, *jp;
@@ -623,12 +627,22 @@ currentjob:	if ((jp = getcurjob(NULL)) =
 				return jp;
 		}
 	}
-	error("No such job: %s", name);
-	/*NOTREACHED*/
 	return NULL;
 }
 
 
+static struct job *
+getjob(char *name)
+{
+	struct job *jp;
+
+	jp = getjob_nonotfound(name);
+	if (jp == NULL)
+		error("No such job: %s", name);
+	return (jp);
+}
+
+
 pid_t
 getjobpgrp(char *name)
 {

Added: head/tools/regression/bin/sh/builtins/wait10.0
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ head/tools/regression/bin/sh/builtins/wait10.0	Wed Jun  5 19:40:52 2013	(r251430)
@@ -0,0 +1,5 @@
+# $FreeBSD$
+# Init cannot be a child of the shell.
+exit 49 & p49=$!
+wait 1 "$p49"
+[ "$?" = 49 ]

Added: head/tools/regression/bin/sh/builtins/wait9.127
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ head/tools/regression/bin/sh/builtins/wait9.127	Wed Jun  5 19:40:52 2013	(r251430)
@@ -0,0 +1,3 @@
+# $FreeBSD$
+# Init cannot be a child of the shell.
+wait 1
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 9 dfilter service freebsd_committer freebsd_triage 2013-06-05 20:54:36 UTC
Author: jilles
Date: Wed Jun  5 19:54:28 2013
New Revision: 251432
URL: http://svnweb.freebsd.org/changeset/base/251432

Log:
  sh(1): Document new features in wait builtin.
  
  PR:		176916

Modified:
  head/bin/sh/sh.1

Modified: head/bin/sh/sh.1
==============================================================================
--- head/bin/sh/sh.1	Wed Jun  5 19:46:39 2013	(r251431)
+++ head/bin/sh/sh.1	Wed Jun  5 19:54:28 2013	(r251432)
@@ -32,7 +32,7 @@
 .\"	from: @(#)sh.1	8.6 (Berkeley) 5/4/95
 .\" $FreeBSD$
 .\"
-.Dd May 3, 2013
+.Dd June 5, 2013
 .Dt SH 1
 .Os
 .Sh NAME
@@ -2642,12 +2642,17 @@ If the
 option is specified, the
 .Ar name
 arguments are treated as function names.
-.It Ic wait Op Ar job
-Wait for the specified
+.It Ic wait Op Ar job ...
+Wait for each specified
 .Ar job
 to complete and return the exit status of the last process in the
+last specified
 .Ar job .
-If the argument is omitted, wait for all jobs to complete
+If any
+.Ar job
+specified is unknown to the shell, it is treated as if it
+were a known job that exited with exit status 127.
+If no operands are given, wait for all jobs to complete
 and return an exit status of zero.
 .El
 .Ss Commandline Editing
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 10 Mark Linimon freebsd_committer freebsd_triage 2013-06-05 20:59:48 UTC
State Changed
From-To: open->patched

over to committer.
Comment 11 Jilles Tjoelker freebsd_committer freebsd_triage 2014-08-31 21:00:43 UTC
This was implemented in 10-current. No MFC to older branches is planned.