Created attachment 171553 [details] Be more precise in the manpage. The manpage of /usr/bin/comm says at the moment, that file1 and file2 "should be sorted". This gives the impression that it will work as expected if the files are not sorted, but maybe with a slight penalty on performance. It is however the case, that unsorted files do not work as expected: Example: % cat file1 line1 line3 line2 % cat file2 line1 line2 line3 % comm -12 file1 file2 line1 line3 The attached patch changes the language to "have to be sorted".
The POSIX man page for comm uses "should" and additionally states: If the lines in both files are not ordered according to the collating sequence of the current locale, the results are unspecified. The FreeBSD man page states: The comm utility assumes that the files are lexically sorted; all characters participate in line comparisons. I think it is generally well-understood that in this context "should" indicates a requirement of conformance. The additional statement reinforces that requirement. There is no need to replace "should".
Agreed Even something like -which should be +which must be is still ambiguous as to what happens if lines are unsorted. Additionally, we do mention that comm(1) assumes lines are sorted. Appreciate the contribution but I'm going to close this.