Bug 15458

Summary: sort(1) doesn't sort correctly in some cases
Product: Base System Reporter: Rudolf Čejka <cejkar>
Component: binAssignee: Gabor Kovesdan <gabor>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: 4.0-CURRENT   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
file.diff none

Description Rudolf Čejka 1999-12-13 13:10:02 UTC
Sort(1) doesn't work in some cases for some locales. In cs_CZ.ISO_8859-2
(will be shortly commited; maybe similar problem could be seen with es_ES)
there is collation definition:

	(H,h);\
	(CH,Ch,ch);\
	(I,i);\

So sort should sort "h" and "ch" in order "h", "ch". But it sorts
these two words incorrectly as "ch", "h". If I want to sort for
example "ha" and "ch", it will be sorted correctly: "ha", "ch".

The problem is in "optimalizations", where only substrings of two
strings of minimal length of one of them are compared in strcoll()
function. This is not possible to do in this manner for languages,
where collating symbols could be longer than one character.

Fix: Here is my patch for /usr/src/gnu/usr.bin/sort/sort.c:
Comment 1 Andrey A. Chernov freebsd_committer freebsd_triage 1999-12-22 01:06:47 UTC
On Mon, Dec 13, 1999 at 02:01:02PM +0100, cejkar@dcse.fee.vutbr.cz wrote:
> Sort(1) doesn't work in some cases for some locales. In cs_CZ.ISO_8859-2
> (will be shortly commited; maybe similar problem could be seen with es_ES)
> there is collation definition:
> 
> 	(H,h);\
> 	(CH,Ch,ch);\
> 	(I,i);\

> Here is my patch for /usr/src/gnu/usr.bin/sort/sort.c:

It is general problem in GNU sort which compare strings character-by-character.
Your patch not helps, if f.e. ignore case or skip blanks flags are given.
Correct patch require big redesign of sort. Try to contact GNU sort
maintainers first to ask them to fix this bug in future sort versions.

-- 
Andrey A. Chernov
http://nagual.pp.ru/~ache/
MTH/SH/HE S-- W-- N+ PEC>+ D A a++ C G>+ QH+(++) 666+>++ Y
Comment 2 cejkar 1999-12-22 14:18:22 UTC
Andrey A. Chernov wrote (1999/12/21):
> It is general problem in GNU sort which compare strings character-by-character.
> Your patch not helps, if f.e. ignore case or skip blanks flags are given.

At this poing you are right.

> Correct patch require big redesign of sort. Try to contact GNU sort
> maintainers first to ask them to fix this bug in future sort versions.

We should not contact GNU sort maintainers because this is FreeBSD
specific problem: In our source tree there is a very old patched
sort-1.14 and they have sort-2.0 already. And sort-2.0 works much better
and hasn't this problem.

So the best solution should be to import sort-2.0 from textutils-2.0.
I have tried this and it looks it works: We have to configure textutils-2.0
with "configure --with-catgets" and copy-out these files: COPYING, 
intl/cat-compat.c, po/cat-id-tbl.c, lib/closeout.[ch], config.h,
lib/error.[ch], lib/getopt.[ch], lib/getopt1.c, lib/hard-locale.[ch],
intl/libgettext.h, intl/libintl.h, lib/long-options.[ch], lib/memcoll.[ch],
man/sort.1, src/sort.c, src/sys2.h, src/system.h, lib/version-etc.[ch],
lib/xalloc.h and lib/xmalloc.c. After this in Makefile we have to define
all *.c as SRCS and add -DLOCALEDIR=\"/usr/share/nls\" and it works.

But I expect another problems again there ;-)

-- 
Rudolf Cejka   (cejkar@dcse.fee.vutbr.cz;  http://www.fee.vutbr.cz/~cejkar)
Brno University of Technology, Faculty of El. Engineering and Comp. Science
Bozetechova 2, 612 66  Brno, Czech Republic
Comment 3 Andrey A. Chernov freebsd_committer freebsd_triage 1999-12-22 20:57:30 UTC
State Changed
From-To: open->analyzed

I agree that we need to switch to sort-2.0 
Comment 4 Andrey A. Chernov freebsd_committer freebsd_triage 2002-06-08 21:05:01 UTC
State Changed
From-To: analyzed->feedback

We switch to latest GNU sort in -current. Is PR problem still exists with it?
Comment 5 Andrey A. Chernov freebsd_committer freebsd_triage 2002-06-10 12:45:02 UTC
State Changed
From-To: feedback->patched

Problem fixed in -current by upgrading to new GNU sort
Comment 6 le freebsd_committer freebsd_triage 2004-07-22 14:25:43 UTC
State Changed
From-To: patched->closed

As the problem seems to be fixed long ago, close this PR.
Comment 7 le freebsd_committer freebsd_triage 2004-07-22 14:38:47 UTC
State Changed
From-To: closed->patched

Re-open this PR as submitter says problem still exists in -stable.
Comment 8 Gabor Kovesdan freebsd_committer freebsd_triage 2007-03-16 21:49:27 UTC
State Changed
From-To: patched->feedback

Dear Submitter, 

could you check if it's still an issue on a recent release, please? 

Thanks in advance, 
Gabor 


Comment 9 Gabor Kovesdan freebsd_committer freebsd_triage 2007-03-16 21:49:27 UTC
Responsible Changed
From-To: freebsd-bugs->gabor

Track.
Comment 10 Gabor Kovesdan freebsd_committer freebsd_triage 2007-03-17 21:06:36 UTC
State Changed
From-To: feedback->closed

Submitter agreed, that this can be closed now.