This isn't really a problem, since FreeBSD's sort command is more than fast enough for my purposes, but it seems worth reporting in case it exposes an undiscovered problem that might be easy to fix. All tests were run on the same machine, a ThinkCenter i5 with 8G RAM. NetBSD running under VirtualBox. NetBSD netbsd9.acadix bacon ~/Prog/Src/peak-classifier 435: (pkgsrc): time sort -n -k 1 -k 2 -k 3 pc-gff-stripped.bed > /dev/null 1.409u 2.548s 0:03.95 99.7% 0+0k 0+0io 0pf+0w FreeBSD coral.acadix bacon ~/Prog/Src/peak-classifier 1005: time sort -n -k 1 -k 2 -k 3 pc-gff-stripped.bed > /dev/null 24.574u 0.795s 0:25.37 99.9% 55+172k 0+0io 0pf+0w FreeBSD coral.acadix bacon ~/Prog/Src/peak-classifier 1006: time gsort --parallel=1 -n -k 1 -k 2 -k 3 pc-gff-stripped.bed > /dev/null 3.005u 0.149s 0:03.16 99.3% 152+177k 6+0io 3pf+0w I generated a simpler file using just column 2 from the bed file and ran a simple sort -n on it (no -k flags). Results were similar, so the performance difference is not limited to multicolumn sorting. Results are the same on 12.2 and 13.0.
Can you share the data that you're using for the comparison, or something with equivalent properties?
Created attachment 224652 [details] Column 2 from the original BED (TSV) file time sort -n test.txt > /dev/null 3.004u 0.063s 0:03.06 100.0% 55+172k 0+0io 0pf+0w time gsort --parallel 1 -n test.txt > /dev/null 0.628u 0.047s 0:00.67 98.5% 156+182k 0+0io 0pf+0w
You may be interested in the patch here which shows an optimization that speeds up sort by a significant amount, though not equivalently to the speed of gsort: https://reviews.freebsd.org/D30170. The flags --qsort, --mergesort may also be of interest, since they are faster than heapsort, the default algorithm used by sort.
(In reply to cyril from comment #3) The patch about cuts the time in half for my data. FreeBSD coral.acadix bacon ~/Prog/Src/peak-classifier 1018: time sort -n -k 1 -k 2 -k 3 pc-gff-stripped.bed > /dev/null 12.361u 0.794s 0:13.16 99.9% 55+171k 0+0io 0pf+0w
(In reply to Jason W. Bacon from comment #4) It'd be interesting to know whether --mergesort gives a further improvement.
(In reply to Mark Johnston from comment #5) FreeBSD coral.acadix bacon ~/Prog/Src/peak-classifier 1004: !972 time sort -n -k 1 -k 2 -k 3 pc-gff-stripped.bed > /dev/null 12.467u 0.906s 0:13.38 99.8% 55+171k 634+0io 0pf+0w FreeBSD coral.acadix bacon ~/Prog/Src/peak-classifier 1005: time sort --qsort -n -k 1 -k 2 -k 3 pc-gff-stripped.bed > /dev/null 11.188u 0.811s 0:12.00 99.9% 55+171k 0+0io 0pf+0w FreeBSD coral.acadix bacon ~/Prog/Src/peak-classifier 1006: time sort --mergesort -n -k 1 -k 2 -k 3 pc-gff-stripped.bed > /dev/null 8.271u 0.857s 0:09.13 99.8% 55+171k 0+0io 0pf+0w
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=71ec05a21257e159f40d54e26ad0011bb19b5134 commit 71ec05a21257e159f40d54e26ad0011bb19b5134 Author: Cyril Zhang <cyril@freebsdfoundation.org> AuthorDate: 2021-05-13 12:55:06 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2021-05-13 13:33:19 +0000 sort: Cache value of MB_CUR_MAX Every usage of MB_CUR_MAX results in a call to __mb_cur_max. This is inefficient and redundant. Caching the value of MB_CUR_MAX in a global variable removes these calls and speeds up the runtime of sort. For numeric sorting, runtime is almost halved in some tests. PR: 255551 PR: 255840 Reviewed by: markj MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D30170 usr.bin/sort/bwstring.c | 54 ++++++++++++++++++++++++------------------------ usr.bin/sort/bwstring.h | 9 ++++---- usr.bin/sort/radixsort.c | 4 ++-- usr.bin/sort/sort.c | 6 +++++- usr.bin/sort/sort.h | 6 ++++++ 5 files changed, 45 insertions(+), 34 deletions(-)
We are looking at changing the default to mergesort. One other thing that will help a lot is to set LC_ALL=C in the sort(1) invocation. In this case sort(1) will use radix sort and will assume that each character occupies a single byte, which makes the comparison function much faster.
sh -c ./sort-test.sh + time sort -n -k 1 -k 2 -k 3 pc-gff-stripped.bed 13.28 real 12.32 user 0.95 sys + time gsort '--parallel=1' -n -k 1 -k 2 -k 3 pc-gff-stripped.bed 3.05 real 2.97 user 0.08 sys + export 'LC_ALL=C' + time sort -n -k 1 -k 2 -k 3 pc-gff-stripped.bed 7.64 real 7.30 user 0.33 sys + time sort --mergesort -n -k 1 -k 2 -k 3 pc-gff-stripped.bed 4.02 real 3.61 user 0.41 sys + time gsort '--parallel=1' -n -k 1 -k 2 -k 3 pc-gff-stripped.bed 2.40 real 2.29 user 0.11 sys
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=df40dcbf7c794f5448c13e23670466658a620933 commit df40dcbf7c794f5448c13e23670466658a620933 Author: Cyril Zhang <cyril@freebsdfoundation.org> AuthorDate: 2021-05-13 12:55:06 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2021-05-20 13:15:43 +0000 sort: Cache value of MB_CUR_MAX Every usage of MB_CUR_MAX results in a call to __mb_cur_max. This is inefficient and redundant. Caching the value of MB_CUR_MAX in a global variable removes these calls and speeds up the runtime of sort. For numeric sorting, runtime is almost halved in some tests. PR: 255551 PR: 255840 Reviewed by: markj Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D30170 (cherry picked from commit 71ec05a21257e159f40d54e26ad0011bb19b5134) usr.bin/sort/bwstring.c | 54 ++++++++++++++++++++++++------------------------ usr.bin/sort/bwstring.h | 9 ++++---- usr.bin/sort/radixsort.c | 4 ++-- usr.bin/sort/sort.c | 6 +++++- usr.bin/sort/sort.h | 6 ++++++ 5 files changed, 45 insertions(+), 34 deletions(-)
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=68d3790ba0bce162f9fcaed09cfecd9adeab3943 commit 68d3790ba0bce162f9fcaed09cfecd9adeab3943 Author: Cyril Zhang <cyril@freebsdfoundation.org> AuthorDate: 2021-06-17 17:40:16 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2021-06-17 17:53:03 +0000 sort: Change default algorithm to mergesort This results in a significant improvement in the runtime of sort(1) when radix sort cannot be used. This comes at the expense of increased memory usage, but this is small relative to sort's overall memory usage. PR: 255551 Reviewed by: markj Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D30319 usr.bin/sort/file.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
I'm going to close this one now that PR 255840 is fixed as well. On main we're still not as fast as GNU sort, but the difference is much smaller: markj@nuc> time sort -n test.txt > /dev/null real 0m0.925s user 0m0.875s sys 0m0.050s markj@nuc> time LC_ALL=C sort -n test.txt > /dev/null real 0m0.782s user 0m0.738s sys 0m0.033s markj@nuc> time gsort --parallel 1 -n test.txt > /dev/null real 0m0.423s user 0m0.380s sys 0m0.041s Per > This isn't really a problem, since FreeBSD's sort command is more than fast enough for my purposes, but it seems worth reporting in case it exposes an undiscovered problem that might be easy to fix. I think this PR has achieved its goals. :) Please re-open if you disagree or have suggestions for further improvements.
That's a huge improvement, thanks! Even less motivation to bother specifying gsort now.