Bug 219933 - bsddiff: Implement --suppress-common-lines // -y/--side-by-side // -W
Summary: bsddiff: Implement --suppress-common-lines // -y/--side-by-side // -W
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: Baptiste Daroussin
URL:
Keywords: feature, needs-patch
Depends on:
Blocks:
 
Reported: 2017-06-12 04:05 UTC by Enji Cooper
Modified: 2020-03-04 11:29 UTC (History)
4 users (show)

See Also:


Attachments
Patch for diff.c diffreg.c diff.h diff.1 (14.86 KB, patch)
2020-02-07 09:59 UTC, fehmi noyan isi
no flags Details | Diff
Patch for /usr/test/usr.bin/diff/diff_test (313 bytes, patch)
2020-02-07 10:00 UTC, fehmi noyan isi
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Enji Cooper freebsd_committer 2017-06-12 04:05:35 UTC
I use `--suppress-common-lines -y` frequently to compare two files with diff. This isn't currently implemented in bsd diff:

$ gdiff -y --suppress-common-lines A B
A                                                             | D
C                                                             | E
$ diff -y --suppress-common-lines A B
diff: invalid option -- y
usage: diff [-abdilpTtw] [-c | -e | -f | -n | -q | -u] [--ignore-case]
            [--no-ignore-case] [--normal] [--strip-trailing-cr] [--tabsize]
            [-I pattern] [-L label] file1 file2
       diff [-abdilpTtw] [-I pattern] [-L label] [--ignore-case]
            [--no-ignore-case] [--normal] [--strip-trailing-cr] [--tabsize]
            -C number file1 file2
       diff [-abdiltw] [-I pattern] [--ignore-case] [--no-ignore-case]
            [--normal] [--strip-trailing-cr] [--tabsize] -D string file1 file2
       diff [-abdilpTtw] [-I pattern] [-L label] [--ignore-case]
            [--no-ignore-case] [--normal] [--tabsize] [--strip-trailing-cr]
            -U number file1 file2
       diff [-abdilNPprsTtw] [-c | -e | -f | -n | -q | -u] [--ignore-case]
            [--no-ignore-case] [--normal] [--tabsize] [-I pattern] [-L label]
            [-S name] [-X file] [-x pattern] dir1 dir2
Comment 1 Enji Cooper freebsd_committer 2017-06-12 04:09:39 UTC
(In reply to Ngie Cooper from comment #0)

-W isn't implemented either.
Comment 2 commit-hook freebsd_committer 2017-06-12 05:12:04 UTC
A commit references this bug:

Author: ngie
Date: Mon Jun 12 05:11:44 UTC 2017
New revision: 319847
URL: https://svnweb.freebsd.org/changeset/base/319847

Log:
  Add some testcases for `diff --side-by-side` support

  These are were created proactively, in anticipation of the support being
  fully implemented sometime in the future.

  The tests currently fail on ^/head@r319845, however. Expect them to fail.

  PR:		219933
  Tested with:	gdiff

Changes:
  head/usr.bin/diff/tests/diff_test.sh
Comment 3 Enji Cooper freebsd_committer 2017-06-12 20:19:10 UTC
I think bapt@ might be a better owner -- I just wrote some tests for the feature, once it's been implemented.
Comment 4 commit-hook freebsd_committer 2018-03-23 21:06:20 UTC
A commit references this bug:

Author: ian
Date: Fri Mar 23 21:06:07 UTC 2018
New revision: 331465
URL: https://svnweb.freebsd.org/changeset/base/331465

Log:
  MFC r315051, r315101, r315103, r315107, r315180, r315197, r315293, r315319,
      r315590, r315649, r315726, r315743, r315746-r315747, r315779, r315985,
      r316002, r316639, r316959, r317187, r317194, r317205-r317207, r317381,
      r319489, r319847, r321076-r321079, r321227, r326822

  Add the BSD-licensed diff from OpenBSD, which is optionally built and
  installed when WITHOUT_GNU_DIFF is set.

  r315051:
  Import diff from OpenBSD

  Some of the modifications from the previous summer of code has been integrated
  Modification for compatibility with GNU diff output has been added

  Main difference with OpenBSD:
  Implement multiple GNU diff options:
  * --ignore-file-name-case
  * --no-ignore-file-name-case
  * --normal
  * --tabsize
  * --strip-trailing-cr
  Make diff -p compatible with GNU diff
  Implement diff -l
  Make diff -r compatible with GNU diff

  Capsicumize diffing 2 regular files
  Add a simple test suite

  Approved by:	AsiaBSDcon devsummit
  Obtained from:	OpenBSD, GSoC
  Relnotes:	yes

  r315101:
  Fix wrong date in diff(1)

  Reported by:	rgrimes

  r315103:
  Implement a stub --horizon-lines=NUM for compatibility with GNU diff3

  some options of GNU diff3 would call diff with --horizon-lines, rcs is depending
  on that.

  Reported by:	antoine

  r315107:
  Fix building with recent gcc

  Reported by:	lwhsu, ngie

  r315180:
  Readd codes that creates a tmp file for diffing stdout or devices

  r315197:
  Do not die if cap_rights_limit reports ENOSYS

  Reported by:	mmel

  r315293:
  Integrate contrib/netbsd-tests/usr.bin/diff/t_diff.sh in as
  .../usr.bin/diff/diff_test

  Some minor adjustment needed to be done for :same as it currently
  has the test script hardcoded into the test, instead of using an
  idiom like $(dirname $0)

  Sponsored by:	Dell EMC Isilon

  r315319:
  diff(1): sort long options under -D example in SYNOPSYS

  Sponsored by:	Dell EMC Isilon

  r315590:
  diff(1): add --strip-trailing-cr to last example in the SYNOPSIS

  This syncs the last example in the SYNOPSIS with the other examples.

  Reviewed by:	bapt
  Sponsored by:	Dell EMC Isilon
  Differential Revision:	D10017

  r315649:
  Cache tzdata when running under capsicum

  PR:		217957
  Reported by:	tobik@

  r315726:
  diff(1): fix SYNOPSIS section noting non-existent option, --no-ignore-case

  `--no-ignore-case` should be `--no-ignore-file-name-case` per code for
  compatibility with [g]diff(1).

  Sponsored by:	Dell EMC Isilon

  r315743:
  Use MAX and MIN macros from sys/param.h

  r315746:
  Use strndup(3) instead of malloc + memcpy

  r315747:
  Use MIN macros from sys/param.h

  r315779:
  diff(1): document remaining long options

  While here, try and tie together some of the short options with
  their long option equivalents, where possible.

  Sponsored by:	Dell EMC Isilon

  r315985:
  diff: Fix mtime of file1 in -u/-c header line.

  PR:		218018
  Reviewed by:	bapt
  Differential Revision:	https://reviews.freebsd.org/D10140

  r316002:
  diff: Show nanoseconds in -u/-c header line.

  Show nanoseconds in the -u/-c header line.

  The present portability conditionals cannot handle the POSIX standard
  st_mtim, so remove them and unconditionally use st_mtim.

  PR:		218018
  Reported by:	jbeich
  Reviewed by:	bapt
  Differential Revision:	https://reviews.freebsd.org/D10145

  r316639:
  add a stub --speed-large-files for compatibility with GNU diff

  There is no intention to implement it, but lots of scripts/tools using
  diff(1) passes GNU diff option

  r316959:
  Clean up headers declaration

  r317187:
  Add a regression test for diff -D

  r317194:
  Implement a basic --changed-group-format

  etcupdate(8) requires that option, while GNU diff supports many more variation
  of that options, their behaviour beside the simple verion implemented here are
  quite inconsistent as such I do not plan to implement those.

  The only special keyword supported by this implementation are: %< and %>
  %= is not implemented as the documentation of GNU diff says: common lines, but
  it actually when tested print the changes from the first file

  r317205:
  Document all long options

  r317206:
  Update the TODO list to reflect what has been changed

  r317207:
  Cross reference pr(1) which diff might call with -l option

  r317381:
  Fix the following warning from gcc 4.2 in usr.bin/diff:

  usr.bin/diff/diffreg.c: In function 'change':
  usr.bin/diff/diffreg.c:1085: warning: 'i' may be used uninitialized in this function

  This version of gcc is not smart enough to see that 'i' cannot actually
  be used unitialized.  However, the variable is confusingly re-used, so
  it is better to give it another name, and clearly initialize it before
  attempting to use it.

  Reviewed by:	bapt
  Differential Revision: https://reviews.freebsd.org/D10484

  r319489:
  Add -H as an alias for --speed-large-file to match GNU diff.

  This is undocumented to match GNU diff where -H is also undocumented.
  Some existing software (such as kompare) uses this option by default.

  Reviewed by:	emaste, rpokala
  Differential Revision:	https://reviews.freebsd.org/D11022

  r319847:
  Add some testcases for `diff --side-by-side` support

  These are were created proactively, in anticipation of the support being
  fully implemented sometime in the future.

  The tests currently fail on ^/head@r319845, however. Expect them to fail.

  PR:		219933
  Tested with:	gdiff

  r321076:
  Don't emit "diff: diff <options> arguments" when diffing files if
  -q is specified.

  This improves compatibility with GNU diff.

  Found by accident with `diff -Nrq /usr/tests /usr/tests.new | grep Kyuafile`.

  Relnotes:	yes

  r321077:
  Add some tests for brief (--brief/-q) format

  MFC with:	r321076

  r321078:
  Fix exit status with -rq when there is a file in one directory but not another,
  i.e., when print_only is called.

  Prior to this change, -rq was always returning 0. After this change it will
  return 1 if there is a difference between two directories.

  This fixes compatibility with GNU diff and unbreaks backwards compatibility
  expectations.

  Found when trying to extend diff_test:brief_format_test.

  MFC with:	r321076, r321077

  r321079:
  Add tests that exercise -q, like -rq and add tests that test -q like -Nrq

  MFC with:	r321076, r321077, r321078

  r321227:
  Use more flexible expression for replacing t_diff in
  contrib/netbsd-tests/usr.bin/diff/t_diff.sh with the name of the script via
  `basename $0`.

  This was a change I forgot to port over from
  ^/head/gnu/usr.bin/diff/tests/Makefile@r272787.

  r326822:
  Replace homemade equivalent of tolower(3) by towlower(3)

  This will help in the futur making diff -i works with multibyte

  Relnotes:	Yes

Changes:
_U  stable/11/
  stable/11/usr.bin/Makefile
  stable/11/usr.bin/diff/
  stable/11/usr.bin/diff/TODO
  stable/11/usr.bin/diff/diff.1
  stable/11/usr.bin/diff/diff.c
  stable/11/usr.bin/diff/diff.h
  stable/11/usr.bin/diff/diffdir.c
  stable/11/usr.bin/diff/diffreg.c
  stable/11/usr.bin/diff/tests/Makefile
  stable/11/usr.bin/diff/tests/diff.sh
  stable/11/usr.bin/diff/tests/diff_test.sh
  stable/11/usr.bin/diff/tests/group-format.out
  stable/11/usr.bin/diff/tests/header.out
  stable/11/usr.bin/diff/tests/header_ns.out
  stable/11/usr.bin/diff/tests/ifdef.out
Comment 5 fehmi noyan isi 2020-01-20 11:09:44 UTC
While testing my patch for diff(1) [1], I noticed this one and started working on this two days ago. I had some progress so far and now working on -W now (I can pass the parameter but need to figure out a way to calculate the proper loop counter for printing characters). Once it is sorted, I can submit a patch I think (-y / --side-by-side and --suppres-common-lines seem to be working).

root@patch:~/diff_patch2 # cat file1.txt
this is a test file
name of this file is file 1
some lines are the same
more lines are the same
whereas some are different
here is a random number : 388163
that's all from me
thanks
root@patch:~/diff_patch2 # cat file2.txt
this is a test file
name of this file is file 2 
some lines are the same
more lines are the same
whereas some are different
here is a random number : 6354 
that's all from me
ta
root@patch:~/diff_patch2 # diff file1.txt file2.txt -y --suppress-common-lines  
name of this file is file 1 | name of this file is file 2 
here is a random number : 388163 | here is a random number : 6354 
thanks | ta
root@patch:~/diff_patch2 # diff file1.txt file2.txt -y 
this is a test file   this is a test file
name of this file is file 1 | name of this file is file 2 
some lines are the same   some lines are the same
more lines are the same   more lines are the same
whereas some are different   whereas some are different
here is a random number : 388163 | here is a random number : 6354 
that's all from me   that's all from me
thanks | ta
root@patch:~/diff_patch2 # 

I then need to modify usage() and diff.1 accordingly. May need some style(9) changes to the code as well.
Comment 6 fehmi noyan isi 2020-01-20 11:10:25 UTC
[1] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=242828
Comment 7 fehmi noyan isi 2020-01-26 20:53:28 UTC
well...as far as the number of space characters are concerned, are we looking to get exactly the same output as the GNU diff when -W is used?

This may take  while to *replicate* but I got -W working already (in its own way).
Comment 8 fehmi noyan isi 2020-02-07 09:59:00 UTC
Created attachment 211439 [details]
Patch for diff.c diffreg.c diff.h diff.1

Patch to add --side-by-side // -W // --suppress-common-lines options to diff(1). For atf test cases to succeed, one of the tests needs updating. The patch for this is also attached

root@patch:~/patch/diff/2 # cat A
same1
differ1


differ3
root@patch:~/patch/diff/2 # cat B
same1  
differ2
differ4
root@patch:~/patch/diff/2 # diff A B -y -W 40
same1              |    same1  
differ1            |    differ2
                   |    differ4
                   <    
differ3            <    
root@patch:~/patch/diff/2 # cd /usr/tests
root@patch:/usr/tests # kyua test -k Kyuafile usr.bin/diff/
usr.bin/diff/diff_test:Bflag  ->  passed  [0.047s]
usr.bin/diff/diff_test:b230049  ->  passed  [0.016s]
usr.bin/diff/diff_test:brief_format  ->  passed  [0.045s]
usr.bin/diff/diff_test:group_format  ->  passed  [0.017s]
usr.bin/diff/diff_test:header  ->  passed  [0.020s]
usr.bin/diff/diff_test:header_ns  ->  passed  [0.020s]
usr.bin/diff/diff_test:ifdef  ->  passed  [0.021s]
usr.bin/diff/diff_test:side_by_side  ->  passed  [0.031s]
usr.bin/diff/diff_test:simple  ->  passed  [0.053s]
usr.bin/diff/diff_test:unified  ->  passed  [0.025s]
usr.bin/diff/netbsd_diff_test:mallocv  ->  passed  [0.016s]
usr.bin/diff/netbsd_diff_test:nomallocv  ->  passed  [0.020s]
usr.bin/diff/netbsd_diff_test:same  ->  passed  [0.021s]

Results file id is usr_tests.20200207-225114-281378
Results saved to /root/.kyua/store/results.usr_tests.20200207-225114-281378.db

13/13 passed (0 failed)
root@patch:/usr/tests #
Comment 9 fehmi noyan isi 2020-02-07 10:00:44 UTC
Created attachment 211440 [details]
Patch for /usr/test/usr.bin/diff/diff_test

This patch is required to update the atf test case
Comment 10 fehmi noyan isi 2020-02-07 10:07:46 UTC
I noticed a couple of bugs while owrkign on this. I will create new bug reports and work on them.

- We probably should print a message and usage() when two formatting options are used together. Instead, diff(1) carries on picks one of the formatting options (this is not random, but needs a bit of debugging)

- On another machine, I tried to use --tabsize but getting SIGSEGV
fnoyanisi@beastie:~ % diff A B --tabsize 4
Segmentation fault (core dumped)
fnoyanisi@beastie:~ % uname -a
FreeBSD beastie 12.1-RELEASE-p2 FreeBSD 12.1-RELEASE-p2 GENERIC  amd64
fnoyanisi@beastie:~ % 

I will have a look and create a bug report if needed
Comment 11 commit-hook freebsd_committer 2020-02-07 10:18:14 UTC
A commit references this bug:

Author: bapt
Date: Fri Feb  7 10:17:14 UTC 2020
New revision: 357648
URL: https://svnweb.freebsd.org/changeset/base/357648

Log:
  diff: implement -y (--side-by-side) along with -W and --suppress-common-lines

  PR:		219933
  Submitted by:	fehmi noyan isi <fnoyanisi@yahoo.com>
  MFC after:	3 weeks

Changes:
  head/usr.bin/diff/diff.1
  head/usr.bin/diff/diff.c
  head/usr.bin/diff/diff.h
  head/usr.bin/diff/diffreg.c
  head/usr.bin/diff/tests/diff_test.sh
Comment 12 Baptiste Daroussin freebsd_committer 2020-02-07 10:20:14 UTC
Thank you very much for your contribution!! I have tested and added it to the tree.

Don't hesitate to ping me if needed on further diff or diff3 related contributions!

Looking forward to merge more patches!
Comment 13 fehmi noyan isi 2020-02-08 09:28:48 UTC
Thanks Baptiste...

I created two bug reports for the issues I mentioned and working on the patches for them.

diff(1) should print the usage information when two different formatting options are used at the same time
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=243975

diff(1) generates core dump when --tabsize is used
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=243974
Comment 14 commit-hook freebsd_committer 2020-03-04 11:29:19 UTC
A commit references this bug:

Author: bapt
Date: Wed Mar  4 11:28:50 UTC 2020
New revision: 358610
URL: https://svnweb.freebsd.org/changeset/base/358610

Log:
  MFC r357648:

  diff: implement -y (--side-by-side) along with -W and --suppress-common-lines

  PR:		219933
  Submitted by:	fehmi noyan isi <fnoyanisi@yahoo.com>

Changes:
_U  stable/12/
  stable/12/usr.bin/diff/diff.1
  stable/12/usr.bin/diff/diff.c
  stable/12/usr.bin/diff/diff.h
  stable/12/usr.bin/diff/diffreg.c
  stable/12/usr.bin/diff/tests/diff_test.sh