Bug 267032 - diff(1) -U and -C with very large number of context may produce incorrect line indicator
Summary: diff(1) -U and -C with very large number of context may produce incorrect lin...
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 13.0-RELEASE
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-10-14 00:40 UTC by Yasuhito FUTATSUKI
Modified: 2022-10-17 13:32 UTC (History)
3 users (show)

See Also:


Attachments
proposed fix for the integer overflows. (2.10 KB, patch)
2022-10-15 09:59 UTC, Daniel Tameling
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Yasuhito FUTATSUKI 2022-10-14 00:40:08 UTC
diff(1) can accept number of context lines up to INT_MAX-1, however, it can cause signed integer overflow while calculating line number in line indicator.

e.g. on FreeBSD/amd64,

$ /usr/bin/diff -U2147483645 /usr/src/usr.bin/diff/diff.h /usr/src/usr.bin/diff/diff.c | /usr/bin/grep '^@@'
@@ -1,117 +1,659 @@
@@ --2147483646,0 +-2147483647,0 @@
@@ --2147483634,0 +-2147483643,0 @@
@@ --2147483622,0 +-2147483635,0 @@
@@ --2147483619,0 +-2147483631,0 @@
@@ --2147483616,0 +-2147483627,0 @@
@@ --2147483613,0 +-2147483625,0 @@
@@ --2147483598,0 +-2147483615,0 @@
@@ --2147483596,0 +-2147483612,0 @@
@@ --2147483595,0 +-2147483597,0 @@
@@ --2147483588,0 +-2147483581,0 @@
@@ --2147483574,0 +-2147483535,0 @@
@@ --2147483573,0 +-2147483167,0 @@
@@ --2147483571,0 +-2147483165,0 @@
@@ --2147483562,0 +-2147483158,0 @@
@@ --2147483561,0 +-2147483142,0 @@
@@ --2147483559,0 +-2147483140,0 @@
@@ --2147483555,0 +-2147483135,0 @@
@@ --2147483550,0 +-2147483129,0 @@
@@ --2147483538,0 +-2147483124,0 @@
@@ --2147483534,0 +-2147482992,0 @@

I tried it on FreeBSD/amd64 13.0-RELEASE, with diff built from source brought from commit 3736b2dd327050d2e6c925964b210eccbaac51ab in main branch.
Comment 1 Daniel Tameling 2022-10-15 09:54:51 UTC
I had a look at this and there are two kinds of overflows that happen.

The first is for the line number. Here the code does something like

upd = MIN(len[1], context_vec_ptr->d + diff_context);

where the addition can overflow. The fix is to just use len[1] if that would happen.

The second overflow happens when checking whether the current chuck overlaps with the next one. This is why the command you posted has so much output; it should just be one large chunk. The code of the check is

} else if (a > context_vec_ptr->b + (2 * diff_context) + 1 &&
           c > context_vec_ptr->d + (2 * diff_context) + 1) {

To fix this one should check if an overflow would happen and if that is the case to not execute the body of the else if.
Comment 2 Daniel Tameling 2022-10-15 09:59:05 UTC
Created attachment 237325 [details]
proposed fix for the integer overflows.

The patch assumes that a,b,c,d represent line numbers and are thus always positive.