Bug 144325

Summary: [libpcap] tcpdump compiles complex expression to incorrect BPF code
Product: Base System Reporter: Vadim Goncharov <vadim_nuclight>
Component: kernAssignee: freebsd-bugs (Nobody) <bugs>
Status: Open ---    
Severity: Affects Only Me CC: archit, delphij, gharris, guy, imp
Priority: Normal    
Version: 7.2-RELEASE   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
test file demonstrating the bug
none
expression demonstrating the bug
none
shell script to run the test case
none
Proposed patch none

Description Vadim Goncharov 2010-02-26 16:00:13 UTC
I tried to gather statistics on some packets based on signature in data
payload, for plain traffic that was simple "tcpdump 'udp[20:4]=0x7fffffff'"
(this works) but for PPTP things go complex and I was forced to write very
complex expression. I've used cpp(1) for this. This has worked earlier in
6.4 days for me for another packets, I've just redone it for another bytes,
file for cpp was roughtly the same. But this didn't match anything.

After cutting down the full expression to only the most relevant part for
my generated test IP-packets (see most parts below /* commented */), I was
able to look at the tcpdump -d debug BPF assembly code output and identify
that generated code was incorrect.

Fix: 

No known. In some cases BPF code could be manually edited and installed
to kernel, but not all programs support it (I need tcpdump).

Also note that this is too complex due to one need to manually get IP
headers length - slightly easier preprocessor works for me. If tcpdump's
syntax supported things like ipdata[] or tcpdata[] (utilising BPF_MSH),
it should be shorter and perhaps correct - I suspect that libpcap's code
optimizer is buggy on long expressions.
How-To-Repeat: 
Here is the packet which I try to match, and tcpdump debug output.
Note that if you uncomment all final expression's parts and do
s/INNER_IS_UTP/INNER_IS_UDP/g, this will work for up to all UDP
packets inside GRE (but without signatures, of course). So bugs
start to happen when IS_TORRENT_UTP(INNER_UDP_OFFSET(ppp_hdr_len)) is
included.

$ uuencode bug100225.pcap < bug100225.pcap  # test packet to catch
begin 644 bug100225.pcap
MU,.RH0(`!````````````/__````````W=B&2P98"P!:````6@````(```!%
M``!6S'4``/PO;#!.C``,V1U>'3"!B`L`,L`````+IP``"^#_`P`A10``+CU*
H```X$<?3;7N^'DZ,`WS-9P*:`!IH\@`!`@,$!08'"`D*"W____^K````
`
end
$ cat tcpdump-gre-utp-cpp
#define IPHDRLEN(firstbyte) ((ip[firstbyte]&0xf)<<2)
#define GRESTART IPHDRLEN(0)
/* Check that is GREv1 with seq num and proto set per RFC 2637 */
#define VALID_PPTP_GRE ((ip[GRESTART:4] & 0xff7fffff) = 0x3001880b)
/* ACK is optional 4 bytes to previous 12 */
#define GRE_DATA_START (GRESTART + ((ip[GRESTART+1] & 0x80) >> 5) + 12)
/* Actual IP byte values to find in the UDP payload of inner IP datagram */
#define IS_TORRENT_UTP(udp_hdr_start)   (ip[(udp_hdr_start+20):4]=0x7fffffff)
/* Check inner IP has UDP payload (proto 17) then calculate offset and pass it to UTP macro */
#define INNER_IS_UDP(ppp_hdr_len)       (ip[GRE_DATA_START+ppp_hdr_len+9]=17)
#define INNER_UDP_OFFSET(ppp_hdr_len)   (GRE_DATA_START+ppp_hdr_len+IPHDRLEN(GRE_DATA_START+ppp_hdr_len))
#define INNER_IS_UTP(ppp_hdr_len)       (INNER_IS_UDP(ppp_hdr_len) and IS_TORRENT_UTP(INNER_UDP_OFFSET(ppp_hdr_len)))

/*
 * Finally, expression: sort by most frequent pattern first.
 * We check four possible PPP headers corresponding to IP, then
 * pass length of matched PPP header to checking macros.
 */
proto gre /*and VALID_PPTP_GRE*/ and (
/*      (
                (ip[GRE_DATA_START]=0x21) and INNER_IS_UTP(1)
        ) or (
                (ip[GRE_DATA_START:2]=0xff03) and (ip[GRE_DATA_START+2]=0x21) and INNER_IS_UTP(3)
        ) or (*/
                (ip[GRE_DATA_START:4]=0xff030021) and INNER_IS_UTP(4)
/*      ) or (
                (ip[GRE_DATA_START:2]=0x0021) and INNER_IS_UTP(2)
        )*/
)
$ tcpdump -dni ng0 `cpp -P tcpdump-gre-utp-cpp`
(000) ld       [0]
(001) jeq      #0x2000000       jt 2    jf 73
(002) ldb      [13]
(003) jeq      #0x2f            jt 4    jf 73
(004) ldb      [4]
(005) and      #0xf
(006) lsh      #2
(007) st       M[3]
(008) ldb      [4]
(009) and      #0xf
(010) lsh      #2
(011) add      #1
(012) tax
(013) ldb      [x + 4]
(014) and      #0x80
(015) rsh      #5
(016) tax
(017) ld       M[3]
(018) add      x
(019) add      #12
(020) tax
(021) ld       [x + 4]
(022) jeq      #0xff030021      jt 23   jf 73
(023) ldb      [4]
(024) and      #0xf
(025) lsh      #2
(026) st       M[1]
(027) ldb      [4]
(028) and      #0xf
(029) lsh      #2
(030) add      #1
(031) tax
(032) ldb      [x + 4]
(033) and      #0x80
(034) rsh      #5
(035) tax
(036) ld       M[1]
(037) add      x
(038) add      #12
(039) add      #4
(040) add      #9
(041) tax
(042) ldb      [x + 4]
(043) jeq      #0x11            jt 44   jf 73
(044) ldb      [4]
(045) and      #0xf
(046) lsh      #2
(047) add      #1
(048) tax
(049) ldb      [4]
(050) and      #0xf
(051) lsh      #2
(052) st       M[15]
(053) ldb      [x + 4]
(054) and      #0x80
(055) rsh      #5
(056) tax
(057) ld       M[15]
(058) add      x
(059) add      #12
(060) add      #4
(061) tax
(062) ldb      [x + 4]
(063) and      #0xf
(064) lsh      #2
(065) tax              ; here is the BUG - if this and next line cut out, then
(066) ld       M[11]   ; it will be correct...  and M[11] is never stored above
(067) add      x
(068) add      #20
(069) tax
(070) ld       [x + 4]
(071) jeq      #0x7fffffff      jt 72   jf 73
(072) ret      #96
(073) ret      #0
Comment 1 Eitan Adler freebsd_committer freebsd_triage 2017-12-31 08:01:41 UTC
For bugs matching the following criteria:

Status: In Progress Changed: (is less than) 2014-06-01

Reset to default assignee and clear in-progress tags.

Mail being skipped
Comment 2 Archit Shah 2020-10-17 07:30:23 UTC
Created attachment 218830 [details]
test file demonstrating the bug
Comment 3 Archit Shah 2020-10-17 07:31:05 UTC
Created attachment 218831 [details]
expression demonstrating the bug
Comment 4 Archit Shah 2020-10-17 07:38:25 UTC
Created attachment 218832 [details]
shell script to run the test case
Comment 5 Archit Shah 2020-10-17 08:03:39 UTC
Created attachment 218833 [details]
Proposed patch

The bug appears to be in libpcap. The libpcap optimizer (contrib/libpcap/optimizer.c) removes statements as dead that store certain values but does not account for the fact that a successor block may attempt to read the value written by the dead statemenent.  The proposed patch marks the "val" data structure as having unknown value when statements are removed as dead to indicate to successor blocks that the value is not available. (I will also report this upstream.)
Comment 6 Xin LI freebsd_committer freebsd_triage 2022-11-22 22:37:20 UTC
(In reply to Archit Shah from comment #5)
Hi, was this reported to the upstream?  (I haven't found one at https://github.com/the-tcpdump-group/libpcap/issues?q=is%3Aissue ) If so, could you please give us a URL for reference?

(also +guy@ for visibility)
Comment 7 Archit Shah 2023-03-06 18:39:25 UTC
Upstream pull request is here: https://github.com/the-tcpdump-group/libpcap/pull/976
Comment 8 Warner Losh freebsd_committer freebsd_triage 2024-01-14 18:48:48 UTC
Since the import of 1.10.3, this patch conflicts.
I don't know if the changes related to detecting the optimizer loop or not will fix this issue though.

Can you confirm either that (a) the upstream fix in 1.10.3 fixes your issue or (b) making the trivial modifications to your patch will fix the still lingering issue?

I see that the upstream pull request remains open, so I'm unsure what the status is... The import was just a couple of weeks after the last comment...

Thanks and sorry for the delay.
Comment 9 Archit Shah 2024-01-20 09:07:16 UTC
It appears that upstream just merged my proposed fix today and I don't believe 1.10.3 addresses the bug.  However, trivial modifications to the patch to work with 1.10.3 should also work until the next upstream release is made and imported into FreeBSD.