Bug 287229 - IP reassembly issue in FreeBSD 14.1
Summary: IP reassembly issue in FreeBSD 14.1
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 14.2-STABLE
Hardware: amd64 Any
: --- Affects Some People
Assignee: Michael Tuexen
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2025-06-02 07:46 UTC by Lucas Aubard
Modified: 2025-06-11 09:58 UTC (History)
6 users (show)

See Also:


Attachments
PCAP files (4.49 KB, application/zip)
2025-06-02 07:46 UTC, Lucas Aubard
no flags Details
plots (44.22 KB, application/zip)
2025-06-02 07:48 UTC, Lucas Aubard
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Lucas Aubard 2025-06-02 07:46:59 UTC
Created attachment 260886 [details]
PCAP files

Dear FreeBSD development team, 

I am Lucas Aubard. I am a PhD student in an Inria lab in Rennes, France. 
This PhD is supervised by Gilles Guette (IMT Atlantique), Pierre Chifflier (ANSSI) and Johan Mazel (ANSSI).

During our research work, we analyzed FreeBSD 14.1 when processing overlapping IPv4 and IPv6 data fragments.

Our platform exhaustively generates and tests overlapping and non-overlapping test cases with pair (12 test cases) and triplet (409 test cases) chunks. Every case is tested for several testing scenarii, i.e., the context surrounding the original test case chunks. 

For a given testing scenario, we noticed that FreeBSD does not reassemble at least one test case consistently across the multiple testing runs. 
For IPv4 (resp. IPv6), it eventually impacts 25 (resp. 31) of the 42 implemented testing scenarii. Here are the description of some impacted scenarii: 
- peoef: an ending contiguous extra chunk follows (timewisely) the overlapping test case chunks.
- peoep: an ending contiguous extra chunk precedes (timewisely) the overlapping test case chunks.
- peosfef: a starting and an ending contiguous extra chunks follow (timewisely) the overlapping test case chunks.
- peospep: a starting and an ending contiguous extra chunks precede (timewisely) the overlapping test case chunks.
- peoepsf: an ending contiguous extra chunk precedes (timewisely) and a starting contiguous extra chunk follows (timewisely) the overlapping test case chunks.
- peosf: a starting contiguous extra chunk follows (timewisely) the overlapping test case chunks.
    + af: all the rightmost finishing fragments have the More Fragment bit unset.
    + ns: only the newest starting fragment has the More Fragment bit unset.
    + of: only the oldest finishing fragment has the More Fragment bit unset.
- peosp: a starting contiguous extra chunk precedes (timewisely) the overlapping test case chunks.
    + as: all the rightmost starting fragments have the More Fragment bit unset.
    + nf: only the newest finishing fragment has the More Fragment bit unset.
    + oms: the oldest and mid starting fragment have the More Fragment bit unset. 
- pep: no extra chunks.
    + os: only the oldest starting fragment has the More Fragment bit unset.

According to what we have observed, when a test case inconsistency occurs: at run x, FreeBSD reassembles favoring some overlapping data but at run y, it ignores the test case chunks or it favors other overlapping data. 
While the fewer parallelizations, the fewer inconsistencies, we may observe some residual inconsistencies without parallelization.

Attached are the pcap files and plots for some (random) overlap test cases illustrating the problem. Note that we test FreeBSD 14.1 IPv4 (resp. IPv6) fragment reassemblies with ICMP (resp. ICMPv6) Echo service and 192.168.56.37 (resp. fd00:0:0:56::37) are the FreeBSD host IP address in the PCAP files.

While this non-deterministic behavior cannot be classified as a bug, we believe that this behavior is not intended. Can your confirm this?

Do not hesitate if you have any question.

Best regards,
Lucas Aubard.
Comment 1 Lucas Aubard 2025-06-02 07:48:00 UTC
Created attachment 260887 [details]
plots
Comment 2 Olivier Cochard freebsd_committer freebsd_triage 2025-06-02 12:50:13 UTC
Is your testing script framework/sources available online ?

We are running extensive regression tests and TCP regression tests (https://github.com/freebsd-net/tcp-testsuite), so I would like to reproduce one of your test.
Comment 3 Michael Tuexen freebsd_committer freebsd_triage 2025-06-02 17:14:23 UTC
(In reply to Olivier Cochard from comment #2)
I think the reporter is talking about IP level reassembly whereas the testsuite you are referring to, is testing TCP level reassembly. These are two different things.
I can look at the report, but packetdrill does not (yet) support testing of IP level fragmentation and reassembly...
Comment 4 Lucas Aubard 2025-06-03 08:33:56 UTC
Yes, I confirm I am talking about IP-level reassembly. We tested it with ICMP/IPCMv6 Echo Requests/Replies.

Unfortunately, I can not yet give you access to the platform. The work is currently being pair-reviewed at RAID'25.
I can, however, provide more PCAP files or log information in the meantime.
Comment 5 Michael Tuexen freebsd_committer freebsd_triage 2025-06-08 18:16:06 UTC
I looked at the code and I think there are not much differences between 14.1 and CURRENT, at least for the IPv4 case, which I am focussing on. The reassembly algorithm is also deterministic and therefore looking into cases where you describe it as un-deterministic is interesting.
But to be clear: deterministic means if the IPv4 stack reassembles the same sequence of incoming packets the same way.
I have two questions for you:
(1) how do you ensure that different test cases do not interfere? Manually selecting different IP IDs?
(2) how do you ensure that the reassembly buffer is empty, before you start running a test? If fragments stay there, the second run has a different input.
For example, test_495_peosp-oms, the reassembly can be performed after the first two packets have been received (see the corresponding y trace). So the last two fragments stay in the reassembly buffer. When you run the test again, it looks receiving (in this sequence) the 3rd, 4th, 1st, 2nd, 3rd, 4th segment. After the first four fragments, the packet is reassembled, but differently, since the sequence is different. I guess this results in an ICMP checksum error and therefore you don't see the ICMP echo response.
Maybe I can hack packetdrill a bit to test this locally. But right now I suspect that the cause for the difference in behavior is related to the reassembly buffer not being in the same state when running the tests.
Comment 6 Lucas Aubard 2025-06-10 15:48:53 UTC
(1) Every scenario has its own IPID offset (0, 1000, 2000, ...), which allows for uniquely identifying a test case for one run. For example, test case 152 has an IPID of 21152 with scenario 21 and 36152 for scenarii 36. 
(2) These IPIDs are, however, repeated across the runs. For the other tested IPv4 and IPv6 targets, we do not usually reboot the target between the runs, since we do not observe inconsistent test case reassemblies. However, after the FreeBSD host reboot (which should discard any previously received fragments, right?), we still observe inconsistent reassemblies. 

You can find here https://filesender.renater.fr/?s=download&token=5f407a38-1bc5-47db-aeef-cce91bdabc73 a minimal version of our testing tool with two testing scenarii for which we've found inconsistencies across our runs, namely peoef and peosp-nf. I prepared some instructions in the README, but you might need additional ones. Do not hesitate if you need anything.
Comment 7 Michael Tuexen freebsd_committer freebsd_triage 2025-06-10 20:19:04 UTC
Thanks for sharing your tool. I am planing to add the needed functionality to packetdrill, but that needs a bit of time.
Can I run your tool on a FreeBSD arm64 box or do I need FreeBSD on Intel, since I saw that you are using VirtualBox. Or do I need some other systems?
Comment 8 Lucas Aubard 2025-06-11 08:34:40 UTC
Unfortunately, I have only Virtualbox-based boxes prepared. Since I don't have an arm64 laptop, I can't test an adaptation of the Vagrantfile for arm64 hosts. 
However, I don't think it would require much changes. 
* The current Base box (debian/testing64) is amd64 only, but this one https://portal.cloud.hashicorp.com/vagrant/discover/generic/debian12, for example, has arm64 support. 
* The current Target box (bento/freebsd-14.1) is also amd64 only. But, this one has arm64 support https://portal.cloud.hashicorp.com/vagrant/discover/bento/freebsd-14.

To recall, the tool deploys two Vagrant boxes: Base (a debian) and Target (freebsd 14.1). Base box is not essential; you can actually execute the commands from your host machine (Base mainly allows us to launch generic commands in our IP testing). You may prefer to run/prepare your own freebsd VM (Target). In any case, if you change our default setup, be careful to 1) adapt MAC and IP addresses in the PCAP files, 2) disable NIC offloading that could interfere with freebsd host reassembly. 

I forgot to mention that we would like the entire PCAP file set not shared publicly until our work is published. We plan to publish the entire tool after that.
Comment 9 Michael Tuexen freebsd_committer freebsd_triage 2025-06-11 09:58:34 UTC
(In reply to Lucas Aubard from comment #8)
Thanks for the information. I have no experience with Vagrant, but I do have amd64 based FreeBSD systems. Will try and report.
I will not share any .pcap files or any other stuff except discussing why the behavior observed happens.