Created attachment 152865 [details] Dmesg and kernel panic on CURRENT When I set the network interface address, I get a bunch of "Memory modified after free" messages: Memory modified after free 0xfffff800039de800(2048) val=ffffffff @ 0xfffff800039de800 Memory modified after free 0xfffff800039d4800(2048) val=ffffffff @ 0xfffff800039d4800 If I wait long enough (a couple of minutes) I get a kernel panic. I attach an example (dmesg + kernel panic) I've tested it using 10.1-STABLE, same messages after ifconfig, but the kernel panic is different. On 10, I see really often the value 0x3201c040 causing segmentation fault (!), but I don't know where it comes from. About the messages, it could be that the init procedure of re(4) cannot correctly stop the device (a normal Realtek 8168) and the dma address are rewritten by receiving packets.
Created attachment 152866 [details] dmesg (verbose) and kernel panic on 10-STABLE
I add the pciconf output pciconf -lvbce pci0:3:0:0 re0@pci0:3:0:0: class=0x020000 card=0x012310ec chip=0x816810ec rev=0x0c hdr=0x00 vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTL8111/8168B PCI Express Gigabit Ethernet controller' class = network subclass = ethernet bar [10] = type I/O Port, range 32, base rx1000, size 256, enabled bar [18] = type Memory, range 64, base rx90500000, size 4096, enabled bar [20] = type Prefetchable Memory, range 64, base rx90400000, size 16384, enabled cap 01[40] = powerspec 3 supports D0 D1 D2 D3 current D0 cap 05[50] = MSI supports 1 message, 64 bit cap 10[70] = PCI-Express 2 endpoint IRQ 1 max data 128(128) link x1(x1) speed 2.5(2.5) ASPM disabled(L0s/L1) cap 11[b0] = MSI-X supports 4 messages, enabled Table in map 0x20[0x0], PBA in map 0x20[0x800] cap 03[d0] = VPD ecap 0001[100] = AER 1 0 fatal 0 non-fatal 2 corrected ecap 0002[140] = VC 1 max VC0 ecap 0003[160] = Serial 1 01000000684ce000 ecap 0018[170] = LTR 1 PCI-e errors = Correctable Error Detected Corrected = Receiver Error Bad DLLP
More information: * The chip is recognized as 8168G * Both Rx and Tx don't works * debugging the driver, I discovered that the 0x3201c040 is the value of the rl_desc->cmd_stat * the rx can get 256 Rx descriptor, then there're no more packets (RL_RDESC_STAT_OWN always on) It seems to me that the DMA/device is not correctly initialized and the communication with the device works in an unexpected way.
Created attachment 152990 [details] re_cfgv2.diff You could give this patch a try (note that the location of if_rlreg.h depends on the version of FreeBSD). If fixes a couple of bugs, mainly in the area of receiver configuration of newer chips. Given that these configuration bits got repurposed and it's unknown what both the old and new bits do exactly in later MACs, this patch might make a difference for you. That said, generally your problem appears to be caused by a hardware defect of some sort. For one, rev. 0x4c000000 chips are known to work at this time. Also, the memory used for descriptors shouldn't suddenly go away and cause a page fault when accessed. Moreover, the freed memory neither containing the expected 0xdeadc0de nor some random bits but just always all ones in your case is very suspicious. Have you tested whether that piece of hardware works with Linux?
Thanks for patch! I tested it and unfortunately it doesn't any difference. I tested the board on Linux and it works. I shortly looked in the re Linux driver and it seems that they have the ability to dynamically load a known firmware on the chip and they do it for this specific chip. It could be the case that my 0x4c000000 card has a different firmware version. I'll try to get more information on that side. It really seems that the network card is writing received packets (via DMA?) in unexpected memory addresses overwriting portion of the kernel and causing the panics.
Your rev. 0x4c000000 MAC coming up with broken firmware could be another reason, which would be unfortunate, though, as these images are GPLed. However, the Atom E3800 errata has an entry (VLI30 in the non-NDA October 2013 version of that document) suggesting that the MMU will not behave correctly when employing super pages, which would be a more plausible cause for the problems you are experiencing. Thus, I'd give a kernel with super page usage disabled a try. Last time I tested, unfortunately, the corresponding loader tunable didn't take effect. So manually setting pg_ps_enabled to 0 in sys/amd64/amd64/pmap.c and recompiling likely is safest in order to do so.
I tested disabling super pages and I get the same behavior. I verified the sysctl bit and it was disabled. I tested disabling tso (as suggested on mailing list, despite tso was already disabled) and I get the same behavior. I would love to implement the API to load this firmware blob, to test it. It could take some time but it seems doable. Firmware blobs are not GPLed, they're distributed as binary permitted by the vendor. BTW, I would just test it. In parallel, I'll try to understand where this new rx_desc are written, if there's a logical connection with the loaded DMA maps.
See also Bug 193743 - RTL8111/8168B PCI Express Gigabit Ethernet controller: doesn't work properly, problems getting UP automatically
I collect some other information about this error and I'm a little bit confused. I added some printf in the re_rxeof() function, to understand who's overwriting the memory after free (my suspect was the DMA was writing in the wrong place): re0: idx 22 - rxstat 0x3201C040 - cur_rx at 0xfffff8000179b160 re0: rl_rx_list 0xfffff8000179b000 - rl_rx_list_addr 0x179b000 Memory modified after free 0xfffff800069da800(2048) val=ffffffff @ 0xfffff800069da800 re0: newbuf m 0xfffff800069f0600 - segs.ds_addr 0x00000000069da800 re0: newbuf m->m_data 0xfffff800069da800 rl_rx_list is the pointer to the rl_desc list rl_rx_list_addr is the phy address used by the device (DMA segment) The newbuf function allocate a new mbuf and load it in a new dma segment: m->m_data is the virtual address segs.ds_addr is the phy address used by the device (DMA segment) Apparently, the driver can really get data from DMA. I've still to explain: * received XX packets, "netstat -s" shows 0 packets received * after the first 256 packets, the first descriptor of the ring is overwritten by a new one that is always: rl_desc->cmdstat is 0x80000800 => no packets * how some rl_desc are flying away causing page faults... The Linux firmware actually is not a real firmware, but it seems to me a way to encode and hide the chip initialization.
Finally, I get the behavior of the my re0 interface, unfortunately, still without solution. When the driver gets a new descriptor on the rxring, the mbuf is still full of 0xdeadcode. The Memory modified after free: Memory modified after free 0xfffff800039de800(2048) val=ffffffff @ 0xfffff800039de800 The value is the first 4 byte of a broadcast ethernet packet and the address is one previously used to store a mbuf. The first conclusion is that packets are not yet written when a rxring entry is sent. I've also checked why rxring entries are flying away causing page faults: simply, the ring is acting as an array :) In other words, new entries are always written sequentially, violating the size of the ring. When the last element of the ring (idx=255) is used, it correctly rewritten using 0xC0000800 instead of 0x80000800, signalling that it's the last entry of the ring. Any ideas?
Created attachment 153780 [details] patch solving the issue using this hardware Explanation and solution The actual problem was that the rx ring (and probably the txring too) wasn't updated by the driver or, better, the device wasn't able to get these updates. That explains why the last entry in the ring was ignored and why only few packets were really copied via DMA. I found this log entry in the linux git: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=d6e572911a4cb2b9fcd1c26a38d5317a3971f2fd It seems that for "some" chips, Rx and Tx should be enabled in the RL_COMMAND register later, after rings configuration and so on. The patch shows only how I used this tip, but I don't know how it could affect other devices.
The patch (minus the commented out code) LGTM. I’ll test it out on my workstation (a few kldunload/kldload/ifconfig/dhclient runs) and commit it if no one objects in a few days.
And just for reference, here's what I'm running: re0@pci0:6:0:0: class=0x020000 card=0x83671043 chip=0x816810ec rev=0x02 hdr=0x00 vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTL8111/8168B PCI Express Gigabit Ethernet controller' class = network subclass = ethernet $ uname -a FreeBSD bayonetta.local 9.3-RELEASE FreeBSD 9.3-RELEASE #0 r268512: Thu Jul 10 23:44:39 UTC 2014 root@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64
A commit references this bug: Author: marius Date: Thu Apr 9 21:35:45 UTC 2015 New revision: 281337 URL: https://svnweb.freebsd.org/changeset/base/281337 Log: Don't enable RX and TX before their initial configuration is done, i. e. after setting up interrupt moderation but before turning interrupts on. This matches what Realtek's r8168 Linux driver does as of version 8.039.00 and fixes problems with certain incarnations of certain MAC revisions like the interface requiring an extra up/down-cycle after boot to start working or DMA configuration not being adhered to. PR: 193743, 197535 MFC after: 1 week Changes: head/sys/dev/re/if_re.c
A commit references this bug: Author: marius Date: Sun Jul 5 20:16:39 UTC 2015 New revision: 285177 URL: https://svnweb.freebsd.org/changeset/base/285177 Log: MFC: r281337 Don't enable RX and TX before their initial configuration is done, i. e. after setting up interrupt moderation but before turning interrupts on. This matches what Realtek's r8168 Linux driver does as of version 8.039.00 and fixes problems with certain incarnations of certain MAC revisions like the interface requiring an extra up/down-cycle after boot to start working or DMA configuration not being adhered to. PR: 193743, 197535 Approved by: re (kib) Changes: _U stable/10/ stable/10/sys/dev/re/if_re.c
A commit references this bug: Author: marius Date: Sun Jul 5 20:16:46 UTC 2015 New revision: 285178 URL: https://svnweb.freebsd.org/changeset/base/285178 Log: MFC: r281337 Don't enable RX and TX before their initial configuration is done, i. e. after setting up interrupt moderation but before turning interrupts on. This matches what Realtek's r8168 Linux driver does as of version 8.039.00 and fixes problems with certain incarnations of certain MAC revisions like the interface requiring an extra up/down-cycle after boot to start working or DMA configuration not being adhered to. PR: 193743, 197535 Changes: _U stable/9/sys/ _U stable/9/sys/dev/ stable/9/sys/dev/re/if_re.c
A commit references this bug: Author: marius Date: Sun Jul 5 20:16:52 UTC 2015 New revision: 285179 URL: https://svnweb.freebsd.org/changeset/base/285179 Log: MFC: r281337 Don't enable RX and TX before their initial configuration is done, i. e. after setting up interrupt moderation but before turning interrupts on. This matches what Realtek's r8168 Linux driver does as of version 8.039.00 and fixes problems with certain incarnations of certain MAC revisions like the interface requiring an extra up/down-cycle after boot to start working or DMA configuration not being adhered to. PR: 193743, 197535 Changes: _U stable/8/sys/ _U stable/8/sys/dev/ _U stable/8/sys/dev/re/ stable/8/sys/dev/re/if_re.c