Bug 230144

Summary: [stable/11] Linux emulator does not work on Ryzen / Epic processors
Product: Base System Reporter: pete
Component: kernAssignee: freebsd-emulation (Nobody) <emulation>
Status: New ---    
Severity: Affects Some People CC: emaste, kib, meowthink, nwhitehorn, shoesoft
Priority: ---    
Version: 11.2-STABLE   
Hardware: amd64   
OS: Any   
Bug Depends on:    
Bug Blocks: 247219    

Description pete 2018-07-29 11:22:49 UTC
Trying to run any Linux binary on Ryzen or Epic coredumps. I trying to backtrace the core but the debugger does not seem to be able to get any information ut of the binary.

This is completely AMD Zen specific - I can take the same disc and boot it up with a Xeon processor and it works fine. I have also used the Linux emulator on older AMD processors fine (my desktop was a Phenom II until I upgraded it to the Ryzen) but I have to admit I haven't been able to try the latest kernel on older AMD processors.

Reproducing is trivial - enable Linux in rc.conf, install linux_base-c7 and try and run /compat/linux/bin/bash. It will immediately coredump on a Zen machine, but plugging the same disc into an Intel machine will run it fine.
Comment 1 pete 2018-07-29 11:23:53 UTC
here what gdb does, if its of any use...

[webadmin@epyc-test ~]$ gdb /compat/linux/bin/bash bash.core
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...
warning: A handler for the OS ABI "GNU/Linux" is not built into this configuration
of GDB.  Attempting to continue with the default i386:x86-64 settings.

(no debugging symbols found)...

warning: core file may not match specified executable file.
Core was generated by `/compat/linux/bin/bash'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000037f in ?? ()
(gdb) bt
#0  0x0000037f in ?? ()
(gdb)
Comment 2 Conrad Meyer freebsd_committer 2018-07-29 16:49:12 UTC
(GDB 6.1.1 is pretty useless in general.  I'd suggest installing the latest GDB from ports instead.)

Re: Zen specific, have you run any of the Zen-specific tests to try and determine if you have a faulty CPU?  Latest BIOS?
Comment 3 Conrad Meyer freebsd_committer 2018-07-29 16:53:05 UTC
By the way, it seems fine on my TR system running CURRENT:

$ sudo mount -t linprocfs linprocfs /compat/linux/proc
$ sudo mount -t linsysfs linsysfs /compat/linux/sys
$ sudo mount -t tmpfs -o rw,mode=1777 tmpfs /compat/linux/dev/shm
$ /compat/linux/bin/bash
ELF binary type "0" not known.
zsh: exec format error: /compat/linux/bin/bash
$ sudo kldload linux64.ko
$ sudo kldload linux
$ /compat/linux/bin/bash
bash-4.2$

Do you have all Linux-compat filesystems mounted?
Comment 4 Conrad Meyer freebsd_committer 2018-07-29 16:56:56 UTC
I observe that running the process under the debugger immediately segfaults, although I don't have another non-AMD system with linux compat available to compare.
Comment 5 pete 2018-07-29 17:05:45 UTC
I did try gdb81 from ports too - that did the same thing.

Am interested that you actually have it working - I haven't tried under CURRENT, but will as soon as I can on the Epyc boxes. My desktop CPU is not buggy, and I doubt the Epic ones are (they are Azure virtual machines so I don't have direct access to the hardware and BIOS). Another person on the STABLE mailing list confirmed that it coredumps for them on their Ryzen machine too. I would have expected ThreadRipper to also not work - any chance you could try it with 11.2-STABLE ?

Will let you know how it goes on CURRENT when I have built it...
Comment 6 Conrad Meyer freebsd_committer 2018-07-29 17:21:01 UTC
I can't test 11.x, sorry.  Perhaps bash is affected by the brk bug fixed in CURRENT at r335702?  Or r335516?  It's odd to me that you're seeing a behavior difference between Ryzen and Intel with the exact same software.  Linuxulator does not do anything (AFAIK) that touches on any Zen errata I'm aware of.
Comment 7 pete 2018-07-29 19:27:56 UTC
(In reply to Conrad Meyer from comment #6)

So, I tested 12-CURRENT, and that works fine, so thats good news - however the original behaviour is still concerning I think.

The issue is not just with bash - all Linux binaries I have tried do this. I have not yet tried a staticky linked executable, however, which I will try when I get access to a real Linux machine to compile a simple "hello world" tomorrow.

The test I am doing is with the same disc by the way, so I know its identical - am literally shutting it down and switching the processor type. It does puzzle me a lot though - is there any assembler in the Linuxulator ?
Comment 8 Nathan Whitehorn freebsd_committer 2018-08-17 19:57:20 UTC
It might be the brk() issue. I'm experiencing the same trouble on EPYC hardware with 11.2. Truss gives the following:

$ truss /compat/linux/bin/uname
linux_brk(0x0)                                   = 6324224 (0x608000)
SIGNAL 11 (SIGSEGV) code=SEGV_MAPERR trapno=12 addr=0x7ffffffff508
process killed, signal = 11 (core dumped)

(Output is identical for all Linux executables)

Unfortunately, this is a production machine on which I cannot install custom kernels.
Comment 9 Conrad Meyer freebsd_committer 2018-08-17 20:03:46 UTC
On CURRENT truss uname shows:

$ truss /compat/linux/bin/uname
linux_brk(0x0)                                   = 6324224 (0x608000)
linux_newuname(0x7fffffffb59a)                   = 0 (0x0)
...
write(1,"Linux\n",6)                             = 6 (0x6)

So I suspect it is not the brk, but the newuname() with the stack address within a page of the invalid range (on 11.2 Nathan reports 0x7ffffffff508).  On zen addresses >= 0x7ffffffff000 should be avoided, and it looks like they are on CURRENT.
Comment 10 Konstantin Belousov freebsd_committer 2018-08-17 20:38:40 UTC
(In reply to Conrad Meyer from comment #9)
Top user page only need to be avoided for execution.  I am not aware of any errata which makes it unsafe for data.

The lowering of the top user address is only implemented for FreeBSD ELF, not for linux ELF, and this is done both on HEAD and stable/11.
Comment 11 Konstantin Belousov freebsd_committer 2018-08-17 20:41:14 UTC
gdb from ports should be able to attach to the linux process, at least on HEAD.  Most likely I merged the changes to stable/11, but I do not remember for sure.

You might set sysctl machdep.uprintf_signal to 1 to get more information on fault of gdb does not attach.
Comment 12 pete 2018-08-17 21:59:51 UTC
I did try applying the bits from r335702 to 11 by hand, but got part way through and realised that it was basically a reorganisation of the brk function instead of a bug fix - the rounding fix is, I believe, an incidental part of the patch having looked at the commit message:

"This also addresses a minor bug in linux_brk in that we now return the
  actual (rounded up) break address, rather than the requested value."

Haven't had time to work out which is the actual change in order to try it though. I also have no longer access to a system running CURRENT which is a bit annoying. Does anyone more familiar with this code know which bit to change to test it ?
Comment 13 Conrad Meyer freebsd_committer 2018-08-17 22:04:15 UTC
(In reply to pete from comment #12)
I don't think that's the problem, since e.g. uname is only passing aligned addresses to brk anyway.  The rounding would have no effect.
Comment 14 meowthink 2018-09-16 07:18:54 UTC
Same issue here. stable/11 r338649 on a Ryzen 2400G.

$ truss -f /compat/linux/usr/bin/bash
 4163: linux_brk(0x0)				 = 7221248 (0x6e3000)
pid 4163 comm bash: signal 11 err 4 code 1 type 12 addr 0x7ffffffff508 rsp 0x7fffffffcb10 rip 0x8006ea3c1 <80 3f 00 0f 85 9e 01 00>
 4163: SIGNAL 11 (SIGSEGV) code=SEGV_MAPERR trapno=12 addr=0x7ffffffff508
 4163: process killed, signal = 11 (core dumped)

But 32-bit linux binaries just run fine.

I'm pretty sure this isn't r335702 related, as brk(0) will always return the old value. Most weird thing here is, this seems not affect Intel CPUs, but I can't find any differences in code.