The easiest way to reproduce this problem -- on a fresh FreeBSD-11.2/amd64 is to attempt "make test" in multimedia/x265. This works on my older system (where march is core2), but the two new ones -- with newer Ivy Bridge Xeons (E5-1620) -- break: ===> Testing for x265-2.6_1 /symbion/ports/multimedia/x265/work/x265_v2.6/source/test/TestBench Using random seed 5B4B6D10 12bit Testing primitives: SSE2 Testing primitives: SSE3 Testing primitives: SSSE3 cuTreeFix8Pack failed x265: asm primitive has failed. Go and fix that Right Now! it does not matter, whether OPTIMIZED_CFLAGS is enabled (which adds -O3) -- it builds either way. The only way to get it through is to use a lower march (like core2) or none at all. Another port, which does not build properly with march=ivybridge for me is qt5-core, but that's a lot hairier to deal with... I noticed this problem after rebuilding world with the recently-committed clang-6.0.1, but I haven't tested it with the earlier clang-6.0.0 so this may be an older bug.
Probably related: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=226059
(In reply to Gleb Popov from comment #1) In Bug #226059 things work for Ivy Bridge, but not for Sky Lake. In my case, it is the Ivy Bridge that has a problem (tried march=sandybridge on the same machine and had the same issue as well).
I can reproduce, but I'm not sure at this point if it's a clang problem at all. Maybe x265 is doing something fishy? I would like to dig deeper but am short on time, so this will end up on my TODO-later list.
FWIW, adding an explicit -mno-avx helps. I do not know, why -- hopefully, dim@ will find out soon...
I created a bug-report with the upstream: https://bitbucket.org/multicoreware/x265/issues/422/when-using-clang-with-avx-enabled Just in case... Notably, using gcc8 with the same -march string works.
Created attachment 195456 [details] Avoid undefined behavior in cuTreeFix8Pack As noted in the upstream x265 bug report, it turns out cuTreeFix8Pack() has undefined behavior when it attempts to fit negative double values into uint16_t. There needs to be an intermediate cast to int16_t to avoid this, and I verified that the input values are never outside the range [INT16_MIN..INT16_MAX]. In addition I added a part to the port Makefile which sets the port LLD_UNSAFE on i386. (I have had this patch in my tree for a while, might as well get it applied now...) Note that independently of this patch, or adding -mno-avx, for me TestBench still crashes with a segfault at the end, but that is apparently unrelated to this particular bug: Using random seed 5B58D03E 8bit Testing primitives: SSE2 Testing primitives: SSE3 Testing primitives: SSSE3 Testing primitives: SSE4 Testing primitives: AVX Testing primitives: AVX2 Testing primitives: BMI2 Testing primitives: ARMv6 Testing primitives: NEON Testing primitives: FastNeonMRC Test performance improvement with full optimizations == pixel primitives == satd[ 4x4] 3.65x 147.78 538.83 avg_pp[ 4x4] 1.25x 244.77 305.50 [... much more of these stats ... ] pelFilterLumaStrong_Vertical 1.49x 723.31 1075.46 *** Signal 11 Stop. make: stopped in /share/dim/ports/multimedia/x265
Can I commit this patch?
Comment on attachment 195456 [details] Avoid undefined behavior in cuTreeFix8Pack Request approval from multimedia/x265 maintainer
(In reply to Dimitry Andric from comment #7) Dimitry, what does the first hunk do? The LLD_UNSAFE= yes? Thanks!
(In reply to Mikhail Teterin from comment #9) > (In reply to Dimitry Andric from comment #7) > Dimitry, what does the first hunk do? The LLD_UNSAFE= yes? Thanks! It ensures that x265 is linked with ld.bfd instead of ld.lld, at least on i386. There is still some issue with either lld or x265's assembly routines, and it fails to link it on i386.
(In reply to Dimitry Andric from comment #10) > There is still some issue with either lld or x265's assembly routines, > and it fails to link it on i386. I don't think, I had a problem - neither on 10.x/i386, nor on 11.2/i386. Are you sure, that's not something local to your system?
(In reply to Mikhail Teterin from comment #11) > (In reply to Dimitry Andric from comment #10) > > There is still some issue with either lld or x265's assembly routines, > > and it fails to link it on i386. > > I don't think, I had a problem - neither on 10.x/i386, nor on 11.2/i386. Are > you sure, that's not something local to your system? On 10.x and 11.x lld is not the default /usr/bin/ld, but on my 12.x system, it is. This LLD_UNSAFE setting simply avoids using lld if it is the default linker.
(In reply to Mikhail Teterin from comment #11) > On 10.x and 11.x lld is not the default /usr/bin/ld, but on my 12.x > system, it is. Only on your 12.x system, or on _all_ 12.x/i386 systems? > This LLD_UNSAFE setting simply avoids using lld if it is the default Ok, but I don't want to clutter the port for the sake of a feature still experimental...
A commit references this bug: Author: mi Date: Sun Aug 5 22:15:03 UTC 2018 New revision: 476480 URL: https://svnweb.freebsd.org/changeset/ports/476480 Log: Fix the underlying problem in the code, which previously required disabling AVX as a work-around: PR: 229788 Submitted by: dim@ Also, switch the build-dependency from yasm to nasm -- upstream made the switch in version 2.6 Reported by: Callum Aitchison Bump PORTREVISION... Changes: head/multimedia/x265/Makefile head/multimedia/x265/files/patch-bug-422 head/multimedia/x265/files/patch-disable-avx-for-clang
Many thanks to @dim for getting to the root of the problem!
(In reply to Mikhail Teterin from comment #15) > Many thanks to @dim for getting to the root of the problem! Note that upstream has applied the patch now: https://bitbucket.org/multicoreware/x265/commits/88ee12651e3031dc1fc2f3f6a8bbac5f67839579