Bug 209412

Summary: math/openblas SEGFAULT during build on AMD64 BARCELONA and BULLDOZER
Product: Ports & Packages Reporter: Cy Schubert <cy>
Component: Individual Port(s)Assignee: Cy Schubert <cy>
Status: Closed FIXED    
Severity: Affects Only Me CC: natbsd, phd_kimberlite, pi, zaphod
Priority: --- Flags: bugzilla: maintainer-feedback? (phd_kimberlite)
Version: Latest   
Hardware: amd64   
OS: Any   
URL: https://github.com/xianyi/OpenBLAS/issues/786
See Also: https://github.com/xianyi/OpenBLAS/issues/786
Attachments:
Description Flags
This patch circumvents the problem by disabling stack checking.
none
Fixes stack allocation failure for AMD Bulldozer CPU. none

Description Cy Schubert freebsd_committer 2016-05-10 04:43:24 UTC
Created attachment 170165 [details]
This patch circumvents the problem by disabling stack checking.

Builds on i386 architecture and builds on amd64 using Intel processors build fine whereas builds on amd64 architecture using AMD CPUs (BARCELONA and BULLDOZER) still fail.

This discusses the problem: https://github.com/xianyi/OpenBLAS/issues/786. Unfortunately it doesn't discuss a fix, my builds on my BARCELONA processor still fail. The attached patch does workaround the problem but I don't think it's suitable for inclusion in ports.

Note the port builds properly on the BARCELONA and BULLDOZER CPUs in i386 (32-bit) mode but SEGFAULTS in amd64 (64-bit) mode. The port builds successfully on the Intel Sandy Bridge CPU in both modes (i386 and amd64).
Comment 1 Cy Schubert freebsd_committer 2016-06-08 23:46:20 UTC
A new comment in the previously mentioned thread suggests that increasing the buffer by 8 in line 245 of interface/ztrmv.c.
Comment 2 Cy Schubert freebsd_committer 2016-06-09 02:08:19 UTC
Created attachment 171221 [details]
Fixes stack allocation failure for AMD Bulldozer CPU.

This patch based on discussion at https://github.com/xianyi/OpenBLAS/issues/786 fixes this PR. Tested.
Comment 3 commit-hook freebsd_committer 2016-06-09 02:15:00 UTC
A commit references this bug:

Author: cy
Date: Thu Jun  9 02:14:50 UTC 2016
New revision: 416576
URL: https://svnweb.freebsd.org/changeset/ports/416576

Log:
  Fix SEGFAULT during build on AMD Barcelona CPUs. This patch is
  based on discussion at https://github.com/xianyi/OpenBLAS/issues/786.

  PR:		209412

Changes:
  head/math/openblas/Makefile
  head/math/openblas/files/patch-interface__ztrmv.c
Comment 4 Cy Schubert freebsd_committer 2016-06-09 02:16:40 UTC
Fixed.
Comment 5 Natacha Porté 2016-06-09 08:45:55 UTC
Hello,

I'm the one who made the tests on the Github issue (and I found it through here, thank you for posting it).

I would like to point out that my tests were performed on a K8-class CPU, not BARCELONA or BULLDOZER. The important point is that my CPU is single-core, so SMP is disabled at compile-time. On SMP-capable machines (or at least whenever the build system somehow defines the `SMP` C symbol), the buffer size is computed in another code path, which is unchanged.

So I wouldn't consider the issue closed until someone with a multicore AMD CPU performs the same work as I to find the correct adjustment to the buffer size.
Comment 6 Cy Schubert freebsd_committer 2016-06-09 13:34:30 UTC
(In reply to Natacha Porté from comment #5)
Yes, I opened the PR (suggested by pi@ to track the bug) because my poudriere builds of net/openblas failed on my dual core Barcelona machines (but did build okay on my hyperthreaded dual core Ivy Bridge machine). The segfaults only occurred in 64-bit mode (not in 32-bit mode on Barcelona). (The problem also did not exist on my Ivy Bridge machine.) The workaround was to disable stack checking when building on Barcelona in 64-bit mode (amd64).

As there was confusion on https://github.com/xianyi/OpenBLAS/issues/786, because nobody could reliably reproduce the bug there, I posted that I had confirmed the existence of the bug.

Your solution was tested on one of the three dual core Barcelona machines (X2 machine) I have in my basement.
Comment 7 Natacha Porté 2016-06-09 14:26:42 UTC
I mistakenly understood your earlier "tested" as referring my tests instead of yours. So thanks for your extra testing and for committing quickly (so I can go back to vanilla ports), and sorry for the extra noise (including the present comment).
Comment 8 Kubilay Kocak freebsd_committer freebsd_triage 2019-09-16 10:24:29 UTC
*** Bug 207287 has been marked as a duplicate of this bug. ***