278417 – Support for the F16C CPU extension should be added to bsd.cpu.mk (Was: The _cvtsh_ss() intrinsic function generates illegal instructions)

Bug 278417 - Support for the F16C CPU extension should be added to bsd.cpu.mk (Was: The _cvtsh_ss() intrinsic function generates illegal instructions)

Summary: Support for the F16C CPU extension should be added to bsd.cpu.mk (Was: The _c...

Status:	Closed FIXED

Alias:	None

Product:	Base System
Classification:	Unclassified
Component:	misc (show other bugs)
Version:	14.0-STABLE
Hardware:	amd64 Any

Importance:	--- Affects Some People
Assignee:	freebsd-toolchain (Nobody)

URL:
Keywords:

Depends on:
Blocks:

Reported:	2024-04-17 20:27 UTC by Yuri Victorovich
Modified:	2024-04-21 19:07 UTC (History)
CC List:	3 users (show)

See Also:

Attachments
test.cpp (107 bytes, text/plain) 2024-04-17 20:27 UTC, Yuri Victorovich	no flags	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Yuri Victorovich freebsd_committer

2024-04-17 20:27:23 UTC

Created attachment 250036 [details]
test.cpp

See the attached testcase.

$ c++ test.cpp -mf16c
$ ./a.out 
Illegal instruction

This problem prevents porting of https://github.com/ggerganov/ggml


clang-17 on 14.0-STABLE

Comment 1 Mark Millard 2024-04-17 20:56:02 UTC

For reference:

static __inline float __DEFAULT_FN_ATTRS128 	_cvtsh_ss (unsigned short __a)
 	Converts a 16-bit half-precision float value into a 32-bit float value.

Quoting https://en.wikipedia.org/wiki/Half-precision_floating-point_format :

Support for half precision in the x86 instruction set is specified in the F16C instruction set extension, first introduced in 2009 by AMD and fairly broadly adopted by AMD and Intel CPUs by 2012. This was further extended up the AVX-512_FP16 instruction set extension implemented in the Intel Sapphire Rapids processor.

Quoting https://en.wikipedia.org/wiki/F16C :

The F16C[1] (previously/informally known as CVT16) instruction set is an x86 instruction set architecture extension which provides support for converting between half-precision and standard IEEE single-precision floating-point formats.


Is your hardware missing the F16C instruction set extension?

Comment 2 Yuri Victorovich freebsd_committer

2024-04-17 21:50:11 UTC

(In reply to Mark Millard from comment #1)

> Is your hardware missing the F16C instruction set extension?

No, it's not supported on my system.

But this should be covered by the CPUTYPE setting.
clang should know that this CPU doesn't support such instructions and should emit a generic replacement.
Different CPUTYPEs introduce different extensions like this.

It looks like clang misses this one.

Comment 3 Dimitry Andric freebsd_committer

2024-04-17 21:54:03 UTC

Indeed, this intrinsic is defined in f16cintrin.h, but that header is only included by immintrin.h if __F16C__ is defined, and that macro is normally only available when the CPU features support F16C.

If you force the support by adding -mf16c on the command line, you have to make sure your CPU actually has this instruction, or have some other mechanism to ensure that the function is never called unless the feature is detected at run-time.

Comment 4 Dimitry Andric freebsd_committer

2024-04-17 22:02:26 UTC

(In reply to Yuri Victorovich from comment #2)
> But this should be covered by the CPUTYPE setting.

That seems to be unimplemented at this point. The llvm getHostCPUFeatures() function has support for detecting the feature from CPUID bits, but it looks like that is only used for e.g. -march=native.

I don't maintain share/mk/bsd.cpu.mk, but it seems logical to have some support added there?


> clang should know that this CPU doesn't support such instructions and should emit a generic replacement.

There is a note in f16cintrin.h about this:

/* NOTE: Intel documents the 128-bit versions of these as being in emmintrin.h,
 * but that's because icc can emulate these without f16c using a library call.
 * Since we don't do that let's leave these in f16cintrin.h.
 */

so apparently it is not implemented. I guess you simply have to avoid relying on these intrinsics if your CPU does not support them.

Comment 5 Yuri Victorovich freebsd_committer

2024-04-17 22:14:21 UTC

(In reply to Dimitry Andric from comment #4)

I am not sure what should happen when some project is just using the F16C extension.

According to the rules, the default port/package build should support the basic amd64 CPU with SSE2.

But this software can't even work on such CPU.
And Clang doesn't provide the default implementation.

Comment 6 Dimitry Andric freebsd_committer

2024-04-17 22:14:50 UTC

FWIW, clang defines __F16C__ for the following -march= settings:

alderlake, arrowlake, arrowlake-s, bdver2, bdver3, bdver4, broadwell, btver2, cannonlake, cascadelake, clearwaterforest, cooperlake, core-avx-i, core-avx2, emeraldrapids, gracemont, grandridge, graniterapids, graniterapids-d, haswell, icelake-client, icelake-server, ivybridge, knl, knm, lunarlake, meteorlake, pantherlake, raptorlake, rocketlake, sapphirerapids, sierraforest, skx, skylake, skylake-avx512, tigerlake, x86-64-v3, x86-64-v4, znver1, znver2, znver3, znver4.

I'm unsure if these names correspond exactly to what is in bsd.cpu.mk now. Also, I don't see much use of the bsd.cpu.mk features in the src tree, but maybe it is used more often in the ports tree?

Comment 7 Dimitry Andric freebsd_committer

2024-04-17 22:19:23 UTC

(In reply to Yuri Victorovich from comment #5)
Ideally the program should disable the parts that require f16c, but if that is not possible, you could mark the port as requiring a CPU that supports the feature.

However, to be able to do so, support for f16c should be added to bsd.cpu.mk. Then you could check it in the port with something like:

.if defined(MACHINE_CPU) && ${MACHINE_CPU:Mf16c}

Comment 8 commit-hook freebsd_committer

2024-04-18 18:44:01 UTC

A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=2fd73b7126d7d7e5701e001af929411ce7a0c5f1

commit 2fd73b7126d7d7e5701e001af929411ce7a0c5f1
Author:     Dimitry Andric <dim@FreeBSD.org>
AuthorDate: 2024-04-18 17:46:57 +0000
Commit:     Dimitry Andric <dim@FreeBSD.org>
CommitDate: 2024-04-18 18:42:21 +0000

    share/mk/bsd.cpu.mk: add F16C feature for i386 and amd64 architectures

    As discussed in bug 278417, some ports require the F16C instruction set
    to compile, but there is no way yet to detect whether the currently
    chosen CPUTYPE supports this feature.

    Add the feature to the MACHINE_CPU variable, for each processor that
    supports it. The list of processors was extracted from clang 18's -dM
    output, filtered on the __F16C__ define.

    PR:             278417
    Reviewed by:    brooks, emaste
    MFC after:      3 days
    Differential Revision: https://reviews.freebsd.org/D44848

 share/mk/bsd.cpu.mk | 42 ++++++++++++++++++++++++------------------
 1 file changed, 24 insertions(+), 18 deletions(-)

Comment 9 commit-hook freebsd_committer

2024-04-21 19:06:30 UTC

A commit in branch stable/14 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=2dff504999f570d45e969f62a3911f5d8310ab21

commit 2dff504999f570d45e969f62a3911f5d8310ab21
Author:     Dimitry Andric <dim@FreeBSD.org>
AuthorDate: 2024-04-18 17:46:57 +0000
Commit:     Dimitry Andric <dim@FreeBSD.org>
CommitDate: 2024-04-21 18:44:21 +0000

    share/mk/bsd.cpu.mk: add F16C feature for i386 and amd64 architectures

    As discussed in bug 278417, some ports require the F16C instruction set
    to compile, but there is no way yet to detect whether the currently
    chosen CPUTYPE supports this feature.

    Add the feature to the MACHINE_CPU variable, for each processor that
    supports it. The list of processors was extracted from clang 18's -dM
    output, filtered on the __F16C__ define.

    PR:             278417
    Reviewed by:    brooks, emaste
    MFC after:      3 days
    Differential Revision: https://reviews.freebsd.org/D44848

    (cherry picked from commit 2fd73b7126d7d7e5701e001af929411ce7a0c5f1)

 share/mk/bsd.cpu.mk | 42 ++++++++++++++++++++++++------------------
 1 file changed, 24 insertions(+), 18 deletions(-)

Comment 10 commit-hook freebsd_committer

2024-04-21 19:07:32 UTC

A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=b449894978b56651bc8d2a19b03c4af67e4ec59c

commit b449894978b56651bc8d2a19b03c4af67e4ec59c
Author:     Dimitry Andric <dim@FreeBSD.org>
AuthorDate: 2024-04-18 17:46:57 +0000
Commit:     Dimitry Andric <dim@FreeBSD.org>
CommitDate: 2024-04-21 19:02:09 +0000

    share/mk/bsd.cpu.mk: add F16C feature for i386 and amd64 architectures

    As discussed in bug 278417, some ports require the F16C instruction set
    to compile, but there is no way yet to detect whether the currently
    chosen CPUTYPE supports this feature.

    Add the feature to the MACHINE_CPU variable, for each processor that
    supports it. The list of processors was extracted from clang 18's -dM
    output, filtered on the __F16C__ define.

    PR:             278417
    Reviewed by:    brooks, emaste
    MFC after:      3 days
    Differential Revision: https://reviews.freebsd.org/D44848

    (cherry picked from commit 2fd73b7126d7d7e5701e001af929411ce7a0c5f1)

 share/mk/bsd.cpu.mk | 42 ++++++++++++++++++++++++------------------
 1 file changed, 24 insertions(+), 18 deletions(-)