Created attachment 248146 [details] Patch to update math/openblas to 0.3.26 The proposed patch will update math/openblas to the latest version 0.3.26 It will also introduce an option to control AVX512 support. This patch depends on bug #276758
Created attachment 248147 [details] Test build logs for the proposed patch Added archive with the test build logs with different options enabled for the proposed patch.
I refuse to do an exp-run for a patch that depends on another patch
Just to make things clear - the patch proposed in this bug report doesn't include fix from the bug #276758 So we need to get patch from the bug #276758 in to the ports tree first, before attempting an exp-run on the patch proposed in this bug report. Patch from the bug #276758 can be cleanly applied to the both OpenBLAS version 0.3.25 and version 0.3.26 source code 'cause there were no changes to the drivers/other/dynamic.c code between those two OpenBLAS versions.
Created attachment 248175 [details] OpenBLAS WIP https://gitlab.archlinux.org/archlinux/packaging/packages/openblas/-/blob/main/PKGBUILD?ref_type=heads I got a bit inspired by Arch's port of OpenBLAS which seemed a bit cleaner. It's a wip and as I don't know the inner workings of OpenBLAS it's a bit hard to work on making sure everything works as expected. It did get a bit cleaner than before. - Use upstream release tarball - aarch64 and amd64 appears to work as excepted (incl make test) and in Poudriere - I haven't tested the remaining platforms however they should also work from what I can tell - Simplify arch optimizaion(s), I'm not a fan of manually doing this at all since we have CPUTYPE but due to how OpenBLAS builds I guess this is about as good as it gets. Just provade a "baseline" switch or full blast to simplify options in general. Things that needs to worked on, - Test armv7, i386 and powerpc* - I doubt anyone does any serious work on these platforms at least for now (on FreeBSD )so we don't need to do micro-optimization. As long as it builds and runs it should be fine for now at least. - Connect the benchmark build(?) - Add description for POWER8 and POWER9, seems to be the only "real" platforms supported/usable - Remove old code in Makefile
(In reply to Daniel Engberg from comment #4) Hello Daniel! I'm sorry, but I'm totally against a proposed approach. It doesn't really simplify port it and makes it harder to debug and maintain. First of all, combining several AVX options into a single global one makes the port less flexible. There are several reasons why there are three AVX options provided by OpenBLAS developers and I don't see any reason why we should combine them into a single one by ourselves. The second one is that there is no reason why we should follow an Arch Linux approach, since the current math/openblas port structure is almost optimal and provides clean and easy way to a build process. In fact what the current FreeBSD port do is it's setting up an OpenBLAS Makefile.rules standard variables according to PORTOPTIONS and running a generic OpenBLAS build pipeline after that. It doesn't directly set up any custom compiler options nor it interfere in any other ways with the OpenBLAS build process - and it's the way the port should be done 'cause it allows to easily update it to a newer version and report bugs to upstream if we'll encounter some of them. On the other hand I'm not sure why the port fetching several archives with the source code files and uses a GH tag instead of release archive but may be Eijiro Shibusawa can shed some light on it. IMO what this port is really missing is an option to enable FreeBSD CPUTYPE to OpenBLAS TARGET variable mapping. It's presence will allow a more granular control of the resulting optimizations especially during DYNAMIC_ARCH builds. But it seems that it'll be easy to implement it.
Not sure I follow you, OpenBLAS goes at great lengths to figure out things by its own which is troublesome for packaging irregardless of framework (CMake is actually easier to follow). It's also quite a mess and also all distros patch it in some way (you can have a look using repology.org), it would be nice if we could avoid that or at least upstream patches. I would also argue that the current post-patch section is less than elegant but it is what it is. https://github.com/OpenMathLib/OpenBLAS/blob/develop/Makefile.system https://github.com/OpenMathLib/OpenBLAS/blob/develop/cmake/system.cmake There's no point in having multiple options as it will try to detect by itself and/or there's also DYNARCH which will pick whatever code path it finds most suitable for your system.
(In reply to Daniel Engberg from comment #6) It is true that OpenBLAS tries to figure out the most appropriate optimization options by itself but you should consider some things before trying to "enhance" some things in it or patch it somehow or make some changes to it's build process. 1. There are globally two build paths in the OpenBLAS - one for the DYNAMIC_ARCH = 1 and another one for a current host you are building OpenBLAS on (when DYNAMIC_ARCH = 1 isn't set) 2. When DYNAMIC_ARCH is unset, then OpenBLAS auto-detects the current CPU type of a build host and compiles it's library with the code that is optimized only for the CPU of the build host. This leads to a smaller library size, and a faster compilation time. The drawback of a such approach is that the library code will not work on a CPU with a different instructions set or it will perform suboptimal on a newer CPUs (ie with a larger cache sizes). 3. When DYNAMIC_ARCH is set to 1 then OpenBLAS builds all the available CPU modules (except "older" ones) but it will optimize it's common code to the CPU it is being compiled on. This leads to an increased compilation times and a resulting library size increase significantly. But the resulting code supports numerous CPU models with the different instructions sets. But the "bottom line" of the common code optimizations is still tied to the host you are building OpenBLAS on. 4. The "upper bound" for the optimizations of the both build paths (with and without DYNAMIC_ARCH) are controlled with the three AVX options. Though the actual behavior is a bit different when DYNAMIC_ARCH is set to disabled, we can assume this for the sake of simplicity. 5. The "lower bound" of optimizations for the build path with a DYNAMIC_ARCH set to enabled could be controlled with the TARGET OpenBLAS Makefile.rules option. It could (or should) be set to a lowest possible target CPU arch you are planning to run that OpenBLAS library on. 6. The behavior of the TARGET option when DYNAMIC_ARCH is unset is a bit different - it allows you to compile an OpenBLAS library for a specific CPU target family only. Considering an AVX options: Sometimes it is necessary to disable AVX512 or AVX2 instructions due to buggy implementation of an optimized code or due to some other reasons (higher power consumption, or suboptimal performance or performance testing) so it's better to leave all those three PORTOPTIONS in the ports Makefile. Another thing to consider is that when the DYNAMIC_ARCH is set to enabled, an AVX options not only control the "upper bounds" of optimizations but and a "lower bounds" too. You can check the summary section of the build logs attached to this bug to get familiar with this behavior. So you wouldn't be able to compile a SANDYBRIDGE optimized code if you enable all AVX options all together or disable them at all. But you can accomplish this using TARGET option, but since there is no mapping of FreeBSD CPUTYPE make.conf option to an OpenBLAS TARGET Makefile.rules options it is not a trivial task to do. To summarize all of the above - there is no need to patch OpenBLAS or somehow manually interfere with it's build process - there are existing knobs in a Makefile.rules file that could be used to get the needed compilation (and optimization) results. And setting those options via Makefile.rules is the way that is expected to be used by the OpenBLAS developers. Considering the post-patch section in the current ports Makefile - if there is a better way to make changes to an OpenBLAS Makefile.rules file, I guess it can be modified to use it.
Just noticed that it's a Makefile.rule file not a Makefile.rules that I was referencing to in my previous comments...
Created attachment 251527 [details] Patch to update math/openblas to 0.3.27 This is an update patch to upgrade math/openblas to the latest version 0.3.27
Tested build with different options set on amd64 host.
Maybe an exp-run would be useful?
(In reply to Thierry Thomas from comment #11) Yes, I guess an exp-run would be useful in this case.
Asking for an exp-run.
Exp-run looks fine
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/ports/commit/?id=4f7c0ccc218086416510128603e9a1f9bb0241ac commit 4f7c0ccc218086416510128603e9a1f9bb0241ac Author: Artyom Davidov <ard_1@mail.ru> AuthorDate: 2024-06-21 17:18:36 +0000 Commit: Thierry Thomas <thierry@FreeBSD.org> CommitDate: 2024-06-21 17:49:23 +0000 math/openblas: update to 0.3.27 Changelog at <https://github.com/OpenMathLib/OpenBLAS/releases/tag/v0.3.27>. PR: 276789 Reported by: Artyom Davidov Approved by: Eijiro Shibusawa (maintainer) Exp-run by: antoine@ math/openblas/Makefile | 12 ++++++++---- math/openblas/distinfo | 6 +++--- 2 files changed, 11 insertions(+), 7 deletions(-)
Committed, thanks!