Bug 280234 - science/py-tensorflow: update 2.9.1 → 2.13.1
Summary: science/py-tensorflow: update 2.9.1 → 2.13.1
Status: New
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-ports-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-07-12 09:25 UTC by Lars Herschke
Modified: 2024-11-18 09:35 UTC (History)
4 users (show)

See Also:
bugzilla: maintainer-feedback? (amzo1337)
lhersch: maintainer-feedback-


Attachments
update to 2.13.1 (110.92 KB, patch)
2024-07-12 09:25 UTC, Lars Herschke
no flags Details | Diff
update to 2.13.1 (127.74 KB, patch)
2024-07-12 09:58 UTC, Lars Herschke
no flags Details | Diff
update to 2.13.1 (129.52 KB, patch)
2024-07-19 06:48 UTC, Lars Herschke
no flags Details | Diff
update to 2.13.1 (148.76 KB, patch)
2024-09-30 19:43 UTC, Lars Herschke
no flags Details | Diff
update to 2.13.1 (151.85 KB, patch)
2024-11-18 08:55 UTC, Lars Herschke
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Lars Herschke 2024-07-12 09:25:48 UTC
Created attachment 251988 [details]
update to 2.13.1
Comment 1 Lars Herschke 2024-07-12 09:58:00 UTC
Created attachment 251990 [details]
update to 2.13.1

The first patch was incomplete.
Comment 2 Lars Herschke 2024-07-19 06:48:20 UTC
Created attachment 252159 [details]
update to 2.13.1

With the first patch, py-tensorflow could not be built on FreeBSD 14.1. I always got the following error.
Comment 3 Lars Herschke 2024-07-19 06:54:30 UTC
ERROR: /wrkdirs/usr/ports/science/py-tensorflow/work-py39/tensorflow-2.13.1/tensorflow/cc/BUILD:673:22: Executing genrule //tensorflow/cc:image_ops_genrule failed: (Exit 1): bash failed: error executing command
...
ld-elf.so.1: /wrkdirs/usr/ports/science/py-tensorflow/work-py39/bazel_out/510805f3beb273b7d5a810ae984312ce/execroot/org_tensorflow/bazel-out/host/bin/tensorflow/cc/ops/../../../_solib_k8/_U_S_Stensorflow_Scc_Cops_Simage_Uops_Ugen_Ucc___Utensorflow/libtensorflow_framework.so.2: Undefined symbol "_ZN4absl12lts_202301254CordC1INSt3__112basic_stringIcNS3_11char_traitsIcEENS3_9allocatorIcEEEELi0EEEOT_"

This is probably due to the fact that devel/abseil is built in 14.1 with llvm 18. If you build py-tensorflow with llvm 18, the error is gone. However, the build then fails under FreeBSD older than 14.1.
With the current change py-tensorflow always builds with the system compiler. So I could build jet successfully under 14.1, 14.0 , 13.3 and even under 13.2 (with llvm 14!).
Comment 4 Dima Panov freebsd_committer freebsd_triage 2024-09-20 07:33:23 UTC
(In reply to Lars Herschke from comment #3)

Sadly, it's unable to build at all after devel/abseil: update Abseil C++ to LTS version 20240722.0

looks like bundled benchmark/gtest cannot be used with recent abseil anymore
Comment 5 Lars Herschke 2024-09-30 19:43:07 UTC
Created attachment 253911 [details]
update to 2.13.1

Here is a new patch. It contains additional adjustments for the new devel/abseil and the new devel/protobuf version.
Comment 6 Zsolt Udvari freebsd_committer freebsd_triage 2024-10-13 18:46:26 UTC
Lars, do you want to update 2.17?
Comment 7 Dima Panov freebsd_committer freebsd_triage 2024-10-15 08:35:01 UTC
(In reply to Zsolt Udvari from comment #6)
This need to almost complere rewrite rules_python subpackage at least
Comment 8 Lars Herschke 2024-10-15 08:42:14 UTC
From 2.14 the build environment changes massively regarding python or perl, I don't know exactly. My knowledge of Bazel has reached its limits in this respect. Actually, I just wanted to fix the build error of the current port and version 2.13.1 turned out to be the version that worked the easiest.
Comment 9 Dima Panov freebsd_committer freebsd_triage 2024-10-15 20:16:31 UTC
(In reply to Lars Herschke from comment #8)

BTW. 2.13.1 builds but coredumps while init on start


$ python3.11 -vv -c "import tensorflow as tf; print('TensorFlow version:', tf.version())"

[skip]
import 'tensorflow.python.platform.build_info' # <_frozen_importlib_external.SourceFileLoader object at 0x680907f0410>
import 'tensorflow.python.platform.self_check' # <_frozen_importlib_external.SourceFileLoader object at 0x680907dbe90>
# trying /usr/local/lib/python3.11/site-packages/tensorflow/python/platform/_pywrap_cpu_feature_guard.cpython-311.so
# trying /usr/local/lib/python3.11/site-packages/tensorflow/python/platform/_pywrap_cpu_feature_guard.abi3.so
# trying /usr/local/lib/python3.11/site-packages/tensorflow/python/platform/_pywrap_cpu_feature_guard.so
Segmentation fault (core dumped)
$
Comment 10 Zsolt Udvari freebsd_committer freebsd_triage 2024-10-17 18:47:30 UTC
(In reply to Lars Herschke from comment #8)
Maybe can check ArchLInux's PKGBUILD.
Comment 11 Lars Herschke 2024-10-18 14:15:55 UTC
I also get the segmentation fault . But when I build tensorflow with debug symbols, the error no longer occurs. Therefore it is probably some kind of race condition. Of course, that doesn't make it any easier to find the cause.
Comment 12 Lars Herschke 2024-11-15 11:34:30 UTC
The segmentation fault occurs as soon as _pywrap_tensorflow_internal.so or pywrap_bfloat16.so is stripped. These two files are also larger after stripping than before. If I strip with GNU strip from the binutils, llvm-strip or directly through the linker there are no problems.

We may be affected by the following bug.
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=269568

I will provide a new patch soon, which will use the linker for stripping.
Comment 13 Lars Herschke 2024-11-18 08:55:46 UTC
Created attachment 255253 [details]
update to 2.13.1

Here is a new patch where the linker is used for stripping. In addition, a newer version of cpuinfo is now used. This now has native FreeBSD support and therefore no longer needs to be patched.
Comment 14 Lars Herschke 2024-11-18 09:35:31 UTC
tf.version() does not work like this. You can display the version number as follows.

python3.11 -c "import tensorflow as tf; print('TensorFlow version:', tf.version.VERSION)"