Bug 266303 - science/py-tensorflow Upgrade to 2.9.1
Summary: science/py-tensorflow Upgrade to 2.9.1
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: amd64 Any
: --- Affects Many People
Assignee: Yuri Victorovich
URL:
Keywords:
Depends on: 266290
Blocks: 266311
  Show dependency treegraph
 
Reported: 2022-09-08 20:32 UTC by Anthony Donnelly
Modified: 2022-09-13 20:33 UTC (History)
1 user (show)

See Also:
amzo1337: maintainer-feedback+


Attachments
Tensorflow 2.9.1 (463.19 KB, text/plain)
2022-09-08 20:32 UTC, Anthony Donnelly
no flags Details
plist-issues.txt (60.85 KB, text/plain)
2022-09-13 06:44 UTC, Yuri Victorovich
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Anthony Donnelly 2022-09-08 20:32:30 UTC
Created attachment 236442 [details]
Tensorflow 2.9.1

I've upgrade Tensorflow to the latest version 2.9.1 and have tested in poudriere.

Tested in poudriere and tested a simple neural network and all seems to be okay.
Comment 1 Anthony Donnelly 2022-09-08 20:39:32 UTC
Quick note, this is blocked until the resolution of Bug 266290
Comment 2 Yuri Victorovich freebsd_committer freebsd_triage 2022-09-11 04:08:10 UTC
I am getting the Java breakage:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00003dd78c4cd328, pid=1325, tid=728219
#
# JRE version: OpenJDK Runtime Environment (11.0.16+8) (build 11.0.16+8-1)
# Java VM: OpenJDK 64-Bit Server VM (11.0.16+8-1, mixed mode, tiered, compressed oops, g1 gc, bsd-amd64)
# Problematic frame:
# V  [libjvm.so+0xecd328]  JVM_RaiseSignal+0x3d3888
#
# Core dump will be written. Default location: /wrkdirs/usr/ports/science/py-tensorflow/work-py39/tensorflow-2.9.1/bazel(tensorflow-2.9.1).core
#
# An error report file with more information is saved as:
# /wrkdirs/usr/ports/science/py-tensorflow/work-py39/tensorflow-2.9.1/hs_err_pid1325.log
Compiled method (c1)   60861 3204   !   3       java.util.concurrent.ConcurrentHashMap::putVal (432 bytes)
 total in heap  [0x00003dd79f4e7390,0x00003dd79f4ea5e0] = 12880
 relocation     [0x00003dd79f4e7508,0x00003dd79f4e7718] = 528
 main code      [0x00003dd79f4e7720,0x00003dd79f4e9ac0] = 9120
 stub code      [0x00003dd79f4e9ac0,0x00003dd79f4e9b90] = 208
 oops           [0x00003dd79f4e9b90,0x00003dd79f4e9b98] = 8
 metadata       [0x00003dd79f4e9b98,0x00003dd79f4e9bd8] = 64
 scopes data    [0x00003dd79f4e9bd8,0x00003dd79f4ea170] = 1432
 scopes pcs     [0x00003dd79f4ea170,0x00003dd79f4ea450] = 736
 dependencies   [0x00003dd79f4ea450,0x00003dd79f4ea458] = 8
 handler table  [0x00003dd79f4ea458,0x00003dd79f4ea5a8] = 336
 nul chk table  [0x00003dd79f4ea5a8,0x00003dd79f4ea5e0] = 56


I think that's Java breaking while running the Bazel code.
Comment 3 Anthony Donnelly 2022-09-11 04:41:11 UTC
Is this reproducible everytime? Since java is only really used by bazel. I'm wondering if it's a bug in bazel or java but can only really guess at this point.

Could you share the core dump, or try and build tensorflow again and see if it happens again.

I need to try and find a way to reproduce this if it's a consistent issue with compiling.
Comment 4 Yuri Victorovich freebsd_committer freebsd_triage 2022-09-11 05:04:37 UTC
Anthony,


What system version did you build on?


Yuri
Comment 5 Anthony Donnelly 2022-09-11 05:10:16 UTC
OS: FreeBSD Shiva.home 13.1-RELEASE-p2 FreeBSD 13.1-RELEASE-p2 GENERIC amd64
Java Version: openjdk11-11.0.16+8.1_1
bazel version: bazel-5.3.0


I've done around 5 successful builds so far, and the person who open the issue on github to update tensorflow also managed a successful build. However, I've been looking at open bugs regarding java and there is currently this bug 265805 open which seems similiar.
Comment 6 Yuri Victorovich freebsd_committer freebsd_triage 2022-09-11 05:17:49 UTC
(In reply to Anthony Donnelly from comment #5)

For me Java failures occurred on 13.1-RELEASE-p2 GENERIC amd64.
Java crashed a few times while building bazel too, but then the bazel build succeeded. So the failure seems to be intermittent.

Now I am 1 hour into the build on 13-STABLE and it is working so far.
Comment 7 Yuri Victorovich freebsd_committer freebsd_triage 2022-09-11 22:11:43 UTC
TF has built successfully on 13.1-STABLE.

-------------------------------------------------------

I tried this example: https://www.tensorflow.org/tutorials/quickstart/advanced and it couldn't find tensorflow.keras

Is it missing?
Comment 8 Anthony Donnelly 2022-09-11 22:30:24 UTC
The tf.keras imports are from py-keras, which is bug 266311. If you build and install that the tutorial will work. Wasn't sure how to handle the dependcies as most distributions have keras as a post dependency, but with freebsd ports I wasn't sure how to handle this as keras requires tensorflow to build.
Comment 9 Anthony Donnelly 2022-09-11 22:31:00 UTC
I created the ports this way to avoid circular dependencies.
Comment 10 Yuri Victorovich freebsd_committer freebsd_triage 2022-09-11 22:48:37 UTC
(In reply to Anthony Donnelly from comment #9)

No problem.
Comment 11 Yuri Victorovich freebsd_committer freebsd_triage 2022-09-12 03:26:02 UTC
I think that JOBS_ALL should be a default, not JOBS_1. People who would have problems building the port would be able to lower the number of jobs. IMO there's no need to always begin with 1 job. What do you think?
Comment 12 Anthony Donnelly 2022-09-12 07:00:43 UTC
When bazel jobs were set to all the build would be killed by the FreeBSD build machine which is why it's there. I'm wondering if setting a memory limit would also work.
Comment 13 Anthony Donnelly 2022-09-12 07:15:38 UTC
I should add, it used to be killed by running out of resources when being built along side other packages. Which resulted in emails being sent about the port being broken and ended up with it being marked as broken.

There is --local_ram_resources= option in bazel to limit the memory usage which would probably fix it, but I don't really want to set an arbitrary limit for people which might slow the build for people who build from source.
Comment 14 Yuri Victorovich freebsd_committer freebsd_triage 2022-09-12 07:24:25 UTC
(In reply to Anthony Donnelly from comment #12)

By default poudriere doesn't limit memory. But there is the MAX_MEMORY setting that is likely enabled on the builders.

Since builders are shared between ports MAX_MEMORY can't be specified per-port.
poudriere creates jails with memory limit when MAX_MEMORY is set.
Comment 15 Yuri Victorovich freebsd_committer freebsd_triage 2022-09-12 07:39:23 UTC
I asked for the feature to have MAX_MEMORY per-port: https://github.com/freebsd/poudriere/issues/1016

They do set memory limit to jail on creation but this doesn't seem essential. Limit supplied with jexec is what limits the memory.
Comment 16 Yuri Victorovich freebsd_committer freebsd_triage 2022-09-13 06:44:07 UTC
Created attachment 236534 [details]
plist-issues.txt

When I've rebuilt in poudriere I got plist issues, see attachment.

Contrary to what post-install comment says it installs some headers and libraries.

Should I just add them to pkg-plist?
Comment 17 Anthony Donnelly 2022-09-13 09:05:52 UTC
Yeah, jsut add them to plist. I think this is due to the package used to only provide the python bindings, but I added in the libtensorflow* and the headers and forgot to update the plists.
Comment 18 Yuri Victorovich freebsd_committer freebsd_triage 2022-09-13 20:33:24 UTC
Committed, thanks!
Comment 19 commit-hook freebsd_committer freebsd_triage 2022-09-13 20:33:33 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=56dc0e449f2a0fdf190cfc1de367b148cbb5e467

commit 56dc0e449f2a0fdf190cfc1de367b148cbb5e467
Author:     Anthony Donnelly <amzo1337@gmail.com>
AuthorDate: 2022-09-13 20:21:29 +0000
Commit:     Yuri Victorovich <yuri@FreeBSD.org>
CommitDate: 2022-09-13 20:32:36 +0000

    science/py-tensorflow: Update 1.15.5 -> 2.9.1

    Big thank you to Anthony Donnelly for updating this difficult port.

    PR:             266303

 science/py-tensorflow/Makefile                     | 182 +++--
 science/py-tensorflow/Makefile.MASTER_SITES        |  94 ++-
 science/py-tensorflow/distinfo                     |  98 ++-
 .../files/bazel/add-default-option.patch (new)     |  37 +
 .../files/bazel/fix-environ.patch (new)            |  42 +
 .../files/bazel/fix_cpuinfo.patch (new)            |  80 ++
 .../files/bazel/freebsd_python_fix.patch (new)     |  11 +
 .../files/bazel/set-c++17.patch (new)              |  11 +
 science/py-tensorflow/files/bazelrc                |  24 +-
 ...absl_base_internal_unscaledcycleclock.cc (gone) |  47 --
 science/py-tensorflow/files/freebsd/BUILD (new)    |  88 ++
 .../files/freebsd/cc_toolchain_config.bzl (new)    | 287 +++++++
 science/py-tensorflow/files/patch-.bazelrc (new)   |  22 +
 science/py-tensorflow/files/patch-WORKSPACE        |  61 +-
 science/py-tensorflow/files/patch-bazelrc (gone)   |  11 -
 science/py-tensorflow/files/patch-configure (new)  |  11 +
 science/py-tensorflow/files/patch-protobuf (gone)  |  22 -
 .../files/patch-tensorflow_BUILD (new)             |  22 +
 ...ch-tensorflow_compiler_mlir_hlo_WORKSPACE (new) |  11 +
 ...atch-tensorflow_compiler_mlir_lite_BUILD (gone) |  10 -
 ...ow_compiler_mlir_lite_quantization_BUILD (gone) |   8 -
 ...mpiler_mlir_quantization_tensorflow_BUILD (new) |  11 +
 ...ensorflow_compiler_mlir_tensorflow_BUILD (gone) |  10 -
 .../patch-tensorflow_contrib_bigtable_BUILD (gone) |  20 -
 ...-tensorflow_contrib_boosted__trees_BUILD (gone) |  24 -
 ...-tensorflow_contrib_ffmpeg_default_BUILD (gone) |  10 -
 ...ls_client_ignite__plain__client__unix.cc (gone) |  13 -
 ...tch-tensorflow_contrib_makefile_Makefile (gone) |  49 --
 .../files/patch-tensorflow_core_BUILD              |  44 +-
 ...flow_core_distributed__runtime_rpc_BUILD (gone) |  10 -
 ...ibuted__runtime_rpc_grpc__server__lib.cc (gone) |  15 -
 .../patch-tensorflow_core_lib_png_BUILD (new)      |  10 +
 .../patch-tensorflow_core_platform_BUILD (new)     |  10 +
 ...tch-tensorflow_core_platform_cloud_BUILD (gone) |  10 -
 ...w_core_platform_cloud_gcs__dns__cache.cc (gone) |  14 -
 ...-tensorflow_core_profiler_internal_BUILD (gone) |  10 -
 ...ensorflow_core_profiler_rpc_client_BUILD (gone) |  19 -
 ...ensorflow_core_protobuf_autotuning.proto (gone) |  21 -
 ...tools_make_targets_freebsd__makefile.inc (gone) |  13 -
 ...h-tensorflow_lite_kernels_internal_BUILD (gone) |  11 -
 ...w_lite_python_interpreter__wrapper_BUILD (gone) |  10 -
 .../files/patch-tensorflow_lite_tools_BUILD (gone) |  10 -
 ...atch-tensorflow_lite_tools_make_Makefile (gone) |  19 -
 ...tools_make_targets_freebsd__makefile.inc (gone) |  19 -
 ...tch-tensorflow_lite_tools_optimize_BUILD (gone) |  20 -
 ...ow_lite_tools_optimize_calibration_BUILD (gone) |  34 -
 ...sorflow_python_eager_pywrap__tfe__src.cc (gone) |  20 -
 ...h-tensorflow_python_lib_core_bfloat16.cc (gone) |  11 -
 ...thon_lib_core_ndarray__tensor__bridge.cc (gone) |  11 -
 ...ream__executor_stream__executor__pimpl.h (gone) |  10 -
 .../files/patch-tensorflow_tensorflow.bzl (gone)   |  65 --
 ...atch-tensorflow_tools_lib__package_BUILD (gone) |  18 -
 ...ch-tensorflow_tools_pip__package_setup.py (new) |  34 +
 .../patch-tensorflow_tools_proto__text_BUILD (new) |  10 +
 .../files/patch-tensorflow_workspace.bzl (gone)    |  18 -
 .../files/patch-tensorflow_workspace2.bzl (new)    |  10 +
 ...ird__party_absl_system.absl.strings.BUILD (new) |  26 +
 .../patch-third__party_aws_BUILD.bazel (gone)      |  13 -
 ...tch-third__party_com__google__absl.BUILD (gone) |  13 -
 .../patch-third__party_cpuinfo_cpuinfo.BUILD (new) |  51 ++
 .../patch-third__party_cpuinfo_workspace.bzl (new) |   9 +
 ...ch-third__party_flatbuffers_BUILD.system (gone) |  18 -
 ...third__party_llvm_macos__build__fix.patch (new) |  11 +
 .../patch-third__party_llvm_workspace.bzl (new)    |  10 +
 .../files/patch-third__party_mlir_BUILD (gone)     |  10 -
 .../files/patch-third__party_py_BUILD.tpl (new)    |  17 +
 ...e__config_remote__platform__configure.bzl (new) |  19 +
 ...hird__party_systemlibs_functools32.BUILD (gone) |  18 -
 ...patch-third__party_systemlibs_grpc.BUILD (gone) |  11 -
 ...ch-third__party_systemlibs_jsoncpp.BUILD (gone) |  21 -
 ...ch-third__party_systemlibs_protobuf.BUILD (new) |  25 +
 ...tch-third__party_systemlibs_protobuf.bzl (gone) |  11 -
 ..._party_systemlibs_syslibs__configure.bzl (gone) |  10 -
 science/py-tensorflow/files/tensorflow.pc (new)    |  11 +
 science/py-tensorflow/files/tensorflow_cc.pc (new) |  11 +
 science/py-tensorflow/pkg-plist (new)              | 908 +++++++++++++++++++++
 76 files changed, 2090 insertions(+), 955 deletions(-)