Summary: | science/py-tensorflow Upgrade to 2.9.1 | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Ports & Packages | Reporter: | Anthony Donnelly <amzo1337> | ||||||
Component: | Individual Port(s) | Assignee: | Yuri Victorovich <yuri> | ||||||
Status: | Closed FIXED | ||||||||
Severity: | Affects Many People | CC: | yuri | ||||||
Priority: | --- | Flags: | amzo1337:
maintainer-feedback+
|
||||||
Version: | Latest | ||||||||
Hardware: | amd64 | ||||||||
OS: | Any | ||||||||
Bug Depends on: | 266290 | ||||||||
Bug Blocks: | 266311 | ||||||||
Attachments: |
|
Quick note, this is blocked until the resolution of Bug 266290 I am getting the Java breakage: # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00003dd78c4cd328, pid=1325, tid=728219 # # JRE version: OpenJDK Runtime Environment (11.0.16+8) (build 11.0.16+8-1) # Java VM: OpenJDK 64-Bit Server VM (11.0.16+8-1, mixed mode, tiered, compressed oops, g1 gc, bsd-amd64) # Problematic frame: # V [libjvm.so+0xecd328] JVM_RaiseSignal+0x3d3888 # # Core dump will be written. Default location: /wrkdirs/usr/ports/science/py-tensorflow/work-py39/tensorflow-2.9.1/bazel(tensorflow-2.9.1).core # # An error report file with more information is saved as: # /wrkdirs/usr/ports/science/py-tensorflow/work-py39/tensorflow-2.9.1/hs_err_pid1325.log Compiled method (c1) 60861 3204 ! 3 java.util.concurrent.ConcurrentHashMap::putVal (432 bytes) total in heap [0x00003dd79f4e7390,0x00003dd79f4ea5e0] = 12880 relocation [0x00003dd79f4e7508,0x00003dd79f4e7718] = 528 main code [0x00003dd79f4e7720,0x00003dd79f4e9ac0] = 9120 stub code [0x00003dd79f4e9ac0,0x00003dd79f4e9b90] = 208 oops [0x00003dd79f4e9b90,0x00003dd79f4e9b98] = 8 metadata [0x00003dd79f4e9b98,0x00003dd79f4e9bd8] = 64 scopes data [0x00003dd79f4e9bd8,0x00003dd79f4ea170] = 1432 scopes pcs [0x00003dd79f4ea170,0x00003dd79f4ea450] = 736 dependencies [0x00003dd79f4ea450,0x00003dd79f4ea458] = 8 handler table [0x00003dd79f4ea458,0x00003dd79f4ea5a8] = 336 nul chk table [0x00003dd79f4ea5a8,0x00003dd79f4ea5e0] = 56 I think that's Java breaking while running the Bazel code. Is this reproducible everytime? Since java is only really used by bazel. I'm wondering if it's a bug in bazel or java but can only really guess at this point. Could you share the core dump, or try and build tensorflow again and see if it happens again. I need to try and find a way to reproduce this if it's a consistent issue with compiling. Anthony, What system version did you build on? Yuri OS: FreeBSD Shiva.home 13.1-RELEASE-p2 FreeBSD 13.1-RELEASE-p2 GENERIC amd64 Java Version: openjdk11-11.0.16+8.1_1 bazel version: bazel-5.3.0 I've done around 5 successful builds so far, and the person who open the issue on github to update tensorflow also managed a successful build. However, I've been looking at open bugs regarding java and there is currently this bug 265805 open which seems similiar. (In reply to Anthony Donnelly from comment #5) For me Java failures occurred on 13.1-RELEASE-p2 GENERIC amd64. Java crashed a few times while building bazel too, but then the bazel build succeeded. So the failure seems to be intermittent. Now I am 1 hour into the build on 13-STABLE and it is working so far. TF has built successfully on 13.1-STABLE. ------------------------------------------------------- I tried this example: https://www.tensorflow.org/tutorials/quickstart/advanced and it couldn't find tensorflow.keras Is it missing? The tf.keras imports are from py-keras, which is bug 266311. If you build and install that the tutorial will work. Wasn't sure how to handle the dependcies as most distributions have keras as a post dependency, but with freebsd ports I wasn't sure how to handle this as keras requires tensorflow to build. I created the ports this way to avoid circular dependencies. (In reply to Anthony Donnelly from comment #9) No problem. I think that JOBS_ALL should be a default, not JOBS_1. People who would have problems building the port would be able to lower the number of jobs. IMO there's no need to always begin with 1 job. What do you think? When bazel jobs were set to all the build would be killed by the FreeBSD build machine which is why it's there. I'm wondering if setting a memory limit would also work. I should add, it used to be killed by running out of resources when being built along side other packages. Which resulted in emails being sent about the port being broken and ended up with it being marked as broken. There is --local_ram_resources= option in bazel to limit the memory usage which would probably fix it, but I don't really want to set an arbitrary limit for people which might slow the build for people who build from source. (In reply to Anthony Donnelly from comment #12) By default poudriere doesn't limit memory. But there is the MAX_MEMORY setting that is likely enabled on the builders. Since builders are shared between ports MAX_MEMORY can't be specified per-port. poudriere creates jails with memory limit when MAX_MEMORY is set. I asked for the feature to have MAX_MEMORY per-port: https://github.com/freebsd/poudriere/issues/1016 They do set memory limit to jail on creation but this doesn't seem essential. Limit supplied with jexec is what limits the memory. Created attachment 236534 [details]
plist-issues.txt
When I've rebuilt in poudriere I got plist issues, see attachment.
Contrary to what post-install comment says it installs some headers and libraries.
Should I just add them to pkg-plist?
Yeah, jsut add them to plist. I think this is due to the package used to only provide the python bindings, but I added in the libtensorflow* and the headers and forgot to update the plists. Committed, thanks! A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/ports/commit/?id=56dc0e449f2a0fdf190cfc1de367b148cbb5e467 commit 56dc0e449f2a0fdf190cfc1de367b148cbb5e467 Author: Anthony Donnelly <amzo1337@gmail.com> AuthorDate: 2022-09-13 20:21:29 +0000 Commit: Yuri Victorovich <yuri@FreeBSD.org> CommitDate: 2022-09-13 20:32:36 +0000 science/py-tensorflow: Update 1.15.5 -> 2.9.1 Big thank you to Anthony Donnelly for updating this difficult port. PR: 266303 science/py-tensorflow/Makefile | 182 +++-- science/py-tensorflow/Makefile.MASTER_SITES | 94 ++- science/py-tensorflow/distinfo | 98 ++- .../files/bazel/add-default-option.patch (new) | 37 + .../files/bazel/fix-environ.patch (new) | 42 + .../files/bazel/fix_cpuinfo.patch (new) | 80 ++ .../files/bazel/freebsd_python_fix.patch (new) | 11 + .../files/bazel/set-c++17.patch (new) | 11 + science/py-tensorflow/files/bazelrc | 24 +- ...absl_base_internal_unscaledcycleclock.cc (gone) | 47 -- science/py-tensorflow/files/freebsd/BUILD (new) | 88 ++ .../files/freebsd/cc_toolchain_config.bzl (new) | 287 +++++++ science/py-tensorflow/files/patch-.bazelrc (new) | 22 + science/py-tensorflow/files/patch-WORKSPACE | 61 +- science/py-tensorflow/files/patch-bazelrc (gone) | 11 - science/py-tensorflow/files/patch-configure (new) | 11 + science/py-tensorflow/files/patch-protobuf (gone) | 22 - .../files/patch-tensorflow_BUILD (new) | 22 + ...ch-tensorflow_compiler_mlir_hlo_WORKSPACE (new) | 11 + ...atch-tensorflow_compiler_mlir_lite_BUILD (gone) | 10 - ...ow_compiler_mlir_lite_quantization_BUILD (gone) | 8 - ...mpiler_mlir_quantization_tensorflow_BUILD (new) | 11 + ...ensorflow_compiler_mlir_tensorflow_BUILD (gone) | 10 - .../patch-tensorflow_contrib_bigtable_BUILD (gone) | 20 - ...-tensorflow_contrib_boosted__trees_BUILD (gone) | 24 - ...-tensorflow_contrib_ffmpeg_default_BUILD (gone) | 10 - ...ls_client_ignite__plain__client__unix.cc (gone) | 13 - ...tch-tensorflow_contrib_makefile_Makefile (gone) | 49 -- .../files/patch-tensorflow_core_BUILD | 44 +- ...flow_core_distributed__runtime_rpc_BUILD (gone) | 10 - ...ibuted__runtime_rpc_grpc__server__lib.cc (gone) | 15 - .../patch-tensorflow_core_lib_png_BUILD (new) | 10 + .../patch-tensorflow_core_platform_BUILD (new) | 10 + ...tch-tensorflow_core_platform_cloud_BUILD (gone) | 10 - ...w_core_platform_cloud_gcs__dns__cache.cc (gone) | 14 - ...-tensorflow_core_profiler_internal_BUILD (gone) | 10 - ...ensorflow_core_profiler_rpc_client_BUILD (gone) | 19 - ...ensorflow_core_protobuf_autotuning.proto (gone) | 21 - ...tools_make_targets_freebsd__makefile.inc (gone) | 13 - ...h-tensorflow_lite_kernels_internal_BUILD (gone) | 11 - ...w_lite_python_interpreter__wrapper_BUILD (gone) | 10 - .../files/patch-tensorflow_lite_tools_BUILD (gone) | 10 - ...atch-tensorflow_lite_tools_make_Makefile (gone) | 19 - ...tools_make_targets_freebsd__makefile.inc (gone) | 19 - ...tch-tensorflow_lite_tools_optimize_BUILD (gone) | 20 - ...ow_lite_tools_optimize_calibration_BUILD (gone) | 34 - ...sorflow_python_eager_pywrap__tfe__src.cc (gone) | 20 - ...h-tensorflow_python_lib_core_bfloat16.cc (gone) | 11 - ...thon_lib_core_ndarray__tensor__bridge.cc (gone) | 11 - ...ream__executor_stream__executor__pimpl.h (gone) | 10 - .../files/patch-tensorflow_tensorflow.bzl (gone) | 65 -- ...atch-tensorflow_tools_lib__package_BUILD (gone) | 18 - ...ch-tensorflow_tools_pip__package_setup.py (new) | 34 + .../patch-tensorflow_tools_proto__text_BUILD (new) | 10 + .../files/patch-tensorflow_workspace.bzl (gone) | 18 - .../files/patch-tensorflow_workspace2.bzl (new) | 10 + ...ird__party_absl_system.absl.strings.BUILD (new) | 26 + .../patch-third__party_aws_BUILD.bazel (gone) | 13 - ...tch-third__party_com__google__absl.BUILD (gone) | 13 - .../patch-third__party_cpuinfo_cpuinfo.BUILD (new) | 51 ++ .../patch-third__party_cpuinfo_workspace.bzl (new) | 9 + ...ch-third__party_flatbuffers_BUILD.system (gone) | 18 - ...third__party_llvm_macos__build__fix.patch (new) | 11 + .../patch-third__party_llvm_workspace.bzl (new) | 10 + .../files/patch-third__party_mlir_BUILD (gone) | 10 - .../files/patch-third__party_py_BUILD.tpl (new) | 17 + ...e__config_remote__platform__configure.bzl (new) | 19 + ...hird__party_systemlibs_functools32.BUILD (gone) | 18 - ...patch-third__party_systemlibs_grpc.BUILD (gone) | 11 - ...ch-third__party_systemlibs_jsoncpp.BUILD (gone) | 21 - ...ch-third__party_systemlibs_protobuf.BUILD (new) | 25 + ...tch-third__party_systemlibs_protobuf.bzl (gone) | 11 - ..._party_systemlibs_syslibs__configure.bzl (gone) | 10 - science/py-tensorflow/files/tensorflow.pc (new) | 11 + science/py-tensorflow/files/tensorflow_cc.pc (new) | 11 + science/py-tensorflow/pkg-plist (new) | 908 +++++++++++++++++++++ 76 files changed, 2090 insertions(+), 955 deletions(-) |
Created attachment 236442 [details] Tensorflow 2.9.1 I've upgrade Tensorflow to the latest version 2.9.1 and have tested in poudriere. Tested in poudriere and tested a simple neural network and all seems to be okay.