Summary: | lang/mono: [patch] mono-sgen SIGSEGV during build | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Ports & Packages | Reporter: | Thomas Hurst <tom> | ||||||||
Component: | Individual Port(s) | Assignee: | David Naylor <dbn> | ||||||||
Status: | Closed FIXED | ||||||||||
Severity: | Affects Only Me | CC: | dbn, emaste, lacey.leanne, pmichel, rafael.espindola | ||||||||
Priority: | --- | Keywords: | patch | ||||||||
Version: | Latest | Flags: | bugzilla:
maintainer-feedback?
(mono) |
||||||||
Hardware: | Any | ||||||||||
OS: | Any | ||||||||||
Attachments: |
|
I tested this patch on FreeBSD talizorah 10.2-RELEASE-p14 FreeBSD 10.2-RELEASE-p14 #0: Wed Mar 16 20:46:12 UTC 2016 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 And this solution allows me to build mono from ports. =) If needed, I can also build with Poudriere under 10.1 and 9.3 on amd64 to validate this patch works there as well. Can I suggest bumping the max slots by a power of 2 (i.e. 32 instead of 17). Looking through the code there is no apparent reason to use a power of 2, and the 1024 may already provide any scaling benefits needed - however I be suspicious. Created attachment 168803 [details]
Patch to increase _WAPI_PRIVATE_MAX_SLOTS to (1024 * 32)
Slots bumped per suggestion above, tested with the
poudriere testport
under 9.3 and 10.2 amd64. The new patch applies cleanly via
patch -p0 <mono_iolayer_filehandles.patch
in the root of the ports tree.
LLD developers have come across this issue again while trying to test a ports tree linked with LLD. Is there some reason this hasn't made it into the ports tree? Has it been proposed upstream in the mono project? For what it is worth, I still got the crash with 1024 * 17 but it worked with 10224*32. (In reply to Rafael Avila de Espindola from comment #5) I assume 10224 is a typo, and you meant 1024*32. (In reply to Ed Maste from comment #6) Yes, I meant 1024*32. It wasn't commited as this error never came up in testing or exp-run. Would you please confirm what environment is required to trigger this issue? It appears related to the amount of RAM? (In reply to David Naylor from comment #8) I got the error on an EC2 r3.8xlarge instance. (In reply to Rafael Avila de Espindola from comment #9) For reference from https://aws.amazon.com/ec2/instance-types/ r3.8xlarge: 32 vCPU, 244 GiB, 2 x 320 GB SSD Any chance we could get this patch merged into the mono port given it's a tiny change but solves lots of people's problems? I'm investigating this change as an update to mono-4.8.1.0 The patch as proposed no longer works for mono-4.8 and higher. However, the behavior is still the same. Mono crashes if any more than ~128G of ram is present :( Fixing this would be much appreciated. I've found the offending line in 4.8.1.0. It is now in mono/utils/w32handle.c. As before, increasing MAX_SLOTS to something like (1024 * 32) fixes the issue. I've attached a new patch above. Created attachment 182291 [details]
Patch to increase max slots to (1024*32) in mono 4.8.1.0
A commit references this bug: Author: dbn Date: Sat May 13 07:48:29 UTC 2017 New revision: 440759 URL: https://svnweb.freebsd.org/changeset/ports/440759 Log: Update mono and related ports USES=mono: minor fixes - save a copy of the nuget package in the packages directory - force linking of directories, allowing nuget-extract to be rerun without `make clean` - fix makenuget: nuget requires an equals to identify the version, not a dash devel/monodevelop: update to 6.2.1.3 - update nuget packages: - link older System.Collection.Immutable 1.1.37 to newer 1.3.1 (used by C# and F# respectively) - update external github repositories - allow post-extract target to be run multiple times - change MonoDevelop.Packaging to use a newer version of NuGet.Build.Packaging (the older version is no longer fetchable) - remove patch integrated upstream - moved `nuget restore` patching from post-patch into a patch file (the former broke silently) - ChangeLog: - https://developer.xamarin.com/releases/studio/xamarin.studio_6.2/xamarin.studio_6.2/ irc/smartirc4net: update to 1.1 - add LICENSE lang/fsharp: update to 4.1.18 - add test dependency on libgdiplus - update nuget packages - update test paths for fsharp assemblies - update patches to prevent `nuget restore` from running - ChangeLog: - Set executable bit correctly on output - Integrate visualfsharp - Fix regression on binding redirects for System.Collections.Immutable - Fix regression in Microsoft.Build.FSharp.targets - Fix binding redirects for System.Collections.Immutable - Fix version of library going in %PREFIX/lib/mono/fsharp - Align fsc task and target file - Use install layout that includes mono/fsharp - Fix F# Intereactive on Mono 4.9+ - Update compiler tools - Updates to FSharp.Core nuget package for F# 4.1 - Fix #656: error FS0193: internal error: No access to the given key lang/mono: various fixes - fix linking with lld [1] - double maximum handle size [2] - add option to run acceptance tests - allow for optional bootstrapping of mono via either installed mcs (if available) or via downloaded "monolite" (default) - add python and py-pillow as dependencies for bin/mono-heapviz - add armv6 as a supported architecture (untested) - switch to github for source code: - official tarball does not include tests - patches: - recognise FreeBSD for AOT suffix - change mono-heapviz to use pillow instead of PIL multimedia/banshee: tell portscout to ignore this port - Portscout was not skipping the 2.9.1 version, and upstream appears to be quiet for the last few years. x11-toolkits/gtk-sharp20: update to 2.12.43 - ChangeLog: - fix compilation on mono-4.8.0 (incorrect use of sizeof()) - correctly set owned=true on custom constructors PR: 218885 [1] PR: 200937 [2] Changes: head/Mk/Uses/mono.mk head/devel/monodevelop/Makefile head/devel/monodevelop/distinfo head/devel/monodevelop/files/patch-Makefile.am head/devel/monodevelop/files/patch-external_fsharpbinding_.paket_paket.targets head/devel/monodevelop/files/patch-external_fsharpbinding_MonoDevelop.FSharpBinding_FSharpTextEditorCompletion.fs head/devel/monodevelop/pkg-plist head/irc/smartirc4net/Makefile head/irc/smartirc4net/distinfo head/irc/smartirc4net/files/ head/irc/smartirc4net/pkg-plist head/lang/fsharp/Makefile head/lang/fsharp/distinfo head/lang/fsharp/files/patch-Makefile head/lang/fsharp/files/patch-Makefile.in head/lang/fsharp/files/patch-src_FSharpSource.targets head/lang/fsharp/pkg-plist head/lang/mono/Makefile head/lang/mono/distinfo head/lang/mono/files/patch-configure.ac head/lang/mono/files/patch-mono_utils_mono-compiler.h head/lang/mono/files/patch-mono_utils_w32handle.c head/lang/mono/files/patch-scripts_mono-heapviz head/lang/mono/pkg-plist head/multimedia/banshee/Makefile head/x11-toolkits/gtk-sharp20/Makefile head/x11-toolkits/gtk-sharp20/distinfo Fixed in ports and submitted upstream (https://github.com/mono/mono/pull/4856). Thank you for the report and patches. |
Created attachment 157839 [details] Patch: increase _WAPI_PRIVATE_MAX_SLOTS System: FreeBSD 10.1-STABLE #0 r283969: Wed Jun 3 22:59:38 BST 2015 Dual hex-core Westmere Xeon, 155GB RAM. For a while now lang/mono's been segfaulting during build: ------------------------------------------------------------------------ Making all in runtime gmake[3]: Entering directory '/usr/obj/usr/ports/lang/mono/work/mono-4.0.1/runtime' if test -w /usr/obj/usr/ports/lang/mono/work/mono-4.0.1/mcs; then :; else chmod -R +w /usr/obj/usr/ports/lang/mono/work/mono-4.0.1/mcs; fi cd /usr/obj/usr/ports/lang/mono/work/mono-4.0.1/mcs && gmake --no-print-directory -s NO_DIR_CHECK=1 PROFILES='binary_reference_assemblies net_4_5 xbuild_12 xbuild_14 ' CC='cc' all-profiles gmake[7]: mcs: Command not found build/profiles/basic.make:93: recipe for target 'build/deps/basic-profile-check.exe' failed gmake[7]: *** [build/deps/basic-profile-check.exe] Error 127 *** The compiler 'mcs' doesn't appear to be usable. *** Trying the 'monolite' directory. ================================================================= Got a SIGSEGV while executing native code. This usually indicates a fatal error in the mono runtime or one of the native libraries used by your application. ================================================================= build/profiles/basic.make:93: recipe for target 'build/deps/basic-profile-check.exe' failed gmake[9]: *** [build/deps/basic-profile-check.exe] Abort trap (core dumped) *** The contents of your 'monolite' directory may be out-of-date *** You may want to try 'make get-monolite-latest' build/profiles/basic.make:77: recipe for target 'do-profile-check-monolite' failed gmake[9]: *** [do-profile-check-monolite] Error 1 build/profiles/basic.make:60: recipe for target 'do-profile-check' failed gmake[8]: *** [do-profile-check] Error 2 build/profiles/basic.make:85: recipe for target 'do-profile-check-monolite' failed gmake[7]: *** [do-profile-check-monolite] Error 2 build/profiles/basic.make:60: recipe for target 'do-profile-check' failed gmake[6]: *** [do-profile-check] Error 2 Makefile:44: recipe for target 'profile-do--basic--all' failed gmake[5]: *** [profile-do--basic--all] Error 2 Makefile:40: recipe for target 'profiles-do--all' failed gmake[4]: *** [profiles-do--all] Error 2 Makefile:555: recipe for target 'all-local' failed gmake[3]: *** [all-local] Error 2 gmake[3]: Leaving directory '/usr/obj/usr/ports/lang/mono/work/mono-4.0.1/runtime' Makefile:522: recipe for target 'all-recursive' failed gmake[2]: *** [all-recursive] Error 1 gmake[2]: Leaving directory '/usr/obj/usr/ports/lang/mono/work/mono-4.0.1' Makefile:449: recipe for target 'all' failed gmake[1]: *** [all] Error 2 gmake[1]: Leaving directory '/usr/obj/usr/ports/lang/mono/work/mono-4.0.1' *** Error code 1 ------------------------------------------------------------------------ I can't reproduce it on a 24GB 10.1-RELEASE system. I'm guessing it's a high-memory/tuning issue. mono-sgen.core reveals the following backtrace: #0 0x00000008012eacaa in thr_kill () from /lib/libc.so.7 #1 0x00000008012eac16 in raise () from /lib/libc.so.7 #2 0x00000008012e9409 in abort () from /lib/libc.so.7 #3 0x00000000004b1903 in mono_handle_native_sigsegv (signal=<value optimized out>, ctx=<value optimized out>, info=<value optimized out>) at mini-exceptions.c:2386 #4 0x000000000041eeac in mono_sigsegv_signal_handler (_dummy=11, _info=0x7fffffffd2b0, context=0x7fffffffcf40) at mini.c:6771 #5 0x0000000800f8f997 in pthread_sigmask () from /lib/libthr.so.3 #6 0x0000000800f8f1a8 in pthread_getspecific () from /lib/libthr.so.3 #7 <signal handler called> #8 0x0000000000607f83 in wapi_init () at handles.c:278 #9 0x00000000005a41d5 in mono_init_internal (filename=0x7fffffffdb82 ".//class/lib/monolite/basic.exe", exe_filename=0x7fffffffdb82 ".//class/lib/monolite/basic.exe", runtime_version=0x0) at domain.c:504 #10 0x000000000041f846 in mini_init (filename=<value optimized out>, runtime_version=<value optimized out>) at mini.c:7404 #11 0x0000000000484b4c in mono_main (argc=8, argv=0x7fffffffd628) at driver.c:1921 #12 0x000000000041600f in _start () Turns out this line in mono/io-layer/handles.c immediately before the segfault is failing, returning -1: _wapi_global_signal_handle = _wapi_handle_new (WAPI_HANDLE_EVENT, NULL); Drilling down, I see the array's running out of slots, as defined in mono/io-layer/handles-private.h:25 #define _WAPI_PRIVATE_MAX_SLOTS (1024 * 16) Increasing this to 1024 * 17 fixes the build. Patch to drop into files/ attached.