Bug 200937 - lang/mono: [patch] mono-sgen SIGSEGV during build
Summary: lang/mono: [patch] mono-sgen SIGSEGV during build
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Only Me
Assignee: David Naylor
URL:
Keywords: patch
Depends on:
Blocks:
 
Reported: 2015-06-17 18:19 UTC by Thomas Hurst
Modified: 2017-05-13 07:52 UTC (History)
5 users (show)

See Also:
bugzilla: maintainer-feedback? (mono)


Attachments
Patch: increase _WAPI_PRIVATE_MAX_SLOTS (609 bytes, text/plain)
2015-06-17 18:19 UTC, Thomas Hurst
no flags Details
Patch to increase _WAPI_PRIVATE_MAX_SLOTS to (1024 * 32) (1.25 KB, patch)
2016-03-31 01:46 UTC, Lacey Powers
no flags Details | Diff
Patch to increase max slots to (1024*32) in mono 4.8.1.0 (295 bytes, patch)
2017-05-04 01:22 UTC, Phil
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Hurst 2015-06-17 18:19:40 UTC
Created attachment 157839 [details]
Patch: increase _WAPI_PRIVATE_MAX_SLOTS

System: FreeBSD 10.1-STABLE #0 r283969: Wed Jun  3 22:59:38 BST 2015

Dual hex-core Westmere Xeon, 155GB RAM.

For a while now lang/mono's been segfaulting during build:

------------------------------------------------------------------------
Making all in runtime
gmake[3]: Entering directory '/usr/obj/usr/ports/lang/mono/work/mono-4.0.1/runtime'
if test -w /usr/obj/usr/ports/lang/mono/work/mono-4.0.1/mcs; then :; else chmod -R +w /usr/obj/usr/ports/lang/mono/work/mono-4.0.1/mcs; fi
cd /usr/obj/usr/ports/lang/mono/work/mono-4.0.1/mcs && gmake --no-print-directory -s NO_DIR_CHECK=1 PROFILES='binary_reference_assemblies net_4_5 xbuild_12 xbuild_14   ' CC='cc' all-profiles
gmake[7]: mcs: Command not found
build/profiles/basic.make:93: recipe for target 'build/deps/basic-profile-check.exe' failed
gmake[7]: *** [build/deps/basic-profile-check.exe] Error 127
*** The compiler 'mcs' doesn't appear to be usable.
*** Trying the 'monolite' directory.

=================================================================
Got a SIGSEGV while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries
used by your application.
=================================================================

build/profiles/basic.make:93: recipe for target 'build/deps/basic-profile-check.exe' failed
gmake[9]: *** [build/deps/basic-profile-check.exe] Abort trap (core dumped)
*** The contents of your 'monolite' directory may be out-of-date
*** You may want to try 'make get-monolite-latest'
build/profiles/basic.make:77: recipe for target 'do-profile-check-monolite' failed
gmake[9]: *** [do-profile-check-monolite] Error 1
build/profiles/basic.make:60: recipe for target 'do-profile-check' failed
gmake[8]: *** [do-profile-check] Error 2
build/profiles/basic.make:85: recipe for target 'do-profile-check-monolite' failed
gmake[7]: *** [do-profile-check-monolite] Error 2
build/profiles/basic.make:60: recipe for target 'do-profile-check' failed
gmake[6]: *** [do-profile-check] Error 2
Makefile:44: recipe for target 'profile-do--basic--all' failed
gmake[5]: *** [profile-do--basic--all] Error 2
Makefile:40: recipe for target 'profiles-do--all' failed
gmake[4]: *** [profiles-do--all] Error 2
Makefile:555: recipe for target 'all-local' failed
gmake[3]: *** [all-local] Error 2
gmake[3]: Leaving directory '/usr/obj/usr/ports/lang/mono/work/mono-4.0.1/runtime'
Makefile:522: recipe for target 'all-recursive' failed
gmake[2]: *** [all-recursive] Error 1
gmake[2]: Leaving directory '/usr/obj/usr/ports/lang/mono/work/mono-4.0.1'
Makefile:449: recipe for target 'all' failed
gmake[1]: *** [all] Error 2
gmake[1]: Leaving directory '/usr/obj/usr/ports/lang/mono/work/mono-4.0.1'
*** Error code 1
------------------------------------------------------------------------

I can't reproduce it on a 24GB 10.1-RELEASE system.  I'm guessing it's a high-memory/tuning issue.

mono-sgen.core reveals the following backtrace:

#0  0x00000008012eacaa in thr_kill () from /lib/libc.so.7
#1  0x00000008012eac16 in raise () from /lib/libc.so.7
#2  0x00000008012e9409 in abort () from /lib/libc.so.7
#3  0x00000000004b1903 in mono_handle_native_sigsegv (signal=<value optimized out>, ctx=<value optimized out>,
    info=<value optimized out>) at mini-exceptions.c:2386
#4  0x000000000041eeac in mono_sigsegv_signal_handler (_dummy=11, _info=0x7fffffffd2b0, context=0x7fffffffcf40) at mini.c:6771
#5  0x0000000800f8f997 in pthread_sigmask () from /lib/libthr.so.3
#6  0x0000000800f8f1a8 in pthread_getspecific () from /lib/libthr.so.3
#7  <signal handler called>
#8  0x0000000000607f83 in wapi_init () at handles.c:278
#9  0x00000000005a41d5 in mono_init_internal (filename=0x7fffffffdb82 ".//class/lib/monolite/basic.exe",
    exe_filename=0x7fffffffdb82 ".//class/lib/monolite/basic.exe", runtime_version=0x0) at domain.c:504
#10 0x000000000041f846 in mini_init (filename=<value optimized out>, runtime_version=<value optimized out>) at mini.c:7404
#11 0x0000000000484b4c in mono_main (argc=8, argv=0x7fffffffd628) at driver.c:1921
#12 0x000000000041600f in _start ()


Turns out this line in mono/io-layer/handles.c immediately before the segfault is failing, returning -1:

    _wapi_global_signal_handle = _wapi_handle_new (WAPI_HANDLE_EVENT, NULL);

Drilling down, I see the array's running out of slots, as defined in mono/io-layer/handles-private.h:25

    #define _WAPI_PRIVATE_MAX_SLOTS         (1024 * 16)

Increasing this to 1024 * 17 fixes the build.  Patch to drop into files/ attached.
Comment 1 Lacey Powers 2016-03-28 21:58:05 UTC
I tested this patch on

FreeBSD talizorah 10.2-RELEASE-p14 FreeBSD 10.2-RELEASE-p14 #0: Wed Mar 16 20:46:12 UTC 2016 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64

And this solution allows me to build mono from ports. =)

If needed, I can also build with Poudriere under 10.1 and 9.3 on amd64 to validate this patch works there as well.
Comment 2 David Naylor freebsd_committer freebsd_triage 2016-03-29 20:06:55 UTC
Can I suggest bumping the max slots by a power of 2 (i.e. 32 instead of 17).  Looking through the code there is no apparent reason to use a power of 2, and the 1024 may already provide any scaling benefits needed - however I be suspicious.
Comment 3 Lacey Powers 2016-03-31 01:46:04 UTC
Created attachment 168803 [details]
Patch to increase _WAPI_PRIVATE_MAX_SLOTS to (1024 * 32)

Slots bumped per suggestion above, tested with the

poudriere testport 

under 9.3 and 10.2 amd64. The new patch applies cleanly via 

patch -p0 <mono_iolayer_filehandles.patch 

in the root of the ports tree.
Comment 4 Ed Maste freebsd_committer freebsd_triage 2016-12-09 21:17:56 UTC
LLD developers have come across this issue again while trying to test a ports tree linked with LLD. Is there some reason this hasn't made it into the ports tree? Has it been proposed upstream in the mono project?
Comment 5 Rafael Avila de Espindola 2016-12-09 22:00:20 UTC
For what it is worth, I still got the crash with 1024 * 17 but it worked with 10224*32.
Comment 6 Ed Maste freebsd_committer freebsd_triage 2016-12-10 02:21:17 UTC
(In reply to Rafael Avila de Espindola from comment #5)

I assume 10224 is a typo, and you meant 1024*32.
Comment 7 Rafael Avila de Espindola 2016-12-10 17:28:53 UTC
(In reply to Ed Maste from comment #6)

Yes, I meant 1024*32.
Comment 8 David Naylor freebsd_committer freebsd_triage 2016-12-10 20:47:51 UTC
It wasn't commited as this error never came up in testing or exp-run.  Would you please confirm what environment is required to trigger this issue?  It appears related to the amount of RAM?
Comment 9 Rafael Avila de Espindola 2016-12-10 21:23:24 UTC
(In reply to David Naylor from comment #8)

I got the error on an EC2 r3.8xlarge instance.
Comment 10 Ed Maste freebsd_committer freebsd_triage 2016-12-18 14:16:02 UTC
(In reply to Rafael Avila de Espindola from comment #9)

For reference from https://aws.amazon.com/ec2/instance-types/
r3.8xlarge: 32 vCPU, 244 GiB, 2 x 320 GB SSD
Comment 11 Phil 2017-01-26 08:12:18 UTC
Any chance we could get this patch merged into the mono port given it's a tiny change but solves lots of people's problems?
Comment 12 David Naylor freebsd_committer freebsd_triage 2017-05-01 16:36:28 UTC
I'm investigating this change as an update to mono-4.8.1.0
Comment 13 Phil 2017-05-03 16:16:06 UTC
The patch as proposed no longer works for mono-4.8 and higher. 

However, the behavior is still the same. Mono crashes if any more than ~128G of ram is present :(

Fixing this would be much appreciated.
Comment 14 Phil 2017-05-04 01:21:12 UTC
I've found the offending line in 4.8.1.0. It is now in mono/utils/w32handle.c.

As before, increasing MAX_SLOTS to something like (1024 * 32) fixes the issue.

I've attached a new patch above.
Comment 15 Phil 2017-05-04 01:22:03 UTC
Created attachment 182291 [details]
Patch to increase max slots to (1024*32) in mono 4.8.1.0
Comment 16 commit-hook freebsd_committer freebsd_triage 2017-05-13 07:49:39 UTC
A commit references this bug:

Author: dbn
Date: Sat May 13 07:48:29 UTC 2017
New revision: 440759
URL: https://svnweb.freebsd.org/changeset/ports/440759

Log:
  Update mono and related ports

  USES=mono: minor fixes
   - save a copy of the nuget package in the packages directory
   - force linking of directories, allowing nuget-extract to be rerun
     without `make clean`
   - fix makenuget: nuget requires an equals to identify the version, not a dash

  devel/monodevelop: update to 6.2.1.3
   - update nuget packages:
     - link older System.Collection.Immutable 1.1.37 to newer 1.3.1 (used
       by C# and F# respectively)
   - update external github repositories
   - allow post-extract target to be run multiple times
   - change MonoDevelop.Packaging to use a newer version of
     NuGet.Build.Packaging (the older version is no longer fetchable)
   - remove patch integrated upstream
   - moved `nuget restore` patching from post-patch into a patch file (the
     former broke silently)
   - ChangeLog:
     - https://developer.xamarin.com/releases/studio/xamarin.studio_6.2/xamarin.studio_6.2/

  irc/smartirc4net: update to 1.1
   - add LICENSE

  lang/fsharp: update to 4.1.18
   - add test dependency on libgdiplus
   - update nuget packages
   - update test paths for fsharp assemblies
   - update patches to prevent `nuget restore` from running
   - ChangeLog:
     - Set executable bit correctly on output
     - Integrate visualfsharp
     - Fix regression on binding redirects for System.Collections.Immutable
     - Fix regression in Microsoft.Build.FSharp.targets
     - Fix binding redirects for System.Collections.Immutable
     - Fix version of library going in %PREFIX/lib/mono/fsharp
     - Align fsc task and target file
     - Use install layout that includes mono/fsharp
     - Fix F# Intereactive on Mono 4.9+
     - Update compiler tools
     - Updates to FSharp.Core nuget package for F# 4.1
     - Fix #656: error FS0193: internal error: No access to the given key

  lang/mono: various fixes
   - fix linking with lld [1]
   - double maximum handle size [2]
   - add option to run acceptance tests
   - allow for optional bootstrapping of mono via either installed mcs (if
     available) or via downloaded "monolite" (default)
   - add python and py-pillow as dependencies for bin/mono-heapviz
   - add armv6 as a supported architecture (untested)
   - switch to github for source code:
     - official tarball does not include tests
   - patches:
     - recognise FreeBSD for AOT suffix
     - change mono-heapviz to use pillow instead of PIL

  multimedia/banshee: tell portscout to ignore this port
   - Portscout was not skipping the 2.9.1 version, and upstream appears to be
     quiet for the last few years.

  x11-toolkits/gtk-sharp20: update to 2.12.43
   - ChangeLog:
     - fix compilation on mono-4.8.0 (incorrect use of sizeof())
     - correctly set owned=true on custom constructors

  PR:		218885 [1]
  PR:		200937 [2]

Changes:
  head/Mk/Uses/mono.mk
  head/devel/monodevelop/Makefile
  head/devel/monodevelop/distinfo
  head/devel/monodevelop/files/patch-Makefile.am
  head/devel/monodevelop/files/patch-external_fsharpbinding_.paket_paket.targets
  head/devel/monodevelop/files/patch-external_fsharpbinding_MonoDevelop.FSharpBinding_FSharpTextEditorCompletion.fs
  head/devel/monodevelop/pkg-plist
  head/irc/smartirc4net/Makefile
  head/irc/smartirc4net/distinfo
  head/irc/smartirc4net/files/
  head/irc/smartirc4net/pkg-plist
  head/lang/fsharp/Makefile
  head/lang/fsharp/distinfo
  head/lang/fsharp/files/patch-Makefile
  head/lang/fsharp/files/patch-Makefile.in
  head/lang/fsharp/files/patch-src_FSharpSource.targets
  head/lang/fsharp/pkg-plist
  head/lang/mono/Makefile
  head/lang/mono/distinfo
  head/lang/mono/files/patch-configure.ac
  head/lang/mono/files/patch-mono_utils_mono-compiler.h
  head/lang/mono/files/patch-mono_utils_w32handle.c
  head/lang/mono/files/patch-scripts_mono-heapviz
  head/lang/mono/pkg-plist
  head/multimedia/banshee/Makefile
  head/x11-toolkits/gtk-sharp20/Makefile
  head/x11-toolkits/gtk-sharp20/distinfo
Comment 17 David Naylor freebsd_committer freebsd_triage 2017-05-13 07:52:38 UTC
Fixed in ports and submitted upstream (https://github.com/mono/mono/pull/4856).  Thank you for the report and patches.