The issue is reproducible with all versions of openjdk (6/7/8) on FreeBSD 10.x amd64 on different Amazon EC2 instances. The JVM is used to run different server scala applications and all crash. It can happen after a few minutes or after a few days, with a probability that seem proportional to the traffic on the website. I can provide about 50 hs_err_pidXXXXX.log files to help the diagnosis.
Do you see any pattern in the hs_err_pidXXXX? What does the # Problematic frame: on ~ line 8 say?
C [libc.so.7+0x9cf7b] _pthread_mutex_init_calloc_cb+0x6eb C [libc.so.7+0x9cf7b] _pthread_mutex_init_calloc_cb+0x6eb C [libc.so.7+0x9cf7b] _pthread_mutex_init_calloc_cb+0x6eb C [libc.so.7+0xa57c3] short+0x25d3 C [libc.so.7+0xa57c3] short+0x25d3 C [libc.so.7+0xa57c3] short+0x25d3 C [libc.so.7+0xa57c3] short+0x25d3 C [libc.so.7+0xa57c3] short+0x25d3 C [libc.so.7+0xa57c3] short+0x25d3 C [libc.so.7+0xa57c3] short+0x25d3 C [libc.so.7+0xa57c3] short+0x25d3 C [libc.so.7+0xa5d52] short+0x2b62 C [libc.so.7+0xa5d52] short+0x2b62 C [libc.so.7+0xa70d4] short+0x3ee4 C [libc.so.7+0xa70d4] short+0x3ee4 C [libc.so.7+0xa70d4] short+0x3ee4 C [libc.so.7+0xa70d4] short+0x3ee4 C [libc.so.7+0xa70d4] short+0x3ee4 C [libc.so.7+0xa70d4] short+0x3ee4 C [libc.so.7+0xa70d4] short+0x3ee4 C [libc.so.7+0xa70d4] short+0x3ee4 C [libc.so.7+0xa70d4] short+0x3ee4 C [libc.so.7+0xa70d4] short+0x3ee4 C [libc.so.7+0xa70d4] short+0x3ee4 C [libc.so.7+0xb28dd] __free+0x7d C [libc.so.7+0xb28dd] __free+0x7d C [libc.so.7+0xb28dd] __free+0x7d C [libc.so.7+0xb28dd] __free+0x7d C [libc.so.7+0xb28dd] __free+0x7d C [libc.so.7+0xb28dd] __free+0x7d C [libzip.so+0x50aa] ZIP_GetEntry+0x11a C [libzip.so+0x50aa] ZIP_GetEntry+0x11a C [libzip.so+0x50aa] ZIP_GetEntry+0x11a C [libzip.so+0x50aa] ZIP_GetEntry+0x11a C [libzip.so+0x50aa] ZIP_GetEntry+0x11a C [libzip.so+0x50aa] ZIP_GetEntry+0x11a C [libzip.so+0x517a] ZIP_GetEntry+0x11a V [libjvm.so+0x31ce40] +0x12c0c8 V [libjvm.so+0x31ceb0] +0x12c0f0 V [libjvm.so+0x3d87de] +0x1e7a66 V [libjvm.so+0x3d87de] +0x1e7a66 V [libjvm.so+0x3d87de] +0x1e7a66 V [libjvm.so+0x3d885e] +0x1e7a9e V [libjvm.so+0x3d885e] +0x1e7a9e V [libjvm.so+0x3f6955] +0x205bdd V [libjvm.so+0x5b4ac6] AsyncGetCallTrace+0xe1da6 V [libjvm.so+0x5f3230] JNI_GetCreatedJavaVMs+0x256e0 V [libjvm.so+0x5f3230] JNI_GetCreatedJavaVMs+0x256e0 V [libjvm.so+0x5f3230] JNI_GetCreatedJavaVMs+0x256e0 V [libjvm.so+0x5f3230] JNI_GetCreatedJavaVMs+0x256e0 V [libjvm.so+0x5f3230] JNI_GetCreatedJavaVMs+0x256e0 V [libjvm.so+0x5f3230] JNI_GetCreatedJavaVMs+0x256e0 V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 V [libjvm.so+0x664e7f] JVM_FindSignal+0x4ee4f V [libjvm.so+0x6c9bd8] JVM_FindSignal+0xb6e38 V [libjvm.so+0x6c9ce8] JVM_FindSignal+0xb6e38 V [libjvm.so+0x78d410] JVM_FindSignal+0x17a560 V [libjvm.so+0x7a5071] JVM_FindSignal+0x1922d1 V [libjvm.so+0x7a5071] JVM_FindSignal+0x1922d1 V [libjvm.so+0x7a5071] JVM_FindSignal+0x1922d1 V [libjvm.so+0x7a5071] JVM_FindSignal+0x1922d1 V [libjvm.so+0x7a5071] JVM_FindSignal+0x1922d1 V [libjvm.so+0x7a5331] JVM_FindSignal+0x192481 V [libjvm.so+0x7a5331] JVM_FindSignal+0x192481 V [libjvm.so+0x7a5331] JVM_FindSignal+0x192481 V [libjvm.so+0x804169] JVM_handle_bsd_signal+0x510a9
A long shot, but it could be related to https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209599 Any way, we see a similar pattern, not as often as you, but still once a day on fairly busy servers. I'm going to apply the patch from 209599 to see if it helps. we see # V [libjvm.so+0x3f6c7e] +0x205ebe # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x570156] AsyncGetCallTrace+0x9d496 # V [libjvm.so+0x570156] AsyncGetCallTrace+0x9d496 # V [libjvm.so+0x570156] AsyncGetCallTrace+0x9d496 # V [libjvm.so+0x570156] AsyncGetCallTrace+0x9d496 # V [libjvm.so+0x570156] AsyncGetCallTrace+0x9d496 # V [libjvm.so+0x3f7569] +0x2067a9 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x3f7569] +0x2067a9 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x3f6c7e] +0x205ebe # V [libjvm.so+0x3f6c7e] +0x205ebe # V [libjvm.so+0x3f6c7e] +0x205ebe # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # C [libc.so.7+0xa6e44] short+0x3ee4 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # C [libc.so.7+0xa6e44] short+0x3ee4 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # C [libc.so.7+0xa6e44] short+0x3ee4 # V [libjvm.so+0x456564] +0x2657a4 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # C [libc.so.7+0xa6e44] short+0x3ee4 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x456564] +0x2657a4 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x456564] +0x2657a4 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x451c2e] +0x260e6e # V [libjvm.so+0x3f6c7e] +0x205ebe # V [libjvm.so+0x451c2e] +0x260e6e # V [libjvm.so+0x3f6c7e] +0x205ebe # V [libjvm.so+0x451c2e] +0x260e6e # V [libjvm.so+0x3f6c7e] +0x205ebe # V [libjvm.so+0x3f6c7e] +0x205ebe # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x451c2e] +0x260e6e # V [libjvm.so+0x3f6c7e] +0x205ebe # V [libjvm.so+0x3f6c7e] +0x205ebe # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0 # V [libjvm.so+0x5f3340] JNI_GetCreatedJavaVMs+0x256e0
I already tried, my last three crashes ware with that patch aplied :-(
Errata corrige: the patch didn't go through. Just deployed again with the patch applied. Let's see, I'll update you.
I can confirm it still crashes :-(
I note that you mention the crash on JDK6 and 7 also so this might not be applicable but I just fixed this problem with JDK8 and it looks like it has fixed my stability issues in saturated network environments. https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=210191
off if you find that PR 210119 does actually help then please feel free to change from 'Effects only me' to others ^^
I cannot try it before a week, but as you said, we had the issue when we were using openjdk6, and it didn't disappear switching to openjdk7 and now opendjdk8.
Just out of curiosity, are you using Akka in the Scala apps..? I need to finish my due diligence piece on that so I'm kinda' interested and have heard of some interesting stability concerns re sun.misc.Unsafe and Akka providing their own..?
Yes, we use akka.
This could well be something upstream of the FreeBSD port. "Bug ID: JDK-8143123 Hotspot SIGSEGV using Akka (Scala) under heavy load" is showing up on bugs.java.com at present and this is an OSX related one.
For the records, the same application doesn't crash on Linux with Oracle JDK.
Actually, this could possibly be a double whammy with the sun.misc.Unsafe problem AND the network problem. The JDK6 and 7 ports are also susceptible to the sun.misc.Unsafe problem and Akka makes use of sun.misc.Unsafe through it's own wrapper that grabs the binding. Could you re-test with OpenJDK8 version 8.92.14_1 please as this port has the 2 patches applied. I'd certainly love to hear that you had a stable environment since it would give me some confidence to proceed with looking at Akka further which I might have to delay if it's tripping the VM badly in some odd way.
BTW: JDK6 was never enhanced upstream to avoid crashing the VM when using sun.misc.Unsafe which is why there is no patch to fix it on FreeBSD.
I've updated the servers to the new OpenJDK8 version, I'll let you know when I'll be quite confident on the result.
So far so good, I hadn't experienced any crash after a few days. I'd say the issues have been fixed by these patches with a confidence of 95% (given the non-deterministic nature of the issue I cannot say 100% yet, but I'll update the PR in a week or two)
(In reply to Alex Dupre from comment #17) Super, many thanks for the update and I would really appreciate updates when you have a higher confidence level.
How are things looking Alex..?
A month without any crashes, I'd say the issue has been definitively fixed!