Bug 212150 - java/openjdk8 frequent sigsev due to small ThreadStackSize
Summary: java/openjdk8 frequent sigsev due to small ThreadStackSize
Status: New
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-java mailing list
URL:
Keywords:
Depends on:
Blocks: 222146
  Show dependency treegraph
 
Reported: 2016-08-25 15:26 UTC by Palle Girgensohn
Modified: 2017-09-10 12:44 UTC (History)
2 users (show)

See Also:
bugzilla: maintainer-feedback? (java)


Attachments
hs_err example (55.83 KB, text/plain)
2016-08-25 15:26 UTC, Palle Girgensohn
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Palle Girgensohn freebsd_committer 2016-08-25 15:26:24 UTC
Created attachment 174065 [details]
hs_err example

Hi,

We have an application running tomcat on several machines. They crashes on a more or less daily basis.

They are running openjdk8-8.102.14 or openjdk8-8.92.14_2, both experience similar problems

10.2-RELEASE-p12


Some of the "Problematic frame"s:

# J 26736 C2 java.util.AbstractMap.equals(Ljava/lang/Object;)Z (145 bytes) @ 0x0000000808907030 [0x0000000808907000+0x30]
# j  org.hibernate.type.descriptor.java.AbstractTypeDescriptor.getJavaTypeClass()Ljava/lang/Class;+0
# J 9985 C1 org.hibernate.type.descriptor.java.AbstractTypeDescriptor.getJavaTypeClass()Ljava/lang/Class; (5 bytes) @ 0x000000080401afe0 [0x000000080401afa0+0x40]
# J 16010 C2 org.hibernate.internal.SessionImpl.getEntityUsingInterceptor(Lorg/hibernate/engine/spi/EntityKey;)Ljava/lang/Object; (51 bytes) @ 0x0000000806392210 [0x00000008063921e0+0x30]
# J 31297 C2 org.hibernate.type.descriptor.java.AbstractTypeDescriptor.areEqual(Ljava/lang/Object;Ljava/lang/Object;)Z (6 bytes) @ 0x00000008082f95d0 [0x00000008082f95a0+0x30]
# J 8453 C1 org.hibernate.engine.spi.CascadeStyle$2.doCascade(Lorg/hibernate/engine/spi/CascadingAction;)Z (2 bytes) @ 0x0000000805237e20 [0x0000000805237de0+0x40]
# J 67410 C2 java.lang.String.hashCode()I (55 bytes) @ 0x0000000806ed9af0 [0x0000000806ed9ae0+0x10]
# J 37130 C2 org.hibernate.proxy.pojo.javassist.JavassistLazyInitializer.invoke(Ljava/lang/Object;Ljava/lang/reflect/Method;Ljava/lang/reflect/Method;[Ljava/lang/Object;)Ljava/lang/Object; (214 bytes) @ 0x00000008082648b0 [0x0000000808264880+0x30]
# J 20663 C2 org.hibernate.engine.spi.CascadeStyle.reallyDoCascade(Lorg/hibernate/engine/spi/CascadingAction;)Z (6 bytes) @ 0x00000008076e5e50 [0x00000008076e5e20+0x30]
# J 6799 C1 org.hibernate.type.EntityType.isAssociationType()Z (2 bytes) @ 0x0000000804e952a0 [0x0000000804e95260+0x40]
# J 24608 C2 org.hibernate.event.internal.DefaultSaveOrUpdateEventListener.reassociateIfUninitializedProxy(Ljava/lang/Object;Lorg/hibernate/engine/spi/SessionImplementor;)Z (13 bytes) @ 0x00000008080937b0 [0x0000000808093780+0x30]
# J 8330 C1 org.hibernate.engine.spi.CascadeStyle$2.doCascade(Lorg/hibernate/engine/spi/CascadingAction;)Z (2 bytes) @ 0x00000008050b4160 [0x00000008050b4120+0x40]

The servers endure high loads, and so far we cannot see any clear pattern except high load (not extreme, just that systems with higher load are more susceptible to crash).
Comment 1 openjdk 2016-08-25 16:39:13 UTC
Hi Palle,
I've been working with some folks over at OpenNMS that were experiencing similar crashes.  Increasing the stack size to 8M seems to have fixed their issue, I wonder if would also resolve yours.

Increase the stack size by adding this startup option:
-Xss8m

I haven't had the time to research why increasing the stack size on Freebsd was necessary yet, but I hope it helps you none the less.  Please report back whether it helps or not.
Comment 2 Palle Girgensohn freebsd_committer 2016-09-07 11:55:05 UTC
(In reply to openjdk from comment #1)

Thank you for the prompt reply!

We tried this the same might you reported it and two weeks later we are still running with no crashes. Thumbs up! :)

This is obviously a work-around, there is a bug in there somewhere that fails to throw a proper exception or remedy the problem in some way.

-Xss8m  does help

As a side note, we first tried to turn off compressing 64bit pointers to 32 bit (-XX:-UseCompressedOops) but that did not work. It did however help jmap to create heap dumps from huge (>2^32 bytes) core dumps.
Comment 3 openjdk 2016-09-07 18:47:22 UTC
I'm glad it helped.  I agree that this is just a workaround and not a fix.  I took a peak at libthr and I don't believe there is a way to detect a native stack overflow.  I also looked into the default sizing logic and it looks like the default is identical to Linux, at 1M for 64bit architectures.  It looks like the  default non-initial thread stack size is 2MB for 64bit architectures, so I wonder if 1MB is just too low for FreeBSD. 

It looks like the stack sizing logic was directly ported from Linux.  So maybe the solution would be to bump up the defaults.  Does anybody have any clue as to why we would need a bigger native stack on Freebsd vs Linux?
Comment 4 Palle Girgensohn freebsd_committer 2016-11-29 22:33:39 UTC
Hi,

We are seeing new crashes, not as frequent as before, but occasional crashes anyway. 

We are running with -Xss8m

Is it possible to see from my original report that it was the stack, or was it just a hunch?
Comment 5 Palle Girgensohn freebsd_committer 2017-09-09 19:49:27 UTC
Just for reference, the intermittent crashes we saw where due to infinite loops or recursions. When we bumped the stack size, we got stack traces and could finally track down the root cause.

The bug in OpenJDK is still there though, and even with the larger stack size, we still saw occasional seg faults where java should really throw StackOverflow exception.
Comment 6 Michael Osipov 2017-09-10 12:44:35 UTC
(In reply to Palle Girgensohn from comment #5)

Can you elaborate on the root cause and wether this has been reported to Oracle?