BZ #54: Non-deterministic crashes in lusearch

Status fields:

creation_ts:2008-03-24 00:16
version:default branch
Running the dacapo lusearch benchmark on dual- and quad-core machines frequently leads
to unwarranted SIGSEGVs or heap corruptions detected by glibc. On average, 1 out of 20
runs fails with SIGABRT because of this. Spurious NullPointerExceptions or
ClassCastExceptions happen also from time to time. When trying to trigger those
failures, it seems to help to load the machine a bit.

The most frequent cause of failure is a heap corruption in dumpmemory_release followed
by a crash inside the compiler while another thread does GC. SIGSEGV inside
builtin_arraycopy is also one of the more frequent occurences.

The version tested was 056edaebc79b with classpath 0.96.1 on four different Linux
machines, all x86_64. Two quad-cores, two dual-cores, two of them running CentOS 5, one
Fedora 5 and one Fedora 8. All four behave essentially the same.

Comment #1 by on 2008-03-25 09:25:30

Reading of ClassCastExceptions, maybe it's related to non-working critical sections?

Comment #2 by on 2008-03-25 09:56:55

Yeah, the reason for the ClassCastExceptions is obvious. The rest isn't, though. I'm
trying the same with GC7 right now, and it seems more stable. Strange things still
happen but less frequently, at least that's my first impression.

Comment #3 by on 2011-01-20 22:26:32

This is still a problem, unfortunately. Best reproducible with dacapo xalan -s small. I
had high hopes that it would vanish with the introduction of memory barriers for
volatiles [1] and aligned patchers (bug #144), but no such luck…


Comment #4 by on 2012-03-13 16:26:47

I’d say this one is taken care of. Aligned patchers, barriers for volatiles, various
other fixes for races and non-races all contributed to it.