BZ #85: possbile race with trap instructions and signal handlers

Status fields:

creation_ts:2008-07-02 16:37
version:default branch
Since we are using trap instructions to invoke patcher functions, we can hit a race
condition here in the following situation:

2 threads are running on 2 CPUs.  The first thread hits the trap and executes the
patcher function.  The second thread hits the trap instruction and right after the trap
for the second thread is handled by the kernel, the first thread patches back the
original instruction.  The second thread sees then the original instruction in the
signal handler instead of the trap instruction and most likely aborts, since it can't
read the required data from the trap instruction.

A solution would be to check in the signal handler if there is a trap instruction at the
faulting position (PC) and if not, check the patcher list if there was a patcher trap at
this PC which has already been patched.  If that is true, simply return.

Comment #1 by on 2008-07-23 14:25:53

Cool find!

This could be the reason for most crashes I witnessed in dacapo lusearch and xalan.

Comment #2 by on 2008-08-13 10:19:21

patching in org.apache.axis.encoding.TypeMappingRegistryImpl.<init>(Z)V PUBLIC (mono)
(impl) at 0x44240bf0
        patcher function = -UNKNOWN PATCHER FUNCTION- <0x40239f60>
        machine code before = e7f007f0 at 0x44240bf0 (disassembler disabled)
        machine code after  = e51c002c at 0x44240bf0 (disassembler disabled)
LOG: [0x00004000] md_signal_handler_sigill: Unknown illegal instruction 0xe51c002c at
LOG: [0x00004000] Aborting...

Comment #3 by on 2008-08-13 14:52:09

To fix this bug properly I have to change some of the JIT ports (alpha, mips, maybe
others too) to use a dedicated signal (e.g. SIGILL) for patcher traps.  I'm working on

Comment #4 by on 2008-08-14 16:11:43

Fix for ARM:

Comment #5 by on 2008-09-11 10:30:47

*** Bug 73 has been marked as a duplicate of this bug. ***

Comment #6 by on 2008-09-11 10:32:32

*** Bug 74 has been marked as a duplicate of this bug. ***

Comment #7 by on 2008-09-11 10:34:54

*** Bug 75 has been marked as a duplicate of this bug. ***

Comment #8 by on 2008-09-11 10:46:11

*** Bug 76 has been marked as a duplicate of this bug. ***

Comment #9 by on 2008-09-12 11:31:34

Fix for powerpc64:

Comment #10 by on 2008-09-13 14:28:03

Fix for powerpc:

Comment #11 by on 2008-10-01 16:11:08

Created an attachment (id=52)
make fp-abi hack compatible with neon hardware

Attached is a patch that modifies the hacks and replaces them with instructions that are
available on armv7 /w neon.

Comment #12 by on 2008-12-20 15:54:48

Fix for x86_64:

Comment #13 by on 2008-12-29 20:02:20

Fix for i386:

x86_64 & i386 are Linux only so far. Will need to be transferred to the other md-os.c as

Comment #14 by on 2008-12-29 20:13:33

As the comment mentions, this should be moved to the generic patcher handler function.
It would be better to implement it there and ifdef it away for the archs which do not
have the necessary support functions yet. That way it will be less work.

Comment #15 by on 2008-12-30 13:50:22

Agreed, that would be much better. It has been not done that way yet, though.

Comment #16 by on 2008-12-30 18:09:08

The fixes have been done like this because they should be ported to cacao-1.0.x too.

Comment #17 by on 2009-03-10 16:18:21

*** Bug 44 has been marked as a duplicate of this bug. ***

Comment #18 by on 2009-03-12 15:53:09

This is the clean-solution fix for the given problem on the trunk, please review:

Comment #19 by on 2009-04-15 17:20:42

The new-trap-decoding branch was merged back onto default. Bug is hereby closed.

Attachment id=52

date:2008-10-01 16:11
desc:make fp-abi hack compatible with neon hardware