BZ #85: possbile race with trap instructions and signal handlers

Status fields:

creation_ts:2008-07-02 16:37
component:jit
version:default branch
rep_platform:All
op_sys:All
bug_status:RESOLVED
resolution:FIXED
reporter:twisti@complang.tuwien.ac.at
Since we are using trap instructions to invoke patcher functions, we can hit a race
condition here in the following situation:

2 threads are running on 2 CPUs.  The first thread hits the trap and executes the
patcher function.  The second thread hits the trap instruction and right after the trap
for the second thread is handled by the kernel, the first thread patches back the
original instruction.  The second thread sees then the original instruction in the
signal handler instead of the trap instruction and most likely aborts, since it can't
read the required data from the trap instruction.

A solution would be to check in the signal handler if there is a trap instruction at the
faulting position (PC) and if not, check the patcher list if there was a patcher trap at
this PC which has already been patched.  If that is true, simply return.

Comment #1 by stefan@complang.tuwien.ac.at on 2008-07-23 14:25:53

Cool find!

This could be the reason for most crashes I witnessed in dacapo lusearch and xalan.

Comment #2 by twisti@complang.tuwien.ac.at on 2008-08-13 10:19:21

patching in org.apache.axis.encoding.TypeMappingRegistryImpl.<init>(Z)V PUBLIC (mono)
(impl) at 0x44240bf0
        patcher function = -UNKNOWN PATCHER FUNCTION- <0x40239f60>
        machine code before = e7f007f0 at 0x44240bf0 (disassembler disabled)
        machine code after  = e51c002c at 0x44240bf0 (disassembler disabled)
...
LOG: [0x00004000] md_signal_handler_sigill: Unknown illegal instruction 0xe51c002c at
0x44240bf0
LOG: [0x00004000] Aborting...

Comment #3 by twisti@complang.tuwien.ac.at on 2008-08-13 14:52:09

To fix this bug properly I have to change some of the JIT ports (alpha, mips, maybe
others too) to use a dedicated signal (e.g. SIGILL) for patcher traps.  I'm working on
it.

Comment #4 by twisti@complang.tuwien.ac.at on 2008-08-14 16:11:43

Fix for ARM: http://mips.complang.tuwien.ac.at/hg/cacao/rev/cc536a1ac45c

Comment #5 by twisti@complang.tuwien.ac.at on 2008-09-11 10:30:47

*** Bug 73 has been marked as a duplicate of this bug. ***

Comment #6 by twisti@complang.tuwien.ac.at on 2008-09-11 10:32:32

*** Bug 74 has been marked as a duplicate of this bug. ***

Comment #7 by twisti@complang.tuwien.ac.at on 2008-09-11 10:34:54

*** Bug 75 has been marked as a duplicate of this bug. ***

Comment #8 by twisti@complang.tuwien.ac.at on 2008-09-11 10:46:11

*** Bug 76 has been marked as a duplicate of this bug. ***

Comment #9 by twisti@complang.tuwien.ac.at on 2008-09-12 11:31:34

Fix for powerpc64: http://mips.complang.tuwien.ac.at/hg/cacao/rev/a470a2a8360d

Comment #10 by twisti@complang.tuwien.ac.at on 2008-09-13 14:28:03

Fix for powerpc: http://mips.complang.tuwien.ac.at/hg/cacao/rev/05da7a4ba56b

Comment #11 by thebohemian@gmx.net on 2008-10-01 16:11:08

Created an attachment (id=52)
make fp-abi hack compatible with neon hardware

Attached is a patch that modifies the hacks and replaces them with instructions that are
available on armv7 /w neon.

Comment #12 by stefan@complang.tuwien.ac.at on 2008-12-20 15:54:48

Fix for x86_64: http://mips.complang.tuwien.ac.at/hg/cacao/rev/7e6eef2b4c94

Comment #13 by stefan@complang.tuwien.ac.at on 2008-12-29 20:02:20

Fix for i386: http://mips.complang.tuwien.ac.at/hg/cacao/rev/2a4bca2e3f35

x86_64 & i386 are Linux only so far. Will need to be transferred to the other md-os.c as
well.

Comment #14 by michi@complang.tuwien.ac.at on 2008-12-29 20:13:33

As the comment mentions, this should be moved to the generic patcher handler function.
It would be better to implement it there and ifdef it away for the archs which do not
have the necessary support functions yet. That way it will be less work.

Comment #15 by stefan@complang.tuwien.ac.at on 2008-12-30 13:50:22

Agreed, that would be much better. It has been not done that way yet, though.

Comment #16 by twisti@complang.tuwien.ac.at on 2008-12-30 18:09:08

The fixes have been done like this because they should be ported to cacao-1.0.x too.

Comment #17 by michi@complang.tuwien.ac.at on 2009-03-10 16:18:21

*** Bug 44 has been marked as a duplicate of this bug. ***

Comment #18 by michi@complang.tuwien.ac.at on 2009-03-12 15:53:09

This is the clean-solution fix for the given problem on the trunk, please review:

http://mips.complang.tuwien.ac.at/hg/cacao/rev/96f53095598b

Comment #19 by michi@complang.tuwien.ac.at on 2009-04-15 17:20:42

The new-trap-decoding branch was merged back onto default. Bug is hereby closed.

http://mips.complang.tuwien.ac.at/hg/cacao/rev/e651482cd9e7

Attachment id=52

date:2008-10-01 16:11
desc:make fp-abi hack compatible with neon hardware
type:text/plain
download:neon-compat.diff