BZ #78: Fast subtype checking
I've started an implementation of fast subtype checking (like Hotspot):
It's x86_64 only at the moment. I won't have time to work on it for at least 2 months,
so I'm making it available here in case somebody wants to pick it up.
- Because of the messy CISC code, I couldn't always find useful instruction mnemonics in
emit.*/codegen.h, so there's the actual machine code in codegen for two instructions.
- The code is x86_64 only. Needs to be ported to all architectures.
- Overflow is allocated via malloc and never freed. Not nice.
- The x86_64 code uses the red zone (below the stack pointer) for the array scanning
code. twisti doesn't like that...
- The fields in _vftbl should be reordered. I think I'm wasting some space to alignment.
They are also not nicely formatted.
We do secondary types (interfaces) the way we've always done. I've also left in the
check for ACC_INTERFACE, so the new display-based subtype code only needs to handle
primary types. So for us, the algorithm looks like this.
if S.display[T.depth] == T: (where T.depth is a constant and encoded
in the instruction itself)
if T.offset != &.display[DISPLAY_SIZE]:
return array_contains(T, *S.overflow (of length S.overflow_length))
The overflow part is allocated on the heap, and in our case we would not even have to
scan it because it contains just the entries from the display. And those are in the
correct order. So a simple lookup would be sufficient here. However, this test is rare,
so I don't mind wasting a few cycles here in the name of future extendability.
- The actual code for ICMD_CHECKCAST and ICMD_INSTANCEOF is almost identical.
- It's basically an inline implementation of fast_subtype_check in builtin.c.
- Almost all of the code can be left out when super is known and its depth is not too
- I used the red zone because of severe register shortage. Maybe someone will find a
more elegant solution.
|desc:||subtype branch export