Technical discussion part 2: constructing constants

Among other things, one interesting advantage of dynamically generated code is that constants can be embedded into the instruction stream. Think about it: there are several constants (usually pointers), which are unknown at compile time, but behave as constant once a value is assigned to them. Those constants can be embedded into the generated code, so several load-from-memory operations can be eliminated. WebKit JIT goes one step further: you can also rewrite constants which are not even known at JIT compilation time. Those constants typically hold cached values used by some fast cases.

On x86 based machines, these features are rather easy to implement, since instructions have a 32 bit immediate field, which is enough to hold any immediate value. On ARM, we only have an 8 bit immediate field, which can be rotated by an even number. Therefore, we sometimes need 4 instructions to create a 32 bit number. Fortunately, there is another way to access constants. The program counter (pc) is a regular register on ARM, which points to the instruction address plus 8. Using the pc, we can load constants that are at most 4 kbyte away from the current instruction. The constants are grouped together to form a constant pool and its entry is protected by a jump instruction. The constant pool is aligned to 32 bytes to aid caching. The drawback of this solution is that memory loads are expensive operations, especially when they cause cache misses.

[jump after the constant pool][constant 0][constant 1]...[constant n]

Constant pool

In our ARM JIT implementation, the constant generator was made as smart as possible. In other words, we tried to find the right balance between speed (a slow algorithm is not suitable for a JIT compiler) and efficiency (use the least number of instructions to generate a specific constant). Furthermore, those constants, which are not known at JIT compilation time, are placed into the constant pool as well.

Finally, let me share some technical details with you. We decided to use the following algorithm:

step 1: it detects whether the constant is an ARM constant (rotated 8 bit immediate value)
step 2: it detects whether the constant can be encoded by two data processing instructions
step 3: it bails out, the constant is simply pushed onto the constant pool

Since one of our primary concern is speed, we tried to avoid loops: our implementation does not need any loops if the constant itself is an ARM constant. If it is not an ARM constant, we employ a single loop to find an 8 zero bit sequence (if the the instruction can be encoded as two data processing instruction, it must have 8 continual zero bits somewhere). This 8 bits are rotated to the least significant byte (bit 0-7). The search iteration took at most 16 steps. No other loops are needed for our algorithm.

bkil (not verified) - 05/27/2009 - 11:01

This sounds like a neat technique. Could we please get some benchmark results to help place it in context?

zoltan.herczeg - 05/28/2009 - 10:59

The SunSpider results on Nokia N810:

(1) all constants are encoded as load: 44648.0ms
(2) step 2 disabled (see the table above): 44925.0ms
(3) all steps are enabled: 43041.8ms
(4) no loads are used, the last step encoded as 4 data processing instructions: 41505.8ms

The result (4) is better than (3) in this case, because N810 has a small data cache. However, if the cache memory grows this gain is getting to vanish. It was a surprise for us, and we decided to keep this version as well for devices with low cache memory.

zaheer (not verified) - 11/13/2009 - 18:00

Hi Zoltan, Thanks for all this great information.. I have one question, is there a way to we can see the Javascript->Bytecode->MacroAssembler->JIT mapping (e.g in inspector), if not whats the alternative..

zoltan.herczeg - 11/17/2009 - 15:04

Hi Zaheer. Unfortunately there is no such thing in WebKit (at least I am not aware of such tool). You can get a ByteCode dump using the command line JavaScript execution tool (pass -d option to jsc).

felixs (not verified) - 04/11/2013 - 13:07

Hi Zaheer, thanks for your share.

I have an question want to consult you.

I encounter a crash in the sfx when browse some web page, is exist some steps or method for debugging it?

Thanks.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • No HTML tags allowed
  • Lines and paragraphs break automatically.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
Fill in the blank