Boost ARM-JIT engine with Nitro Extreme. Brace for impact.
Nitro Extreme developers encouraged all port maintainers to support this new representation (JSValue32_64), since maintaing several different JSValue representations is a nightmare, and they will surely get rid of the old representation (JSValue32) at some point in the future. Because of my preliminary results, I was not convinced that extreme representation would put our port to a new speed level. Regardless, it never hurts to give it a try.
Once the implementation was finished, we decided to measure the gain on different benchmark sets. We created 8 binaries (for revision 51068), with and without Nitro Extreme support, using interpreter and JIT, and with -O2 and -O3 compiler optimization flags. That is a total of 8 executables. To our surprise, it turned out that the gain of -O3 compared to -O2 is less than 1%, so we decided to omit them from the figures.
Comparison on SunSpider:
The math benchmarks had considerable speedup on JIT, usually around 50% - 150%. However, the benchmarks which copy (or move) large amount of JSValues, suffer performance drop (around 20% - 50%). In the case of the interpreter, the behaviour is similar to JIT, but the gain is scaled down (around 30% gain on math, and 20% slowdown on others).
Comparison on v8 benchmark:
In the case of v8, only the raytrace benchmark becomes faster by about 30% using JIT. The others suffer a performance loss around 8-20%. This is true for interpreter as well, although the values are sightly lower (25% gain on raytrace, and 5-15% loss on others).
Comparison on WindScorpion:
As for WindScorpion, the two JSValue representation have roughly the same runtime speed when using JIT, and about 7% loss when using interpreter. However, comparing JIT and interpreter is a different story. JIT suffers a great performance loss here, mainly because of one benchmark, called WS-email. The result looks better for JIT, if we omit this particular benchmark:
JSValue32_64 is really effective for DES cryptography (2.34x as fast on JIT, and 1.62x as fast on interpreter). On the other hand, array handling algorithms (like bubble sort and floyd warshal algorithm) slowed down by about 10-20%.