Firefox has a JIT for JavaScript now. Whoa. Before I continue on, here are links to the blogs of other Mozilla people close to the TraceMonkey team:
So, I started at Mozilla by working on Tamarin-Tracing. Tracing is Andreas Gal’s fancy new idea for run-time guided JIT optimization, a powerful new concept that poses huge benefits over whole method compilation. I talked about this before, perhaps erroneously, but the concepts are there.
The old style of compilation is to perform static analysis on entire methods at a time, compiling them to assembly when necessary. Without running the program, you decide how to compile loops and nested loops efficiently, perhaps even trying to decide if they’re expensive or not. Methods may or may not be inlined, but they are still the fundamental building block of most compilers.
The concept of methods quickly disappears in a tracing compiler. Everything is inlined as the tracer only compiles exactly what low level operations it sees being performed (and any sort of control flow is essentially a no-op). A tracing JIT essentially turns an expensive loop into its own isolated method call, optimized for its run-time properties, regardless of where it is or what the loop has to call into.
Andreas’s original paper on tracing was targeted toward mobile performance, where whole-method compilation and static analysis are too expensive. For dynamic languages one instruction can have many decision paths at run-time. Whole method JITing is a real problem because the code required for each opcode becomes very large, and static analysis is either unfruitful (because of dynamic types) or just too expensive. This is especially problematic for JavaScript where browser performance is critical, and time spent analyzing code is time wasted.
Thus it’s no surprise that Adobe decided to try tracing in the next generation of their Tamarin project. Adobe’s approach to tracing ActionScript is to create very primitive building blocks and trace those at the lowest level. It does this by converting ActionScript bytecode to a Forth dialect, and tracing the primitive Forth operations.
Mozilla’s JavaScript engine (”SpiderMonkey”) is very different. It has a decade worth of optimization hacks and very “fat” opcodes (instructions that have a lot of internal decisions, rather than performing one single operation). Although there were originally plans for Mozilla to switch to Tamarin, throwing out SpiderMonkey had a lot of hurdles, and the TraceMonkey project was started instead.
Luckily Adobe had very nicely separated the tracing backend from their interpreter. Tamarin-Tracing has a “nanojit” component with a simple IR. Interpreters are responsible for emitting the IR, and nanojit can compile straight-line IR blocks into native code. It can also link compiled code fragments together for attaching branches and building trees of traces.
Using Adobe’s nanojit, Andreas decided to take a top-down approach to tracing SpiderMonkey. The edge of every loop is monitored. If a loop is executed enough times, the tracer is activated. Every opcode is hooked and critical decision points are emitted as nanojit IR where possible. When the control flow reaches the loop edge again, the IR is compiled and the loop will run as native code thereafter.
There are some fascinating aspects to Andreas’s work. Type speculation and specialization, the native stack versus the script stack, tree specialization, his handling of global variables — are all intricate and critical to the rapid progress and success he’s made on TraceMonkey. And he (and Mike Shaver and Brendan Eich) did it all in 60 days, which is amazing.
What’s my role in all this? My summer intern project was porting the code generator and tracer to AMD64, which has landed and seems to work in the shell. I’ve also been debugging anything that goes wrong on the 32-bit port. Working with nanojit was a lot of fun - Adobe did a great job making it usable by other projects, and it’s definitely something that could become a generic library for dynamic languages to use for tracing.
The big news today is that TraceMonkey has landed in mozilla-central and will probably be turned on by default for Firefox 3.1 beta 1. Although it was open source and downloadable during development, it is now being officially announced and publicized, and can be used in the official nightly builds. The speed difference is noticeable in sites doing intense JavaScript processing. And though the SunSpider benchmarks can be considered superficial, it’s great to see the improvements we’re getting on them versus the old SpiderMonkey.
This is just the beginning. A lot more is planned for TraceMonkey and for tracing in general. Code that used to be considered too crazy for JavaScript, like graphics and crypto loops, is becoming plausible. We’re already noticing smoother play quality in some 3D JavaScript games on the web (using Canvas) and in other heavy applications. In Mike Shaver’s words, this could change the way people use JavaScript.
In terms of portability, nanojit still needs a bit of work, but it’s been hammered into shape for AMD64 for the time being. It can use the extra eight registers available with REX prefixes, and it will perform 32-bit integer math versus 64-bit pointer math correctly (given using the correct LIR instructions for safety). I also took the liberty of prettying up the macros used for code generation. Some of the work remaining that I’d like to do in terms of the overall x32/x64 assembly process:
- Taking more advantage of addressing modes — we can reduce register pressure by combining redundant store/load ALU operations.
- Improving SSE2 logic which currently uses LAHF/PUSHF.
- Improving calling conventions for SSE2 and reducing register spilling.
- Inheriting type information from child instructions, to remove the need for separately typed IR instructions (i.e. no need for add versus fadd versus qadd).
- Enabling 64-bit jitting in the browser (too unstable right now so it’s only on in the shell).
I should thank Edwin Smith at Adobe for putting up with my intense nanojit nagging; Mozilla for giving me the opportunity to work on this project as an intern; and Andreas Gal for coaching me through the tracing concepts every time I got them wrong.
For people who follow this blog from the SourceMod project, will SourcePawn get tracing? It’s something I’m experimenting with and will talk more about later. There are some hurdles to JITing Pawn in that very careful escape analysis is needed to make any of the nice optimizations.