Category Archives: Mozilla (non-PMO)

Graduate School Lost

About six months ago I applied to $UNIVERSITY to enroll in its Computer Science PhD program. I had just crammed a four year undergraduate degree into six years, as my father jokes.

Whether or not to accept admission was a deeply divisive decision for me. I got a lot of opinions from family and peers. I went searching online for advice articles. It was a lot to digest and I ended up accepting the offer in March. The internal struggle didn’t go away, but the voices around me did. In July I paid my first rent check, still uneasy.

Come August, there was a defining day – no, moment – when I realized I did not want to go to graduate school. It was so blinding that I had to get out immediately. The intensity of the feeling subsided the next day (and I cannot “reproduce” it now), but it didn’t matter. I had seen the path I truly wanted to take.

I’m omitting my reasons since I don’t want to unduly influence anyone else approaching any level of education. I’ve done this too much in the past – it’s not my place. Everyone must follow their own path, and only by stewing over the decision alone for five months was I able to get closure.

To the people who gave me the (much needed) confidence and support in getting into school, and who I subsequently inconvenienced, I am deeply appreciative. I’m truly sorry it didn’t work out.

But now that is all behind me, and as of September 1st, I look forward to doing fo’ realz what I love doing.

Very shortly: a much less emo article on tracing recursion.

Trace Compilation at PLDI 2009

To mirror Dave Mandelin’s blog post, we’ve gotten a paper on trace compilation accepted to PLDI, one of the top programming language research conferences.

Andreas Gal is presenting the paper on June 18th in Dublin. You can read it by clicking here (requires PDF reader).

Trace compilation is the new optimization technology appearing in Firefox 3.5. If you want to see how it feels, give 3.5 beta a go (it’s stable and nearing release).

I’ll have a bigger and better post about this and sundry things soon. The short of it is, I’m back into trace/language research after a six-month reprieve. It is good to be back!

A Divine TraceMonkey Bug

I spent the better part of the last week or so on a rather annoying TraceMonkey bug. A few weeks ago I mentioned that we extended multiple type-specialized trees to include specialization on global types. Suddenly a rare but dangerous class of bugs appeared.

The Bug

When a trace exits back to the interpreter, it does so through a guard. These guards contain an “exit typemap,” which tells the interpreter how to take types from the native CPU and box them back into generic containers for the interpreter. The bug was that some global variables were being boxed back as the wrong type. For example, the 32-bits in an integer would be stored in a slot interpreted as a 64-bit double, or an object would be tagged as a string. When the interpreter went to unbox these again, it got garbage. That’s bad. The only way this can happen is if the exit typemap contains wrong type information.

Lazily Specialization

Global variables are tracked lazily. The overarching Trace Monitor keeps track of which global variables have been seen on trace. For example, say we have a type-stable tree, Tree 1. It has no global variables, and thus has empty exit and entry typemaps (for globals). Later, the global variable X is encountered. It’s now being tracked, but Tree 1 doesn’t have a type entry for it.

If Tree 1 wants to execute again, it will lazily update its entry typemap. The exit typemaps on the other hand are fixed and cannot be modified. So now Tree 1 looks like this:

Tree 1 Entry: X = Object
Tree 1 Exit: 

When exiting the tree, we merge the exit and outermost entry types together, giving a complete typemap. More on this later. When entering Tree 1, X is unboxed and boxed as an Object, even if it never gets used. This is because Tree 1 could call into a nested tree that does use X.

Problem #1

Let’s say later we build Tree 2. It’s the same loop as Tree 1, but it’s type-unstable. The typemaps look like this:

Tree 2 Entry: X = String
Tree 2 Exit:  X = Object

TraceMonkey’s multitrees kicks in, and now Tree 2‘s exit will go directly to Tree 1‘s entry, since their types link together. When Tree 1 exits, we combine the outermost typemap with the exit typemap. In this case, Tree 2‘s entry is the outermost typemap, but it contains the wrong type! The type of X is an Object, but now it’s being boxed as a String. Crash.

Note: This is different from normal type instability, because type unstable traces represent a path where a type is actually changed. In this scenario, the type is being mistaken, which is deadly.

Failed Solutions

Solving this was complicated by nested trees. Trees can be deeply nested, and each tree along the way could be missing global types, so it seemed like we needed to recover missing global types incrementally up the call stack. That is:

  1. Start with global types from the innermost exit.
  2. Add missing globals from each tree on the call stack, starting at the deepest, and ending at the outermost tree.

Since the outermost tree is the original tree we started with, it is guaranteed to have all missing types, so it was the last resort. Achieving this ended up with three different implementations as the patch progressed, but the idea was: as we enter trees, push them onto a stack. As we exit trees, pop the stack. If we hit a guard (which exits), immediately reconstruct the full global typemap using that stack of trees. By the time we exit back to the interpreter, a global typemap will have been prepared already.

Alas, this wasn’t enough, because…

Problem #2

Say we have two trees on the same loop, Tree 1 and Tree 2. Their typemaps are:

Tree 1 Entry: X = Object
Tree 1 Exit:  X = String
Tree 2 Entry: X = String
Tree 2 Exit:  X = String

In this situation, Tree 1‘s exit is linked to Tree 2‘s entry. Later, a new global appears, Y. We execute Tree 2 with X = String and Y = Object. Then we execute Tree 1 with X = Object and Y = String. Because of lazy specialization, the typemaps now look like this:

Tree 1 Entry: X = Object, Y = String
Tree 1 Exit:  X = String
Tree 2 Entry: X = String, Y = Object
Tree 2 Exit:  X = String

This is problematic, because Tree 1 is still connected to Tree 2, but their typemaps are incompatible! If we run Tree 1, Y will be unboxed as a String and reboxed as an Object, without the underlying type actually changing. The solution to Problem #1 doesn’t fix this.


Linked trees should never have incompatible typemaps. What counts as a linked tree? Any two trees that are connected via a nested call (nested loops), or any two trees that are connected via type instability (multitrees), are “friendly.”

In pseudocode:

  FOREACH Tree IN FriendlyTrees
    IF MissingGlobals(Tree)

Now when a tree exits, it suffices to use only the exit typemap and the typemap of the innermost entry (that is, the tree the exit immediately came from). This neatly solves Problem #1. If any one tree gets a new global, the entire graph of connected trees is updated, solving Problem #2.

This is probably one of the more difficult TraceMonkey bugs I’ve had to reason through. Nested trees always seems to complicate how you reason about tracing, and in this case it ended up not mattering. And though the problems are conceptually disjoint, they also feed into each other: fixing only #1 led to #2 breaking.

More Type Instability and Specialization in JavaScript

A while back I talked about type instability in JavaScript, and how TraceMonkey optimizes it by connecting type-specialized traces together. This stitching mechanism was only half complete.

TraceMonkey has a global cache called the “trace monitor.” As the original version did not type-specialize any loop (tree) more than once, there was a design decision to only monitor one global object at a time. This is significant because there can be many global objects in JavaScript. Each window in the browser has its own global object as a security measure (windows can’t poison other windows).

Let’s say there’s a window A with global object of shape X. The monitor notices this and compiles a bunch of traces specialized to that global object. Then window B with global object of shape Y starts running an intensive loop. The trace monitor kicks in, notices that the global object is different, and flushes its entire cache. A cache flush invalidates every trace, meaning the JIT’d code is deleted.

The trace monitor also locked itself to the types of every property in the global object it was tracking. If any variable in the global object changed, the cache would flush, even if most of the traces never used this variable. Any type of global instability, either from changed global variables or different global objects, would throw out all JIT’d code.

This would be fine:

Select All Code:
function f() { 
   for (var i in ["1", "1", "1", 1, 1, 1.5])

This would continually flush the cache, as i is global and type unstable:

Select All Code:
for (var i in ["1", "1", "1", 1, 1, 1.5])

Luckily we’re now working on fixing this, and the first major step landed last week. The trace monitor no longer keeps track of global types. This information is tracked in each tree and each guard, and the old principles of “multitrees” apply. If a global variable is not type-stable across a loop, the loop is left unclosed and the dangling edge becomes an “unstable exit.” If there ever exists a tree whose entry types match an unstable exit, the edges are joined together.

See the original post for diagrams.

The only additional complication with global variables is that they are lazily tracked by the trace monitor. This is an optimization. If there are 50,000 globals in play and only 5 ever get used, there’s no need to iterate through every single one. As traces get recorded, every tracked global variable is included in a tree’s specialization. Old trees that don’t include newly tracked globals will be updated automatically. This removes the need for complex dependency tracking for branches and nested trees.

So now that global types are specialized pre-tree, what’s the next step? The plan is to include the actual identity of the global object in per-tree specializations. The easiest way to do this is probably to probably divide up the trace monitor, so that it can maintain separate caches for each global object. This way if a global object’s shape changes, only its cache will be flushed.

Fine-grained specialization continues to be the most interesting and promising aspect of trace compilation for dynamic languages.