A few months ago I made some major changes to the SourceMod JIT, the first major changes since the thing was originally written two years ago.
Pawn suffers from many problems. One of my (least) favorites is the lack of separation between a runtime and a context. Some language implementations don’t seem to have this either. Python, notably has a global runtime and no notion of a context as far as I can tell. SpiderMonkey abstracts both. Lua seems to have both and combines them into one structure.
Pawn historically shoved both a runtime (VM code+heap) and a context (thread stack+locals) into the same structure, and SourcePawn originally did the same. The problem was that once you created this structure, it was immutable. You couldn’t add code or data. Sharing between scripts meant jumping runtimes which is very expensive to marshal. Each plugin having its own runtime and context is a waste in every sense of the word.
Unfortunately the design of the VM doesn’t really let us fix that. References to memory are not typed, and references to global memory can’t be easily deduced at verification time. This makes it impossible to put multiple scripts into one structure with a common heap. For example:
const.pri 50 load.pri
This is a load from global data at address BASE+50. SourcePawn is a straight-line JIT, and some form of data flow analysis would be needed to catch and relocate these references. Although we could change the compiler, we have to be backwards compatible with old bytecode.
So stuck with this crummy design I went on a spree of changes to try and clean things up. The first was separating the ideas of a context and a runtime – although runtimes are still per-plugin, the invocation model is now centered around Functions and Contexts. That is, the API for calling a function used to require a runtime and a code addres. It now requires a function object and a thread context. The API looks a whole lot cleaner, but it’s just lipstick on a pig – internally, contexts and runtimes are still tied together because of Pawn’s limitations.
Though as part of those changes the implementations for contexts and runtimes got moved into the JIT library. Previously they were separated, the JIT lived half its own library and half in SourceMod. SourcePawn is officially standalone now. For testing I was even able to hack up a shell app to load plugins using the DLL and headers from SourceMod snapshots.
The third big change was to move to whole-method incremental compilation. Previously the entire code blob of a plugin would get JIT’d on load. Now we only JIT functions once they’ve been run for the first time. This speeds up load time noticeably; compilation time becomes spread out, and the hot paths (such as OnGameFrame) are hit quickly, so gameplay is unlikely to be interrupted. It’s faster to compile a smaller chunk of code anyway, especially since SourcePawn does little to no JIT-time analysis.
Functions that call other functions (via the CALL opcode) are special. If the target function is already compiled, a direct call is generated. If the function is not already compiled then a special thunk is generated instead. If the thunk is ever executed it checks to see if the function has been compiled yet. If it hasn’t, it compiles it. The thunk completes by replacing itself with a direct call to the target function.
The last big change was a rewrite of our debugging implementation. The previous concept (lifted from AMX Mod X) was pretty bad. Pawn emits BREAK opcodes throughout methods to indicate line changes. The JIT would change these BREAK opcodes into a function call called the “debug hook.” The debug hook would look for changes to the frame pointer to deduce whether the call stack was entering or leaving a function. This was really expensive so you had to explicitly enable debug mode to get error diagnostics.
The new implementation emits a single instruction for each BREAK opcode:
mov [save_pc], pc
For the CALL opcode, the pc (program counter) is pushed onto a separate stack (the call stack). When a CALL finishes, the top value is popped. If a function throws an error, the complete stack trace can be recovered from a combination of save_pc and the call stack. The pc can be used to find the line, file, and function in the debug info. This is fast enough to remove re-JITing to debug a plugin.
So SourcePawn is a little more mature now. It’s got an isolated runtime, it incrementally JITs, and the design (on the outside) is a lot cleaner. I wouldn’t recommend using it in other projects yet, but it’s at a point where if we wanted to take it further we could do so without more massive rewrites.