Category Archives: Articles

“Official” articles that were pre-written for posting.

Six Years Later, Steam is Still Annoying

It’s been a year or so since I fired up Steam to actually try and play a game. Given that I help maintain multiple mods for Valve games, that might seem surprising.

I love the concept of Steam, I’ve had it installed since the early betas, and I hated the days of scouring for files like “10161100.exe” to update Half-Life 1. But the actual software is so mind-bogglingly annoying. I cringe when I hear Steam fanboys tout it as the shining jewel in content distribution.

To demonstrate, I recently got a desktop and decided to try playing a game I paid for on Steam. I downloaded Steam, installed it, and instantly got bombarded with stuff like this:

Steam Notifications

These three things would slide down, then three more would pop up. Rinse and repeat every few seconds. Really annoying. I have Steam Friends completely pref’d off on my laptop, but I guess Steam doesn’t sync account settings across computers. I’d rather not disable Steam Friends completely, but of the three options to disable notifications, none seem related to the ones up above. So I pref’d Friends off. The little messages kept coming though, until the whole list of notifications was exhausted.

Later I came back and tried to install a game. It said “Your Steam ticket has expired” and asked for my password. I typed it in, and nothing happened. The dialog disappeared but the game wasn’t downloading. I double-clicked the game to try again. It asked for my password again, but no-go.

I tried this a few more times, then restarted Steam. When it started back up, all of the icons were missing:

Steam - No Icons

Okay, that’s weird. I double-clicked the game again, and I still got the expired ticket dialog box. I typed in my password again, but this time selected “Remember my password”. The game didn’t start installing, but the icons appeared.

I tried installing again, and now I got a new dialog box: “The Steam servers are currently too busy to handle your request.” Huh? The next try got me back to the password entry dialog box because my “Steam ticket” had expired again.

I searched Google and found a Steam forum thread describing my problem. Another thread linked from comment #11 said to try deleting “ClientRegistry.blob”, and if that doesn’t work, reinstall Steam.

So I exited Steam, deleted “C:\Program Files (x86)\Steam\ClientRegistry.blob”, and restarted. When I tried installing the game, I actually got a progress bar. By the time it had finished downloading I’d moved on to other things, but at least next time I’m in the mood to play a game on Steam, I know to delete random internal files.

This product… needs polishing.

64-bit TraceMonkey!

As of an hour ago, Mozilla’s JavaScript engine now has 64-bit JIT support turned on by default in the TraceMonkey tree only. We currently don’t ship a 64-bit browser, but we will eventually, and this is an important step in making sure we’re staying optimized on all major platforms.

If you are bleeding-edge and build your own 64-bit Firefox, you can now easily try out the JIT and report any problems.

A nice discovery was that the 64-bit engine as a whole outperforms 32-bit builds. SunSpider ran about 20% faster, comparing both 32-bit and 64-bit builds with and without the JIT.

For more information see this newsgroup post on

Trace Compilation and Recursion, Part 1

TraceMonkey does really well at lots of benchmarks. Specifically, it does well at benchmarks that it can trace compile. If for some reason a loop contains a code path that cannot be traced, that path will never run inside the JIT – instead it stays in the interpreter, which is much slower.

There isn’t much left that we can’t trace, but when that happens, it lights up benchmarks in bad ways. One of the areas we don’t trace at all is recursion.

Recursion isn’t too common on the web but it’s still a big hole in our functionality, so I worked on it quite a bit this summer. Unfortunately it’s non-trivial, and still hasn’t landed, but I’d like to talk about what I do have working.

Quick Introduction

TraceMonkey uses trace trees. If some repetitive code is run, like a loop, a trace recorder kicks in. This records the behavior of the code for one iteration. If any behavior could diverge at runtime – such as an if branch – it becomes a “guard”, and we only record the observed path.

We assemble these observations into statically-typed, native CPU code. If a guard is triggered while running this new, faster code, it means the behavior has changed. We start recording a new path, and attach it to the point of divergence. Thus the code is literally a tree of traces.

Recursion is a Loop, Sort Of

Recursion is a loop, so we’d like to compile as such, using the behavior we observe at runtime. Let’s look at a simple recursive function and identify where it loops.

Select All Code:
function factorial(n) {
  if (n == 0)
    return 1;
  return n * factorial(n - 1);

What happens when you call factorial(10)? It starts off in the interpreter here:

Select All Code:
  if (n == 0)

n isn’t 0, so the indented path gets skipped.

Select All Code:
  return n * factorial(n - 1);

First n - 1 evaluates to 9. Then factorial() is called again, and we’re back at:

Select All Code:
  if (n == 0)

Bam. We’ve hit the same point, and thus identified a repeating pattern. To make sure it’s hot, the tracer recorder waits for one more iteration. Then it kicks in and starts recording the behavior it sees. Is (n == 0)? No, n is 8, so the tracer records:

Select All Code:
1: guard n != 0

Next, the trace compiler sees the n - 1 and function call and records those.

Select All Code:
2: subtract(n, 0x1)
3: add stack space (frame) for factorial call
4: n = result of instruction #2

Next the trace recorder sees if n == 0, which is where it started recording. It adds a “jump to the beginning” instruction and compiles the trace to native code. Visually it looks something like this:

Down recursion

But wait! That’s only “half” of the function. Specifically, it’s the down-recursive half, which dives deeper and deeper. What happens when we run this? The code will build a bunch of stack frames that look like this:

frame # variables
9 n = 0
8 n = 1
7 n = 2
6 n = 3
5 n = 4
4 n = 5
3 n = 6
2 n = 7
1 n = 8

When n == 0, the guard fails. This is really nasty. It turns out the JIT and interpreter have completely different ways of representing stack frames, and when transferring control the frames must be converted from one to the other. Locally, this process is really expensive. As you can imagine it is disastrous for deeply recursive functions. The only way to fully mitigate this is to make sure we spend more time in the JIT. Ideally, we never want to reconstruct any frames.

Upwards Recursion

But back to what happens. n == 0, the guard has triggered, and we’re back in the interpreter. The recorder only wants to observe hot code, so it doesn’t record the path from the guard yet. Now we see this get executed:

Select All Code:
  if (n == 0)
    return 1;

Remember, we’re at frame #9 from the diagram above. The “return” instruction pops us to frame #8, and now we’re at:

Select All Code:
  return n * (result of factorial(n - 1));

This returns 1 * 1 to the calling function, and now we’re at:

Select All Code:
  return n * (result of factorial(n - 1));

We’re back at the same place! That’s a loop. This time 2 * 1 is multiplied to return 2. The trace recorder kicks in and starts recording from frame #6, but hits a snag: it sees a return, but hasn’t recorded a matching function call for this trace. This is where things get weird. How can it observe a return if it doesn’t know anything about frame #5?

It can’t. That’s sad, but we really want to solve this. If the JIT can unwind its own recursive frames, we won’t hit that expensive synthesizing path. The trick is to infer some uniquely identifying information about frame 5: its variable’s types and program counter. We guard on that identity. If that guard fails at runtime, we know that it is illegal to keep popping frames.

The resulting trace looks like this:

Upwards recursion

Linking the Two Halves

Hopefully, the down-recursive side will be run again. When the n == 0 guard hits, it will be “hot”, and the recorder starts up. This time it notices the “return”, looks at the above frame, and recognizes that recursion is unwinding. It links the two traces together like so:

Recursion, linked together

Beautiful! Except for…

Next Time

There are still problems with this approach. The major one is that by the time the up-recursive frame is built, all of the frames exist on the interpreter – not the JIT. When control transfers from the interpreter to the JIT, only one frame is converted because the process is very expensive.

That means the first up-recursive guard will hit, since there are no more JIT frames. We’ll do the expensive reconstruction process on the same frame, transfer it back to the interpreter, then re-enter the interpreter. This will happen for each frame until the recursion is unwound, all to execute a few multiplies!

That’s REALLY bad. To make things worse, this whole process will happen at least twice. The first time is to unwind the initial down recursion. The next time is because the n == 0 guard fails, and all the frames are popped off the JIT and transferred to the interpreter again. Everything only runs smoothly on the third invocation of the recursion function.

Next time I’ll discuss approaches to solving that situation.

Graduate School Lost

About six months ago I applied to $UNIVERSITY to enroll in its Computer Science PhD program. I had just crammed a four year undergraduate degree into six years, as my father jokes.

Whether or not to accept admission was a deeply divisive decision for me. I got a lot of opinions from family and peers. I went searching online for advice articles. It was a lot to digest and I ended up accepting the offer in March. The internal struggle didn’t go away, but the voices around me did. In July I paid my first rent check, still uneasy.

Come August, there was a defining day – no, moment – when I realized I did not want to go to graduate school. It was so blinding that I had to get out immediately. The intensity of the feeling subsided the next day (and I cannot “reproduce” it now), but it didn’t matter. I had seen the path I truly wanted to take.

I’m omitting my reasons since I don’t want to unduly influence anyone else approaching any level of education. I’ve done this too much in the past – it’s not my place. Everyone must follow their own path, and only by stewing over the decision alone for five months was I able to get closure.

To the people who gave me the (much needed) confidence and support in getting into school, and who I subsequently inconvenienced, I am deeply appreciative. I’m truly sorry it didn’t work out.

But now that is all behind me, and as of September 1st, I look forward to doing fo’ realz what I love doing.

Very shortly: a much less emo article on tracing recursion.

Computer Go, Part 3

As my senior project in college, I worked on algorithms for Computer Go. A few weeks ago I talked about the UCT algorithm, the most popular algorithm as of late. UCT plays random games to explore the value of potential game tree paths. It converges to a reasonable answer much faster than normal Monte-Carlo searches.

My project ended up being about experimenting with playout policies. A playout policy is just a fancy way of saying, “How do you play a random game?”

If you play ten random games from a single position, you probably won’t get a very accurate average score. If you play 10,000 you still might not get an accurate score! This inaccuracy means that even though UCT is a good algorithm, it will take longer (since you need more random games) to converge to an acceptable answer. Garbage in, garbage out.

One of the current Computer Go champions, MoGo, was also one of the first to use UCT. Their paper improved random playouts by introducing “local shape features.” (Gelly et al). It looks for “interesting shapes” on the board, and leans towards playing those instead.

What’s an interesting shape? Take this position with black to move:

Ignoring the rest of the board, the position marked a is pretty much the only move for black. It is a strong move that effectively captures territory, attacks the white stone, and deprives it of a liberty. Position b is also potentially interesting for similar reasons.

This kind of knowledge is really vague though. How can you make a quick test about whether something is interesting? MoGo decided to use hand-crafted information about 3×3 shapes. It takes every legal position on the board, and looks at the 8 stones around it. These form a pattern, and can be looked up in a pattern database. To see how this works, let’s take the two a and b positions from earlier, as if we were about to play at the center of a 3×3 area:

These are both patterns for the very favorable hane shape. Patterns are pretty flexible. You can introduce wildcards at intersections (which MoGo does). You can assign weights for a probability distribution. You can use neural networks to train them.

Patterns are also pretty fast as long as you stay under a 3×3 grid. Since the center is always empty, there are only eight stones to check. Of these eight, each intersection has four possibilities: Black, White, Empty, or Edge (edge of the board). That’s 4^8, or 2^{16} possibilities. That means you can encode a pattern with just a 16-bit integer, and you can use that as an index into a 65,536-length array.

As an added bonus, patterns have identical symmetries. If you mirror or rotate a pattern, it is the same. If you flip the colors, you can also flip the score – meaning that you only need to store scores for one color. Combined with the fact that most patterns are not legal, there are under 2,000 unique 3×3 patterns. In fact, the two patterns above are identical.

There are also patterns which are bad, and should probably be discouraged during normal play. For example, this pattern is the empty triangle, and you never want to play it:

Why are patterns useful, despite being local and having no knowledge about the board? They serve as an effective, quick local evaluation that is ideal for generating random games. A purely random game will have many bad moves. Pattern knowledge helps add some potentially good moves into the mix. MoGo claims to have gotten huge gains from their pattern knowledge.

For my project, I was tasked with trying out a few neural-network training algorithms for 3×3 patterns. The goal was to spit out a pattern database and compare the performance against other computer programs. I did all my work off an open-source project called libEGO.

Now, I love open source. I don’t love libEGO. It’s a messy project. The code is near-unreadable, there are almost no useful comments, the whitespacing is peculiar at best, bad at worst, and the formatting is atrocious. But it was small and had high coverage density, which made it ideal for experimentation.

I used a softmax function for choosing the next random move based on a pattern database. Every legal position on the board got a score of e^p, where p is the value of the pattern at that position. The denominator of the probability is the sum of those scores for the entire board. The numerator is the score of a specific location on the board.

I tried a bunch of reinforcement learning functions from a recently published paper. Testing was a pain. Levente was kind enough to lend me access to a 100-node grid. I easily ate up every node for days on end. I hacked up Python scripts to battle-royale dozens of configurations against each other and spit out statistics. It was fun to see the results and analyze the games, but a single mistake could throw out a ton of time, since I’d have to recycle all the tests.

And I did make mistakes. With only 5-6 weeks of actual working time available, I had a few setbacks and I had to rush near the end. I didn’t end up with great results. On the other hand, I learned a lot, and I got to talk with one of the masters in the field every day. So in the end I really enjoyed the project, despite not being very good at AI.

I’m still fascinated by Go, and now UCT. I wrote some other UCT programs in between school ending and returning to Mozilla. One for Shogi, and one for m,n,k games (such as Connect 4 and Tic-Tac-Toe). The Shogi one didn’t do too well – I didn’t have enough domain knowledge to write a good evaluation function, and thus playouts were really slow. Connect 4 fared better, though my implementation would lose to perfect play. Tic-Tac-Toe played perfectly. twisty did an implementation as well and I believe managed to get perfect play on Connect 4.

Alas, I’m not sure if I’ll spend more time on it yet. My real calling is elsewhere.

Quick String Interning Benchmark

While working on a project, I was wondering how to most efficiently implement string internalization. String “interning” is when two strings that compare true always have the same reference. This is quite nice as you can test if two strings are equal by just comparing their pointers.

It seems everyone uses hash tables for this, and indeed when I asked someone with more VM experience, he suggested using hash tables as well. I was still curious though, especially since SourceMod has a fancy-pants compressed “double-array” trie implementation. We knew that insertion time was poor, but I secretly hoped that this cost would be amortized over the much faster retrieval case.

I took a dump of ~5MB of plugin text from the forums and extracted all syntactically valid identifier tokens (including ones found in comments and strings). In total I had 553,640 strings, of which 19,852 were unique. This seems pretty reasonable as people tend to use similar names for identifiers. I set up four benchmarks:

  • std::map<std::string, int> — Typical STL container (red-black tree I think)
  • SourceMod::KTrie<int> — SourceMod’s Trie container
  • SourceHook::THash<const char*, int> – SourceHook’s “TinyHash,” chained hash table
  • SourceHook::THash<std::string, int> – Same, with more allocation involved

As a note, “TinyHash” is one of the first ADTs I tried to write, and it shows. Its stock find() function iterates over the whole table, instead of computing a hash, which is really slow. I filed a bug on this and hacked up my local copy to benchmark.

Each test gets run twice. The first “batch” run goes over all identifiers and inserts them if they are not already in the container. The second “retrieval” run does the same thing, except now there will be no string insertions since the table is populated.

Here’s the results, in milliseconds. I used the Intel C++ Compiler, version 10, on my Macbook Pro (2.4GHz Core 2 Duo, 4GB RAM).

Whoa! Everyone was right. Well, I knew that insertion on our Trie was pants, but I was surprised to see that the cost was in no way amortized. In fact, it wasn’t even faster on retrieval! Why? I don’t know (yet), but if I had a guess, it’s because our Trie has to peek at memory a lot. David Gregg explained to me that the advantage of tries is sorting. He also theorized it might be faster on long strings, which is in line with my original benchmarks ages ago, where I used very long randomized strings. (Tries are also good for compression.)

Well, I has sad. Looks like we’ll be ditching KTrie for a chained hash table. At least for string interning. Someone is welcome to request a hash table ADT for SourceMod scripting as well.

I also benchmarked non-PGO ICC and non-PGO g++-4.2 if you would like to see other compilers.

Note 1: Go article returns next week.
Note 2: Firefox 3.5 comes out tomorrow (June 30th!)

Computer Go, Part 2

Last week I gave an introduction to Computer Go, and left off mentioning that the current hot topic is Monte-Carlo methods. These methods use repeated random gameplay to sample a position’s value.

How do programs use random gameplay to reach a good conclusion? While you can sample specific positions, you only have finite time. How do you balance simulating one interesting position over another?

A few years ago a breakthrough algorithm was published, called UCT (Upper Confidence Bounds applied to Trees). The algorithm attempts to minimize the amount of sampling needed to converge to an acceptable solution. It does this by treating each step in the game as a multi-armed bandit problem.

Bandit Problem

A bandit is a slot machine. Say you have a slot machine with some probability of giving you a reward. Let’s treat this reward as a random variable X. You don’t know the true value of X, but by playing the slot machine repeatedly, you can observe an empirical reward, which is just an average (\mu).

Now consider having k slot machines, each expressed as a random variable X_1 \ldots X_k. Once again, you don’t know the true value of any X_i. How can you choose the right slot machine such that you minimize your losses? That is, you want to minimize the loss you incur from not always playing the most optimal machine.

Well, if you knew the true values of each machine, you could just play on the one with the greatest reward! Unfortunately you don’t know the true values, so instead you conduct trials on each machine, giving you averages \mu_i. You can then use this information to play on the observed best machine, but that might not end up being the most optimal strategy. You really need to balance exploration (discovering \mu_i) with exploitation (taking advantage of the best \mu_i).

A paper on this problem published an algorithm called UCB1, or Upper Confidence Bounds, which attempts to minimize regret in such multi-armed bandit problems. It computes an upper confidence index for each machine, and the optimal strategy is to pick the machine with the highest such index. For more information, see the paper.


Levente Kocsis and Csaba Szepesvarí’s breakthrough idea was to treat the “exploration versus exploitation” dilemma as a multi-armed bandit problem. In this case, “exploration” is experimenting with different game positions, and “exploitation” is performing Monte-Carlo simulations on a given position to approximate its \mu_i. UCT forms a game tree of these positions. Each node in the UCT tree stores \mu_i and a “visit” counter. The confidence index for each node is computed as:

\mu_i + \sqrt{\frac{2\ln(parent.visits)}{child.visits}}

Each run of the algorithm traverses down the tree until it finds a suitable node on which to run a Monte-Carlo simulation. Once a node has received enough simulations, it becomes “mature,” and UCT will start exploring deeper through that position. Once enough overall simulations have been performed, any number of methods can be used to pick the best action from the top of the UCT tree.

How this all worked confused me greatly at first, so I made what is hopefully an easy flow-chart diagram. In my implementations, UCT starts with an empty tree (save for the initial moves the player can make).

UCT Algorithm

UCT converges to an acceptable solution very quickly, can be easily modified to run in parallel, and can be easily interrupted. It has seen massive success; all competitive Go programs now use it in some form.

Levente’s paper: click here

Next week: Monte-Carlo simulations and my project.

Computer Go, Part 1

As a degree requirement, my university requires students to complete a “major qualifying project” (MQP) in their field of study. Usually these projects are done in teams of two to three people, and often they take place abroad. MQPs last seven weeks and they culminate with the group writing a thesis-length paper (80+ pages) and giving a twenty minute presentation.

One of the projects available last quarter was about teaching computers to play Go. It took place in Budapest, Hungary, because one of the foremost researchers in the field (Levente Kocsis) works at the renowned Hungarian Academy of Sciences.

I really enjoy Go. It is an amazingly deep game where complex play is derived from very simple rules. Computers have a very difficult time playing it. Amateurs can defeat the best programs even when they run on supercomputers. Playing Go is very much an open problem in artificial intelligence.

The project was researching Go with Levente Kocsis. I was really eager to get on board! Because of my (bad) academic standing I had problems with the bilious idiots in the registrar and administration, but eventually I got in. In the end, no one else joined. I had to tackle the whole project myself. That’s implementation, research, testing, and writing that paper — all in seven weeks. Well, that’s a lot for one student with absolutely no AI experience.

If I had known this ahead of time, and I had also known that I’d be going into real compiler research, I might not have done this Go project. On the other hand, I loved Budapest. It’s a great city, and I’d go back in a heartbeat. I also learned a lot. So in another one or two blog posts, I will try to share a bit about the project, before it completely slips my mind.


Go is a two-player board game. The board is a 19×19 grid. Players take turns placing black and white stones on the intersections. The object of the game is to surround as many empty intersections as possible. The game ends when both players agree there are no more moves, at which point the player with more “territory” is the winner.

So why is Go really difficult for computers, while a desktop can beat grandmasters at Chess?

The standard Chess algorithm is α-β pruning over minimax trees. Unfortunately Go’s search space is too large for this to work efficiently (see chart). The average number of moves per position in Chess is around 35, whereas in Go it is 250. Even if you play Go on a small board (such as 9×9, for beginners), there are around 40-60 moves per position.

There are additional complications. In Chess there are opening books and end-game solvers. In Go there are opening styles and patterns, but they evolve rapidly over time and are not rote as in Chess. It is also possible to quickly evaluate the value of a Chess position, whereas in Go the game must be played until the end to get an accurate scoring.

So, that’s what doesn’t work work. What does work?

Monte-Carlo Introduction

If there’s one thing a computer can do quickly, it’s picking moves at random until there is a position that can be scored. With enough random sampling, the true value of a position can be approximated within a certain accuracy. This is the basis of all Monte-Carlo algorithms, and it has become the foremost area of research for Go.

The most common application is Monte-Carlo Tree Search. The idea is to play random games from various positions. The results can then be used to explore the most interesting paths in the game tree. The idea seems ridiculous both in its simplicity and stochasticity, but it works.

Next week: Bandit problems.

Hello PM

People reading this blog probably know about PM. He was one of the original SourceMod authors, and forked off his work as the amazing SourceHook, which became the backend to Metamod:Source.

I finally got a chance to meet PM today. He was going from Slovakia to Germany, and stopped in Budapest, Hungary for a few hours. We did a little touring around the Castle District (Várnegyed). There are some great panoramic views of the city, since it is high up on the hills of the Pest side. It’s also a very touristy area, people were speaking English (and according to PM, German).

Picture or it didn’t happen.

It’s rare I get a chance to meet people from the AlliedModders community, and it’s great to put a real person to the handle/avatar on the forums. I can only recall a few other times to do this, I got to meet Freecode, OneEyed, and teame06 at various LAN events.

I first heard from PM in 2003 or so. I was writing a Counter-Strike mod and it was exposing a very bad bug: if you removed a weapon on the ground, Counter-Strike would crash, either immediately or much later. I couldn’t figure out what was going wrong and was at my wits’ end, until I got a random message from PM with a mysterious but working fix.

In early 2004 he and I both ended up being the core developers of AMX Mod X. Back when I was still trying to figure out what pointers were, or what malloc() really did, or what the difference between the stack and heap were (you know, all these beginner things), PM was hacking away at the Small virtual machine and doing big rewrites of some of the most terrible AMX Mod code. (I eventually figured this stuff out, and looking back I wonder why PM let me do anything.)

So, meeting one of the most awesome Half-Life/AlliedModders community developers after six years was really cool. I hope I get more chances like this in the future. And thanks to PM for putting up with our boring traipsing around the city!

It’s Upgrade Week

I spent a good portion of this week upgrading various parts of the AM “infrastructure” (read: server held together with twine and duct tape).

 * 16:32 GMT-5. PHP FastCGI upgraded from 5.2.8 to 5.2.9.
 * 16:44 GMT-5. MySQL upgraded from 5.1.30 to 5.1.32.
 * 17:38 GMT-5. MediaWiki upgraded from 1.10.1 to 1.14.0.
 * 17:44 GMT-5: ViewVC upgraded from 1.0.5 to 1.0.7.
 * 17:55 GMT-5: phpMyAdmin upgraded from 2.11.3 to 3.1.3.
 * 18:55 GMT-5: PHP FastCGI recompiled against MySQL 5.1.32.
 * 19:27 GMT-5: Apache for users upgraded from 2.2.8 to 2.2.11.
 * 22:08 GMT-5: Apache for vhosts upgraded to 2.2.8 to 2.2.11.
 * 00:13 GMT-5: Bugzilla upgraded from 3.1.4 to 3.2.2.
 * 01:16 GMT-5: Buildbot upgraded from 0.7.8 to 0.7.10.
 * 01:59 GMT-5: Mercurial upgraded from 1.0.2 to 1.2.
 * 04:39 GMT-5: Debian upgraded from 4.0 (etch) to 5.0 (lenny).
 * 04:42 GMT-5: Mercurial and Buildbot switched from Python 2.4.4 to 2.5.2.
 * 00:44 GMT-5: vBulletin upgraded from 3.6.7 to 3.8.1-PL1.

DS helped me on the Buildbot/Mercurial side. Hopefully everything still works for the most part. Many of these upgrades were sorely needed, fixing known bugs or security issues.

All of this was delayed so long because, well, no one wanted to do it. We maintain private patches against Apache, Bugzilla, vBulletin, and ViewVC. Keeping track of those changes is difficult.

This time around we started using Mercurial Queues. We keep the official software in a Mercurial repository and our custom changes in a queue (patch series). When it’s time to update, we pop the series, commit the new software, then push the series back.

This workflow is nice and all but Mercurial queues aren’t really a part of the repository. So if someone wants to check out the repository and stage development somewhere else, they have to export the queue manually and re-import manually on the other side. Ideally, these patches would be lightweight development forks off the main repository.

We used to keep separate repositories and do three-way diffs. That was a huge pain. Never again. It still seems like we’re missing something though.