Category Archives: AlliedModders

The Evolution of Building at AlliedModders

AlliedModders recently passed its ten year anniversary. That’s literally crazy. At some point I’d like to do a longer post on its history and the people who made it happen – but today I wanted to talk about the evolution of our build process. It’s quite complicated for your average gritty open-source project – complicated enough that we wrote multiple custom build systems. Right now we’re pretty close to state-of-the-art: AMBuild 2 is a general-purpose, optimal-by-design DAG solver. But we started off in the dregs, and it took 10 years of iteration to get where we are now.

Without further ado…

The Beginning

The first project that would become part of AlliedModders was AMX Mod X. Felix Geyer (SniperBeamer) recruited me to the team, and I provided hosting. It was January of 2004, and there were two compilers you could use: GCC 2.95, and Visual Studio 6.0. The project had a single binary built from about 20 .cpp files.

On Linux it used a single Makefile. On Windows you had to run the Visual Studio IDE. For about six months our release process was entirely hand-driven. Even as we added more binary components, I’d build each one by hand, sort them into packages by hand, drop them into SourceForge FTP, then one by one add each package to a release.

As components became more complicated, I eventually got frustrated by Make’s lack of simple programmatic control. If you’re unfamiliar with GNU Make, it has something resembling the worst scripting language ever made. I couldn’t bring myself to touch it, and in August 2004 created two Perl scripts:

  • Makefile.pl – A straight translation of the Makefile to Perl, but with more programmatic control over how and what gets built.
  • package.pl (which I no longer have) – after building binaries by hand, this script would build packages and upload them to SourceForge.

Over the next year, AMX Mod X grew rapidly. The development team and community were very active, and the Half-Life 1 modding scene was at its peak. By July 2005, AMX Mod X had over 20 binary components and supported three platforms: Windows, Linux i386, and Linux AMD64. Releasing by hand took hours, and maintaining 20+ random copies of a Perl script was really difficult. Something had to change.

Fixed Pipeline Era

The first step was ditching Makefile.pl. I replaced it with a much cleaner Makefile. Most of the complexity was stripped away and reduced to a few variables and filters.

But I was still pretty new to software engineering, and made a major mistake – I didn’t separate the guts of the Makefile into something reusable. As a result the whole thing got copied and pasted to every component, and to components in other projects. To this day about 50 variations of that Makefile are scattered around, though it’s unclear whether anyone still uses them. Up until 2009 we were still syncing changes across them by hand.

The next step was automating all the component builds and packaging. AMX Mod X had weird packages. There was a “base” package, and then optional addon packages to supplement it. The steps for each package were similar, but they all built different sets of components. The result: I ended up writing something we affectionately called “the C# tool”: AMXXRelease. It had three pieces:

  • A general pipeline for constructing an AMX Mod X package.
  • An abstraction layer that decided whether to build via GNU Make or MSBuild.
  • Classes for each package, usually just supplying filenames but sometimes adding custom steps to the pipeline.

The C# tool survived many years. Though it was a hard-coded tool written in a verbose, rigid framework, it got the job done. AMX Mod X continued to use it until 2014, when we finally ditched it as part of a transition to GitHub.

Interlude: SourceMod

In 2006, Borja Ferrer (faluco) and I secretly begun work on the next major AlliedModders project: SourceMod. SourceMod was a rewrite of AMX Mod X for the Half-Life 2 Source engine. By 2007 it was nearing ready to go public as beta, so I ported the C# tool. Everything was peachy. Until Half-Life Episode 1 came out.

At the time, it wasn’t conceivable to us that Valve would maintain multiple forks of the Source engine. Half-Life 1 was all we knew. Valve had never changed Half-Life 1 in a way that couldn’t be fudged with some detection and binary hacks. But Source was different – it was vast and complex compared to its predecessor. Valve began changing it frequently – first with Episode 1 – and then again with the Orange Box and Team Fortress. Now there were three versions of the Source engine, all totally binary incompatible. The same code didn’t even build across their SDKs, and linking was too complex for something like a “universal Source binary”.

Suddenly our Makefile and Visual Studio project files were getting ridiculously hairy. Not only did SourceMod have a dozen binary components, but it had to build each one multiple times with different parameters, across two different platforms. By mid-2009 we realized this was crazy and started looking at other options.

AMBuild 1: Alpha-Gen Building

I evaluated two build systems for SourceMod: CMake and SCons. CMake was a contender for “worst scripting language ever”. After a lot of futzing I concluded it was impossible to replicate SourceMod’s build, and even if I could, I didn’t want to see what it would look like. SCons came much closer, but it didn’t have enough flexibility. Source binaries required a precise linking order, and at the time SCons had too much automagic behavior in the way.

We needed something where we could template a single component and repeat its build over and over with small changes. We needed reasonably accurate minimal builds. We needed platform-independent build scripts, but also tight control over how the compiler was invoked. So in August 2009, I dove into Python and wrote our own general-purpose build tool, dubbed “AMBuild 1″.

I would describe AMBuild 1 as “rough”. It was definitely extremely flexible, but it was buggy and slow. Each unit of work was encapsulated as a “job”, but jobs did not form dependencies. It had some hardcoded, special logic for C++ to support minimal rebuilds, but since it was special logic, it broke a lot, and because it was recursive in nature (like Make), it was very slow. Since jobs didn’t form dependencies, they couldn’t be parallelized either. Everything ran sequentially. The API was also pretty grody. The root build file for SourceMod was somewhat of a nightmare.

But… it worked! Suddenly, supporting a new Source engine was just a matter of adding a new line in its central build file. It worked so well we began migrating all of our projects to AMBuild.

Over time, as always, things began to break down. By 2013 there were twenty versions of Source, and AMBuild 1 was nigh unusable. Minimal rebuilding was inaccurate; sometimes it rebuilt the entire project for no reason. Sometimes it produced a corrupt build. Builds were also EXTREMELY slow. A complete Windows build could took two hours, and multiple cores did nothing. There were so many headers in the Source SDK, and so many file scans for dependency checks, that merely computing the job set for an update could take 15 seconds.

AMBuild 2: Modern Era

Sometime in 2013 I found Tup and was immediately inspired. Tup “gets it”. Tup knows what’s going on. Tup knows what it is. Tup is what build systems should aspire to be. In short, it has two philosophies:

  • Builds should be 100% accurate. Not 99% accurate, or 37% accurate, or “accurate if you clean or clobber”. They should be 100% accurate all the time.
  • Builds should be fast, because waiting for builds is a waste of time.

The design of Tup is simple on the surface. Rather than recursively walking the dependency graph and comparing timestamps, it finds a root set of changed files, and then propagates a dirty bit throughout the graph. This new “partial graph” is what has to be built. In practice, this is much faster. In addition, Tup caches the graph in between builds, so if your build files change, it can compute which parts of the graph changed.

Well, that’s awesome. But… I didn’t want to take on Tup as a dependency for AlliedModders. I have a few reasons for this, and I’m still not sure whether they’re valid:

  • Tup requires file system hooks (like FUSE) and I was worried about its portability.
  • Tup is a C project, and not yet in Linux distros, so we’d have to tell our users to install or build it.
  • AMBuild’s API, and of course Python, together make for a much better front-end than Lua.
  • Tup didn’t seem to include external resources (for example, system headers/libraries or external SDKs) in its graph.

So I sat down and began rewriting AMBuild with Tup’s principles in mind. The result was AMBuild 2. With a real dependency graph in place, the core of the build algorithm became much simpler, and parellelizing tasks was easy. The API got a massive overhaul and cleanup. It took about two months to work all the kinks out, but in the end, my SourceMod builds took about a minute locally instead of 15. And with some smarter #include’ing, our automated builds went from 45 minutes to 2 minutes. The build scripts don’t look too bad, either.

What’s next?

AMBuild 2 isn’t perfect. Its biggest problems are scalability and internal complexity. I’m a huge fan of Python, and it’s nearly everywhere now. But it’s slow. Dog slow. AMBuild 2 would not scale to a codebase the size of Mozilla or Chromium. In addition, Python’s concurrency support is very bad, even taking into account the multiprocessing module. Most of AMBuild’s complexity is in its IPC implementation.

Early on in its development a common suggestion I got was to separate AMBuild into a front-end (the build API) and back-end (the build algorithm). I heard this theory float around for Mozilla’s build system too. CMake is structured this way. Separating the components out means they are replaceable: AMBuild 2′s frontend could generate Makefiles instead of an AMB2 DAG, or it could read Makefiles and parse them into a DAG.

I eventually concluded that didn’t make much sense. The goal of AMBuild 2 is not to provide a general backend or general frontend: it’s to produce correct builds as quickly as possible. In fact, it considers most other build systems to be broken and inferior. Taking an AMBuild project and generating Makefiles defeats its entire purpose.

However, there is something very useful that comes out of this separation, which is the ability to generate IDE project files for casual use or local development. Keeping this in mind, AMBuild 2 did separate its front-end from its back-end, but the API makes too many assumptions that don’t have clean mappings to IDE files. As we experiment with Visual Studio and XCode generators, we’ll probably have to iterate on the API.

Lastly, it’s worth nothing that part of what AMBuild solves is papering over an early design flaw in SourceMod: with proper abstractions, it wouldn’t need 20 different builds. In fact, over time we have moved a large amount of code into single-build components. Unfortunately, due to the complexity of what’s left and the Source SDK in general, it would be nearly impossible to completely eliminate the multiple-build steps.

Ok

There are very few projects I’m truly proud of, where I look back and have very few regrets. (One is IonMonkey, but that’s a story for another day.) AMBuild is one of them. It’s not particularly stellar or original, but we needed to solve a difficult problem, and it does the job very well. I wouldn’t be surprised if it takes another 10 years of iteration to make it perfect – but that seems to be how things work.

Inheritance Without Types

At AlliedModders we dabble in practical language design. Our two major scripting projects, AMX Mod X and SourceMod, iterated on our own fork of a scrappy language called Pawn. Our next iteration is untyped, and we’re adding object support.

What does object support mean? “Well, obviously,” I naively thought, “it means Java-style implementation inheritance!” Whoops. WRONG. It turns out there’s a ton of complexity with inheritance once you take away types. This post is a brief survey of some popular untyped languages and what we ended up deciding.

The Goal
We want our language to have as few pitfalls as possible. I’m not sure if I stole this from Graydon Hoare, but our motto is “No surprises!” We want people to be able to write code that works the way they expect, minimizing random or unforeseen run-time failures.

A great aspect of C++’s inheritance is that information hiding is super easy. A base class’s behavior isn’t trampled by derived classes. For example:

Select All Code:
class Base {
    int x;
  public:
    Base() : x(5) { }
    void print() {
        printf("Base: %d\n", x);
    }
};
 
class Derived : public Base {
  public:
    int x;
    Derived() : x(20) { }
};
 
int main() {
    Derived().print();
}

As you’d expect, this prints ’5′. The fact that Derived also declares x does not trample on Base‘s own behavior. Now let’s take an untyped language, Python 3:

Select All Code:
class Base:
    def __init__(self):
        self.x = 5
    def print(self):
        print(self.x)
 
class Derived(Base):
    def __init__(self):
        super().__init__()
        self.x = 20
 
Derived().print()

This prints ’20′. In Python, “self” is really one object, with one set of properties and methods, whereas in the C++ model “this” is conceptually two objects, and “this” is statically typed to be the object of the containing class. I consider the Python model unappealing for two reasons:

  1. When adding a member to a derived class, you have to know that you’re not colliding with a member of your base class.
  2. When adding a member to a base class, you might be breaking any number of unknown consumers.

JavaScript’s prototype-based inheritance discourages this style, but JS is super flexible, and if you don’t mind a very, very inefficient prototype chain, you can do this:

Select All Code:
function Base() {
    var x = 5;
    this.print = function () {
        alert(x);
    }
}
 
function Derived() {
    return Object.create(new Base(), { x: { value: 20 } });
}
 
Derived().print()

This prints “5″.

PHP’s Answer
For whatever reason, PHP takes Java’s model wholesale. Here’s an example:

Select All Code:
class Base {
    private $x;
    public function __construct() {
        $this->x = 5;
    }
    public function printMe() {
        print($this->x . "\n");
    }
}
 
class Derived extends Base {
    public $x;
    public function __construct() {
        parent::__construct();
        $this->x = 20;
    }
}
 
$obj = new Derived();
$obj->printMe();

This prints ’5′, which is what we want – and makes sense. But is $this statically typed to the Base class, like in C++? No:

Select All Code:
class Base {
    public function printMe() {
        print($this->stuff . "\n");
    }
}
 
class Derived extends Base {
    public $stuff;
    public function __construct() {
        $this->stuff = "hello";
    }
}
 
$obj = new Derived();
$obj->printMe();

This prints "hello", so PHP’s $this retains some dynamicism. Let’s up the ante. What should this do?

Select All Code:
class Base {
    private $x;
    public function __construct() {
        $this->x = 5;
    }
    public function compareTo($obj) {
        return $this->x == $obj->x;
    }
}
 
class Derived extends Base {
    public $x;
    public function __construct() {
        parent::__construct();
        $this->x = 20;
    }
}
 
$b = new Base();
$d = new Derived();
print($b->compareTo($d) . ", " . $d->compareTo($b) . "\n");

This prints true in both cases! For any property access in a PHP class function, if the object has that class on its inheritance chain, it uses the property on that class. Otherwise, it searches from the derived-most class back to the base class like normal. This implicitly hides variables on the derived object. Nonetheless it’s the right choice given the model, especially considering that it’s usually bad practice for a base class’s behavior to explicitly cast to one of its derived class.

Note that the fact that the inner x is private is actually irrelevant. Even if it were public, the base class should retain its behavior. Banning redeclaration works, though then you run the risk of potentially preventing an unknown consumer from compiling (albeit, better than silently being wrong). Similar issues occur with introducing new virtual functions or overloads.

So, what did we do?
After letting this all sink in, I decided to scrap inheritance as a feature of our next language iteration. Having objects alone will be a huge step forward, and we can evaluate use cases from there. I’m satisfied with this for a few reasons.

Inheritance is really complicated with types, and even more complicated without. And in an untyped language, it’s not even clear if implementation inheritance is useful. PHP’s interfaces and abstract classes seem verbose and heavyweight in comparison to JavaScript or Python, in return for a very small amount of static guarantees.

So far Pawn and our derivatives have mainly been embedded in games, and the majority of those games are based on the Half-Life engines (including Source). We want our object model to be compatible with the game’s internal object model, however, I’m not convinced that copying it would be ideal. The Source engine has the “Big-Ass Base Entity Class from which Everything Derives,” a painful and complex architecture continuing Source’s theme of not being scalable. Do we really want to encourage that?

I suspect we’ll end up with something like traits and/or composition, but for now, I’m content with keeping it simple and continuing to evaluate the strengths of other languages’ models.

(Addendum: I don’t much about Perl or Ruby. I tried quickly evaluating them for this post but the syntax was a little intimidating. I’d really appreciate insight into other languages.)

Addendum 2: SaberUK contributes this Ruby example which prints 20 – exhibiting similar behavior to Python.

Select All Code:
class Base
	def initialize
		@x = 5
	end
 
	def print_me
		puts @x
	end
end
class Derived < Base
	attr_accessor :x
	def initialize
		super
		@x = 20
	end
end
obj = Derived.new
obj.print_me

Debugging 500 Errors

For the past two years, the AlliedModders site has been randomly dishing out “HTTP 500 – Internal Server Error” responses. It’s been pretty frustrating.

I actually tried debugging the errors two years ago. I got far enough to attach GDB to PHP, and saw it blocked on mysql_query(), but didn’t know how to proceed. We knew that during these incidents, the server load would spike to obscene levels – like 100-150. Lots of PHP processes were getting spawned. “iowait” time was through the roof, and CPU usage was normal. Something couldn’t read the disk, or the disk was getting hammered. I put MySQL on a separate hard drive, but the problems persisted and I gave up. Keep in mind: I didn’t know anything about running servers, and I still don’t – I just have a vague notion of how to debug things.

Fast forward to a few weeks ago, we had a serious hard drive crash and suffered some painful downtime. We lost a lot of data. I clung to a hope that our years of errors had been caused by a failing disk. Predictably, we weren’t so lucky, and the website problems continued. Someone joked, “Hey, I’m glad to see you got the server back up, even the 500 errors!” Yeah… it was time to fix this. Somehow.

Step One: Apache Logs

There were only two notable Apache errors. They’d appear over and over, but only in large clumps:

[Tue Jan 25 04:15:19 2011] [warn] [client xx.xxx.xxx.xx] (110)Connection timed out: mod_fcgid: ap_pass_brigade failed in handle_request function
[Tue Jan 25 04:15:25 2011] [warn] [client xxx.xxx.xx.xx] mod_fcgid: read data timeout in 60 seconds

It was pretty clear that somehow, communication was breaking down between mod_fcgid (Apache) and PHP processes. I read, I think, every single webpage on the Internet involving these two errors. Twice. They were all dead ends, both for us and both for the person reporting their own problems. A little sad. I wanted to blame mod_fcgid, since historically it has been really buggy for us, but I wanted to pin absolute blame.

So the next step was to break out a debugger.

Step Two: Debugging PHP, Apache

AM developer Fyren edited the mod_fcgid source code, such that instead of killing PHP, it would attach gdb in a separate tty. It usually took around three hours to get a debugging session going. The call stacks looked like:

(gdb) bt
...
#4 0x00007f1f7cad804f in net_write_command () from /usr/lib/libmysqlclient.so.15
#5 0x00007f1f7cad4c51 in cli_advanced_command () from /usr/lib/libmysqlclient.so.15
#6 0x00007f1f7cad1751 in mysql_send_query () from /usr/lib/libmysqlclient.so.15
#7 0x00007f1f7cad17b9 in mysql_real_query () from /usr/lib/libmysqlclient.so.15
...
(gdb) fin
Run till exit from #0 0x00007f1f7cad17b9 in mysql_real_query () from /usr/lib/libmysqlclient.so.15

At this point, the session would hang, indicating that MySQL was not responding. I really had no idea what to do, but we got lucky. We noticed during these incidents, Apache’s server-status was stuck at “200 requests being processed,” all in the “W” (“Sending reply”) state. Our MaxClients setting was exactly 200. Now we had a hypothesis: MySQL was intermittently not responding for short bursts of time. This blocked PHP, which in turn made mod_fcgid block and timeout, causing Apache to spawn a bunch of workers that would lock up as well.

Unfortunately, the intermittent failures only happened a few times a day, so debugging went slow. We managed to catch one, and ran iostat:

Device:          r/s       w/s   rsec/s    wsec/s    avgrq-sz  avgqu-sz   await  svctm  %util
sdc              320.80    4.40  3540.80    54.40    11.06     4.32   13.01   3.08 100.00
sdc1             320.80    4.40  3540.80    54.40    11.06     4.32   13.01   3.08 100.00

Someone was reading tons of data off the disk, and iotop confirmed MySQL was the culprit.

Diagnosing MySQL

I managed to find this enlightening presentation by Percona. Case study #3 looked identical to ours, except in their case, MySQL was writing. But their conclusion about bad queries seemed plausible. I made a script that would peek at server-status every second, log a ton of data, and e-mail me if it went above 100 concurrent requests. The data included oprofiling, iotop spew, SHOW FULL PROCESSLIST, SHOW OPEN TABLES, SHOW GLOBAL STATUS, etc.

Oprofile’s report wasn’t too interesting, but the MySQL status showed around 150 queries waiting for a table lock. The active queries, in all cases, were SELECTs. This explained the reads, but not the table locks. This didn’t make sense, so I generated a few core files of MySQL from within GDB, then Fyren and I did some post-mortem analysis.

The analysis revealed that MySQL had about 200 threads, most of which were waiting on locks (pthread_cond_timedwait) with a timeout of one year. The locks were almost all read locks for SELECTs, and the actually succeeding queries were exclusively SELECTs. Fyren tried to track down the thread owning the locks, but it ended up being apparently circular (thread X’s lock was held by thread Y, and thread Y was waiting on a lock held by thread Y).

That was all pretty mysterious (and still is), but rather than try to understand exactly what locks are in MyISAM tables, I took a look at the non-deadlocked queries. They were all one of: the AMX Mod X plugin search, the SourceMod plugin search, or the vBulletin member list search.

It turns out all three of these queries were extremely expensive – containing ORs, joins, full table scans of 800MB tables, etc. On average they each took about 500ms to complete. That’s without MySQL freaking out. I turned off vBulletin member searching (most large boards do), and replaced the plugin searches with faster queries off a much a smaller table. This brought those page load times down to 20ms. Twelve hours later, no 500 errors had occurred, and there were no mod_fcgid failures in Apache’s logs.

Too Early to Declare Victory

While waiting to see if any more 500 errors would show up, suddenly the server load spiked, and lo and behold – mod_fcgid became overloaded. Our monitoring script kicked in, and this time, MySQL was nowhere in the profile! It turned out some forum users were effectively DoS’ing the site by holding down F5 on their profile pages to bump up the “view” count. OProfile suggested that 20% of CPU time, during this incident, went to PHP parsing and lexing source code. I installed eAccelerator and now that time appears to be gone. (And for good measure, I removed profile view counts.)

Growth and Traffic

It’s unclear what our load and traffic really is. I think we’re pretty good though. CPU utilization is usually under 20%, disk I/O is under 10%, and we rarely see more than 30 concurrent requests. We don’t need to invest in a separate MySQL server yet, and we definitely aren’t growing fast enough to expect to need one anytime soon.

Community Involvement

I would like to thank Fyren for his help in looking into these problems, and asherkin for helping beef up our non-existent monitoring. Also, thanks to AzuiSleet, Drunken_Fool, MatthiasVance, and devicenull for sticking around in IRC and answering random questions.

Diagnose, then Solve

Many people, well-intentioned, said “switch to lighttpd, nginx, IIS, use my favorite host, use a VPS, buy more servers,” etc. I got caught up in this too, and fueled it by initially blaming mod_fcgid. It’s really important to make sure a solution has targeted a specific problem, and I had to keep telling myself to keep a cool head. Sometimes I had to slap myself or wait for Fyren to slap me. They all might be good ideas in their own right, but without diagnosing the actual cause of the errors, there was no guarantee of a fix. By careful debugging (albeit in painful, hour-per-day sprees), we got pretty close to the root cause, whereas throwing our infrastructure into a salad bowl would have delayed the inevitable.

Part of the reason our original drive is still at the recovery lab is because someone didn’t diagnose before “solving.” When it failed, a technician (not one of us) ran fsck on the disk instead of doing a hardware diagnostic first. I have no doubt this caused untold extra damage.

Conclusion

tl;dr, the forums should be much faster now, and it shouldn’t error anymore. I’ve got the system set up to e-mail me and dump diagnostics when our load goes way above normal, so I’ll continue monitoring for potential problems.

I would like to sincerely thank the whole AlliedModders community for the successful donation drive. I won’t lie, the past month has been pretty sucky. Despite our failings, the fact that everyone cares enough to see the whole thing keep going is really awesome. Our seven year anniversary was just a few weeks ago. Here’s to an AlliedModders for 2011!

Login Changes for AlliedModders

SourceMod developer and occasional ITer Scott Ehlert (“DS”) has overhauled the way we do source code access at AlliedModders. If you have any kind of Mercurial, SVN, or CVS access with us, this pertains to you. If not, you probably want to save yourself the boredom and read something else.

Domain Names
All source code services are now on a separate server. You need to use hg.alliedmods.net, svn.alliedmods.net, cvs.alliedmods.net, etc. Using just alliedmods.net will not work.

Secure Accounts
If you used SSH access (“ssh://” or “svn+ssh://”) to access a repository, your account is no longer valid. Our new account system uses e-mail addresses for user names. We’ve made an easy website for converting your old account to an e-mail address based one:

https://www.alliedmods.net/internal/oldaccount

If you had source code access and you want to keep it, you MUST provide a PUBLIC key. You CANNOT push to source code repositories with a password anymore. Keys are required, and of course you should be careful to protect the private half of your key.

When accessing source code repositories, you need to use your full account name. For example, “[email protected]@hg.alliedmods.net”. If you have SSH access to our webserver, you can continue to use your old account name.

Repository URLs
For Mercurial users, you no longer have to type “//hg/” before every path. For example, “//hg/sourcemod-central” is now “/sourcemod-central“.

For Subversion (SVN) users, the same applies for “/svnroot“. If you had “/svnroot/users/dvander” it is now “/users/dvander“.

User Repositories
We now have a fairly easy way for developers to create their own Mercurial repositories without filing bugs and waiting. You can read more here:

http://wiki.alliedmods.net/Mercurial_User_Repositories

This replaces our old, vastly insecure Subversion management tool. We don’t really want to give out new Subversion repositories unless there are extenuating circumstances. If you would like changes to your SVN repo, or a migration to Mercurial, please just file a bug.

Credits
I think we’re in a much better position now to quickly give community developers the resources they need to do secure, versioned, shared development. There were quite a few trials and tribulations getting this thing working. DS worked on and off on this for a while, only taking a “break” to help out with the L4D2 porting effort. Hopefully it’ll all have been worth it – and a huge thanks to him regardless.

Hello PM

People reading this blog probably know about PM. He was one of the original SourceMod authors, and forked off his work as the amazing SourceHook, which became the backend to Metamod:Source.

I finally got a chance to meet PM today. He was going from Slovakia to Germany, and stopped in Budapest, Hungary for a few hours. We did a little touring around the Castle District (Várnegyed). There are some great panoramic views of the city, since it is high up on the hills of the Pest side. It’s also a very touristy area, people were speaking English (and according to PM, German).

Picture or it didn’t happen.

It’s rare I get a chance to meet people from the AlliedModders community, and it’s great to put a real person to the handle/avatar on the forums. I can only recall a few other times to do this, I got to meet Freecode, OneEyed, and teame06 at various LAN events.

I first heard from PM in 2003 or so. I was writing a Counter-Strike mod and it was exposing a very bad bug: if you removed a weapon on the ground, Counter-Strike would crash, either immediately or much later. I couldn’t figure out what was going wrong and was at my wits’ end, until I got a random message from PM with a mysterious but working fix.

In early 2004 he and I both ended up being the core developers of AMX Mod X. Back when I was still trying to figure out what pointers were, or what malloc() really did, or what the difference between the stack and heap were (you know, all these beginner things), PM was hacking away at the Small virtual machine and doing big rewrites of some of the most terrible AMX Mod code. (I eventually figured this stuff out, and looking back I wonder why PM let me do anything.)

So, meeting one of the most awesome Half-Life/AlliedModders community developers after six years was really cool. I hope I get more chances like this in the future. And thanks to PM for putting up with our boring traipsing around the city!

It’s Upgrade Week

I spent a good portion of this week upgrading various parts of the AM “infrastructure” (read: server held together with twine and duct tape).

2009-03-08
 * 16:32 GMT-5. PHP FastCGI upgraded from 5.2.8 to 5.2.9.
 * 16:44 GMT-5. MySQL upgraded from 5.1.30 to 5.1.32.
 * 17:38 GMT-5. MediaWiki upgraded from 1.10.1 to 1.14.0.
 * 17:44 GMT-5: ViewVC upgraded from 1.0.5 to 1.0.7.
 * 17:55 GMT-5: phpMyAdmin upgraded from 2.11.3 to 3.1.3.
 * 18:55 GMT-5: PHP FastCGI recompiled against MySQL 5.1.32.
 * 19:27 GMT-5: Apache for users upgraded from 2.2.8 to 2.2.11.
 * 22:08 GMT-5: Apache for vhosts upgraded to 2.2.8 to 2.2.11.
2009-03-09
 * 00:13 GMT-5: Bugzilla upgraded from 3.1.4 to 3.2.2.
 * 01:16 GMT-5: Buildbot upgraded from 0.7.8 to 0.7.10.
 * 01:59 GMT-5: Mercurial upgraded from 1.0.2 to 1.2.
 * 04:39 GMT-5: Debian upgraded from 4.0 (etch) to 5.0 (lenny).
 * 04:42 GMT-5: Mercurial and Buildbot switched from Python 2.4.4 to 2.5.2.
2009-03-10
 * 00:44 GMT-5: vBulletin upgraded from 3.6.7 to 3.8.1-PL1.

DS helped me on the Buildbot/Mercurial side. Hopefully everything still works for the most part. Many of these upgrades were sorely needed, fixing known bugs or security issues.

All of this was delayed so long because, well, no one wanted to do it. We maintain private patches against Apache, Bugzilla, vBulletin, and ViewVC. Keeping track of those changes is difficult.

This time around we started using Mercurial Queues. We keep the official software in a Mercurial repository and our custom changes in a queue (patch series). When it’s time to update, we pop the series, commit the new software, then push the series back.

This workflow is nice and all but Mercurial queues aren’t really a part of the repository. So if someone wants to check out the repository and stage development somewhere else, they have to export the queue manually and re-import manually on the other side. Ideally, these patches would be lightweight development forks off the main repository.

We used to keep separate repositories and do three-way diffs. That was a huge pain. Never again. It still seems like we’re missing something though.

What’s next for AlliedModders?

It’s the question no one asked!

SourceMod Motivation

As I’ve too often said, the original SourceMod failed because we tried to do too much without understanding project management, the scope, or even how to code most of what we wanted to do. When we began talking about restarting the project, we did so from a research point of view. We asked the community: should we spend time writing a new language, or just get something to market on Pawn?

The community voted the latter, so we approached the project with the following questions: What would SourcePawn look like if it were modernized a bit? What would AMX Mod X look like if it had a much cleaner and more consistent API?

We wrote a new VM and JIT for Pawn. We added fast strings, basic function pointers, and dynamic arrays to the language. We boxed it into a fairly simple plugin with a small API and watched it take off, adding more as it got requested. Commands, events, menus, FFI, et cetera. It was fun designing all this.

What’s left to do that’s interesting? There’s a lot of open bugs for features people want. Perhaps the only really big ticket item left for SourceMod is script-level detouring of virtual functions, which has been in the works for a long time (and is about 50-60% done).

Most of what needs to be done though is hacking away at the indescribably infuriating Source engine, and all of the closed source games that run on it. This is boring. Sure, we’ll continue to do it. But it’s not interesting. It’s rote reverse engineering against the same old stuff. It’s monotonous and tedious, not to mention fragile. If we were developing a full product (game), we’d have access to everything. But we don’t, and we’re stuck in what’s ultimately a script kiddy’s role.

SourcePawn is Limited

SourceMod has stretched Pawn to its limits. It’s a crufty language. There is no heap. Memory management is manual. Strings are byte buffers. There are no types. There is no data integrity or safety. There are no structures or classes, just C-style functions and opaque integer values.

Basically, writing SourceMod plugins feels vaguely reminiscent of WinAPI except you don’t have any of the advantages of C.

Compiling plugins is asinine. It causes so many problems – arbitrary compatibility issues, secret backdoors, license violations, site and forum maintenance difficulty, et cetera.

Pawn is very fast because it’s so limited, but it’s also hard to squeeze more optimizations out. Everything has to go through natives, even small trivial functions. The bytecode is very low level, and though Pawn as a language has immense JIT potential, our JIT is very basic because the bytecode is too hard to analyze.

So… where do we go from here?

I want to start looking at more involved questions. The type of questions SourceMod might have asked if we had chosen the former path when restarting the project. I’ve boiled this down to two issues:

  • What would a language look like if it were specifically designed with trace compilation, game scripting, and embedding in mind?
  • What would Pawn, a language designed to be simple, look like as a truly modern scripting language, and not something from 1980?

Relatedly, there are two interesting questions for SourceMod:

  • What would SourceMod (as an “AMX Mod X done right”) look like with a more modern, object-oriented language available?
  • Can SourceMod be self-hosted?

I have mentioned this before. Most people say “Just use Python, or JavaScript, or Mono, or X!” where X is some random language that I don’t want to use. That’s not something I’m interested in. If you want to see what your favorite language looks like in Source, go ahead and embed it. This isn’t a case of NIH. It’s something I very much want to pursue to see where it goes.

The Next Project

The next project has been started. It’s a new programming language. I’ve chatted a bit about it in IRC, mostly to get some bikeshedding out of the way. As the project is not official yet, there’s no official name.

I have no idea how far this will go, or if it will go anywhere. But it’s been started. Goals, or rather, properties of the language will be:

  • Dynamically typed.
  • Strongly typed.
  • Object oriented.
  • Garbage collected.
  • Static lexical scoping.
  • Nested block scoping.
  • Explicit variable declaration.

I’ve started work on a garbage collected, register-based bytecode interpreter. I won’t start the actual compiler until I’m pleased enough with backend progress, and I won’t start the JIT until I’m pleased enough with overall maturity.

Everything will be incremental. Parity with Pawn is enough to let people start playing with things, and that would be a huge, huge step.

If anything ever comes of this, it will be interested to see how it can fit into SourceMod. While we can never get rid of Pawn because of compatibility, there’s lots of room for potential. IRC goer theY4Kman embedded Python in a SourceMod extension. We could try something similar as a starting point, or we could do something like EventScripts where Python and the “old language” of the older, darker times lives alongside.

Who is working on this?

For the most part, just me (and, if they so desire, other SourceMod developers). This is a pet project and I don’t want to be too distracted. I also don’t want to tie SourceMod to its success, so it has no relation to SourceMod’s roadmap in any way. But like SourceMod, it’s something I will attempt to work on incrementally in my free time.

If progress on this is something people are interested in, I can keep talking about it. If you’re really interested in it, feel free to find me on IRC (irc.gamesurge.net, #sourcemod). More than likely I’ll be occasionally throwing random design questions around out loud.