Category Archives: Articles

“Official” articles that were pre-written for posting.

Tamarin Tracing, Intro to Tracing JITs

I’ve been porting Tamarin-Tracing‘s code generator to AMD64. The idea behind Tamarin-Tracing is that you have an interpreter for some dynamically typed language, or an environment where a fully optimizing JIT would be too expensive.

As the interpreter runs, it will decide to start tracing sequences of code that could be “hot paths” for performance. For example, loops are good candidates. Once tracing begins, the interpreter captures the state of every instruction and emits it as a “trace.” A trace represents the exact sequence that was taken through a particular control path.

A good example could be a method call. For example, “obj.crab()” in languages such as JavaScript or Python requires a dynamic lookup, since the type of “obj” is only known at run-time. You can emit a trace like this:

LET x = TYPEOF(obj)
TRACE:
   IF TYPEOF(obj) != x
      EXIT
   ELSE
      CALL x::crab ON obj

This trace says, “Assume object is type X. If it’s not, we’ll have to recompute it, but if it is, we can optimize this to a direct method call.” Later, the trace will be analyzed, optimized, and then compiled to assembly. Next time the interpreter needs to call the method, it will see that a compiled trace exists. The compiled trace will first check that the type of “obj” matches the path it originally took. If not, it will exit back to the interpreter. This check is called a “guard” and the exit is called a “side exit.” Otherwise, the optimized method call is invoked. A side exit can later turn into another trace; these branches form a “trace tree.”

Anything can be traced. For example, one side of an “if” branch can be traced. The resulting compiled code would be for the “hot” side of the branch. If the “cold” side is detected, it would jump back to the interpreter and a new trace might branch off. Another case is addition for weakly typed languages. For example, “5 + '3'” is valid in JavaScript, and a trace might optimize numeric conversion paths.

One of the most surprising features is that the compilation granularity is much finer. SourceMod compiles entire scripts to native code (known as “ahead of time” compilation). Sun and Microsoft’s JITs compile methods at a time, and thus compilation is deferred until a method is needed. A tracing JIT, however, is capable of compiling only pieces of code that are deemed as important. It can trace through anything it wants, including method calls, and return control back to the interpreter when the hard work is done.

This is a pretty new concept that seems to be the future of optimizing dynamic languages. JITs for these languages typically hit performance boundaries because the native code must either perform run-time conversion itself or exit back to interpreter functions for resolving types. There is supposedly a LuaJIT in the works for possibly my least favorite language (LUA), and someone on IRC mentioned a tracing JIT in the PyPy project (though I have not looked into it).

Unfortunately, benchmarks have continually shown that Tamarin Tracing is just plain slow. Some people are wondering why we’re even bothering with it. What’s important to recognize is that the speed of the tracer is bound to the speed of the interpreter. Tamarin Tracing is highly experimental, and the interpreter Adobe packaged with it is not optimized. The tracing engine, however, can be easily detached from the interpreter. Since Mozilla already has a reasonably fast interpreter (SpiderMonkey), our new project is to couple the two together, as “TraceMonkey,” so the interpreter can emit traces.

A trace is a set of low-level, abstracted instructions, internally called “LIR.” Each “word” of LIR is 32-bits — an 8-bit opcode and three optional 8-bit operands. For example, encoding a 32-bit immediate value requires an empty LIR_imm32 word and another word for the value. These LIR opcodes are designed to be fairly platform agnostic, but more importantly, they are intrinsically in SSA form. This makes liveness analysis, common sub-expression elimination, and other optimizations much easier.

Since TT’s interpreter is not 64-bit safe, I’ve been testing my port by writing programs directly in trace LIR. For my first function I implemented XKCD’s getRandomInt() { return 4; } function. Each label is the word/instruction number.

1: imm64 4      ; immediate value of 4
4: imm64, &addr ; immediate value, address from C code
7: sti 1, #0(4) ; take the pointer value of instruction 4, store value in instruction 1 at offset 0.
8: x            ; exit the trace

These traces are compiled backwards. In this example, the epilogue is generated first, and the last instruction to be generated is the first instruction of the prologue. Generation is a single pass and thus code size is not computed beforehand. Executable pages are allocated one at a time. If there is not enough space to encode a full instruction in the current page, a new one is allocated and a JMP is generated to bridge the code.

That’s it for my rough overview. If you’re interested in details of the tracer, including LIR and everything after, I highly recommend Dave Mandelin’s Tamarin Tracing Internals articles (parts 3, 4, and 5). He provides some excellent insight into the optimizations it makes, and I’ve been using the articles as a guide while reading the code.

SourceMod’s JIT Opened, Performance Ideas

Originally, we had a few motivations for keeping SourceMod’s JIT closed source. They are neither here nor there, but we’ve decided to open it. I’d like to talk a bit about what it is and what it isn’t. If you’d like to see the source, it’s in sourcepawn/jit/x86.

A JIT is a technology that, “when it’s needed,” compiles code from one form to another for optimization. For example, the Java JIT compiles Java bytecode to native assembly. There are three levels of granularity in a JIT:

Whole Program. The JIT compiles the entire program in one go (actually an “ahead of time” compiler). No granularity.
Methods. The JIT compiles functions one at a time as they are needed. Coarse granularity.
Tracing. Can compile any path of code anywhere in the program. Fine granularity.

Tracing JITs are new and still seem to be experimental. Traditional JITs, like Microsoft’s and Sun’s, compile functions one at a time. SourceMod’s JIT is a whole program compiler. We chose this route for a few reasons:

It’s easy.
We’re not concerned about loading/compilation time since Half-Life servers take forever to start anyway.
Even so, the cost of computing optimizations on a SourceMod plugin is extremely cheap.
There is no performance gain from tracing because Half-Life runs on good processors and all types are statically known.

Why is optimizing Pawn code so easy? It has a simple, two-register design internally. This has a number of really nice implications:

The generated patterns are simple. Peephole optimization in the source compiler does a good job at reducing expressions.
The two registers that represent machine state can be easily mapped to processor registers, since nearly every processor has two scratch registers.
Types are static and optimal code can be emitted.

Therefore SourceMod’s JIT is “dumb” for the most part. It performs a massive peephole “search and replace” of every Pawn bytecode to native machine code. Where it wins is that the assembly is highly handcrafted to the x86 ISA, rather than piped through a processor abstraction layer. faluco spent a lot of work optimizing moves and stores to reduce things like pipeline stalls. A pipeline stall is when one instruction depends on the result before it. For example, if x86 sees an address coming up, it tries to pre-fetch the final computed address onto the address pipeline. This is why LEA is called a “free” instruction on x86 (because the computed result is “already there”). If the address computation requires a prior instruction, the pipeline will be stalled.

SourceMod performs some other more general optimizations. It optimizes for space by emitting short instructions when possible. Certain function intrinsics are implemented in inlined hand-crafted assembly (like the GENARRAY and FILL opcodes). Native dispatch is entirely inlined down to assembly.

A special amount of attention is given to switch cases. If all case values are consecutive they are optimized down to a jump table rather than a series of if statements. The sequence can start at any number and end at any number, as long as no number is skipped.

Since Pawn doesn’t have types, the JIT performs some peephole optimizations on certain natives. For example, if it sees the ‘FloatAdd’ native, it will optimize the code down to FPU instructions. This is a huge bonus because native dispatch is expensive (the VM’s internal state must be cached and then restored). This specialized peephole optimization occurs mainly on float natives.

The JIT maps Pawn’s two registers to EAX and EDX. This ends up being really nice, as they’re the two scratch registers used for lots of important things on x86. For example, a MUL instruction uses both, and many of the ALU instructions have shortened forms for EAX. The unfortunate side effect is that stack spilling can be common, but the Sethi-Ullman Algorithm shows that two registers will suffice for binary expressions.

The SourceMod JIT gets the job done. In terms of design considerations, it’s not perfect. Its optimizations are primitive, and it assumes the source compiler will do most of the hard work. But in the end, if fits the bill well. Pawn doesn’t have much complexity, and the scripts are fairly small.

I do have some future plans for the JIT. Some easy optimizations are that more float natives can be optimized away in the peephole pipeline. Similarly, if SSE/SSE2 is detected, faster instructions could be emitted instead.

With a little help from the compiler, we could completely eliminate a large class of useless performance losses in SourceMod. Pawn scripts use “local addresses” that are relative to a base pointer. With a little help from the compiler, the JIT could eliminate local addresses completely. This would greatly improve generated code performance and turn LocalTo* calls (which extension writers should hate by now) into nops. There are other implications too – calls from plugin to plugin would become direct calls instead of slowly marshaled through PushCell() and GetNativeCell() and whatnot.

Lastly, natives are expensive. It takes 28 instructions to jump into native mode. Unfortunately, calls like IsClientConnected() and IsValidEntity() can’t be implemented without native code. A possible solution to this is to introduce a few very specialized “native” opcodes into the JIT. For example, a check for IsClientConnected might look like this in an extended Pawn bytecode:

  proc
  push.s  0xC     ;push the first parameter (the entity index)
  getuseraddr 1   ;offset of &g_Players address in context::user[] array
  vcall 5         ;call function at virtual index N, GetGamePlayer()
  vcall 2         ;call function at virtual index N, IsConnected()
  ret

The JIT would then compile this directly into the plugin, as if it were a stock. This would effectively let anyone write very fast natives in an abstracted assembly language, without incurring any native overhead. It is of course a lazy solution based on Pawn’s limitations, but definitely a decent attempt at gaining performance for very little work.

For now, I will likely back away from other lacking things in Pawn (like CSE, liveness analysis, optimizing register usage, etc). The principle reason being that the other proposed optimizations are likely to get more bang for the buck at this point. The fact that the JIT is hardcoded to assume EAX/EDX as the two storage registers means such extensive optimization would warrant a complete redesign for a dynamic register allocator. We wouldn’t be able to justify that until we freed up some of the registers required by Pawn (like the DAT and FRM registers for the local addressing mode, see earlier idea).

Driving in CA is Dangerous

After a few weeks of driving in CA, I’ve seen more accidents than I see in months of driving on the east coast.

I’m not sure what the reason is. There is certainly a lot more traffic, but the roads are bigger to compensate. It seems people are just more reckless. Examples of weird traffic incidents:

(Morning) A car is completely stopped in in the middle of a freeway, causing a big traffic mess as people try to route around it.
(Morning) A car has skidded off the freeway into a steep ditch, becoming engulfed in bushes. Supposedly this car had a person in it and it took police days to notice it, so there were a few police cars and traffic was bad.
(Morning) A truck’s doors flew open and the contents scattered everywhere on the freeway. Unbelievably, the owner of the truck was on the side of the road, making mad dashes into the freeway to collect his dropped stuff.
(Afternoon) A truck crashed badly on a long, curvy exit ramp. Two lanes of traffic had to be shut down and it took almost half an hour to get off the ramp.
(Afternoon) A “generic” accident involving three or four cars held up traffic as we left a movie theater.
(Afternoon) On the same trip as above, we saw an accident in progress – a car swerved from the rightmost lane of the freeway, and headed left horizontally into (I’m assuming) the concrete barrier. I didn’t see the whole thing as we were driving in the opposite direction.
(Afternoon) On “bike to work day,” a biker was pedaling alongside the freeway. He was on the breakdown lane, but when an exit ramp came up, he simply pedaled into the freeway. This was extremely stupid, and I almost hit the guy trying to merge. It’s not easy to merge when there’s someone on a bicycle in front of you traveling a fraction of your speed.

Then again, maybe it’s just me. The family I’m staying with contends they haven’t seen stuff like this very often, and since a few of the incidents happened with us traveling together, I’ve become a bad luck charm for cars.

I wouldn’t care except, the frequency of these incidents means I have to allocate thirty minutes extra into my driving time. Traffic in San Jose isn’t as bad as Los Angeles, but it’s pretty slow during rush hours.

A Drive Through the Country

As I alluded to in February, I’m going to try and shy away from the gaming scene while I experiment with other fields. That doesn’t mean “leave completely,” but my activity will be hampered. That is partially the reason SourceMod moved to a “rolling release” concept so we can push out incremental feature updates easier.

As part of this endeavor, I’ve accepted an internship at Mozilla. I don’t know what I’ll be working on yet, but my primary interest is in language development, so hopefully it will be EMCA4/JavaScript 2/Tamarin. It requires relocation to Mountain View, California.

I’m really looking forward to this, but the next two weeks I am not. I managed to schedule myself into a rather nasty hole. In specific, finals last until April 29th. My apartment lease ends on April 30th. I said I’d be in California by May 5th. That means I need to finish school, move all of my belongings to storage, and move — in a time span of a week.

What makes this feat particularly amusing is that, against better judgment, I have decided to drive to my destination. This is not a short road trip. Clocking in at 3,100 miles, I can expect to spend at least a few days on the road. I’ve gotten a friend to tag along to make sure I don’t accidentally drive in a direction other than “West.”

While I’m resigned to the fact that I will probably miss my deadline by a few days, it’s both a long journey and a stressful two weeks ahead. As of May 1st, I will probably not be on IRC for a week or so. You probably won’t see any commits from me for at least two weeks though I will try to read the forums and bug reports as I have time. Blog articles will resume on May 5th (I have a rather scathing, two-part overview of the JVM spooled up).

And so…
Mozilla or Bust

Sayonara, bickering with Valve!

Migration to Vista x64

About a month ago I made the migration to Windows Vista x64. A little back story first — I tried Vista Business soon after it was released. I hated it, most apps had terrible compatibility (even Microsoft’s own apps), and it didn’t feel very fast.

But recently my Windows XP install was yet again beginning to decay. This happened every time I reformatted. Usually after a few days (or even hours) of setting up my typical XP environment, it started exhibiting strange bugs, slow downs, and random problems. For example, the entire user interface would randomly stop working and I’d have to reboot. Clicking the ‘Start’ button would freeze the taskbar for about two seconds. Some USB devices wouldn’t work until they got power cycled. Killing things in Task Manager would always have some ridiculous delay (which I think is a general problem in XP).

These problems got so predictable that when my hard drive died in March, I decided up to upgrade to Vista again, and take another plunge at the same time – Vista x64. So far, I don’t have many complaints. The user interface is slick and responsive. I’m using Aero which says a lot — I never used Luna on XP. The organizational changes, for the most part, aren’t really much of a problem for me. I switch between the “Classic” and new control panels frequently to adapt.

UAC is not a problem, and in fact I like it after some tweaks. I disabled “Secure Desktop” (which is an extreme annoyance) and disabled confirmation for Administrators. I run my primary account as LUA (limited user account). Typing in the Administrator password to install apps is no big deal, I’m used to this from administering Linux boxes. On a few occasions I’ve seen the box pop up from suspicious programs that shouldn’t need Administrator privileges — on XP, these would have gracefully executed with no chance to inspect them.

Application compatibility has improved since a year ago. While most apps aren’t x64-native, a few were — 7-Zip, Ventrilo, MySQL, Perl, and a fair number of Microsoft apps. Some aren’t for decent reasons. Firefox, Internet Explorer, Media players and the like are often dependent on third-party libraries which don’t have x64 ports. You can’t mix 64-bit and 32-bit code in the same process.

I did have a few 16-bit Windows games which no longer run (x64 dropped the 16-bit subsystem). I even had one very esoteric program with a 16-bit installer. I wasn’t too broken up about losing those though, and VMWare will suffice if I ever need to run them.

As for driver compatibility, I did have one issue. My XP system had two GeForce cards, a 5200 and an 8800. On Vista (or at least, Vista x64) the 5xxx series and lower is no longer supported by any of the new nVidia drivers. The 5xxx series is only supported up to ForceWare 96.85 and the 8800 requires much later versions (the latest is 169.35). I tried to install both drivers at once and ended up with a nice blue screen, which is pretty understandable. Out of pure luck I had a GeForce 6 in another machine so the problem was easily resolved.

The one thing I do not like about Vista is its Explorer changes. The new Explorer is pretty bad. I removed the new organizational changes almost instantly (and I’m glad Microsoft allows for those to be reconfigured). However, it changed my favorite hot-keys and left no revert mechanism in. Every time I use backspace I expect it to go up one level, and instead it treats file browsing like web browsing. There’s a new shortcut for this (ALT+Up) but now it’s confusing when I go back and forth between operating systems.

Vista’s Explorer makes it difficult to right-click files that aren’t selected, because it requires hitting the exact name text, rather than the row the file is in. If you miss this, you get the folder’s context menu and you might not realize it right away. The new dialog for “Copy/Replace” is great, except for one major detail – you can’t use the keyboard. I haven’t been able to find a key combination that will automatically select “Replace.” Enter, strangely enough, just cancels the dialog. This is really annoying because I’m a heavy keyboard user and I’m overwriting files a lot while debugging.

It also automatically sorts new files. This seems pretty cool at first until you drop a file in a folder and watch it get sorted into a completely different location than you visually plopped it.

Back to application compatibility — some people still don’t get things totally right. A few apps failed to create Start Menu/Desktop shortcuts. Some did so, but created them with Administrator privileges (which is annoying because then I can’t remove them). Some apps almost get the right idea. For example, mIRC uses “Application Data” instead of “Program Files” now, but it uses “AppData” to store log files. That’s kind of strange, since they’re user-readable documents. “AppData” is a hidden folder and shouldn’t be used for things like that.

My favorite changes in Vista? Task Manager’s “Kill” function seems to actually work. “Documents and Settings” has been changed to the much more palatable “Users”. Symlinks now provide a bit more functionality over XP/2000’s “Junctions.” UAC makes me feel more in control.

That’s enough ranting for today — your mileage may vary.

Edit: Twisty suggested I try TAB+Space which ended up working. Huzzah!

Programmatic Radio Control for Smartphones and Pocket PCs

Lengthy title, yes. I’m working on an application for my Windows Mobile 6.0 Phone, and I needed to toggle the radio (phone receiver) on and off. It’s not very difficult to disable and enable the radio programmatically, but it did take quite a bit of research since I’m new to this platform. I couldn’t find any good, well-written examples on this using Google, so I decided to discuss my solution.

I used Visual Studio’s “Windows Mobile 5.0 Smartphone SDK (ARMV4I)” emulator and architecture for testing. The code is written in straight C. The process is pretty simple:

Initialize with the TAPI DLL.
Enumerate all the line devices. Find the one called “Cellular Line”.
Open the line, change its equipment state, close the line.
Shutdown from the TAPI DLL.

For initializing and shutting down access to TAPI, I made a few helper functions.

Select All Code:

#include <tapi.h>
#include <extapi.h>
#include <string.h>
 
HLINEAPP hLineApp = NULL;
DWORD g_NumDevices = 0;
 
int InitializeLineUtils(LINECALLBACK callback)
{
	int ret;
 
	/* Init TAPI, get device count */
	if ((ret = lineInitialize(&hLineApp,
			g_hInstance,  /* From WinMain */
			callback,
			NULL,
			&g_NumDevices))
		!= 0)
	{
		return ret;
	}
 
	return 0;
}
 
void ShutdownLineUtils()
{
	if (hLineApp != NULL)
	{
		lineShutdown(hLineApp);
		hLineApp = NULL;
	}
}

The next step is to find a device matching a specific name. Device names can be either in Unicode or ASCII, so my helper function takes both to eliminate any conversion. First the device API version is negotiated, then the device capabilities are queried. Note that the structure to do this is pretty low-level — you have to reallocate memory until the device decides it’s big enough to squirrel all its data past the end of the struct definition.

This function finds a matching device and returns its negotiated API version:

Select All Code:

DWORD FindLineDeviceByName(LPCWSTR wstr, LPCSTR cstr, DWORD *pAPIVersion)
{
	LONG ret;
	DWORD api_version;
	LINEDEVCAPS *dev_caps;
	LINEEXTENSIONID ext_id;
	DWORD device_id, dev_cap_size;
 
	dev_cap_size = sizeof(LINEDEVCAPS);
	dev_caps = (LINEDEVCAPS *)malloc(dev_cap_size);
	dev_caps->dwTotalSize = dev_cap_size;
 
	/* For each device... */
	for (device_id = 0; device_id < g_NumDevices; device_id++)
	{
		/* Negotiate an API version (0x00020000 is my CURRENT) */
		if ((ret = lineNegotiateAPIVersion(hLineApp,
				device_id,
				0x00020000,
				TAPI_CURRENT_VERSION,
				&api_version,
				&ext_id))
			!= 0)
		{
			continue;
		}
 
		while (TRUE)
		{
			if ((ret = lineGetDevCaps(hLineApp, 
					device_id,
					api_version,
					0,
					dev_caps)) == 0
				&& dev_caps->dwNeededSize <= dev_caps->dwTotalSize)
			{
				break;
			}
 
			if (ret == 0 || ret == LINEERR_STRUCTURETOOSMALL)
			{
				dev_cap_size = dev_caps->dwNeededSize;
				dev_caps = (LINEDEVCAPS *)realloc(dev_caps, dev_cap_size);
				dev_caps->dwTotalSize = dev_cap_size;
			}
			else
			{
				break;
			}
		}
 
		if (ret != 0)
		{
			continue;
		}
 
		if (dev_caps->dwStringFormat == STRINGFORMAT_UNICODE)
		{
			const wchar_t *ptr = (wchar_t *)((BYTE *)dev_caps 
				+ dev_caps->dwLineNameOffset);
			if (wcscmp(ptr, wstr) == 0)
			{
				*pAPIVersion = api_version;
				return device_id;
			}
		}
		else if (dev_caps->dwStringFormat == STRINGFORMAT_ASCII)
		{
			const char *ptr = (char *)dev_caps + dev_caps->dwLineNameOffset;
			if (strcmp(ptr, cstr) == 0)
			{
				*pAPIVersion = api_version;
				return device_id;
			}
		}
	}
 
	return -1;
}

Lastly, helper functions for opening and closing lines:

Select All Code:

LONG OpenLine(DWORD device_id, 
	      DWORD api_version, 
	      HLINE *pLine, 
	      DWORD inst,
	      DWORD privs,
	      DWORD media)
{
	return lineOpen(hLineApp,
		device_id,
		pLine,
		api_version,
		0,
		inst,
		privs,
		media,
		NULL);
}
 
void CloseLine(HLINE line)
{
	lineClose(line);
}

Now we can write a simple program to toggle the radio:

Select All Code:

VOID FAR PASCAL lineCallbackFunc(DWORD hDevice, 
				 DWORD dwMsg, 
				 DWORD dwCallbackInstance, 
				 DWORD dwParam1, 
				 DWORD dwParam2, 
				 DWORD dwParam3)
{
}
 
int WINAPI WinMain(HINSTANCE hInstance,
		   HINSTANCE hPrevInstance,
		   LPTSTR lpCmdLine,
		   int nCmdShow)
{
	HLINE hLine;
	DWORD state, radio_support;
	DWORD cell_api, cell_device;
 
	if (InitializeLineUtils(lineCallbackFunc) != 0)
	{
		return 1;
	}
 
	if ((cell_device = FindLineDeviceByName(
			L"Cellular Line",
			"Cellular Line",
			&cell_api))
		== -1)
	{
		ShutdownLineUtils();
		return 1;
	}
 
	if (OpenLine(cell_device, cell_api, &hLine, 0, LINECALLPRIVILEGE_NONE, 0) != 0)
	{
		ShutdownLineUtils();
		return 1;
	}
 
	if (lineGetEquipmentState(hLine, &state, &radio_support) == 0)
	{
		if (state != LINEEQUIPSTATE_FULL)
		{
			state = LINEEQUIPSTATE_FULL;
		}
		else
		{
			state = LINEEQUIPSTATE_MINIMUM;
		}
 
		lineSetEquipmentState(hLine, state);
	}
 
	CloseLine(hLine);
	ShutdownLineUtils();
 
	return 0;
}

Side note: LINEEQUIPSTATE_NOTXRX seemed to have no effect for me. My device turned the radio off on “MINIMUM” but refused the other option as unsupported.

As excellent and generally seamless as Visual Studio’s emulator was, it did not respond to lineSetEquipmentState at all. I guess since it’s not an actual device, maybe it doesn’t bother implementing the “equipment.”

Vista Problems (Surprise?)

I recently migrated to Vista (a story and a review for another day), but one thing has been bothering me. The “Visual Effects” settings (from My Computer Properties -> Advanced System Properties -> Performance Settings) do not stick.

If I customize the settings, they get reset on logoff or reboot. This is true no matter which account I use, Administrator or limited user. I tried to monitor the dialog box with procmon to see what registry keys were involved, but there were quite a few and they looked annoying to research.

The strange thing is that the dialog box is an administrator-only feature, which would imply that the settings are system-wide. Yet monitoring the dialog box shows all sorts of per-user settings go by.

I used the classic Windows theme for XP (I hated Luna). Aero is tolerable so I decided to give it a shot, but I don’t like all of the frilly, useless animations. For example, windows “slurping” into the task bar and menus fading in and out feels kitschy to me, and only seems to serve as some sort of visual distraction or delay. So I disabled all of these animations in the Visual Effects dialog, and soon discovered that as soon as I logged out and logged back in, I had to reapply all of the settings.

I couldn’t find any other instances of this problem on Google. Damaged Soul managed to find one but it contained a red herring and no solution. I gave up and solved it programmatically, with a small, very insecure (I was lazy) C program sitting in my Startup folder.

Select All Code:

#define WINDOWS_LEAN_AND_MEAN
#include <windows.h>
#include <stdio.h>
#include <stdlib.h>
#include <tchar.h>
 
int main()
{
	HANDLE hToken;
	ANIMATIONINFO info;
 
	if (!LogonUser(_T("Administrator"),
		_T("KNIGHT"),
		_T("blahblahblah"),
		LOGON32_LOGON_BATCH,
		LOGON32_PROVIDER_DEFAULT,
		&hToken))
	{
		exit(1);
	}
 
	if (!ImpersonateLoggedOnUser(hToken))
	{
		exit(1);
	}
 
	SystemParametersInfo(SPI_SETDISABLEOVERLAPPEDCONTENT,
		0,
		(PVOID)TRUE,
		SPIF_SENDCHANGE);
 
	SystemParametersInfo(SPI_SETCOMBOBOXANIMATION,
		0,
		(PVOID)FALSE,
		SPIF_SENDCHANGE);
 
	SystemParametersInfo(SPI_SETDRAGFULLWINDOWS,
		0,
		(PVOID)FALSE,
		SPIF_SENDCHANGE);
 
	SystemParametersInfo(SPI_SETSELECTIONFADE,
		0,
		(PVOID)FALSE,
		SPIF_SENDCHANGE);
 
	SystemParametersInfo(SPI_SETCLIENTAREAANIMATION,
		0,
		(PVOID)FALSE,
		SPIF_SENDCHANGE);
 
	SystemParametersInfo(SPI_SETMENUANIMATION,
		0,
		(PVOID)FALSE,
		SPIF_SENDCHANGE);
 
	info.cbSize = sizeof(ANIMATIONINFO);
	info.iMinAnimate = 0;
	SystemParametersInfo(SPI_SETANIMATION, 
		sizeof(ANIMATIONINFO),
		&info, 
		SPIF_SENDCHANGE);
}

Two notes from playing with this API:

SPI_SETDISABLEOVERLAPPEDCONTENT appears to do nothing? I thought it would be related to Transparent Glass, but… Transparent Glass is a user-mode (per-user?) setting. You can change it in your display preferences, and oddly enough, that will cause it to flip the switch in the Administrator-only settings! I have no idea what’s going on there. Also, transparent glass is the only “effect” setting not to be reset on logging off.
SPIF_UPDATEINIFILE fails with ERROR_MOD_NOT_FOUND on Vista. Maybe it does that on previous versions too, I have no idea. Maybe I’m forgetting to link to something or maybe I’ve missed a security policy thing somewhere that fixes all of my problems.

Now that I have my hacky fix, I don’t feel like investigating the problem any further. But it’d be nice to know what’s going on here, and why those settings can’t per-user in the first place.

Turtles and Random Bits

I was brushing up on sorting algorithms and came across this one whose goal is to kill turtles. How uncouth!

Speaking of turtles, I’ve received two comments on dealing with TortoiseSVN’s terrible caching program. sawce replaced its executable with a dummy that exits on startup. Damaged Soul says simply removing the executable does the job as well.

Not speaking of turtles, I think I may have to start a new segment on “disappointing software.” My Motorola Q decided to format itself this week, and I had to reinstall everything. The first thing that I always install after wiping my smartphone is a program to play Go. Go is extremely difficult for computers to play, so I don’t expect much from an embedded processor. The best program I’ve found so far (that works) has been Pocket GNU Go.

I always forget its name though, so while searching for it I found a (relatively new?) program called Go Mobile. It bragged about having good graphics so I tried it out. By the third move or so it seemed to crap out. I don’t have a screenshot but the move looked like this (the triangled white piece):

To make a Chess comparison, that move is like skipping your turn and then giving a random piece to your opponent. If you’re going to make a Go program but you don’t want to bother writing playable AI for it, why not embed one of the existing solutions?

The lack of good pocket programs for Go is disappointing. Even the GNU Go port I use is very old, it’s an entire major release out of date with the PC version. Contrast — about seven years ago I had a Palm 3 (black and white, 2MB of RAM) that had an amazingly complete chess program.

Don’t call me pedantic!

A few weeks ago in a class, the professor mentioned that C does not support nested functions. He wrote an example on the board that he claimed would not compile.

A student quickly typed up the program on his laptop and said, “GCC compiled it just fine.” The professor replied, “Well, I’ve never heard of that. It must be a compiler feature unless they changed the standard.”

The student then said, “Well, I added -ansi and GCC still compiles it.” The professor caved in — “I guess they changed the standard recently.”

I was sure they most certainly didn’t. This incident bothered me since I knew nested functions couldn’t possibly be in C99. When I got home I reproduced the scenario, and sure enough, the student was right. GCC was compiling completely invalid syntax even with the “-ansi” setting. After digging through the documentation, I was able to find two things:

GCC has its own custom extensions to the C language, which includes nested functions.
This custom extension is enabled by default even if you choose a language specification where it does not exist.

As if that weren’t strange enough, GCC’s option to disable its custom additions appears to be called “-pedantic.” Uh, I don’t think it’s pedantic if I want the language to conform to the actual standard. As much as I like GCC and its custom extensions (many of which are pretty cool), they should be opt-in, not opt-out, in compliance modes.

I frequently see people say “Microsoft’s stupid compiler doesn’t conform to ANSI.” Well, after this, I’m not so convinced GCC is innocent either.

That said, GCC’s nested function feature and its implementation are both very cool. Taking a look at the disassembly, it does run-time code generation on the stack. Simple example:

Select All Code:

include <stdio.h>
 
int blah(int (*g)())
{
    g();
}
 
int main()
{
    int i = 0;
 
    int g()
    {
        i++;
    }
 
    g();
    blah(g);
 
    printf("%d\n", i);
}

Moving On

It’s been an interesting eight years. I started playing Half-Life Deathmatch in 2000, and picked up Counter-Strike 1.0 in 2001. From 2002 to 2003 I ran a few servers and did a little scripting. The serious projects started in 2004.

What began as a craving to take part in open source software ballooned into something that has consumed the past four years of my life. Certainly I have gained a tremendous amount of experience, and for that reason alone I do not regret the choices I’ve made. Unfortunately I’ve also had to make sacrifices. The details are boring, suffice to say — it is taking longer for me to graduate, and I don’t have much of a social life. Ah — youthful optimism, blind dedication, and a strong sense of commitment.

With no regrets, I have come to the realization that it is time for me to move on.

I am going to try and slowly drift out of the gaming scene. I’ll keep most of the reasons to myself. They fall out of this discussion’s scope, and I don’t want to burn too many bridges. There is one reason, however, which will strike home to a few people. The world of server-side Half-Life development is just too limited.

Valve is part of the problem. Releases are almost never tested, and there is little warning before frequent compatibility breaks. The API is poorly documented (if at all). Arbitrary limitations in the SDK severely limit creativity. Debugging crashes is difficult; there are no symbols. Trying to coerce Valve into making even minor changes is impossible. After all, Valve wants developers making full games against the SDK. They want the next Counter-Strike. They want a game that can make money. A server-side plugin can’t do that directly.

Although my open-source fanaticism has lightened over the years, I am still very much an open-source appreciator at heart. Valve tends to not be friendly toward open-source. They write shoddy platform code that barely works on Linux. They don’t even allow redistribution of derivative SDK works in source form.

While I like to blame Valve for everything, the fact of the matter is that it’s just the nature of game development. It’s an extremely competitive and proprietary environment. Server-side developers are forced to hack against a black box, fighting an uphill battle against the very games they’re trying to promote.

Alas, what it comes down to is that the game isn’t my product. No matter what I think may be best, it’s someone else’s game. It’s a closed system, run by a closed company, and that’s their decision. I can lament about it, but I can’t fault them for it — they spent their time and money to make it.

By moving on, I can explore areas of software engineering where there are less, if any, limitations.

Where do I go from here? I don’t want to be too specific publicly yet, but my next project will involve programming language implementation. You can ask me in private if you’re interested.

In the meantime, I will keep maintaining the projects I’ve started, albeit at a lighter pace. AMX Mod X will get one last planned bug-fix release from me. I will maintain things that break from Valve updates. SourceMod, for the most part, is already released in trickled updates, although I will certainly continue to improve it as I have the time and energy. Other than that, I’ll continue to hang out in IRC and post here. It’s just not my style to completely abandon the community.

Although I have made some negative remarks about server-side game development, I would like to say that it’s still an excellent and often fun way to learn programming. I hope that people have and continue to enjoy working with AMX Mod X and SourceMod as much as we have enjoyed developing them.

Mystery Bail Theater

Sometimes worth reading

Category Archives: Articles

Tamarin Tracing, Intro to Tracing JITs

SourceMod’s JIT Opened, Performance Ideas

Driving in CA is Dangerous

A Drive Through the Country

Migration to Vista x64

Programmatic Radio Control for Smartphones and Pocket PCs

Vista Problems (Surprise?)

Turtles and Random Bits

Don’t call me pedantic!

Moving On