Monthly Archives: October 2007

Using the Wrong Return Value

One thing I continually see which bugs me is using the wrong return value from a function. For example, take the following function:

Select All Code:
 * Does something.
 * @param error		On failure, a null-terminated error message is printed to this buffer.
 * @param maxlength	Maximum length, in bytes, of the message buffer.
 * @return		True on success, false on failure.
bool DoSomething(char *error, size_t maxlength);

One thing people will start doing is this:

Select All Code:
char error[255];
error[0] = '\0';
DoSomething(error, sizeof(error));
if (error[0] != '\0')

This is a subtle logical fallacy. The documentation says that “failure implies a null-terminated string.” That proposition doesn’t imply that “failure implies a string of length greater than 0” or that “success implies a string of length zero.” It simply says that if the function fails, the buffer will contain a valid string.

Of course, eliminating the zeroing of the first byte would make this code completely wrong. As it stands, the code will only be incorrect when DoSomething() returns false and has no error message available.

Perhaps, the documentation could be clearer. For example:

Select All Code:
 * Does something.
 * @param error		On failure, a null-terminated error message is printed to this buffer.  If 
 *			no error message is available, an empty string is written.  The buffer is 
 *			not modified if DoSomething() returns successfully.
 * @param maxlength	Maximum length, in bytes, of the message buffer.
 * @return		True on success, false on failure.
bool DoSomething(char *error, size_t maxlength);

However, to the astute reader, this is just clarifying the obvious: the function only modifies the input buffer when it says it will, and an empty string is a valid, null-terminated string.

Making this mistake was very common in AMX Mod X, where return values were often ill-defined, non-uniform, and coupled with strange tags. Another case of this is through callbacks. SourceMod has a callback that essentially looks like:

Select All Code:
public SomethingHappened(Handle:hndl1, Handle:hndl2, const String:error[], any:data);

The rules for determining an error depend on the inputs and values of hndl1 and hndl2. These are well-defined. However, if you don’t read these rules, or simply ignore them, it’s tempting to just check the contents of error. Unfortunately, the API does not specify that checking the error buffer is a valid method of checking for an error, and your assumption could potentially lead to a bug.

Conclusion: If a function has a return value that definitively determines success or failure, use it. Don’t rely on testing values that are merely secondary in the failure process.

Backwards Compatibility Means Bugs Too

Sometimes backwards compatibility means preserving bugs. When working on AMX Mod X 1.8.0, we found a very nasty bug in the code for creating new-menus.

For the unfamiliar, new-menus are a replacement for the original menu code (now dubbed “old-menus”). They were designed to be much simpler to use, though the basics are similar:

  • menu_create() binds a new-menu to a new-menu callback.
  • register_menucmd() binds an old-menu to an old-menu callback.

The two callbacks are very different: new-menu callbacks have three parameters (menu, client, item). Old-menu callbacks have two parameters (client, keypress). They are mutually exclusive.

Yet, there was a serious, accidental bug in AMX Mod X 1.7x — menu_create called register_menucmd on the same callback it used for its own callback. Effectively, your new-menu callback could receive random, invalid data because it was being invoked as an old-menu handler and a new-menu handler. This created almost untraceable and inexplicable bugs in a few plugins that had ventured into new-menu territory.

Needless to say, the code was removed very fast. Then when it came time to beta test 1.8.0, we received two reports that a specific plugin had stopped working. One of its menus was broken. After successfully reproducing the problem, I opened the plugin’s source code and found why.

The plugin was calling menu_create on callback X. However, it never called any of the other relevant new-menu functions. Instead, it used old-menus to actually draw and display the menu. Effectively, the author had replaced register_menucmd with menu_create. He probably intended to rewrite the rest of his code as well, but forgot. After all, his functionality continued to work as normal! So when AMX Mod X 1.8.0 removed this bug, his plugin broke.

Deciding whether to insert backward compatibility shims is always tough. Generally, we always want plugins to work on future versions, no matter what. There is a rough checklist I went through.

  • How many plugins are affected by the change? One.
  • How many people are affected by the change? 100-200. Punishing one author punishes all of these users.
  • How much code exists that is affected by the change? Unknown, assumed minimal.
  • Is the compatibility change a design decision, or a bug fix? How severe? Severe bug fix.
  • Is the break occurring from valid API usage or invalid API usage? If invalid, is it a mistake that is pervasive? Invalid usage localized to one line in one plugin.
  • Is backwards compatibility possible? No, the flaw is too severe to revert in any form.
  • Are any of the affected plugins no longer being maintained? No.

Given the above checklist, we decided to allow for this one plugin to receive reverted functionality. menu_create will call register_menucmd if and only if the plugin matches X and the plugin’s version matches Y. We notified the author saying, “if you release version Y+1, it will break on future AMX Mod X versions, so you will want to fix it.”

The conclusion: backwards compatibility sometimes means preserving bugs. Unfortunately, this gives new meaning to bugs being “undocumented features.” In the end, choosing whether to preserve bugs (or indeed any compatibility) depends on how many users you want to affect and how badly you want to affect them.