The Hell that MOVAPS Hath Wrought

One of the most difficult bugs we tracked down in SourceMod was a seemingly random crash bug. It occurred quite often in CS:S DM and GunGame:SM, but only on Linux. The crash usually looked like this, although the exact callstack and final function varied:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1209907520 (LWP 5436)]
0xb763ed87 in CPhysicsTrace::SweepBoxIVP () from bin/vphysics_i486.so
(gdb) bt
#0  0xb763ed87 in CPhysicsTrace::SweepBoxIVP () from bin/vphysics_i486.so
#1  0xb7214329 in CEngineTrace::ClipRayToVPhysics () from bin/engine_i686.so
#2  0xb7214aad in CEngineTrace::ClipRayToCollideable () from bin/engine_i686.so
#3  0xb72156cc in CEngineTrace::TraceRay () from bin/engine_i686.so

This crash occurred quite often in normal plugins as well. Finally, one day we were able to reproduce it by calling TraceRay() directly. However, it would only crash from a plugin. The exact same code worked fine if the callstack was C/C++. But as soon as the call emanated from the SourcePawn JIT, it crashed. Something extremely subtle was going wrong in the JIT.

After scratching our heads for a while, we decided to disassemble the function in question — CPhysicsTrace::SweepBoxIVP(). Here is the relevant crash area, with the arrow pointing toward the crashed EIP:

   0xb7667d7c :        mov    DWORD PTR [esp+8],edi
   0xb7667d80 :        lea    edi,[esp+0x260]
-> 0xb7667d87 :        movaps XMMWORD PTR [esp+48],xmm0
   0xb7667d8c :        mov    DWORD PTR [esp+0x244],edx

We generated a quick crash and checked ESP in case the stack was corrupted. It wasn’t, and the memory was both readable and writable. So what does the NASM manual say about MOVAPS?

When the source or destination operand is a memory location, it must be aligned on a 16-byte boundary. To move data in and out of memory locations that are not known to be on 16-byte boundaries, use the MOVUPS instruction.

Aha! GCC decided that it was safe to optimize MOVUPS to MOVAPS because it knows all of its functions will be aligned to 16-byte boundaries. This is a good example of where whole-program optimization doesn’t take external libraries into account. I don’t know how GCC determines when to make this optimization, but for all intents and purposes, it’s reasonable.

The SourcePawn JIT, of course, was making no effort to keep the stack 16-byte aligned for GCC. That’s mostly because the JIT is a 1:1 translation of the compiler’s opcodes, which are processor-independent. As a fix, faluco changed the JIT to align the stack before handing control to external functions.

Suddenly, an entire class of bugs disappeared from SourceMod forever. It was a nice feeling, but at least a week of effort was put into tracking it down. The moral of this story is that source-level debugging for “impossible crashes” is usually in vain.

6 thoughts on “The Hell that MOVAPS Hath Wrought

  1. piloni gard

    Thanks for the auspicious writeup. It in truth was once a entertainment account it. Glance advanced to far introduced agreeable from you! By the way, how could we communicate?

    Reply
  2. Tom

    Hah! I’m writing a JIT compiler and after hours of searching I found out that it happened at movaps. After that it still took me a little while, because I assumed it it was a corrupted base pointer (becuase it was moving a register into mem relative to RBP). Then it finally hit me that these were 128bit registers! After doing a google on “movaps segfault” I found out I was not the only one :)

    Quick question, how did you align the stack pointer? I wrote a quick and ugly ‘wrapper’ that does
    sub $0×10, %rsp
    and $0xfffffffffffffff0, %rsp\n
    before calling external functions.

    Reply
  3. Jaunita Slosser

    This website is completely awesome. I’ve looked these info
    a great deal and I realised that is good written, easy to understand.
    I congratulate you for this article that I’ll recommend to prospects around.
    I ask you to go to the gpa-calculator.co page where each college student or learner
    can find results gpa marks. Have a great day!

    Reply
  4. nursing degree

    An MBA program could possibly be customized to full-time, part-time, executive,
    and online learning according to the suitability from the
    student. With the development of technology and availability of animated characters & movies this domain is experiencing tremendous growth.
    A number with the supplies presented in the schools
    include textbooks, guides, and supplements.

    Reply

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="">