IA32/x86 and GCC’s fPIC

Lately Valve has started using GCC’s fPIC option to compile their Linux binaries, and I remain unconvinced that this is a good idea.

The purpose of fPIC is to generate position independent code, or code that references data positions without the need for code relocation. Instead of referencing data sections by their actual address, you reference them by an offset from the program counter. In and of itself, it’s not a bad idea.

My observation on fPIC is that its usefulness varies depending on the platform. AMD64 has a built-in mechanism for referencing memory as an offset from the program counter. This makes generating PIC code nearly trivial, and can reduce generated code size because you don’t need full 64-bit address references. On the other hand, it can actually complicate relocation. Since the references are 32-bit, the data cannot be relocated more than 2GB away from the code. That’s a minor problem for loaders, but certainly a nastier problem for people implementing detours and the like.

So, what about x86? It has no concept of PC-relative addressing. In fact, it doesn’t even have an instruction to get the program counter (EIP)! Let’s take a simple C++ code snippet, and look at the disassembly portion for modifying g_something:

Select All Code:
int g_something = 0;
int do_something(int x)
    g_something = x;
    return ++g_something;

With GCC flags “-O3″ I get this assembly routine:

Select All Code:
0x080483d7 <_Z12do_somethingi+7>:       mov    ds:0x804960c,eax

With GCC flags “-fPIC -O3″ I get this:

Select All Code:
0x0804849a <__i686.get_pc_thunk.cx+0>:  mov    ecx, [esp]
0x0804849d <__i686.get_pc_thunk.cx+3>:  ret
0x08048441 <_Z12do_somethingi+1>:       call   0x8048496 <__i686.get_pc_thunk.cx>
0x08048446 <_Z12do_somethingi+6>:       add    ecx,0x12b6
0x08048451 <_Z12do_somethingi+17>:      mov    edx,DWORD PTR [ecx-0x8]
0x08048458 <_Z12do_somethingi+24>:      mov    DWORD PTR [edx],eax

The non-PIC version is one instruction. The PIC version is six instructions. As if that couldn’t be any worse, there’s an entire branch added into the fray! Let’s look at what it’s doing:

  • The call instruction calls a routine which simply returns the value at [esp]. The value at [esp] is the return address. This is a fairly inefficient way to get the program counter, but (as far as I know) the only way on x86 while avoiding relocation.
  • A constant offset is added to the EIP. The new address points to the global offset table, or GOT. The GOT is a big table of addresses, each entry being an address to an item in the data section. The entries in this table require relocating patching from the loader (and the code, subsequently, does not).
  • The actual address to the data is computed by looking up the GOT entry.
  • Finally, the value can be stored in the data’s memory.

Meanwhile, let’s look at the AMD64 versions. I apologize for the ugly AT&T syntax; GDB won’t show RIP-addressing on Intel mode.

PIC version:

Select All Code:
0x0000000000400560 <_Z12do_somethingi+0>:       mov    1049513(%rip),%rdx        # 0x500910 <_DYNAMIC+448>
0x000000000040056a <_Z12do_somethingi+10>:      mov    %eax,(%rdx)

Non-PIC version:

Select All Code:
0x0000000000400513 <_Z12do_somethingi+3>:       mov    %eax,1049587(%rip)        # 0x50090c <g_something>

Although there’s still one extra instruction, that’s a lot more reasonable. So, why would anyone generate fPIC code on x86?

Supposedly without any relocations, the operating system can keep one central, unmodified copy of a library’s code in memory. To me, this seems like a pretty meaningless advantage. Unless you’ve got 4MB of memory, chances are you have plenty of it (especially if you’re running Half-Life 1/2 servers). Also, the cost of relocation should be considered a negligible one-time expense. If it wasn’t, it’d mean you were probably doing something silly like loading a shared library quickly and repeatedly.

My thoughts on this matter are shared by the rest of the AlliedModders developers: don’t use GCC’s fPIC. On x86 the generated code is a lot uglier and slower because the processor doesn’t facilitate such addressing. On AMD64 the difference is small, but even so — as far as I know, Microsoft’s compiler doesn’t ever use such a crazy scheme. Microsoft uses absolute addressing on x86 and RIP-relative addressing on AMD64, and at least on x86 (I’m willing to bet on AMD64 as well), they’ve never used a global offset table for data.

Conclusion: Save yourself the run-time expense. Don’t use GCC’s fPIC on x86. If you’ve got a reason explaining otherwise, I’d love to hear it. This issue has been eating at me for a long time.

(Note: Yes, we told Valve about this. Of course, it was ignored, but that no longer bothers me.)

15 thoughts on “IA32/x86 and GCC’s fPIC

  1. dvander Post author

    I don’t think so. It seems to be recommended a lot, but you can build shared libraries without -fPIC on x86 and they’ll work fine.

  2. great site

    Wow, amazing weblog structure! How long have you been running a blog for? you make running a blog look easy. The full look of your site is magnificent, as smartly as the content material!

  3. Mists of Pandaria Key online kaufen

    Howdy! I realize this is somewhat off-topic but I needed to ask.
    Does running a well-established website such as yours require a lot of work?
    I’m completely new to blogging however I do write in my journal everyday. I’d like to start a
    blog so I can easily share my experience and thoughts online.
    Please let me know if you have any recommendations or tips for new aspiring blog owners.
    Appreciate it!

  4. linux

    I was recommended this blog via my cousin. I’m now not positive whether this submit is written by means of him as nobody else recognize such specific about my difficulty. You’re incredible! Thanks!

  5. company

    Basically great writeup. The item in truth became a fun consideration the item. Appearance innovative to be able to additional additional reasonable from you! Nevertheless, exactly how should we speak?

  6. domain

    Hello there, You have done an incredible job.
    I will definitely digg it and personally recommend to my friends.
    I am sure they will be benefited from this site.

  7. corporate gift ideas

    A good pair of custom socks can be worn at a formal gathering or while taking in a ball game.
    What you choose to give as corporate gifts will depend on who you’re giving them to and
    your industry (average life time value of a client not
    what’s customary in your industry; don’t copy your competitors, it’s rarely a good idea).

    You can get all sorts of gifts like coffee
    mug, pen, pen stand, mobile stand, wall clock and
    many other things.

  8. http://www.streetdirectory.com

    Picking the right location can literally mean the difference between being wildly successful and failing
    miserably. You may also want to look into third party payment processors
    and payment gateways. It also calls attention to the entire staff that the
    employee has been rewarded.

  9. Eficiencia energetica aire acondicionado

    Este periodo comenzará a contar desde la fecha de entrega del equipo al cliente y tendrá validez únicamente sobre la parte reparada y siempre y cuando no haya sido manipulado por terceros. Claro nos duele cuando uno va a una acción militar y uno sabe que el que esta allá también es hermano de uno, también es un soldado que tiene sus necesidades y que también es gente del pueblo, pero durante más de cuarenta años hemos tratado de buscar la manera de que esas contradicciones no las resolvamos en el campo de batalla, porque es lo más doloroso.

  10. vintage wedding invitations lace

    Margaret Atwood did an AMA here you might want to take a look : Here’s a link to all of our upcoming AMAs *I am a bot, and this action was performed Please contact the moderators of this subreddit /message/compose/?to=books if you have any questions or

  11. Sina Tumminia

    Llámenos, con una solo llamada suya se pondrá en marcha todo un equipo para que su reparación sea atendida en un tiempo máximo de 24 horas, previa cita acordado entre técnico y cliente. Uno de nuestros técnicos realizara su reparación el mismo día de su llamada sin Recargo a Zanussi uno en el precio y siempre facilitamos Garantías de 3 meses por escrito en todas y cada una de nuestras reparaciones.


Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="">