GCC is Scary

Markus | Tuesday, May 23rd 2017, 19:58

-- Debugging an application which only works in the debug build

Figure 1. The error message I was confronted with

Recently, I had an interesting bug in one of my C applications where the application would work in the debug build but not in the release build (-O3 etc.). It turned out to be a really stupid mistake of mine where I went 'How could that ever work?!' after finding it out. However, this opened my eyes to how strong gcc's optimization feature really is.

In this article, I will explain the mistake I made, how I went about solving it and what I learned about gcc in the process.

Finding the Problem

Usually, one would use a debugger to trace down the issue to a specific line of code. That, however, is not possible in this case as the program only fails in the release build which does not contain any (or at least less) debug information. Therefore, I used some printfs in the code to roughly locate the error. That lead me to a function which returned NULL even though it returns a valid pointer in the debug build.

That function was the following "constructor". It allocates some memory for a struct, initializes it and returns the pointer (can you spot the error already?):

struct mdiary_scanner *mdiary_scanner_new(gboolean gpg_enabled)
{
    struct mdiary_scanner *mdiary_scanner;

    mdiary_scanner = malloc(sizeof(struct mdiary_scanner));
    if (mdiary_scanner) {
        mdiary_scanner->time_earliest     = G_MAXINT64;
        mdiary_scanner->time_latest       = 0;
        mdiary_scanner->entries_added     = 0;
        mdiary_scanner->gpg_enabled       = gpg_enabled;
        mdiary_scanner->entries_failed    = 0;
        mdiary_scanner->entries_encrypted = 0;
    }
}

So, to debug it, I went ahead and took a look at the assembly of it for both the release build and the debug build using radare2.

Release:

markus@Core-i7:~/projects/c/build-mdiary-Desktop-Release$ r2 mdiary -A
[x] Analyze all flags starting with sym. and entry0 (aa)
[x] Analyze len bytes of instructions for references (aar)
[x] Analyze function calls (aac)
[ ] [*] Use -AA or aaaa to perform additional experimental analysis.
[x] Constructing a function name for fcn.* and sym.func.* functions (aan))
 -- ...
[0x00404780]> pdf @ sym.mdiary_scanner_new
/ (fcn) sym.mdiary_scanner_new 2
|   sym.mdiary_scanner_new ();
|              ; CALL XREF from 0x00406df6 (sym.mainWindow_switch_workspace)
\           0x00407960      f3c3           ret
[0x00404780]> 

What, where did my function go?! Only the return statement is left! *Staring intensely at the C-Code* *Adding some debug printfs to print pointer values* *What? NULL?!* *More staring...*

→ Ohh I forgot the return statement!! After adding that, everything worked perfectly fine.

Why Did It Work in the Debug Build?!

Only one thing left to ask: How could that ever work?! Let's take a look at the debug build assembly for that.

/ (fcn) sym.mdiary_scanner_new 65
|   sym.mdiary_scanner_new ();
|              ; CALL XREF from 0x00406df6 (sym.mainWindow_switch_workspace)
|           0x00407960      53             push rbx
|           0x00407961      89fb           mov ebx, edi
|           0x00407963      bf28000000     mov edi, 0x28               ; '('
|           0x00407968      e823cbffff     call sym.imp.malloc        ;  void *malloc(size_t size)
|           0x0040796d      4885c0         test rax, rax
|       ,=< 0x00407970      742d           je 0x40799f
|       |   0x00407972      48baffffffff.  movabs rdx, 0x7fffffffffffffff ; -1
|       |   0x0040797c      48c740100000.  mov qword [rax + 0x10], 0
|       |   0x00407984      c74018000000.  mov dword [rax + 0x18], 0
|       |   0x0040798b      48895008       mov qword [rax + 8], rdx
|       |   0x0040798f      8918           mov dword [rax], ebx
|       |   0x00407991      c74020000000.  mov dword [rax + 0x20], 0
|       |   0x00407998      c7401c000000.  mov dword [rax + 0x1c], 0
|       |      ; JMP XREF from 0x00407970 (sym.mdiary_scanner_new)
|       `-> 0x0040799f      5b             pop rbx
\           0x004079a0      c3             ret

Okay, first of all it is not optimized away. Cool. So let's see why this function still returns the pointer. This will by no means be a tutorial on x86_64 assembly, but the important thing to know here is that return values are stored in register rax before calling ret. This means that the malloc'ed pointer needs to be in rax before we exit this function. malloc itself is a function - called @ 0x00407968 - which means the malloc'ed address is stored in rax after it returns. Makes sense, the "test rax, rax" checks whether malloc returned NULL and skips the value initialization if necessary. If it's okay, we init the values (the many mov instructions) and then -- just return. No copying the result to the return register. Just by sheer luck, the malloc'ed pointer is still in register rax when returning which is why this worked in the first place.

However, the truly interesting thing here is the "power" of GCC. I have this function, which calls malloc (which does already change memory!) - then even write to that memory before just exiting. Even though all that happens, GCC still noticed that the code does not have any effect whatsoever and removed it completely - an entire function!

That was the point at which I checked my cmake file and noticed that I am compiling with only very few warnings enabled. After adding -Wall, sure enough, the compiler told me about the missing return (*cough* and a few more *cough*).

Just for reference, the project is this one.

TLDR / Lessons learned:

Always compile with -Wall
GCC is scary

Tags: gcc software