GCC is Scary

Markus | Tuesday, May 23rd 2017, 19:58

-- Debugging an application which only works in the debug build

Figure 1. The error message I was confronted with

Recently, I had an interesting bug in one of my C applications where the application would work in the debug build but not in the release build (-O3 etc.). It turned out to be a really stupid mistake of mine where I went 'How could that ever work?!' after finding it out. However, this opened my eyes to how strong gcc's optimization feature really is.

In this article, I will explain the mistake I made, how I went about solving it and what I learned about gcc in the process.

Finding the Problem

Usually, one would use a debugger to trace down the issue to a specific line of code. That, however, is not possible in this case as the program only fails in the release build which does not contain any (or at least less) debug information. Therefore, I used some printfs in the code to roughly locate the error. That lead me to a function which returned NULL even though it returns a valid pointer in the debug build.

That function was the following "constructor". It allocates some memory for a struct, initializes it and returns the pointer (can you spot the error already?):

struct mdiary_scanner *mdiary_scanner_new(gboolean gpg_enabled) { struct mdiary_scanner *mdiary_scanner; mdiary_scanner = malloc(sizeof(struct mdiary_scanner)); if (mdiary_scanner) { mdiary_scanner->time_earliest = G_MAXINT64; mdiary_scanner->time_latest = 0; mdiary_scanner->entries_added = 0; mdiary_scanner->gpg_enabled = gpg_enabled; mdiary_scanner->entries_failed = 0; mdiary_scanner->entries_encrypted = 0; } }

So, to debug it, I went ahead and took a look at the assembly of it for both the release build and the debug build using radare2.

Release:

markus@Core-i7:~/projects/c/build-mdiary-Desktop-Release$ r2 mdiary -A [x] Analyze all flags starting with sym. and entry0 (aa) [x] Analyze len bytes of instructions for references (aar) [x] Analyze function calls (aac) [ ] [*] Use -AA or aaaa to perform additional experimental analysis. [x] Constructing a function name for fcn.* and sym.func.* functions (aan)) -- ... [0x00404780]> pdf @ sym.mdiary_scanner_new / (fcn) sym.mdiary_scanner_new 2 | sym.mdiary_scanner_new (); | ; CALL XREF from 0x00406df6 (sym.mainWindow_switch_workspace) \ 0x00407960 f3c3 ret [0x00404780]>

What, where did my function go?! Only the return statement is left! *Staring intensely at the C-Code* *Adding some debug printfs to print pointer values* *What? NULL?!* *More staring...*

→ Ohh I forgot the return statement!! After adding that, everything worked perfectly fine.

Why Did It Work in the Debug Build?!

Only one thing left to ask: How could that ever work?! Let's take a look at the debug build assembly for that.

/ (fcn) sym.mdiary_scanner_new 65 | sym.mdiary_scanner_new (); | ; CALL XREF from 0x00406df6 (sym.mainWindow_switch_workspace) | 0x00407960 53 push rbx | 0x00407961 89fb mov ebx, edi | 0x00407963 bf28000000 mov edi, 0x28 ; '(' | 0x00407968 e823cbffff call sym.imp.malloc ; void *malloc(size_t size) | 0x0040796d 4885c0 test rax, rax | ,=< 0x00407970 742d je 0x40799f | | 0x00407972 48baffffffff. movabs rdx, 0x7fffffffffffffff ; -1 | | 0x0040797c 48c740100000. mov qword [rax + 0x10], 0 | | 0x00407984 c74018000000. mov dword [rax + 0x18], 0 | | 0x0040798b 48895008 mov qword [rax + 8], rdx | | 0x0040798f 8918 mov dword [rax], ebx | | 0x00407991 c74020000000. mov dword [rax + 0x20], 0 | | 0x00407998 c7401c000000. mov dword [rax + 0x1c], 0 | | ; JMP XREF from 0x00407970 (sym.mdiary_scanner_new) | `-> 0x0040799f 5b pop rbx \ 0x004079a0 c3 ret

Okay, first of all it is not optimized away. Cool. So let's see why this function still returns the pointer. This will by no means be a tutorial on x86_64 assembly, but the important thing to know here is that return values are stored in register rax before calling ret. This means that the malloc'ed pointer needs to be in rax before we exit this function. malloc itself is a function - called @ 0x00407968 - which means the malloc'ed address is stored in rax after it returns. Makes sense, the "test rax, rax" checks whether malloc returned NULL and skips the value initialization if necessary. If it's okay, we init the values (the many mov instructions) and then -- just return. No copying the result to the return register. Just by sheer luck, the malloc'ed pointer is still in register rax when returning which is why this worked in the first place.

However, the truly interesting thing here is the "power" of GCC. I have this function, which calls malloc (which does already change memory!) - then even write to that memory before just exiting. Even though all that happens, GCC still noticed that the code does not have any effect whatsoever and removed it completely - an entire function!

That was the point at which I checked my cmake file and noticed that I am compiling with only very few warnings enabled. After adding -Wall, sure enough, the compiler told me about the missing return (*cough* and a few more *cough*).

Just for reference, the project is this one.

TLDR / Lessons learned:

  • Always compile with -Wall
  • GCC is scary

Tags: gcc software