Fuzzing Image Parsing in Windows, Part Three: RAW and HEIF
Mandiant
Written by: Dhanesh Kizhakkinan
Automated testing of Windows Image Libraries uncovers 37 security issues, including Zero-Click Code Execution with CVSS score of 7.8. All vulnerabilities have been remediated by Microsoft following the disclosure by Mandiant.
Continuing our discussion of image parsing vulnerabilities in Windows, we take a look at two of the file types supported by Windows: RAW and High Efficiency Image File Format (HEIF).
RAW files have been supported by Windows Camera Codec Pack since the Windows XP days and later Windows versions include these codecs by default. The codec file is present as WindowsCodecsRaw.dll in the system32 folder. The RAW codec is Windows Imaging Component (WIC) enabled, which means any program using WIC should be able to load and decode RAW images without any additional code. This enables Explorer to display thumbnails and previews and enables Microsoft.Photos.exe to display supported RAW images. Having WIC enabled codecs helped us avoid writing a new fuzzing harness, because the WIC and Component Object Model (COM) handles the detection of file formats and the loading of corresponding codecs. Even though this was very useful to quickly start our fuzzing, it came with a heavy performance tax (10 - 30x slowdown) due to the way COM works.
The codec supports RAW files from multiple camera manufactures, such as: Canon, Casio, Epson, Fujifilm, Kodak, Konica Minolta, Leica, Nikon, Olympus, Panasonic, Pentax, Samsung, Sony and a variety of other camera models. On the surface this is a very attractive attack surface given the number of supported camera models and manufacturers; unfortunately, most of the supported file formats are TIFF or based on the TIFF file format.
For fuzzing the RAW codec, we use a generic WIC harness and use WindowsCodecsRaw.dll as the code coverage module. The corpuses were collected from various public places and sizes were minimized by sacrificing some percentage of code coverage. This is needed due to the large file size of RAW files.
In our fuzzing, we found two integer overflow vulnerabilities in the library’s parsing of Canon CR2 format RAW files, and 35 vulnerabilities in the parsing of HEIF files. We present details of the most notable findings in this blog post.
CVE-2020-16968
CVE-2020-16968 is an integer overflow leading to a heap overflow while parsing a specially crafted Canon CR2 file.
Crash details are shown in Figure 1.
0:000> r
rax=0000000009aee38e rbx=000000000000008e rcx=00000203b8ab0580
rdx=0000000000000d07 rsi=0000000000000d34 rdi=00000203b7998f20
rip=00007ffcbd26c4e2 rsp=000000c6b258e370 rbp=000000c6b258e470
r8=000000000000008e r9=0000000000000000 r10=00000203b7a2eff0
r11=00000000000006c0 r12=000000000000036e r13=000000000000bbbb
r14=0000000000000658 r15=0000000000000e0c
iopl=0 nv up ei pl nz na po nc
cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000206
WindowsCodecsRaw!CCanonSRawLoadRaw::lossless_jpeg_load_raw+0x1f2:
00007ffc`bd26c4e2 6644890c41 mov word ptr [rcx+rax*2],r9w ds:00000203`cc08cc9c=???
0:000> !heap -p -a @rcx
address 00000203b8ab0580 found in
_DPH_HEAP_ROOT @ 203a4231000
in busy allocation ( DPH_HEAP_BLOCK: UserAddr UserSize - VirtAddr VirtSize)
203a4295270: 203b8ab0580 135cca80 - 203b8ab0000 135ce000
00007ffd065c73ab ntdll!RtlDebugAllocateHeap+0x000000000000003b
00007ffd064e9745 ntdll!RtlpAllocateHeap+0x00000000000000f5
00007ffd064e73d4 ntdll!RtlpAllocateHeapInternal+0x00000000000006d4
00007ffcbd0ffb6d WindowsCodecsRaw!POBitmapImageRep::init+0x00000000000001f1
00007ffcbd0ff926 WindowsCodecsRaw!POBitmapImageRep::POBitmapImageRep+0x00000000000000aa
00007ffcbd269f88 WindowsCodecsRaw!CCanonSRawLoadRaw::GetBitmapImage+0x00000000000000d8
00007ffcbd269ea3 WindowsCodecsRaw!CCanonSRawLoadRaw::GetBitmapImage+0x0000000000000023
00007ffcbd18e19b WindowsCodecsRaw!CCanon1DMK4ImageRep::GetBitmapImageFromFile+0x00000000000002bb
00007ffcbd10d635 WindowsCodecsRaw!CRawImageRep::GetBitmapImageRep+0x0000000000000055
00007ffcbd0fe4fe WindowsCodecsRaw!RawImageGetBitmap+0x000000000000005e
00007ffcbd0f4b9a WindowsCodecsRaw!FSixFrameDecode::EnsureRawBitmap+0x0000000000000076
00007ffcbd0f4de8 WindowsCodecsRaw!FSixFrameDecode::RunPipelineInternal+0x0000000000000068
00007ffcbd0f4d60 WindowsCodecsRaw!FSixFrameDecode::RunPipeline+0x0000000000000104
00007ffcbd0f664a WindowsCodecsRaw!FSixFrameDecode::CopyPixels+0x00000000000003ca
0:000> k
# Child-SP RetAddr Call Site
00 000000c6`b258e370 00007ffc`bd26a016 WindowsCodecsRaw!CCanonSRawLoadRaw::lossless_jpeg_load_raw+0x1f2
01 000000c6`b258e4d0 00007ffc`bd269ea3 WindowsCodecsRaw!CCanonSRawLoadRaw::GetBitmapImage+0x166
02 000000c6`b258e5c0 00007ffc`bd18e19b WindowsCodecsRaw!CCanonSRawLoadRaw::GetBitmapImage+0x23
03 000000c6`b258e610 00007ffc`bd10d635 WindowsCodecsRaw!CCanon1DMK4ImageRep::GetBitmapImageFromFile+0x2bb
04 000000c6`b258e7a0 00007ffc`bd0fe4fe WindowsCodecsRaw!CRawImageRep::GetBitmapImageRep+0x55
05 000000c6`b258e7d0 00007ffc`bd0f4b9a WindowsCodecsRaw!RawImageGetBitmap+0x5e
06 000000c6`b258e870 00007ffc`bd0f4de8 WindowsCodecsRaw!FSixFrameDecode::EnsureRawBitmap+0x76
07 000000c6`b258e8a0 00007ffc`bd0f4d60 WindowsCodecsRaw!FSixFrameDecode::RunPipelineInternal+0x68
08 000000c6`b258f030 00007ffc`bd0f664a WindowsCodecsRaw!FSixFrameDecode::RunPipeline+0x104
09 000000c6`b258f0d0 00007ffc`fc73ceb3 WindowsCodecsRaw!FSixFrameDecode::CopyPixels+0x3ca
Figure 1: RAW codec crash
The root cause of the bug can be found by looking at the allocation of the object at WindowsCodecsRaw!POBitmapImageRep::init in Figure 2.
void __fastcall POBitmapImageRep::init(
POBitmapImageRep *this,
unsigned int swidth,
unsigned int sheight,
...
...
)
{
...
unsigned int allocSize; // ebx
HANDLE ProcessHeap; // rax
LPVOID allocMem; // rax
...
// validations
if ( swidth > 0x10000
|| sheight > 0x10000
...)
{
ATL::AtlThrowImpl(0x80070057);
}
...
...
swidthX2 = swidth * two_2;
...
...
allocSize = swidthX2 * sheight; // integer overflow
ProcessHeap = GetProcessHeap();
allocMem = HeapAlloc(ProcessHeap, 0, allocSize);
*(this + 12) = allocMem;
...
}
Figure 2: Vulnerable function
POBitmapImageRep::init accepts multiple heights and widths associated with camera sensors and does a cursory validation of the values they accept. These values are retrieved from the image file and thus are user controllable. The vulnerability arises from the fact that even with the initial validation, a 32-bit multiplication of the sensor height and width can cause an integer overflow and trigger a smaller heap memory allocation than needed. Additionally, POBitmapImageRep::init is a generic function used to allocate memory and is used by multiple RAW file formats from different manufacturers. It is very likely that different file formats may also trigger the same vulnerability.
Sample values which cause integer overflow are provided in Figure 3.
swidth = 0xbbc0;
sheight = 0xbbbb;
swidthX2 = swidth * 2; // 0x177c0
allocSize = swidthX2 * sheight; // 0x177c0 * 0xbbbb => 0x(1)135cca80
allocMem = HeapAlloc(ProcessHeap, 0, allocSize); // 0x135cca80 allocated
Figure 3: Integer overflow calculations
The resulting heap memory allocated by POBitmapImageRep::init is used to store data from the image file. This is performed using the two nested for loops of row by columns. The underlying code in CCanonSRawLoadRaw::lossless_jpeg_load_raw is the result of a port of the function from the dcraw library to C++.
Variant Analysis
A variant of this vulnerability can be easily found by cross referencing HeapAlloc usage in other parts of the code. CCanonSRawLoadRaw::GetBitmapImage contained a similar pattern of integer overflow leading to an heap overflow. Microsoft decided to patch this as a separate vulnerability, CVE-2020-0997.
Patch
Microsoft patched both integer overflows by promoting the multiplication to 64-bit and bailing out if the result is greater than 4 GiB as shown in Figure 4.
allocSize64 = swidthx2 * (unsigned __int64)sheight;
if ( allocSize64 > 0xFFFFFFFF )
{
ATL::AtlThrowImpl(0x80070057);
}
allocSize = allocSize64;
ProcessHeap = GetProcessHeap();
allocMem = HeapAlloc(ProcessHeap, 0, allocSize);
Figure 4: CVE-2020-16968 patch
HEIF
High Efficiency Image File Format (HEIF) is a newer and more dynamic container format for images and image sequences based on the ISO Base Media File Format (ISOBMFF). Windows 10 supports HEIF through the HEIF Image Extension Windows Store app. The HEIF Image Extension is also WIC enabled, allowing us to use the older harness to access HEIF decoders without any changes to our fuzzer.
As HEIF is a container format, it supports multiple compression formats, namely:
- Advanced Video Coding (AVC) in HEIF (AVCI)
- High Efficiency Video Coding (HEVC) in HEIF (HEIC)
- AOMedia Video 1 (AV1) in HEIF (AVIF)
The HEIF Image Extension from the Windows Store supports basic HEIF containers with AVC support. To decode HEVC or AV1, additional extensions must be installed. While the AV1 extension support is based on the open source libavif, HEVC seems to be Microsoft's proprietary code base and requires a payment of $0.99 to install the extension (likely due to complex HEVC licensing terms). All of the extensions are WIC enabled and HEIF loads the corresponding extensions based on the detected file format, eliminating the need of a new harness.
For fuzzing HEIF, I decided to gather three separate corpus sets based on their file headers and fuzz them based on the likelihood of finding vulnerabilities. I spent the least amount with fuzzing AVIF, because its support is based on the libavif code which is regularly fuzzed by OSSFuzz and other researchers. Only a moderate amount of time was spent on the AVC extension, given the prevalence of AVC codec and the most of my time was spent on the new HEVC extension for being a relatively new codec.
Fuzzing these codecs ended up more fruitful than I could have imagined. After triaging hundreds of crashes, a total of 35 vulnerabilities were reported to the Microsoft Response Center (MSRC), resulting in 21 CVEs. Most of the crashes were from the HEVC extension and the least were from the AVIF extension. Let me explain to you an interesting case of heap use after free (UAF) that I found in the HEIF extension. An example crash of the aforementioned UAF vulnerability (CVE-2020-17101) is presented in Figure 5.
0:000> r
rax=0000029dec4dff60 rbx=0000029dec4dff60 rcx=0000029de5b20000
rdx=0000029de5b20000 rsi=0000029de9751e50 rdi=0000029dec4c5e80
rip=00007ffcce89ca2e rsp=000000e19c8ff2f0 rbp=000000e19c8ff479
r8=0000029dec4ddf30 r9=0000000000000001 r10=0000000000000000
r11=000000e19c8ff1f0 r12=0000000000000000 r13=00007ffcce9bd708
r14=0000029dec4c3fd0 r15=000000e19c8ff558
iopl=0 nv up ei pl nz na po nc
cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010206
msheif_store!DllGetActivationFactory+0x79cbe:
00007ffc`ce89ca2e 488b9890000000 mov rbx,qword ptr [rax+90h] ds:0000029d`ec4dfff0=????????????????
0:000> !heap -p -a @rax
address 0000029dec4dff60 found in
_DPH_HEAP_ROOT @ 29de5b21000
in free-ed allocation ( DPH_HEAP_BLOCK: VirtAddr VirtSize)
29deb61f340: 29dec4df000 2000
00007ffd065c7db4 ntdll!RtlDebugFreeHeap+0x0000000000000038
00007ffd0657c018 ntdll!RtlpFreeHeap+0x0000000000097bc8
00007ffd064e95c4 ntdll!RtlpFreeHeapInternal+0x0000000000000464
00007ffd064e5d21 ntdll!RtlFreeHeap+0x0000000000000051
00007ffd03c2ea0b ucrtbase!_free_base+0x000000000000001b
00007ffcce89c931 msheif_store!DllGetActivationFactory+0x0000000000079bc1
00007ffcce989beb msheif_store!DllGetActivationFactory+0x0000000000166e7b
00007ffcce89ca55 msheif_store!DllGetActivationFactory+0x0000000000079ce5
00007ffcce8f0821 msheif_store!DllGetActivationFactory+0x00000000000cdab1 CHEIFImage::FinalParseAtom
00007ffcce8f0395 msheif_store!DllGetActivationFactory+0x00000000000cd625 CHEIFImage::CreateHEIFImage
00007ffcce8ac562 msheif_store!DllGetActivationFactory+0x00000000000897f2 CQTMovie::CreateMovieFromBuffer
00007ffcce8ee992 msheif_store!DllGetActivationFactory+0x00000000000cbc22
00007ffcce88afb2 msheif_store!DllGetActivationFactory+0x0000000000068242 CHEIFStreamReader::ParseStream
00007ffcce88a884 msheif_store!DllGetActivationFactory+0x0000000000067b14 CHEIFStreamReader::GetHEIFImage
00007ffcce82f2dc msheif_store!DllGetActivationFactory+0x000000000000c56c CHEIFParser::RuntimeClassInitialize
00007ffcce83963c msheif_store!DllGetActivationFactory+0x00000000000168cc
00007ffcce824f73 msheif_store!DllGetActivationFactory+0x0000000000002203
00007ffcce824c08 msheif_store!DllGetActivationFactory+0x0000000000001e98 MFCreateHEIFReaderFromStream
00007ffcce86c28b msheif_store!DllGetActivationFactory+0x000000000004951b CWICHeifDecoder::Initialize
00007ffcfc71334c windowscodecs!CCodecFactory::HrArbitrateDecoderList+0x0000000000000408
00007ffcfc715279 windowscodecs!CCodecFactory::HrCreateDecoderFromStreamInternalNew+0x000000000000030d
00007ffcfc7a918f windowscodecs!CCodecFactory::CreateDecoderFromFileHandle+0x000000000000008f
0:000> u @rip-7
msheif_store!sub_18007CA10+0x17:
00007ffc`ce89ca27 488b8798000000 mov rax,qword ptr [rdi+98h]
00007ffc`ce89ca2e 488b9890000000 mov rbx,qword ptr [rax+90h] ; crash
00007ffc`ce89ca35 4883a08800000000 and qword ptr [rax+88h],0
00007ffc`ce89ca3d 488b8f98000000 mov rcx,qword ptr [rdi+98h] ; same object
00007ffc`ce89ca44 4883c118 add rcx,18h
00007ffc`ce89ca48 488b01 mov rax,qword ptr [rcx] ; vftable
00007ffc`ce89ca4b 488b4010 mov rax,qword ptr [rax+10h] ; vfcall
00007ffc`ce89ca4f ff15b3d21000 call qword ptr [msheif_store!_guard_dispatch_icall_fptr (00007ffc`ce9a9d08)]
Figure 5: HEIF crash
CVE-2020-17101’s use-after-free happens in the very early stage of the ISOBMFF header parsing. The code path can be invoked by thumbnail creation, which essentially makes this a zero-click Remote Code Execution (RCE). What helps enable this RCE is that thumbnails are turned on by default in Windows Explorer.
ISOBMFF uses boxes or atoms to encapsulate data stored in the representative file. Each box contains a length field (4 bytes), box name (fourCC byte sequence), and optional metadata and data. The data can contain other boxes and corresponding metadata.
Figure 6: HEIF PoC file
Basic box parsing starts from the bytes "ftyp" (offset 4 in Figure 6) and looks for a supported brand. Here, "mif1" is the brand and "heic" is the image collection brand. In a normal parsing scenario, "mif1" is followed by a "meta" box, which encapsulates other boxes such as "hdlr", "iloc", "iinf", "pitm" etc.
When parsing the Proof of Concept (PoC) file (Figure 6), the "meta" box is followed by an "xxxx" generic box (instead of "hdlr" and "pict"). As "pict" is missing, the parsing code assumes the following boxes "pitm" and "iloc" as generic boxes and not as a "typed box". When the parsing reaches the "hdlr" box, "hdlr" and "pict" boxes are parsed as a typed box and the "meta" object is updated to mark the found "pict" box. Now the parsing reaches the second "pitm" box and is considered as a typed box, the pointer is updated in the "meta" object. The next box "zzzz" is parsed as a generic box and attached to the new "pitm" object. All the initial parsing happens in the CQTAtom::ScanChildren function.
When parsing reaches the end and calls the CHEIFImage::FinalParseAtom function, it checks whether "iloc", "iinf", and "pitm" boxes were found. If any of those boxes are missing, destructors of the objects are called for cleanup (objects are reference counted), and parsing is restarted by calling the function CQTAtom::ScanChildren. The destructor frees the "zzzz" box which is attached to "pitm", but the stored pointer to "zzzz" object is not nulled from the "pitm" object.
When the parsing restarts, the "meta" object already has "pict", and then "xxxx" is parsed as a generic box, but "pitm" and "iloc" (offset 0x49 and 0x57 in Figure 6) are parsed as typed boxes. As we already have "pitm", its reference count is incremented (instead of full parsing) and "iloc" is parsed. This parsing fails and the code bails out with an error.
As the parsing has failed, the objects are destroyed once again. The destructor loops over the linked list and calls every object's destructor. As the "pitm" object still has a pointer reference to the freed "zzzz" object, the code tries to free the "zzzz" object a second time by accessing the freed object's virtual function table and calling the destructor as a virtual function. This causes the crash, as we have enabled page heap for the fuzzed process.
Patch
Microsoft patched the vulnerability by nulling the saved pointer reference after freeing the memory.
Conclusion
Part three of this blog series presents multiple vulnerabilities in Window’s built-in image parsers and introduces newer image formats supported by Windows. A list of all reported vulnerabilities can be found in the following appendix and found referenced in the Mandiant Vulnerability Disclosures.