Uninitialized Memory Disclosures in Web Applications

While we at Silent Signal are strong believers in human creativity when it comes to finding new, or unusual vulnerabilities, we’re also constantly looking for ways to transform our experience into automated tools that can reliably and efficiently detect already known bug classes. The discovery of CVE-2019-6976 – an uninitialized memory disclosure bug in a widely used imaging library – was a particularly interesting finding to me, as it represented a lesser known class of issues in the intersection of web application and memory safety bugs, so it seemed to be a nice topic for my next GWAPT Gold Paper.

While we did some work on investigating the issue, and even developed tooling for detection, writing a paper was a good opportunity to systematize my knowledge, and to properly evaluate the effectiveness of available discovery methods. While going through a process where I had to back all my claims with references and data, a couple of important things quickly became apparent:

There isn’t really a standard way to think about memory safety. While Matt Miller’s work in this area fit my case really well, most papers and writeups just rely on “folklore knowledge”, and I realized that this makes it really hard to logically reason about one’s own way of thinking (even if a particular sentence makes sense at all?).
Some concepts that we throw around in IT-security can be much more complex than most of us probably think.
Our original detection algorithm was suboptimal, and the existing implementation was incorrect…

Fortunately, I managed to fix the problems, and now the tools I created are available for you to verify. Following the Unix philosophy of creating simple tools that can interact once again helped me to test and compare different ideas in a reproducible and automated way. The relevant code repositories for this research are:

TestEnvForEntropyCalc:multi – Improved branch of our Docker test environment with Apache/PHP, Node.js and Python based test applications. You can use this to experiment with new and existing tools.
image-memleak – Test scripts referenced in the paper.
image-memleak-testsuite – Test images to facilitate testing of memory disclosures. PNG’s for now.

As you will see in the paper, detection of memory disclosures can be facilitated by using an appropriately chosen input test suite. Feel free to use the last repo in your tests, and if you feel like messing around with image formats, don’t hesitate to contribute more samples!

I also reached out to Chris Evans, whose work in this area was the original inspiration for the initial bug, and this paper too. He was kind enough to give feedback on my paper, some of which didn’t make it to the released version because of timing issues:

Cloudbleed definitely would’ve worth a mention in the historical overview
Feeding the same input to the parser multiple times and looking for differences in the output seems also like a reliable way to detect parsing problems. This technique can be particularly useful, when the tested edge case doesn’t allow full control over the actual bitmap content of the input image.
As the paper mentions, it’s not just image parsers that can be abused this way, these are just the most common examples one can encounter on the web. This bug of Chris is a nice example of memory disclosure in Flash (still not completely dead as of this writing).

Finally, writing this paper highlighted some areas which would deserve their own papers, such as:

Recovering memory content after lossy compression
Improving pointer identification based on specifics of particular executable loaders

I hope that this paper will serve as a useful foundation to better understand this exciting branch of vulnerabilities, and inspire further research. The full Gold Paper can be downloaded from the website of SANS Institute: