Attackers have long been searching for ways to meddle with the day-to-day operations of an average computer user. It’s no wonder the Microsoft Office suite has been one of the key targets of adversaries to compromise endpoints. What better than to dispatch a seemingly-harmless office document to a rather naive user? It’s mayhem.
Owing to the popularity of Office documents as a technique to carry out execution, I’ll be discussing an interesting strategy employed by attackers to further increase the chances of evasion. Although first brought to light by Didier Stevens in February 2020 — the technique VBA Purging is slowly gaining popularity. We can, however, classify several malicious documents to have already employed the technique.
Before we dig into the technique itself — let’s take a closer look at the fundamentals of VBA and the binary file formats powering macros today.
A Quick Intro to VBA and Document File Formats
VBA, an abbreviation for Visual Basic Applications, is a programming language used to extend the functionality of Microsoft Office applications. Through the power of a small piece of (macro) code, you could do so much for automation, repetition, and general administration tasks. Owing to the powerful nature of the language, it didn’t take long before malicious macro codes were mainstreamed and now account for majority of the malicious attachments sent via emails.
Before Microsoft Office 2007, the Compound File Binary Format (CFBF) was the default file format for majority of its applications — otherwise known as the Object Linking and Embedding (OLE) file format. Today, Office applications use the Office Open XML (OOXML) file format wherein the objects within the document are structured in an XML-like format and then zipped. However, the embedded macros or VBA project file is still formatted using the CFBF file format — and this is precisely why we’ll be studying it next.
We’ll be referencing the file formats with the term — compound file. Why is that? Just as file systems offer structured storage for a variety of application-specific data streams, the compound file offers a structured solution to store different types of objects in a single file.
Two types of objects are available within a compound file — storage and stream objects. You can think of storage objects as directories composing of multiple files, whereas files are the stream objects. To avoid recreation of the file upon addition of a new object, the hierarchy-like structure also maintains an internal structure to identify the next location to storage objects in.
Just as you see “Storage 1” under the Root Storage, we have a “VBA” storage in macro-enabled documents (often under the “Macro” storage as well). Here’s the usual representation of a VBA project:
Our ‘Module Streams’ are what contain the actual code (or modules which host routines or data) that powers the embedded macros. Now these module streams are further broken down into two parts — PerformanceCache (P-Code) and CompressedSourceCode. The PerformanceCache contains the source code in its compiled form, whereas the CompressedSourceCode section contains the actual VBA source code but in a compressed form. Now — you can make some sense of it. When a new VBA macro is added to a document, it’s the responsibility of the VBA engine to store copies of the compiled code to the PerformanceCache section. The two sections are further separated by an offset defined in the dir stream called by MODULEOFFSET.
Though you should also note that the compiled code in PerformanceCache is version and architecture dependent. If the compiled VBA code’s version or architecture don’t match, the compiled code serves no purpose. It’s again handed off to the VBA engine to decompress (using Microsoft’s own compression algorithm) the CompressedSourceCode section, compile it on the run-time, and run the code.
Similarly, once you’ve executed the P-code from a macro-enabled document, a tokenized form of it is stored in the document’s VBA storage under __SRP__ streams (followed by a number). These streams can help execute the code much faster. However, these streams and the code contained within them is also Office-version specific and previously discussed techniques are run if the versions don’t match.
Is it Stomped or Purged?
VBA stomping or purging revolve around the aforementioned sections in the module streams. VBA Stomping came to light in 2018 when security researchers from Walmart explored the possibility of removing the CompressedSourceCode section of the module stream without impacting the execution of the macro but thwarting defenses set to detect strings with great efficiency. Though this is great in theory, recall the fact that if the source code for the macro is removed, the Office application will immediately fall back to the compiled code. What happens if the version and architecture don’t match? Nothing!
This is where things take an awry turn and you need some bits of recon on the target host to make this a success. If you’re certain about the version of Office on the system, Stomping the VBA can be quite effective against detection on the malicious document.
VBA Purging, on the other hand, revolves around removing the PerformanceCache section from the module streams. To completely remove evidence of the P-Code section, the MODULEOFFSET between the two sections is set to 0 by modifying the _VBA_PROJECT stream and removes all __SRP__ streams which also contain PerformanceCache data. Once you remove the compiled code, AV engines and Yara rules which rely on complete string matches are broken and the macros will be able to bypass them easily (due to the compressed nature of the source code which is still left intact).
What’s The Next Step for Defense?
VBA purging can make static analysis a tad-bit hard and leave IDS or Yara rules useless since the static strings they rely on are available in the P-code section. For example, the shell creation of object creation function CreateObject won’t be available in the CompressedSourceCode section as a single string but would most probably be broken down.
There are still opportunities to hunt and detect VBA purging actively evading defenses. FireEye has done an excellent job at covering those detections by putting their own Red Team’s developed purging tool called — OfficePurge.
One of those techniques rely on the minimum size of the _VBA_PROJECT stream — 7 bytes. Though this is based on the fact that the length of the PerformanceCache section should always be seven bytes less than the size of the _VBA_PROJECT stream. You can view the Yara rule at FireEye’s own coverage of the technique. Though it should be known that the techniques to detect VBA purging also have a chance of including benign documents which have their PerformanceCache sections removed but aren’t technically malicious.
In order to prevent macros from execution, you can employ these strategies:
- Train employees to pick apart malicious from harmless documents
- Disable macros if not necessary
- Use antimalware products to detect malicious documents containing malicious macros or VBA code
- If macros can’t be disabled, make sure only signed macros are run or they are in trusted files
- Block macros in Office to run from the Internet (often used for template injections). Excellent coverage by Microsoft to utilize Group Policies to enforce the change across the environment
Office documents and their popularity amongst masses will keep attracting malicious users to try and evade defenses using new techniques. VBA purging is one such technique which breaks traditional defenses or weakens them. It is imperative to instruct end-users to ensure the authenticity of a document before opening it and to stay wary of documents which appear anomalous from the get-go.