capa v4: casting a wider .NET
Mandiant
Written by: Willi Ballenthin, Moritz Raabe, Mike Hunhoff, Anushka Virgaonkar
We are excited to announce version 4.0 of capa with support for analyzing .NET executables. This open-source tool automatically identifies capabilities in programs using an extensible rule set. The tool supports both malware triage and deep dive reverse engineering. If you have not heard of capa before, or need a refresher, check out our first blog post. You can download capa v4.0 standalone binaries from the project’s release page and checkout the source code on GitHub.
capa 4.0 adds major new features that extends its ability to analyze and reason about programs. This blog post covers the following improvements included in capa 4.0:
- A new analysis backend that supports .NET executables, allowing you to analyze malware families such as Dark Crystal RAT and JUNKMAIL/DoubleZero.
- A new analysis scope restricts rule evaluation to individual instructions, enabling rule authors to inspect the specific mnemonic and operand combinations used throughout programs.
- Clarification of the rule set release process, including major version tagging of rules, so that you can easily update even if you are running an outdated version of capa.
- A collection of breaking changes that enable capa to both run faster and represent more types of results.
.NET Support
capa v4.0 is the first version to support analyzing .NET executables. With this new feature, we updated 49 existing rules and added 22 new rules to capture capabilities in .NET files. Read on to understand how we extended capa with .NET support.
Open-Source Contributions
Adding .NET support to capa provided the FLARE team an opportunity to contribute to open-source .NET analysis projects. We merged new features and updates to dnfile, a Python library that parses metadata found in .NET executables. We also released dncil, a Python library that disassembles Common Intermediate Language (CIL) instructions. dncil parses managed method headers, CIL instructions, and exception handlers, exposing the data through an object-oriented API to help you quickly build CIL analysis tools using the library.
.NET Feature Extraction
.NET is a platform for building managed applications executed by the Common Language Runtime (CLR). These applications can be written in high-level languages including C#, VB.NET, and F# and are compiled to low-level CIL instructions. These CIL instructions are included in the .NET file alongside metadata used by the CLR to execute them. Figure 1 shows an example “Hello World” method compiled from C# to CIL.
Figure 1: Hello World in C# and CIL
capa parses features from the metadata and CIL instructions stored in .NET executables. For example, capa uses dnfile to parse the .NET MethodDef
metadata table that describes all the managed methods defined in a .NET file. Each table entry includes the offset of the method’s body containing its CIL instructions. capa then uses dncil to disassemble each method body and extracts features from the CIL instructions. Figure 2 shows a breakdown of the features capa extracts from our example “Hello World” method.
Figure 2: .NET features extracted from Hello World program
capa addresses .NET features using their method token and instruction offset, e.g., token(0x6000001)+0x6
. This helps reverse engineers to quickly navigate and inspect interesting methods using .NET analysis tools like dnSpy.
Figure 3 shows decompiled C# code from the .NET malware Mandiant tracks as DOTWRAP. This code hides the console window and reads data from a file named config.txt
.
Figure 3: Decompiled DOTWRAP .NET malware sample
Figure 4 shows the features capa extracts from the above example code (this output can be obtained via the scripts/show-features.py helper script). The avid reader may recognize the namespace
, class
, api
, string
, and number
features.
Figure 4: Extracted .NET features for the DOTWRAP malware sample
With a simple addition of the System.IO.File::ReadAllLines
API call to an existing rule, capa now detects the read file on Windows
capability in the DOTWRAP malware sample (see Figure 5).
Figure 5: Read file on Windows rule match in the DOTWRAP malware sample
As shown in Figure 5, capa also extracts the two API calls kernel32.GetConsoleWindow
and user32.ShowWindow
. These are native Windows API functions called from the backdoor’s managed code using a technology called Platform Invoke (P/Invoke). The .NET ImplMap
metadata table describes the native functions that can be called from managed code using P/Invoke. Each table entry maps a managed method (MemberForwarded
) to a native function. The native function can be executed by calling its MemberForwarded
method and P/Invoke handles the details.
capa reads the ImplMap
table to chain MemberForwarded
methods to their native functions. This enables detecting native capabilities implemented in managed code. So, here we can rely on an existing rule to detect window hiding via native Windows functions. Figure 6 shows the capa match for our example code.
Figure 6: Hide graphical window rule match in the DOTWRAP sample
In version 4 capa extracts and analyzes the following types of features from .NET files:
namespace
e.g.,System.IO
class
e.g.,System.IO.File
api
andimport
e.g.,System.IO.File::Delete
function-name
e.g.,HelloWorld::Main
number
- string
We have also added two new characteristics to detect .NET executables containing both managed and unmanaged (native) code and calls from managed code to native code, respectively:
- mixed mode
unmanaged call
Future .NET Work
As we write more .NET specific rules and perform more research we expect to expand and enhance the .NET feature support in future versions. If you encounter missing features or have ideas for good additions, please open a discussion in our GitHub repository.
Instruction Scope
We have added a new scope that restricts rule evaluation to individual instructions. With this rule authors can match specific combinations of instruction mnemonics and operands. capa’s updated rule syntax also includes a new operand
feature that specifies number
and offset
values for operands. This enables rule authors to specify flow of data from a source or to a destination – like moving data from a structure or comparing against a constant.
Figure 7 shows an example of using the new instruction
scope and operator
feature to more reliably detect Adler32 checksum calculation. Note how it’s much clearer that the shr
instruction must use the number 0xF
as its source operand. This enables capa to match capabilities more accurately and makes rules easier to understand for human readers. With these changes we’ve removed the previously supported /x32
and /x64
flavors of number
and operand
features.
Figure 7: Old rule features (left) vs. new instruction scope and operand feature (right)
Rule Release Process and Other Changes
When we introduce new functionality and breaking changes to capa, rules may become incompatible with a certain release or our current development branch. To explain this, we’ve added clarifying documentation that helps users to identify the correct rules branch for their respective capa version. In short, users must use the matching rules branch corresponding to the used capa major version. That is, use v3 rules for the v3 release of capa and v4 rules for the v4 release of capa.
capa now requires Python 3.7 or newer. If you build on top of capa also be aware that we’ve updated the freeze format to store extracted features and the JSON results document to store and exchange capa results. Moreover, the internal representation of addresses changed so that the tool now can express additional context, e.g., .NET tokens and offsets.
Contributing
We look forward to seeing how the new capa functionalities further support the community and encourage you to contribute. Any form of feedback, ideas, and pull requests are welcome. Just open an issue or check out the contributing document to get started.
Rules are the foundation of capa’s identification algorithm. If you have any rule ideas, please open an issue or even better submit a pull request to capa-rules. This way, everyone can benefit from the collective knowledge of our malware analysis community.
Conclusion
The newest improvements add .NET executable analysis support to capa and make its rules even more expressive. The 4.0 capa release also includes bug fixes, new features, improvements to the freeze and JSON results serialization formats, and more than 60 new and updated rules. See the capa changelog for all update details.