CARBANAK Week Part Three: Behind the CARBANAK Backdoor
Mandiant
Written by: James T. Bennett, Michael Bailey
We covered a lot of ground in Part One and Part Two of our CARBANAK Week blog series (Part Four is now available). Now let's take a look back at some of our previous analysis and see how it holds up.
In June 2017, we published a blog post sharing novel information about the CARBANAK backdoor, including technical details, intel analysis, and some interesting deductions about its operations we formed from the results of automating analysis of hundreds of CARBANAK samples. Some of these deductions were claims about the toolset and build practices for CARBANAK. Now that we have a snapshot of the source code and toolset, we also have a unique opportunity to revisit these deductions and shine a new light on them.
Was There a Build Tool?
Let’s first take a look at our deduction about a build tool for CARBANAK:
“A build tool is likely being used by these attackers that allows the operator to configure details such as C2 addresses, C2 encryption keys, and a campaign code. This build tool encrypts the binary’s strings with a fresh key for each build.”
We came to this deduction from the following evidence:
“Most of CARBANAK’s strings are encrypted in order to make analysis more difficult. We have observed that the key and the cipher texts for all the encrypted strings are changed for each sample that we have encountered, even amongst samples with the same compile time. The RC2 key used for the HTTP protocol has also been observed to change among samples with the same compile time. These observations paired with the use of campaign codes that must be configured denote the likely existence of a build tool.”
Figure 1 shows three keys used to decode the strings in CARBANAK, each pulled from a different CARBANAK sample.
Figure 1: Decryption keys for strings in CARBANAK are unique for each build
It turns out we were spot-on with this deduction. A build tool was discovered in the CARBANAK source dump, pictured with English translations in Figure 2.
Figure 2: CARBANAK build tool
With this build tool, you specify a set of configuration options along with a template CARBANAK binary, and it bakes the configuration data into the binary to produce the final build for distribution. The Prefix text field allows the operator to specify a campaign code. The Admin host text fields are for specifying C2 addresses, and the Admin password text field is the secret used to derive the RC2 key for encrypting communication over CARBANAK’s pseudo-HTTP protocol. This covers part of our deduction: we now know for a fact that a build tool exists and is used to configure the campaign code and RC2 key for the build, amongst other items. But what about the encoded strings? Since this would be something that happens seamlessly behind the scenes, it makes sense that no evidence of it would be found in the GUI of the build tool. To learn more, we had to go to the source code for both the backdoor and the build tool.
Figure 3 shows a preprocessor identifier named ON_CODE_STRING defined in the CARBANAK backdoor source code that when enabled, defines macros that wrap all strings the programmer wishes to encode in the binary. These functions sandwich the strings to be encoded with the strings “BS” and “ES”. Figure 4 shows a small snippet of code from the header file of the build tool source code defining BEG_ENCODE_STRING as “BS” and END_ENCODE_STRING as “ES”. The build tool searches the template binary for these “BS” and “ES” markers, extracts the strings between them, encodes them with a randomly generated key, and replaces the strings in the binary with the encoded strings. We came across an executable named bot.dll that happened to be one of the template binaries to be used by the build tool. Running strings on this binary revealed that most meaningful strings that were specific to the workings of the CARBANAK backdoor were, in fact, sandwiched between “BS” and “ES”, as shown in Figure 5.
Figure 3: ON_CODE_STRING parameter enables easy string wrapper macros to prepare strings for encoding by build tool
Figure 4: builder.h macros for encoded string markers
Figure 5: Encoded string markers in template CARBANAK binary
Operators’ Access To Source Code
Let’s look at two more related deductions from our blog post:
“Based upon the information we have observed, we believe that at least some of the operators of CARBANAK either have access to the source code directly with knowledge on how to modify it or have a close relationship to the developer(s).”
“Some of the operators may be compiling their own builds of the backdoor independently.”
The first deduction was based on the following evidence:
“Despite the likelihood of a build tool, we have found 57 unique compile times in our sample set, with some of the compile times being quite close in proximity. For example, on May 20, 2014, two builds were compiled approximately four hours apart and were configured to use the same C2 servers. Again, on July 30, 2015, two builds were compiled approximately 12 hours apart.”
To investigate further, we performed a diff of two CARBANAK samples with very close compile times to see what, if anything, was changed in the code. Figure 6 shows one such difference.
Figure 6: Minor differences between two closely compiled CARBANAK samples
The POSLogMonitorThread function is only executed in Sample A, while the blizkoThread function is only executed in Sample B (Blizko is a Russian funds transfer service, similar to PayPal). The POSLogMonitorThread function monitors for changes made to log files for specific point of sale software and sends parsed data to the C2 server. The blizkoThread function determines whether the user of the computer is a Blizko customer by searching for specific values in the registry. With knowledge of these slight differences, we searched the source code and discovered once again that preprocessor parameters were put to use. Figure 7 shows how this function will change depending on which of three compile-time parameters are enabled.
Figure 7: Preprocessor parameters determine which functionality will be included in a template binary
This is not definitive proof that operators had access to the source code, but it certainly makes it much more plausible. The operators would not need to have any programming knowledge in order to fine tune their builds to meet their needs for specific targets, just simple guidance on how to add and remove preprocessor parameters in Visual Studio.
Evidence for the second deduction was found by looking at the binary C2 protocol implementation and how it has evolved over time. From our previous blog post:
“This protocol has undergone several changes over the years, each version building upon the previous version in some way. These changes were likely introduced to render existing network signatures ineffective and to make signature creation more difficult.”
Five versions of the binary C2 protocol were discovered amongst our sample set, as shown in Figure 8. This figure shows the first noted compile time that each protocol version was found amongst our sample set. Each new version improved the security and complexity of the protocol.
Figure 8: Binary C2 protocol evolution shown through binary compilation times
If the CARBANAK project was centrally located and only the template binaries were delivered to the operators, it would be expected that sample compile times should fall in line with the evolution of the binary protocol. Except for one sample that implements what we call “version 3” of the protocol, this is how our timeline looks. A probable explanation for the date not lining up for version 3 is that our sample set was not wide enough to include the first sample of this version. This is not the only case we found of an outdated protocol being implemented in a sample; Figure 9 shows another example of this.
Figure 9: CARBANAK sample using outdated version of binary protocol
In this example, a CARBANAK sample found in the wild was using protocol version 4 when a newer version had already been available for at least two months. This would not be likely to occur if the source code were kept in a single, central location. The rapid-fire fine tuning of template binaries using preprocessor parameters, combined with several samples of CARBANAK in the wild implementing outdated versions of the protocol indicate that the CARBANAK project is distributed to operators and not kept centrally.
Names of Previously Unidentified Commands
The source code revealed the names of commands whose names were previously unidentified. In fact, it also revealed commands that were altogether absent from the samples we previously blogged about because the functionality was disabled. Table 1 shows the commands whose names were newly discovered in the CARBANAK source code, along with a summary of our analysis from the blog post.
Table 1: Command hashes previously not identified by name, along with description from prior FireEye analysis
The msgbox command was commented out altogether in the CARBANAK source code, and is strictly for debugging, so it never appeared in public analyses. Likewise, the ifobs command did not appear in the samples we analyzed and publicly documented, but likely for a different reason. The source code in Figure 10 shows the table of commands that CARBANAK understands, and the ifobs command (0x6FD593) is surrounded by an #ifdef, preventing the ifobs code from being compiled into the backdoor unless the ON_IFOBS preprocessor parameter is enabled.
Figure 10: Table of commands from CARBANAK tasking code
One of the more interesting commands, however, is tinymet, because it illustrates how source code can be both helpful and misleading.
The tinymet Command and Associated Payload
At the time of our initial CARBANAK analysis, we indicated that command 0xB0603B4 (whose name was unknown at the time) could execute shellcode. The source code reveals that the command (whose actual name is tinymet) was intended to execute a very specific piece of shellcode. Figure 12 shows an abbreviated listing of the code for handling the tinymet command, with line numbers in yellow and selected lines hidden (in gray) to show the code in a more compact format.
Figure 11: Abbreviated tinymet code listing
The comment starting on line 1672 indicates:
tinymet command
Command format: tinymet {ip:port | plugin_name} [plugin_name]
Retrieve meterpreter from specified address and launch in memory
On line 1710, the tinymet command handler uses the single-byte XOR key 0x50 to decode the shellcode. Of note, on line 1734 the command handler allocates five extra bytes and line 1739 hard-codes a five-byte mov instruction into that space. It populates the 32-bit immediate operand of the mov instruction with the socket handle number for the server connection that it retrieved the shellcode from. The implied destination operand for this mov instruction is the edi register.
Our analysis of the tinymet command ended here, until the binary file named met.plug was discovered. The hex dump in Figure 12 shows the end of this file.
Figure 12: Hex dump of met.plug
The end of the file is misaligned by five missing bytes, corresponding to the dynamically assembled mov edi preamble in the tasking source code. However, the single-byte XOR key 0x50 that was found in the source code did not succeed in decoding this file. After some confusion and further analysis, it was realized that the first 27 bytes of this file are a shellcode decoder that looked very similar to call4_dword_xor. Figure 13 shows the shellcode decoder and the beginning of the encoded metsrv.dll. The XOR key the shellcode uses is 0xEF47A2D0 which fits with how the five-byte mov edi instruction, decoder, and adjacent metsrv.dll will be laid out in memory.
Figure 13: Shellcode decoder
Decoding yielded a copy of metsrv.dll starting at offset 0x1b. When shellcode execution exits the decoder loop, it executes Metasploit’s executable DOS header.
Ironically, possessing source code biased our binary analysis in the wrong direction, suggesting a single-byte XOR key when really there was a 27-byte decoder preamble using a four-byte XOR key. Furthermore, the name of the command being tinymet suggested that the TinyMet Meterpreter stager was involved. This may have been the case at one point, but the source code comments and binary files suggest that the developers and operators have moved on to simply downloading Meterpreter directly without changing the name of the command.
Conclusion
Having access to the source code and toolset for CARBANAK provided us with a unique opportunity to revisit our previous analysis. We were able to fill in some missing analysis and context, validate our deductions in some cases, and provide further evidence in other cases, strengthening our confidence in them but not completely proving them true. This exercise proves that even without access to the source code, with a large enough sample set and enough analysis, accurate deductions can be reached that go beyond the source code. It also illustrates, such as in the case of the tinymet command, that sometimes, without the proper context, you simply cannot see the full and clear purpose of a given piece of code. But some source code is also inconsistent with the accompanying binaries. If Bruce Lee had been a malware analyst, he might have said that source code is like a finger pointing away to the moon; don’t concentrate on the finger, or you will miss all that binary ground truth. Source code can provide immensely rich context, but analysts must be cautious not to misapply that context to binary or forensic artifacts.
In the next and final blog post, we share details on an interesting tool that is part of the CARBANAK kit: a video player designed to play back desktop recordings captured by the backdoor.