Jump to Content
Threat Intelligence

Reversing Malware Command and Control: From Sockets to COM

August 16, 2010

Written by: Nick Harbour

On a Windows host there is more than one way for a program to communicate across the internet. When reverse engineering a piece of malware it is of critical importance to understand what API is being used and how it works so that you may gain an understanding of the data sent and received as well as command structure and internal protocol if applicable. The choice of networking API also effects how you craft your indicators (more on this later). I break Windows Malware Command and Control communications into four API categories: Sockets, WinInet, URLMon and COM. The primary focus of this article is COM, since it is the rarest, least understood and most difficult to reverse engineer.


The first group, sockets, is the most widely known as it is the same basic networking API found on Unix. It provides a somewhat raw session level access to a TCP or UDP session. It is provided primarily by the DLL ws2_32.dll, though it has a storied history. Any application using the socket API must manually craft any higher level protocol such as HTTP. A socket object is created with a call to the function socket(). It is during this call where the protocol is specified, TCP or UDP. For a TCP client like most malware backdoors, the function connect() must be called, which performs the TCP three-way handshake and paves the way for reading and writing of data. A server program on the other hand will typically call bind() and listen() to wait for connections (although bind() can also be used client side to force a socket to a specific egress port). The functions send() and recv() are frequently used to transfer data across the wire for TCP connections and sendto() and recvfrom() are traditionally used for UDP connections. From a malware analysis perspective this makes things simple, if I need to understand what information a piece of malware is expecting in response to its beacon packet I can typically look for calls to the recv() function and see how the code that follows inspects the data it reads from the wire. On Windows though, sockets can be used interchangeably with most operations as if they were a file handle. For example, a process (like cmd.exe) could be spawned with a network socket specified as its standard input and standard output and that process would automatically communicate across the network. To learn the socket API in-depth I recommend reading the late W. Richard Stevens' classic book UNIX Network Programming Volume 1. The drawback to using the Socket API is that interacting with any higher level protocol such as HTTP must be done so manually. This allows a lot of flexibility in how the malware crafts its requests but everything must be laid out explicitly. Malware authors will frequently introduce subtle nuance differences between the way their HTTP traffic looks on the wire and the way most web browsers look. These small differences make great network signatures!


The WinInet API requires much less work than the socket API. It is a convenience API which simplifies the interaction to higher level protocols such as HTTP, FTP and even GOPHER! To begin using the WinInet API to talk to a remote host you first need to call InternetOpen() followed by InternetConnect() or InternetOpenURL(). Once the internet has been opened and a connection established, to perform and HTTP request you can call the functions HttpOpenRequest() to make a request handle and HttpSendRequest() to send the request. InternetReadFile() may then be called to read any response from the server. It can also be used to read data from an FTP session. The InternetWriteFile() function can also be used generically in place of any protocol specific function where data needs to be sent across the wire. The HTTP functions provide the programmer the ability to configure most of the options one would expect in an HTTP request header such as the User-Agent string. Not all of these values have to be specified though, and the system default will be used if none is specified. The malware programmer doesn't have as much flexibility to introduce subtle anomalies in the request structures though like they can with sockets, and left with default settings the malware's HTTP requests will look virtually identical to legitimate Internet Explorer traffic on the network.


The URL Monikers API provided by the DLL urlmon.dll provides yet another API for performing internet communications. In the back end it uses COM but I choose to list this as a separate category from the later discussion of COM because using this API is an abstraction away from the ugly, obscure-t0-reverse methods of direct COM interaction. The most popular function in the URLMon arsenal, from a malware perspective, is the one-punch knockout function URLDownloadToFile(). Rarely in the Win32 API does one function do so much work. You provide this function a URL (for any protocol IE understands), a filename and it uses COM to force Internet Explorer to download the resource to the specified filename. This is very popular with dropper malware which simply needs to download an EXE from a website and launch it. You might also run into the URLDownloadToCacheFile() function which will download a specified URL to the browser cache and return the name of the file it downloaded to. URLOpenStream() and URLOpenPullStream() can be used to download a URL to a buffer in memory, but these functions are rarely used in malware.

Controlling Internet Explorer with COM

Using COM for malware command and control has a number of advantages for the malware author. From a live response/volatile data/memory analysis perspective it obscures the source of malicious traffic because all communication with the remote host will be performed within an iexplore.exe process instead of the malware process. From a reverse engineering perspective it makes things complicated because it is not immediately clear that the malware is doing network communications at all when first inspected with static analysis. COM has fallen out of fashion with many of today's programmers and is less understood than many other technologies used in modern software. Malware analysts may be particularly unfamiliar with the inner workings of COM, depending on the programming background in the Win32 world.

Before any COM object can be instantiated, the function CoInitialize() must be called to initialize the COM library. Once this is successfully initialized a call can be made to CoCreateInstance() to create an instance of a particular COM object, such as Internet Explorer. In order to understand which object is being instantiated at this point you must inspect the first argument to the CoCreateInstance() function, which will be a CLSID value which is a unique identifier for a COM object. Here is the call to CoCreateInstance() found in a malware sample which used COM:



At first glance it's difficult to tell what object this function is instantiating because IDA simply shows you the data type of the first value instead of any type of meaningful interpretation of it. The first argument in this example code is a global variable (a constant actually) that exists at some location in the binary that has been given the label "rclsid" by IDA Pro. If you double click on this first argument IDA Pro will take you to view the variable as it would look in memory:



This CLSID value is recognized as a data structure by IDA and is shown here broken into its components. The Human-readable way to represent this CLSID value is "{0002DF01-0000-0000-C000-000000000046}". It would be nice if there were an IDA script to automatically name these values, but in the mean time you can look them up manually in the registry. By opening regedit and browsing to the key HKEY_CLASSES_ROOTCLSID you can see a rather long list of subkeys, each named with a CLSID value. By finding the subkey name which matches the hexadecimal values you see in IDA Pro you can identify what the object name is that the program is requesting, which is stored in the default value of the key. Here is a screen shot of regedit showing the key for the CLSID shown in the example code above:



So it appears that our malware sample is making an instance of the Internet Explorer object! In order to use any functionality from this object the malware must make use of the last argument to the CoCreateInstance() function (which was labeled "ppv"). This parameter is a pointer to pointer to receive a newly allocated data structure containing all the pertinent data for this object, most notably function pointers. The list of functions available is documented here: http://msdn.microsoft.com/en-us/library/bb159694.aspx. To see the Internet Explorer object in action from a source code level please take a look at this article: http://support.microsoft.com/kb/167658.

The most important function call to this COM object we need to recognize as reverse engineers is the call to the function Navigate() or Navigate2(). This is what actually accomplishes the HTTP request which may also contain POST data or a parametrized GET, and thus transmit data to the remote command and control server. The following screen fragment shows the call to the Navigate() method of the IE COM object as seen from IDA Pro:



In this fragment we see the ppv value being accessed, which is what was populated from the call to CoCreateInstance(). A function is being called from within this data structure.

This function is at offset 2Ch from the beginning of the ppv data structure, and you can tell that the binary is calling this function because the call instruction accesses this specific offset from the beginning of the structure ("call dword ptr [ecx+2Ch]"). There is no clear-cut way to go directly from the Microsoft documentation of this COM interface to seeing the actually internal offset used by each member function, so I made a small test application to simply reference each function. I then disassembled my test application and could see the clear mapping of offsets to member functions. The following table is a mapping of the commonly used member functions (by malware specifically) and their associated offsets as seen in a call instruction:



Final Thoughts

My goal with this article was to make more malware analysts and incident responders aware of the spectrum of communication techniques available to malware authors and to make analysts recognize and be able to begin reverse engineering the COM technique specifically. There is certainly a need for more tools and techniques in this area but it all begins with awareness and understanding. One thing to take away from this is that as a malware analyst it is not enough to simply know assembly language. You must gain a working knowledge of the APIs and you must be rigorous, if not downright tenacious, in your appetite to learn new facets of the APIs as you encounter them. If a malware analyst disregards calls such as CoCreateInstance() without deeper inspection due to their lack of understanding of it, they potentially will miss a critically important component of the malware's operation.

Posted in