Thoughts on cyber threat intelligence, malware analysis, and other things

Detecting (Some) Malicious Office Documents Using Sysmon

Recently I've been collaborating with a friend and colleague, Roberto - @Cyb3rWard0g, to come up with ideas for detecting different attack scenarios using tools like Sysmon. Roberto setup a GitHub repository - The ThreatHunter-Playbook - to consolidate some of those methods and ideas. I highly recommend anyone interested in threat-hunting using open source or freely available tools to check it out.

Based on some of the work we've been doing, I thought I'd share some observations on detecting malicious Microsoft Office documents using Sysmon. For a great tutorial on setting up a threat-hunting lab, check out Roberto's blog post Setting up a Pentesting... I mean, a Threat Hunting Lab.

Malicious Office Documents

Even in 2017, malicious Microsoft Office documents are still an extremely common delivery mechanism for malware. The typical scenario involves leveraging a malicious document as a dropper/downloader to install one or more additional pieces of malware. From my experience, these malicious documents can be (generally) divided into three main categories:

  • Documents leveraging malicious macros to launch a command shell - either cmd.exe or PowerShell
  • Documents with malicious scripts embedded as objects
  • Documents that use malicious macros - or possibly leveraging an exploited vulnerability - to execute shellcode

Please note, this is not intended to be an all-inclusive list, just some general categories based on what I see most often in my environment.

We can detect, in quite a few use cases, each of these categories using Sysmon and a little bit of understanding of how each method works. I'll cover the first two use cases here, and save the third for another post as it may require a little more explanation.

Macros Launching a Command Shell and Documents with Embedded Scripts

Generally speaking, this is the scenario I see most often in my environment. A user receives an email with an attached Word document that contains an obfuscated macro or a script embedded as an OLE object and some lure enticing the user to click that magical "Enable Content" button. In the case of macros, the goal is usually to decode some obfuscated string that usually contains a command to be invoked using cmd.exe or PowerShell. The embedded scripts I've seen typically attempt to download and execute additional malware and are often obfuscated in much the same way as macro content.

For detection purposes, we can target both of these use cases using Sysmon Event ID 1: Process Creation.

Looking for PowerShell and cmd

For macros attempting to execute commands via cmd or PowerShell, we'll be searching for a cmd.exe or PowerShell related process being spawned by an Office application. To take this a step further, you could search for any non-Office product launched by an Office application to catch some other edge cases that don't use cmd.exe or PowerShell. However, that would require white-listing quite a few applications (things like web browsers) based on your environment to reduce false positives.

To test, we'll use a very simple macro proof of concept that simply launches the calculator from the command line. The code is:

Public Sub Document_Open()
    bad_command = "cmD.eXE /c calc.exe"
    Result = CreateObject("Wscript.Shell").Run(bad_command, False)
End Sub

To isolate the events we're interested in, for testing purposes, we'll use a custom Sysmon configuration that will only target Microsoft Word. Again, take a look at Roberto's blog post - Setting up a Pentesting... I mean, a Threat Hunting Lab - for some help in setting up a lab environment to collect and visualize Sysmon events

<Sysmon schemaversion="3.30">
   <!-- Capture all hashes -->
      <!-- Event ID 1 == Process Creation. -->
      <ProcessCreate onmatch="include">
        <ParentImage condition="end with">winword.exe</ParentImage>
      <!-- Event ID 2 == File Creation Time. -->
      <FileCreateTime onmatch="include"/>
      <!-- Event ID 3 == Network Connection. -->
      <NetworkConnect onmatch="include"/>
      <!-- Event ID 5 == Process Terminated. -->
      <ProcessTerminate onmatch="include"/>
      <!-- Event ID 6 == Driver Loaded.-->
      <DriverLoad onmatch="include"/>   
      <!-- Event ID 7 == Image Loaded. -->
      <ImageLoad onmatch="include"/>  
      <!-- Event ID 8 == CreateRemoteThread. -->
      <CreateRemoteThread onmatch="include"/>
      <!-- Event ID 9 == RawAccessRead. -->
      <RawAccessRead onmatch="include"/>       
      <!-- Event ID 10 == ProcessAccess. -->
      <ProcessAccess onmatch="include"/>
      <!-- Event ID 11 == FileCreate. -->
      <FileCreate onmatch="include"/>
      <!-- Event ID 12,13,14 == RegObject added/deleted, RegValue Set, RegObject Renamed. -->
      <RegistryEvent onmatch="include"/>
      <!-- Event ID 15 == FileStream Created. -->
      <FileCreateStreamHash onmatch="include"/>
      <!-- Event ID 17 == PipeEvent. -->
      <PipeEvent onmatch="include"/>    

We can tell Sysmon to use the new config with the command sysmon.exe -c <filename>.xml.

With that configuration in place, we can open the Word document with the malicious macro content and click enable content. If you're watching your Sysmon logs in the event viewer, you should immediately see a new event for event ID 1.
Event ID 1 Several fields here are valuable as search parameters, namely ParentImage, Image, and CommandLine. For this use case, we're looking for an Office application spawning a shell using cmd.exe or PowerShell. If you're pushing your Sysmon data to an ELK installation, we can build a simple search to find any events where the parent image is an office application and either the target image is cmd.exe or the command line field contains powershell.
event_data.ParentImage: office AND (event_data.Image: cmd.exe OR event_data.CommandLine: powershell)

In my environment, the search results look like the image below.
ELK Search Results

Even in a production environment, this produces a pretty high fidelity result with a very low false positive rate.

Looking for Embedded Script Content

We can identify embedded scripts using a nearly identical method. When a user double clicks on a icon for an embedded VBS file in a Word document, Word writes the document to the temp folder and executes it with wscript.exe or cscript.exe. We just need to modify our search to look for wscript or cscript as the target image.

To test this we'll use the same Sysmon configuration and the following PoC code that we'll embed into a Word document as an object.

dim objShell
set objShell = CreateObject("shell.application")
objShell.ShellExecute "calc.exe", "", "", "open", 1

Now if we open the document and double click the embedded object, we should again see a new Sysmon event.
Event ID 1 We can just make a couple of simple changes to our original search to target these events.
event_data.ParentImage: office AND event_data.Image: (wscript.exe OR cscript.exe)

Again, in my environment the result looks like:
ELK Search Results

We can cover both use cases in one search by combining the two:
event_data.ParentImage: office AND (event_data.Image: (wscript.exe OR cscript.exe OR cmd.exe) OR event_data.CommandLine: powershell)

Which returns both of the events in my environment:

Again, we end up with a high fidelity search result that can be used in a production environment to identify a large portion of macro content.

In my next post I'll highlight a few observations from our research on the final use case.