Cybercriminals are always innovative and fast in finding new tricks to bypass security solutions, and sandboxes are no exception. If you look at today's tricks, the majority belong to the group of environment checks. A malware detects that is not running on the real target system but rather in a sandbox and therefore hides its real behavior.
However, what if the sandbox does not know how to execute the sample at all or if it does not find the payload?
This blog post will outline some advanced attacks which fall into this category and show how Joe Sandbox can handle these evasions.
PDF has been used for years to deliver malware to endpoints, mostly through exploits. The shell code inside a PDF is the trigger used to download and install a second stage malware. However, these days PDFs are also often used to just deliver a link:
When the victim clicks on the link, the malware is downloaded via a web browser and then installed.
Given this common scenario, the goal of a sandbox is to precisely simulate this behavior.
Sandbox UI automation 101
To be able to automate the user interaction, the sandbox has to first find the link in the PDF. There are two ways to do so:
- Parse the PDF and search for links
- Click on the link
As you can observe, link extraction via parsing the PDF is not really the solution. How about clicking on the link? This is also non-trivial because Adobe Reader uses its own UI elements. The Windows UI Automation
(UIA) does not help here and the UISpy tool only detects the other PDF page but not the link button:
So how does Joe Sandbox solve this? Well, first it creates a grid on the PDF page and then tries to determine if each cross point is worth to be clicked. If so, it will then simulate clicks on each interesting cross point and watches the Adobe Reader process for any events:
If a button is reached and clicked successfully, the click simulation is stopped. Then right afterwards, our OCR UI engine takes over.
OCR based UI Automation
Using the above-mentioned technique, Joe Sandbox's PDF automation has successfully clicked the link. Due to that, the local browser will be opened by the operating system and since the link points to a file, it will be downloaded:
As a next step, the sandbox needs to execute the downloaded sample. Of course, the most straightforward technique for the "lazy" would be to locate the temporary file on disk and then launch it. However, we have seen some malware which checks if the parent process is the browser and not e.g. Windows Explorer. Therefore, the only way is to continue with UI automation.
Again, the Windows UI Automation and similar techniques do not help. The reason we guess is likely that Microsoft protected some of the buttons from clicking due to security reasons.
Joe Sandbox solves this problem via a unique optical character recognition (OCR) technqiue based on a UI automation approach. The engine works like this:
- Find interesting top level window
- Perform OCR
- Compare detected word with a predefined button list
- For each match click on the word
During analysis this looks like this:
The full behavior due to the simulation can be easily seen in the process startup overview:
Please note that this technology is independent of any UI framework used by any application. It is fully generic and clicks on anything which looks interesting. Below you find an example of a URL analysis:
Joe Sandbox does not go the lazy way. In contrast to many other solutions which try to extract links via PDF parsing, Joe Sandbox uses UI automation to extract them, no matter if the link is encrypted, obfuscated or hidden. To trigger download resources it uses a unique generic OCR based UI automation approach which precisely simulates a user.