A Tale of a Quick Evening Malware Reversing Session | Version | N/A | |
---|---|---|---|
Updated | |||
Author | Joshua Finley | License | DBE |
This post documents a quick analysis of heavily obfuscated malware variant I discovered in the wild, featuring multiple layers of packing and encryption. Through a simple but careful unpacking process, the malware was revealed to be PureLogStealer, demonstrating obfuscation techniques including homoglyphs, multi-stage payloads, and encryption. This writeup briefly documents the sample and the reversing session.
1. Introduction
Last night, I stumbled across what appeared to be a suspicious file in the wild. When I happen upon live samples on accident, I often get fixated on understanding them, and this situation was no different (resulting in the rest of the evening escaping me). This post documents the analysis process and unpacking of what turned out to be multiple layers of obfuscated code leading to a credential stealing payload.
2. Initial Discovery
The sample was first noticed when a friend reached out to me suspecting that a local company’s website had been compromised. The site was displaying a “cloudflare verification” error. When I attepted to access the site over a VPN, the lure was gone, and the website loaded as normal.
Fortunately, my friend had already copied the page’s payload, which was constructed as a prompt for the user to use Win+R to paste in a benign looking string:
I am not a robot: Cloudflare Verification ID: ...
However, the webpage obscured the actual text, meaning that whenever the text was copied, a mshsta
command to pull a file from an unknown domain would actually be placed in the clipboard. Combined with the use of Win+R, this would lead to the first stage of the malware being downloaded and executed on the victim’s computer.
3. First Impressions and Initial Analysis
The first thing that caught my attention was that Windows Defender didn’t flag the file - a common indicator of either a novel threat or good obfuscation. Initial inspection showed the file had ID3 magic bytes (typically associated with MP3 files), but actually contained what appeared to be JavaScript code mixed with binary data.
File characteristics:
- ID3 magic byte header
- Contains JavaScript tags
- Binary data interspersed throughout
This mixing of file types is a classic obfuscation technique. Attackers often disguise malicious code as media files or other benign formats to evade detection.
4. Deobfuscation: Layer 1
Taking a closer look at the file, I noticed two large blocks of junk data surrounding what appeared to be the actual payload. After removing these and the <script>
tags, and changing eval
to Wscript.echo
, I was able to get the script to output a secondary encoded payload.
This is a common technique in modern malware - the first layer is often just a simple wrapper designed to throw off automated analysis and basic signature detection. Additionally, Microsoft, in their unbounded cleverness, designed mshta
to ignore bad data and still execute valid script blocks in files. As demonstrated here, this is extremely valuable for malware developers, as they can fill the rest of the file with whatever junk data they please, throwing off antivirus scanning.
5. Deobfuscation: Layers 2 and 3
The secondary payload turned out to be more obfuscated code, this time PowerShell. After decoding it, I found yet another layer of PowerShell obfuscation that used the classic iex
(Invoke-Expression) technique to execute dynamically generated code.
Interestingly, the script specifically called for 32-bit PowerShell, which is often done to evade certain security tools or because the payload has 32-bit dependencies.
6. Network Behavior and Isolating the Sample
At this point, I needed to observe the malware’s network behavior. After taking a VM snapshot to ensure I could revert to a clean state, I allowed the script to execute with network access.
The malware promptly downloaded a file from:
hxxps://<redacted>.shop/<random-alpha-numeric-string>.xlt
The URL structure with that long hexadecimal string is typically used for unique tracking or could be a session ID. The domain itself is clearly malicious with its random-word generation pattern.
The downloaded file appeared to be an XLT (Excel template) file, but upon inspection, it was heavily obfuscated. Interestingly, it contained Cyrillic text - # Ключ
(which translates to “key”) - and hints of an XOR-based decryption loop.
7. Deobfuscation: Layers 4 and 5
After deciding to avoid manually reversing the heavy obfuscation in the large XLT file, I wrote a decryption hook to execute and capture the decrypted content. This yielded yet another PowerShell script, but this time with more readable code that included process injection techniques.
This new PowerShell script contained what appeared to be a Base64-encoded .NET assembly. Decoding this in CyberChef confirmed my suspicion - it was indeed an MZ executable (the standard Windows executable format).
At this point, the unpacking chain was becoming clear:
- Obfuscated JavaScript in ID3 file, executed by
mshta
- Encoded PowerShell payload
- Secondary obfuscated PowerShell with IEX
- Downloaded XLT with XOR encryption
- PowerShell with process injection and Base64 .NET assembly
8. Analyzing the .NET Payload
After decoding the Base64 payload, I loaded the resulting executable into DNSpy for analysis. Windows Defender now recognized the threat, identifying it as “Upsiugyll.exe” - a sign that we’d reached a known malware component.
The .NET assembly was still heavily obfuscated, with signs of:
- More Base64 encoding
- Use of System.Threading, for the next execution stage
- AES encryption routines
- COM visible functions
- Exception handling code that appeared to be used for obfuscation
9. The Final Layer
After some debugging, I identified what appeared to be a key decryption function that was loading something into the current AppDomain. Setting a breakpoint at this location, I was able to capture the decrypted array before it was loaded.
This final layer turned out to be “Tgksfxaml.dll” with numerous junk namespaces - a common obfuscation technique to make static analysis more difficult. The large number of Windows .NET type references suggested this was the actual payload rather than another loader.
At this point, Windows Defender identified the malware as “Trojan:MSIL/PureLogStealer.ANNA!MTB” - a credential stealing malware family.
10. Technical Observations
Throughout this analysis, several sophisticated techniques stood out:
Multiple Layer Obfuscation: At least 7-8 distinct layers of encoding, encryption, and obfuscation.
Format Mixing: Using ID3 headers to disguise JavaScript content.
Homoglyphs: The initial file contained lookalike characters that appeared to be comments but were likely arguments to functions.
Evasion Techniques:
- Targeting 32-bit PowerShell specifically
- Heavy use of string obfuscation
- Junk code insertion
- Using Excel template formats to disguise executable code
Network Behavior: The malware reached out to a seemingly randomized domain to fetch additional payloads.
Multi-Stage Delivery: Breaking the malicious code into multiple stages makes detection more difficult, as no single component contains the full malicious capability.
11. Lessons Learned
This analysis reinforces several important principles of malware analysis:
Always Use Proper Isolation: Working in snapshottable VMs without network access except when specifically needed is crucial.
Be Methodical: Documenting each step and creating backups at various stages saved time and prevented having to restart the analysis.
Modern Malware is Dynamic: What initially appeared to be a simple file ended up having 7-8 layers of obfuscation, each requiring different techniques to unpack. However, obfuscations for interpreted languages leave a lot to be desired, enabling the de-obfuscation of the entire payload chain in only an evening.
Defender Catches Known Threats: While initial stages evaded detection, Windows Defender recognized the final payload once it was unpacked to a known signature.
12. Conclusion
This real-world malware analysis demonstrates the some techniques used by modern threats. The PureLogStealer variant used multiple layers of obfuscation and encryption, as well as file format tricks, to evade detection and complicate analysis.
For those interested in malware analysis, this sample offers a quick case study in modern obfuscation techniques and the patient, methodical approach required to unpack them. Like in the case of the sample reviewed here, many modern malware deployments rely on several sequential stages and obfuscation layers to evade detection, but are very easy to unpack with some basic techniques.
1Footnotes
- This analysis was performed in an isolated environment with proper safety precautions. The malware has been reported to relevant parties. Remember that analyzing malware without proper isolation can put your systems and data at risk. ↩