T1132 Data Encoding — MITRE ATT&CK Technique Reference

Encoding is everywhere — and that is the problem

Base64 encoding is legitimate: MIME email attachments, JWT tokens, HTTP authentication headers, and API payloads all use Base64. This ubiquity means that Base64-encoded C2 traffic blends seamlessly with legitimate application data. Traditional signature-based detection cannot distinguish malicious Base64 payloads from legitimate encoded content without contextual analysis. Steganography compounds this challenge by hiding encoded payloads inside image files, audio files, and Unicode text that appear completely benign. In June 2025, researchers discovered malicious JPEG images with Base64-encoded executables appended after the end-of-file marker — invisible to image viewers but extractable by malware loaders. T1132 has two sub-techniques: Standard Encoding (T1132.001) using recognized encoding schemes, and Non-Standard Encoding (T1132.002) using custom or modified encoding systems.

T1132 falls under the Command and Control tactic (TA0011). The technique covers any encoding of C2 data that makes it harder to detect or analyze, whether using standard binary-to-text encoding systems (Base64, hexadecimal, ASCII, Unicode, MIME) or non-standard encoding that diverges from protocol specifications (custom XOR schemes, modified Base64 alphabets, proprietary encoding formats). The technique spans ESXi, Linux, Windows, and macOS platforms.

MITRE ATT&CK v18 documents dozens of threat groups and over 70 malware families using T1132, making it one of the most commonly observed C2 techniques. The technique is distinguished from T1573 (Encrypted Channel) in that encoding transforms data into a different representation rather than making it cryptographically unreadable — an analyst who captures encoded traffic can decode it without needing encryption keys, though they must first identify the encoding scheme being used.

The Two Sub-Techniques

T1132.001 — Standard Encoding

Adversaries encode C2 data using recognized encoding systems that adhere to existing protocol specifications. The dominant encoding is Base64, which converts binary data to a 64-character ASCII string. Base64 is ubiquitous in web applications — used in HTTP Basic authentication, data URIs, email MIME attachments, JSON Web Tokens (JWT), and API payloads — making Base64-encoded C2 traffic nearly impossible to flag as suspicious based on encoding alone. Common implementations include Base64 encoding of exfiltrated data in HTTP POST bodies, Base64-encoded commands embedded in cookie values or URL parameters, hexadecimal encoding of shellcode and binary payloads, and gzip compression (which is both a compression and encoding system) to reduce C2 payload sizes.

HAFNIUM used Base64 encoding in its 2021 exploitation of Microsoft Exchange Server zero-day vulnerabilities, encoding web shell commands to bypass WAF and IDS signatures. LockBit 3.0 uses Base64 encoding for configuration data and command parameters. The Redline infostealer uses a custom Base64 encoding approach documented by researchers in April 2024 for obfuscating stolen credential data before C2 transmission. The Kapeka backdoor (linked to Sandworm/GRU) uses Base64 encoding in its C2 communications. The Springtail group added a new Linux backdoor that encodes C2 traffic using standard Base64. The Gootloader malware encodes its multi-stage payloads using Base64, functioning as an "Initial Access as a Service" platform. Ryuk ransomware's GrimAgent loader uses Base64 encoding in its C2 protocol.

T1132.002 — Non-Standard Encoding

Adversaries encode C2 data using custom encoding schemes that diverge from standard protocol specifications. This includes XOR encoding with hardcoded or derived keys, modified Base64 alphabets (substituting characters to break standard decoders), custom binary-to-text conversions, proprietary encoding formats, steganographic encoding within image or media files, and Unicode steganography using invisible characters.

The FSB's Snake malware used custom encoding combined with fragmentation protocols — traffic was encoded using proprietary methods layered on top of standard network protocols, making decoding impossible without reverse-engineering the malware. OilRig developed a novel C2 channel using steganography, hiding encoded commands within image files transmitted to compromised systems. The MoustachedBouncer group (targeting foreign diplomats in Belarus) uses non-standard encoding in its C2 communications. The ToddyCat APT uses non-standard encoding in multiple tools across its espionage campaigns. Earth Preta (Mustang Panda) and Hive0154 use non-standard encoding in campaigns targeting the US, Philippines, Pakistan, and Taiwan. DiceLoader (FIN7) uses custom encoding protocols, and the neo-reGeorg webshell tunneling tool implements non-standard encoding for its C2 channel.

A particularly sophisticated example emerged in October 2025: Veracode discovered an npm package (os-info-checker-es6) using Unicode steganography — invisible Unicode variation selector characters that encoded a Base64 string, which when decoded, fetched a second-stage payload from a Google Calendar event. The Calendar event contained another encoded URL in its data-base-title attribute, which downloaded the final Base64-encoded malware. This multi-layer encoding chain (Unicode steganography to Base64 to Google Calendar to Base64 to payload execution) illustrates how adversaries stack encoding techniques to evade each layer of defense.

How Data Encoding Works in Practice

Base64 in C2 Communications

The simplest and most common implementation: a malware implant collects data from the victim (credentials, screenshots, file listings), Base64-encodes it, and sends it as part of an HTTP request — in the URL, in a POST body, in a cookie value, or in a custom HTTP header. The C2 server Base64-decodes the data, processes the command results, Base64-encodes its response (new commands, configuration updates), and returns it in the HTTP response body. To defenders inspecting network traffic, the encoded data looks like legitimate Base64 content found in normal web traffic. The PowerShell -EncodedCommand parameter is a notorious vector: attackers pass Base64-encoded PowerShell scripts to powershell.exe -enc, bypassing command-line logging that would capture the plaintext.

Steganographic Encoding

Steganography hides encoded data within seemingly benign files. Forcepoint X-Labs documented a Q3 2025 attack chain where obfuscated JavaScript email attachments download PNG images from compromised domains. The PNG files contain Base64-encoded .NET executables delimited by custom markers (BaseStart- and -BaseEnd). PowerShell extracts the encoded content between the markers, Base64-decodes it, and executes the resulting malware. In June 2025, researchers discovered JPEG images with Base64-encoded malicious payloads appended after the End-of-Image (EOI) marker — the image renders normally, but the appended data contains executable code invisible to image viewers and scanners that stop reading at the EOI marker. OilRig used steganographic C2 where commands were hidden in image files, and the StegBaus loader (documented by Unit 42) used steganography combined with AES encryption and multiple encoding layers to deliver DarkComet, LuminosityLink RAT, Pony, and other malware families.

XOR and Custom Encoding

XOR encoding is the most common non-standard encoding scheme. Adversaries XOR plaintext data with a key (single byte, multi-byte, or rotating) to produce encoded output that appears random. The key may be hardcoded in the malware, derived from system properties (machine GUID, hostname, timestamp), or exchanged during the initial C2 handshake. A December 2025 analysis documented a Python RAT distributed via malicious PyPI packages that uses custom encryption and XOR obfuscation for C2 communications, combined with Base64 encoding to bypass static detection tools. The RotaJakiro Linux backdoor uses custom XOR encoding combined with rotation algorithms. The OceanSalt campaign used non-standard encoding derived from source code of Chinese hacker groups. Custom encoding is harder to detect than standard Base64 because there are no standard decoders — analysts must reverse-engineer the malware to understand the encoding scheme.

Multi-Layer Encoding Chains

Sophisticated adversaries stack multiple encoding layers. A typical chain might be: plaintext command, XOR-encoded with a rotating key, then Base64-encoded, then embedded in a legitimate-looking HTTP header or image file. Each layer must be peeled back in sequence to reach the original data. The Unicode steganography attack documented by Veracode used four distinct encoding layers (Unicode variation selectors, Base64, Google Calendar API, Base64 again) to hide a malicious payload in what appeared to be a simple npm package. Snake malware combined custom encoding with traffic fragmentation, sending encoded data split across multiple packets that had to be reassembled before decoding could begin.

Why Data Encoding Matters

Encoding defeats content-based detection

Network IDS signatures that match on plaintext command strings, file headers, or known malicious patterns are rendered useless when C2 traffic is encoded. A Base64-encoded PowerShell download cradle looks nothing like the plaintext command it represents. An XOR-encoded C2 beacon contains no recognizable patterns. A steganographic payload hidden in a PNG image passes every file type validation check. Encoding is the fundamental technique that forces defenders from content-based detection to behavioral analysis.

Standard encoding blends with legitimate traffic. Base64 appears in every modern web application. HTTP headers, API responses, authentication tokens, and file attachments all use Base64 encoding as part of normal operations. An IDS rule that alerts on Base64 content would generate millions of false positives. This means that detecting malicious Base64-encoded C2 requires contextual analysis — where the encoded data appears, what process generated it, and what pattern of communications it represents — rather than detecting the encoding itself.

Non-standard encoding requires reverse engineering. Custom XOR schemes, modified character sets, and proprietary encoding formats cannot be decoded by standard tools. Defenders must first obtain the malware binary, reverse-engineer the encoding algorithm, extract any keys, and then build custom decoders. This creates a significant time delay between initial compromise and the ability to analyze captured C2 traffic, during which the adversary operates with impunity.

Steganography is an emerging delivery and C2 trend. Forcepoint's Q3 2025 analysis documented a rising trend of steganography-based attack chains. The combination of steganographic encoding with legitimate file formats (PNG, JPEG, audio) creates payloads that pass file type validation, bypass content scanning, and appear benign to manual analysis. As defenders improve detection of encoded scripts and encoded command-line parameters, adversaries are moving encoding into media files that receive less scrutiny.

Encoding is the first layer in defense evasion. T1132 frequently appears in combination with T1573 (Encrypted Channel), T1001 (Data Obfuscation), and T1071 (Application Layer Protocol). Adversaries encode data, then encrypt it, then send it over a legitimate protocol — creating multiple layers that each independently complicate detection. Defeating any one layer is insufficient; defenders must address the entire stack.

Real-World Case Studies

Case 1: HAFNIUM / Exchange Server Exploitation (2021)

HAFNIUM used Base64 encoding extensively during the exploitation of Microsoft Exchange Server zero-day vulnerabilities (ProxyLogon). Web shell commands were Base64-encoded to bypass web application firewalls and intrusion detection systems. The encoded commands enabled post-exploitation activities including credential dumping, data collection, and lateral movement across compromised Exchange environments. The campaign affected over 30,000 organizations in the United States alone. This case demonstrated that even simple standard encoding (Base64) is effective against signature-based defenses when combined with zero-day exploitation that bypasses the normal detection window.

Case 2: Steganographic Delivery Chains — Q3 2025

Forcepoint X-Labs documented a rising trend of email campaigns using obfuscated JavaScript attachments that download PNG images containing Base64-encoded .NET payloads. The attack chain begins with phishing emails disguised as invoices, quotes, or shipment alerts. The JavaScript attachments, when opened, use PowerShell to download PNG images from compromised domains. The PNG files contain Base64-encoded DLL or EXE binaries delimited by the markers BaseStart- and -BaseEnd. PowerShell extracts the content between these markers, Base64-decodes it, and executes the resulting .NET RATs and infostealers. The final payloads use process hollowing (injecting into RegASM.exe) and exfiltrate data to dynamic DNS servers via SMTP/FTP. This multi-stage encoding chain illustrates how adversaries combine JavaScript obfuscation, steganographic embedding, and Base64 encoding to bypass each layer of defense.

Case 3: OilRig Steganographic C2 (2020–2025)

OilRig (APT34/Iran) developed a novel C2 channel using steganography to hide encoded commands within image files transmitted between compromised systems and C2 infrastructure. The approach targets Middle Eastern telecommunications organizations and uses non-standard encoding to embed command data within the pixel data or metadata of image files, making the C2 traffic appear as normal image downloads. OilRig's Outer Space and Juicy Mix campaigns continued to use non-standard encoding in 2023-2024, with each campaign iteration introducing new encoding variations to evade detection signatures developed for previous campaigns.

Case 4: Unicode Steganography in npm Supply Chain (October 2025)

Veracode discovered the npm package os-info-checker-es6 using Unicode steganography for malicious payload delivery. The package used invisible Unicode variation selector characters (in the range U+E0100) to encode a Base64 string within what appeared to be a simple vertical bar character. A native binary decoded the Unicode steganography to extract the Base64 string, which when decoded, fetched a second-stage payload from a Google Calendar event. The Calendar event contained another encoded URL in its data-base-title attribute, which when fetched and Base64-decoded, delivered the final malware payload. This four-layer encoding chain (Unicode steganography, Base64, Google Calendar C2, Base64) represents the cutting edge of encoding-based evasion.

Case 5: Snake Malware Custom Encoding — 20 Years of Non-Standard C2

The FSB's Snake malware used custom encoding combined with traffic fragmentation across its nearly 20 years of operation. Snake's C2 protocol encoded data using proprietary methods, then fragmented the encoded data across multiple network packets, then routed the fragments through a global peer-to-peer relay network of infected systems. The encoding was designed specifically to be indistinguishable from legitimate network traffic. CISA's joint advisory described Snake's encoding as making "data payloads impossible to decrypt and interpret without software specifically designed to process the implant's custom protocols." The FBI required eight years of analysis to reverse-engineer Snake's encoding and develop the PERSEUS counter-tool that ultimately disrupted the network in Operation MEDUSA.

Detection Strategies

Detect the context, not the encoding

Alerting on Base64 content produces unmanageable false positives. Detection must focus on contextual indicators: Base64-encoded content in unusual locations (URL parameters, cookie values, custom headers), processes decoding content before network transmission, PowerShell -EncodedCommand usage, high-entropy data in protocols that normally carry low-entropy content, and anomalous data flow patterns where clients send significantly more data than they receive.

Data Source	Component	Detection Focus
Network Traffic	Network Traffic Content	Base64 patterns in unusual HTTP fields (cookies, custom headers); high-entropy data in low-entropy protocols; non-standard character sets
Process	Process Creation	PowerShell -EncodedCommand / -enc; certutil -encode/-decode; use of encoding tools (base64.exe, xxd) by suspicious processes
Network Traffic	Network Traffic Flow	Asymmetric data flows (client sending significantly more than receiving); periodic encoded data bursts consistent with beaconing
File	File Creation	Image files with anomalous sizes or appended data beyond EOF markers; files with high entropy appended to low-entropy content
Script	Script Execution	JavaScript eval() of decoded content; Python exec() with decoded strings; PowerShell Invoke-Expression on decoded data

Splunk / SIEM Detection Queries

PowerShell Encoded Command Execution — Detect Base64-encoded PowerShell commands, the most common T1132 vector:

index=sysmon EventCode=1 Image="*\\powershell.exe"
| where match(CommandLine, "(?i)(-enc\s|-encodedcommand\s|-e\s+[A-Za-z0-9+/=]{20,})")
| eval decoded=base64decode(mvindex(split(CommandLine, " "), -1))
| stats count values(decoded) as decoded_commands values(User) as users
  by ComputerName ParentImage
| sort - count

High-Entropy HTTP Content Detection — Identify HTTP traffic with unusually high entropy (consistent with encoded payloads):

index=proxy_logs http_method=POST content_length>1000
| eval entropy=shannon_entropy(http_body)
| where entropy > 5.5
| stats count avg(entropy) as avg_entropy sum(content_length) as total_bytes
  by src_ip dest_ip dest_host
| where count > 10 AND avg_entropy > 5.8
| sort - total_bytes

Certutil Encoding/Decoding Activity — Detect use of certutil for Base64 encoding or decoding, a common LOLBin for T1132:

index=sysmon EventCode=1 Image="*\\certutil.exe"
| where match(CommandLine, "(?i)(-encode|-decode|-urlcache)")
| stats count values(CommandLine) as commands values(ParentImage) as parents
  by ComputerName User
| sort - count

Anomalous Image File Downloads — Detect image files with suspicious characteristics that may contain steganographic payloads:

index=proxy_logs http_content_type="image/*"
  (http_response_code=200)
| eval size_kb=content_length/1024
| where size_kb > 500 AND (match(url, "(?i)\.(jpg|jpeg|png|gif|bmp)$"))
| stats count sum(size_kb) as total_kb_downloaded dc(url) as unique_images
  by src_ip dest_host
| where total_kb_downloaded > 5000 AND count > 20
| sort - total_kb_downloaded

Known Threat Actors

Standard Encoding (T1132.001)

Threat Actor / Malware	Encoding Method	Notable Detail
HAFNIUM	Base64 web shell commands	Exchange Server zero-day exploitation; 30,000+ orgs affected
LockBit 3.0	Base64 configuration encoding	CISA AA23-075A advisory; encoded parameters in C2
Redline Stealer	Custom Base64 approach	Credential exfiltration with novel Base64 encoding (2024)
Kapeka (Sandworm/GRU)	Base64 C2 encoding	Eastern European backdoor documented by WithSecure (April 2024)
Springtail / Kimsuky	Base64 in Linux backdoor	New toolkit addition with standard encoded C2
Gootloader	Base64 multi-stage payloads	"Initial Access as a Service" platform; encoded delivery chain
GrimAgent (Ryuk)	Base64 C2 protocol	Encoded command exchange in Ryuk deployment chain
Latrodectus / IcedID successor	Base64 C2 encoding	Proofpoint: potential IcedID replacement with encoded C2
HOPLIGHT (Lazarus)	Hex/Base64 encoding	North Korean trojan with encoded C2 communications

Non-Standard Encoding (T1132.002)

Threat Actor / Malware	Encoding Method	Notable Detail
Snake / Turla (FSB)	Custom encoding + fragmentation	20 years of custom C2 encoding; FBI needed 8 years to decode
OilRig / APT34	Steganographic image encoding	Commands hidden in image files; novel C2 channel (2020-2025)
Earth Preta / Mustang Panda	Non-standard encoding	Campaigns targeting US, Philippines, Pakistan, Taiwan (2025)
MoustachedBouncer	Non-standard C2 encoding	Targeting foreign diplomats in Belarus
ToddyCat	Non-standard encoding in multiple tools	APT espionage across Asia with custom encoding
DiceLoader (FIN7)	Custom encoding protocol	Unveiled by Sekoia TDR (February 2024)
RotaJakiro	Custom XOR + rotation	Linux backdoor with 0 VT detections for years
Velvet Ant (China)	Non-standard encoding via F5	Abused F5 load balancers with custom encoded C2 (Sygnia, 2024)

Defensive Recommendations

1. Monitor PowerShell Encoded Commands

Enable PowerShell Script Block Logging (Event ID 4104) and Module Logging to capture decoded script content, bypassing Base64 encoding at the logging level. Alert on use of -EncodedCommand, -enc, and -e parameters. Script Block Logging automatically decodes Base64 content and logs the plaintext, making it the single most effective detection for Base64-encoded PowerShell C2 and execution.

2. Implement Network Content Analysis

Deploy network detection tools capable of identifying encoded content in unusual protocol locations. Focus on Base64 patterns in HTTP cookies, custom headers, URL parameters, and DNS TXT records where encoded content is not expected. Use entropy analysis to flag high-entropy payloads in low-entropy protocols. Behavioral analysis of data flow patterns (asymmetric transfers, periodic bursts) is more reliable than content inspection for encoded C2.

3. Detect Certutil and LOLBin Encoding

Monitor execution of certutil.exe -encode, certutil.exe -decode, and certutil.exe -urlcache which are commonly used for Base64 encoding/decoding and file download in malware operations. Similarly monitor for encoding utilities (base64, xxd on Linux) executed by non-administrative users or called from unusual parent processes.

4. Analyze Image Files for Steganographic Content

Implement file analysis that examines image files for appended data beyond end-of-file markers, unusually large file sizes relative to image dimensions, high-entropy regions within low-entropy image data, and known steganographic markers. Forcepoint's Q3 2025 documentation of the BaseStart-/-BaseEnd markers provides specific detection signatures for that campaign pattern.

5. Deploy TLS Inspection for Encoded Content

Encoded C2 traffic inside TLS sessions is invisible without decryption. Deploy TLS inspection at network boundaries to expose encoded payloads within encrypted HTTPS traffic. This enables network-level detection of Base64-encoded commands, steganographic downloads, and non-standard encoded data that would otherwise be hidden by the encryption layer.

6. Build Encoding-Aware Detection Rules

Write SIEM rules that detect the behavioral patterns of encoding-based C2 rather than specific encoded content. Focus on: processes that read files then immediately make network connections (steganographic C2), processes that Base64-decode content then execute it (encoded payload delivery), and network sessions with consistent encoded payload sizes (beaconing with encoded data). Combine encoding detection with other C2 indicators for higher-confidence alerts.

7. Implement Application Whitelisting for Script Interpreters

Restrict which processes can execute decoded content. Use application control to prevent PowerShell, wscript, cscript, and mshta from executing encoded or decoded content from untrusted locations. Constrained Language Mode in PowerShell limits the ability to execute arbitrary encoded scripts, though determined adversaries can bypass these restrictions.

8. Automate Decoding in Analysis Workflows

Build automated decoding into your SOC analysis workflows. When analysts encounter suspicious traffic, automated tools should attempt Base64 decoding, hexadecimal conversion, common XOR key testing, and entropy analysis. MITRE's detection guidance recommends analyzing packet contents to detect communications that do not follow expected protocol behavior for the port being used.

MITRE ATT&CK Mapping

Field	Value
Technique ID	T1132
Name	Data Encoding
Tactic	Command and Control (TA0011)
Sub-Techniques	T1132.001 Standard Encoding, T1132.002 Non-Standard Encoding
Platforms	ESXi, Linux, Windows, macOS
Version	1.3 (Last Modified October 2025)
Data Sources	Network Traffic Content, Network Traffic Flow
Related Techniques	T1573 Encrypted Channel, T1001 Data Obfuscation, T1071 Application Layer Protocol, T1027 Obfuscated Files or Information

Sources and References

sourced references

This article draws on vendor threat intelligence, academic research, and government advisories. All referenced sources are publicly available.

Forcepoint X-Labs — Q3 2025 Threat Brief: Obfuscated JavaScript and Steganography Enabling Malware Delivery (October 2025): forcepoint.com
Veracode — Sophisticated npm Attack Leveraging Unicode Steganography and Google Calendar C2 (October 2025): veracode.com
CyberSecurityNews — Malicious Payload Uncovered in JPEG Image Using Steganography and Base64 (June 2025): cybersecuritynews.com
OPSWAT — How Base64 Encoding Opens the Door for Malware (December 2024): opswat.com
Unit 42 (Palo Alto Networks) — StegBaus: Steganographic Malware Loader: unit42.paloaltonetworks.com
CISA/FBI/Five Eyes — Hunting Russian Intelligence Snake Malware (May 2023): cisa.gov
Sygnia — Velvet Ant Abuses F5 Load Balancers for Persistence (June 2024): referenced via attack.mitre.org
Sekoia TDR — Unveiling the Intricacies of DiceLoader / FIN7 (February 2024): referenced via attack.mitre.org
CyberPress — Infected PyPI Package Using XOR and Base64 C2 Encoding (November 2025): cyberpress.org
MITRE ATT&CK — T1132 Data Encoding (v18, October 2025): attack.mitre.org