Data Exfiltration Techniques
Data exfiltration is the unauthorized transfer of data from a target environment to attacker-controlled infrastructure.
# Data Exfiltration Techniques
Definition
Data exfiltration is the unauthorized transfer of data from a target environment to attacker-controlled infrastructure. In MITRE ATT&CK, this is Tactic TA0010. It is typically the final operational stage of an intrusion, the point at which the attacker converts network access into usable value: selling stolen records, using credentials in downstream attacks, encrypting data for ransomware leverage, or publishing sensitive information as extortion.
Understanding exfiltration is not primarily about the data transfer itself. It is about everything that happens before the transfer: staging, compression, encryption, and selection of the exfiltration path. Attackers who achieve persistence in a network do not immediately exfiltrate. They survey, collect, stage, and then move. Each phase leaves behavioral artifacts that detection programs can identify.
The techniques in TA0010 cover the full range of exfiltration paths: over existing C2 channels, over alternative protocols, through cloud storage services, and via scheduled transfers designed to blend with normal traffic. Each technique requires a different detection approach, and no single control covers all of them.
The Sovereign Data Protocol (SDP) within CDA's DPS domain addresses the strategic defense against exfiltration: if data is encrypted with customer-managed keys before it leaves your control, what the attacker exfiltrates is ciphertext. SDP's core assertion, "Your data lives where you decide. Period," means that even a successful exfiltration event against an SDP-compliant environment produces data that the attacker cannot use without the encryption keys, which never left the customer's environment.
How It Works
Exfiltration over C2 Channel (T1041)
The most common exfiltration technique is the simplest: send data out over the same channel the attacker uses for command and control. The C2 session is already established, already encrypted (if using HTTPS or custom encryption), and already evading detection. Adding data transfer to an existing session requires no additional infrastructure.
T1041 is detected primarily through volume anomalies on the C2 channel. The behavior changes when exfiltration begins: a C2 channel that was previously sending small amounts of beacon traffic (a few hundred bytes per check-in) suddenly carries megabytes or gigabytes of outbound data. The destination IP and domain remain the same, but the transfer volume is anomalous.
Detection relies on baselining normal outbound traffic volumes per destination and alerting on deviations. A workstation that normally sends 50 MB/day outbound and suddenly sends 2 GB in a four-hour window is a high-confidence exfiltration indicator, regardless of the destination. This detection requires historical baselines, which means organizations need at least 30 days of proxy and firewall log data before volume anomaly detection is reliable.
Exfiltration over Alternative Protocol (T1048)
Alternative protocol exfiltration routes data through protocols other than the primary C2 channel, often to evade detection logic targeting the C2 destination. Sub-techniques cover:
- T1048.001 (Exfiltration Over Symmetric Encrypted Non-C2 Protocol): custom encrypted channels, often over common ports to appear as HTTPS without using standard TLS
- T1048.002 (Exfiltration Over Asymmetric Encrypted Non-C2 Protocol): similar but using public-key encryption
- T1048.003 (Exfiltration Over Unencrypted Non-C2 Protocol): plaintext exfiltration over FTP, HTTP, or custom TCP connections
DNS exfiltration (a variant of T1048 overlapping with T1071.004 DNS tunneling) deserves specific attention. DNS payload size is normally small: a few hundred bytes per query-response pair. When an attacker uses DNS to exfiltrate data, query volume rises and individual query names grow to encode data chunks. The total bytes exfiltrated per unit time is low, making DNS exfiltration slow but difficult to detect without DNS-specific monitoring.
SMTP-based exfiltration (T1048.003 over port 25 or 587) targets mail servers. An attacker with access to an internal mail server can send data to external recipients using the organization's own email infrastructure. Detection: alert on SMTP connections originating from non-mail-server hosts, monitor for large email attachments sent to external addresses from unusual senders, and implement egress filtering that permits SMTP only from designated mail relay servers.
Exfiltration over Web Service (T1567)
Cloud storage services are high-value exfiltration destinations because traffic to them is pre-authorized, HTTPS-encrypted, and generated by legitimate users constantly. Blocking Google Drive or OneDrive outright is not practical for most organizations.
Sub-techniques cover:
- T1567.001 (Exfiltration to Code Repository): data uploaded to GitHub, GitLab, or Bitbucket as files in a repository. Particularly relevant for source code theft. ALPHV/BlackCat operators have been documented using private GitHub repositories as staging and exfiltration destinations.
- T1567.002 (Exfiltration to Cloud Storage): Google Drive, Dropbox, OneDrive, Box, S3 buckets. The attacker creates or compromises an account on the cloud service, uploads stolen data, and retrieves it from outside the target network. The outbound traffic is indistinguishable from a legitimate employee syncing files to cloud storage.
Detection requires Cloud Access Security Broker (CASB) integration. CASB sits in-line with cloud service traffic and can: identify which cloud storage tenants data is being uploaded to (alert when data moves to a personal Google Drive account rather than the corporate tenant), enforce data volume limits per service per user, apply DLP scanning to outbound uploads, and detect uploads from unusual hosts or user accounts.
Without CASB, cloud storage exfiltration is nearly invisible unless the organization inspects TLS traffic at the proxy layer and applies content inspection to uploads.
Exfiltration to Cloud Account (T1537)
T1537 is distinct from T1567 in that the attacker transfers data directly to cloud infrastructure they control, often by configuring new cloud credentials or storage objects within the victim's own cloud environment. In AWS environments, this means creating a new S3 bucket, granting public access, and copying data to it. In Azure, it may involve modifying storage account access policies. The data is transferred within cloud infrastructure, avoiding the egress controls that monitor outbound internet traffic.
Detection for T1537 focuses on cloud control plane logging: AWS CloudTrail, Azure Activity Log, GCP Cloud Audit Logs. Alert on: new S3 bucket creation by unusual IAM principals, modifications to bucket access control lists removing private restrictions, new cross-account IAM role trust relationships, and data transfer events to buckets not in the organization's known bucket inventory.
Scheduled Transfer (T1029)
Scheduled transfer is the low-and-slow exfiltration strategy. Instead of exfiltrating a 10 GB dataset in a single session, the attacker sends small batches at regular intervals, during business hours, at volumes that fall within the normal range of outbound traffic. Over days or weeks, the full dataset transfers without triggering volume-based alerts.
This technique is particularly effective against organizations that only alert on total-volume thresholds (alert if more than 1 GB transferred to a single destination per day) rather than behavioral baselines (alert if this specific host is transferring data to a destination it has never contacted before).
Detection requires longitudinal analysis: building a baseline of which hosts connect to which external destinations over 30 or more days, then detecting new destination relationships that persist over time. A host that has never contacted a specific IP address and then begins sending regular 50 MB batches to that IP is exhibiting T1029 behavior even if each individual transfer is within normal volume bounds.
Data Staging and Pre-Exfiltration Preparation (T1074, T1560)
Before exfiltration occurs, attackers typically stage and compress the target data. T1074 covers local data staging (T1074.001: collecting files in a local directory before transfer) and remote data staging (T1074.002: staging on another compromised system inside the network). T1560 covers archiving collected data, typically for compression and encryption before transfer.
Detection for staging and archiving is endpoint-based and is often the highest-fidelity pre-exfiltration signal available:
- Execution of archiving tools (rar.exe, 7z.exe, WinZip, tar) is uncommon on standard workstations and servers. Alert on archive tool execution, particularly when the output archive is large (above 100 MB) or password-protected.
- Mass file copy operations: a process reading hundreds of files across multiple directories in a short time window is staging behavior. EDR tools can detect this through file system access pattern analysis.
- Creation of large files with archive extensions (
.zip,.rar,.7z,.tar.gz) in unusual locations (temp directories, desktop, network shares) - Volume shadow copy deletion (T1490): often a precursor to ransomware but also associated with evidence destruction around exfiltration
MOVEit and Change Healthcare: Real-World Patterns
The Cl0p ransomware group's exploitation of CVE-2023-34362 in MOVEit Transfer illustrates at scale how modern extortion campaigns execute exfiltration. MOVEit Transfer is a managed file transfer platform. Organizations use it to send large files securely. Cl0p exploited a SQL injection vulnerability to deploy a web shell (LEMURLOOT), which then executed SQL queries to identify and export database contents and file transfer payloads. The exfiltration happened over the same HTTPS port (443) that MOVEit uses for legitimate transfers, to the same IP addresses that customers were already authorized to reach.
The detection failure was upstream: the web shell deployment was the detectable event. By the time exfiltration occurred, the attacker had already achieved privileged access. The lesson is that exfiltration detection must be paired with detection of the compromise steps that precede it.
The Change Healthcare breach involved ALPHV/BlackCat ransomware operators who, after achieving access through a Citrix portal without MFA, spent weeks in the environment before exfiltrating an estimated 6 TB of protected health information before deploying ransomware. The 6 TB volume is the kind of anomaly that volume-based exfiltration detection should catch, but only if baseline thresholds are properly configured and the monitoring coverage includes the relevant network segments.
Detection
Network Log Sources
Web proxy logs: All outbound HTTP/HTTPS must route through a logging proxy. Key fields for exfiltration detection: bytes sent per session, bytes sent per destination per hour, bytes sent per source host per hour, destination domain reputation and categorization, and upload-to-download ratio (exfiltration shows high bytes-sent relative to bytes-received).
Firewall logs: Monitor for large outbound transfers on any port. Connections to new or uncategorized external IPs carrying more than 10 MB of outbound data are exfiltration candidates regardless of port.
DNS logs: For DNS exfiltration detection, monitor query frequency per destination domain and payload size per query set. Baseline normal DNS query volume per host and alert on hosts exceeding 3x their historical average query rate to external domains.
Cloud platform audit logs: AWS CloudTrail, Azure Activity Log, and GCP Cloud Audit Logs must be enabled and shipped to the SIEM. Alert thresholds for S3 GetObject and PutObject operations, Azure Blob write operations, and equivalent events should be tuned to baseline normal data access patterns.
Windows Event IDs
- Event ID 4663 (Object Access: File Read): with auditing enabled on sensitive directories, provides visibility into mass file access events. Requires object access auditing configured via Group Policy.
- Event ID 4688 (Process Creation): with command-line logging enabled, captures execution of archiving tools (7z.exe, rar.exe, tar) with their arguments, revealing target paths and archive names.
- Event ID 4648 (Logon using explicit credentials): may indicate attacker using a service account or alternative credential set to access file shares for staging.
- Sysmon Event ID 11 (File Create): logs file creation events with full path. Alert on creation of large archive files in unusual directories.
- Sysmon Event ID 3 (Network Connection): log outbound connections from processes that should not initiate network connections (rar.exe, 7z.exe, robocopy.exe making network connections).
Behavioral Indicators
- Archive tool (7z.exe, rar.exe, WinZip) executing with password-protect flags (
-pfor 7z,-hpfor rar) combined with large output file size - Robocopy, xcopy, or rsync copying files from sensitive directories to a temporary staging location
- A process writing more than 500 files in under five minutes (bulk file staging)
- Outbound data volume to a cloud storage service that has not previously received data from the same host
- SMTP connections originating from hosts that are not designated mail servers
- Volume shadow copy service being stopped or shadows being deleted (often paired with exfiltration in ransomware attacks)
DLP Controls
Data Loss Prevention (DLP) tools inspect outbound data streams for content matching defined patterns (SSNs, credit card numbers, PHI, proprietary document patterns). Effective DLP deployment requires:
- Endpoint DLP agents to capture data at the point of copy/paste and file write operations
- Network DLP inline with the web proxy for outbound content inspection
- Cloud DLP (CASB) for cloud service upload inspection
- Configured policies with real pattern libraries, not just generic regex, to reduce false positives
DLP is a compensating control, not a primary defense. A motivated attacker will encrypt or obfuscate data before exfiltration, bypassing content-inspection DLP. The primary defense for sensitive data is encryption at rest with customer-managed keys, which converts a successful exfiltration event into a useless ciphertext problem for the attacker.
Why It Matters
Exfiltration is the monetization event of most intrusions. Ransomware groups exfiltrate before encrypting, ensuring they have leverage even if the victim restores from backup. Nation-state actors exfiltrate intellectual property, personnel records, and sensitive communications to support intelligence collection objectives. Criminal actors sell exfiltrated records in bulk on darknet marketplaces.
The business impact of exfiltration is measured in regulatory exposure, customer notification obligations, litigation risk, and reputational damage. The Change Healthcare breach triggered HIPAA breach notification requirements affecting over 100 million individuals, the largest healthcare data breach in US history. The regulatory response to a successful exfiltration event, particularly of regulated data (PHI, PII, financial records, cardholder data), can exceed the direct operational cost of the breach itself.
The MOVEit campaign demonstrated that exfiltration attacks can be executed at industrial scale. Cl0p compromised and exfiltrated from over 2,000 organizations in a matter of weeks using a single vulnerability. The organizations that had encrypted sensitive data at rest with keys outside the MOVEit environment suffered less impact than those whose data was stored in plaintext within the platform.
Exfiltration detection requires visibility at multiple layers simultaneously: network egress, endpoint file system access, cloud control plane, and email. Single-layer detection misses the techniques designed to evade that layer. The organizations with the lowest exfiltration impact are those with network egress visibility, endpoint DLP, and CASB working together against a shared behavioral baseline.
CDA Perspective
DPS: Sovereign Data Protocol
SDP is the strategic defense against exfiltration. When data is encrypted with customer-managed keys before storage, the attacker who exfiltrates it receives ciphertext. They can publish it, sell it, or hold it for ransom, but they cannot read it without keys that were never accessible from the compromised environment. The SDP methodology, "Your data lives where you decide. Period," means implementing encryption that survives a breach.
DPS-B02 (Encryption Standards Deployment) and DPS-B03 (DLP Foundation) are the BUILD-phase missions that implement SDP. DPS-H01 (Advanced DLP Tuning) operationalizes content inspection at the levels required to catch staged data moving toward exfiltration paths. DPS-D01 (Data Exfiltration Drill) is the validation exercise: simulate an exfiltration attempt and verify that detection and response controls actually work before an attacker tests them for you.
DPS-H02 (Data Sovereignty Mapping) and DPS-H03 (Key Management Hardening) address the key management and data residency components of SDP that ensure encrypted data remains protected even after exfiltration.
TID: Predictive Defense Intelligence
The PDI methodology applies to exfiltration detection through threat intelligence integration and behavioral monitoring. TID-B01 (SIEM Deployment and Tuning) is the prerequisite: without centralized log collection covering network egress, endpoint file activity, and cloud platforms, exfiltration detection logic has nothing to operate on. TID-B03 (Threat Intelligence Integration) adds threat feeds covering known exfiltration infrastructure (IPs associated with Cl0p, ALPHV, and other ransomware groups) that can block or alert on connections to known-bad destinations.
TID-H01 (Detection Engineering Program) is where the behavioral detection rules described in this article are built, tested, and maintained: archive tool execution rules, bulk file access alerts, volume anomaly thresholds, and cloud storage upload monitoring. TID-H03 (Threat Hunting Program) proactively hunts for staging and exfiltration activity that automated rules have not yet flagged.
SPH: Autonomous Posture Command
Egress filtering as an APC control limits available exfiltration paths. Organizations that enforce outbound traffic through a proxied, inspected, and logged egress path, blocking direct connections to non-categorized destinations, close the easiest exfiltration channels. SPH-H01 (Automated Compliance Monitoring) ensures these egress controls remain in force as the environment changes. SPH-B03 (Security Awareness Program) addresses the human element: employees who understand why sensitive data should not be copied to personal cloud storage are a meaningful defense layer against both malicious exfiltration and accidental data exposure.
Key Takeaways
- Exfiltration (TA0010) is the tactic covering unauthorized data transfer out of a target environment. It is typically the final operational stage before ransomware deployment or the end goal of espionage operations.
- The most common exfiltration technique is T1041 (Exfiltration over C2 Channel). Detection requires volume anomaly analysis against historical baselines.
- Cloud storage exfiltration (T1567.002) is nearly invisible without CASB. Organizations permitting outbound HTTPS to Google Drive, Dropbox, OneDrive, and S3 without CASB inspection are operating blind to this technique.
- Pre-exfiltration staging (T1074, T1560) is often the highest-fidelity detection opportunity. Archive tool execution with large output files and mass file access events are detectable before the data leaves the network.
- The Change Healthcare breach (6 TB exfiltrated) and the MOVEit campaign (2,000+ organizations compromised) illustrate the operational scale of modern exfiltration operations.
- DLP is a compensating control. Motivated attackers encrypt or obfuscate before exfiltrating. Encryption at rest with customer-managed keys (CDA's SDP methodology) is the strategic defense that makes exfiltration operationally useless.
- Detection requires simultaneous visibility at network egress, endpoint file system, cloud control plane, and email layers. Single-layer detection misses techniques designed to evade that layer.
Sources
- MITRE ATT&CK: Exfiltration (TA0010). https://attack.mitre.org/tactics/TA0010/
- CISA: #StopRansomware: Cl0p Ransomware Gang Exploits MOVEit Vulnerability. https://www.cisa.gov/news-events/cybersecurity-advisories/aa23-158a
- HHS Office for Civil Rights: Change Healthcare Cybersecurity Incident. https://www.hhs.gov/hipaa/for-professionals/special-topics/change-healthcare-cybersecurity-incident/index.html
- Verizon 2024 Data Breach Investigations Report. https://www.verizon.com/business/resources/T1f3/reports/2024-dbir-data-breach-investigations-report.pdf
- MITRE ATT&CK: Data Staged (T1074). https://attack.mitre.org/techniques/T1074/
- CISA: Data Loss Prevention Guide. https://www.cisa.gov/sites/default/files/2023-06/fact-sheet-data-loss-prevention.pdf
- Gartner Magic Quadrant for Security Service Edge (CASB Capabilities). 2024.
Sources
- MITRE ATT&CK: Exfiltration (TA0010). https://attack.mitre.org/tactics/TA0010/
- CISA: #StopRansomware - Cl0p Ransomware Gang Exploits CVE-2023-34362 MOVEit Vulnerability. https://www.cisa.gov/news-events/cybersecurity-advisories/aa23-158a
- HHS Office for Civil Rights: Change Healthcare Cybersecurity Incident. https://www.hhs.gov/hipaa/for-professionals/special-topics/change-healthcare-cybersecurity-incident/index.html
- Verizon 2024 Data Breach Investigations Report. https://www.verizon.com/business/resources/T1f3/reports/2024-dbir-data-breach-investigations-report.pdf
- MITRE ATT&CK: Data Staged (T1074). https://attack.mitre.org/techniques/T1074/
- CISA: Data Loss Prevention Guide. https://www.cisa.gov/sites/default/files/2023-06/fact-sheet-data-loss-prevention.pdf
- Gartner: Magic Quadrant for Security Service Edge (CASB Capabilities). 2024.
Related Articles
Lazarus Group (HIDDEN COBRA / Diamond Sleet)
Lazarus Group is North Korea's primary advanced persistent threat operation, operating under the RGB (Reconnaissance General Bureau), the DPRK's primary foreign intelligence service.
Written by Evan Morgan
Found an issue? Help improve this article.