cloud-forensics: CDA.Wiki (Print)

# Cloud Forensics

Definition

Cloud forensics is the discipline of acquiring, preserving, and analyzing digital evidence from cloud infrastructure in support of security incident investigations, insider threat cases, and litigation holds. It extends traditional digital forensics into environments where the investigator has no physical access to hardware, where compute resources may exist for minutes before self-terminating, and where the cloud provider controls the underlying infrastructure under a shared responsibility model.

The shift to cloud computing did not eliminate the need for forensic investigation. It changed who owns the evidence, how it is accessed, and how long it survives. On-premises forensics has a well-defined playbook: isolate the machine, image the disk, preserve volatile memory, collect logs. Cloud forensics requires a fundamentally different approach because the "machine" may be a container that lived for 90 seconds, the disk may be a managed block volume that auto-deletes on instance termination, and the hypervisor layer is entirely inaccessible to the tenant.

Cloud forensics operates within CDA's TID (Threat Intelligence and Defense) domain. The atmosphere layer detects adversary presence, but when detection fires after a breach, forensics is how defenders reconstruct what happened, what was accessed, what was exfiltrated, and how the attacker moved. PDI (Predictive Defense Intelligence) depends on accurate post-incident reconstruction to improve future detection logic. Without the forensic record, each incident teaches the organization nothing about the adversary's tradecraft.

The discipline draws from established frameworks including the NIST SP 800-86 Guide to Integrating Forensic Techniques into Incident Response and the Cloud Security Alliance's Cloud Forensics Capability Maturity Model, adapting their evidence preservation principles to the specific constraints of AWS, Azure, and Google Cloud Platform.

How It Works

The Core Challenge: Evidence Volatility in Cloud Environments

Traditional forensics operates under the principle that evidence at rest is evidence preserved. This assumption breaks down in cloud environments across four dimensions.

Ephemeral compute. Auto-scaling groups launch and terminate instances automatically based on load. A web tier under attack may spin up 20 instances during the incident and terminate all of them before the investigation begins. Containers in Kubernetes clusters have lifespans measured in seconds to minutes. If no snapshot or log forwarding exists before termination, the instance-level evidence is gone permanently. Evidence preservation in cloud forensics is not a reactive task: it must be built into the architecture before an incident occurs.

Shared hardware. Cloud providers run multi-tenant infrastructure. Tenants share physical hardware with other customers, and the hypervisor layer managing memory and CPU scheduling belongs entirely to the provider. Investigators cannot access hypervisor memory, neighboring tenant data, or physical disk firmware. This is not a gap in cloud forensics capability: it is a boundary the investigator must work within. Evidence collection is limited to the resources the tenant controls under the shared responsibility model.

Distributed storage. An S3 bucket may store objects replicated across three availability zones in two regions. An RDS instance may have automated backups retained for 35 days. Data may be replicated to a disaster recovery region in a different legal jurisdiction. Jurisdiction is not a theoretical concern: cross-border data access during investigations involving European customer data, for example, implicates GDPR's provisions on data transfers, and several cloud regions in China are subject to national security review requirements that may prohibit evidence access without government authorization.

Log retention gaps. Cloud providers do not enable all logging by default. AWS CloudTrail must be explicitly enabled and configured to deliver logs to S3. VPC Flow Logs require activation per VPC. S3 server access logging is off by default. An organization that discovers an attacker exfiltrated data from S3 buckets may find that S3 access logging was never enabled, leaving no record of what objects were accessed or when.

Evidence Sources in Cloud Environments

Cloud forensics substitutes software-defined evidence for physical evidence. The investigator reconstructs activity from control plane logs, data plane logs, storage snapshots, and configuration state rather than from disk images and memory captures.

AWS CloudTrail is the primary control plane evidence source in AWS. CloudTrail records every API call made to AWS services: who made the call (IAM principal, access key, or assumed role), what action was requested, what resource was affected, from which IP address and user agent, and whether the call succeeded or failed. An attacker creating a new IAM user, attaching an AdministratorAccess policy, and exfiltrating S3 objects generates a complete CloudTrail record of each step. Organizations with CloudTrail enabled, logs delivered to a dedicated forensic S3 bucket with MFA-delete enabled, and logs replicated to a separate account controlled by the security team have a tamper-resistant record that survives even if the attacker compromises the primary AWS account.

VPC Flow Logs capture the 5-tuple of network connections (source IP, destination IP, source port, destination port, protocol) plus bytes transferred and whether the connection was accepted or rejected. Flow logs do not capture payload content: they show that host A connected to host B on port 443 and transferred 14.3 GB, not what that 14.3 GB contained. This is sufficient to identify data exfiltration by volume, identify command-and-control communications by destination IP, and establish lateral movement patterns between instances. Flow logs are published to S3 or CloudWatch Logs with a delay of approximately 10 minutes, creating a near-real-time network record.

CloudWatch Logs capture application and operating system logs forwarded by the CloudWatch agent or application-level integrations. System logs, authentication events, application errors, and custom log streams all flow through CloudWatch. Retention is configurable: logs that expire before the investigation begins are permanently lost. A forensic logging architecture sets retention at a minimum of 12 months for security-relevant log groups.

EBS snapshots capture the state of an Elastic Block Store volume at a point in time. When an instance is suspected of compromise, the forensic response procedure includes creating an EBS snapshot of all attached volumes before the instance is terminated. The snapshot is then mounted read-only to a separate forensic instance (ideally in an isolated forensic VPC with no internet access) for analysis using standard Linux forensic tools (The Sleuth Kit, Autopsy, Volatility for memory if a memory capture was taken). This approximates the disk imaging step of traditional forensics, with the constraint that the snapshot captures the volume state at the moment of snapshot creation, not at the moment of initial compromise.

S3 access logs and S3 CloudTrail data events distinguish which objects were accessed. Standard CloudTrail captures S3 management events (bucket creation, policy changes) but not data events (object reads, uploads, deletes) unless data event logging is explicitly enabled. S3 server access logs provide per-request object access records but with a different format and delay than CloudTrail. For investigations involving data exfiltration from S3, both sources should be queried and correlated.

Container logs (EKS, GKE, ECS). Container orchestration platforms generate logs at multiple layers: Kubernetes control plane audit logs (similar in function to CloudTrail for Kubernetes API calls), kubelet logs, container stdout/stderr forwarded to CloudWatch or Cloud Logging, and application-level logs. EKS control plane logging captures authentication decisions, API calls, and scheduler activity. Container forensics must account for the fact that container filesystems are ephemeral by design: without a logging sidecar or log aggregation agent, container logs survive only as long as the container process runs.

Instance metadata. The EC2 instance metadata service (IMDS) exposes role credentials, instance identity documents, network configuration, and user data scripts. An attacker who accessed IMDS may have obtained temporary credentials associated with the instance profile. The instance identity document, cryptographically signed by AWS, provides a tamper-evident record of the instance's AMI, launch time, and instance type, useful for establishing the forensic context of the snapshot being analyzed.

Forensic Procedures: Snapshot Before Termination

The most critical cloud forensic procedure is evidence preservation before instance termination. Auto-scaling and auto-remediation tooling may terminate compromised instances automatically: Incident response runbooks must interrupt this automation when forensic preservation is required.

The preservation sequence for a compromised EC2 instance: (1) isolate the instance by attaching a restrictive security group that allows no inbound or outbound traffic, preventing further attacker activity and preventing the instance from being terminated by load balancer health check failures; (2) create snapshots of all attached EBS volumes; (3) capture instance metadata and current network interface configuration; (4) export relevant CloudTrail and VPC Flow Log data for the incident time window; (5) tag all resources with incident identifiers for chain-of-custody tracking; (6) only then allow auto-scaling to replace the instance.

Timeline Reconstruction from CloudTrail

Timeline reconstruction aggregates log sources into a unified chronological sequence of events. Athena queries against CloudTrail logs in S3 allow analysts to reconstruct the attacker's actions across an entire AWS account: when did the initial access occur, which API calls were made and in what order, which resources were created or accessed, and what lateral movement occurred between services and accounts.

The reconstruction must account for time zone normalization (CloudTrail logs in UTC), log propagation delays (VPC Flow Logs have an approximately 10-minute delay), and gaps where logging was not enabled. An honest forensic timeline documents what is known, what was not captured due to logging gaps, and what can be inferred from indirect evidence.

Why It Matters

Cloud forensics capability determines whether an organization learns from a breach or is left guessing about attacker activity, blast radius, and regulatory obligations.

When a breach involves regulated data (PHI under HIPAA, PII under GDPR, cardholder data under PCI DSS), the organization typically has a legal obligation to determine whether data was accessed or exfiltrated and to report to regulators within defined timeframes. Without forensic logging in place before the incident, the honest answer to "was customer data accessed?" is "we don't know," which in many regulatory frameworks triggers a worst-case assumption: report the breach for all potentially affected individuals, not just those confirmed affected. The financial and reputational difference between a confirmed 50-record breach and a worst-case 500,000-record breach notification is substantial.

Forensics also directly improves future detection. Every reconstructed attacker timeline produces MITRE ATT&CK technique mappings: this is how the attacker gained initial access (T1078, Valid Accounts), this is how they escalated privileges (T1548, Abuse Elevation Control Mechanism), this is how they exfiltrated data (T1537, Transfer Data to Cloud Account). Those techniques become new detection engineering inputs, closing the loop between incident response and proactive threat detection.

Insurance and legal contexts increasingly require demonstrable forensic capability. Cyber insurance underwriters ask whether CloudTrail is enabled, whether logs are retained for 12 months, and whether the organization has an incident response retainer with a forensic firm. Organizations that cannot answer yes to these questions pay higher premiums or face coverage exclusions.

Technical Details

Forensic Account Architecture

Best practice separates the forensic environment from the production AWS account. A dedicated forensic account, accessed only by the security team, receives CloudTrail log replication, stores EBS snapshots copied from production accounts during incidents, and hosts forensic EC2 instances with no internet access. S3 buckets in the forensic account have Object Lock enabled in WORM (Write Once Read Many) mode, preventing log deletion even by account administrators.

Multi-account organizations should enable CloudTrail as an organization-level trail, capturing events across all member accounts into a centrally managed S3 bucket in the security account. This prevents an attacker who compromises one account from deleting evidence that already exists in an account they cannot reach.

Tools for Cloud Forensic Analysis

Cado Security is a cloud-native forensic platform that automates evidence acquisition from AWS, Azure, and GCP, producing a unified timeline, filesystem artifacts, and memory analysis results without requiring manual volume mounting.

AWS built-in services cover substantial forensic ground natively: Athena for querying CloudTrail logs via SQL, GuardDuty for automated threat detection, Security Hub for aggregating findings, and Detective for graph-based investigation of CloudTrail, VPC Flow Logs, and GuardDuty findings.

Open-source tools remain relevant for analysis of EBS snapshot disk images: The Sleuth Kit and Autopsy for filesystem artifacts, Volatility for memory analysis, and log2timeline/Plaso for super-timeline construction across multiple log sources.

CDA Perspective

Cloud forensics is where TID's Predictive Defense Intelligence methodology meets operational reality. PDI says "see the threat before it sees you," but before requires a historical record to learn from. Every incident that occurs without adequate logging leaves the defender permanently blind to what the attacker did, which techniques they used, and which detection rules would have caught them earlier.

CDA's approach treats forensic readiness as infrastructure, not incident response improvisation. CloudTrail must be enabled before the incident. Log retention policies must be set before the investigation begins. EBS snapshot procedures must be documented and tested before an analyst is paging through runbooks at 2 AM during an active breach.

Within the PDM's atmosphere layer, cloud forensics serves a dual function: it provides the post-incident evidence needed to understand attacker behavior, and it feeds the intelligence cycle that sharpens future TID detection. An organization that treats forensic data as a byproduct of incident response, rather than a first-class intelligence input, is running a one-directional TID program. The atmosphere detects, responds, and then discards the data. CDA's model closes the loop: detect, respond, reconstruct, refine, and build better detection from the evidence of every attack.

For organizations running multi-cloud environments, the forensic data model must span providers. AWS CloudTrail, Azure Activity Logs, and GCP Cloud Audit Logs each have different schemas and default retention policies. Cross-cloud forensics requires a unified SIEM that normalizes these sources into a common timeline format, a capability grounded in CDA's SPH logging architecture work.

Key Takeaways

Cloud forensics cannot be reactive: forensic readiness (CloudTrail enabled, VPC Flow Logs active, log retention configured, snapshot procedures documented) must be in place before an incident occurs. Logging gaps discovered during an investigation cannot be retroactively filled.
The evidence sources that replace physical disk and memory in cloud investigations are control plane logs (CloudTrail), network logs (VPC Flow Logs), and EBS snapshots. Each source has gaps that must be understood and accounted for in the investigation.
Jurisdiction and shared responsibility boundaries constrain what the investigator can access. Hypervisor-layer and multi-tenant hardware evidence belongs to the cloud provider, not the tenant. Multi-region data storage creates legal exposure that must be planned for before an incident, not discovered during one.
Evidence preservation requires isolating and snapshotting compromised instances before auto-scaling terminates them. Incident response runbooks must explicitly interrupt auto-termination during forensic holds.
Forensic timelines from CloudTrail provide ATT&CK-mapped technique records that feed detection engineering and improve future threat identification.

Sources

Amazon Web Services. AWS CloudTrail Documentation. https://docs.aws.amazon.com/cloudtrail/

Amazon Web Services. Amazon VPC Flow Logs. https://docs.aws.amazon.com/vpc/latest/userguide/flow-logs.html

Cloud Security Alliance. Cloud Forensics Capability Maturity Model. CSA, 2023. https://cloudsecurityalliance.org/

NIST. SP 800-86: Guide to Integrating Forensic Techniques into Incident Response. NIST, 2006. https://csrc.nist.gov/publications/detail/sp/800-86/final

Cado Security. Cloud Forensics and Incident Response Platform. https://www.cadosecurity.com/

MITRE Corporation. MITRE ATT&CK Cloud Matrix. https://attack.mitre.org/matrices/enterprise/cloud/

CDA, LLC. Planetary Defense Model Master Reference. CDA Canon, 2026.