Data Breach Investigation
A data breach investigation is the structured process of determining the scope, cause, impact, and affected data of a cybersecurity incident that resulted in unauthorized access to or exfiltration of sensitive information.
# Data Breach Investigation
Definition
A data breach investigation is the structured process of determining the scope, cause, impact, and affected data of a cybersecurity incident that resulted in unauthorized access to or exfiltration of sensitive information. The investigation answers five questions that every subsequent decision depends on: what happened, how did it happen, what data was affected, who was affected, and is the threat still present.
A breach investigation is not the same as incident response, though they overlap. Incident response focuses on containment and recovery: stop the attack, restore operations, and return to normal. Breach investigation focuses on facts and evidence: determine exactly what occurred, quantify the impact, preserve evidence for legal proceedings, and produce the findings that regulatory notification, insurance claims, and legal defense require.
The investigation's findings drive consequential decisions. The number of affected individuals determines which state breach notification laws apply and whether federal notification is required (HIPAA: 500+ individuals triggers media notification, HHS wall-of-shame posting). The type of data affected determines the regulatory framework (payment card data triggers PCI forensic investigation requirements, health data triggers HIPAA breach analysis). The root cause determines the remediation plan. The evidence determines the viability of law enforcement referral, insurance claims, and legal defense. Every decision downstream from the breach depends on the investigation's accuracy.
How It Works
Investigation Phases
Phase 1: Initial assessment (Hours 0-24). Determine whether a breach has occurred and initiate the investigation. A breach is distinct from a security incident: a breach involves unauthorized access to or acquisition of protected data. An incident where ransomware encrypted systems but no data was accessed or exfiltrated is an incident, not necessarily a breach (though the forensic investigation must confirm this determination).
Initial assessment actions: review available evidence (SIEM alerts, EDR detections, user reports, threat intelligence), determine whether data was likely accessed or exfiltrated, engage breach counsel (attorney-client privilege over the investigation should be established early), notify the cyber insurance carrier (to activate the IR panel), and preserve evidence before containment actions modify the environment.
The attorney-client privilege decision is critical and time-sensitive. If breach counsel (external legal counsel specializing in data breach response) is engaged before the investigation begins and directs the forensic investigation, the investigation findings may be protected by attorney-client privilege and work product doctrine. This protection can shield the investigation findings from discovery in subsequent litigation. If the investigation is conducted without counsel direction, the findings may be discoverable. The privilege decision should be made within the first hours.
Phase 2: Scoping (Days 1-7). Determine the extent of the compromise. Scoping answers: which systems were compromised, how long the attacker was present (dwell time), what access the attacker had, and which data stores the attacker could reach from the compromised systems.
Scoping techniques: forensic imaging of compromised systems, SIEM log analysis for attacker activity (authentication events, lateral movement indicators, data access patterns), EDR telemetry review (process execution, file access, network connections on compromised endpoints), network traffic analysis (data flows to external destinations, command-and-control communication), and cloud audit log review (if cloud systems are in scope).
The scoping phase frequently reveals that the breach is larger than initially suspected. An alert on one endpoint leads to evidence of lateral movement to five additional systems. Log analysis reveals the attacker was present for weeks before detection. Network traffic analysis shows data exfiltration to external destinations that the initial alert did not identify. Scoping must be thorough because underscoping the breach produces inaccurate notification (notifying fewer individuals than were actually affected) that creates regulatory and legal exposure when the true scope is later discovered.
Phase 3: Root cause analysis (Days 3-14). Determine how the attacker gained initial access. The root cause is the specific vulnerability, misconfiguration, or human action that the attacker exploited to enter the environment. Common root causes:
Phishing: an employee clicked a link or opened an attachment that provided the attacker with credentials or malware execution. The investigation identifies the specific phishing email, the employee who interacted with it, and the credential or access that was compromised.
Credential compromise: the attacker used stolen credentials (from a previous breach, credential stuffing, or password spraying) to authenticate to an internet-facing system (VPN, email, cloud application). The investigation identifies which credentials were used, how they were obtained, and whether MFA was in place (and if not, why not).
Vulnerability exploitation: the attacker exploited a known vulnerability in an internet-facing system (web application, VPN appliance, email server). The investigation identifies the specific CVE, when the vulnerability was disclosed, when the patch was available, and why the patch was not applied before exploitation.
Supply chain compromise: the attacker compromised a third-party vendor and used the vendor's legitimate access to reach the organization's environment. The investigation identifies the compromised vendor, the access path, and the data the vendor's access could reach.
Insider action: an employee or contractor with legitimate access deliberately or accidentally exposed data. The investigation identifies the individual, the action, and the intent (if determinable).
Phase 4: Data impact assessment (Days 7-30). Determine which data was actually accessed or exfiltrated. This is the most consequential investigation phase because the data impact determination drives notification obligations.
Data impact assessment requires: identifying every data store the attacker accessed (from the scoping phase), determining what data those stores contain (from the data inventory, if one exists, or through forensic examination of the stores), and determining whether the data was actually viewed, copied, or exfiltrated (as opposed to merely accessible from a system the attacker compromised).
The distinction between "accessed" and "exfiltrated" matters legally and regulatorily. If the attacker compromised a server that had access to the customer database but forensic evidence shows the attacker never queried the database, the data may not have been breached. If the attacker ran SELECT * FROM customers and transferred the results to an external server, the data was exfiltrated. The evidence must support the determination.
For notification purposes, many state laws define breach as "unauthorized access to" personal information, not just exfiltration. Under these definitions, the attacker's access to a system containing personal information may trigger notification even if exfiltration cannot be confirmed. Breach counsel advises on the legal determination based on the applicable laws and the forensic findings.
Phase 5: Attribution and intelligence (Days 14-60). Determine who conducted the attack, if possible. Attribution is not always achievable (sophisticated attackers use extensive operational security to prevent attribution) and is not required for notification or remediation. It is valuable for threat intelligence (understanding the adversary improves future defense), law enforcement referral (attribution enables criminal investigation), and insurance claims (nation-state attribution may trigger war exclusions in the policy).
Attribution techniques: comparing observed TTPs (tactics, techniques, and procedures) against known threat actor profiles, analyzing malware against known malware families, tracing network infrastructure (C2 servers, exfiltration destinations) to known threat actor infrastructure, and correlating with sector-specific ISAC intelligence about active campaigns.
Phase 6: Reporting and notification (Days 14-60+). Produce the investigation report and execute notification obligations. The forensic report documents: the investigation scope, methodology, evidence collected, factual findings, root cause, data impact determination, and remediation recommendations. The report supports: regulatory notification (providing the factual basis for what happened and what data was affected), insurance claims (documenting the incident, impact, and costs), legal defense (providing evidence for potential litigation), and organizational improvement (identifying the controls that failed and the remediation needed).
Notification execution follows the determination of which laws apply, which individuals are affected, and what the notification content must include (see Incident Communication and Notification for detailed notification guidance).
Investigation Standards
PCI Forensic Investigator (PFI). If payment card data is involved, the card brands (Visa, Mastercard) may require a PCI Forensic Investigator to conduct the investigation. PFIs are certified by the PCI SSC and follow specific investigation procedures required by the card brands. The PFI report is submitted to the card brands and acquirer.
HIPAA breach risk assessment. If PHI is involved, the covered entity must conduct a four-factor risk assessment to determine whether the breach triggers notification: the nature and extent of the PHI involved, the unauthorized person who accessed the PHI, whether the PHI was actually acquired or viewed, and the extent to which the risk has been mitigated. If the risk assessment concludes that there is a low probability the PHI was compromised, notification may not be required. If the risk assessment is inconclusive or indicates probable compromise, notification is required.
Common Investigation Failures
Insufficient evidence preservation. Systems are rebuilt or reimaged before forensic images are captured. Logs expire before they are exported. Volatile evidence (memory) is lost when systems are rebooted during containment. Each evidence loss reduces the investigation's ability to answer the five core questions.
Underscoping. The investigation focuses on the initially detected systems without searching for evidence of lateral movement, additional compromised systems, or data access beyond the initial detection point. Underscoping produces an incomplete breach determination that must be revised (and re-notified) when additional scope is later discovered.
Delayed counsel engagement. The investigation begins without breach counsel direction, producing findings that are not protected by attorney-client privilege. The findings are subsequently discoverable in litigation, potentially including internal assessments, vulnerability identifications, and candid communications that damage the organization's legal position.
Assuming no exfiltration without evidence. The investigation finds no evidence of data exfiltration and concludes that no exfiltration occurred. The absence of evidence is not evidence of absence: the attacker may have exfiltrated data through an encrypted channel that the organization's monitoring did not capture, or may have used a technique that does not leave artifacts in the available log sources. The determination should state "no evidence of exfiltration was found in the available data sources" rather than "no exfiltration occurred."
Why It Matters
Legal and Financial Consequences
Breach investigation findings determine the organization's legal exposure: which notification laws apply, how many individuals must be notified, what regulatory reports must be filed, and what litigation risk exists. An investigation that underscopes the breach or misidentifies the affected data produces notifications that must be corrected when the true scope is discovered, compounding the legal exposure and reputational damage.
Insurance Claims
Cyber insurance claims require documented evidence of the incident. The forensic report is the primary evidence artifact. An investigation that follows recognized methodology, maintains chain of custody, and produces a comprehensive report supports the claim. An ad hoc investigation with incomplete documentation invites claim disputes.
Regulatory Expectations
Regulators evaluate whether the organization conducted a thorough investigation. OCR (HIPAA), state attorneys general, the SEC, and card brands all assess the quality of the investigation as part of their enforcement evaluation. An organization that conducted a thorough, well-documented investigation and notified based on its findings demonstrates good faith. An organization that conducted a minimal investigation and produced vague notifications demonstrates negligence.
CDA Perspective
Breach investigation spans TID (forensic analysis), RGA (notification and regulatory compliance), and DPS (data impact assessment) in the Planetary Defense Model. TID provides the technical investigation capability: forensic imaging, log analysis, malware analysis, and timeline reconstruction. RGA provides the governance framework: breach counsel coordination, notification execution, regulatory reporting, and insurance claims management. DPS provides the data context: the data inventory that enables rapid identification of affected data categories and the classification that determines notification requirements.
TID-D03 (Forensic Investigation, variable hours) is the mission CDA deploys for breach investigations. The mission scope varies with incident complexity. CDA's approach integrates the technical investigation with the legal and regulatory dimensions from day one: breach counsel is engaged immediately, evidence is preserved under privilege, and the investigation plan addresses both technical findings and legal requirements simultaneously.
CDA's emphasis: the data inventory determines investigation speed. An organization with a current data inventory can determine "what data was on the compromised system" in hours. An organization without a data inventory must forensically examine every data store the attacker accessed to determine contents, which adds days or weeks to the investigation timeline. DPS-R01 (Data Inventory and Mapping) is the mission that enables rapid breach impact assessment.
Key Takeaways
- Breach investigation determines scope, cause, data impact, affected individuals, and threat status. Every notification, insurance, and legal decision depends on the investigation's accuracy.
- Six investigation phases: initial assessment, scoping, root cause analysis, data impact assessment, attribution, and reporting/notification.
- Engage breach counsel before the investigation begins to establish attorney-client privilege over the findings.
- Data impact assessment is the most consequential phase: it determines which notification laws apply, how many individuals are affected, and what regulatory reports are required.
- CDA's emphasis: the data inventory determines investigation speed. Organizations with current data inventories assess breach impact in hours. Organizations without them take weeks.
Related Articles
- Incident Response Lifecycle
- Digital Forensics and Evidence Handling
- Incident Communication and Notification
- Data Governance
- Cyber Insurance
- HIPAA Security Rule
Sources
- National Institute of Standards and Technology (NIST). "Computer Security Incident Handling Guide: SP 800-61 Rev. 2." U.S. Department of Commerce, 2012.
- U.S. Department of Health and Human Services. "Breach Notification Rule: 45 CFR 164.400-414, Breach Risk Assessment Guidance." HHS OCR, 2024.
- PCI Security Standards Council. "PCI Forensic Investigator (PFI) Program Guide." PCI SSC, 2024.
- Mandiant (Google Cloud). "Breach Investigation Best Practices." Mandiant, 2024.
- Sedona Conference. "Commentary on Legal Holds: The Trigger and The Process, Third Edition." Sedona Conference, 2023. (Evidence preservation and privilege guidance.)
Word count: 2,023
Related CDA Missions
CDA Theater missions that address topics covered in this article.
Written by Evan Morgan
Found an issue? Help improve this article.