top-mission-tid-r06-detection-engineering: CDA.Wiki (Print)

# TOP Mission TID-R06: Detection Engineering Program

Detection engineering is the disciplined practice of designing, building, validating, and maintaining the analytics and rules that power a security operations center's ability to identify adversary activity. TID-R06 exists because most organizations accumulate detection content reactively, adding rules after incidents and rarely retiring or tuning what already exists. The result is a detection layer that is bloated, noisy, and full of gaps. This mission establishes a repeatable, structured program that treats detection logic as a software engineering product: versioned, tested, peer-reviewed, and aligned to a threat model. It answers the fundamental operational question every security team must be able to answer: given the threats most likely to target this organization, do we have validated detection coverage for them?

---

Definition

Detection engineering is the systematic application of engineering principles to the creation and lifecycle management of detection content. That content includes correlation rules, behavioral analytics, threshold alerts, machine learning models, and any other logic that transforms raw security telemetry into actionable alerts.

This definition is precise because the term is frequently confused with adjacent disciplines. Detection engineering is not the same as threat hunting. Threat hunting is an ad hoc, hypothesis-driven investigation of data that has not yet triggered an alert. Detection engineering takes the findings from a successful hunt and converts them into durable, automated detection logic so the same adversary behavior triggers an alert the next time it appears. Detection engineering is also not the same as SIEM administration. Managing data pipelines, configuring log sources, and maintaining platform infrastructure are prerequisites for detection engineering, but they are not the engineering work itself.

The scope of TID-R06 covers the complete detection lifecycle: threat modeling that identifies what to detect, content creation that builds the detection logic, validation that ensures the logic performs correctly, deployment that moves tested content into production, and continuous improvement that keeps detection fidelity high over time. What this mission does not cover: incident response workflows, SIEM platform selection, or threat intelligence collection. Those activities are addressed in adjacent TOP missions. TID-R06 assumes that telemetry exists and that a platform capable of executing detection logic is in place.

Detection engineering exists because reactive rule accumulation creates more noise than signal. Organizations that add detections after incidents but never retire obsolete ones build detection libraries that consume analyst attention without improving security outcomes. The engineering discipline applies quality gates, performance metrics, and lifecycle management to ensure detection content serves operational needs rather than becoming an operational burden.

---

How It Works

Detection engineering follows a five-phase lifecycle that transforms threat intelligence into validated, production-ready detection content. Each phase produces concrete artifacts and has defined entry and exit criteria that prevent premature advancement.

Phase 1: Threat Modeling and Coverage Analysis

The engineering cycle begins with a scoped threat model that identifies which adversary groups, attack techniques, or malware families are most relevant to the organization. This model considers industry sector, geographic location, infrastructure profile, and historical incident data. The threat model is then mapped to the MITRE ATT&CK framework to produce a coverage matrix: a structured view of which techniques the organization currently detects, which it does not, and which represent the highest-priority gaps.

For example, a regional bank might determine that financially motivated threat groups using business email compromise and credential harvesting represent their primary threat. The coverage matrix might reveal strong detection for known phishing domains but no coverage for techniques like T1114 (Email Collection) or T1552.001 (Credentials in Files). Those gaps become the engineering backlog, prioritized by likelihood and business impact.

Phase 2: Detection Design and Content Creation

Engineers write detection logic against specific ATT&CK techniques, using threat intelligence and adversary simulation data to inform what the logic should identify. Effective detection design requires understanding the data source: what fields are available, what values represent normal activity, and what patterns indicate adversary behavior. A detection for credential access written against Windows Security Event Logs requires different logic than one written against PowerShell Script Block Logging or endpoint detection and response telemetry.

Detection content is documented using structured formats that capture the intent, data dependencies, logic components, expected alert volume, and analyst response guidance. Many organizations adopt the SIGMA rule format, which provides a vendor-agnostic YAML schema for expressing detection logic. SIGMA rules compile to platform-specific query languages, making them portable across technology changes and auditable by security teams.

Phase 3: Validation and Quality Assurance

No detection advances to production without comprehensive testing. Validation has two required components. Functional testing confirms the detection fires when adversary activity occurs. This involves generating synthetic attack activity using frameworks like Atomic Red Team, which provides a library of small, repeatable test procedures organized by ATT&CK technique. Running an Atomic test for T1003.001 (LSASS Memory) and confirming the corresponding detection triggers is a basic but essential quality gate.

False positive testing estimates alert volume against historical production data. This testing identifies normal business activities that might trigger the detection and allows engineers to refine logic before deployment. A detection that would have generated fifty alerts daily against legitimate administrative activity requires tuning before it becomes operationally useful.

Phase 4: Deployment and Documentation

Validated detections deploy to production through controlled change management. Each detection receives metadata tags indicating the ATT&CK technique it covers, required data sources, severity level, and expected alert frequency. Deployment includes comprehensive analyst documentation that explains the detection purpose, known limitations, common false positive scenarios, and recommended response procedures. This documentation populates a detection knowledge base that analysts reference during alert triage.

Phase 5: Performance Monitoring and Lifecycle Management

Detection content requires continuous maintenance. Adversary techniques evolve, infrastructure changes, and business processes shift in ways that affect detection accuracy. The engineering program tracks alert fidelity metrics: true positive rate, false positive rate, and analyst escalation rate for each deployed detection. Detections that consistently produce low-quality alerts undergo tuning or retirement. A detection that has not fired in twelve months against a threat that no longer applies to the organization's current threat model becomes a candidate for archival rather than indefinite maintenance.

Implementation Example: Banking Sector Lateral Movement Detection

A community bank implements TID-R06 and identifies that credential theft leading to lateral movement represents their highest-priority threat scenario. The engineering team develops detection logic for abnormal authentication patterns: successful logins from accounts that subsequently access file shares containing customer data within thirty minutes, cross-referenced against a whitelist of authorized administrative accounts. The detection undergoes Atomic Red Team testing using T1021.002 (SMB/Windows Admin Shares), tuning against six months of historical authentication logs to eliminate alerts from legitimate overnight backup processes, and deployment with analyst guidance noting that alerts require immediate account suspension assessment. Three months after deployment, the detection identifies a compromised teller account being used to access loan documentation outside normal business processes, enabling containment before customer data exfiltration occurs.

Advanced Detection Types and Implementation Considerations

Detection engineering encompasses multiple technical approaches beyond simple rule-based logic. Behavioral analytics detect statistical anomalies in user or system activity patterns, such as file access volumes that deviate significantly from historical baselines. These detections require mathematical models, training periods, and threshold calibration that differs substantially from signature-based approaches.

Correlation-based detection combines multiple low-confidence signals into higher-confidence findings. A user authenticating from an unusual location might generate a low-priority alert, but the same user downloading large volumes of customer data immediately after that authentication represents a higher-confidence threat indicator. Correlation logic requires careful timing windows, state management, and complexity controls to prevent false positive escalation.

Machine learning-based detection applies algorithms to identify patterns that human analysts might miss. These approaches require training data, model validation, and drift detection to ensure accuracy over time. Organizations implementing ML-based detection must plan for model retraining, feature engineering, and explainability requirements that allow analysts to understand why an alert was generated.

---

Why It Matters

Organizations that treat detection as a configuration task rather than an engineering discipline accumulate technical debt that degrades security effectiveness. Rules written for historical threats continue consuming analyst attention long after those threats become irrelevant. High-volume, low-fidelity alerts cause analyst fatigue, creating conditions where legitimate threats receive inadequate investigation because the noise level overwhelms available review capacity.

The operational consequences of poor detection engineering appear clearly in major security incidents. The 2020 SolarWinds supply chain compromise affected organizations that possessed comprehensive security tooling and detection platforms. What many lacked was detection logic specifically calibrated for the techniques the threat actor employed: SAML token manipulation, legitimate cloud administration tool abuse, and command and control traffic that mimicked normal API communication patterns. Organizations with mature detection engineering programs that maintained ATT&CK-mapped coverage and actively incorporated nation-state tradecraft intelligence were better positioned to identify anomalous activity through internal monitoring rather than external notification.

A persistent misconception suggests that more detection rules produce better security outcomes. The relationship is frequently inverse. A smaller collection of high-confidence, well-maintained detections covering techniques likely to target the organization generates more actionable intelligence than an extensive, unmanaged rule library that produces alert noise. Detection engineering provides the analytical framework to make coverage versus noise trade-offs deliberately rather than accidentally.

From a regulatory compliance perspective, auditors increasingly expect organizations to demonstrate that detection capabilities map to documented threat models and undergo regular effectiveness testing. The NIST Cybersecurity Framework's Detect function requires organizations to maintain detection processes and improve detection capabilities over time. The CIS Controls specify that organizations must deploy and tune detection systems based on threat intelligence. An active detection engineering program provides the documentation, metrics, and evidence necessary to satisfy these requirements while demonstrating that security investments produce measurable operational improvements.

Detection engineering also addresses the skills shortage that affects many security operations teams. Junior analysts can more effectively triage high-quality alerts with clear documentation and response guidance than they can manage large volumes of poorly contextualized alerts. Senior analysts can focus on complex investigations and program improvement rather than spending time tuning individual rules reactively. The engineering approach scales human expertise through better tooling and processes.

---

CDA Perspective

CDA addresses TID-R06 through the Planetary Defense Model, specifically within the Threat Intelligence and Defense domain. The governing methodology is Predictive Defense Intelligence (PDI): see the threat before it sees you. This approach inverts the standard industry sequence of writing detections after incidents occur. Instead, threat intelligence drives detection engineering proactively, creating coverage for attack techniques before they appear in the organization's incident data.

The conventional approach waits for threat actors to succeed, analyzes the incident, and then develops detection logic. CDA's methodology begins with intelligence analysis that identifies techniques associated with threat groups actively targeting clients in the same sector and geographic region. That intelligence becomes a prioritized engineering backlog before any security incident occurs. The result is detection coverage positioned for the next attack rather than optimized for the previous one.

CDA engineers maintain a detection content library organized by ATT&CK technique and tagged to specific client threat profiles. When new threat intelligence identifies technique evolution within relevant adversary groups, the library undergoes review and updates within defined service level agreements. Detection content follows software development practices: version control, peer review, automated testing, and controlled deployment. This ensures that detection quality remains high as content volume scales across multiple client environments.

CDA conducts detection validation through structured adversary emulation exercises that confirm deployed detections perform correctly under realistic attack conditions. This differs from traditional red team exercises where the objective involves evading detection. Purple team validation focuses on confirming that detections work as designed and identifying gaps that require engineering attention. These exercises provide concrete evidence that detection investments translate into operational capabilities.

Clients receive quarterly detection coverage reports that map current detection logic to their specific threat models, expressed as validated coverage percentages for high-priority ATT&CK techniques. This metric tracks program maturity over time and provides security leadership with concrete indicators of defensive capability growth. The reports also identify emerging technique gaps based on evolving threat intelligence, ensuring that detection engineering efforts remain aligned with the current threat environment rather than historical incident patterns.

---

Key Takeaways

• Map detection development to specific threat models using MITRE ATT&CK coverage analysis rather than generic vendor defaults; detection logic must address actual adversary techniques likely to target your organization, not theoretical attack scenarios that ignore your threat profile.

• Implement comprehensive testing for every detection before production deployment: functional validation using adversary simulation frameworks like Atomic Red Team, and false positive testing against historical data to estimate operational alert volume and quality.

• Track quantitative fidelity metrics (true positive rate, false positive rate, analyst escalation rate) for each deployed detection and use that data to drive tuning decisions; rely on measurement rather than subjective analyst feedback as the primary quality improvement mechanism.

• Treat detection content as software with defined lifecycle management: version control, documentation, peer review, scheduled maintenance, and retirement criteria; detection rules that have not undergone review in twelve months represent technical debt rather than security assets.

• Adopt vendor-neutral rule formats such as SIGMA to ensure detection portability across platform changes and enable detection content auditing by security teams; platform-specific detection logic creates vendor lock-in and complicates detection quality assurance processes.

---

TOP Mission TID-R01: Threat Intelligence Program Foundations
TOP Mission TID-R04: Adversary Emulation and Purple Team Operations
TOP Mission SOC-R02: Alert Triage and Escalation Workflow Design
TOP Mission TID-R08: Threat Hunting Program Establishment
PDM Domain Overview: Threat Intelligence and Defense (TID)

---

Sources

National Institute of Standards and Technology. "Framework for Improving Critical Infrastructure Cybersecurity, Version 1.1." NIST. April 2018. https://www.nist.gov/cyberframework

MITRE Corporation. "MITRE ATT&CK Framework for Enterprise." https://attack.mitre.org/

Center for Internet Security. "CIS Controls Version 8." https://www.cisecurity.org/controls/v8

MITRE Engenuity. "ATT&CK Evaluations: Methodology for Detection Coverage Assessment." https://attackevals.mitre-engenuity.org/methodology/

Florian Roth, et al. "Sigma: Generic Signature Format for SIEM Systems." SigmaHQ Project. https://github.com/SigmaHQ/sigma

TOP Mission TID-R06: Detection Engineering Program