TOP Mission TID-H03: Purple Team Operations

TOP Mission TID-H03: Purple Team Operations | CDA.Wiki | CDA.Wiki

# TOP Mission TID-H03: Purple Team Operations

Purple team operations represent a structured methodology for combining offensive security testing with real-time defensive validation, enabling security teams to measure detection capability against simulated adversary behavior rather than assumed capability. Unlike a traditional penetration test that delivers findings after the fact, purple teaming keeps red and blue team members working in parallel, creating a closed feedback loop where each attack technique is immediately evaluated against the organization's actual detection stack. The mission exists because most security programs accumulate tools and controls without ever verifying whether those controls fire when they are supposed to. TID-H03 closes that gap by converting threat intelligence about real adversary tactics into executed test cases that produce measurable, defensible results.

---

Definition

Purple team operations are collaborative security exercises in which offensive practitioners (the red team) execute specific attack techniques drawn from a structured threat intelligence baseline, while defensive practitioners (the blue team) monitor, detect, and respond in real time. The two teams share situational awareness throughout the exercise, rather than operating in isolation as they do in a traditional red team engagement. The term "purple" is not a permanent team structure in most organizations; it describes a working mode where red and blue functions merge temporarily to accelerate learning.

Purple teaming is distinct from penetration testing. A penetration test evaluates whether an attacker can achieve an objective. A purple team exercise evaluates whether defenders can detect and respond to specific techniques, regardless of whether those techniques succeed. The objective is detection fidelity, not breach simulation.

Purple teaming is also distinct from a full red team operation. A red team engagement typically runs covertly, with the blue team unaware. Purple teaming is entirely transparent. Blue team members know that a test is occurring and which technique is being executed, allowing them to trace the precise detection or detection failure in real time.

Purple team operations exist in two primary forms. Atomic purple teaming involves executing individual MITRE ATT&CK techniques one at a time, verifying detection before moving to the next. Campaign-based purple teaming executes a chained sequence of techniques that mirror an actual threat actor's known tradecraft, testing detection at each stage of an attack chain.

Purple teaming does not replace threat intelligence, red team operations, or incident response rehearsals. It is a validation mechanism that depends on all three to be effective. Without current threat intelligence, the techniques tested may not reflect realistic adversary behavior. Without mature blue team processes, the exercise produces noise rather than signal.

---

How It Works

The mechanics of a purple team operation follow a consistent cycle: select, execute, observe, document, remediate, and retest. Each iteration of this cycle covers one technique or a small cluster of related techniques.

Selection and Scoping

The exercise begins with threat intelligence triage. The team identifies which adversary groups pose the highest realistic risk to the organization, based on industry targeting data, recent incident reports, and threat actor profiles maintained in the organization's intelligence program. From those profiles, the team extracts specific techniques mapped to MITRE ATT&CK. A financial services organization facing a TA505 campaign would select techniques including T1566 (phishing), T1059.001 (PowerShell execution), T1105 (ingress tool transfer), and T1078 (valid accounts), among others. The scope document lists each technique, the expected detection source (endpoint detection and response, SIEM rule, network sensor), and the acceptance criteria for a passing result.

The scoping process also defines the operational boundaries. Some organizations limit purple team activities to non-production environments to avoid service disruption. Others conduct exercises during maintenance windows or with pre-approved impact thresholds. The key is establishing rules of engagement that allow realistic technique execution while maintaining operational safety.

Execution Environment

Purple team exercises run in production-equivalent environments whenever possible. Testing in an isolated lab produces data about whether the detection tool can see an event; testing in production validates whether the detection tool is properly configured, licensed for that endpoint, and integrated with the SIEM at scale. Many organizations conduct the initial execution in a staging environment for safety, then validate detections in production with a reduced-impact variant of the technique.

Environment selection matters because detection tools behave differently under load, with varying log volume, and across different network segments. An EDR agent that reliably detects process injection in a lab environment may fail to report the same technique when competing with high CPU usage from production applications. Similarly, SIEM rules tuned in a quiet test environment may produce excessive false positives when exposed to production log volumes, leading to suppression rules that inadvertently block legitimate alerts.

Technique Execution

The red team operator executes the selected technique using real tooling, not sanitized simulations. For T1059.001 (PowerShell execution), this means running an encoded PowerShell command that matches the obfuscation pattern used by the targeted threat actor, not a generic test script. Tools such as Atomic Red Team (from Red Canary), Invoke-AtomicRedTeam, or commercial platforms such as Vectr are commonly used to standardize execution and logging.

The execution phase requires precise timing coordination. Red team operators announce their execution window to blue team observers, execute the technique, and log the exact timestamp of the action. Some teams use automated execution platforms that timestamp events to the microsecond, eliminating ambiguity about when an action occurred relative to when an alert fired.

Real-Time Blue Team Observation

While the red team executes, the blue team monitors their primary detection surfaces: the SIEM alert queue, endpoint detection and response (EDR) console, network detection tools, and any specialized security platforms deployed in the environment. The blue team records the exact time a technique was executed and the exact time an alert fired, if it fired. If no alert fires within a defined observation window (typically five to ten minutes for high-confidence detections), the technique is recorded as a detection miss.

Blue team observers also evaluate alert quality. An alert that fires but provides insufficient context for analyst response is categorized as a partial detection. For example, a generic "suspicious process execution" alert without command-line parameters, parent process information, or user context may indicate that a log source is working but that log parsing or rule logic needs improvement.

Documentation and Gap Analysis

Every technique produces one of four outcomes: detected (alert fired correctly and provided actionable information), partially detected (alert fired but lacked sufficient detail), detected with delay (alert fired outside acceptable time thresholds), or not detected (no alert fired). The team documents the specific log source, the rule or model that fired, and the analyst action taken. Detection misses trigger immediate root cause analysis: Was the log source missing? Was the rule misconfigured? Was the alert suppressed by a filter? Was the detection capability absent entirely?

The documentation phase captures not just whether detection occurred, but how detection occurred. Teams maintain a detection matrix that maps each ATT&CK technique to its detection requirements: required log sources, expected detection methods (signature, behavioral analysis, machine learning), alert routing, and escalation procedures. This matrix becomes the organization's detection blueprint, showing exactly which capabilities are verified and which remain assumptions.

Remediation and Retest

For each detection miss or partial detection, the team implements a remediation action before closing the cycle. Remediation may involve writing a new SIEM rule, adjusting an EDR policy, correcting a log forwarding gap, tuning a suppression filter that was blocking legitimate alerts, or deploying additional log sources. The red team then re-executes the same technique to confirm the fix works. This retest step is the feature that distinguishes purple teaming from a standard gap assessment. The exercise does not end with a finding; it ends with a verified detection.

The retest cycle often reveals secondary issues. A new SIEM rule may generate the desired alert but also produce false positives that require additional tuning. An EDR policy change may improve detection for one technique while degrading detection for another. Purple teaming captures these interdependencies in real time, rather than discovering them weeks later during routine operations.

Campaign-Based Purple Teaming

While atomic testing validates individual techniques, campaign-based purple teaming tests detection across complete attack chains. The red team executes a sequence of techniques that mirror a specific threat actor's tradecraft, simulating the progression from initial access through impact. This approach reveals detection gaps that only appear when techniques are chained together.

A campaign-based exercise targeting APT29 techniques might begin with T1566.002 (spearphishing link), progress through T1059.001 (PowerShell execution), T1055 (process injection), T1083 (file and directory discovery), T1021.001 (RDP lateral movement), and conclude with T1041 (exfiltration over C2 channel). Blue team observers track not just whether individual techniques are detected, but whether the sequence of detections provides sufficient context for analysts to recognize the attack progression and respond appropriately.

---

Why It Matters

Detection programs routinely accumulate debt. Tools are deployed, rules are written, and configurations drift. Security operations center (SOC) teams assume that alerts are firing because they were firing last quarter. Purple team operations interrupt that assumption with evidence.

The business case for this mission rests on a simple premise: a security control that cannot be verified is a security control that cannot be trusted. Organizations that rely on assumed detection coverage are exposed to a class of failures that do not appear in vulnerability scans or compliance audits. They appear in breach investigations.

The 2022 Lapsus$ intrusions demonstrated this clearly. Multiple large organizations had mature security programs with deployed EDR and SIEM tooling. Lapsus$ used techniques including SIM swapping, social engineering of help desks, and credential theft that were individually detectable but went undetected in production environments because detection rules existed in theory, not in verified practice. Post-incident reviews revealed that specific log sources were not being forwarded, that rules were written against log formats that had changed after a platform upgrade, and that analysts had been trained to deprioritize certain alert categories. A consistent purple team program would have surfaced each of these gaps before they were exploited.

Purple teaming also addresses the detection confidence gap that plagues many security programs. SOC analysts receive alerts daily but often lack confidence in alert accuracy and completeness. When analysts do not trust their detection tools, they develop workarounds: manual hunting processes, informal escalation procedures, and shadow security tools. These workarounds introduce operational risk and reduce response efficiency. Purple team exercises that consistently validate detection accuracy build analyst confidence and improve response velocity.

The methodology provides quantifiable metrics that security leaders can use to demonstrate program effectiveness to executive stakeholders. Instead of reporting "we have EDR deployed on 95% of endpoints," security leaders can report "we have verified detection of lateral movement techniques used by threat actors targeting our industry." This shift from deployment metrics to effectiveness metrics aligns security reporting with business risk.

A common misconception is that purple teaming is only relevant for organizations with mature, large security teams. The opposite is true. Smaller organizations with fewer detection resources benefit most from the prioritization discipline that purple teaming forces. When resources are limited, verifying that the highest-priority techniques are detected before investing in additional coverage is exactly the right approach.

Another misconception is that passing a purple team exercise means the organization is secure. It means the organization can detect the techniques tested. Detection capability is necessary but not sufficient. Purple team results must feed into broader threat-informed defense programs to produce durable security improvement.

---

CDA Perspective

CDA approaches TID-H03 through the Planetary Defense Model (PDM), which organizes security capability into intelligence-driven domains designed to see threats early and respond before damage occurs. Within the TID (Threat Intelligence Domain), purple team operations serve as the empirical validation layer that converts intelligence from theoretical to operational.

CDA's methodology is Predictive Defense Intelligence (PDI): "See the threat before it sees you." Applied to purple teaming, this means exercises are not designed around generic coverage checklists. They are designed around current adversary behavior. Before CDA conducts a purple team exercise with a client, the TID team produces a threat actor profile specific to the client's industry, geography, and technology stack. That profile drives every technique selection decision.

What CDA does differently is integrate the intelligence production cycle with the exercise execution cycle as a single workflow rather than two separate deliverables. Many purple team programs receive a threat brief, then design an exercise weeks later, by which time the intelligence has aged. CDA's approach compresses that gap so that the techniques executed in week one of an engagement reflect intelligence that was produced or validated in the same week.

CDA also applies a detection maturity scoring model within the TID domain that rates detection outcomes not as binary pass/fail but across five dimensions: detection speed, detection confidence (true positive rate), analyst response fidelity, remediation velocity, and retest confirmation. This scoring model produces a Detection Effectiveness Score (DES) that clients can track across quarterly exercises, creating a longitudinal view of detection program maturity rather than a point-in-time snapshot.

For organizations in regulated industries, CDA maps purple team results to control frameworks including NIST CSF, CIS Controls v8, and relevant sector-specific requirements, providing documentation that satisfies audit inquiries about whether security controls are operating effectively. The PDM's governance layer ensures that purple team findings feed directly into risk registers and executive reporting, maintaining visibility at the board level.

---

Key Takeaways

Select techniques from current threat intelligence, not generic frameworks. Map exercises to adversary groups that realistically target your industry and technology stack before selecting ATT&CK techniques.

Test detection in production environments, not only in isolated labs. Configuration drift, suppression rules, and log forwarding gaps appear in production and are invisible in controlled test environments.

Require retest confirmation before closing any detection gap. A remediation that has not been retested is an assumption, not a verified fix. Purple team value is produced in the retest cycle.

Score detection outcomes across multiple dimensions, not as pass/fail. Detection speed, analyst response fidelity, and remediation velocity are each independently important and require separate measurement.

Run purple team exercises quarterly, not annually. Adversary techniques change, platform configurations drift, and detection rules degrade. Annual exercises produce a compliance artifact. Quarterly exercises produce operational assurance.

---

TOP Mission TID-H01: Threat Actor Profiling and Intelligence Requirements
TOP Mission TID-H04: Detection Engineering and SIEM Rule Validation
TOP Mission TID-H07: Adversary Emulation Planning with MITRE ATT&CK
Predictive Defense Intelligence (PDI): See the Threat First
CDA Planetary Defense Model: Threat Intelligence Domain (TID) Overview

---

Sources

MITRE Corporation. "ATT&CK for Enterprise." MITRE ATT&CK Framework, version 14. https://attack.mitre.org/

National Institute of Standards and Technology. "Security and Privacy Controls for Information Systems and Organizations." NIST Special Publication 800-53 Rev. 5, September 2020.

Center for Internet Security. "CIS Controls Version 8." CIS, 2021. https://www.cisecurity.org/controls/v8

Scythe Inc. "Purple Team Exercise Framework." SANS Institute, 2019.

Red Canary. "Atomic Red Team: Detection Testing Made Simple." GitHub Repository. https://github.com/redcanaryco/atomic-red-team

Table of Contents

Definition

How It Works

Why It Matters

CDA Perspective

Key Takeaways

Sources

Related CDA Missions

Related Articles

Lazarus Group (HIDDEN COBRA / Diamond Sleet)

Salt Typhoon

Digital Forensics Evidence Handling

Discussion

The Academy

The Command Post

The Armory