Security Information and Event Management (SIEM)

Security Information and Event Management (SIEM) | CDA.Wiki | CDA.Wiki

# Security Information and Event Management (SIEM)

Definition

Security Information and Event Management (SIEM) is the operational backbone of enterprise threat detection: a platform that collects, normalizes, correlates, and analyzes log and event data from across an organization's entire technology stack to surface threats, enable forensic investigation, and produce the audit trails compliance frameworks demand. SIEM exists because no single system sees enough of the environment to detect coordinated attacks on its own. A firewall logs blocked connections. An endpoint detects a suspicious process. An identity provider records a failed authentication. Individually, those events are noise. Correlated across time and systems, they become a recognizable attack pattern. SIEM is the platform that performs that correlation at scale, giving security operations teams a unified view of what is happening across thousands of data sources simultaneously.

SIEM is defined by two converged functions: Security Information Management (SIM) and Security Event Management (SEM). SIM addresses the long-term collection, storage, and compliance reporting of log data. SEM addresses real-time monitoring, alerting, and event correlation. Modern SIEM platforms perform both functions simultaneously, with the addition of User and Entity Behavior Analytics (UEBA), threat intelligence feeds, and, in many implementations, Security Orchestration, Automation, and Response (SOAR) integration.

SIEM is not a firewall, an intrusion detection system, or an endpoint detection and response tool. Those are control layers that generate data. SIEM is the aggregation and analysis layer that consumes data from those controls. It does not prevent attacks by itself. It detects, correlates, alerts, and retains evidence. This distinction is critical because organizations frequently misconfigure SIEMs as passive log archives rather than active detection engines, a pattern that produces alert fatigue and detection gaps simultaneously.

How It Works

Data Collection Architecture

SIEM ingestion begins at the data source through multiple collection mechanisms. Agents installed on endpoints and servers forward logs directly to SIEM collectors via encrypted channels. Agentless collection pulls logs via syslog, SNMP traps, API integrations, or file-based ingestion from network devices, cloud platforms, identity providers, and business applications. Most enterprise SIEMs ingest from hundreds of distinct source types: Windows Event Logs, Linux auditd output, firewall connection logs, DNS query logs, VPN authentication records, email security gateway alerts, and cloud service provider audit trails such as AWS CloudTrail or Azure Activity Log.

The volume is significant and growing. A mid-sized enterprise with 5,000 endpoints and standard network infrastructure generates between 50,000 and 500,000 events per second depending on verbosity settings. Cloud-native organizations running microservices architectures can exceed one million events per second during peak hours. SIEM platforms handle this through distributed collectors, message queues, and tiered storage architectures that separate hot search indexes from cold archival storage.

Modern SIEM deployments use hierarchical collection topologies. Regional collectors aggregate data from local sources, then forward processed events to centralized indexing clusters. This approach reduces bandwidth consumption, provides collection redundancy, and enables geographic data residency compliance. Organizations operating across multiple regulatory jurisdictions often deploy separate SIEM instances per region with federation layers for cross-domain correlation.

Normalization and Parsing

Raw log data arrives in dozens of incompatible formats: JSON, CEF, LEEF, W3C, plain text, and proprietary vendor schemas. The normalization engine parses each format and maps fields to a common schema. Splunk uses the Common Information Model (CIM). Microsoft Sentinel maps to the Advanced Security Information Model (ASIM). IBM QRadar uses the Distributed Security Intelligence architecture. Regardless of the schema, the goal is identical: a source IP from a Palo Alto firewall log and a source IP from an Okta authentication log should populate the same field so that correlation rules can join them.

Normalization failures are one of the most common operational problems in SIEM deployments. If a critical data source is parsed incorrectly, detection rules that depend on those fields silently fail. Teams that do not validate their parsers regularly operate with blind spots they are unaware of. Best practice involves maintaining test datasets for every critical log format and running synthetic events through parsers weekly to verify field extraction accuracy.

Field enrichment occurs during normalization. IP addresses are mapped to geographic locations, MAC addresses to asset inventory records, and user IDs to Active Directory attributes. DNS names are resolved to IP ranges and classified by category. This enrichment is computationally expensive but essential for effective correlation: a security analyst investigating an alert needs to know immediately whether the source host is a domain controller, kiosk system, or external attacker infrastructure.

Detection and Correlation Logic

With normalized data flowing into searchable indexes, detection logic runs continuously against the event stream. SIEM detection methods fall into four categories, each with distinct operational characteristics and failure modes.

Rule-based detection uses explicit logic: if a user account fails authentication five times within two minutes and then succeeds, generate a high-severity alert. These rules are deterministic and fast, but they require adversaries to behave in predictable ways. Attackers who spread authentication attempts across longer time windows or multiple user accounts can evade threshold-based rules. Rule-based detection excels at catching known attack patterns and compliance violations but struggles with novel techniques.

Statistical and threshold-based detection compares current behavior against historical baselines. If a database server normally sends 2 GB of outbound data per day and today it has sent 40 GB by noon, that deviation triggers an alert even without a specific rule matching the behavior. Statistical detection catches data exfiltration, credential stuffing, and other volume-based attacks effectively, but it generates false positives during legitimate business changes like software updates or quarterly reporting periods.

UEBA and machine learning models build behavioral profiles for individual users, service accounts, hosts, and network segments. A contractor account that has never accessed the HR database attempting to query employee salary records at 2:00 AM is anomalous relative to its own history, even if the access would be permitted by access control policy. UEBA surfaces that anomaly without requiring a rule that anticipated exactly that behavior. However, UEBA models require months of training data and perform poorly in dynamic environments where normal behavior changes frequently.

Threat intelligence correlation matches indicators of compromise (IoCs) from commercial feeds, government sources, and internal research against log data. DNS queries to known malware command-and-control domains trigger immediate alerts. File hashes matching ransomware samples generate critical-severity incidents. Threat intelligence correlation provides high-fidelity detection of known threats but offers no protection against zero-day attacks or living-off-the-land techniques that use legitimate tools.

Operational Scenario: Advanced Persistent Threat Detection

Consider a realistic advanced persistent threat scenario that demonstrates SIEM correlation capabilities. An attacker begins with spearphishing, sending a malicious Office document to the CFO's executive assistant. The document contains a macro that, when enabled, executes a PowerShell command to download a second-stage payload from a compromised legitimate website.

The sequence of events creates multiple detection opportunities across different data sources. The email security gateway logs show the attachment was delivered despite passing spam filters. Endpoint telemetry shows PowerShell executing with encoded command arguments, a technique documented as MITRE ATT&CK T1059.001. DNS logs capture the query to the compromised website hosting the payload. HTTP proxy logs record the file download. Windows Event Log 4688 shows process creation for the downloaded executable. Registry monitoring detects persistence mechanisms being installed.

A properly configured SIEM correlates these events through temporal and entity-based relationships. The correlation rule looks for PowerShell execution with encoded commands followed by external DNS resolution and HTTP file downloads from the same host within a 15-minute window. When registry modifications occur on the same host within 30 minutes of the initial PowerShell execution, the SIEM generates a high-severity alert that maps to multiple ATT&CK techniques and provides analysts with a complete attack timeline.

Without SIEM correlation, each event appears benign in isolation. PowerShell executes frequently in enterprise environments. DNS queries are constant. HTTP downloads are normal business activity. Registry modifications happen continuously. The SIEM's value lies in connecting these events through common entities (the compromised host) and temporal proximity to reveal the attack pattern that no single security control could detect independently.

Why It Matters

Detection Time and Business Impact

Without SIEM, security operations teams investigate incidents by manually querying individual systems in sequence. An analyst investigating a potential compromise checks firewall logs on the firewall console, then pivots to the endpoint management platform, then checks the identity provider separately. That process takes hours and requires the analyst to mentally correlate events across systems with no common time reference or shared schema. Attackers who move quickly can complete an intrusion before the investigation even begins.

The 2020 SolarWinds supply chain attack demonstrates this operational reality. Organizations with SIEM deployments ingesting DNS telemetry and network traffic data had the technical capability to detect the anomalous SUNBURST beacon traffic. The domain generation algorithm used by the malware produced DNS queries that deviated from normal application behavior. Organizations monitoring DNS at the SIEM level could have detected these patterns months before the public disclosure. Organizations that treated DNS as purely operational infrastructure, or that had SIEM deployments not configured to ingest DNS logs, remained blind to lateral movement occurring on their networks for extended periods.

The business impact of missing an intrusion that SIEM would have detected is documented and severe. IBM's 2023 Cost of a Data Breach Report shows that organizations with fully deployed security platforms, including SIEM, experience average breach costs of $3.05 million compared to $5.09 million for organizations with limited security platform deployment. The difference of approximately $2 million per incident directly correlates to faster detection and containment enabled by centralized monitoring and correlation.

Compliance and Regulatory Requirements

Compliance is an equally concrete driver for SIEM adoption. PCI DSS Requirement 10 mandates log collection and monitoring across all system components in the cardholder data environment, with specific requirements for real-time alerting on authentication failures, privilege escalation, and access to audit logs. HIPAA Security Rule 164.312(b) requires audit controls that record and examine activity in systems containing electronic protected health information. SOC 2 Type II audits assess whether security monitoring controls are operating effectively over time, requiring evidence of consistent log review and incident response.

SIEM provides both the technical logging capability and the reporting artifacts that auditors require to verify compliance. However, regulatory compliance through SIEM requires specific operational practices beyond basic log collection. Audit logs must be protected from modification, retained for mandated periods, and reviewed regularly by qualified personnel. The SIEM platform itself becomes a critical system requiring access controls, backup procedures, and change management processes.

Common Implementation Failures

A persistent misconception is that deploying SIEM is equivalent to having security monitoring. A SIEM that is deployed but not tuned, not monitored, and not integrated into response workflows is an expensive log archive with a management console. The platform enables detection; the operational program built around it determines whether threats are actually detected and contained.

The most common failure mode is alert fatigue caused by poorly tuned detection rules. Organizations that deploy SIEM with vendor-default rule sets typically generate thousands of low-quality alerts daily, overwhelming analyst capacity and creating learned helplessness where high-severity alerts accumulate unworked in queues. Effective SIEM programs begin with conservative detection rules tuned to the specific environment and gradually increase sensitivity as analyst expertise develops and response procedures mature.

CDA Perspective

The Cyber Defense Alliance approaches SIEM through the Planetary Defense Model under the Threat Intelligence and Detection (TID) domain, specifically applying the Predictive Defense Intelligence (PDI) methodology: see the threat before it sees you. Within that framework, SIEM is not treated as a passive log collector or reactive alerting system. It is positioned as an active intelligence production platform that provides the sensor network for threat hunting and predictive analysis.

CDA's operational approach begins with defining detection coverage objectives before deployment rather than after. Using the MITRE ATT&CK framework as the coverage map, CDA analysts identify which techniques are most likely to be used against a client's specific industry vertical and threat actor profile. Detection rules and data source requirements are then mapped backward from those techniques. If ransomware groups targeting the client's sector consistently use living-off-the-land techniques, the SIEM deployment prioritizes Windows Event Log enrichment, PowerShell script block logging, and WMI activity monitoring from day one.

CDA distinguishes between detection engineering and alert triage as separate disciplines requiring different skill sets and career paths. Detection engineers write, test, and maintain the rule library using software development practices: version control, testing frameworks, and performance optimization. SOC analysts focus on investigation and response workflow execution. This separation prevents the common failure mode where overburdened analysts tune out noisy rules without fixing them, quietly degrading detection coverage over time.

Under the Security Program Health (SPH) domain, CDA conducts SIEM health assessments that measure data source coverage completeness, parser accuracy rates, rule detection rates against simulated attack scenarios, and mean time to alert across detection categories. These metrics are reported in the same format as CDA's broader Predictive Defense Intelligence reporting cadence, giving leadership a quantitative view of whether the SIEM is performing as a detection system or merely as a compliance logging tool.

Risk Governance and Assurance (RGA) domain integration ensures that SIEM outputs map directly to regulatory control requirements. CDA validates that log retention policies, access controls on the SIEM platform itself, and audit trail integrity mechanisms satisfy the specific frameworks a client is assessed against. This prevents the situation where an organization has a functioning SIEM but fails an audit because the platform's own access logs were not retained according to policy requirements.

Key Takeaways

Validate parsers quarterly through synthetic testing: A SIEM detection rule that references fields populated by a broken parser produces zero alerts silently. Run synthetic test events through every critical data source on a schedule and confirm that expected fields are populated correctly. Parser failures are invisible until tested explicitly.

Map detection rules to ATT&CK techniques before deployment: Start with techniques most associated with threat actors targeting your industry vertical. This produces a measurable coverage map rather than a reactive rule library assembled from vendor defaults. Focus on techniques that span multiple detection categories for maximum correlation value.

Separate log retention from active detection indexes: Storing years of raw logs in hot search indexes is expensive and degrades query performance. Use tiered storage architectures, keeping 30 to 90 days in active indexes and archiving the remainder in lower-cost storage that can be restored for forensic investigations when needed.

Define and measure alert response SLAs: Establish how quickly analysts must respond to each alert severity tier and report actual performance weekly. High-severity alerts that accumulate unworked in queues indicate that detection capability is nominal rather than operational. Measure mean time to acknowledge, investigate, and close alerts by category.

Treat SIEM tuning as continuous engineering work: Allocate dedicated analyst time each week to reviewing false positive rates, retiring obsolete rules, and building new detections based on recent threat intelligence. A SIEM tuned at deployment and never revisited degrades in effectiveness as both the environment and threat landscape evolve continuously.

Sources

National Institute of Standards and Technology. Guide to Computer Security Log Management (SP 800-92). https://csrc.nist.gov/publications/detail/sp/800-92/final

MITRE Corporation. ATT&CK Framework for Enterprise. https://attack.mitre.org/

Payment Card Industry Security Standards Council. PCI DSS Requirements and Security Assessment Procedures Version 4.0. https://www.pcisecuritystandards.org/

IBM Security. Cost of a Data Breach Report 2023. https://www.ibm.com/security/data-breach

Center for Internet Security. CIS Controls Version 8, Control 8: Audit Log Management. https://www.cisecurity.org/controls/audit-log-management

Table of Contents

Definition

How It Works

Data Collection Architecture

Normalization and Parsing

Detection and Correlation Logic

Operational Scenario: Advanced Persistent Threat Detection

Why It Matters

Detection Time and Business Impact

Compliance and Regulatory Requirements

Common Implementation Failures

CDA Perspective

Key Takeaways

Sources

Related CDA Missions

Related Articles

Format-Preserving Encryption

HTTP/2 Security

Certificate Transparency Logs

Discussion

The Academy

The Command Post

The Armory