Log Management and Retention
Log management is the operational discipline of collecting, aggregating, normalizing, storing, and retaining log data from across an organization's technology environment.
# Log Management and Retention
Definition
Log management is the operational discipline of collecting, aggregating, normalizing, storing, and retaining log data from across an organization's technology environment. Logs are the records that systems generate when events occur: a user authenticates, a firewall permits or blocks traffic, a server starts a process, an application throws an error, a database executes a query, a cloud resource is provisioned or modified.
Logs are the evidence layer that every other security function depends on. SIEM platforms correlate log data to detect threats. Incident responders analyze logs to determine what happened, when, and how far the compromise extended. Threat hunters search logs for evidence of adversary activity. Forensic investigators reconstruct attack timelines from log data. Compliance auditors verify control operation through log evidence.
Without comprehensive, centralized, and retained logs, an organization cannot detect threats (nothing to analyze), cannot investigate incidents (no evidence to reconstruct events), cannot prove compliance (no evidence of control operation), and cannot conduct forensics (no historical data to review). Log management is not a tool or a product. It is the data foundation that every TID operation and every compliance evidence requirement depends on.
How It Works
Log Sources
A comprehensive log management program collects logs from every category of system in the environment:
Identity and authentication. Active Directory authentication logs (Event IDs 4624, 4625, 4768, 4769, 4776), Entra ID sign-in logs, RADIUS authentication logs, SSO platform logs (Okta, Ping), VPN authentication logs. Authentication logs are the highest-value log source for security analysis because every attack involves authentication: the attacker authenticates with stolen credentials, brute-forces credentials, or bypasses authentication. Authentication anomalies (impossible travel, unusual hours, failed attempts) are among the most reliable threat indicators.
Endpoint. Windows Security Event Logs, Sysmon (enhanced Windows logging), EDR telemetry (process creation, file modifications, network connections, registry changes), macOS Unified Logging, Linux auditd. Endpoint logs capture what happens on individual systems: which processes ran, which files were modified, which network connections were established. EDR platforms collect and centralize endpoint telemetry at granular detail, providing the data that threat hunting and incident investigation require.
Network. Firewall logs (permit/deny decisions, connection metadata), proxy logs (web traffic URLs, user agents, content types), DNS query logs, IDS/IPS alerts, NetFlow/IPFIX (network traffic metadata). Network logs capture traffic patterns: which systems communicate with which external addresses, what protocols they use, and how much data they transfer. DNS query logs are particularly valuable: nearly every attack involves DNS resolution (the malware resolves its C2 domain, the attacker resolves internal hostnames during discovery).
Cloud. AWS CloudTrail (API activity), Azure Activity Log (resource management events), GCP Audit Logs (administrative and data access events), Microsoft 365 Unified Audit Log (email, SharePoint, Teams, OneDrive activity). Cloud logs capture every API call and management action in the cloud environment. A user creating an IAM role, modifying a security group, or accessing a storage bucket generates a cloud log entry.
Application. Web application access logs (HTTP requests, response codes, source IPs), database audit logs (query execution, privilege changes), SaaS application logs (Salesforce login history, Slack audit logs). Application logs capture business-level activity that infrastructure logs do not: which users accessed which customer records, which queries were executed against the database, which files were shared from the collaboration platform.
Email. Mail server logs (send/receive/relay events), email gateway logs (filter decisions, quarantine actions), DMARC aggregate reports. Email logs provide visibility into phishing attempts, BEC indicators, and data exfiltration through email.
Log Pipeline Architecture
Log management involves five stages from source to storage:
Collection. Logs are collected from sources through agents (installed on endpoints and servers), syslog forwarding (network devices and Linux systems), API ingestion (cloud platforms and SaaS applications), and file-based collection (reading log files from shared storage). The collection mechanism must be reliable: lost logs create blind spots that attackers can exploit.
Aggregation. Logs from distributed sources are forwarded to a central aggregation point. In large environments, log aggregators (syslog-ng, Fluentd, Logstash, Cribl) receive logs from hundreds or thousands of sources, buffer them, and forward them to the SIEM or log storage platform. Aggregation provides a single collection point that simplifies downstream processing.
Normalization. Logs from different sources use different formats, field names, and data structures. A Windows authentication event and a Linux SSH authentication event both represent "user logged in" but use different field names, different data formats, and different severity classifications. Normalization maps these diverse formats to a common schema so that a single search query can find authentication events across all sources. SIEM platforms perform normalization during ingestion. Common schemas include the Elastic Common Schema (ECS) and the Open Cybersecurity Schema Framework (OCSF).
Storage. Normalized logs are stored for analysis and retention. Storage architecture involves trade-offs between query speed, storage cost, and retention duration. Hot storage (SSD-backed, fully indexed) provides fast query performance for recent data (typically 30 to 90 days). Warm storage (HDD-backed, partially indexed) provides slower but cheaper storage for older data (90 to 365 days). Cold storage (object storage, archive tier) provides the lowest cost for long-term retention (1 to 7+ years) with the slowest retrieval time.
Most organizations implement tiered storage: recent logs in hot storage for real-time analysis and threat detection, older logs in warm storage for investigations and hunting, and archived logs in cold storage for compliance retention and forensic readiness.
Retention. Retention duration is determined by compliance requirements, forensic needs, and storage budget:
| Requirement Source | Minimum Retention | |-------------------|------------------| | PCI DSS (Requirement 10.7) | 12 months (3 months immediately accessible) | | HIPAA | 6 years (for audit logs related to PHI access) | | SOC 2 | Per organization's defined retention policy (typically 12 months) | | NIST 800-171 | Per organizational policy | | GDPR | As long as necessary for the stated purpose | | Forensic best practice | 12 to 24 months minimum for investigation support | | Threat hunting best practice | 90 days minimum, 365 days recommended |
CDA recommends 12 months of retained, searchable logs (hot + warm storage) and 7 years of archived logs (cold storage) for organizations subject to HIPAA or financial services regulations. For organizations with lower regulatory requirements, 12 months searchable and 3 years archived is a practical baseline.
Log Integrity
Logs that can be modified by an attacker are unreliable as evidence. A sophisticated attacker who compromises a system may delete or modify the local logs to cover their tracks. Log integrity controls prevent this:
Centralized collection. Forwarding logs to a central SIEM or log management platform in real time means that even if the attacker deletes local logs, the centralized copy exists. The attacker must also compromise the central log platform to eliminate the evidence.
Write-once storage. Storing logs in append-only or immutable storage (S3 Object Lock, Azure Immutable Blob, WORM storage) prevents modification or deletion after ingestion. Even an administrator cannot alter immutable logs during the retention period.
Log signing. Cryptographically signing log entries at the source provides tamper evidence: if a log entry is modified after signing, the signature verification fails. Log signing is not widely implemented in commercial environments but is used in high-security contexts.
Why It Matters
Detection Depends on Data
A SIEM without logs is a database without data. Detection rules, correlation, behavioral analytics, and threat hunting all require log data as input. The breadth and depth of log collection directly determine the breadth and depth of detection capability. An organization that collects only firewall logs and authentication logs has detection coverage for network-level and authentication-level threats. An organization that also collects endpoint telemetry, cloud logs, DNS queries, and application logs has detection coverage across the full attack lifecycle.
CDA's TID-R02 mission (Detection Coverage Assessment, 16 hours) evaluates both detection rule coverage (which ATT&CK techniques do the rules detect?) and data source coverage (which log sources are connected, and do they provide the telemetry that detection rules require?). A detection rule that requires DNS query logs cannot function if DNS query logging is not enabled. Data source gaps are detection gaps.
Incident Investigation
When a security incident occurs, the incident response team's first question is: "What happened?" The answer comes from logs. Authentication logs show which accounts the attacker used. Endpoint logs show which processes ran and which files were modified. Network logs show which external systems the attacker communicated with. Cloud logs show which resources were accessed or modified.
If the relevant logs do not exist (the log source was not configured for collection) or have expired (the retention period was shorter than the attacker's dwell time), the investigation hits a dead end. The team knows something happened but cannot determine the scope, impact, or root cause. This limits the effectiveness of containment (cannot contain what you cannot scope) and recovery (cannot confirm clean-up without evidence of the full compromise).
Compliance Evidence
Logs serve as compliance evidence for control operation. The auditor asks: "Show me that access reviews were conducted." The log shows the access review completion events. "Show me that vulnerability scans ran weekly." The log shows the scan execution events. "Show me that changes followed the approval process." The change management log shows approval timestamps.
Without logs, compliance evidence depends on screenshots and manual attestations, which are lower quality and more labor-intensive than automated log-based evidence. GRC platforms that pull compliance evidence from log data automate this process.
CDA Perspective
Log management sits in the TID (Threat Intelligence and Defense) domain of the Planetary Defense Model. TID is the atmosphere: the detection and response layer. Logs are the atmospheric data that every TID function processes. Without comprehensive log data, detection rules have no input, threat hunters have no search corpus, and incident responders have no evidence.
CDA's Predictive Defense Intelligence (PDI) methodology treats log management as a prerequisite for every TID capability. "See the threat before it sees you." You cannot see what you are not recording. The first question in any TID assessment is: "What log sources are connected, and what is the retention?" The answer determines the ceiling for every subsequent TID capability.
TID-B01 (SIEM Deployment and Tuning, 40 estimated hours) includes log management as a core component: identifying required log sources, configuring collection agents, deploying aggregation infrastructure, defining normalization schemas, implementing tiered storage, and establishing retention policies. The SIEM and the log management infrastructure are deployed together because the SIEM is only as effective as the log data it receives.
The interaction with adjacent domains: RGA defines the retention requirements (compliance frameworks mandate minimum retention periods). DPS governs the protection of log data itself (logs contain sensitive information including IP addresses, usernames, and activity records that may be subject to privacy regulations). SPH maintains the log collection infrastructure (agent health, syslog forwarding, API connectivity). IAT provides the identity context that makes log data meaningful (correlating authentication events with user identities).
Key Takeaways
- Log management is the data foundation for SIEM, threat detection, incident investigation, threat hunting, and compliance evidence. Without comprehensive logs, none of these functions can operate effectively.
- Comprehensive log collection includes identity/authentication, endpoint, network, cloud, application, and email sources. Missing source categories are detection blind spots.
- Retention should be at least 12 months searchable (hot + warm) for investigation and hunting, with archived retention (cold) meeting the longest applicable compliance requirement.
- Log integrity (centralized collection, write-once storage, log signing) prevents attackers from covering their tracks by modifying or deleting logs.
- CDA treats log management as a TID prerequisite. The first question in any TID assessment: what log sources are connected, and what is the retention?
Related Articles
- SIEM Architecture
- Threat Hunting
- Incident Detection and Behavioral Analytics
- Incident Response Lifecycle
- Compliance Program Design
- Threat Intelligence and Defense (TID): The Atmosphere
Sources
- National Institute of Standards and Technology (NIST). "Guide to Computer Security Log Management: SP 800-92." U.S. Department of Commerce, September 2006 (principles remain applicable; updated guidance in CSF 2.0).
- PCI Security Standards Council. "PCI DSS v4.0: Requirement 10 (Log and Monitor All Access to System Components and Cardholder Data)." PCI SSC, March 2022.
- MITRE Corporation. "ATT&CK Data Sources." attack.mitre.org, updated continuously. (Mapping of data sources to detection coverage.)
- Open Cybersecurity Schema Framework. "OCSF Schema v1.0." ocsf.io, 2024. (Log normalization schema.)
- SANS Institute. "Log Management and SIEM: Best Practices for Security Operations." SANS Reading Room, 2024.
Word count: 1,976
Related CDA Missions
CDA Theater missions that address topics covered in this article.
Written by Evan Morgan
Found an issue? Help improve this article.