Shadow Data Discovery
Finding and eliminating the data you didn't know you had — the hidden enemy of zero possession architecture.
Finding and eliminating the data you didn't know you had — the hidden enemy of zero possession architecture.
Continue your mission
Shadow data is information that exists outside an organization's known, managed, and governed data stores. It includes data in personal drives, unmanaged SaaS applications, exported spreadsheets, email attachments, messaging platforms, and forgotten databases. Within the Empty Fortress doctrine, shadow data is the enemy of zero possession — you cannot minimize what you do not know exists.
Shadow data grows through normal business operations. An employee exports a customer list to Excel for a presentation. A developer spins up a test database with production data. A sales rep uploads contacts to a personal CRM trial. A contractor stores project files in their personal Dropbox. Each of these actions creates a data store outside governance, encryption, access control, and retention policy.
Shadow data discovery operates in four phases. Asset enumeration maps all sanctioned data stores, SaaS subscriptions, cloud accounts, and endpoints. Network analysis identifies data flows to unknown destinations through DNS logs, proxy logs, and CASB telemetry. Endpoint scanning searches local drives, removable media, and cloud sync folders for sensitive data patterns using DLP tools. User interviews surface workflows that involve data movement outside sanctioned systems — often the most revealing phase.
Once discovered, shadow data must be classified by sensitivity and handled accordingly. Data that should not exist is securely deleted. Data that must exist is migrated to governed stores. Workflows that create shadow data are re-engineered to use sanctioned tools. Recurring shadow data audits prevent re-accumulation.
You cannot achieve Empty Fortress if you do not know what is in the fortress. Shadow data discovery is not a one-time project — it is an ongoing discipline. Every shadow data store is a room in your fortress you did not know existed, holding data you did not know you had, with access controls you did not set.
Shadow data undermines every security control because it exists outside governance. Discover it through asset enumeration, network analysis, endpoint scanning, and user interviews. Treat shadow data discovery as a recurring discipline, not a one-time audit.
CDA Theater missions that address topics covered in this article.
The first line of Empty Fortress defense: strategies for collecting, processing, and retaining only what you strictly need.
Designing retention policies that enforce the temporal dimension of data minimization.
Written by CDA Editorial
Found an issue? Help improve this article.