Open Source Intelligence (OSINT) Techniques
Collection and analysis of publicly available information to map attack surfaces and produce actionable security intelligence.
Continue your mission
Collection and analysis of publicly available information to map attack surfaces and produce actionable security intelligence.
# Open Source Intelligence (OSINT) Techniques
Open Source Intelligence (OSINT) refers to the systematic collection, processing, and analysis of publicly available information to produce actionable intelligence about targets, threats, and attack surfaces. OSINT exists because adversaries conduct extensive pre-attack research before engaging a target network, and defenders who fail to perform the same research operate blind to their own exposure. The problem OSINT solves is asymmetric information: attackers invest significant time understanding a target's digital footprint while defenders often have no clear picture of what information about them is already public. By applying structured OSINT techniques, security teams close that gap, discovering exposed credentials, misconfigured systems, leaked internal documentation, and personnel data before attackers do.
---
Open Source Intelligence is the collection and analysis of information derived exclusively from publicly available sources, without any unauthorized access, covert intrusion, or proprietary data feeds. The "open" in OSINT does not mean free or easy to find. It means the information exists in sources that are legally accessible to anyone: websites, public records, social media platforms, academic publications, government databases, code repositories, and broadcast media.
OSINT is distinct from several adjacent disciplines that are sometimes conflated with it. Signals Intelligence (SIGINT) involves the interception of communications, which requires legal authorization and specialized equipment. Human Intelligence (HUMINT) relies on interpersonal contact and source development. Cyber Threat Intelligence (CTI) may draw on OSINT as one input but also incorporates closed-source feeds, dark web monitoring, and vendor telemetry. Competitive intelligence is a business discipline that overlaps methodologically but serves commercial strategy rather than security operations.
What OSINT is NOT: it is not passive scanning of a target's infrastructure. Sending packets to a target's IP addresses, running active vulnerability scans, or querying a target's DNS resolvers directly are active reconnaissance techniques, not OSINT. OSINT relies entirely on data that was already made public by the data subject or by third parties, with no direct interaction with target systems.
Subtypes of OSINT relevant to security operations include: external attack surface mapping (identifying internet-facing assets), personnel OSINT (mapping employees, roles, and contact information), credential intelligence (identifying leaked usernames and passwords in breach data), supply chain OSINT (researching vendors and technology partners), and geospatial OSINT (using satellite imagery and location data for physical security assessments). Each subtype requires different tools, data sources, and analytical methods, though they share a common methodology grounded in source enumeration, data correlation, and structured analysis.
---
OSINT collection follows a structured process that moves from target definition through data collection, processing, analysis, and reporting. Each phase builds on the previous one, and skipping steps produces incomplete or misleading intelligence.
Phase 1: Target Definition and Scope
Before any collection begins, the analyst defines the target scope precisely. For an organization, this includes the primary domain name, all known subsidiary domains, registered IP ranges, known brand names, key personnel, and relevant technology products. Scope creep is a real problem in OSINT operations: without boundaries, analysts can spend weeks following tangential leads. A clear scope statement, such as "all internet-facing assets associated with acme-corp.com and its confirmed subsidiaries," keeps collection focused and efficient.
Phase 2: Passive DNS and Domain Enumeration
Domain enumeration is typically the first technical step in organizational OSINT. Analysts query WHOIS records to identify domain registration history, registrant contact information (often partially redacted post-GDPR but still useful historically), nameserver configurations, and registration patterns that may reveal additional related domains. Certificate transparency logs, accessible through services like crt.sh, expose every TLS certificate ever issued for a domain, including subdomains that the organization never intended to publicize. A single query to crt.sh for a large enterprise will often return hundreds of subdomains, many pointing to staging environments, development servers, internal tools exposed accidentally, or decommissioned infrastructure still running vulnerable software.
DNS brute-forcing tools such as dnsx and amass expand this further by testing common subdomain patterns against authoritative DNS servers. This is still considered passive because the analyst is querying public DNS infrastructure, not interacting with the target's systems directly.
Phase 3: Technology Fingerprinting and Asset Profiling
Once subdomains are enumerated, analysts profile each asset without touching it directly. Shodan and Censys maintain continuously updated indexes of internet-facing services, storing banner information, certificate details, open ports, and detected software versions. An analyst can query Shodan for all hosts returning a specific server banner associated with the target's IP ranges and immediately identify running software versions, exposed administrative interfaces, and misconfigured services, all without sending a single packet to the target.
Job postings are a frequently underestimated OSINT source for technology stack identification. A posting for a "Senior DevOps Engineer" that lists required experience with specific SIEM platforms, cloud providers, identity providers, and monitoring tools tells an analyst exactly which products the organization is running and, by extension, which CVEs may be relevant.
Phase 4: Personnel and Organizational Mapping
LinkedIn, corporate websites, conference speaker bios, and academic publications collectively expose an organization's internal structure to a degree that surprises most security teams when they see it documented. Analysts build org charts, identify IT and security personnel, and map relationships between employees and business units. This intelligence is directly actionable for social engineering campaigns, spear-phishing target selection, and business email compromise attacks.
Email address harvesting tools such as theHarvester and Hunter.io enumerate corporate email addresses across public data sources. Once an analyst has a confirmed email address format (firstname.lastname@acme-corp.com), they can cross-reference that format against breach databases like Have I Been Pwned and Dehashed to identify valid credentials that may have been exposed in third-party data breaches. A compromised password from a 2019 breach of an unrelated service may still be valid if the employee reused it.
Phase 5: Correlation and Profile Assembly
The highest-value OSINT work is not collection but correlation. Maltego and SpiderFoot automate relationship mapping across disparate data sources, visualizing connections between domains, IP addresses, email addresses, social media profiles, and organizational entities. A single employee's GitHub profile may expose internal hostnames in commit history, AWS S3 bucket names in configuration files, and API keys in code that was meant to be cleaned before publication. None of these findings is obvious in isolation, but combined they reveal significant attack surface.
Concrete Scenario: During an external attack surface assessment, an analyst queries crt.sh for acme-corp.com and discovers a subdomain staging-payments.acme-corp.com that does not appear in the organization's documented asset inventory. A Shodan query for the associated IP shows an outdated nginx version with a known path traversal vulnerability. A search on GitHub for "acme-corp" in repository names reveals a public repository containing a configuration file with a plaintext database connection string pointing to that same staging server. The organization has no awareness of this exposure. The analyst now has a complete attack chain derived entirely from public data, discovered in under two hours.
---
Organizations that do not conduct regular OSINT assessments against themselves are making security decisions without complete information. Their incident response plans assume adversaries arrive at the network perimeter with no prior knowledge. In practice, sophisticated threat actors invest weeks or months in OSINT before executing an attack, using that intelligence to select the most effective initial access vector, identify high-value personnel to impersonate, and anticipate defensive controls.
The business impact of inadequate OSINT awareness is measurable. Credential exposure in third-party breach databases is one of the most common initial access vectors documented in breach investigations. The 2021 Colonial Pipeline ransomware attack, attributed to DarkSide, reportedly began with a compromised VPN credential that appeared in a leaked password database. The VPN account had no multi-factor authentication. An OSINT assessment would have identified that credential in breach data before the attackers did, enabling remediation.
Exposed source code repositories represent a second high-frequency risk category. Development teams routinely publish code to GitHub or GitLab with API keys, cloud credentials, or internal hostnames included inadvertently. These secrets are indexed within minutes of publication by automated scanning tools operated by both security researchers and threat actors. Organizations that do not monitor their own code repositories for secret exposure are typically unaware of this attack surface until after a breach.
A common misconception about OSINT is that privacy measures like GDPR-mandated WHOIS redaction or social media privacy settings meaningfully limit attacker capability. Privacy controls reduce the convenience of data collection but do not eliminate it. Cached data, breach databases, third-party data aggregators, and historical records preserve information long after the original source is restricted. Defenders must treat OSINT as an ongoing, continuous function rather than a one-time assessment, because the internet's memory is long and adversaries are patient.
A second misconception is that OSINT is only useful for red teams. Blue teams, threat intelligence analysts, vulnerability management programs, and executive protection functions all depend on OSINT to do their jobs effectively.
---
The Cyber Defense Alliance approaches OSINT through the Planetary Defense Model (PDM), specifically within the Threat Intelligence Domain (TID). CDA's operational methodology, Predictive Defense Intelligence (PDI), is grounded in the principle: "See the threat before it sees you." OSINT is the mechanism by which that principle is operationalized.
CDA distinguishes between reactive and predictive OSINT postures. A reactive posture waits for indicators of compromise to appear in logs and then attempts to attribute them to known threat actors. A predictive posture continuously monitors the external information environment for signals that precede an attack: new subdomains being registered that mimic the organization's brand, credential exposure in fresh breach databases, dark web forum discussions referencing the organization by name, and social media profiles impersonating executives. By the time a threat actor engages a target's infrastructure, the predictive OSINT program has already flagged the pre-attack research activity.
CDA's TID methodology structures OSINT collection around three layers. The surface layer covers indexed web content, public records, and social media. The deep layer covers content that is publicly accessible but not indexed by standard search engines, including paste sites, public cloud storage buckets, and academic databases. The dark layer, while not strictly "open source" in the traditional sense, is monitored through authorized access to closed criminal forums and marketplaces where compromised credentials and internal documents are traded.
What CDA does differently is treat OSINT not as a project but as a persistent intelligence function. Most organizations conduct an OSINT assessment annually as part of a penetration test engagement and then file the report. CDA integrates continuous OSINT monitoring into the security operations cycle, with automated alerting for new certificate issuances on client domains, credential exposure in breach data feeds, and code repository secret scanning. This persistent posture means that the intelligence picture is current, not twelve months stale, and that the security team can act on findings before they become incidents.
CDA also applies OSINT outputs directly to the Vulnerability and Security Domain (VSD) by feeding discovered assets into the vulnerability management program. Subdomains discovered through OSINT that are not in the asset inventory cannot be patched, monitored, or decommissioned if the security team does not know they exist. OSINT closes that gap systematically.
---
---
---
CDA Theater missions that address topics covered in this article.
Guide to AWS Security Hub for centralized finding aggregation, continuous compliance monitoring, and automated remediation across AWS organizations.
Vendor assessment guide for HashiCorp Vault.
Wireshark is the leading network protocol analyzer for traffic capture and security investigation.
Written by CDA Editorial
Found an issue? Help improve this article.