# Reconnaissance Techniques
Reconnaissance is the phase of an attack in which the adversary gathers information about the target before taking any direct action against it. In MITRE ATT&CK, Reconnaissance is Tactic TA0043. It is the earliest phase in the attack lifecycle, and it is almost entirely invisible to the target organization.
Unlike every subsequent tactic, reconnaissance predominantly occurs outside the target's network, infrastructure, and security monitoring perimeter. The attacker searches public databases, queries DNS, enumerates certificate transparency logs, reviews LinkedIn for employee details, and scans internet-facing systems from IP addresses that have no prior relationship to the target organization. None of this generates log entries in the target's SIEM. None of it triggers endpoint detection. The target organization has no direct visibility into what information the adversary has collected about them before the first intrusion attempt begins.
This asymmetry is the central challenge of reconnaissance defense: you cannot detect what you cannot see. The appropriate defense posture shifts from detection to reduction. If the attacker cannot find the information, they cannot use it. If the attack surface they discover is minimal, the options available to them are limited. If the credentials they find in breach databases are already rotated and protected by phishing-resistant MFA, the credential intelligence they collected is useless.
Reconnaissance is divided in ATT&CK between Reconnaissance (TA0043) and Resource Development (TA0042). This article focuses on TA0043, covering both active and passive information gathering techniques that directly target victim organizations. Understanding what an attacker learns about your organization before the first intrusion attempt is prerequisite to building a rational defensive prioritization.
The Continuous Surface Reduction (CSR) methodology in CDA's VSD domain is the primary strategic response to adversary reconnaissance: "Every surface you expose is a surface we eliminate." The intelligence an attacker gathers during reconnaissance defines the attack surface they will target. Reducing that surface, before the attacker maps it, is the defensive objective.
Active scanning involves the attacker directly probing internet-facing systems to discover open ports, running services, and vulnerabilities. Unlike passive reconnaissance, active scanning generates network traffic that originates from attacker-controlled infrastructure. In theory, this makes it detectable. In practice, the volume of legitimate scanning activity on the internet (search engine crawlers, security researchers, vulnerability management vendors) makes distinguishing malicious scanning from background noise difficult without significant context.
Sub-techniques cover:
The attacker's scanning generates logs on the target side: firewall connection logs, web server access logs, IDS/IPS alerts. Detection is possible but high-noise. More operationally useful is the defensive equivalent: running the same scans against your own infrastructure continuously (Attack Surface Management) and finding your exposure before attackers do.
Passive reconnaissance through public technical databases requires no direct interaction with the target. The attacker queries third-party resources that have already indexed information about the target.
Sub-techniques and specific data sources:
crt.sh for %.target-company.com receives a list of every subdomain for which a certificate has been issued, including staging environments, internal tools exposed to the internet, and development systems. This is a common and highly effective reconnaissance technique that reveals the full subdomain inventory of an organization without touching a single system.Public websites, including job boards, social media platforms, and search engines, provide significant intelligence about an organization's technology stack, personnel, and operational security posture.
firstname.lastname@company.com is the most common enterprise pattern), names and roles of high-value targets for spear phishing (C-suite executives, finance personnel, IT administrators, developers with cloud access), and technology stack details from job postings and employee skills sections.site:company.com filetype:pdf finds all indexed PDF files, which may contain internal documents cached by Google. site:company.com inurl:admin finds admin panels indexed by the crawler. site:company.com intext:"password" finds pages containing credential strings. site:pastebin.com "company.com" finds pastes referencing the organization's domain, which may include leaked credentials, API keys, or internal configuration. These searches return results without any network interaction with the target.Identity intelligence is the highest-value category of reconnaissance data. Credentials and email addresses directly enable the next-phase attacks.
@company.com credentials in these databases may find hundreds or thousands of current and former employee email/password pairs. Even if the passwords are hashed, cracking common password hashes is computationally trivial with GPU-based cracking tools and the RockYou2024 wordlist (10 billion passwords). The attacker's primary use of these credentials is not necessarily to try the original password, but to derive likely current passwords from the patterns observed (employees tend to use variations of their previous passwords when forced to rotate).Network infrastructure reconnaissance maps the target's IP address space, autonomous system number (ASN), domain infrastructure, and cloud provider relationships.
Not all phishing seeks to deliver malware. T1598 covers phishing campaigns designed to collect information rather than establish a foothold.
Most reconnaissance happens outside your network. The attacker querying Shodan, browsing crt.sh, or reviewing LinkedIn generates no logs in your environment. This is the core reason that detection-centric security programs struggle with TA0043: there is often nothing to detect inside the perimeter.
The shift in posture from detection to prevention means organizations must ask: what does the attacker see when they look at us from the outside? Answering that question requires running the same reconnaissance tools against your own infrastructure that an attacker would use.
Attack Surface Management (ASM) tools: Tenable.io, Qualys TruRisk, CyCognito, Censys ASM, and similar platforms continuously monitor the internet-facing exposure of your infrastructure, discovering new assets as they are created and alerting when new services are exposed. This gives defenders the same view the attacker sees on Shodan and Censys, but with asset ownership context.
Certificate Transparency monitoring: Subscribe to certificate issuance alerts for your domains via crt.sh, Certspotter, or Facebook's CT log monitor. New certificates issued for your domain are either legitimate (internal IT provisioning a new service) or adversarial (a phishing site registered as secure-companyname.com). Alert on every new certificate issuance for your domain pattern within 24 hours.
Shodan and Censys monitoring: Set up organizational monitoring in Shodan and Censys to receive alerts when new services appear on your IP ranges or when vulnerability scan matches appear for your infrastructure. Both platforms support API-based monitoring integrated into SIEM or ticketing workflows.
Passive DNS monitoring: Services like Farsight DNSDB, RiskIQ (now Microsoft Defender Threat Intelligence), and Cisco Umbrella Investigate alert on new domain registrations and DNS record changes for monitored domains and infrastructure.
HaveIBeenPwned Enterprise: Monitor your corporate email domain against new breach disclosures. HIBP's notification API alerts within hours of a new breach dataset appearing that contains your domain's email addresses. This gives IT and security teams the ability to force-rotate specific compromised credentials before attackers can use them.
Dark web monitoring: Services including Recorded Future, Digital Shadows (Searchlight Cyber), and Kroll monitor darknet markets and paste sites for credential sets referencing your organization's domains. Credential sets in breach databases are the primary intelligence source attackers use for credential stuffing and password spraying. Knowing which credentials are exposed enables preemptive remediation.
Firewall logs: Detect scanning activity targeting your internet-facing IP ranges. High-rate connection attempts from a single source IP to multiple ports or multiple destination IPs (port sweep, host sweep) are scanning indicators. Alert on source IPs with no prior connection history attempting connections to more than 10 different ports or hosts within a five-minute window.
Web server access logs: Directory enumeration (T1595.003) generates distinctive patterns: rapid sequential requests for paths that return 404, particularly from a single source IP or from a small set of IPs in the same ASN. Alert on source IPs generating more than 50 consecutive 404 responses within a two-minute window.
Honeypots and honeytokens: Internal honeypots (systems that should never receive legitimate traffic) and honeytokens (fake credentials, fake documents, fake API keys that generate alerts when used) are highly effective at detecting active scanning and post-reconnaissance exploitation. A connection attempt to a honeynet IP is unambiguously malicious because no legitimate traffic should go there.
SMTP VRFY and RCPT TO probing: Email address enumeration via SMTP probing generates logs at the mail server. Monitor for sequential RCPT TO attempts from a single source IP cycling through email address permutations (firstname.lastname, f.lastname, firstnamel, etc.). Alert on more than 20 failed RCPT TO attempts from a single source within a ten-minute window.
Reconnaissance is the foundation of every targeted attack. An attacker with no information about the target is limited to opportunistic exploitation: scanning for known vulnerabilities and hoping something is unpatched. An attacker with comprehensive reconnaissance intelligence can select the specific technique most likely to succeed against the specific target, at the time of their choosing.
The value of reconnaissance intelligence compounds. Email addresses enable phishing. Credentials from breach databases enable credential stuffing. Technology stack information from job postings enables targeted exploitation. Subdomain enumeration from Certificate Transparency logs reveals staging environments and internal tools with weaker security posture. Shodan data identifies unpatched, internet-exposed services. A threat actor who has spent two weeks performing reconnaissance on a specific target before attempting initial access has a fundamentally different probability of success than one who scans opportunistically.
Volt Typhoon's documented multi-year persistence in US critical infrastructure networks did not begin with exploitation. It began with years of infrastructure reconnaissance, identifying high-value targets, mapping their network perimeters, and identifying the specific access paths (compromised SOHO routers, VPN appliances, edge devices) that would provide persistent, low-visibility access. The reconnaissance phase was as operationally significant as the access phase that followed.
The Cyber Threat Intelligence (CTI) community's tracking of threat actor infrastructure is directly enabled by the same reconnaissance techniques that attackers use against defenders. Certificate Transparency logs, passive DNS, and Shodan data allow defenders to map attacker infrastructure, link new campaigns to known threat actors, and identify attacker infrastructure before it is used against targets. Defenders who understand how reconnaissance works can apply the same techniques defensively.
The CSR methodology, "Every surface you expose is a surface we eliminate," is the direct strategic response to adversary reconnaissance. The attacker's reconnaissance phase defines the attack surface they will attempt to exploit. Reducing the information available to the adversary and minimizing the internet-exposed surface before the attacker completes their reconnaissance is the VSD objective.
VSD-R01 (External Attack Surface Discovery) is the defensive mirror of adversary reconnaissance. CDA runs the same tools and queries that attackers use, against the client's own infrastructure, to produce an accurate picture of what the attacker sees. This mission is run at the start of every engagement because organizations routinely have internet-exposed services they do not know about: forgotten cloud storage buckets, development servers with public IPs, expired but still-resolving DNS entries pointing to decommissioned infrastructure.
VSD-B03 (Attack Surface Reduction) operationalizes the findings from VSD-R01: decommissioning exposed services that are not required, removing DNS records for retired infrastructure, securing misconfigurations found via Shodan, and eliminating the unnecessary attack surface that reconnaissance would otherwise hand to an adversary. VSD-C01 (Continuous Surface Monitoring) sustains this state by continuously running the same external discovery against the organization's IP ranges and domains, alerting on new exposures as they appear.
The PDI methodology, "See the threat before it sees you," requires understanding what adversaries are collecting about your organization before they act on it. TID-R01 (Threat Landscape Assessment) establishes which threat actors are likely targeting the organization's sector and geography, and what reconnaissance techniques those actors are known to use. This assessment informs which defensive investments are highest priority.
TID-B03 (Threat Intelligence Integration) adds the data feeds that monitor adversary reconnaissance activity: certificate transparency alerts, dark web credential monitoring, domain squatting detection, and infrastructure scanning alerts. These feeds provide the earliest available indicators that a threat actor has moved from passive research to active reconnaissance of your organization.
TID-H03 (Threat Hunting Program) extends PDI into proactive hunting for evidence that reconnaissance has already occurred and preliminary access attempts are underway, including credential stuffing attempts against authentication systems, scanning signatures in web server logs, and subdomain enumeration patterns in DNS logs.
TID-C02 (Threat Intelligence Operations) is the sustained intelligence collection program that monitors adversary reconnaissance activity as an ongoing function, not a one-time assessment.
Credential exposure is the reconnaissance intelligence that most directly converts into successful initial access. The ZPA methodology, "Trust nothing. Possess nothing. Verify everything," addresses the credential exposure problem structurally. Phishing-resistant MFA (hardware security keys implementing FIDO2/WebAuthn) renders credential intelligence collected from breach databases operationally useless: knowing the password is insufficient if authentication requires a hardware token that the attacker does not possess and cannot phish.
IAT-R01 (Identity and Access Baseline) establishes the current credential exposure posture. IAT-B01 (IAM Foundation) implements the identity controls that make reconnaissance-derived credential intelligence ineffective as an attack enabler.