Reconnaissance Techniques
Reconnaissance is the phase of an attack in which the adversary gathers information about the target before taking any direct action against it.
# Reconnaissance Techniques
Definition
Reconnaissance is the phase of an attack in which the adversary gathers information about the target before taking any direct action against it. In MITRE ATT&CK, Reconnaissance is Tactic TA0043. It is the earliest phase in the attack lifecycle, and it is almost entirely invisible to the target organization.
Unlike every subsequent tactic, reconnaissance predominantly occurs outside the target's network, infrastructure, and security monitoring perimeter. The attacker searches public databases, queries DNS, enumerates certificate transparency logs, reviews LinkedIn for employee details, and scans internet-facing systems from IP addresses that have no prior relationship to the target organization. None of this generates log entries in the target's SIEM. None of it triggers endpoint detection. The target organization has no direct visibility into what information the adversary has collected about them before the first intrusion attempt begins.
This asymmetry is the central challenge of reconnaissance defense: you cannot detect what you cannot see. The appropriate defense posture shifts from detection to reduction. If the attacker cannot find the information, they cannot use it. If the attack surface they discover is minimal, the options available to them are limited. If the credentials they find in breach databases are already rotated and protected by phishing-resistant MFA, the credential intelligence they collected is useless.
Reconnaissance is divided in ATT&CK between Reconnaissance (TA0043) and Resource Development (TA0042). This article focuses on TA0043, covering both active and passive information gathering techniques that directly target victim organizations. Understanding what an attacker learns about your organization before the first intrusion attempt is prerequisite to building a rational defensive prioritization.
The Continuous Surface Reduction (CSR) methodology in CDA's VSD domain is the primary strategic response to adversary reconnaissance: "Every surface you expose is a surface we eliminate." The intelligence an attacker gathers during reconnaissance defines the attack surface they will target. Reducing that surface, before the attacker maps it, is the defensive objective.
How It Works
Active Scanning (T1595)
Active scanning involves the attacker directly probing internet-facing systems to discover open ports, running services, and vulnerabilities. Unlike passive reconnaissance, active scanning generates network traffic that originates from attacker-controlled infrastructure. In theory, this makes it detectable. In practice, the volume of legitimate scanning activity on the internet (search engine crawlers, security researchers, vulnerability management vendors) makes distinguishing malicious scanning from background noise difficult without significant context.
Sub-techniques cover:
- T1595.001 (Scanning IP Blocks): systematic scanning across IP address ranges to discover live hosts and open ports. Tools include Nmap, Masscan, Zmap. Nation-state actors conduct large-scale IP block scans to build inventories of internet-exposed infrastructure before targeting specific organizations.
- T1595.002 (Vulnerability Scanning): targeted scanning for specific vulnerabilities, typically following a disclosure event. When CVE-2023-34362 (MOVEit) was disclosed, automated scanners from threat actors began probing port 443 for MOVEit installations within 48 hours. Organizations with MOVEit exposed to the internet were identified and catalogued before many had applied the patch.
- T1595.003 (Wordlist Scanning): directory and path enumeration against web applications using wordlists of common paths, file names, and parameter names. Tools like Gobuster, dirbuster, and ffuf are used for this purpose.
The attacker's scanning generates logs on the target side: firewall connection logs, web server access logs, IDS/IPS alerts. Detection is possible but high-noise. More operationally useful is the defensive equivalent: running the same scans against your own infrastructure continuously (Attack Surface Management) and finding your exposure before attackers do.
Search Open Technical Databases (T1596)
Passive reconnaissance through public technical databases requires no direct interaction with the target. The attacker queries third-party resources that have already indexed information about the target.
Sub-techniques and specific data sources:
- T1596.001 (DNS/Passive DNS): DNS records are public by design. A zone transfer (AXFR) against a misconfigured DNS server can return the entire DNS record set for a domain, including all subdomains and internal hostnames that should not be publicly visible. Even without zone transfers, DNS enumeration using brute-force subdomain lists reveals subdomains. Passive DNS databases (Farsight Security, VirusTotal passive DNS) contain historical records of DNS resolutions, revealing infrastructure that has since been decommissioned.
- T1596.002 (WHOIS): WHOIS data reveals domain registrant information, registration and expiration dates, and name server records. Privacy-protected registrations reduce the value of WHOIS for individual attribution, but registration patterns and shared registrar/nameserver infrastructure can link domains to threat actors.
- T1596.003 (Digital Certificates): Certificate Transparency (CT) logs are public records of every TLS certificate issued by participating Certificate Authorities. The CT log at crt.sh indexes these records and is searchable by domain. An attacker querying
crt.shfor%.target-company.comreceives a list of every subdomain for which a certificate has been issued, including staging environments, internal tools exposed to the internet, and development systems. This is a common and highly effective reconnaissance technique that reveals the full subdomain inventory of an organization without touching a single system. - T1596.005 (Scan Databases): Shodan, Censys, and FOFA continuously scan the internet and index banner information, service versions, and SSL certificates from every responsive IP address. An attacker can query these databases for an organization's IP ranges and immediately see every internet-facing service, its software version, its TLS certificate, and any exposed credentials or misconfigurations visible in service banners. The information is available with no interaction with the target network. Shodan includes specific searches for industrial control systems, SCADA interfaces, VNC sessions, and default-credential devices.
Search Open Websites and Domains (T1593)
Public websites, including job boards, social media platforms, and search engines, provide significant intelligence about an organization's technology stack, personnel, and operational security posture.
- T1593.001 (Social Media): LinkedIn is the single most valuable social media platform for adversary reconnaissance. A complete employee list for a target organization is constructable from LinkedIn by searching for current employees. From this list, an attacker derives: the organizational structure and reporting chains, email address patterns (LinkedIn profiles often expose email addresses, and the pattern is inferrable from known addresses:
firstname.lastname@company.comis the most common enterprise pattern), names and roles of high-value targets for spear phishing (C-suite executives, finance personnel, IT administrators, developers with cloud access), and technology stack details from job postings and employee skills sections. - T1593.002 (Search Engines): Google dorking uses advanced search operators to find indexed content that organizations did not intend to expose. Common operators:
site:company.com filetype:pdffinds all indexed PDF files, which may contain internal documents cached by Google.site:company.com inurl:adminfinds admin panels indexed by the crawler.site:company.com intext:"password"finds pages containing credential strings.site:pastebin.com "company.com"finds pastes referencing the organization's domain, which may include leaked credentials, API keys, or internal configuration. These searches return results without any network interaction with the target.
Gather Victim Identity Information (T1589)
Identity intelligence is the highest-value category of reconnaissance data. Credentials and email addresses directly enable the next-phase attacks.
- T1589.001 (Credentials): breach databases contain billions of credential records from historical data breaches. Dehashed, Snusbase, and darknet markets provide paid access to credential search by email domain. An attacker searching for
@company.comcredentials in these databases may find hundreds or thousands of current and former employee email/password pairs. Even if the passwords are hashed, cracking common password hashes is computationally trivial with GPU-based cracking tools and the RockYou2024 wordlist (10 billion passwords). The attacker's primary use of these credentials is not necessarily to try the original password, but to derive likely current passwords from the patterns observed (employees tend to use variations of their previous passwords when forced to rotate). - T1589.002 (Email Addresses): email address enumeration combines multiple sources. LinkedIn provides names, from which email patterns are inferred. Tools like Hunter.io, Phonebook.cz, and email permutation scripts validate email addresses against mail server responses (SMTP VRFY, RCPT TO probing). A valid email address list enables spear phishing campaigns that reach real employees rather than hitting delivery failures.
- T1589.003 (Employee Names and Personal Information): names, job titles, and professional history enable pretexting. A threat actor who knows that the CFO's executive assistant joined six weeks ago can craft a highly convincing pretext for a business email compromise attempt that would not be plausible without that specific knowledge.
Gather Victim Network Information (T1590)
Network infrastructure reconnaissance maps the target's IP address space, autonomous system number (ASN), domain infrastructure, and cloud provider relationships.
- T1590.001 (Domain Properties): WHOIS, passive DNS, and CT logs combine to map the full domain and subdomain inventory.
- T1590.002 (DNS): name server records (NS records) identify the DNS hosting provider and reveal DNS infrastructure. MX records identify the mail provider (Google Workspace vs. Microsoft 365 vs. on-premises Exchange has different attack implications). SPF and DMARC records reveal email security posture.
- T1590.004 (Network Topology): BGP routing data (from route-views.org, RIPE RIS) maps the ASN and IP prefix ownership. Hurricane Electric's BGP toolkit provides public AS path and prefix data. This reveals cloud provider relationships (an organization announcing Amazon IP prefixes as their ASN's routes is using AWS infrastructure).
- T1590.005 (IP Addresses): ARIN, RIPE, and IANA WHOIS data map IP blocks to organizational owners. Combined with Shodan scan data, this provides a complete picture of internet-facing infrastructure.
Phishing for Information (T1598)
Not all phishing seeks to deliver malware. T1598 covers phishing campaigns designed to collect information rather than establish a foothold.
- T1598.001 (Spearphishing via Service): pretexting calls to help desks, IT support, or customer service. A threat actor posing as an employee calling to reset their password is conducting T1598.001. The information sought may be the victim's email address, their manager's name, or internal system names that enable subsequent attacks.
- T1598.003 (Spearphishing Link for Information): credential harvesting pages that mimic corporate login portals. The attacker seeks the user's credentials, not the installation of malware. From the victim's perspective, this may be indistinguishable from a legitimate login. From the detection perspective, the credential submission to a non-corporate domain is the indicator.
Detection
The Fundamental Detection Challenge
Most reconnaissance happens outside your network. The attacker querying Shodan, browsing crt.sh, or reviewing LinkedIn generates no logs in your environment. This is the core reason that detection-centric security programs struggle with TA0043: there is often nothing to detect inside the perimeter.
The shift in posture from detection to prevention means organizations must ask: what does the attacker see when they look at us from the outside? Answering that question requires running the same reconnaissance tools against your own infrastructure that an attacker would use.
External Exposure Monitoring
Attack Surface Management (ASM) tools: Tenable.io, Qualys TruRisk, CyCognito, Censys ASM, and similar platforms continuously monitor the internet-facing exposure of your infrastructure, discovering new assets as they are created and alerting when new services are exposed. This gives defenders the same view the attacker sees on Shodan and Censys, but with asset ownership context.
Certificate Transparency monitoring: Subscribe to certificate issuance alerts for your domains via crt.sh, Certspotter, or Facebook's CT log monitor. New certificates issued for your domain are either legitimate (internal IT provisioning a new service) or adversarial (a phishing site registered as secure-companyname.com). Alert on every new certificate issuance for your domain pattern within 24 hours.
Shodan and Censys monitoring: Set up organizational monitoring in Shodan and Censys to receive alerts when new services appear on your IP ranges or when vulnerability scan matches appear for your infrastructure. Both platforms support API-based monitoring integrated into SIEM or ticketing workflows.
Passive DNS monitoring: Services like Farsight DNSDB, RiskIQ (now Microsoft Defender Threat Intelligence), and Cisco Umbrella Investigate alert on new domain registrations and DNS record changes for monitored domains and infrastructure.
Credential Exposure Monitoring
HaveIBeenPwned Enterprise: Monitor your corporate email domain against new breach disclosures. HIBP's notification API alerts within hours of a new breach dataset appearing that contains your domain's email addresses. This gives IT and security teams the ability to force-rotate specific compromised credentials before attackers can use them.
Dark web monitoring: Services including Recorded Future, Digital Shadows (Searchlight Cyber), and Kroll monitor darknet markets and paste sites for credential sets referencing your organization's domains. Credential sets in breach databases are the primary intelligence source attackers use for credential stuffing and password spraying. Knowing which credentials are exposed enables preemptive remediation.
Network-Level Detection (Limited Scope)
Firewall logs: Detect scanning activity targeting your internet-facing IP ranges. High-rate connection attempts from a single source IP to multiple ports or multiple destination IPs (port sweep, host sweep) are scanning indicators. Alert on source IPs with no prior connection history attempting connections to more than 10 different ports or hosts within a five-minute window.
Web server access logs: Directory enumeration (T1595.003) generates distinctive patterns: rapid sequential requests for paths that return 404, particularly from a single source IP or from a small set of IPs in the same ASN. Alert on source IPs generating more than 50 consecutive 404 responses within a two-minute window.
Honeypots and honeytokens: Internal honeypots (systems that should never receive legitimate traffic) and honeytokens (fake credentials, fake documents, fake API keys that generate alerts when used) are highly effective at detecting active scanning and post-reconnaissance exploitation. A connection attempt to a honeynet IP is unambiguously malicious because no legitimate traffic should go there.
SMTP VRFY and RCPT TO probing: Email address enumeration via SMTP probing generates logs at the mail server. Monitor for sequential RCPT TO attempts from a single source IP cycling through email address permutations (firstname.lastname, f.lastname, firstnamel, etc.). Alert on more than 20 failed RCPT TO attempts from a single source within a ten-minute window.
Windows Event IDs
- Event ID 4625 (Failed Logon): large volumes of failed logons from a single source or targeting multiple accounts in a short window indicate credential stuffing (testing breach database credentials) or password spraying. Alert thresholds: more than 10 failed logons for a single account in five minutes, or more than 50 failed logons across 10+ accounts from a single source in 15 minutes.
- Event ID 4648 (Logon with Explicit Credentials): may indicate credential testing or enumeration by an attacker who has obtained credentials and is testing their validity.
Why It Matters
Reconnaissance is the foundation of every targeted attack. An attacker with no information about the target is limited to opportunistic exploitation: scanning for known vulnerabilities and hoping something is unpatched. An attacker with comprehensive reconnaissance intelligence can select the specific technique most likely to succeed against the specific target, at the time of their choosing.
The value of reconnaissance intelligence compounds. Email addresses enable phishing. Credentials from breach databases enable credential stuffing. Technology stack information from job postings enables targeted exploitation. Subdomain enumeration from Certificate Transparency logs reveals staging environments and internal tools with weaker security posture. Shodan data identifies unpatched, internet-exposed services. A threat actor who has spent two weeks performing reconnaissance on a specific target before attempting initial access has a fundamentally different probability of success than one who scans opportunistically.
Volt Typhoon's documented multi-year persistence in US critical infrastructure networks did not begin with exploitation. It began with years of infrastructure reconnaissance, identifying high-value targets, mapping their network perimeters, and identifying the specific access paths (compromised SOHO routers, VPN appliances, edge devices) that would provide persistent, low-visibility access. The reconnaissance phase was as operationally significant as the access phase that followed.
The Cyber Threat Intelligence (CTI) community's tracking of threat actor infrastructure is directly enabled by the same reconnaissance techniques that attackers use against defenders. Certificate Transparency logs, passive DNS, and Shodan data allow defenders to map attacker infrastructure, link new campaigns to known threat actors, and identify attacker infrastructure before it is used against targets. Defenders who understand how reconnaissance works can apply the same techniques defensively.
CDA Perspective
VSD: Continuous Surface Reduction
The CSR methodology, "Every surface you expose is a surface we eliminate," is the direct strategic response to adversary reconnaissance. The attacker's reconnaissance phase defines the attack surface they will attempt to exploit. Reducing the information available to the adversary and minimizing the internet-exposed surface before the attacker completes their reconnaissance is the VSD objective.
VSD-R01 (External Attack Surface Discovery) is the defensive mirror of adversary reconnaissance. CDA runs the same tools and queries that attackers use, against the client's own infrastructure, to produce an accurate picture of what the attacker sees. This mission is run at the start of every engagement because organizations routinely have internet-exposed services they do not know about: forgotten cloud storage buckets, development servers with public IPs, expired but still-resolving DNS entries pointing to decommissioned infrastructure.
VSD-B03 (Attack Surface Reduction) operationalizes the findings from VSD-R01: decommissioning exposed services that are not required, removing DNS records for retired infrastructure, securing misconfigurations found via Shodan, and eliminating the unnecessary attack surface that reconnaissance would otherwise hand to an adversary. VSD-C01 (Continuous Surface Monitoring) sustains this state by continuously running the same external discovery against the organization's IP ranges and domains, alerting on new exposures as they appear.
TID: Predictive Defense Intelligence
The PDI methodology, "See the threat before it sees you," requires understanding what adversaries are collecting about your organization before they act on it. TID-R01 (Threat Landscape Assessment) establishes which threat actors are likely targeting the organization's sector and geography, and what reconnaissance techniques those actors are known to use. This assessment informs which defensive investments are highest priority.
TID-B03 (Threat Intelligence Integration) adds the data feeds that monitor adversary reconnaissance activity: certificate transparency alerts, dark web credential monitoring, domain squatting detection, and infrastructure scanning alerts. These feeds provide the earliest available indicators that a threat actor has moved from passive research to active reconnaissance of your organization.
TID-H03 (Threat Hunting Program) extends PDI into proactive hunting for evidence that reconnaissance has already occurred and preliminary access attempts are underway, including credential stuffing attempts against authentication systems, scanning signatures in web server logs, and subdomain enumeration patterns in DNS logs.
TID-C02 (Threat Intelligence Operations) is the sustained intelligence collection program that monitors adversary reconnaissance activity as an ongoing function, not a one-time assessment.
IAT: Zero Possession Architecture
Credential exposure is the reconnaissance intelligence that most directly converts into successful initial access. The ZPA methodology, "Trust nothing. Possess nothing. Verify everything," addresses the credential exposure problem structurally. Phishing-resistant MFA (hardware security keys implementing FIDO2/WebAuthn) renders credential intelligence collected from breach databases operationally useless: knowing the password is insufficient if authentication requires a hardware token that the attacker does not possess and cannot phish.
IAT-R01 (Identity and Access Baseline) establishes the current credential exposure posture. IAT-B01 (IAM Foundation) implements the identity controls that make reconnaissance-derived credential intelligence ineffective as an attack enabler.
Key Takeaways
- Reconnaissance (TA0043) occurs almost entirely outside the target's network. Traditional detection-centric defenses have minimal visibility into attacker reconnaissance activity.
- Certificate Transparency logs (crt.sh) expose every subdomain with a TLS certificate to any attacker who knows to look. Monitor your own domain for new certificate issuances.
- Shodan and Censys index your internet-facing services and their versions without any interaction from you. Every unpatched internet-facing service is available for attacker targeting before you know it exists.
- LinkedIn is an adversary reconnaissance platform. Job postings reveal technology stack. Organizational charts reveal phishing targets. Email patterns enable address enumeration.
- Breach databases contain credential intelligence that attackers convert into credential stuffing and password spraying attacks. HIBP monitoring and dark web credential tracking are the defensive responses.
- The appropriate primary defense is surface reduction, not detection: reduce what the attacker can find, and the intelligence they collect becomes less useful.
- VSD-R01 (External Attack Surface Discovery) is the defensive equivalent of adversary reconnaissance. Running it before attackers do is the most direct operationalization of the CSR methodology.
- Phishing-resistant MFA eliminates the attack path that credential reconnaissance enables. Credential intelligence from breach databases has no value against an organization where every authentication requires a hardware security key.
Sources
- MITRE ATT&CK: Reconnaissance (TA0043). https://attack.mitre.org/tactics/TA0043/
- CISA Advisory AA23-144A: Volt Typhoon Targeting US Critical Infrastructure. https://www.cisa.gov/news-events/cybersecurity-advisories/aa23-144a
- Certificate Transparency Logs: crt.sh. https://crt.sh/
- Shodan: The Internet of Things Search Engine. https://www.shodan.io/
- MITRE ATT&CK: Gather Victim Identity Information (T1589). https://attack.mitre.org/techniques/T1589/
- Have I Been Pwned: https://haveibeenpwned.com/
- SANS Institute: Open Source Intelligence (OSINT) for Penetration Testing. https://www.sans.org/white-papers/34871/
- CISA: Reducing the Significant Risk of Known Exploited Vulnerabilities. https://www.cisa.gov/known-exploited-vulnerabilities-catalog
Sources
- MITRE ATT&CK: Reconnaissance (TA0043). https://attack.mitre.org/tactics/TA0043/
- CISA: Reducing the Significant Risk of Known Exploited Vulnerabilities. https://www.cisa.gov/known-exploited-vulnerabilities-catalog
- Shodan: The Internet of Things Search Engine. https://www.shodan.io/
- Certificate Transparency Logs: crt.sh. https://crt.sh/
- MITRE ATT&CK: Gather Victim Identity Information (T1589). https://attack.mitre.org/techniques/T1589/
- CISA Advisory AA23-144A: Volt Typhoon. https://www.cisa.gov/news-events/cybersecurity-advisories/aa23-144a
- Have I Been Pwned: https://haveibeenpwned.com/
- SANS Institute: Open Source Intelligence (OSINT) for Penetration Testing. https://www.sans.org/white-papers/34871/
Related Articles
Lazarus Group (HIDDEN COBRA / Diamond Sleet)
Lazarus Group is North Korea's primary advanced persistent threat operation, operating under the RGB (Reconnaissance General Bureau), the DPRK's primary foreign intelligence service.
Written by Evan Morgan
Found an issue? Help improve this article.