Dependency Confusion and Supply Chain Attacks on Package Managers

Dependency Confusion and Supply Chain Attacks on Package Managers | CDA.Wiki | CDA.Wiki

# Dependency Confusion and Supply Chain Attacks on Package Managers

Definition

A dependency confusion attack is a supply chain attack technique that exploits how package managers resolve package names when both private and public registries are configured. The attacker registers a malicious package on a public registry (such as npmjs.com or PyPI) using the same name as a private internal package used by a target organization. When the package manager searches for the internal package, it discovers the attacker's public version, often preferring it due to higher version numbers or default registry priority rules, and downloads and executes the malicious code.

The attack requires no credentials, no access to the target's network, and no exploitation of a software vulnerability. It abuses a configuration assumption baked into how modern package managers were designed, making it particularly dangerous. A single registered package name can affect every developer and CI/CD system in a target organization simultaneously.

Dependency confusion sits at the intersection of two broader threat categories: software supply chain attacks (compromising software before it reaches a victim's environment) and configuration abuse (exploiting incorrect or insecure defaults rather than finding new bugs). Organizations that use private package registries for internal code and have not explicitly locked their registry configuration are exposed to this class of attack.

---

How It Works

The Attack Mechanism

Modern software development depends on package managers. A Node.js project declares its dependencies in package.json; a Python project uses requirements.txt or pyproject.toml; a .NET project uses packages.config or .csproj references. When a developer runs npm install, pip install, or dotnet restore, the package manager resolves each named dependency by querying one or more package registries.

Organizations frequently maintain private registries to host internal packages not suitable for public distribution. A company might host a private npm registry at registry.internal.company.com containing packages like company-api-client, company-auth-helpers, or company-utilities. These names are not registered anywhere publicly because there was no reason to register them.

The dependency confusion attack proceeds as follows:

An attacker identifies internal package names. These names often appear in job postings, open source repositories, accidentally committed configuration files, npm package.json files in public GitHub repositories, or in requirements.txt files left in public repos by developers who forgot to scrub internal dependencies.

The attacker registers packages with those exact names on the public registry (npmjs.com, PyPI, RubyGems.org, NuGet.org). The attacker sets the version number higher than whatever the organization's internal version likely is (e.g., version 9.9.9 when the internal package is version 1.2.3).

When a developer runs npm install on a machine or in a CI/CD pipeline configured to search both the private and public registry, the package manager finds two candidates: the legitimate internal package at version 1.2.3 on the private registry, and the attacker's malicious package at version 9.9.9 on the public registry. Most package managers prefer the higher version number by default.

The malicious package is downloaded and installed. npm, pip, and many other ecosystems execute scripts during installation (npm's preinstall, install, and postinstall scripts; Python's setup.py). This gives the attacker code execution on the developer's machine or CI/CD runner before the developer or any automated system has inspected the package contents.

The attacker's payload can exfiltrate environment variables, SSH keys, AWS credentials, build secrets, or source code. Because installation-time scripts run with the same permissions as the build process, the blast radius can be significant.

Alex Birsan's 2021 Research

Security researcher Alex Birsan first documented and demonstrated dependency confusion attacks in February 2021 in a widely circulated research paper. Birsan identified internal package names from public sources for Apple, Microsoft, PayPal, Shopify, Netflix, Yelp, Tesla, and approximately 35 other major organizations. He registered packages with those names on npm, PyPI, and RubyGems, each containing a payload that performed basic system fingerprinting and sent the data to a unique subdomain he controlled (allowing him to identify which organization's system had executed the package).

The packages installed successfully on internal systems at multiple major companies. His payload was deliberately non-destructive, and he coordinated disclosure with affected vendors. The bug bounty rewards totaled more than $130,000 across programs, with individual awards reaching $30,000 from companies including Apple. The research triggered industry-wide reassessment of private registry security configurations.

What made Birsan's research particularly striking was the source of the package name intelligence: public GitHub repositories, public npm packages that had accidentally included internal dependencies, and job postings that mentioned internal tooling names. No internal system access was required for reconnaissance.

---

Why It Matters

Scale of Exposure

Dependency confusion is not a theoretical risk. Real attacks have exploited this vector against production systems at major organizations. The attack is asymmetric: a single attacker spending a few hours registering package names can simultaneously compromise the build environments of hundreds of developers at a target company.

Build environments are high-value targets. CI/CD runners typically hold cloud provider credentials (AWS access keys, GCP service account tokens), code signing certificates, deployment keys, and access to production infrastructure. A malicious package executed in a CI/CD environment is in an excellent position to exfiltrate all of these.

Trust Assumptions

The attack works because developers and organizations implicitly trust the installation process. When npm install completes successfully, developers assume they received the packages they asked for. The concept that a malicious package could have been pulled from a public registry and installed during a routine dependency installation is counterintuitive to most developers who grew up treating npm install as a safe, mechanical operation.

This trust assumption is a systemic problem, not a user error. Package manager defaults were designed for convenience, not adversarial environments. The default behavior in npm is to check the public registry when a private registry is configured and the package is not found there, or to prefer the highest version number across all configured registries. Both behaviors are exploitable.

Persistence After Initial Compromise

Dependency confusion payloads execute on installation. Once the initial exfiltration occurs, the attacker may have obtained credentials sufficient to maintain persistent access through normal authentication channels, meaning the package itself is no longer needed. Detection becomes harder because the malicious activity (credential use from an unusual location) looks like a legitimate, authenticated operation.

---

Technical Details

Defense Mechanisms

Namespace scoping (npm). npm supports scoped packages: @company/package-name. A scoped package name on npmjs.com requires verified ownership of the @company namespace. If an organization registers @acme on npmjs.com, no attacker can publish @acme/internal-tool to the public registry without controlling the @acme organization account. This is the most complete defense for npm-dependent projects: convert all internal packages to scoped names under a namespace the organization owns on the public registry.

Registry configuration lockdown. Package managers allow explicit control over which registry serves which package namespace. In npm, .npmrc configuration supports a scoped registry directive: @company:registry=https://registry.internal.company.com. This tells npm to always fetch @company/* packages from the private registry, never from the public one. For non-scoped internal packages, a broader directive can force all package resolution through the private registry, with the private registry configured to proxy the public registry for legitimate open source packages. This converts the private registry into the sole resolution point, eliminating the ambiguity that dependency confusion exploits.

For pip, pip.conf supports index-url and extra-index-url directives. The distinction matters: index-url sets the primary index (used first and exclusively for packages found there); extra-index-url adds an additional registry that pip also searches, with version preference determining the winner. Organizations using extra-index-url for their private registry are exposed. The correct configuration uses index-url pointing to a private registry that proxies public packages, so pip never queries the public PyPI index directly.

Package name reservation. Organizations can defensively register their internal package names on public registries, publishing intentionally empty or placeholder packages. This prevents an attacker from claiming those names. Combined with monitoring for unauthorized publishes under the organization's account, this removes the public registry registration vector. The NTIA and GitHub both recommend this approach as a complement to registry configuration hardening.

Dependency integrity verification. package-lock.json (npm) and yarn.lock (Yarn) record the exact resolved version and integrity hash (SHA-512) for every installed package. When npm ci is used instead of npm install, npm refuses to install any package that does not match the lockfile exactly. An attacker's malicious package at version 9.9.9 would not match a lockfile that recorded version 1.2.3 with a specific hash. Maintaining strict lockfile discipline and using npm ci in CI/CD pipelines is a meaningful mitigation. Python's pip-tools and Poetry provide comparable lockfile functionality.

Private registry access controls. Internal packages should require authentication to download. A CI/CD system configured with a valid token to the private registry, combined with a configuration that routes all internal package names to that registry, substantially reduces the attack surface. Unauthenticated private registries that rely on network isolation (VPN-only access) provide weaker protection because CI/CD systems that have outbound internet access may still contact public registries.

Real-World Attacks After Birsan

PyTorch-nightly dependency confusion (December 2022). Attackers registered a package named torchtriton on PyPI. The PyTorch project used a private registry to distribute nightly builds, and torchtriton was a dependency of the nightly distribution. Users who installed PyTorch nightly between December 25-30, 2022 received the malicious torchtriton from PyPI instead of the legitimate nightly build package. The malicious package contained a binary that exfiltrated SSH private keys, bash history, and environment variables from affected machines. The PyTorch team responded by removing the nightly distribution that depended on the conflicting name.

GitHub Actions supply chain attacks. Several documented incidents involved dependency confusion-style attacks against GitHub Actions workflows. Malicious packages published to npm with names similar to internal dependencies of popular CI/CD toolchains were designed to execute during automated build processes.

---

CDA Perspective

Dependency confusion attacks belong to the VSD domain because they represent an attack surface introduced by software supply chain design: specifically, the way organizations configure package manager resolution in environments that bridge private internal registries and public ecosystems. Each unscoped, unguarded internal package name is a surface that an attacker can claim.

CDA's Continuous Surface Reduction methodology addresses this surface through several controls:

Registry configuration hardening is part of the CSR baseline for any client that operates a private package registry. The audit checks for improper extra-index-url usage in Python projects, absence of scoped namespaces in npm projects, and the presence of integrity-locked lockfiles in CI/CD pipelines.

Supply chain surface mapping, as part of initial engagement scoping under the VSD domain, includes enumerating all package registries in use, identifying internal package names, and assessing whether those names are defensively reserved on public registries.

Pipeline isolation is a complementary control: CI/CD runners should have network egress filtered so that only approved registry endpoints are reachable. A runner that cannot reach npmjs.com directly cannot be confused by a package registered there, regardless of local configuration.

The Birsan research is a canonical reference in CDA training materials because it illustrates a recurring theme in supply chain security: the most dangerous attacks are not technically sophisticated, they exploit implicit trust and configuration defaults that organizations have never examined. Eliminating those implicit trust assumptions is precisely what Continuous Surface Reduction is designed to do.

---

Key Takeaways

Dependency confusion exploits how package managers resolve package names when both private and public registries are configured: an attacker registers a higher-versioned public package with the same name as a private internal package, causing the package manager to prefer the attacker's version.
The technique was demonstrated by Alex Birsan in 2021 against Apple, Microsoft, PayPal, Shopify, Netflix, and others using only public information. He earned more than $130,000 in bug bounties.
No system access is required for the attack. Internal package names are frequently discoverable from public GitHub repositories, job postings, and npm metadata.
Malicious packages execute code at installation time via package manager lifecycle hooks, giving attackers code execution in developer environments and CI/CD runners before any inspection occurs.
Primary defenses include npm namespace scoping (convert internal packages to @organization/package-name scoped names), explicit registry routing via .npmrc and pip.conf, defensive name reservation on public registries, and lockfile integrity enforcement with npm ci.
The PyTorch-nightly incident (December 2022) is the highest-profile confirmed real-world exploitation of this vector after Birsan's disclosure.
CDA's CSR methodology includes registry hygiene and dependency supply chain surface mapping as baseline controls for organizations operating private package registries.

---

Sources

Birsan, A. (2021). Dependency Confusion: How I Hacked Into Apple, Microsoft, and Dozens of Other Companies. Medium. https://medium.com/@alex.birsan/dependency-confusion-4a5d60fec610
PyTorch. (2022). Compromised PyTorch-nightly dependency chain between December 25th and December 30th, 2023. PyTorch Blog. https://pytorch.org/blog/compromised-nightly-dependency/
npm. (2022). Preventing dependency confusion attacks with scoped packages. GitHub Blog. https://github.blog/2021-02-12-avoiding-npm-substitution-attacks/
CISA, NSA, ODNI. (2022). Securing the Software Supply Chain: Recommended Practices Guide for Developers. https://www.cisa.gov/resources-tools/resources/securing-software-supply-chain-recommended-practices-guide
Microsoft Security Response Center. (2021). 3 Ways to Mitigate Risk When Using Private Package Feeds. https://azure.microsoft.com/en-us/resources/3-ways-to-mitigate-risk-using-private-package-feeds/
NIST. (2022). Software Supply Chain Security Guidance (SSDF). SP 800-218. https://csrc.nist.gov/publications/detail/sp/800-218/final
Sonatype. (2022). 2022 Software Supply Chain Report. https://www.sonatype.com/state-of-the-software-supply-chain/

Table of Contents

Definition

How It Works

The Attack Mechanism

Alex Birsan's 2021 Research

Why It Matters

Scale of Exposure

Trust Assumptions

Persistence After Initial Compromise

Technical Details

Defense Mechanisms

Real-World Attacks After Birsan

CDA Perspective

Key Takeaways

Sources

Related CDA Missions

Related Articles

Cross-Site Scripting (XSS)

Server-Side Request Forgery (SSRF)

Command Injection

Discussion

The Academy

The Command Post

The Armory