Stopping dependency hijacking, part 2


On March 26th malware was injected into bootstrap-sass, a popular Ruby gem. The attack was quickly spotted but it highlighted the same security issues as the recent eslint-scope and event-stream incidents.

No Dependabot users were affected by this attack, but Dependabot has a role to play in stopping future attacks. Here's what we're planning.

Helping users detect hijacks

We wrote some notes about stopping dependency hijacking after the event-stream attack in December. Some of the fixes we recommended there were at the registry level, but one area where Dependabot can make a contribution is helping end-users detect hijacks.

The key to detecting hijacks is getting visibility over the changed code. When viewing the diff between versions it's relatively obvious if a piece of code is malicious, and very suspicious if it's obfuscated.

Dependabot pull requests already link to the diff between tagged versions on GitHub. What they don't currently guarantee, however, is that the code linked to is the code that's been packaged. An attacker could push a malicious version to Rubygems, for example, whilst pushing innocuous code to the relevant tag on GitHub.1

We have a plan to change that. By doing so, Dependabot will help make dependency hijacks easier to detect.

Making dependency packages verifiable

To verify that the code packaged into a dependency version is the same as the code published on GitHub, we need to:

  1. Find the published source code. Most registries allow dependency authors to point to their source code in package metadata, so this is normally straightforward.
  2. Find the commit SHAs for each release. This is trickier, as most package managers don't provide a way of pointing to a specific SCM commit. Dependabot fetches the repo's tags and looks for any that match the release's version. It's fiddly but not too challenging.
  3. Compare the found source code with the packaged code. This is also tricky, as and many languages (including Ruby) don't guarantee that builds are reproducible. Ultimately, making package builds reproducible (following Debian's example) would allow us to simply compare the cryptographic hash of the published package to that of a package we build from an SCM checkout. In the meantime, we can verify individual files and certain deterministically-generated build artifacts, such as transpiled souce code.

Dependabot already performs 1) and 2) when creating its PRs. Now we're planning to start doing 3), working on a language by language basis. For languages where Dependabot has a reliable way to verify source code changes we'll include a verified mark on our PR commit diff links. We'll flag PRs where we couldn't verify the diff as unverified.

By highlighting whether dependency versions diffs can be verified, Dependabot will make it easier for our users to detect compromised versions. Not only will that help protect Dependabot users, it will help ensure the whole open source community is protected from dependency hijacks, as vulnerable versions are detected faster.


[1]: For some languages / package managers this isn't an issue. For example, Composer uses git repository details provided by Packagist to perform installs, so users can be certain the code in that repository is what they're installing. For most, however, including Rubygems and npm, it is.

Dependabot helps keep your dependencies up-to-date. It's free for personal accounts and open source, and always will be.

Find out moreTake me to the app