Table of Contents
I am not an expert on either of the package managers. Contrary, until few days ago I didn’t realize that
npm used a local cache. Unaware, I wrote an article titled OMG — NPM clone that finally makes sense and was called out on some of my false assumptions. That feedback forced me to take a step back and re-examine some of the differences in package managers closer.
I have been using
npm full time for the past 5 years. I’ve played around with
yarn when it first came out, and I learned about
pnpm via the “Why should we use pnpm?” article posted about a week ago.
I’ve spent the past week reading up on
pnpm and wanted to summarize and share my findings. My target audience is regular
npm users, like myself, that didn’t invest time to look into how various
npm alternatives stack up. I will only focus on top three (for me) and will not be covering:
npmd etc, because I don’t know anything about them.
It’s also important to point out, that as of writing of this article, none of the competing libraries are aiming to replace NPM the registry (aka place that stores the packages), and rather they are all aim to replace
npm command line client, providing an alternative user interface and behavior, with similar functionality.
npm was there from the day one and is one of the main reasons that Node.js itself is so successful as a project.
npm team has done a great job making sure that
npm remains backward compatible and works consistently across various environments.
npm was designed around the idea of Semantic Versioning (semver), which is a pretty straight forward approach as quoted from their website:
Given a version number MAJOR.MINOR.PATCH, increment the:
- MAJOR version when you make incompatible API changes,
- MINOR version when you add functionality in a backwards-compatible manner
- PATCH version when you make backwards-compatible bug fixes.
npm uses a file called
package.json in which users can store all of the dependencies for the project, via the
npm install --save command.
For example running
npm install --save lodash will add the following entry to the
^ character added before the version number of lodash. This character tells
npm to install any version of the library with the MAJOR version equal to 4. So if I were to run
npm install a year from now,
npm would install the latest version of
lodash with MAJOR value of 4, for example it could be
@ is npm convention to specify version with package name). You can see all supported characters here: https://docs.npmjs.com/misc/semver.
The reason for it, is because bumping MINOR version (in theory) should only include backward compatible changes. Hence installing the latest version of the library should work AND might allow to pull in important bug and security fixes that happened since original version
4.17.4 was installed.
On a flip side, it could result in a situation where various developers have various version of the same library installed on their machine, even though they are sharing the same
package.json file, leading to potentially hard to debug bugs and “works on my machine” situations.
npm libraries rely heavily on other
npm libraries. This results in nested dependencies and increases the chance of version miss-match.
The default behavior of using
^ in front of the library version can be turned off via
npm config set save-exact true command, but this will only lock in top level dependencies. Since every library required has it’s own
package.json file, which could have
^ in front of their dependencies, there is no guarantees provided via the
package.json file for the nested content.
To combat this concern npm provide a shrinkwrap command. This command will generate a
npm-shrinkwrap.json file, specifying exact version to use for all libraries and all nested dependencies.
That being said, even with
npm-shrinkwrap.json file in place, npm will only lock in the library versions, not the library content. Even though npm now prevents users from re-publishing same version of the library more than once,
npm admins still reserve the right to force update some of the libraries.
Here is a quote form shrinkwrap documentation page:
If you wish to lock down the specific bytes included in a package, for example to have 100% confidence in being able to reproduce a deployment or build, then you ought to check your dependencies into source control, or pursue some other mechanism that can verify contents rather than versions.
npm version 2 used to install all dependencies inside of each package that was requiring them. So if we had project, that required project A, that required project B, that required project C, the tree structure for all dependencies will look as follows:
node_modules - package-A -- node_modules --- package-B ----- node_modules ------ package-C -------- some-really-really-really-long-file-name-in-package-c.jsCode language: CSS (css)
This structure could get pretty long. Which was merely an annoyance on Unix based system, but was actually breaking things on Windows, where a lot of utilities were not implemented to handle file paths longer than 260 characters.
npm version 3 solution to this problem was to flatten the dependency tree, so our 3 project structure would now look as follows:
node_modules - package-A - package-B - package-C -- some-file-name-in-package-c.jsCode language: CSS (css)
As a result of that change, a path to some really long file, went from
You can read more on how NPM 3 dependency resolution works here.
A downside of this approach is that now
npm has to go through all of the project dependencies and decide how the flatten the
npm is forced to build a full dependency tree for all modules used, which is a costly operation and one of the leading causes of
npm install slow down. (Please see an update at the end of this post).
Since I didn’t follow the
npm changes carefully, I assumed that slow down came from NPM having to download everything from the Internet every time I ran the
npm install command.
Turns out, I was wrong, and
npm does have a local cache, where it keeps a tarball of each version of the library that it has downloaded. The content of local cache can be seen via the
npm cache ls command. Having a local cache helps to improve the install times.
All in all,
npm is a mature, stable and fun to use package manager.
Yarn was announced in October 2016 and quickly rose to 24K+ starts on Github. For comparison, npm only has 12K+ starts. It is a project with some high profile developers such as Sebastian McKenzie (Babel.js) and Yehuda Katz (Ember.js, Rust, Bundler etc).
From what I could gather, Yarn’s main initial goal was to address
npm installations not being deterministic due to semver related behavior described in the previous section. While predictable dependency tree (if desired) can be achieved with
npm shrinkwrap, it is not the default behavior and relies on all developers to know that such option exists and that it should be used.
Yarn took a different approach. Every
yarn install generates a
yarn.lock which is similar to
npm-shrinkwrap.json, but it is created by default. In addition to regular information,
yarn.lock file contains a checksums for the content to be installed, insuring that the same version of the library is used.
Since yarn was a fresh re-write of
npm client, the developers were able to properly parallelizes all needed operations and add some other improvements, which provided a significant speed up to the overall install time. My guess is that this speed up is the main reason for
yarn uses a local cache. Unlike
yarn does not need to have an internet connection to install dependencies that are already cached locally, providing the
offline mode. A feature that was unsuccessfully requested from
npm since 2012.
Yarn provides some other perks. For example, it allows to aggregate licenses for all packages used in a project and it’s nice to look at.
An interesting side note, is how attitudes from
yarn documentation has changed towards
yarn project become popular.
The initial yarn announcement said that the following about installing
The easiest way to get started is to run:
npm install -g yarn
Here is what Yarn installation page has to say about installing yarn now:
Note: Installation via npm is generally not recommended. npm is non-deterministic, packages are not signed, and npm does not perform any integrity checks other than a basic SHA1 hash, which is a security risk when installing system-wide apps.
For these reasons, it is highly recommended that you install Yarn through the installation method best suited to your operating system.
At this pace, I would not be surprised if
yarn were to announce their own registry, allowing developers to slowly phase out
Looks like thanks to
npm finally realized that they need to look closer into some highly requested issues. NPM’s initial reaction to the release of Yarn read to me along the lines of “it’s cute”. Now, when I was reviewing the highly request “offline” feature that I mentioned earlier, I noticed that a fix for it (and other related issues) is being actively worked on as we speak.
I want to point out that
pnpm is so fast that it outperforms both npm and yarn.
The reason why it’s so fast? Because it uses a clever approach that leverages hardlinks and symlinks to avoid having to copy all of the locally cached source files, which is one of the biggest performance hits for
Using links is not easy, and comes with a list of issues to consider.
At the same time, as indicated by 2K plus starts on Github,
pnpm is able to make linking work for a lot of people.
In addition, as of March 2017 it provides all benefits that
yarn provides, including offline mode and deterministic installs.
pnpm developers have done an amazing jobs. My personal preference is for deterministic installs, since I like the control and I don’t like the surprises.
Whatever the outcome of this race is (which kind of reminds me of io.js fork), I am thankful to
yarn for putting some fire under
npm's feet and providing a reasonable alternative until the dust settles.
I also think that it’s possible that
yarn could have thrown out the idea of hard and soft linking a bit too early. I wonder what
yarn team could do with this idea, considering how much damage a lone developer was able to do with
pnpm and how highly users seem to value the speed of installs.
I do think that
yarn is a safer choice over all, but
pnpm might be better choice for some use cases. For example, it could play well with a small to medium size team that runs a lot of integration tests and wants their dependencies installed as fast as possible.
Last but not least, I think that
npm still provides a very useful solution that supports a wide range of use case. Most developers can do just fine by sticking with pure
In any case, I am thankful for all of the contenders who are working hard to keep the ecosystem healthy. When companies compete, users win.
Update from @ReBeccaOrg via the following Tweet.
FYI, flattening is not where the “walk the entire tree” thing came from. Walking the entire tree was about self-healing.
It’s that guarantee that slowed it down compared to npm@2.
rm -rfa nested transitive dependency and
npm installwill notice this and correct it for you.
Now as it turns out, the npm@1-@npm@4 cache was slow. VERY slow. Far more slow than anyone imagined. Often barely an improvement over downloading when on a fast network. So with npm@5’s cache rewrite (something planned since npm 1.5 😭) it suddenly got much, MUCH faster.
yarn’s use of shasums protects you against hostile http://registry.npmjs.org ‘s but otherwise gives no further guarantees in its current impl.
— Alex Kras (@akras14) May 2, 2017