Table of Contents
Introduction
I am not an expert on either of the package managers. Contrary, until few days ago I didn’t realize that npm
used a local cache. Unaware, I wrote an article titled OMG — NPM clone that finally makes sense and was called out on some of my false assumptions. That feedback forced me to take a step back and re-examine some of the differences in package managers closer.
I have been using npm
full time for the past 5 years. I’ve played around with yarn
when it first came out, and I learned about pnpm
via the “Why should we use pnpm?” article posted about a week ago.
I’ve spent the past week reading up on npm
, yarn
, and pnpm
and wanted to summarize and share my findings. My target audience is regular npm
users, like myself, that didn’t invest time to look into how various npm
alternatives stack up. I will only focus on top three (for me) and will not be covering: ied
, npm-install
and npmd
etc, because I don’t know anything about them.
It’s also important to point out, that as of writing of this article, none of the competing libraries are aiming to replace NPM the registry (aka place that stores the packages), and rather they are all aim to replace npm
command line client, providing an alternative user interface and behavior, with similar functionality.
NPM
npm was there from the day one and is one of the main reasons that Node.js itself is so successful as a project. npm
team has done a great job making sure that npm
remains backward compatible and works consistently across various environments.
npm
was designed around the idea of Semantic Versioning (semver), which is a pretty straight forward approach as quoted from their website:
Given a version number MAJOR.MINOR.PATCH, increment the:
- MAJOR version when you make incompatible API changes,
- MINOR version when you add functionality in a backwards-compatible manner
- PATCH version when you make backwards-compatible bug fixes.
npm
uses a file called package.json
in which users can store all of the dependencies for the project, via the npm install --save
command.
For example running npm install --save lodash
will add the following entry to the package.json
file.
"dependencies": {
"lodash": "^4.17.4"
}
Code language: JavaScript (javascript)
Notice the ^
character added before the version number of lodash. This character tells npm
to install any version of the library with the MAJOR version equal to 4. So if I were to run npm install
a year from now, npm
would install the latest version of lodash
with MAJOR value of 4, for example it could be lodash@4.25.5
(@
is npm convention to specify version with package name). You can see all supported characters here: https://docs.npmjs.com/misc/semver.
The reason for it, is because bumping MINOR version (in theory) should only include backward compatible changes. Hence installing the latest version of the library should work AND might allow to pull in important bug and security fixes that happened since original version 4.17.4
was installed.
On a flip side, it could result in a situation where various developers have various version of the same library installed on their machine, even though they are sharing the same package.json
file, leading to potentially hard to debug bugs and “works on my machine” situations.
Most of npm
libraries rely heavily on other npm
libraries. This results in nested dependencies and increases the chance of version miss-match.
The default behavior of using ^
in front of the library version can be turned off via npm config set save-exact true
command, but this will only lock in top level dependencies. Since every library required has it’s own package.json
file, which could have ^
in front of their dependencies, there is no guarantees provided via the package.json
file for the nested content.
To combat this concern npm provide a shrinkwrap command. This command will generate a npm-shrinkwrap.json
file, specifying exact version to use for all libraries and all nested dependencies.
That being said, even with npm-shrinkwrap.json
file in place, npm will only lock in the library versions, not the library content. Even though npm now prevents users from re-publishing same version of the library more than once, npm
admins still reserve the right to force update some of the libraries.
Here is a quote form shrinkwrap documentation page:
If you wish to lock down the specific bytes included in a package, for example to have 100% confidence in being able to reproduce a deployment or build, then you ought to check your dependencies into source control, or pursue some other mechanism that can verify contents rather than versions.
npm version 2
used to install all dependencies inside of each package that was requiring them. So if we had project, that required project A, that required project B, that required project C, the tree structure for all dependencies will look as follows:
node_modules
- package-A
-- node_modules
--- package-B
----- node_modules
------ package-C
-------- some-really-really-really-long-file-name-in-package-c.js
Code language: CSS (css)
This structure could get pretty long. Which was merely an annoyance on Unix based system, but was actually breaking things on Windows, where a lot of utilities were not implemented to handle file paths longer than 260 characters.
npm version 3
solution to this problem was to flatten the dependency tree, so our 3 project structure would now look as follows:
node_modules
- package-A
- package-B
- package-C
-- some-file-name-in-package-c.js
Code language: CSS (css)
As a result of that change, a path to some really long file, went from ./node_modules/package-A/node_modules/package-B/node-modules/some-file-name-in-package-c.js
to ./node_modules/some-file-name-in-package-c.js
.
You can read more on how NPM 3 dependency resolution works here.
A downside of this approach is that now npm
has to go through all of the project dependencies and decide how the flatten the node_modules
folder. npm
is forced to build a full dependency tree for all modules used, which is a costly operation and one of the leading causes of npm install
slow down. (Please see an update at the end of this post).
Since I didn’t follow the npm
changes carefully, I assumed that slow down came from NPM having to download everything from the Internet every time I ran the npm install
command.
Turns out, I was wrong, and npm
does have a local cache, where it keeps a tarball of each version of the library that it has downloaded. The content of local cache can be seen via the npm cache ls
command. Having a local cache helps to improve the install times.
All in all, npm
is a mature, stable and fun to use package manager.
Yarn
Yarn was announced in October 2016 and quickly rose to 24K+ starts on Github. For comparison, npm only has 12K+ starts. It is a project with some high profile developers such as Sebastian McKenzie (Babel.js) and Yehuda Katz (Ember.js, Rust, Bundler etc).
From what I could gather, Yarn’s main initial goal was to address npm
installations not being deterministic due to semver related behavior described in the previous section. While predictable dependency tree (if desired) can be achieved with npm shrinkwrap
, it is not the default behavior and relies on all developers to know that such option exists and that it should be used.
Yarn took a different approach. Every yarn
install generates a yarn.lock
which is similar to npm-shrinkwrap.json
, but it is created by default. In addition to regular information, yarn.lock
file contains a checksums for the content to be installed, insuring that the same version of the library is used.
Since yarn was a fresh re-write of npm
client, the developers were able to properly parallelizes all needed operations and add some other improvements, which provided a significant speed up to the overall install time. My guess is that this speed up is the main reason for yarn
‘s popularity.
Like npm
, yarn
uses a local cache. Unlike npm
, yarn
does not need to have an internet connection to install dependencies that are already cached locally, providing the offline
mode. A feature that was unsuccessfully requested from npm
since 2012.
Yarn provides some other perks. For example, it allows to aggregate licenses for all packages used in a project and it’s nice to look at.
An interesting side note, is how attitudes from yarn
documentation has changed towards npm
since yarn
project become popular.
The initial yarn announcement said that the following about installing yarn
:
The easiest way to get started is to run:
npm install -g yarn
yarn
Here is what Yarn installation page has to say about installing yarn now:
Note: Installation via npm is generally not recommended. npm is non-deterministic, packages are not signed, and npm does not perform any integrity checks other than a basic SHA1 hash, which is a security risk when installing system-wide apps.
For these reasons, it is highly recommended that you install Yarn through the installation method best suited to your operating system.
At this pace, I would not be surprised if yarn
were to announce their own registry, allowing developers to slowly phase out npm
completely.
Looks like thanks to yarn
, npm
finally realized that they need to look closer into some highly requested issues. NPM’s initial reaction to the release of Yarn read to me along the lines of “it’s cute”. Now, when I was reviewing the highly request “offline” feature that I mentioned earlier, I noticed that a fix for it (and other related issues) is being actively worked on as we speak.
pnpm
As I mentioned, I only became aware of pnpm a short while ago via “Why should we use pnpm?” post by Zoltan Kochan, author of pnpm.
I am not going to go into too much details (since this post is getting long), but you can checkout my initial post for a lot more context and discussion on Twitter.
BUT
I want to point out that pnpm
is so fast that it outperforms both npm and yarn.
The reason why it’s so fast? Because it uses a clever approach that leverages hardlinks and symlinks to avoid having to copy all of the locally cached source files, which is one of the biggest performance hits for yarn
.
Using links is not easy, and comes with a list of issues to consider.
As Sebastian pointed out on Twitter, he has initially considered using symlinks in yarn
, but decided against it for a number of reasons.
At the same time, as indicated by 2K plus starts on Github, pnpm
is able to make linking work for a lot of people.
In addition, as of March 2017 it provides all benefits that yarn
provides, including offline mode and deterministic installs.
Conclusion
I think yarn
and pnpm
developers have done an amazing jobs. My personal preference is for deterministic installs, since I like the control and I don’t like the surprises.
Whatever the outcome of this race is (which kind of reminds me of io.js fork), I am thankful to yarn
for putting some fire under npm's
feet and providing a reasonable alternative until the dust settles.
I also think that it’s possible that yarn
could have thrown out the idea of hard and soft linking a bit too early. I wonder what yarn
team could do with this idea, considering how much damage a lone developer was able to do with pnpm
and how highly users seem to value the speed of installs.
I do think that yarn
is a safer choice over all, but pnpm
might be better choice for some use cases. For example, it could play well with a small to medium size team that runs a lot of integration tests and wants their dependencies installed as fast as possible.
Last but not least, I think that npm
still provides a very useful solution that supports a wide range of use case. Most developers can do just fine by sticking with pure npm
client.
In any case, I am thankful for all of the contenders who are working hard to keep the ecosystem healthy. When companies compete, users win.
Update from @ReBeccaOrg via the following Tweet.
FYI, flattening is not where the “walk the entire tree” thing came from. Walking the entire tree was about self-healing.
It’s that guarantee that slowed it down compared to npm@2.
If you
rm -rf
a nested transitive dependency andnpm install
will notice this and correct it for you.Now as it turns out, the npm@1-@npm@4 cache was slow. VERY slow. Far more slow than anyone imagined. Often barely an improvement over downloading when on a fast network. So with npm@5’s cache rewrite (something planned since npm 1.5 😭) it suddenly got much, MUCH faster.
yarn’s use of shasums protects you against hostile http://registry.npmjs.org ‘s but otherwise gives no further guarantees in its current impl.
“Overview of differences between npm, yarn and pnpm” @npmjs, @yarnpkg, @pnpm #javascript #npm #yarnpkg #pnpn https://t.co/wJp2PB38vh
— Alex Kras (@akras14) May 2, 2017