Why is local development of npm packages so hard?

Why is local development of npm packages so hard?
Photo by Jalal Kelink / Unsplash
💡
Disclaimer: the information in this article is outdated! I will write an update soon

Sometimes you want to split up code into several npm packages, in my case I had a css/vue library called “hop” and a project, called “flow” in which I used hop.

Later on I build a prosemirror-based text editor but because its quite huge I did not want it to be part of hop. So I moved it to a seperate project/package and named it “hop-editor”. Since I wanted to use some functionality of hop inside hop-editor I required it in the package.json of hop-editor.

The Problem

While developing with multiple packages you might run into the problem that with “real” npm packages for a change to be available in depending packages, you would need to publish a new version of the package for each little change. For example if I change a css rule in “hop” a little, I would need to create and publish a new version to npm and run npm update in flow to have the change there.

While that does work, and thanks to npm version its quite simple to increment the version number, it is not very elegant.

Running npm link in a folder of a package (in my case “hop”) creates a symlink to a special global folder. After that, running npm link hop in flow symlinks the package from the global folder into the node_modules folder of flow.

Thats great! File changes in hop would take immediate effect in flow, right?

Sadly npm link did not reliably work for me because symlinks tend to behave differently depending on the environment and setup (windows, linux, WSL, docker on windows, docker in wsl.. just to name a few).

Using webpack makes it even harder since it sometimes freakes out when confronted with the symlinked packages.

So because of that, someone build yalc:

Though this (npm link) may work in many cases, it often brings nasty constraints and problems with dependency resolution, symlink interoperability between file systems, etc.

yalc acts like a local repository and injects file:/link: dependencies into the package.json

This could work but if I now want to commit and push this package.json to the git repo of flow to run tests in its ci/cd pipeline, it will fail because the local packages are not available there (except I git clone and link all kinds of packages into the pipeline on startup)

Solution 2: npm workspaces

npm workspaces are a bit like npm link except that they automate the process a bit. Basically the folder structure looks a bit like this:

.
+-- package.json
`-- packages
   +-- a
   |   `-- package.json

You can read more about npm workspaces here: https://docs.npmjs.com/cli/v8/using-npm/workspaces

running npm install in the root . directory of the example project above symlinks the packages inside the “packages” folder to the node_modules folder:

.
+-- node_modules
|  `-- packages/a -> ../packages/a
+-- package-lock.json
+-- package.json
`-- packages    
   +-- a    
   |   `-- package.json

So npm workspaces could also be a valid solution, I could even push the whole project into a repository and it would work in a pipeline. But the main issue is: the packages now only exist in this one project constellation.

Meaning I can’t really use the packages inside the packages folder in any other project

Solution 3: git dependencies

I ended up having flow, hop and hop-editor each in a seperate git project, then I installed hop in flow and hop-editor like this:

"devDependencies": {
  "hop": "git+https://<link of my git repo>/hop.git",

Great! now I can use hop from everywhere, get the latest version by simply running npm update and all without constantly using symlinks or pumping out new package versions

It was too good to be true. Sadly but justifiably, npm does not recursively resolve git dependencies. What does that mean?

Look at the following dependency tree:

flow:
|
-> "hop": "git+https://<link of my git repo>/hop.git",
-> "hop-editor": "git+https://<link of my git repo>/hop-editor.git"
    |
    -> "hop": "git+https://<link of my git repo>/hop.git",

As you can see, both flow and hop-editor depend on hop. The dependencies “hop” and “hop-editor” of “flow” get resolved by npm because they are first-level dependencies. The actual effect of this is that webpack does not know what “hop” is from inside “hop-editor” eventhough it is installed in flow.

The nested dependency flow > hop-editor > hop is not resolved because it is a non-first-level git dependency.
Imagine npm would need to recursively fetch dependency git repositories of dependency git repositories to read the package.json inside them — a huge performance issue!

Solution 4: fake npm repository

To resolve packages by name and version, npm talks to a registry website that implements the CommonJS Package Registry specification for reading package info.

npm allows you to specify a custom repository which is then used to resolve packages instead of the default public `https://registry.npmjs.org/`

For example using a .npmrc file like this, you can specify a custom registry for a package scope:

@codingkiwi:registry=https://mycustomregistry.example.org

Now it seems like a npm registry is just a simple api server that follows this specification: http://wiki.commonjs.org/wiki/Packages/Registry

Basically there are 3 endpoints: / , /<package> and /<package>/<version>

First we point npm to the custom registry server for a certain scope like mentioned above. Then we set up a little express server which answers on the /<package> endpoint like this:

{
   "_id": "@codingkiwi/hop",
   "name": "@codingkiwi/hop",
   "versions": {
       "1.1.5-1649875607": {
          "name": "@codingkiwi/hop",
          "version": "1.1.5-1649875607",
          "dist": {
              "tarball": "https://<repo>/archive/master.tar.gz"
           }
       },
       "1.1.5": {
           "name": "@codingkiwi/hop",
           "version": "1.1.5",
           "dist": {
              "tarball": "https://<repo>/archive/v1.1.5.tar.gz"
           }
       },
       ...

As you can see it simply reads the latest version from the package.json of my git repository which is 1.1.5 in that case and appends the timestamp of the last commit of the repo to inject a fake latest version called “1.1.5–1649875607”

This fake version always points to the up-to-date tar.gz archive of the master branch.

Why the last timestamp?

npm versions need to follow semver!

At first I appended the commit ref , for example 1.1.5-d73a38f2f which is a valid semver, but npm resolves the pre-release bit (the part after the “-”) of a version lexically in ASCII sort order and since the commit refs are random, it can happen that you have installed a version where the commit ref belongs to an older commit but the lexical sort order is higher than the latest version provided by the endpoint.

If that happens, npm is confused because the newest version of the package is older than the one it has installed, so it tries to check if the version that is installed can be found somewhere lower in the “versions” array/object provided by the endpoint.

But since the endpoint does not create a fake version for each commit ref that ever existed (which would be possible but insane for git projects with thousands of commits), the version currently installed is not part of the versions list and npm fails with an error saying that the package does not exist.

Since time does not run backwards the timestamp works more reliable since it is numerical and the latest version is always higher.

The tarpit

If the package.json of flow says: I want “@codingkiwi/hop@1.1.5–1649875607” npm fetches the version from our registry but the package.json inside the hop project itself still says “1.1.5” and that is also the version that npm puts into the package-lock.json

That isn’t an issue until you run npm ci which ignores the package.json and just loads packages from the package-lock.json. Since its “1.1.5” there and we don’t have “1.1.5” in our versions array (and we can’t put it in there because 1.1.5 is greater than 1.1.5–xxxx in semver) npm ci fails.

So it got even crazier: I added a tar.gz endpoint to my fake registry which streams the real tar.gz from the git repo and while streaming modifies the version inside the package.json.

Also the tar.gz from my git repo has one folder in the root of the archive named “hop” after the repository name, but npm needs it to be “package” or otherwise the ssri integrity will fail, so I modify that folder name too.

Branches

A desirable feature would be the possibility to install different branches. For this to be possible we need a way to differentiate the versions by branch, semver allows adding a build suffix after the version and prerelease part but it is not possible to install a specific build.

So the only solution would be using a different major or minor version number for each branch, for example:

0.0.<stamp> for master
0.1.<stamp> for branch1
0.2.<stamp> for branch2

For this to work we would need a deterministic system that chooses a number for a branch that will work even if new branches are added or branches are removed which may be possible by storing the branch list and incrementing the number when a new branch is detected.

The next problem to solve would be how a user will know which number is which branch

Installing is as easy as npm install @prefix/repo@0.0 and because its a sub-one version this will include >=0.0.0 && < 0.1.0 according to npm caret ranges

Conclusion

It solution 4 a good solution? Probably not. Being the first person to have a certain solution for a problem sometimes means that the solution might not be the best one.

But does it work? Yes :)