ÆFLASH

Javascript Module Systems

At Fluid, I was recently tasked with updating a legacy Javascript application to use a more modern Javascript module system. The existing app was quite large -- 150 class files, and quite old -- parts of it dated back to 2006. We were partaking in some major expansions and refactorings, so we decided to explore modernizing its underlying build system.

The Existing System

Our app did have its own module system, but it had several quirks and drawbacks.

  • Heavily inspired by Java. Classes were namespaced into a huge global object, e.g. fluid.domain.product.subproduct.models.Foo and fluid.domain.product.subproduct.views.Bar. Source files were expected to live in a directory hierarchy that mirrored this namespacing.
  • The entire app was concatenated and minified using a custom Ant task. (i.e. using Java). One quirk of how this was implemented is that every class file had to have a unique name.
  • No dependency management. The order of classes was determined by a hand-maintined list of classes, dependencies listed before their dependents. If a class depended on another class, and was used before it was defined in the concatenated file, you would simply get a runtime error.
  • The code was also awash with circular dependencies. These only worked due to the dependency being used asynchronously after the initial module load.
  • The main app was loaded by a preliminary bootstrap manager that dynamically loaded the applications main class using its class path (again, using the fact that namespaces mapped to directories). In theory, any class could be dynamically loaded like this, but it wasn't used in practice -- just to load the main application class.
  • Classes were typically referred to by their full, global classname. This made minifciation less efficient. (but gzip misleadingly more efficient!)
  • In order to skip having to type out the full class path every time, some developers assigned the class to a global. e.g. FooModule = fluid.domain.product.subproduct.models.FooModule In this example, you could simply use FooModule anywhere. A good deal of the classes were leaked into the global window object this way.

All in all, not to bad. It was usable, but not ideal. Most of its problems came from the fact that the original architects of the system were most experienced with Java, and therefore they tried to make Javascript like Java, with strict Object Oriented paradigms and classpaths. All the developers who worked on the app in its beginning were also most experienced with Java and other strict OO languages, so they followed suit in their development. Some of the issues also counteracted each other. The fact that every class had to have a unique name made problems from global leaks rare. Using full class-paths all over the place also made dependency problems rare as well. Manually maintaining the order of dependencies was a real pain, however.

The Requirements

Coming into this project, the new lead architect and I had some decent experince with Javascript, both server-side and client-side, and had used a variety of frameworks, module systems, testing platforms, and tools. We drafted a list of requirements of what our ideal module system would offer:

  • Modular Javascript. Ideally one export per file, with the capability for private variables and methods that do not get exposed outside of the module's scope.
  • Automatic dependency management.
  • A nice way to build into a single JS file to minimize HTTP requests.
  • The ability to easily test modules with Mocha, both with automatic unit tests using a CLI, and unit/functional tests in the browser. Since a good chuck of the backend for this app was written in NodeJS and tested with Mocha, we wanted to be able to re-use the same test framework for consistency.
  • Be linted, built and tested automatically using a file-system watcher during development. (e.g. be compatible with Grunt's watch)
  • A strategy that could be gradually applied to the codebase. We would not be able to convert the entire codebase in one fell swoop (Converting 150 classes is a lot of work!), so it would have to be able to coexist with the exsiting module system during the transition.
  • Support for circular dependencies, since we would be doing this before refactoring.
  • The ability to use NPM modules. We were in the process of introducing useful utilities like Underscore and Async, but they required some modification to work with our build system. We would like to be able to use them directly from node_modules. It also would be nice to have the option to re-use some of our modules from the NodeJS backend.

The Contenders

Sprockets

Before we came to this project, the tech lead and I had used Sprockets as a build system for another web app. Sprockets is pretty simple, you just manually require other .js files, and everything would be concatenated together later in a build step.

//= require models/Foo
//= require models/Bar

/* everything in Foo.js and Bar.js is now in the current namespace */

var foo = new Foo(),
  bar = new Bar(foo);
// …

Pretty simple and quick, but too primitive. Unless you had manually wrapped each file in a closure function, each source JS file would leak it's vars into the current namespace. It also wouldn't really handle dependencies, if two classes required the same file, it would be included twice. It really is just fancy concatenation rather than a module system -- it allows you to work with smaller JS files rather than a monolithic app.js. Not a real improvement over the existing system.

Require.js (AMD)

At the time I was exploring this, there was a lot of buzz about Asyncronous Module Definition (AMD). The basic idea is that you define your list of dependencies, and then provide a factory/callback function that expects each of these dependencies as arguments, and returns what you want to export from the module. Example:

define("Foo", ["Dep1", "views/Dep2"], function (Dep1, Dep2) {
  var dep1 = new Dep1(),
    dep2 = new Dep2);
    Foo;

  // Define Foo...

  return Foo;
});

All your modules are defined this way, you point your AMD loader at your main app.js, and it will dynamically load all your dependencies (the canonical loader is require.js).

<script data-main="scripts/main" src="lib/require.js"></script>

Other dependencies will be loaded by inserting <script> tags. There is also a builder (r.js) that will create a single JS file for you, so you don't have to perform 150 HTTP requests in production. Since you do everything in a factory function, any private variables or methods are scoped to that function. You only return a single export. It is a module system designed for the browser.

There is also a buch of fancy stuff you can do. You don't have to define all your dependencies at the beginning of your module, you can asyncronously require them anywhere. You can also require non-javascript files, such as CSS or JSON.

require(["model/Dep1", "views/Dep2", "config/props.json", "styles/style.css"], function (Dep1, Dep2, props) {
  // style.css doen't get an argument, but it will have been loaded by now (no FOUC)
  // Dep3 still will be loaded asyncronously -- it will be parsed out of this factory callback and added to the async array
  var Dep3 = require("views/Dep3");

  function someAsyncFunc() {
    require("lib/Dep4", function (Dep4) {
      // Dep4 will be loaded asyncronously in this callback
      //...
    });
  }
});

CommonJS

CommonJS (hereafter reffered to as "CJS") predates AMD. It is the module system used by NodeJS and NPM. It is very simple. You require() your dependencies, they are loaded synchronously through the file system, and you define a single module.exports for each module file, usually a function or object.

// Foo.js
var Dep1 = require("./models/Dep1"),
  Dep2 = require("./views/Dep2"),
  Dep3 = require("Dep3"),
  Foo;

function someAsyncFunc() {
  var Dep4 = require("./lib/Dep4");
  // ...
}

// define Foo
// ...

module.exports = Foo;

In the above example, Dep1, Dep2, and Dep3 are loaded as the module is itself loaded, and Dep4 is loaded when someAsyncFunc() is called (or loaded from the require cache, if another module has previously required it). All the variables defined in Foo.js are only in scope for that file; there is an implied closure function around each CJS module.

CJS works well in NodeJS since each module can be loaded directly from file-system as needed. To make a CJS app work on the browser, you need to convert it using some sort of builder/module loader. There are several CJS builders; I found that Browserify is the best. It is designed specifically to convert code written for NodeJS, and compile it all into a single, browser-ready file. It also implements some of NodeJS's built in modules.

You may also notice that the some of the dependencies in the example use relative paths, while others do not. In CJS, every dependency is either relative to the current module, or loaded from the node_modules defined in the project's package.json. Dep1, Dep2, and Dep4 are part of the current project and relative to Foo, whereas Dep3 is a third-party module, not a part of the project, and described in the package.json. This contrasts with AMD, where all modules are relative to a project root (by default the path of the containing page), or defined by a custom path in a require.config().

The NPM package.json is a really useful construct. It's purpose is to define your project/module/app, state dependencies, and define various tasks that might be used by your app. Versioned dependencies are loaded from the NPM registry, Git repositories, or even the local filesystem (using npm link), and are copied to a node_modules directory in your project root. Any other setup can be handled by a postinstall script defined in the package.json. Setting up your app for development is as simple as typing npm install at your project root.

Uiversal Module Definition (UMD)

UMD is not really a module system, but it is worth a mention. UMD is a format for defining modules in such a way that they will work in either CJS or AMD (or even using browser globals). It is a significant amount of boilerplate code around each module, however. See the UMD link for more details.

There is also a project called uRequire that will convert any type of module that has been "sensibly" defined to a UMD format. It is a bit new, but it will probably become more and more useful as more modules are written in CJS and AMD, and there is a desire for interoptability between both formats. Keep an eye out.

We were not sure we would need modules to work natively in both systems, we didn't want too much boilerplate, and using uRequire would add an additional build step before the r.js or Browserify build, so we decided to disregard UMD for now.


So of all the contenders, we were down to Require.js, and CommonJS using Browserify.

Using Require.js

Require.js appeared to be up to the task of what we wanted -- it met all of our basic requirements. It provided automatic dependency management, one export per module, and supported compiling into a single file. We would define() all our modules using the AMD syntax, then use the r.js optimizer to build everything into a single file, using the almond AMD loader. (almond is a simplified AMD loader implementation that is designed for single-file builds.) I dove right in.

AMD Basics

Here is a contrived example app that I'll use as an example:

  • main.js
  • lib/
    • a.js
    • b.js

main.js

if (typeof define !== 'function') { var define = require('amdefine')(module) }

define(["./lib/a", "lib/b"], function (a, b) {
  console.log(a.foo);
  console.log(b.foo);
});

lib/a.js

if (typeof define !== 'function') { var define = require('amdefine')(module) }

define(function () {
  return { foo: "Module A" };
});

lib/b.js

if (typeof define !== 'function') { var define = require('amdefine')(module) }

define(["lodash"], function (_) {
  var words = ["this", "is", "module B"];
  return {
    foo: _.map(words, function (s) {
      return s.toUpperCase();
    }).join(" ")
  }
})

The if (typeof define ... boilerplate is needed at the top of each module to make things work in NodeJS. It will be parsed out by the r.js optimizer. However, we need amdefine in order to allow mocha to load our modules through NodeJS's module system -- more on this later.

To make this really simple app work in NodeJS, you also need an entry point:

ambootstrap.js

// installed through NPM
var requirejs = require("requirejs");

requirejs.config({
  // project root is the current directory
  baseUrl: __dirname,
  // require.js needs a reference to NodeJS's built-in require
  nodeRequire: require
});

// require our real main module, just let it do its thing
requirejs(["main"], function (main) {});

Run the app, and it works as expected.

$ node ambootstrap.js
Module A
THIS IS MODULE B

Note that amrequire is smart enough to look in the project's node_modules/ to find lodash. It also works with relative requires, as well as project-relative requires.

Using Almond

To package for use in the browser, we need ar.js optimizer config. With almond, it is a bit tricky. You have to make the almond loader your main file and manually include your actual main file in the requirejs config. Also any NPM modules that you use need a manual path defined in the build config.

build.js

({
  baseDir: ".",
  // main module becomes the almond loader
  name: "node_modules/almond/almond",
  // wrap the entire build in a closure to encapsulate vars
  wrap: true,
  // manually include our real main module
  include: "main",
  // insert a fake require(), since there is main uses amdefine()
  // otherwise, it won't be executed
  insertRequire: ["main"],
  // output file
  out: "dist.js",
  // **manually specify the path to lodash**
  paths: {
    "lodash": "node_modules/lodash/lodash"
  },
  // skip uglifyjs so we can read the output
  optimize: "none"
})

An item of note is that lodash uses a UMD-like syntax to make it work as both a CJS and AMD module out of the box. If, say, were were using underscore, which is not compatible with AMD, we would have to add cjsTranslate: true to our config to wrap the module.

Run $ r.js -o build.js and to compile into a single optimized file. Load the dist.js in a dummy HTML page, and you see the expected console output. There also is a grunt plugin that hooks up the r.js build with our Grunt toolchain.

Mocha

Now let's define some spec files for Mocha:

test/a.test.js

if (typeof define !== 'function') { var define = require('amdefine')(module) }

var expect = require("expect.js");

define(["../lib/a.js"], function (a) {
  describe("a.js", function () {
    it("should have a foo", function () {
      expect(a.foo).to.equal("Module A");
    });
  });
});

test/b.test.js

if (typeof define !== 'function') { var define = require('amdefine')(module) }

var expect = require("expect.js");

define(["../lib/b.js"], function (b) {
  describe("b.js", function () {
    it("should have a foo", function () {
      expect(b.foo).to.equal("THIS IS MODULE B");
    });
  });
});

Run $ mocha, and the tests pass with no errors, excellent. Adding a mocha task to our Gruntfile also works as expected. It is a bit odd that we have to use NodeJS's built-in require for our assertion library, but amdefine to load our module -- a mash of 2 module systems -- but it works nonetheless.

Some shortcomings:

  • You have to use relative-paths to your modules in your spec files. There probably is a way to pass a proper baseDir -- it likely just needs the requirejs.config() options used in the bootstrap file.
  • If your included CJS Node modules have their own CJS dependencies you have to manually add a path in your requirejs config for each CJS sub-dependency. The cjsTranslate option doesn't recursively parse requires, nor look in nested node_modules using node's module look-up logic. If you have a complicated dependency, this could get ugly and tedious. Luckily most things we want to include are a single module file, so we don't have to worry about this. I predict that recursive lookup with CJS rules will be supported in later versions of r.js.

So Many Dependencies

Some of our classes have dozens of dependencies. Typically this is a sign of bad design, but we still have to support large numbers of them until we can afford a refactor. Since you have to list all your dependencies in an array, and have each member correspond to an argument, in order, in a factory function, this can be cumbersome. RequireJS offers an alternate sugared syntax around this that allows you to pretend your requires are synchronous.

This sugar is a bit dangerous, or it at least filled me with a sense of unease. The body of your factory is converted to a string, the synchronous requires are parsed out, and your factory is amended to make them async, then your factory is evaled. If you debug a module that uses this sugared syntax, you will end up in "Program" space, which was a bit shocking when I first discovered it. You get the "this isn't my code" feeling, which is a bit disorienting and can make debugging take longer. r.js will do this parsing at compile time, so there won't be a performance hit, but the resulting module will still differ from the source. All in all, it isn't the end of the world, but we took it into consideration.

Circular dependencies are also tricky. Of course, the sanctioned way to deal with circular dependencies is to avoid creating them in the first place, but as mentioned earlier, they still are a problem we need to deal with. There is a workaround mentioned in the docs. It relies on using a CJS-style workaround, is pretty verbose and counter-intuitive, and also would fall prey to the eval issue mentioned earlier. It is also not supported in the almond loader. I tried playing around with some deferred defines, but couldn't get them to work, at least not in such a way that meant you could be guaranteed that all a module's dependencies would be satisfied.

Require.js Summary

  • Good dependency management with modular javascript
  • Clean support for running AMD modules in NodeJS using amdefine
  • Fairly simple compilation into a single file with r.js and almond
  • Can convert single-file CJS modules to be compatible
  • Works well with mocha
  • Grunt plugin to hook it up with our preferred toolchain
  • Definition syntax is a bit verbose, especially with amdefine
  • Every NPM dependency path has to be manually configured
  • Doesn't support complicated CJS modules elegantly
  • Gets dicey with large numbers of dependencies
  • Circular dependencies aren't well supported

Using CommonJS and Browserify

The Basics

Next on the list was to try CommonJS and Browserify. Let's re-write our contrived app to use CJS modules.

lib/a.js

module.exports = {
  foo: "Module A"
};

lib/b.js

var _ = require("lodash"),
  words = ["this", "is", "module B"];

module.exports = {
  foo: _.map(words, function (s) {
    return s.toUpperCase();
  }).join(" ")
};

main.js

var a = require("./lib/a"),
  b = require("./lib/b");

console.log(a.foo);
console.log(b.foo);

Much simpler. Run main.js, and it works as expected.

$ node main.js 
Module A
THIS IS MODULE B

We get modular javascript, and dependency management. Only the module.exports for each module is visible outside of the file -- the extra vars defined are encapsulaed in an implied closure. It runs on NodeJS perfectly, because it simply uses the built in module system! Now lets use grunt-browserify to build it:

in Gruntfile.js

    //...
    browserify: {
      "dist.js": {
        entries: ["main.js"]
      }
    },
    //...

Run $ grunt browserify, load the dist.js in an HTML wrapper, and we see the friendly console outputs we were expecting. (Note: to get grunt-browserify working for Grunt 0.4, I'm using a private fork at this time.) Browserify uses NodeJS's module look-up logic, so it automatically found and included lodash for us.

So what does dist.js look like?

  • The entire file is wrapped in an anonymous self-executing function, so all variables are encapsulated -- the window object stays pristine. You have to manually assign something to the window object if you need to make a global.
  • It implements a simple CJS loader -- 400 lines, 10kb unminified, 2.2kb minified/gzip'ed -- so it is about the same size as the AMD almond loader.
  • Then, every dependency is wrapped in a require.define() function. Here is what it did to lib.b.js:
require.define("/lib/b.js",function(require,module,exports,__dirname,__filename,process,global){
var _ = require("lodash"),
  words = ["this", "is", "module B"];

module.exports = {
  foo: _.map(words, function (s) {
    return s.toUpperCase();
  }).join(" ")
};

});

Every module is defined with its original file-system path, relative to the project root. Then it is wrapped in a function that provides every global provided by NodeJS's module loader, most notably require and module. After that, it is our original module verbatim.
. Since require.define() adds the module export to the loader's require cache, when require("foo") is called in a later module, the require is synchronous and instant.
* NPM modules are slightly more complicated. Here's what it did for lodash:

require.define("/node_modules/lodash/package.json",function(require,module,exports,__dirname,__filename,process,global){
module.exports = {"main":"./dist/lodash"}
});

require.define("/node_modules/lodash/dist/lodash.js",function(require,module,exports,__dirname,__filename,process,global){
/**
 * @license
 * Lo-Dash 1.0.1 (Custom Build) <http://lodash.com/>
 * ... */
// rest of lodash module...
// ...
});

First, it include's a condensed version of lodash's package.json. All it does is tell the loader where lodash's main module is located. This is necessary because any non-relative require (require("foo")) is inferred to be a NPM node_module, therefore the actual module main file will be described in /node_modules/foo/package.json. (among other things, but all we need is that main: definition). After this, lodash's main module is included as a normal module. Also note if lodash "require()"ed another NPM module itself, you would automatically get definitions for require.define("/node_modules/lodash/node_modules/some_dep/package.json", ... and require.define("/node_modules/lodash/node_modules/some_dep/main.js", ... too. It could get complicated for complicated modules, but composite and recursive dependencies are handled automatically and flawlessly, so all you have to worry about is the resulting file size.

  • Finally, at the end of dist.js, we simple require our entry points, so in the case of our contrived app, it's simply:
require("/main.js");

This kicks off the dependency look up, and runs our app. It all happens synchronously on the same process tick, since by this point all dependencies are already in the require cache, indexed by the original file path, so NodeJS's look-up logic can be used synchronously. Very elegant. The boilerplate is automatic!

Mocha

Mocha is designed for testing NodeJS modules, so by no surprise it works out of the box. Here's an example spec file:

test/b.js

var expect = require("expect.js"),
  b = require("../lib/b.js");

describe("b.js", function () {
  it("should have a foo", function () {
    expect(b.foo).to.equal("THIS IS MODULE B");
  });
});

A bit simpler that its AMD counterpart. Run $ mocha and it tests pass with no surprises. To run these tests in the browser, we just have to browserify our spec files:

    //...
    browserify: {
      "dist.js": {
        entries: ["main.js"]
      },
      "browsertests.js": {
        entries: ["test/*.test.js"]
      }
    },
    //...

Run these in a html wrapper as described in the mocha docs, except omitting the assertion library, and replacing the individual spec files with our browserifyed browsertests.js. It will work as you expect.

Circular Dependencies

In the NodeJS docs there is a suggestion for dealing with dependency cycles, but this does not work with browserify's module loader. However, due to the asynchronous nature of our existing circular depenencies, I devised a solution that would work in this case: defer require()es. For example:

var Dep1 = require("./Dep1"),
  Dep2 = require("./Dep2"),
  CircDep3,
  CircDep4;

process.nextTick(function () {
  CircDep3 = require("./CircDep3");
  CircDep4 = require("./CircDep4");
});

//...

process.nextTick() is implemented in Browserify's loader. We could also use _.defer(). Since all modules are defined on the same tick, we can be guaranteed that all dependency will be defined on the next tick, regardless of loading order! Another solution is to only require the cycle-causing dependencies as you need them, e.g.:

var Dep1 = require("./Dep1"),
  Dep2 = require("./Dep2");

function someAsyncFunc() {
  var CircDep3 = require("./CircDep3"),
    CircDep4 = require("./CircDep4");
  // ...
}
// ...

Migration Strategy

Our last requirement was to allow a gradual migration process -- this module system would need to be able coexist for some time with our existing global-classpath-style system. My solution was as follows:

  • Start converting classes starting at the top of the static, hand-maintained dependency list.
  • Create a BrowserifyAdapter.js file. This file will look like this:
global.fluid.domain.long.class.path.Foo = require("./path/to/Foo");
global.fluid.domain.long.class.path.Bar = require("./path/to/Bar");
//...
  • Convert files at the top to use the CJS module style, make their module.exports the constructor function.
  • Remove them from the old dependency list, and add their old class path to the Browserify adapter. (Foo and Bar above).
  • BrowserifyAdapter.js becomes our single entry in our browserify build. The browserifyed code makes converted classes avilable using their old classpath, and any converted class will use CJS require()s internally.
  • Make the built Browserify file the first dependency in the old dependency list.
  • As more and more classes are converted, the Browserify build and adapter will grow, and the dependency list will shrink.
  • In the end, only the browserify build will be left. Our old build system will be obsolete, and can be discontinued! The Adapter file can be discarded, and the "real" main class can become our entry.

CommonJS/Browserify Summary

  • Modular javascript with encapsulated variables
  • Simple syntax, implied closure wrapping every file
  • Elegant conversion of modules for use int he browser, everything nicely wrapped
  • A subset of built-in NodeJS packages have been ported to the browser
  • NPM modules work out of the box
  • Excellent Mocha support
  • Grunt plugins for everything
  • Easy strategies for dependency cycles
  • Can be made to coexist with our existing module system
  • All dependencies have to be relative, or in a NPM module
  • All dependencies have to be defined on the same tick, no asynchronous loading

Conclusions

Disclaimer: As mentioned in the opening section, we were evaluating these technologies with respect to our app's requirements. This is not meant to be an authoritative answer that should apply to every project. YMMV.

Browserify!

As you may have guessed by now, we decided to use the CommonJS/Browserify solution. It just more elegantly meets our requirements, and meets them all them completely.

  • Dependency Management/Modular Javascript: Both AMD and CJS handle this well, they basically tie on this aspect. Either will suit you well. CJS will work better for modules with large numbers of dependencies, if you have aversions to syntactic sugar.
  • Simplicity of Syntax: CJS wins on this front. AMD forces you to define your module in an explicit factory function, whereas with CJS, the factory closure is implied and added automatically. Also using amdefine to make AMD modules work in a NodeJS environment adds even more boilerplate.
  • NPM Modules: Browserify handles NPM modules out of the box, as long as they dont depend on native node modules that haven't been ported to the browser. AMD requires special path configuration for each NPM module and all of its dependencies.
  • Mocha Support: Mocha can work well with both systems. WIth CJS it is slightly simpler as AMD requires some more configuration.
  • Grunt Tooling: Grunt tooling exists for both styles of browser builds. grunt-browserify still needs to be officially updated for Grunt 0.4.
  • Circular Dependencies: Dependency Cycles are much easier to handle in CJS. AMD workarounds do not work in the almond loader.
  • Migration Path: We were able to easily devise a migration strategy using CJS. An AMD solution would probably be similar.

AMD? YAGNI.

Another main point that drove us away from AMD: If you're building your Javascript into a single file, AMD is overkill. It introduces a brand new module format with its own configuration, that is slightly incompatible with the CJS modules we wanted to use -- requiring adapters and transforms -- and we wouldn't even taking advantage of AMD's main feature: asynchronous module loading! The AMD module format is designed to work in the browser (as long as the define() and require() functions are defined by a loader beforehand), but in our use case Browserify'ed CJS modules work just as well. AMD isn't worth the extra boilerplate and subtle incomaptibilities.

Full AMD using the async require.js loader can be useful during development. The extra HTTP requests don't matter for local testing, and you can simply refresh the browser to update your code. However, hooking up grunt watch to the browserify build (which we also hook up to jshint and our Mocha unit tests) means your build is ready in the time it takes to alt+tab to the browser. Async requiring for development isn't relevant in our preferred workflow.

If in the future we do need to load certain modules dynamically, we will likely use an AMD-like solution. However the core of our app will still be CJS. We still like AMD, we just like CommonJS better.

code javascript modules requirejs commonjs fluid