Production Ready Node: Structure & Packaging

N

ode is still pretty young platform and people are still trying to figure out the best way to build large projects with it. With its highly modular core, and an even more modular ecosystem, it is easy to get lost in the variety of ways to go about anything.

I'm starting this series of articles about my experience in building production ready node projects. We'll be talking about structure, conventions, packages, and all around good things to do. To start, let's talk about one of the less talked about topics in the ecosystem, but in my mind, one of the more important - Project Structure.

Project Structure

The biggest step for new Node.js developers is to think in terms of modules. Not classes, interfaces services, applications, eggs, gems, or whatever else your alternate language may prefer. Gone are the days of monolithic applications. Node applications, are really collections of packages that, when composed, yield complex behavior.

Node applications

Collections of packages that, when composed, yield complex behavior

It is important to stop trying to build an application from the get go. Instead, focus more on building tightly focused modules, which compose a single package, that provide additional functionality for use in an application. The project should be responsible for loading packages, each of which provides some unique piece of functionality to the project

Some examples of common applications that you would most likely extract out into separate packages might be

  • email
  • web server
  • api
  • core ( primary project functionality )
  • configuration
  • logging

Each of these things packages could most likely stand on their own. But when pulled together in the context of a project, you end up with an application

Anatomy of a Package

Package

A package is a directory with a package.json file. A package, is comprised of modules, each with a narrow focus that together solve a pattern. By building a package that solves a pattern, and not a problem, the package can be used by other projects and applications to solve specific problems, you may not have thought about.

Module

a module is a file with a .js, .json, or .node extension. Modules should adhere to the Unix philosophy, Do one thing and do one thing very well. Modules can contain methods, private functions classes or anything other construct that helps solve a pattern. Modules can, and should export more than one thing.

Submodule

A module can also be a directory with, at the very least, an index.js also referred to as a submodule A module can also be a directory with one or more modules to break up logic and keep file size down to something more easily digestible by humans.

Entrypoint ( index.js )

Every directory in a node.js package should have an entry point, typically index.js. Calling require on the top level package should load packages, modules and submodules defined within the package.

.
|-- package-one
|   |-- moduleone
|   |   |-- submodule1.js
|   |   |-- submodule2.js
|   |   `-- index.js
|   |-- moduletwo
|   |   |-- submodule1.js
|   |   |-- submodule2.js
|   |   `-- index.js
|   |-- modulethree.js
|   |-- package.json
|   `-- index.js
`-- package-two

In this example requiring package-one should give direct access to moduleone and moduletwo

// package-one/index.js
exports.moduleone = require('./moduleone');  
exports.moduletwo = require('./moduletwo');  

In the example package structure above, a developer would be able quickly require all functionality with in the package by requiring it by name, or requiring individual submodules

var packages = require('package')  
var moduleone = require('package/moduleone')  
var moduleone = require('package/moduleone/submodule1')  
var moduletwo = require('package/moduletwo')  
var modulethree = require('package/modulethree')  

Both directories and files should be A single, lower cased word containing only alpha characters - If you think two words are warranted, make a directory names with word two, and a file of word one. Above all else, consistency is important. It is overly frustrating and time consuming trying to remember if a file was camel cased, lowercased, capitalized, used dashes, underscores, etc. Save everyone the trouble up front.

# Bad
.
|
`-- SeriousCache.js
# good
.
|-- cache
|   |-- serious.js
    `-- index.js

While you may name files and directories what you wish, specific names are reserved for package developers to house certain pieces of logic and functionality

A package should follow the following conventions

Directories

These directories are reserved for a special purpose within the application and are expected follow these guidelines. If only a single file is expected, the directory should contain a single module named index, or the directory can be omitted in exchange for a module of the same name. Any of the folders can be omitted if they are not needed

Directory Purpose
conf Application / package specific configuration defaults
lib contains modules for package specific functionality
test location for package specific tests and test data
commands houses management commands for the command line interface for the application
startup modules, plugins & functions that are to be loaded with the primary app start.

Modules ( Files )

It is a good idea to establish some base line modules names that will be reserved for specific use cases. Keep in mind, these are just guidlines. It makes more sense for some files to be directories, or some directories, files - that is just fine. The important part here is establishing a little bit of convention

Field Purpose
index.js The entry point for the package. This will be loaded when someone executes require('your-package')
README.md Package specific documentation and examples of usage
package.json The npm package definition. You should define a name version and dependencies at the least
resources.js API Resources that define endpoints for a given content type with in the system
events.js predefined event emitters used to provide hooks into application behavior
.gitkeep will force git to keep the folder when changing branches
gitignore Will force git to ignore specific files or patterns of files
npmignore similar to gitignore, but tells npm which files and directories to ignore when published or packed

Private Packages

A common stumbling block with building out all of these packages, is where to put them. While, you can use npm's private packages feature, or stand up your own npm repository, it is often more efficient in terms of development, to keep the packages local until they are ready to be extracted. All this really means for the structure of the project is to pick a top level directory where these packages will go, and add them to your node path. You can choose any name other than node_modules. I tend to go with packages or modules. Once you do that, you'll want to add that directory to the NODE_PATH. Once you do that, node will include that directory in looking for modules via require and you don't need to hard code relative paths everywhere. If you are on a unix-y system, that would look something like this:

export NODE_PATH=$NODE_PATH:$PWD/packages  

The only thing this directory should contain are legitimate NPM packages, that you and your team author. The one recommendation I have here is to give every folder a single, unique prefix, and put the real name in the package.json file, and mark it as private. As a convention, the prefix is usually the same name as the project. At work, our project is named alice so all of the package directories start with alice-.

The key here is to use npm to manage each of the package. Each package should define it's dependencies just as if it were going directly into npm, even if their are duplicate dependencies across your private packages. We'll cover installation next.

// proejct-core/package.json
{
  "name": "project-core"
  ,"description":"Core functionality for project"
  ,"version": "0.0.1"
  ,"private":true
  ,"dependencies": {
      "debug":"^2.0.0"
  }

  ,"devDependencies": {

  }

  ,"scripts":{

  }
  ,"bin":{

  }
  ,"main":"index.js"
  ,"license":"LGPL 3"
}

All together our new node project looks something like this

# Project
. 
|-- packages/
|   |-- proejct-core
|   |   |-- lib/
|   |   |-- commands/
|   |   |-- startup/
|   |   |-- conf/
|   |   |-- test/
|   |   |-- package.json
|   |   |-- README.md
|   |   |-- events.js
|   |   |-- .npmignore
|   |   `-- index.js
|-- package.json
`-- index.js

Another Nice benefit of doing things this way, is you eliminate having to nasty deep relative paths.

var core = require('../../../../packages/core');  

becomes

var core = require('project-core');  

The key with local packages like this is to always keep in mind, that the goal is to eventually pull it out of the project and publish it to npm or a mirror so it can be reused in other projects. Additionally, because you no longer require relative paths, when you do extract your private packages from the project, you don't have to rewrite any code!

Installation

As you might imaging, having multiple internal packages, with their own defined dependencies breaks normal conventions with npm. That means we'll need to give npm a little bit of help. We can create preinstall script that runs over each of the packages and installs each package as if it were a level package.

Given that multiple packages may have similar dependencies, we want to make sure to take the latest version of that package and not try to install duplicates. We can do that with the semver and npm :

// scripts/install.js

 var debug         = console.log                        // the debug module isn't installed at this point. so console.log...
   , path          = require( 'path' )                  // node path module to resolve paths
   , fs            = require( 'fs' )                    // node fs module to check for fields
   , util          = require('util')                    // util module for string formatting
   , npm           = require('npm')                     // npm module
   , semver        = require('npm/node_modules/semver') // npm's current version of semver
   , packages      = []                                 // final list of packages to be installed
   , latest        = {}                                 // cache to weed out dupes at the latest version
   , projectRoot                                        // root directory of this project
   , packagePath                                        // path to the directory where packages are kept
   ;


projectRoot = path.resolve( path.join( ( __dirname ||'.' ), '..' ) );  
packagePath = path.resolve( projectRoot, 'packages' );


// attempts to resolve the highest version number 
// available from all of the packages we find
function resolveDeps( dependancies ){  
  dependancies = dependancies || {};
  var ret = [] // return value. list of packages to install ( <pkg>@<version> )
    , current
    , previous
    , prefix = /\^|~/
  ; 

  for( var dep in dependancies ){
    if( latest.hasOwnProperty( dep ) ){
      latest[dep] = semver.gt( dependancies[ dep ].replace( prefix, '' ) , latest[ dep ].replace( prefix, '' ) ) ? dependancies[ dep ] : latest[ dep ]
    } else {
      latest[ dep ] = dependancies[ dep ]
    }
  }

  return ret;
}

// find all of the package files
fs.readdirSync( packagePath ).forEach( function( module ){  
  var packagefile = path.join( packagePath, module, 'package.json' )

  debug('reading %s', packagefile )

  // if there is a package.json file
  // require it, read the dependencies object and generate a deps array
  if( fs.existsSync( packagefile ) ){
    resolveDeps( require(packagefile).dependencies ) 
  }
});

// flatten object into an array npm will understand
packages = Object  
      .keys( latest )
      .sort()
      .map( function( key ){
        return util.format( '%s@%s', key, latest[ key ] );
      })

debug( 'installing', packages.join('\n') )


// use npm to install all the the deps into the primary project dir
npm.load(function(){  
  npm.prefix = projectRoot;
  npm.commands.install(packages, function(err, packagelist, tree, pretty){
    console.log( pretty )
  })
})

Then in the project's package.json you add this to the preinstall script hook.

// project/package.json
{
  "scrips":{
        "preinstall": "node ./scripts/install"
    }
}

Thats is it! npm install will work the way you want it to with out and additional steps or setup. Our little script runs over all of the folders in our packages directory, reads the dependencies block and filters out duplicates, keeping the highest version of each and installs them into the node_modules directory of the project just as if they were defined project dependencies. Again, when we decide to extract a package, we don't need to change anything except our package.json file.

The one caviat here is you have to have npm installed as a dependency of the project. I typically don't recommend keeping packages in node_modules under source control. However, this one of the situations that warrants it.

Testing

Similar to installation, we'll want to make sure that our tests for internal packages are included when the main test runner for the project runs. npm test is actually a script just like our preinstall script. Which makes it rather trivial to set up. I'm going to illustrate an example using Mocha, but you can use whatever testing framework you want - The process is basically the same.

// scripts/test.js

var child_process = require('child_process')               // child process for spawning mocha  
  , clone         = require('mout/lang/clone')             // object clone module
  , fs            = require('fs')                          // fs module
  , path          = require('path')                        // path module
  , os            = require('os')                          // os module
  , util          = require("util")                        // util module
  , production    = (process.env.NODE_ENV == 'production') // boolean flag if we are in production mode
  , env           = clone( process.env )                   // clone of current process env
  , debug         = require('debug')( 'scripts:runner')
  , npath         = ( env.NODE_PATH || "" ).split( path.delimiter )     // cache of node path split into an array
  , html                                                   // html stream
  , coverage                                               // mocha code coverage process
  , mocha                                                  // moacha child process
  , reporter


// add our packages directory to node require path
// so we don't have to require( ../../../../../ )

npath.push(path.resolve(__dirname,'..','packages') )  
npath = npath.join( path.delimiter )

env.NODE_PATH = npath  

if( production ){  
  reporter = fs.createWriteStream('tap.xml',{
    flags:'w'
    ,encoding:'utf8'
  })
} else {
  reporter = process.stdout
}

// spin up mocha configured the way I want it
mocha = child_process.spawn("mocha", [  
  "--harmony"
  , "--growl"
  , "--recursive"
  ,"--timeout=10000"
  , util.format("--reporter=%s", production ? "xunit":"spec")
  , "test"
  , "packages/**/test/*.js"
], { env:env });

mocha.on('exit', function( code, sig ){  
  process.exit( code );
});

mocha.stdout.pipe( reporter )  
mocha.stderr.pipe( reporter )  

This little script does three primary things.

  1. Adds the corrected NODE_PATH to the environment
  2. Spawns a process with the new environment to run tests.
  3. Tell mocha to look for js files in test directories, recursively

By putting our packages directory on NODE_PATH before spawning our process, we don't have to worry about things breaking on other people's machines, most importantly a build box. Additionally, if NODE_ENV is set to production it will output an XUnit / tap compliant xml document with test results for something like Jenkins, or Bamboo to parse.
Now we just add this as the test script in our packages.json, and running npm test will runn all of our tests.

// project/package.json
{
  "scrips":{
        "preinstall": "node ./scripts/install",
        "test":"node ./scripts/test"
    }
}

Docker

Docker is becoming a very popular way to deploy applications giving developers a bit more control over their deployment environment and production stack. Under the assumption that you will most likely deploy more than one node project with similar requirements, it is usually a good idea to create a base image first.

Base Image

The base image is an image that will be used as the starting point for images that will contain other node.js projects. The base image should have the base OS, system dependencies, etc that will be common across other projects. All you need to do is create a Dockerfile, which is a lot like a make file, but for Docker Image. I recommend keeping a separate code repository for base images, but it might look something like this:

# company/nodejs

FROM ubuntu:10.04  
WORKDIR /opt/

RUN apt-get update  
RUN apt-get install -y build-essential libssl-dev uuid-dev wget curl git-core vim htop openjdk-6-jdk openjdk-6-jre openjdk-6-jre-lib openssh-client openssh-server libjpeg62 libjpeg62-dev libuuid1 libtiff4 libtiff4-dev libtool libgraphviz4 libgraphviz-dev libevent-dev libevent-core-1.4-2 libevent-extra-1.4-2 libevent-1.4-2 libxml2 libxml2-dev libxslt1.1 libxslt1-dev

ADD ./node-v0.10.30.tar.gz /opt/

# simple node binary install
WORKDIR /opt/node-v0.10.30-linux-x64/

RUN ln -s $PWD/lib/node_modules/npm/cli.js /usr/local/bin/npm  
RUN ln -s $PWD/bin/node /usr/local/bin/  
RUN echo node $(node -v) && echo npm $(npm -v)

## install NVM, grunt & Bower

RUN bash -c 'touch ~/.bashrc && curl https://raw.githubusercontent.com/creationix/nvm/v0.17.1/install.sh | bash && source ~/.nvm/nvm.sh && nvm install 0.8 && nvm install 0.10 && nvm install 0.11 && nvm alias default 0.10 && nvm use default && npm install -g jshint grunt-cli bower'

WORKDIR /opt  

Project Image

The project image, as you might guess, is the Dockerfile that creates the environment specific for the project and be default runs whatever primary application needs to run.

FROM company/nodejs

VOLUME ["/etc/project"]

ADD . /opt/project/  
WORKDIR /opt/project

ENV NODE_PATH $NODE_PATH:/opt/project/packages

# make sure to install it really hard
RUN rm -rf node_modules  
RUN npm install && node ./scripts/install && npm link

EXPOSE 3000  
CMD <RUN APP>  

This Image will use the base image as a starting point so we don't have to worry about setting up the environment again. Just need to install the project and run it.

Recap

We've covered a lot of information here, but it really boils down to a couple of key points:

  • Establish Package & module conventions
  • Build a project shell / harness ( not an application )
  • Create a directory for private packages
  • Create a custom install script
  • Create a custom test script
  • Optionally make a Dockerfiles

Once you have done these couple of things, expanding the project becomes a easier task, as it really comes down to building out the modules and packages rather than overhauling large chunks of your project. More importantly, having this internally defined private packages, allows larger teams to work on discrete pieces of functionality in the code base in a much more efficient manner, because all of the boilerplate and bootstrapping is done.

npm packages production ready node.js