lundi 14 juillet 2014

snapJob Part IV : Scalling node.js, and ensure high availability using "cluster"



Part III of this project ("Storing and loading data with Apache CouchDB and node.js") can be found here.

By nature, node.js is single thread. Wherever you have a 2, 4, 8, 16 (or more) cores architecture, you will not take full advantages of it.

Here comes cluster !

This is a node.js api well described on other websites, such as here, and there.

Cluster allows an existing application to be scaled on a single machine, on every core processor it has. So this is "mono-machine scaling".
It acts as a load-balancer, allowing multiple instances of our application to share the same port for handling web requests.

implementing "cluster" doesn't even require any refactoring at all. In fact, the only thing I did in the snnapJob RESTFul API was to rename app.js as snapJob.js, and recreate the app.js file with the following content :

var cluster = require('cluster');

if (cluster.isMaster) {
    var cpuCount = require('os').cpus().length;

    // Create a worker for each CPU
    for (var i = 0; i < cpuCount; i += 1) {
        cluster.fork();
    }
    cluster.on('exit', function (worker) {
        console.log('Worker ' + worker.id + ' died !');
        cluster.fork();
    });
} else {
    require('./snapJob');
}

Note : I also added a reference to the new dependency "cluster" in the package.json file and ran the "npm install" command.

Now, if I run this :
/usr/bin/node app.js

The node.js output shows (I have 8 cores on my machine) :
snapJob API running on http://localhost:8080 on worker 1
snapJob API running on http://localhost:8080 on worker 8
snapJob API running on http://localhost:8080 on worker 6
snapJob API running on http://localhost:8080 on worker 7
snapJob API running on http://localhost:8080 on worker 2
snapJob API running on http://localhost:8080 on worker 4
snapJob API running on http://localhost:8080 on worker 3
snapJob API running on http://localhost:8080 on worker 5


Cluster first runs a master process. It gets the number of CPUs on the machine, and creates a sub process for each CPU. If one of the process exits, it is recreated. If we don't do that, we may have a master process that will load balance requests to dead children process, and the application will not work anymore.

Obviously, cluster.isMaster is true only once, for the master process. For every child, I just call require('./snapJob'), which will simply reload the application.

Now if I run this :
/usr/bin/node snapJob.js

Then I just run the application the way it was before, on a single thread, and the output is :
snapJob API running on http://localhost:3000


Testing cluster
To be sure that requests are correctly load-balanced on every processes, I just added something new in the /util/log.js file :

cluster = require('cluster')


and
    this.cleanRequest = function(req, callback){
        if(callback) callback(
            req === undefined ?
                undefined :
                {ip: req.ip,
                url: req.url,
                workerId : cluster.worker ? cluster.worker.id : undefined,
                api_key: req.headers["api_key"]});
    };

In other words, if we are in a context where we are clustered (so if we launched the application using the "node app.js" command line and not the "node snapJob.js" command line), then we save on which worker the log has been invoked.

Just run the application as described in the previous articles, and go to Kibana to watch the logs :



As you can see, workers 6, 8, 4, 5, and 1 were called, proving  that load balancing has been correctly done.

Another test you can see in the application sources consists in explicitly exit the application when calling the test method located in models/test.js :
exports.dummyTestMethod = {
    'spec': {
        description : "Test method",
        path : "/test/",
        method: "GET",
        summary : "This is a dummy test method",
        type : "void",
        nickname : "dummyTestMethod",
        produces : ["application/json"]
    },
    'action': function (req, res) {
        logger.logInfo('dummyTestMethod called', req);
        process.exit(1);
        res.send(JSON.stringify("test is ok"));
    }
};

The goal here is to see if the child process is correctly recreated when it dies. Calling the dummy test method via the swagger interface (http://localhost:8080/#!/test, remember to run the app with "node app.js --masterAPIKey=06426e19-d807-4921-a668-4708287d8878", put the masterPIKey in the text field on the top right of the swagger interface, and click explore before you try to call the test method) produces the folowing output in the node.js command prompt :
snapJob API running on http://localhost:8080 on worker 5
snapJob API running on http://localhost:8080 on worker 2
snapJob API running on http://localhost:8080 on worker 4
snapJob API running on http://localhost:8080 on worker 6
snapJob API running on http://localhost:8080 on worker 1
snapJob API running on http://localhost:8080 on worker 3
snapJob API running on http://localhost:8080 on worker 8
snapJob API running on http://localhost:8080 on worker 7
Worker 3 died !
Worker 8 died !
Worker 7 died !
snapJob API running on http://localhost:8080 on worker 10
snapJob API running on http://localhost:8080 on worker 9
snapJob API running on http://localhost:8080 on worker 11
Worker 11 died !
snapJob API running on http://localhost:8080 on worker 12

Which means dying processes are correctly reinstanciated :)

Pretty nice, huh ? That wasn't so hard after all...

Presentation of the project can be found here.
Source code for this application can be downloaded from here.

Next part (snapJob Part V : Serving files with nginx) can be found here

Aucun commentaire:

Enregistrer un commentaire