Mongoose / MongoDB performance enhancements and tweaks (Part 3)

bush_doing_it_wrong_1

Holy crap. MongoDB native drivers are SO much faster than updates through mongoose’s ORM.  Initially, when I set out on this quest to enhance mongoDB performance with node.js, I thought that modifying my queries and limiting the number of results returned would be sufficient to scale. I was wrong (thanks Bush for the imagery). It turns out that the overhead that mongoose adds for wrapping mongoDB documents within mongoose’s ORM is tremendous. Well, I should say tremendous for our use case.

In my previous two posts of tweaking mongoDB / mongoose for performance enhancements (Part 1 and Part 2), I discussed optimization of queries or making simple writes instead of reads. These were worthwhile improvements and the speed difference eventually added up to significant chunks, but I had no idea moving to the native driver would give me these types of improvements (See below).

Example #1: ~400 streams, insertion times.
These numbers are from after I made the initial tweaks after Part 1. Unfortunately, I don’t really have that good of a printout from mongotop, but this kind of gives you an idea. Look at the write times for streams and packets, flowing at a rate of ~400 streams. This is for 400 sources of packets, which all gets written and persisted. Here you can see the write time to streams is @ 193ms / 400 streams or 48.25 ms / 100 streams. Likewise, packet writing is 7.25 ms / 100 streams. (You can mostly ignore read time, these are used for data aggregates and computing analytics). Compare these with the results below:

ns total read write
streams 193ms 0ms 193ms
packets 30ms 1ms 29ms
devices 9ms 9ms 0ms

Example 2: ~1000 streams, insertion times.
You can see here that write time has dropped significantly. Writes to the packets collection is hovering at around 1.7 ms / 1000 streams, and writes to the streams collection hovers at around 7.6 ms / 100 streams. Respectively, that’s a 425% and a 635% improvement in query write times to the packets collection and streams collection. And don’t forget, I had already begun the optimizations to mongoose. Even after the tweaks I made in Part 2, these numbers still represent a better than 100% improvement to query times. Huge, right?

ns total read write
packets 186ms 169ms 17ms
devices 161ms 159ms 2ms
streams 97ms 21ms 76ms

I knew using the mongoDB native drivers would be faster, but I hadn’t guessed that they would be this much faster.  

To make these changes, I updated mongoose to the latest version 3.8.14, which enables queries to be made using the native mongoDB driver released by 10gen (github here: https://github.com/mongodb/node-mongodb-native) via Model.Collection methods.  These in turn call methods defined in node_modules/mongodb/lib/mongodb/collection/core.js, which essentially just execute raw commands in mongo. Using these native commands, one can take advantage of things like bulk inserts.

I still like mongoose, because it helps instantiate the same object whenever you need to create and save something. If something isn’t defined in the mongoose.Schema, that object won’t get persisted to mongoDB either. Furthermore, it can still be tuned to be semi-quick, so it all depends on the use case. It just so happens that when you’re inserting raw json into mongoDB or don’t need the validation and other middleware that mongoose provides, you can use the mongoDB native drivers while still using mongoose for the good stuff. That’s cool.

Here’s what the new improvements look like:

    var Stream = mongoose.model('Stream');
    async.waterfall([
        //Native mongoDB update returns # docs updated, update by default updates 1 document:
        function query1 (callback) {
            Stream.collection.update(query1, query1set, {safe: true}, function(err, writeResult) {
                if (err) throw err;
                if (writeResult == 1) {
                    callback('Found and updated call @ Query1');
                } else {
                    callback(null);
                }
            });
        },
        function(callback) {
            Stream.collection.update(query2, query2set, {safe: true}, function(err, writeResult) {
                if (err) throw err;
                if (writeResult == 1) {
                    callback('Found and updated stream @ Query2');
                } else {
                    pushNewStream(packet, cb);
                    callback('No stream found.  Pushing new stream.');
                }
            });

        }
    ], function(err, results) {});
Advertisements