I am currently using MongoDB with millions of data records. I discovered one thing that's pretty annoying.

When I use 'count()' function with a small number of queried data collection, it's very fast. However, when the queried data collection contains thousand or even millions of data records, the entire system becomes very slow.

I made sure that I have indexed the required fields.

Has anybody encountered an identical thing? How do you do to improve that?

Solution 1

There is now another optimization than create proper index.

db.users.ensureIndex({name:1});
db.users.find({name:"Andrei"}).count();

If you need some counters i suggest to precalculate them whenever it possible. By using atomic $inc operation and not use count({}) at all.

But mongodb guys working hard on mongodb, so, count({}) improvements they are planning in mongodb 2.1 according to jira bug.

Solution 2

You can ensure that the index is really used without any disk access.

Let's say you want to count records with name : "Andrei"

You ensure index on name (as you've done) and

db.users.find({name:"andrei"}, {_id:0, name:1}).count()

you can check that it is the fastest way to count (except with precomputing) by checking if

db.users.find({name:"andrei"}, {_id:0, name:1}).explain() 

displays a index_only field set to true.

This trick will ensure that your query will retrieve records only from ram (index) and not from disk.

Solution 3

You are pretty much out of luck for now, count in mongodb is awful and won't be getting better in the near future. See: https://jira.mongodb.org/browse/SERVER-1752

From experience, you should pretty much never use it unless it's a one time thing, something that occurs very rarely, or your database is pretty small.

As @Andrew Orsich stated, use counters whenever possible (the downfall to counters is the global write lock, but better than count() regardless).

Solution 4

For me the solution was change index to sparse. It depend on specific situation, just give it a try if you can.

db.Account.createIndex( { "date_checked_1": 1 }, { sparse: true } )

db.Account.find({    
     "dateChecked" : { $exists : true }    
}).count()

318 thousands records in collection

  • 0.31 sec - with sparse index
  • 0.79 sec - with non-sparse index

Solution 5

Adding my observations based on latest version of mongodb 4.4. I have 0.80 TB collection size.

I have created an index (UserObject.CountryID) for my collection. and ran this query.

db.users.aggregate([
{
    $match : {
        "UserObject.CountryID" : 3
    }
}]).group({_id: "Count", count: {$sum: 1}})

It took total

  • 06800 ms to fetch count of around 13 million (1.3 crore) records searching 0.80 TB collection size.
  • 16274 ms to fetch count of around 35 million (3.5 crore) records searching 0.80 TB collection size.
  • 41615 ms to fetch count of around 42 million (4.2 crore) records searching 0.80 TB collection size.