mongodb

mongodb-query

aggregation-framework

row-number

The idea is to return a kind of row number to a mongodb aggregate command/ pipeline. Similar to what we've in an RDBM.

It should be a unique number, not important if it matches exactly to a row/number.

For a query like:

[ { $match: { "author" : { $ne: 1 } } }, { $limit: 1000000 } ]

I'd like to return:

{ "rownum" : 0, "title" : "The Banquet", "author" : "Dante", "copies" : 2 }
{ "rownum" : 1, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 }
{ "rownum" : 2, "title" : "Eclogues", "author" : "Dante", "copies" : 2 }
{ "rownum" : 3, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 }
{ "rownum" : 4, "title" : "Iliad", "author" : "Homer", "copies" : 10 }

Is it possible to generate this rownum in mongodb?

Solution 1

Not sure about the performance in big queries, but this is at least an option.

You can add your results to an array by grouping/pushing and then unwind with includeArrayIndex like this:

[
  {$match: {author: {$ne: 1}}},
  {$limit: 10000},
  {$group: {
    _id: 1,
    book: {$push: {title: '$title', author: '$author', copies: '$copies'}}
  }},
  {$unwind: {path: '$book', includeArrayIndex: 'rownum'}},
  {$project: {
    author: '$book.author',
    title: '$book.title',
    copies: '$book.copies',
    rownum: 1
  }}
]

Now, if your database contains a big amount of records, and you intend to paginate, you can use the $skip stage and then $limit 10 or 20 or whatever you want to display per page, and just add the number from the $skip stage to your rownum and you'll get the real position without having to push all your results to enumerate them.

Solution 2

Starting in Mongo 5, it's a perfect use case for the new $setWindowFields aggregation operator and its $documentNumber operation:

// { x: "a" }
// { x: "b" }
// { x: "c" }
// { x: "d" }
db.collection.aggregate([
  { $setWindowFields: {
    sortBy: { _id: 1 },
    output: { rowNumber: { $documentNumber: {} } }
  }}
])
// { x: "a", rowNumber: 1 }
// { x: "b", rowNumber: 2 }
// { x: "c", rowNumber: 3 }
// { x: "d", rowNumber: 4 }

$setWindowFields allows us to work for each document with the knowledge of previous or following documents. Here we just need the information of the place of the document in the whole collection (or aggregation intermediate result), as provided by $documentNumber.

Note that we sort by _id because the sortBy parameter is required, but really, since you don't care about the ordering of your rows, it could be anything you'd like.

Solution 3

Another way would be to keep track of row_number using "$function"

[{ $match: { "author" : { $ne: 1 } }}  , { $limit: 1000000 },
{
    $set: {
      "rownum": {
        "$function": {
          "body": "function() {try {row_number+= 1;} catch (e) {row_number= 0;}return row_number;}",
          "args": [],
          "lang": "js"
        }
      }
    }
  }]

I am not sure if this can mess up something though!