javascript

mongodb

mongodb-query

aggregation-framework

mapreduce

Suppose that I have a series of documents with the following format:

{
    "_id": "3_0",
    "values": ["1", "2"]
}

and I would like to obtain a projection of the array's values concatenated in a single field:

{
    "_id": "3_0",
    "values": "1_2"
}

Is this possible? I have tried $concat but I guess I can't use $values as the array for $concat.

Solution 1

In Modern MongoDB releases you can. You still cannot "directly" apply an array to $concat, however you can use $reduce to work with the array elements and produce this:

db.collection.aggregate([
  { "$addFields": {
    "values": { 
      "$reduce": {
        "input": "$values",
        "initialValue": "",
        "in": {
          "$cond": {
            "if": { "$eq": [ { "$indexOfArray": [ "$values", "$$this" ] }, 0 ] },
            "then": { "$concat": [ "$$value", "$$this" ] },
            "else": { "$concat": [ "$$value", "_", "$$this" ] }
          }    
        }
      }        
    }
  }}
])

Combining of course with $indexOfArray in order to not "concatenate" with the "_" underscore when it is the "first" index of the array.

Also my additional "wish" has been answered with $sum:

db.collection.aggregate([
  { "$addFields": {
    "total": { "$sum": "$items.value" }
  }}
])

This kind of gets raised a bit in general with aggregation operators that take an array of items. The distinction here is that it means an "array" of "aguments" provided in the coded representation a opposed to an "array element" present in the current document.

The only way you can really do the kind of concatenation of items within an array present in the document is to do some kind of JavaScript option, as with this example in mapReduce:

db.collection.mapReduce(
    function() {
        emit( this._id, { "values": this.values.join("_") } );
    },
    function() {},
    { "out": { "inline": 1 } }
)

Of course if you are not actually aggregating anything, then possibly the best approach is to simply do that "join" operation within your client code in post processing your query results. But if it needs to be used in some purpose across documents then mapReduce is going to be the only place you can use it.


I could add that "for example" I would love for something like this to work:

{
    "items": [
        { "product": "A", "value": 1 },
        { "product": "B", "value": 2 },
        { "product": "C", "value": 3 }
    ]
}

And in aggregate:

db.collection.aggregate([
    { "$project": {
        "total": { "$add": [
            { "$map": {
                "input": "$items",
                "as": "i",
                "in": "$$i.value"
            }}
        ]}
    }}
])

But it does not work that way because $add expects arguments as opposed to an array from the document. Sigh! :(. Part of the "by design" reasoning for this could be argued that "just because" it is an array or "list" of singular values being passed in from the result of the transformation it is not "guaranteed" that those are actually "valid" singular numeric type values that the operator expects. At least not at the current implemented methods of "type checking".

That means for now we still have to do this:

db.collection.aggregate([
   { "$unwind": "$items" },
   { "$group": {
       "_id": "$_id",
        "total": { "$sum": "$items.value" }
   }}
])

And also sadly there would be no way to apply such a grouping operator to concatenate strings either.

So you can hope for some sort of change on this, or hope for some change that allows an externally scoped variable to be altered within the scope of a $map operation in some way. Better yet a new $join operation would be welcome as well. But these do not exist as of writing, and probably will not for some time to come.

Solution 2

You can use the reduce operator together with the substr operator.

db.collection.aggregate([
{
    $project: {
        values: {
            $reduce: {
              input: '$values',
              initialValue: '',
              in: {
                $concat: ['$$value', '_', '$$this']
              }
            }
        }   
    }       
},
{
    $project: {
        values: { $substr: ['$values', 1 , -1]}
    }       
}])

Solution 3

Starting in Mongo 4.4, the $function aggregation operator allows applying a custom javascript function to implement behaviour not supported by the MongoDB Query Language.

For instance, in order to concatenate an array of strings:

// { "_id" : "3_0", "values" : [ "1", "2" ] }
db.collection.aggregate(
  { $set:
    { "values":
      { $function: {
          body: function(values) { return values.join('_'); },
          args: ["$values"],
          lang: "js"
      }}
    }
  }
)
// { "_id" : "3_0", "values" : "1_2" }

$function takes 3 parameters:

  • body, which is the function to apply, whose parameter is the array to join.
  • args, which contains the fields from the record that the body function takes as parameter. In our case "$values".
  • lang, which is the language in which the body function is written. Only js is currently available.