In the following example, "Algorithms in C++" is present twice.

The $unset modifier can remove a particular field but how to remove an entry from a field?

  "_id" : ObjectId("4f6cd3c47156522f4f45b26f"), 
  "favorites" : {
    "books" : [
      "Algorithms in C++",    
      "The Art of Computer Programming", 
      "Graph Theory",      
      "Algorithms in C++"
  "name" : "robert"

Solution 1

As of MongoDB 2.2 you can use the aggregation framework with an $unwind, $group and $project stage to achieve this:

db.users.aggregate([{$unwind: '$favorites.books'},
                    {$group: {_id: '$_id',
                              books: {$addToSet: '$favorites.books'},
                              name: {$first: '$name'}}},
                    {$project: {'favorites.books': '$books', name: '$name'}}

Note the need for the $project to rename the favorites field, since $group aggregate fields cannot be nested.

Solution 2

The easiest solution is to use setUnion (Mongo 2.6+):

    {'$addFields': {'favorites.books': {'$setUnion': ['$favorites.books', []]}}}

Another (more lengthy) version that is based on the idea from @kynan's answer, but preserves all the other fields without explicitly specifying them (Mongo 3.4+):

> db.users.aggregate([
    {'$unwind': {
        'path': '$favorites.books',
        // output the document even if its list of books is empty
        'preserveNullAndEmptyArrays': true
    {'$group': {
        '_id': '$_id',
        'books': {'$addToSet': '$favorites.books'},
        // arbitrary name that doesn't exist on any document
        '_other_fields': {'$first': '$$ROOT'},
      // the field, in the resulting document, has the value from the last document merged for the field. (c) docs
      // so the new deduped array value will be used
      '$replaceRoot': {'newRoot': {'$mergeObjects': ['$_other_fields', "$$ROOT"]}}
    // this stage wouldn't be necessary if the field wasn't nested
    {'$addFields': {'favorites.books': '$books'}},
    {'$project': {'_other_fields': 0, 'books': 0}}

{ "_id" : ObjectId("4f6cd3c47156522f4f45b26f"), "name" : "robert", "favorites" : 
{ "books" : [ "The Art of Computer Programmning", "Graph Theory", "Algorithms in C++" ] } }    

Solution 3

What you have to do is use map reduce to detect and count duplicate tags .. then use $set to replace the entire books based on { "_id" : ObjectId("4f6cd3c47156522f4f45b26f"),

This has been discussed sevel times here .. please seee

Removing duplicate records using MapReduce

Fast way to find duplicates on indexed column in mongodb

How to remove duplicate record in MongoDB by MapReduce?

Solution 4

function unique(arr) {
    var hash = {}, result = [];
    for (var i = 0, l = arr.length; i < l; ++i) {
        if (!hash.hasOwnProperty(arr[i])) {
            hash[arr[i]] = true;
    return result;

db.collection.find({}).forEach(function (doc) {
    db.collection.update({ _id: doc._id }, { $set: { "favorites.books": unique(doc.favorites.books) } });

Solution 5

Starting in Mongo 4.4, the $function aggregation operator allows applying a custom javascript function to implement behaviour not supported by the MongoDB Query Language.

For instance, in order to remove duplicates from an array:

// {
//   "favorites" : { "books" : [
//     "Algorithms in C++",
//     "The Art of Computer Programming",
//     "Graph Theory",
//     "Algorithms in C++"
//   ]},
//   "name" : "robert"
// }
  { $set:
    { "favorites.books":
      { $function: {
          body: function(books) { return books.filter((v, i, a) => a.indexOf(v) === i) },
          args: ["$favorites.books"],
          lang: "js"
// {
//   "favorites" : { "books" : [
//     "Algorithms in C++",
//     "The Art of Computer Programming",
//     "Graph Theory"
//   ]},
//   "name" : "robert"
// }

This has the advantages of:

  • keeping the original order of the array (if that's not a requirement, then prefer @Dennis Golomazov's $setUnion answer)
  • being more efficient than a combination of expensive $unwind and $group stages.

$function takes 3 parameters:

  • body, which is the function to apply, whose parameter is the array to modify.
  • args, which contains the fields from the record that the body function takes as parameter. In our case "$favorites.books".
  • lang, which is the language in which the body function is written. Only js is currently available.