I have the following json:

{
  "a1": {"a": "b"},
  "a2": {"a": "c"}
}

How can I request all documents where a1 and a2 are not equal in the same document?

Solution 1

You could use $where:

db.myCollection.find( { $where: "this.a1.a != this.a2.a" } )

However, be aware that this won't be very fast, because it will have to spin up the java script engine and iterate each and every document and check the condition for each.

If you need to do this query for large collections, or very often, it's best to introduce a denormalized flag, like areEqual. Still, such low-selectivity fields don't yield good index performance, because he candidate set is still large.

Solution 2

update

using the new $expr operator available as of mongo 3.6 you can use aggregate expressions in find query like this:

  db.myCollection.find({$expr: {$ne: ["$a1.a", "$a2.a"] } });

Although this comment solves the problem, I think a better match for this use case would be to use $addFields operator available as of version 3.4 instead of $project.

db.myCollection.aggregate([
     {"$match":{"a1":{"$exists":true},"a2":{"$exists":true}}},
     {"$addFields": {
           "aEq": {"$eq":["$a1.a","$a2.a"]}
         }
     },
     {"$match":{"aEq": false}} 
  ]);

Solution 3

To avoid JavaScript use the aggregation framework:

db.myCollection.aggregate([
  {"$match":{"a1":{"$exists":true},"a2":{"$exists":true}}},
  {"$project": {
      "a1":1,
      "a2":1,
      "aCmp": {"$cmp":["$a1.a","$a2.a"]}
    }
  },
  {"$match":{"aCmp":0}}
])

On our development server the equivalent JavaScript query takes 7x longer to complete.

Update (10 May 2017)

I just realized my answer didn't answer the question, which wanted values that are not equal (sometimes I'm really slow). This will work for that:

db.myCollection.aggregate([
  {"$match":{"a1":{"$exists":true},"a2":{"$exists":true}}},
  {"$project": {
      "a1":1,
      "a2":1,
      "aEq": {"$eq":["$a1.a","$a2.a"]}
    }
  },
  {"$match":{"aEq": false}}
])

$ne could be used in place of $eq if the match condition was changed to true but I find using $eq with false to be more intuitive.

Solution 4

MongoDB uses Javascript in the background, so

{"a": "b"} == {"a": "b"}

would be false.

So to compare each you would have to a1.a == a2.a

To do this in MongoDB you would use the $where operator

db.myCollection.find({$where: "this.a1.a != this.a2.a"});

This assumes that each embedded document will have a property "a". If that isn't the case things get more complicated.

Solution 5

Starting in Mongo 4.4, for those that want to compare sub-documents and not only primitive values (since {"a": "b"} == {"a": "b"} is false), we can use the new $function aggregation operator that allows applying a custom javascript function:

// { "a1" : { "x" : 1, "y" : 2 }, "a2" : { "x" : 1, "y" : 2 } }
// { "a1" : { "x" : 1, "y" : 2 }, "a2" : { "x" : 3, "y" : 2 } }
db.collection.aggregate(
  { $match:
    { $expr:
      { $function: {
          body: function(a1, a2) { return JSON.stringify(a1) != JSON.stringify(a2); },
          args: ["$a1", "$a2"],
          lang: "js"
      }}
    }
  }
)
// { "a1" : { "x" : 1, "y" : 2 }, "a2" : { "x" : 3, "y" : 2 } }

$function takes 3 parameters:

  • body, which is the function to apply, whose parameter are the two fields to compare.
  • args, which contains the fields from the record that the body function takes as parameter. In our case, both "$a1" and "$a2".
  • lang, which is the language in which the body function is written. Only js is currently available.

Solution 6

Thanks all for solving my problem -- concerning the answers that use aggregate(), one thing that confused me at first is that $eq (or $in, or lots of other operators) has different meaning depending on where it is used. In a find(), or the $match phase of aggregation, $eq takes a single value, and selects matching documents:

db.items.aggregate([{$match: {_id: {$eq: ObjectId("5be5feb45da16064c88e23d4")}}}])

However, in the $project phase of aggregation, $eq takes an Array of 2 expressions, and makes a new field with value true or false:

db.items.aggregate([{$project: {new_field: {$eq: ["$_id", "$foreignID"]}}}])

In passing, here's the query I used in my project to find all items whose list of linked items (due to a bug) linked to themselves:

db.items.aggregate([{$project: {idIn: {$in: ["$_id","$header.links"]}, "header.links": 1}}, {$match: {idIn: true}}])