I'm working on a query to find cities with most zips for each state:

db.zips.distinct("state", db.zips.aggregate([ 
    { $group:
      { _id: {
           state: "$state", 
           city: "$city" 
         },
        numberOfzipcodes: { 
           $sum: 1
         }
      }
    }, 
    { $sort: {
        numberOfzipcodes: -1
         }
      }
  ])
)

The aggregate part of the query seems to work fine, but when I add the distinct I get an empty result.

Is this because I have state in the id? Can I do something like distinct("_id.state ?

Solution 1

You can use $addToSet with the aggregation framework to count distinct objects.

For example:

db.collectionName.aggregate([{
    $group: {_id: null, uniqueValues: {$addToSet: "$fieldName"}}
}])

Or extended to get your unique values into a proper list rather than a sub-document inside a null _id record:

db.collectionName.aggregate([
    { $group: {_id: null, myFieldName: {$addToSet: "$myFieldName"}}},
    { $unwind: "$myFieldName" },
    { $project: { _id: 0 }},
])

Solution 2

Distinct and the aggregation framework are not inter-operable.

Instead you just want:

db.zips.aggregate([ 
    {$group:{_id:{city:'$city', state:'$state'}, numberOfzipcodes:{$sum:1}}}, 
    {$sort:{numberOfzipcodes:-1}},
    {$group:{_id:'$_id.state', city:{$first:'$_id.city'}, 
              numberOfzipcode:{$first:'$numberOfzipcodes'}}}
]);

Solution 3

SQL Query: (group by & count of distinct)

select city,count(distinct(emailId)) from TransactionDetails group by city;

Equivalent mongo query would look like this:

db.TransactionDetails.aggregate([ 
{$group:{_id:{"CITY" : "$cityName"},uniqueCount: {$addToSet: "$emailId"}}},
{$project:{"CITY":1,uniqueCustomerCount:{$size:"$uniqueCount"}} } 
]);

Solution 4

You can call $setUnion on a single array, which also filters dupes:

{ $project: {Package: 1, deps: {'$setUnion': '$deps.Package'}}}