I'm sure MongoDB doesn't officially support "joins". What does this mean?

Does this mean "We cannot connect two collections(tables) together."?

I think if we put the value for _id in collection A to the other_id in collection B, can we simply connect two collections?

If my understanding is correct, MongoDB can connect two tables together, say, when we run a query. This is done by "Reference" written in http://www.mongodb.org/display/DOCS/Schema+Design.

Then what does "joins" really mean?

I'd love to know the answer because this is essential to learn MongoDB schema design. http://www.mongodb.org/display/DOCS/Schema+Design

Solution 1

It's no join since the relationship will only be evaluated when needed. A join (in a SQL database) on the other hand will resolve relationships and return them as if they were a single table (you "join two tables into one").

You can read more about DBRef here: http://docs.mongodb.org/manual/applications/database-references/

There are two possible solutions for resolving references. One is to do it manually, as you have almost described. Just save a document's _id in another document's other_id, then write your own function to resolve the relationship. The other solution is to use DBRefs as described on the manual page above, which will make MongoDB resolve the relationship client-side on demand. Which solution you choose does not matter so much because both methods will resolve the relationship client-side (note that a SQL database resolves joins on the server-side).

Solution 2

The database does not do joins -- or automatic "linking" between documents. However you can do it yourself client side. If you need to do 2, that is ok, but if you had to do 2000, the number of client/server turnarounds would make the operation slow.

In MongoDB a common pattern is embedding. In relational when normalizing things get broken into parts. Often in mongo these pieces end up being a single document, so no join is needed anyway. But when one is needed, one does it client-side.

Consider the classic ORDER, ORDER-LINEITEM example. One order and 8 line items are 9 rows in relational; in MongoDB we would typically just model this as a single BSON document which is an order with an array of embedded line items. So in that case, the join issue does not arise. However the order would have a CUSTOMER which probably is a separate collection - the client could read the cust_id from the order document, and then go fetch it as needed separately.

There are some videos and slides for schema design talks on the mongodb.org web site I belive.

Solution 3

one kind of join a query in mongoDB, is ask at one collection for id that match , put ids in a list (idlist) , and do find using on other (or same) collection with $in : idlist

u = db.friends.find({"friends": something }).toArray()
idlist= []
u.forEach(function(myDoc) { idlist.push(myDoc.id ); } )
db.family.find({"id": {$in : idlist} } )

Solution 4

The first example you link to shows how MongoDB references behave much like lazy loading not like a join. There isn't a query there that's happening on both collections, rather you query one and then you lookup items from another collection by reference.

Solution 5

The fact that mongoDB is not relational have led some people to consider it useless. I think that you should know what you are doing before designing a DB. If you choose to use noSQL DB such as MongoDB, you better implement a schema. This will make your collections - more or less - resemble tables in SQL databases. Also, avoid denormalization (embedding), unless necessary for efficiency reasons.

If you want to design your own noSQL database, I suggest to have a look on Firebase documentation. If you understand how they organize the data for their service, you can easily design a similar pattern for yours.

As others pointed out, you will have to do the joins client-side, except with Meteor (a Javascript framework), you can do your joins server-side with this package (I don't know of other framework which enables you to do so). However, I suggest you read this article before deciding to go with this choice.

Edit 28.04.17: Recently Firebase published this excellent series on designing noSql Databases. They also highlighted in one of the episodes the reasons to avoid joins and how to get around such scenarios by denormalizing your database.

Solution 6

If you use mongoose, you can just use(assuming you're using subdocuments and population):

Profile.findById profileId
  .select 'friends'
  .exec (err, profile) ->
    if err or not profile
      handleError err, profile, res
      Status.find { profile: { $in: profile.friends } }, (err, statuses) ->
        if err
          handleErr err, statuses, res
          res.json createJSON statuses

It retrieves Statuses which belong to one of Profile (profileId) friends. Friends is array of references to other Profiles. Profile schema with friends defined:

schema = new mongoose.Schema
  # ...

  friends: [
    type: mongoose.Schema.Types.ObjectId
    ref: 'Profile'
    unique: true
    index: true

Solution 7

I came across lot of posts searching for the same - "Mongodb Joins" and alternatives or equivalents. So my answer would help many other who are like me. This is the answer I would be looking for.

I am using Mongoose with Express framework. There is a functionality called Population in place of joins.

As mentioned in Mongoose docs.

There are no joins in MongoDB but sometimes we still want references to documents in other collections. This is where population comes in.

This StackOverflow answer shows a simple example on how to use it.