My data layer uses Mongo aggregation a decent amount, and on average, queries are taking 500-650ms to return. I am using mgo.

A sample query function is shown below which represents what most of my queries look like.

func (r userRepo) GetUserByID(id string) (User, error) {
    info, err := db.Info()
    if err != nil {
        log.Fatal(err)
    }

    session, err := mgo.Dial(info.ConnectionString())
    if err != nil {
        log.Fatal(err)
    }
    defer session.Close()

    var user User
    c := session.DB(info.Db()).C("users")
    o1 := bson.M{"$match": bson.M{"_id": id}}
    o2 := bson.M{"$project": bson.M{
        "first":           "$first",
        "last":            "$last",
        "email":           "$email",
        "fb_id":           "$fb_id",
        "groups":          "$groups",
        "fulfillments":    "$fulfillments",
        "denied_requests": "$denied_requests",
        "invites":         "$invites",
        "requests": bson.M{
            "$filter": bson.M{
                "input": "$requests",
                "as":    "item",
                "cond": bson.M{
                    "$eq": []interface{}{"$$item.active", true},
                },
            },
        },
    }}
    pipeline := []bson.M{o1, o2}
    err = c.Pipe(pipeline).One(&user)
    if err != nil {
        return user, err
    }
    return user, nil
}

The user struct I have looks like the following..

type User struct {
    ID             string        `json:"id" bson:"_id,omitempty"`
    First          string        `json:"first" bson:"first"`
    Last           string        `json:"last" bson:"last"`
    Email          string        `json:"email" bson:"email"`
    FacebookID     string        `json:"facebook_id" bson:"fb_id,omitempty"`
    Groups         []UserGroup   `json:"groups" bson:"groups"`
    Requests       []Request     `json:"requests" bson:"requests"`
    Fulfillments   []Fulfillment `json:"fulfillments" bson:"fulfillments"`
    Invites        []GroupInvite `json:"invites" bson:"invites"`
    DeniedRequests []string      `json:"denied_requests" bson:"denied_requests"`
}

Based on what I have provided, is there anything obvious that would suggest why my queries are averaging 500-650ms?

I know that I am probably swallowing a bit of a performance hit by using aggregation pipeline, but I wouldn't expect it to be this bad.

Solution 1

.. is there anything obvious that would suggest why my queriers are averaging 500-650ms?

Yes, there is. You are calling mgo.Dial() before executing each query. mgo.Dial() has to connect to the MongoDB server every time, which you close right after the query. The connection may very likely take hundreds of milliseconds to estabilish, including authentication, allocating resources (both at server and client side), etc. This is very wasteful.

This method is generally called just once for a given cluster. Further sessions to the same cluster are then established using the New or Copy methods on the obtained session. This will make them share the underlying cluster, and manage the pool of connections appropriately.

Create a global session variable, connect on startup once (using e.g. a package init() function), and use that session (or a copy / clone of it, obtained by Session.Copy() or Session.Clone()). For example:

var session *mgo.Session
var info *db.Inf // Use your type here

func init() {
    var err error
    if info, err = db.Info(); err != nil {
        log.Fatal(err)
    }
    if session, err = mgo.Dial(info.ConnectionString()); err != nil {
        log.Fatal(err)
    }
}

func (r userRepo) GetUserByID(id string) (User, error) {
    sess := session.Clone()
    defer sess.Close()

    // Now we use sess to execute the query:
    var user User
    c := sess.DB(info.Db()).C("users")
    // Rest of the method is unchanged...
}