regex

mongodb

mongodb-query

aggregation-framework

I want to regex search an integer value in MongoDB. Is this possible?

I'm building a CRUD type interface that allows * for wildcards on the various fields. I'm trying to keep the UI consistent for a few fields that are integers.

Consider:

> db.seDemo.insert({ "example" : 1234 });
> db.seDemo.find({ "example" : 1234 });
{ "_id" : ObjectId("4bfc2bfea2004adae015220a"), "example" : 1234 }
> db.seDemo.find({ "example" : /^123.*/ });
> 

As you can see, I insert an object and I'm able to find it by the value. If I try a simple regex, I can't actually find the object.

Thanks!

Solution 1

If you are wanting to do a pattern match on numbers, the way to do it in mongo is use the $where expression and pass in a pattern match.

> db.test.find({ $where: "/^123.*/.test(this.example)" })
{ "_id" : ObjectId("4bfc3187fec861325f34b132"), "example" : 1234 }

Solution 2

I am not a big fan of using the $where query operator because of the way it evaluates the query expression, it doesn't use indexes and the security risk if the query uses user input data.

Starting from MongoDB 4.2 you can use the $regexMatch|$regexFind|$regexFindAll available in MongoDB 4.1.9+ and the $expr to do this.

let regex = /123/;
  • $regexMatch and $regexFind

    db.col.find({
        "$expr": {
            "$regexMatch": {
               "input": {"$toString": "$name"}, 
               "regex": /123/ 
            }
        }
    })
    
  • $regexFinAll

    db.col.find({
        "$expr": {
            "$gt": [
                { 
                    "$size": { 
                        "$regexFindAll": { 
                            "input": {"$toString": "$name"}, 
                            "regex": "123" 
                        }
                    }
                }, 
                0
            ]
        }
    })
    

From MongoDB 4.0 you can use the $toString operator which is a wrapper around the $convert operator to stringify integers.

db.seDemo.aggregate([ 
    { "$redact": { 
        "$cond": [ 
            { "$gt": [ 
                { "$indexOfCP": [ 
                    { "$toString": "$example" }, 
                    "123" 
                ] }, 
                -1 
            ] }, 
            "$$KEEP", 
            "$$PRUNE" 
        ] 
    }}
])

If what you want is retrieve all the document which contain a particular substring, starting from release 3.4, you can use the $redact operator which allows a $conditional logic processing.$indexOfCP.

db.seDemo.aggregate([ 
    { "$redact": { 
        "$cond": [ 
            { "$gt": [ 
                { "$indexOfCP": [ 
                    { "$toLower": "$example" }, 
                    "123" 
                ] }, 
                -1 
            ] }, 
            "$$KEEP", 
            "$$PRUNE" 
        ] 
    }}
])

which produces:

{ 
    "_id" : ObjectId("579c668c1c52188b56a235b7"), 
    "example" : 1234 
}

{ 
    "_id" : ObjectId("579c66971c52188b56a235b9"), 
    "example" : 12334 
}

Prior to MongoDB 3.4, you need to $project your document and add another computed field which is the string value of your number.

The $toLower and his sibling $toUpper operators respectively convert a string to lowercase and uppercase but they have a little unknown feature which is that they can be used to convert an integer to string.

The $match operator returns all those documents that match your pattern using the $regex operator.

db.seDemo.aggregate(
    [ 
        { "$project": { 
            "stringifyExample": { "$toLower": "$example" }, 
            "example": 1 
        }}, 
        { "$match": { "stringifyExample": /^123.*/ } }
    ]
)

which yields:

{ 
    "_id" : ObjectId("579c668c1c52188b56a235b7"), 
    "example" : 1234,
    "stringifyExample" : "1234"
}

{ 
    "_id" : ObjectId("579c66971c52188b56a235b9"), 
    "example" : 12334,
    "stringifyExample" : "12334"
}

Now, if what you want is retrieve all the document which contain a particular substring, the easier and better way to do this is in the upcoming release of MongoDB (as of this writing) using the $redact operator which allows a $conditional logic processing.$indexOfCP.

db.seDemo.aggregate([ 
    { "$redact": { 
        "$cond": [ 
            { "$gt": [ 
                { "$indexOfCP": [ 
                    { "$toLower": "$example" }, 
                    "123" 
                ] }, 
                -1 
            ] }, 
            "$$KEEP", 
            "$$PRUNE" 
        ] 
    }}
])