mongodb

spring-data-mongodb

mongodb-indexes

@CompoundIndexes({
    @CompoundIndex(name = "fertilizer_idx",
        unique = true,
        def = "{'name': 1, 'formula': 1, 'type': 1}")
})
public class Fertilizer extends Element implements Serializable {
//class stuff
}

Is it possible to create the index case insensitive? Right now it is differentiating from NAME to NAMe. Saving a second field lowercase (or uppercase) is not a possibility for me.

Thanks, Pedro

Solution 1

Prior of MongoDB version 3.4 we were unable to create index with case insensitive.

In version 3.4 has collation option that allows users to specify language-specific rules for string comparison, such as rules for lettercase and accent marks.

The collation option has the following syntax:

collation: {
   locale: <string>,
   caseLevel: <boolean>,
   caseFirst: <string>,
   strength: <int>,
   numericOrdering: <boolean>,
   alternate: <string>,
   maxVariable: <string>,
   backwards: <boolean>
}

where the locale field is mandatory; all other fields are optional.

To create index with case insensitive we need to use mandatory field locale and strength field for string comparison level. strength allows value rage 1 - 5. read more about collation

The strength attribute determines whether accents or case are taken into account when collating or matching text

Example:

if strength=1 then role = Role = rôle

if strength=2 then role = Role < rôle

if strength=3 then role < Role < rôle

Comparison level doc

So we need to use strength=2 to create index. like:

db.collectionName.createIndex(
  { name: 1, formula: 1, type: 1 },
  { 
    name: "fertilizer_idx",
    collation: {locale: "en", strength: 2},
    unique: true
  }
)

N.B: collation option is not available for text indexes.

Solution 2

Yes, it is now available in MongoDB 3.4 with the new collation feature.

you can create a case insensitive index like this:

db.collection.createIndex({
   name:1,
   formula:1,
   type:1
},
{
   collation:{
      locale:"en",
      strength:2
   }
});

where the strength attribute is the comparaison level

you can then get case insensitive match with this query:

db.collection.find({name: "name"}).collation({locale: "en", strength: 2});

see collation for details

if you upgraded to mongodb 3.4 from a previous version, you may need to set compatibility before creating the index like this

db.adminCommand( { setFeatureCompatibilityVersion: "3.4" } )

Solution 3

Spring Data Mongo2.2 provides 'Annotation-based Collation support through @Document and @Query.'

Ref. What's new in Spring Data Mongo2.2

@Document(collection = 'fertilizer', collation = "{'locale':'en', 'strength':2}")
public class Fertilizer extends Element implements Serializable {

    @Indexed(unique = true)
    private String name;
    //class stuff
}

When the application is started, it will create the indexes along with respective collation for every document.

Solution 4

db.collection.createIndex(
{ name: 1, formula: 1, type: 1 },
{ name: "fertilizer_idx", unique: true, collation:{ locale: "en", strength: 2 } }
)

Use collation as an option for db.collection.createIndex()

more info here: https://docs.mongodb.com/manual/reference/method/db.collection.createIndex/

here for locale/language information: https://docs.mongodb.com/manual/reference/collation-locales-defaults/#collation-languages-locales

strength: integer

Optional. The level of comparison to perform. Possible values are:

1: Primary level of comparison. Collation performs comparisons of the base characters only, ignoring other differences such as diacritics and case.

2: Secondary level of comparison. Collation performs comparisons up to secondary differences, such as diacritics. That is, collation performs comparisons of base characters (primary differences) and diacritics (secondary differences). Differences between base characters takes precedence over secondary differences.

3: Tertiary level of comparison. Collation performs comparisons up to tertiary differences, such as case and letter variants. That is, collation performs comparisons of base characters (primary differences), diacritics (secondary differences), and case and variants (tertiary differences). Differences between base characters takes precedence over secondary differences, which takes precedence over tertiary differences. This is the default level.

4: Quaternary Level. Limited for specific use case to consider punctuation when levels 1-3 ignore punctuation or for processing Japanese text.

5: Identical Level. Limited for specific use case of tie breaker.

Mongo 3.4 has collation, which allows users to specify language-specific rules for string comparison

Collation includes:

collation: {
   locale: <string>,
   caseLevel: <boolean>,
   caseFirst: <string>,
   strength: <int>,
   numericOrdering: <boolean>,
   alternate: <string>,
   maxVariable: <string>,
   backwards: <boolean>
}

Solution 5

As mentioned above by Shaishab Roy you should use collation.strength

There are no way to define that with annotations of spring data

But you can implement it manually. To implement this behavior with your spring application you should create event listener to listen that your application is ready, inject MongoOperations bean and define index like in example below:

@Configuration
public class MongoConfig {
  @Autowired
  private MongoOperations mongoOperations;

  @EventListener(ApplicationReadyEvent.class)
  public void initMongo() {
    mongoOperations
    .indexOps(YourCollectionClass.class)
    .ensureIndex(
        new Index()
            .on("indexing_field_name", Sort.Direction.ASC)
            .unique()
            .collation(Collation.of("en").strength(2)));
  }
}