DeployU
Interviews / Databases / What is the difference between a sparse and a partial index in MongoDB?

What is the difference between a sparse and a partial index in MongoDB?

conceptual Indexing Interactive Quiz Code Examples

The Scenario

You are a backend engineer at a social media company. You are designing a new service that will store user profiles. Some users have a location field, but others do not.

You want to create an index on the location field to speed up queries that search for users by location. However, you do not want the index to include the users who do not have a location field.

The Challenge

Explain the difference between a sparse and a partial index in MongoDB. What are the pros and cons of each approach, and which one would you choose for this use case?

Wrong Approach

A junior engineer might not be aware of sparse or partial indexes. They might just create a regular index on the `location` field, which would include all the documents in the collection, even the ones that do not have a `location` field.

Right Approach

A senior engineer would know that a sparse index is the perfect tool for this job. They would be able to explain the difference between a sparse and a partial index, and they would have a clear plan for how to use them to solve this problem.

Step 1: Understand the Key Differences

FeatureSparse IndexPartial Index
InclusionOnly includes documents that have the indexed field.Only includes documents that match a given filter expression.
Syntaxdb.my_collection.createIndex({ my_field: 1 }, { sparse: true })db.my_collection.createIndex({ my_field: 1 }, { partialFilterExpression: { my_field: { $exists: true } } })
Use CasesWhen you want to index a field that only exists in some of the documents.When you want to index a subset of the documents in a collection.

Step 2: Choose the Right Tool for the Job

For our use case, we should use a sparse index. This is because we want to index the location field, but we only want the index to include the documents that have a location field.

A partial index would also work for this use case, but a sparse index is simpler to create and is more efficient for this specific scenario.

Practice Question

You want to create an index on a field that only contains a certain value. Which of the following would be the most appropriate?