0

Let's take the field _id, which is unique across the collection.

Does it make any sense to have any compound index where _id is the prefix eg:

{
  _id: 1,
  A: 1
}

Would the index above be any more efficient than simply the index of:

{
  _id: 1
}

If so, can you give an example of a query where that is the case?

1
  • 2
    No, both will be equally efficient. Commented Oct 11, 2023 at 9:15

1 Answer 1

1

I'm going to disagree with the comment and say that the answer is: Sure, there are very specific circumstances where it makes sense. The most notable one that comes to mind is if you want the index to cover the query which we'll explore below.

In this answer, consider a collection foo with the following indices:

> db.foo.getIndices()
[
  { v: 2, key: { _id: 1 }, name: '_id_' },
  { v: 2, key: { u: 1 }, name: 'u_1', unique: true },
  { v: 2, key: { u: 1, A: 1 }, name: 'u_1_A_1' }
]

And the following query:

.find({u:123},{A:1, _id:0})

Using the single field unique index yields the following plan:

> db.foo.find({u:123},{A:1, _id:0}).hint({u:1}).explain().queryPlanner.winningPlan
{
  stage: 'PROJECTION_SIMPLE',
  transformBy: { A: 1, _id: 0 },
  inputStage: {
    stage: 'FETCH',
    inputStage: {
      stage: 'IXSCAN',
      keyPattern: { u: 1 },
      indexName: 'u_1',
      ...
      indexBounds: { u: [ '[123, 123]' ] }
    }
  }
}

Using the compound index, however, shows something different:

> db.foo.find({u:123},{A:1, _id:0}).hint({u:1, A:1}).explain().queryPlanner.winningPlan
{
  stage: 'PROJECTION_COVERED',
  transformBy: { A: 1, _id: 0 },
  inputStage: {
    stage: 'IXSCAN',
    keyPattern: { u: 1, A: 1 },
    indexName: 'u_1_A_1',
    ...
    indexBounds: { u: [ '[123, 123]' ], A: [ '[MinKey, MaxKey]' ] }
  }
}

The notable difference here is the absence of the FETCH stage when the compound index is used. By the way, the compound index is the one that is chosen naturally by the database in my testing (6.0).

In a related manner, if the query were to include a query predicate on the A field then the compound index could check that condition whereas the single field index could not. With an equality condition on the unique field, the downside of that FETCH from using the single field index is capped though (only a single document could be retrieved and discarded unnecessarily).

So, yes, there are cases where a compound index prefixed on a unique field could provide some value. That said, indexing is often about finding the right balance for the overall workload. So the micro-optimizations that the compound index could provide for a specific query or two might not be worth the tradeoff of forcing the database to maintain a whole new index.

In this answer I've used a u field rather than _id as posed in the initial question. This is because, at the time of writing, MongoDB has some special handling of the _id field. So generalizing and working with a different field/index that is unique makes it easier to demonstrate and reason about.

2
  • If a fetch stage is needed, is there a way to know exactly how many docs or MB that were fetched from storage vs RAM? stackoverflow.com/questions/77273999/… Commented Oct 11, 2023 at 15:38
  • 2
    Also note that once the database gets large enough to require sharding, the only way to ensure a unique field across shards is to have the unique field as a prefix of the shard key index.
    – Joe
    Commented Oct 11, 2023 at 20:47

Not the answer you're looking for? Browse other questions tagged or ask your own question.