Mongodb多键索引之数组

时间:2022-07-23
本文章向大家介绍Mongodb多键索引之数组,主要内容包括其使用实例、应用技巧、基本知识点总结和需要注意事项,具有一定的参考价值,需要的朋友可以参考一下。

【背景】

最近有项目需求用到多键索引,Mongodb中字段值支持多键索引主要包括嵌套文档、数组以及数组嵌套文档.例如联系包括手机、固定电话、邮箱、微信、QQ等,对于字段值存储类型不一样,决定创建多键索引也不一样同时性能也存在差异,例如数组值(包括数组以及数组文档),创建多键索引时会为数组中每个元素都创建索引键,如数组中元素特别多,相应索引也会特别大,创建多键索引或者组合索引时最多只支持一个数组值.

嵌套文档:“telephone”:{"cellphone":"0211234567","mobilephone":13888888888}

数组:“telephone”:["0211234567",13888888888]

数组文档:“联系”:[“telephone”:{"cellphone":"0211234567","mobilephone":13888888888}]

【数组值创建并高效使用索引】

1、集合中文档信息

备注:ratings字段值是数组.

xiaoxu:PRIMARY> db.survey.findOne({});
{
  "_id" : ObjectId("5f2ff576eb7de181ebe814f9"),
  "item" : {
    "name" : "Katie",
    "manufactured" : 16
  },
  "ratings" : [
    16,
    116
  ],
  "user_id" : 16,
  "boolean" : true,
  "added_at" : ISODate("2020-08-09T13:09:10.791Z"),
  "number" : 662
}
xiaoxu:PRIMARY>

2、创建多键索引 db.survey.createIndex({"ratings":1},{background:1});

备注:创建多键索引不需要显示关键字,如字段值中包括数组值或者嵌套文档情况,这个存储引擎自动创建为多键盘索引,如4.2版本之前不加background:1,则前台创建索引,加DB级别排他锁(大表后果很严重),4.2开始加不加background:1都是online创建索引.

3、范围查找ratings

逻辑:

1、查询ratings数组中存在至少1个元素同时满足大于等于3且小于等于6【类似and逻辑 ,此时多键索引边界可以合并为【【3,6】】】--

db.survey.find({ ratings:{ $elemMatch: { $gte: 3, $lte: 6}}} )

2、查询ratings数组中存在至少1个元素大于等于3且至少1元素小于等于6或者存在一个元素同时满足大于等于3且小于等于6【类似or逻辑,【【3,+∞】】or【【-∞,6】】,此时执行计划只有使用or的一边,使用大于等于3然后过来另外一个值是否满足小于等于6或者反过来

db.survey.find({ ratings:{ $gte: 3, $lte: 6} )  

【具体执行计划】

db.survey.find({ ratings:{ $elemMatch: { $gte: 3, $lte: 6}}} )
xiaoxu:PRIMARY> db.survey.find( { ratings : { $elemMatch: { $gte: 3, $lte: 6 } } } ).explain("executionStats").executionStats;
{
  "executionSuccess" : true,
  "nReturned" : 4,
  "executionTimeMillis" : 0,
  "totalKeysExamined" : 4,
  "totalDocsExamined" : 4,
  "executionStages" : {
    "stage" : "FETCH",
    "filter" : {
      "ratings" : {
        "$elemMatch" : {
          "$lte" : 6,
          "$gte" : 3
        }
      }
    },
    "nReturned" : 4,
    "executionTimeMillisEstimate" : 0,
    "works" : 5,
    "advanced" : 4,
    "needTime" : 0,
    "needYield" : 0,
    "saveState" : 0,
    "restoreState" : 0,
    "isEOF" : 1,
    "invalidates" : 0,
    "docsExamined" : 4,
    "alreadyHasObj" : 0,
    "inputStage" : {
      "stage" : "IXSCAN",
      "nReturned" : 4,
      "executionTimeMillisEstimate" : 0,
      "works" : 5,
      "advanced" : 4,
      "needTime" : 0,
      "needYield" : 0,
      "saveState" : 0,
      "restoreState" : 0,
      "isEOF" : 1,
      "invalidates" : 0,
      "keyPattern" : {
        "ratings" : 1
      },
      "indexName" : "ratings_1",
      "isMultiKey" : true,
      "multiKeyPaths" : {
        "ratings" : [
          "ratings"
        ]
      },
      "isUnique" : false,
      "isSparse" : false,
      "isPartial" : false,
      "indexVersion" : 2,
      "direction" : "forward",
      "indexBounds" : {
        "ratings" : [
          "[3.0, 6.0]"
        ]
      },
      "keysExamined" : 4,
      "seeks" : 1,
      "dupsTested" : 4,
      "dupsDropped" : 0,
      "seenInvalidated" : 0
    }
  }
}

db.survey.find({ ratings:{ $gte: 3, $lte: 6} )  
xiaoxu:PRIMARY> db.survey.find( { ratings :  { $gte: 3, $lte: 6 }  } ).explain("executionStats").executionStats;
{
  "executionSuccess" : true,
  "nReturned" : 7,
  "executionTimeMillis" : 0,
  "totalKeysExamined" : 7,
  "totalDocsExamined" : 7,
  "executionStages" : {
    "stage" : "FETCH",
    "filter" : {
      "ratings" : {
        "$gte" : 3
      }
    },
    "nReturned" : 7,
    "executionTimeMillisEstimate" : 0,
    "works" : 9,
    "advanced" : 7,
    "needTime" : 0,
    "needYield" : 0,
    "saveState" : 0,
    "restoreState" : 0,
    "isEOF" : 1,
    "invalidates" : 0,
    "docsExamined" : 7,
    "alreadyHasObj" : 0,
    "inputStage" : {
      "stage" : "IXSCAN",
      "nReturned" : 7,
      "executionTimeMillisEstimate" : 0,
      "works" : 8,
      "advanced" : 7,
      "needTime" : 0,
      "needYield" : 0,
      "saveState" : 0,
      "restoreState" : 0,
      "isEOF" : 1,
      "invalidates" : 0,
      "keyPattern" : {
        "ratings" : 1
      },
      "indexName" : "ratings_1",
      "isMultiKey" : true,
      "multiKeyPaths" : {
        "ratings" : [
          "ratings"
        ]
      },
      "isUnique" : false,
      "isSparse" : false,
      "isPartial" : false,
      "indexVersion" : 2,
      "direction" : "forward",
      "indexBounds" : {
        "ratings" : [
          "[-inf.0, 6.0]"
        ]
      },
      "keysExamined" : 7,
      "seeks" : 1,
      "dupsTested" : 7,
      "dupsDropped" : 0,
      "seenInvalidated" : 0
    }
  }
}
xiaoxu:PRIMARY>

4、等值查询ratings

逻辑:

1、查询数组值完全匹配100,200 2个元素且顺序是相同

备注:先匹配数组内任何元素满足100,然后返回文档判断是否满足整个数组,

如果100值特别多,100存在超过80%,最终结果集只有1个,那么回集合过滤匹配效果特别差

db.survey.find({ ratings:[100,200]}) 
备注:keysExamined=12,实际上等于100只有11个,多扫描一个索引key没有搞定
清楚
xiaoxu:PRIMARY> db.survey.find({ratings:100}).count()
11
xiaoxu:PRIMARY> db.survey.find({ratings:[100,200]}).explain("executionStats").executionStats;
{
  "executionSuccess" : true,
  "nReturned" : 5,
  "executionTimeMillis" : 0,
  "totalKeysExamined" : 12,
  "totalDocsExamined" : 11,
  "executionStages" : {
    "stage" : "FETCH",
    "filter" : {
      "ratings" : {
        "$eq" : [
          100,
          200
        ]
      }
    },
    "nReturned" : 5,
    "executionTimeMillisEstimate" : 0,
    "works" : 13,
    "advanced" : 5,
    "needTime" : 7,
    "needYield" : 0,
    "saveState" : 0,
    "restoreState" : 0,
    "isEOF" : 1,
    "invalidates" : 0,
    "docsExamined" : 11,
    "alreadyHasObj" : 0,
    "inputStage" : {
      "stage" : "IXSCAN",
      "nReturned" : 11,
      "executionTimeMillisEstimate" : 0,
      "works" : 13,
      "advanced" : 11,
      "needTime" : 1,
      "needYield" : 0,
      "saveState" : 0,
      "restoreState" : 0,
      "isEOF" : 1,
      "invalidates" : 0,
      "keyPattern" : {
        "ratings" : 1
      },
      "indexName" : "ratings_1",
      "isMultiKey" : true,
      "multiKeyPaths" : {
        "ratings" : [
          "ratings"
        ]
      },
      "isUnique" : false,
      "isSparse" : false,
      "isPartial" : false,
      "indexVersion" : 2,
      "direction" : "forward",
      "indexBounds" : {
        "ratings" : [
          "[100.0, 100.0]",
          "[[ 100.0, 200.0 ], [ 100.0, 200.0 ]]"
        ]
      },
      "keysExamined" : 12,
      "seeks" : 2,
      "dupsTested" : 11,
      "dupsDropped" : 0,
      "seenInvalidated" : 0
    }
  }
}
如不需要考虑顺序,使用$all操作符
db.survey.find({ ratings:{$all:[100,200]}}) 
这个执行只需要判断另外一个元素是否等于200.而不是整合数组匹配,扫描刚好是11
个索引key
xiaoxu:PRIMARY> db.survey.find({ratings:{$all:[100,200]}}).explain("executionStats").executionStats;
{
  "executionSuccess" : true,
  "nReturned" : 5,
  "executionTimeMillis" : 0,
  "totalKeysExamined" : 11,
  "totalDocsExamined" : 11,
  "executionStages" : {
    "stage" : "FETCH",
    "filter" : {
      "ratings" : {
        "$eq" : 200
      }
    },

2、查询数组中至少一个元素等于100,效率还可以

db.survey.find({ ratings:100}) 
xiaoxu:PRIMARY> db.survey.find({ratings:100}).explain("executionStats").executionStats;
{
  "executionSuccess" : true,
  "nReturned" : 2,
  "executionTimeMillis" : 0,
  "totalKeysExamined" : 2,
  "totalDocsExamined" : 2,
  "executionStages" : {
    "stage" : "FETCH",
    "nReturned" : 2,
    "executionTimeMillisEstimate" : 0,
    "works" : 3,
    "advanced" : 2,
    "needTime" : 0,
    "needYield" : 0,
    "saveState" : 0,
    "restoreState" : 0,
    "isEOF" : 1,
    "invalidates" : 0,
    "docsExamined" : 2,
    "alreadyHasObj" : 0,
    "inputStage" : {
      "stage" : "IXSCAN",
      "nReturned" : 2,
      "executionTimeMillisEstimate" : 0,
      "works" : 3,
      "advanced" : 2,
      "needTime" : 0,
      "needYield" : 0,
      "saveState" : 0,
      "restoreState" : 0,
      "isEOF" : 1,
      "invalidates" : 0,
      "keyPattern" : {
        "ratings" : 1
      },
      "indexName" : "ratings_1",
      "isMultiKey" : true,
      "multiKeyPaths" : {
        "ratings" : [
          "ratings"
        ]
      },
      "isUnique" : false,
      "isSparse" : false,
      "isPartial" : false,
      "indexVersion" : 2,
      "direction" : "forward",
      "indexBounds" : {
        "ratings" : [
          "[100.0, 100.0]"
        ]
      },
      "keysExamined" : 2,
      "seeks" : 1,
      "dupsTested" : 2,
      "dupsDropped" : 0,
      "seenInvalidated" : 0
    }
  }
}

5、通过数组长度查询数据

逻辑:

查询数组长度为1的对象,注意这个$size无法使用索引都是collscan.

db.survey.find({ratings:{$size:1}})
xiaoxu:PRIMARY> db.survey.find({ratings:{$size:1}}).
explain("executionStats").executionStats;
{
  "executionSuccess" : true,
  "nReturned" : 0,
  "executionTimeMillis" : 718,
  "totalKeysExamined" : 0,
  "totalDocsExamined" : 1000019,
  "executionStages" : {
    "stage" : "COLLSCAN",
    "filter" : {
      "ratings" : {
        "$size" : 1
      }
    },
    "nReturned" : 0,
    "executionTimeMillisEstimate" : 621,
    "works" : 1000021,
    "advanced" : 0,
    "needTime" : 1000020,
    "needYield" : 0,
    "saveState" : 7813,
    "restoreState" : 7813,
    "isEOF" : 1,
    "invalidates" : 0,
    "direction" : "forward",
    "docsExamined" : 1000019
  }
}
xiaoxu:PRIMARY>

6、通过数组索引位置来查询数据

逻辑:

1、数组索引位置从0开始,当对数组创建索引时,使用索引位置查询时,是无法使用多键索引,必须创建单独索引,例如第二个元素位置, db.survey.createIndex({"ratings.1":1}).其他位置以此内推方式创建索引.

2、创建数组索引还是按照数组索引位置创建索引,根据业务实际需求,做到创建索引能够提升效率,而不是创建低效或者无用索引。

3、查询单个元素,此时索引则不是多键索引,就是单个标量值,标量表示是字符串或者数字,而不是数组或者嵌套文档.

xiaoxu:PRIMARY> db.survey.find({"ratings.0":100}).explain("executionStats").executionStats;
{
  "executionSuccess" : true,
  "nReturned" : 10,
  "executionTimeMillis" : 498,
  "totalKeysExamined" : 0,
  "totalDocsExamined" : 1000019,
  "executionStages" : {
    "stage" : "COLLSCAN",
    "filter" : {
      "ratings.0" : {
        "$eq" : 100
      }
    },
    "nReturned" : 10,
    "executionTimeMillisEstimate" : 450,
    "works" : 1000021,
    "advanced" : 10,
    "needTime" : 1000010,
    "needYield" : 0,
    "saveState" : 7812,
    "restoreState" : 7812,
    "isEOF" : 1,
    "invalidates" : 0,
    "direction" : "forward",
    "docsExamined" : 1000019
  }
}
xiaoxu:PRIMARY> 
xiaoxu:PRIMARY> db.survey.find({"ratings.1":100}).
explain("executionStats").executionStats;
{
  "executionSuccess" : true,
  "nReturned" : 1,
  "executionTimeMillis" : 505,
  "totalKeysExamined" : 0,
  "totalDocsExamined" : 1000019,
  "executionStages" : {
    "stage" : "COLLSCAN",
    "filter" : {
      "ratings.1" : {
        "$eq" : 100
      }
    },
    "nReturned" : 1,
    "executionTimeMillisEstimate" : 410,
    "works" : 1000021,
    "advanced" : 1,
    "needTime" : 1000019,
    "needYield" : 0,
    "saveState" : 7812,
    "restoreState" : 7812,
    "isEOF" : 1,
    "invalidates" : 0,
    "direction" : "forward",
    "docsExamined" : 1000019
  }
}
xiaoxu:PRIMARY> db.survey.createIndex({"ratings.1":1})
{
  "createdCollectionAutomatically" : false,
  "numIndexesBefore" : 5,
  "numIndexesAfter" : 6,
  "ok" : 1
}
xiaoxu:PRIMARY> db.survey.find({"ratings.1":100}).explain("executionStats").executionStats;
{
  "executionSuccess" : true,
  "nReturned" : 1,
  "executionTimeMillis" : 1,
  "totalKeysExamined" : 1,
  "totalDocsExamined" : 1,
  "executionStages" : {
    "stage" : "FETCH",
    "nReturned" : 1,
    "executionTimeMillisEstimate" : 0,
    "works" : 2,
    "advanced" : 1,
    "needTime" : 0,
    "needYield" : 0,
    "saveState" : 0,
    "restoreState" : 0,
    "isEOF" : 1,
    "invalidates" : 0,
    "docsExamined" : 1,
    "alreadyHasObj" : 0,
    "inputStage" : {
      "stage" : "IXSCAN",
      "nReturned" : 1,
      "executionTimeMillisEstimate" : 0,
      "works" : 2,
      "advanced" : 1,
      "needTime" : 0,
      "needYield" : 0,
      "saveState" : 0,
      "restoreState" : 0,
      "isEOF" : 1,
      "invalidates" : 0,
      "keyPattern" : {
        "ratings.1" : 1
      },
      "indexName" : "ratings.1_1",
      "isMultiKey" : false,
      "multiKeyPaths" : {
        "ratings.1" : [ ]
      },
      "isUnique" : false,
      "isSparse" : false,
      "isPartial" : false,
      "indexVersion" : 2,
      "direction" : "forward",
      "indexBounds" : {
        "ratings.1" : [
          "[100.0, 100.0]"
        ]
      },
      "keysExamined" : 1,
      "seeks" : 1,
      "dupsTested" : 0,
      "dupsDropped" : 0,
      "seenInvalidated" : 0
    }
  }
}
  • 【总结】

本次主要介绍数组多键索引,还包括数组嵌套文档、嵌套文档等多键索引,本次内容主要来自官方文档,主要补充执行计划等信息,并没有深入研究,只是作为入门学习了解,希望对大家有帮助。