Mongodb多键索引之数组文档

时间:2022-07-23
本文章向大家介绍Mongodb多键索引之数组文档,主要内容包括其使用实例、应用技巧、基本知识点总结和需要注意事项,具有一定的参考价值,需要的朋友可以参考一下。

接上2篇文档关于多键索引内容,接着学习数组文档,主要实验来验证如何进行高效数据查询,通过对比方式来验证3种多键索引优缺点以及适合场景,具体链接如下:

Mongodb多键索引之数组

Mongodb多键索引之嵌套文档

【数组文档如何使用高效索引查询】

1、集合中随机一条文档信息

备注:instock是数组文档,里面嵌套文档

xiaoxu:PRIMARY> db.inventory.findOne();
{
  "_id" : ObjectId("5f3cd7d026ca128ad929fc38"),
  "name" : "Tracy",
  "item" : {
    "name" : "Tracy",
    "manufactured" : 0
  },
  "instock" : [
    {
      "warehouse" : "Tracy",
      "qty" : 0
    }
  ],
  "ratings" : [
    0,
    100
  ],
  "added_at" : ISODate("2020-08-19T07:42:08.450Z"),
  "number" : 6306
}

2、如何查询等值匹配数组内嵌套文档

备注:匹配数组文档时,嵌套对象field(字段)顺序也必须保持一致,否则结果集为空

db.inventory.find({ "instock": { warehouse: "Tracy", qty: 0 } } ).pretty();
{
  "_id" : ObjectId("5f3cd7d026ca128ad929fc38"),
  "name" : "Tracy",
  "item" : {
    "name" : "Tracy",
    "manufactured" : 0
  },
  "instock" : [
    {
      "warehouse" : "Tracy",
      "qty" : 0
    }
  ],
  "ratings" : [
    0,
    100
  ],
  "added_at" : ISODate("2020-08-19T07:42:08.450Z"),
  "number" : 6306
}
db.inventory.find({ "instock": { qty: 0, warehouse: "Tracy" } } ).pretty();
xiaoxu:PRIMARY>
【执行计划】
xiaoxu:PRIMARY> db.inventory.find({ "instock": { warehouse: "Tracy", qty: 0 } } ).explain("executionStats").executionStats;
{
  "executionSuccess" : true,
  "nReturned" : 1,
  "executionTimeMillis" : 772,
  "totalKeysExamined" : 0,
  "totalDocsExamined" : 1100003,
  "executionStages" : {
    "stage" : "COLLSCAN",
    "filter" : {
      "instock" : {
        "$eq" : {
          "warehouse" : "Tracy",
          "qty" : 0
        }
      }
    },
    "nReturned" : 1,
    "executionTimeMillisEstimate" : 680,
    "works" : 1100005,
    "advanced" : 1,
    "needTime" : 1100003,
    "needYield" : 0,
    "saveState" : 8593,
    "restoreState" : 8593,
    "isEOF" : 1,
    "invalidates" : 0,
    "direction" : "forward",
    "docsExamined" : 1100003
  }
}
xiaoxu:PRIMARY> 
【创建索引--这个是多键索引与嵌套文档存在区别,数组类字段类似】
备注:通常查询整体匹配比较少,查询单列无法使用索引.
xiaoxu:PRIMARY> db.inventory.createIndex({instock:1})
{
  "createdCollectionAutomatically" : false,
  "numIndexesBefore" : 1,
  "numIndexesAfter" : 2,
  "ok" : 1
}
【走IXSCAN】
db.inventory.find({ "instock": { warehouse: "Tracy", qty: 0 } } ).explain("executionStats").executionStats;
{
  "executionSuccess" : true,
  "nReturned" : 1,
  "executionTimeMillis" : 0,
  "totalKeysExamined" : 1,
  "totalDocsExamined" : 1,
  "executionStages" : {
    "stage" : "FETCH",
    "nReturned" : 1,
    "executionTimeMillisEstimate" : 0,
    "works" : 2,
    "advanced" : 1,
    "needTime" : 0,
    "needYield" : 0,
    "saveState" : 0,
    "restoreState" : 0,
    "isEOF" : 1,
    "invalidates" : 0,
    "docsExamined" : 1,
    "alreadyHasObj" : 0,
    "inputStage" : {
      "stage" : "IXSCAN",
      "nReturned" : 1,
      "executionTimeMillisEstimate" : 0,
      "works" : 2,
      "advanced" : 1,
      "needTime" : 0,
      "needYield" : 0,
      "saveState" : 0,
      "restoreState" : 0,
      "isEOF" : 1,
      "invalidates" : 0,
      "keyPattern" : {
        "instock" : 1
      },
      "indexName" : "instock_1",
      "isMultiKey" : true,
      "multiKeyPaths" : {
        "instock" : [
          "instock"
        ]
      },
      "isUnique" : false,
      "isSparse" : false,
      "isPartial" : false,
      "indexVersion" : 2,
      "direction" : "forward",
      "indexBounds" : {
        "instock" : [
          "[{ warehouse: "Tracy", qty: 0.0 }, { warehouse: "Tracy", qty: 0.0 }]"
        ]
      },
      "keysExamined" : 1,
      "seeks" : 1,
      "dupsTested" : 1,
      "dupsDropped" : 0,
      "seenInvalidated" : 0
    }
  }
}
xiaoxu:PRIMARY>

3、如何查询满足单个条件列的数据

  • 通过数组字段加点(.)嵌套字段方式--数组内嵌套对象任意字段满足条件
  • 通过数组索引位置来查询嵌套字段--数组指定位置的嵌套字段满足条件

备注:通过数组字段.嵌套字段查询出73443--任意一个对象的warehouse满足即可.

通过数组位置来查询,发现第一个元素中包括warehouse=“xiaoxu”与

第二元素中包括不一样多.只匹配特定位置的warehouse

db.inventory.find({ "instock.warehouse":"xiaoxu"}).count();
73443
注意:数组中可以这么写db.inventory.find({ "instock":"xiaoxu"}),数组文档
中没有必须带是嵌套字段才可以.否则变成匹配整个文档
db.inventory.find({ "instock.0.warehouse":"xiaoxu"}).count();
73443
db.inventory.find({ "instock.1.warehouse":"xiaoxu"}).count();
26828
【无法使用instock:1多键索引】
备注:必须创建instock.warehouse这样索引才可以使用,但是使用数组位置查询
同样无法使用索引.
db.inventory.find({ "instock.warehouse":"xiaoxu"}).explain("executionStats").executionStats;
{
  "executionSuccess" : true,
  "nReturned" : 73446,
  "executionTimeMillis" : 891,
  "totalKeysExamined" : 0,
  "totalDocsExamined" : 1100003,
  "executionStages" : {
    "stage" : "COLLSCAN",
    "filter" : {
      "instock.warehouse" : {
        "$eq" : "xiaoxu"
      }
    },
    "nReturned" : 73446,
    "executionTimeMillisEstimate" : 790,
    "works" : 1100005,
    "advanced" : 73446,
    "needTime" : 1026558,
    "needYield" : 0,
    "saveState" : 8593,
    "restoreState" : 8593,
    "isEOF" : 1,
    "invalidates" : 0,
    "direction" : "forward",
    "docsExamined" : 1100003
  }
}
db.inventory.find({ "instock.0.warehouse":"xiaoxu"}).explain("executionStats").executionStats;
{
  "executionSuccess" : true,
  "nReturned" : 73443,
  "executionTimeMillis" : 1144,
  "totalKeysExamined" : 0,
  "totalDocsExamined" : 1100003,
  "executionStages" : {
    "stage" : "COLLSCAN",
    "filter" : {
      "instock.0.warehouse" : {
        "$eq" : "xiaoxu"
      }
    },
    "nReturned" : 73443,
    "executionTimeMillisEstimate" : 1040,
    "works" : 1100005,
    "advanced" : 73443,
    "needTime" : 1026561,
    "needYield" : 0,
    "saveState" : 8593,
    "restoreState" : 8593,
    "isEOF" : 1,
    "invalidates" : 0,
    "direction" : "forward",
    "docsExamined" : 1100003
  }
}

【创建使用数组字段点嵌套字段创建索引】
xiaoxu:PRIMARY> db.inventory.createIndex({"instock.warehouse":1})
{
  "createdCollectionAutomatically" : false,
  "numIndexesBefore" : 2,
  "numIndexesAfter" : 3,
  "ok" : 1
}
【可以使用索引】
db.inventory.find({ "instock.warehouse":"xiaoxu"}).explain("executionStats").executionStats;
{
  "executionSuccess" : true,
  "nReturned" : 73446,
  "executionTimeMillis" : 123,
  "totalKeysExamined" : 73446,
  "totalDocsExamined" : 73446,
  "executionStages" : {
    "stage" : "FETCH",
    "nReturned" : 73446,
    "executionTimeMillisEstimate" : 100,
    "works" : 73447,
    "advanced" : 73446,
    "needTime" : 0,
    "needYield" : 0,
    "saveState" : 573,
    "restoreState" : 573,
    "isEOF" : 1,
    "invalidates" : 0,
    "docsExamined" : 73446,
    "alreadyHasObj" : 0,
    "inputStage" : {
      "stage" : "IXSCAN",
      "nReturned" : 73446,
      "executionTimeMillisEstimate" : 70,
      "works" : 73447,
      "advanced" : 73446,
      "needTime" : 0,
      "needYield" : 0,
      "saveState" : 573,
      "restoreState" : 573,
      "isEOF" : 1,
      "invalidates" : 0,
      "keyPattern" : {
        "instock.warehouse" : 1
      },
      "indexName" : "instock.warehouse_1",
      "isMultiKey" : true,
      "multiKeyPaths" : {
        "instock.warehouse" : [
          "instock"
        ]
      },
      "isUnique" : false,
      "isSparse" : false,
      "isPartial" : false,
      "indexVersion" : 2,
      "direction" : "forward",
      "indexBounds" : {
        "instock.warehouse" : [
          "["xiaoxu", "xiaoxu"]"
        ]
      },
      "keysExamined" : 73446,
      "seeks" : 1,
      "dupsTested" : 73446,
      "dupsDropped" : 0,
      "seenInvalidated" : 0
    }
  }
}
【基于位置查询无法使用索引】
db.inventory.find({ "instock.0.warehouse":"xiaoxu"}).explain("executionStats").executionStats;
{
  "executionSuccess" : true,
  "nReturned" : 73443,
  "executionTimeMillis" : 1113,
  "totalKeysExamined" : 0,
  "totalDocsExamined" : 1100003,
  "executionStages" : {
    "stage" : "COLLSCAN",
    "filter" : {
      "instock.0.warehouse" : {
        "$eq" : "xiaoxu"
      }
    },
    "nReturned" : 73443,
    "executionTimeMillisEstimate" : 1000,
    "works" : 1100005,
    "advanced" : 73443,
    "needTime" : 1026561,
    "needYield" : 0,
    "saveState" : 8593,
    "restoreState" : 8593,
    "isEOF" : 1,
    "invalidates" : 0,
    "direction" : "forward",
    "docsExamined" : 1100003
  }
}
xiaoxu:PRIMARY> 

4、如何查询满足多个条件列的数据

  • 至少1个嵌套文档同时满足多个条件--需要使用$elemMatch(此时不分区字段顺序)
  • 至少1个嵌套文档满足A条件或者满足B条件--注意没有同时且满足条件的文档 可以跨越多个嵌套文档,这个就是是否使用$elemMatch的区别

执行计划不同:

  • 使用$elemMatch完全匹配2个条件,即索引边界问题
  • 不使用$elemMatch,只能使用前导列进行匹配,剩下列需要回表后过滤

例如MYSQL有索引下推或者索引过滤,Oracle可以直接在索引后过滤(已包括在索引,否则要回表).

【单个文档等值匹配】
db.inventory.find( { "instock": { $elemMatch: 
{ qty: 100061, warehouse: "xiaoxu" } } } ).pretty();
{
  "_id" : ObjectId("5f3cd83626ca128ad92b830b"),
  "name" : "xiaoxu",
  "item" : {
    "name" : "xiaoxu",
    "manufactured" : 100051
  },
  "instock" : [
    {
      "warehouse" : "xiaoxu",
      "qty" : 100051
    },
    {
      "warehouse" : "xiaoxu",
      "qty" : 100061
    }
  ],
  "ratings" : [
    100051,
    100151
  ],
  "added_at" : ISODate("2020-08-19T07:43:50.040Z"),
  "number" : 4062
}
xiaoxu:PRIMARY> 
【数组内嵌套文档只有组合可以满足即可or关系warehouse&qty】
第一个满足的是在同一个嵌套文档内,第二个是分布在1个数组内2个文档
db.inventory.find( {"instock.warehouse": "xiaoxu",
"instock.qty": 10061 } ).pretty();
{
  "_id" : ObjectId("5f3cf38026ca128ad93ac518"),
  "item" : {
    "name" : "xiaoxu",
    "manufactured" : 100051
  },
  "instock" : [
    {
      "warehouse" : "xiaoxing",
      "qty" : 10061
    },
    {
      "warehouse" : "xiaoxu",
      "qty" : 100061
    }
  ],
  "ratings" : [
    100051,
    100151
  ],
  "added_at" : ISODate("2020-08-19T07:43:50.040Z"),
  "number" : 4062
}
{
  "_id" : ObjectId("5f3cf3c226ca128ad93ac519"),
  "item" : {
    "name" : "xiaoxu",
    "manufactured" : 100051
  },
  "instock" : [
    {
      "warehouse" : "xiaoxing",
      "qty" : 10061
    },
    {
      "warehouse" : "xiaoxu",
      "qty" : 100071
    }
  ],
  "ratings" : [
    100051,
    100151
  ],
  "added_at" : ISODate("2020-08-19T07:43:50.040Z"),
  "number" : 4062
}

【执行计划差别】
备注:索引高效:totalKeysExamined=totalKeysExamined=nReturned
db.inventory.find( { "instock": { $elemMatch: { qty: 100061, warehouse: "xiaoxu" } } } ).explain("executionStats").executionStats;
{
  "executionSuccess" : true,
  "nReturned" : 2,
  "executionTimeMillis" : 0,
  "totalKeysExamined" : 2,
  "totalDocsExamined" : 2,
  "executionStages" : {
    "stage" : "FETCH",
    "filter" : {
      "instock" : {
        "$elemMatch" : {
          "$and" : [
            {
              "warehouse" : {
                "$eq" : "xiaoxu"
              }
            },
            {
              "qty" : {
                "$eq" : 100061
              }
            }
          ]
        }
      }
    },
    "nReturned" : 2,
    "executionTimeMillisEstimate" : 0,
    "works" : 3,
    "advanced" : 2,
    "needTime" : 0,
    "needYield" : 0,
    "saveState" : 0,
    "restoreState" : 0,
    "isEOF" : 1,
    "invalidates" : 0,
    "docsExamined" : 2,
    "alreadyHasObj" : 0,
    "inputStage" : {
      "stage" : "IXSCAN",
      "nReturned" : 2,
      "executionTimeMillisEstimate" : 0,
      "works" : 3,
      "advanced" : 2,
      "needTime" : 0,
      "needYield" : 0,
      "saveState" : 0,
      "restoreState" : 0,
      "isEOF" : 1,
      "invalidates" : 0,
      "keyPattern" : {
        "instock.warehouse" : 1,
        "instock.qty" : 1
      },
      "indexName" : "instock.warehouse_1_instock.qty_1",
      "isMultiKey" : true,
      "multiKeyPaths" : {
        "instock.warehouse" : [
          "instock"
        ],
        "instock.qty" : [
          "instock"
        ]
      },
      "isUnique" : false,
      "isSparse" : false,
      "isPartial" : false,
      "indexVersion" : 2,
      "direction" : "forward",
      "indexBounds" : {
        "instock.warehouse" : [
          "["xiaoxu", "xiaoxu"]"
        ],
        "instock.qty" : [
          "[100061.0, 100061.0]"
        ]
      },
      "keysExamined" : 2,
      "seeks" : 1,
      "dupsTested" : 2,
      "dupsDropped" : 0,
      "seenInvalidated" : 0
    }
  }
}

db.inventory.find( {"instock.warehouse": "xiaoxu",
 "instock.qty": 10061 } ).explain("executionStats").executionStats;
{
  "executionSuccess" : true,
  "nReturned" : 3,
  "executionTimeMillis" : 190,
  "totalKeysExamined" : 100274,
  "totalDocsExamined" : 73446,
  "executionStages" : {
    "stage" : "FETCH",
    "filter" : {
      "instock.qty" : {
        "$eq" : 10061
      }
    },
    "nReturned" : 3,
    "executionTimeMillisEstimate" : 180,
    "works" : 100275,
    "advanced" : 3,
    "needTime" : 100271,
    "needYield" : 0,
    "saveState" : 783,
    "restoreState" : 783,
    "isEOF" : 1,
    "invalidates" : 0,
    "docsExamined" : 73446,
    "alreadyHasObj" : 0,
    "inputStage" : {
      "stage" : "IXSCAN",
      "nReturned" : 73446,
      "executionTimeMillisEstimate" : 60,
      "works" : 100275,
      "advanced" : 73446,
      "needTime" : 26828,
      "needYield" : 0,
      "saveState" : 783,
      "restoreState" : 783,
      "isEOF" : 1,
      "invalidates" : 0,
      "keyPattern" : {
        "instock.warehouse" : 1,
        "instock.qty" : 1
      },
      "indexName" : "instock.warehouse_1_instock.qty_1",
      "isMultiKey" : true,
      "multiKeyPaths" : {
        "instock.warehouse" : [
          "instock"
        ],
        "instock.qty" : [
          "instock"
        ]
      },
      "isUnique" : false,
      "isSparse" : false,
      "isPartial" : false,
      "indexVersion" : 2,
      "direction" : "forward",
      "indexBounds" : {
        "instock.warehouse" : [
          "["xiaoxu", "xiaoxu"]"
        ],
        "instock.qty" : [
          "[MinKey, MaxKey]"
        ]
      },
      "keysExamined" : 100274,
      "seeks" : 1,
      "dupsTested" : 100274,
      "dupsDropped" : 26828,
      "seenInvalidated" : 0
    }
  }
}

5、分析这个写法执行计划

【具体SQL】

db.inventory.find( {"instock.warehouse": "xiaoxu","instock.qty": 10061 } )

执行效率:totalKeysExamined>totalDocsExamined>nReturned

扫描索引key与返回记录相差3万倍.

  "nReturned" : 3,
  "executionTimeMillis" : 190,
  "totalKeysExamined" : 100274,
  "totalDocsExamined" : 73446
  
  下面这种写法都是相等才2条
  db.inventory.find( { "instock": 
  { $elemMatch: { qty: 100061, warehouse: "xiaoxu" } } } ).explain("executionStats").executionStats;
  "nReturned" : 2,
  "executionTimeMillis" : 0,
  "totalKeysExamined" : 2,
  "totalDocsExamined" : 2,

【为什么慢】

  • 效率低只由写法决定,这个逻辑是instock中至少存在一个嵌套文档中包含warehouse等于“xiaoxu”且至少一个文档中qty等10061. 相当于SQL逻辑如下
  • where instock.warehouse="xiaoxu" or "instock.qty"=10061
  • 只能匹配嵌套文档第一个字段,后续字段需要回表匹配.

【执行计划】

db.inventory.find( {"instock.warehouse": "xiaoxu",
 "instock.qty": 10061 } ).explain("executionStats").executionStats;
{
  "nReturned" : 3,
  "executionTimeMillis" : 190,
  "totalKeysExamined" : 100274,
  "totalDocsExamined" : 73446,
  "executionStages" : {
    "stage" : "FETCH",
    "filter" : {
      "instock.qty" : {"$eq" : 10061}},
    "nReturned" : 3,
    "docsExamined" : 73446,
    "inputStage" : {
      "stage" : "IXSCAN",
      "nReturned" : 73446,
      "keyPattern" : {
        "instock.warehouse" : 1,
        "instock.qty" : 1
      },
      "indexName" : "instock.warehouse_1_instock.qty_1",
      "isMultiKey" : true,
      "multiKeyPaths" : {
        "instock.warehouse" : ["instock"],
        "instock.qty" : ["instock"]
      },
      "indexBounds" : {
        "instock.warehouse" : ["["xiaoxu", "xiaoxu"]"],
        "instock.qty" : ["[MinKey, MaxKey]"]},
      "keysExamined" : 100274

【分析keysExamined&docsExamined&nReturned】

keysExamined:由分别统计不同嵌套文档累计加起来的个数,如果嵌套文档越多,

keysExamined越大.

docsExamined:由keysExamined合并回表的记录.因为or存在一条记录多次统计,

去掉key会表总记录数.

nReturned:因为这种写法只能使用匹配前导列,回表之后匹配嵌套文档是否存在qty

等于1061这个值,存在这个返回整个记录.经过过滤7万多文档只有3条满足.

db.inventory.find( {"instock.warehouse": "xiaoxu" } ).count()
73446
db.inventory.find( {"instock.0.warehouse": "xiaoxu" } ).count()
73443
db.inventory.find( {"instock.1.warehouse": "xiaoxu" } ).count()
26831
3443+26831;
100274

【总结】

  • 如果对数组类型字段操作时,需要判断元素同时满足多个条件时,

需要使用elemMatch,类似关系型数据库中and.如果不是使用elemMatch,则逻辑变成关系型中or操作.例如select * from dba_objects where owner='HR' or object_id=100.oracle中12c之前需要改写才能获得好的执行计划.

  • 为什么建议使用elemMatch,如果不使用elemMatch,虽然可以使用索引,但是

只能匹配前导列,后续字段只能回表过滤,无法在索引中过滤.如果能在索引中过滤,类似MYSQL ICP或者ORACLE 索引FILTER。不知道是否多键索引问题

  • 数组与数组嵌套文档创建索引以及用法类似,相当于结合数组、嵌套文档形成数组文档.