Mongodb集群范围分片

1,577次阅读
没有评论

共计 10454 个字符,预计需要花费 27 分钟才能阅读完成。

Mongodb集群范围分片

前面文章弄了个商品中心服务,就近想把对应的商品评价信息也捞点下来,并使用mongodb分片来存储,所以记录下mongodb分片的使用。此处只涉及范围分片

使用分片无非就是数据量过大,为了提高查询写入的速度和带宽,所以将数据均衡切分到不同节点上处理,这便是分片集群的好处。然而,不是拥有分片集群就能提高你的业务查询和读写带宽,因为还要考虑你是否会使用分片健的配置,配置的不好,则是大材小用。。。。注意,如何在使用分片集群的时候,需要给集合设置数据分片,如果没有设置,那么数据会被集中在一个shard节点内!

范围分片

mongodb按照片健的值范围将数据拆分到不同的chunk里,每个chunk包含了一段范围内的数据。这种方式适用于存在一个相对固定的范围的变化,该片键的值不是:单调递增或递减,范围查询业务。

  • 优点:mongos可以快速定位请求所需的数据,请求到对应的shard节点处理
  • 缺点:可能导致数据在shard节点分布不均衡,容易造成分片数据倾斜

使用python的faker造点数据

[root@mongodb-server ~]# cat comment.py
from faker import Faker
from pymongo import MongoClient

# 创建 Faker 实例
fake = Faker()

# 创建 MongoDB 客户端
client = MongoClient('mongodb://127.0.0.1:38017/')

# 获取数据库和集合
# 生成随机数据
db = client['mydatabase']
collection = db['productReviews']

for i in range(10000000):

    review = {
        'reviewId': i + 1,
        'spu': fake.random_int(min=100000, max=999999),
        'sku': fake.random_int(min=1000000000, max=9999999999),
        'userName': fake.name(),
        'rating': fake.random_int(min=1, max=5),
        'title': fake.sentence(nb_words=6),
        'content': fake.paragraph(nb_sentences=3),
        'createDate': fake.date_time_between(start_date='-30d', end_date='now'),
        'updateTime': fake.date_time_between(start_date='-30d', end_date='now')
    }
    collection.insert_one(review)
    print(review)

数据库开启分片

mongos> use mydatabase
switched to db mydatabase
mongos> sh.enableSharding('mydatabase')
{
        "ok" : 1,
        "operationTime" : Timestamp(1678692768, 2),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1678692768, 2),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}

# 查看此时db状态,所有数据都在shard03
mongos> db.stats()
{
        "raw" : {
                "shard03/mongodb-server:38025,mongodb-server:38026,mongodb-server:38027" : {
                        "db" : "mydatabase",
                        "collections" : 1,
                        "views" : 0,
                        "objects" : 1443422,
                        "avgObjSize" : 291.4877062979503,
                        "dataSize" : 420739768,
                        "storageSize" : 263249920,
                        "indexes" : 1,
                        "indexSize" : 24645632,
                        "totalSize" : 287895552,
                        "scaleFactor" : 1,
                        "fsUsedSize" : 9247023104,
                        "fsTotalSize" : 39700664320,
                        "ok" : 1
                },
                "shard01/mongodb-server:38019,mongodb-server:38020,mongodb-server:38021" : {
                        "db" : "mydatabase",
                        "collections" : 0,
                        "views" : 0,
                        "objects" : 0,
                        "avgObjSize" : 0,
                        "dataSize" : 0,
                        "storageSize" : 0,
                        "totalSize" : 0,
                        "indexes" : 0,
                        "indexSize" : 0,
                        "scaleFactor" : 1,
                        "fileSize" : 0,
                        "fsUsedSize" : 0,
                        "fsTotalSize" : 0,
                        "ok" : 1
                },
                "shard02/mongodb-server:38022,mongodb-server:38023,mongodb-server:38024" : {
                        "db" : "mydatabase",
                        "collections" : 0,
                        "views" : 0,
                        "objects" : 0,
                        "avgObjSize" : 0,
                        "dataSize" : 0,
                        "storageSize" : 0,
                        "totalSize" : 0,
                        "indexes" : 0,
                        "indexSize" : 0,
                        "scaleFactor" : 1,
                        "fileSize" : 0,
                        "fsUsedSize" : 0,
                        "fsTotalSize" : 0,
                        "ok" : 1
                }
        },
        "objects" : 1443422,
        "avgObjSize" : 291,
        "dataSize" : 420739768,
        "storageSize" : 263249920,
        "totalSize" : 287895552,
        "indexes" : 1,
        "indexSize" : 24645632,
        "scaleFactor" : 1,
        "fileSize" : 0,
        "ok" : 1,
        "operationTime" : Timestamp(1678692775, 1),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1678692777, 3),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}

字段sku建立索引

mongos> db.productReviews.findOne()
{
        "_id" : ObjectId("640eba227572fb6e8d819d9c"),
        "reviewId" : 1,
        "spu" : 196170,
        "sku" : NumberLong("6291367729"),
        "userName" : "Jeffrey Durham",
        "rating" : 5,
        "title" : "For well exactly sound perform hotel sell.",
        "content" : "Answer candidate hit. Determine interesting society. Include science evidence begin data wish vote.",
        "createDate" : ISODate("2023-02-12T07:42:50Z"),
        "updateTime" : ISODate("2023-02-15T18:24:33Z")
}
mongos> db.productReviews.createIndex({"sku": 1})
{
        "raw" : {
                "shard03/mongodb-server:38025,mongodb-server:38026,mongodb-server:38027" : {
                        "createdCollectionAutomatically" : false,
                        "numIndexesBefore" : 1,
                        "numIndexesAfter" : 2,
                        "commitQuorum" : "votingMembers",
                        "ok" : 1
                }
        },
        "ok" : 1,
        "operationTime" : Timestamp(1678692925, 5),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1678692925, 5),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}

字段sku创建分片索引

注意:分片键一旦设置就不在支持修改调整

mongos> use config
switched to db config
mongos> sh.shardCollection('mydatabase.productReviews', {"sku": 1})
{
        "collectionsharded" : "mydatabase.productReviews",
        "collectionUUID" : UUID("b75a543a-4fd7-4295-bf6c-dbff2dfa6ac4"),
        "ok" : 1,
        "operationTime" : Timestamp(1678693048, 13),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1678693048, 13),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}

此时再查看我们的shard状态

mongos> sh.status()
--- Sharding Status ---
  sharding version: {
        "_id" : 1,
        "minCompatibleVersion" : 5,
        "currentVersion" : 6,
        "clusterId" : ObjectId("640eb4e77a504d88d33c6581")
  }
  shards:
        {  "_id" : "shard01",  "host" : "shard01/mongodb-server:38019,mongodb-server:38020,mongodb-server:38021",  "state" : 1 }
        {  "_id" : "shard02",  "host" : "shard02/mongodb-server:38022,mongodb-server:38023,mongodb-server:38024",  "state" : 1 }
        {  "_id" : "shard03",  "host" : "shard03/mongodb-server:38025,mongodb-server:38026,mongodb-server:38027",  "state" : 1 }
  active mongoses:
        "4.4.19" : 2
  autosplit:
        Currently enabled: yes
  balancer:
        Currently enabled:  yes
        Currently running:  no
        Failed balancer rounds in last 5 attempts:  0
        Migration Results for the last 24 hours:
                686 : Success
  databases:
        {  "_id" : "config",  "primary" : "config",  "partitioned" : true }
                config.system.sessions
                        shard key: { "_id" : 1 }
                        unique: false
                        balancing: true
                        chunks:
                                shard01 342
                                shard02 341
                                shard03 341
                        too many chunks to print, use verbose if you want to force print
        {  "_id" : "mydatabase",  "primary" : "shard03",  "partitioned" : true,  "version" : {  "uuid" : UUID("d257af67-e4ae-4368-b313-0e6eea87a8f2"),  "lastMod" : 1 } }
                mydatabase.productReviews
                        shard key: { "sku" : 1 }
                        unique: false
                        balancing: true
                        chunks:
                                shard01 2
                                shard02 2
                                shard03 3
                        { "sku" : { "$minKey" : 1 } } -->> { "sku" : NumberLong("2439394823") } on : shard01 Timestamp(2, 0)
                        { "sku" : NumberLong("2439394823") } -->> { "sku" : NumberLong("3880973244") } on : shard02 Timestamp(3, 0)
                        { "sku" : NumberLong("3880973244") } -->> { "sku" : NumberLong("5313913946") } on : shard02 Timestamp(4, 0)
                        { "sku" : NumberLong("5313913946") } -->> { "sku" : NumberLong("6483173253") } on : shard01 Timestamp(5, 0)
                        { "sku" : NumberLong("6483173253") } -->> { "sku" : NumberLong("7656267505") } on : shard03 Timestamp(5, 1)
                        { "sku" : NumberLong("7656267505") } -->> { "sku" : NumberLong("8825674039") } on : shard03 Timestamp(1, 5)
                        { "sku" : NumberLong("8825674039") } -->> { "sku" : { "$maxKey" : 1 } } on : shard03 Timestamp(1, 6)

查看下此时该db的状态

mongos> db.stats()
{
        "raw" : {
                "shard01/mongodb-server:38019,mongodb-server:38020,mongodb-server:38021" : {
                        "db" : "mydatabase",
                        "collections" : 1,
                        "views" : 0,
                        "objects" : 418507,
                        "avgObjSize" : 290.25706380060547,
                        "dataSize" : 121474613,
                        "storageSize" : 78262272,
                        "indexes" : 2,
                        "indexSize" : 25698304,
                        "totalSize" : 103960576,
                        "scaleFactor" : 1,
                        "fsUsedSize" : 11241979904,
                        "fsTotalSize" : 39700664320,
                        "ok" : 1
                },
                "shard02/mongodb-server:38022,mongodb-server:38023,mongodb-server:38024" : {
                        "db" : "mydatabase",
                        "collections" : 1,
                        "views" : 0,
                        "objects" : 461228,
                        "avgObjSize" : 292.03945987667703,
                        "dataSize" : 134696776,
                        "storageSize" : 86073344,
                        "indexes" : 2,
                        "indexSize" : 22794240,
                        "totalSize" : 108867584,
                        "scaleFactor" : 1,
                        "fsUsedSize" : 11241979904,
                        "fsTotalSize" : 39700664320,
                        "ok" : 1
                },
                "shard03/mongodb-server:38025,mongodb-server:38026,mongodb-server:38027" : {
                        "db" : "mydatabase",
                        "collections" : 2,
                        "views" : 0,
                        "objects" : 1443422,
                        "avgObjSize" : 291.4877062979503,
                        "dataSize" : 420739768,
                        "storageSize" : 263254016,
                        "indexes" : 4,
                        "indexSize" : 42885120,
                        "totalSize" : 306139136,
                        "scaleFactor" : 1,
                        "fsUsedSize" : 11241979904,
                        "fsTotalSize" : 39700664320,
                        "ok" : 1
                }
        },
        "objects" : 2323157,
        "avgObjSize" : 291.0183892005577,
        "dataSize" : 676911157,
        "storageSize" : 427589632,
        "totalSize" : 518967296,
        "indexes" : 8,
        "indexSize" : 91377664,
        "scaleFactor" : 1,
        "fileSize" : 0,
        "ok" : 1,
        "operationTime" : Timestamp(1678693218, 2),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1678693224, 3),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}

开启balancer负载均衡

从上面可以看到我们的数据并不是很均衡的分配在各个节点上,我们可以开启balance自动均衡功能。当此命令被执行并将参数设置为 true 时,MongoDB 的 balancer 进程将会自动将集合和数据库中的数据均匀地分配到不同的 shard 节点上,以实现负载均衡和最大化整个集群的性能

mongos> sh.enableBalancing(true)
WriteResult({ "nMatched" : 0, "nUpserted" : 0, "nModified" : 0 })

过一段时间后,数据相对之前分布叫均匀了许多

mongos> db.stats()
{
        "raw" : {
                "shard02/mongodb-server:38022,mongodb-server:38023,mongodb-server:38024" : {
                        "db" : "mydatabase",
                        "collections" : 1,
                        "views" : 0,
                        "objects" : 461228,
                        "avgObjSize" : 292.03945987667703,
                        "dataSize" : 134696776,
                        "storageSize" : 86073344,
                        "indexes" : 2,
                        "indexSize" : 22794240,
                        "totalSize" : 108867584,
                        "scaleFactor" : 1,
                        "fsUsedSize" : 11576156160,
                        "fsTotalSize" : 39700664320,
                        "ok" : 1
                },
                "shard01/mongodb-server:38019,mongodb-server:38020,mongodb-server:38021" : {
                        "db" : "mydatabase",
                        "collections" : 1,
                        "views" : 0,
                        "objects" : 418507,
                        "avgObjSize" : 290.25706380060547,
                        "dataSize" : 121474613,
                        "storageSize" : 78262272,
                        "indexes" : 2,
                        "indexSize" : 25698304,
                        "totalSize" : 103960576,
                        "scaleFactor" : 1,
                        "fsUsedSize" : 11576156160,
                        "fsTotalSize" : 39700664320,
                        "ok" : 1
                },
                "shard03/mongodb-server:38025,mongodb-server:38026,mongodb-server:38027" : {
                        "db" : "mydatabase",
                        "collections" : 2,
                        "views" : 0,
                        "objects" : 563687,
                        "avgObjSize" : 291.94992788551093,
                        "dataSize" : 164568379,
                        "storageSize" : 543461376,
                        "indexes" : 4,
                        "indexSize" : 85598208,
                        "totalSize" : 629059584,
                        "scaleFactor" : 1,
                        "fsUsedSize" : 11576156160,
                        "fsTotalSize" : 39700664320,
                        "ok" : 1
                }
        },
        "objects" : 1443422,
        "avgObjSize" : 291.0295970270648,
        "dataSize" : 420739768,
        "storageSize" : 707796992,
        "totalSize" : 841887744,
        "indexes" : 8,
        "indexSize" : 134090752,
        "scaleFactor" : 1,
        "fileSize" : 0,
        "ok" : 1,
        "operationTime" : Timestamp(1678694656, 1),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1678694656, 2),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}

关闭自动balancer

注意:均衡器需要在数据的修改操作(如插入、更新、删除)之间运行,以确保数据始终分布在整个集群中。因此,在执行这些操作时,可能会出现一些性能开销。所以我们不会将它设置为自动开启

mongos> sh.enableBalancing(false)
WriteResult({ "nMatched" : 0, "nUpserted" : 0, "nModified" : 0 })

如果发现集群中存在业务大量读写,此时又开启了自动balancer,便会增加集群负载,所有一般将自动balance设置为false。同时也可以手动关闭正在运行的balancer

sh.stopBalancer()

定时开启balancer

use config
db.settings.update(
   { _id: "balancer" },
   { $set: { activeWindow : { start : "<start-time>", stop : "<stop-time>" } } },
   { upsert: true }
)
  • <start-time>:开始时间,时间格式为HH:MM(实例所在地域的当地时间),HH取值范围为00 – 23,MM取值范围为00 – 59。
  • <stop-time>:结束时间,时间格式为HH:MM(实例所在地域的当地时间),HH取值范围为00 – 23,MM取值范围为00 – 59。
mongos> use config
switched to db config
mongos> db.settings.update(
...    { _id: "balancer" },
...    { $set: { activeWindow : { start : "03:00", stop : "06:30" } } },
...    { upsert: true }
... )
WriteResult({ "nMatched" : 0, "nUpserted" : 1, "nModified" : 0, "_id" : "balancer" })
mongos>
mongos> sh.status()
--- Sharding Status ---
  sharding version: {
        "_id" : 1,
        "minCompatibleVersion" : 5,
        "currentVersion" : 6,
        "clusterId" : ObjectId("640eb4e77a504d88d33c6581")
  }
  shards:
        {  "_id" : "shard01",  "host" : "shard01/mongodb-server:38019,mongodb-server:38020,mongodb-server:38021",  "state" : 1 }
        {  "_id" : "shard02",  "host" : "shard02/mongodb-server:38022,mongodb-server:38023,mongodb-server:38024",  "state" : 1 }
        {  "_id" : "shard03",  "host" : "shard03/mongodb-server:38025,mongodb-server:38026,mongodb-server:38027",  "state" : 1 }
  active mongoses:
        "4.4.19" : 2
  autosplit:
        Currently enabled: yes
  balancer:
        Currently enabled:  yes
        Currently running:  no
                Balancer active window is set between 03:00 and 06:30 server local time
        Failed balancer rounds in last 5 attempts:  0
        Migration Results for the last 24 hours:
                686 : Success

关闭该定时窗口

db.settings.update({ _id : "balancer" }, { $unset : { activeWindow : true } })                

正文完
 
xadocker
版权声明:本站原创文章,由 xadocker 2023-03-13发表,共计10454字。
转载说明:除特殊说明外本站文章皆由CC-4.0协议发布,转载请注明出处。
评论(没有评论)