Ceph集群常用命令参考

时间:2022-07-22
本文章向大家介绍Ceph集群常用命令参考,主要内容包括其使用实例、应用技巧、基本知识点总结和需要注意事项,具有一定的参考价值,需要的朋友可以参考一下。

进到Ceph集群的目录

交互模式

直接输入ceph进入交互模式
$ ceph

查看集群状态
ceph> status

检查集群的健康状态
ceph> health

查看monitors的状态
ceph> mon_status

检查集群状态

ceph status 或 ceph -s

检查OSD状态

ceph osd stat
或 
ceph osd dump

$ ceph osd stat
     osdmap e30: 3 osds: 3 up, 3 in
            flags sortbitwise,require_jewel_osds

OSD的状态

在集群内为“in”

在集群外为 “out”

运行着的为 “up”

不在运行的为 “down”

如果一个OSD处于up状态,那么它可以是在集群内,也可以是在集群外,如果之前的状态为 up 且 in,现在变成 up out了,那么ceph会把PG迁移到其他的OSD上。如果某个OSD的变成out了,则crush就不会再分配PG给它,如果状态为down,那么它的状态就会为out,默认在OSD down掉300s后标记它为out状态

如果某个OSD处于down且in的状态,那么肯定是有问题的,集群则处于非健康状态。

我们还可以查看详细的OSD的状态

$ ceph osd tree
ID WEIGHT  TYPE NAME           UP/DOWN REWEIGHT PRIMARY-AFFINITY 
-1 0.04376 root default                                          
-2 0.01459     host ceph-node1                                   
 0 0.01459         osd.0            up  1.00000          1.00000 
-3 0.01459     host ceph-node2                                   
 1 0.01459         osd.1            up  1.00000          1.00000 
-4 0.01459     host ceph-node3                                   
 2 0.01459         osd.2            up  1.00000          1.00000

那么如果有OSD处于down状态该如何启动呢?

sudo systemctl start ceph-osd@{id}

这里的id对应这里的的{0,1,2}

查看任意一个状态

# systemctl status ceph-osd@0
● ceph-osd@0.service - Ceph object storage daemon
   Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled-runtime; vendor preset: disabled)
   Active: active (running) since Mon 2019-05-13 16:36:41 CST; 3 days ago
 Main PID: 2277 (ceph-osd)
   CGroup: /system.slice/system-cephx2dosd.slice/ceph-osd@0.service
           └─2277 /usr/bin/ceph-osd -f --cluster ceph --id 0 --setuser ceph --setgroup ceph

May 17 09:31:11 ceph-node1 ceph-osd[2277]: 2019-05-17 09:31:11.726254 7f52e5898700 -1 osd.0 23 heartbeat_check: no reply from 192.168.152.155:6806 osd.1 since back 2019-05-17 09:30:19.920217 front 2019-05-17 09:30:...09:30:51.726252)
May 17 09:31:11 ceph-node1 ceph-osd[2277]: 2019-05-17 09:31:11.726264 7f52e5898700 -1 osd.0 23 heartbeat_check: no reply from 192.168.152.156:6806 osd.2 since back 2019-05-17 09:30:19.920217 front 2019-05-17 09:30:...09:30:51.726252)
May 17 09:31:12 ceph-node1 ceph-osd[2277]: 2019-05-17 09:31:12.538595 7f53030d3700 -1 osd.0 23 heartbeat_check: no reply from 192.168.152.155:6806 osd.1 since back 2019-05-17 09:30:19.920217 front 2019-05-17 09:30:...09:30:52.538592)
May 17 09:31:12 ceph-node1 ceph-osd[2277]: 2019-05-17 09:31:12.538653 7f53030d3700 -1 osd.0 23 heartbeat_check: no reply from 192.168.152.156:6806 osd.2 since back 2019-05-17 09:30:19.920217 front 2019-05-17 09:30:...09:30:52.538592)
May 17 09:31:13 ceph-node1 ceph-osd[2277]: 2019-05-17 09:31:13.427236 7f52e5898700 -1 osd.0 23 heartbeat_check: no reply from 192.168.152.155:6806 osd.1 since back 2019-05-17 09:30:19.920217 front 2019-05-17 09:30:...09:30:53.427231)
May 17 09:31:13 ceph-node1 ceph-osd[2277]: 2019-05-17 09:31:13.427241 7f52e5898700 -1 osd.0 23 heartbeat_check: no reply from 192.168.152.156:6806 osd.2 since back 2019-05-17 09:30:19.920217 front 2019-05-17 09:30:...09:30:53.427231)
May 17 09:31:13 ceph-node1 ceph-osd[2277]: 2019-05-17 09:31:13.539323 7f53030d3700 -1 osd.0 23 heartbeat_check: no reply from 192.168.152.155:6806 osd.1 since back 2019-05-17 09:30:19.920217 front 2019-05-17 09:30:...09:30:53.539302)
May 17 09:31:13 ceph-node1 ceph-osd[2277]: 2019-05-17 09:31:13.539388 7f53030d3700 -1 osd.0 23 heartbeat_check: no reply from 192.168.152.156:6806 osd.2 since back 2019-05-17 09:30:19.920217 front 2019-05-17 09:30:...09:30:53.539302)
May 17 09:31:14 ceph-node1 ceph-osd[2277]: 2019-05-17 09:31:14.541007 7f53030d3700 -1 osd.0 23 heartbeat_check: no reply from 192.168.152.155:6806 osd.1 since back 2019-05-17 09:30:19.920217 front 2019-05-17 09:30:...09:30:54.541004)
May 17 09:31:14 ceph-node1 ceph-osd[2277]: 2019-05-17 09:31:14.541025 7f53030d3700 -1 osd.0 23 heartbeat_check: no reply from 192.168.152.156:6806 osd.2 since back 2019-05-17 09:30:19.920217 front 2019-05-17 09:30:...09:30:54.541004)
Hint: Some lines were ellipsized, use -l to show in full.

检查mon状态

ceph mon stat
或
ceph mon dump

检查mon的法定人数状态

$ ceph quorum_status -f json-pretty

{
    "election_epoch": 3,
    "quorum": [
        0
    ],
    "quorum_names": [
        "ceph-admin"
    ],
    "quorum_leader_name": "ceph-admin",
    "monmap": {
        "epoch": 1,
        "fsid": "dd6219bd-db70-46dc-82fb-f5ea31cfa727",
        "modified": "2019-05-13 16:33:58.012658",
        "created": "2019-05-13 16:33:58.012658",
        "mons": [
            {
                "rank": 0,
                "name": "ceph-admin",
                "addr": "192.168.152.153:6789/0"
            }
        ]
    }
}

检查mds状态

ceph mds stat
或
ceph mds dump

检查PG状态

ceph pg stat

查看OSD在 crush map里的位置

$ ceph osd tree
ID WEIGHT  TYPE NAME           UP/DOWN REWEIGHT PRIMARY-AFFINITY 
-1 0.04376 root default                                          
-2 0.01459     host ceph-node1                                   
 0 0.01459         osd.0            up  1.00000          1.00000 
-3 0.01459     host ceph-node2                                   
 1 0.01459         osd.1            up  1.00000          1.00000 
-4 0.01459     host ceph-node3                                   
 2 0.01459         osd.2            up  1.00000          1.00000

观察集群内发生的事件

$ ceph -w
    cluster dd6219bd-db70-46dc-82fb-f5ea31cfa727      #集群ID
     health HEALTH_OK                                 #集群健康状态
     monmap e1: 1 mons at {ceph-admin=192.168.152.153:6789/0}
            election epoch 3, quorum 0 ceph-admin
      fsmap e7: 1/1/1 up {0=ceph-admin=up:active}
     osdmap e30: 3 osds: 3 up, 3 in
            flags sortbitwise,require_jewel_osds
      pgmap v123: 84 pgs, 3 pools, 4516 bytes data, 20 objects
            323 MB used, 45723 MB / 46046 MB avail
                  84 active+clean

2019-05-17 09:31:52.670689 mon.0 [INF] pgmap v123: 84 pgs: 84 active+clean; 4516 bytes data, 323 MB used, 45723 MB / 46046 MB avail

检查集群使用情况

$ ceph df
GLOBAL:
    SIZE       AVAIL      RAW USED     %RAW USED 
    46046M     45723M         323M          0.70 
POOLS:
    NAME                ID     USED     %USED     MAX AVAIL     OBJECTS 
    rbd                 0         0         0        14473M           0 
    cephfs_data         1         0         0        14473M           0 
    cephfs_metadata     2      4516         0        14473M          20 


SIZE 集群总容量
AVAIL 集群可用空间总量
RAW USED 集群已用空间总量
%RAW USED 以用存储空间比例