基于kubeadm搭建k8s高可用集群

实践环境准备

服务器说明

我这里使用的是五台CentOS-7.7的虚拟机，具体信息如下表：

系统版本	IP地址	节点角色	CPU	Memory	Hostname
CentOS-7.7	192.168.243.138	master	>=2	>=2G	m1
CentOS-7.7	192.168.243.136	master	>=2	>=2G	m2
CentOS-7.7	192.168.243.141	master	>=2	>=2G	m3
CentOS-7.7	192.168.243.139	worker	>=2	>=2G	s1
CentOS-7.7	192.168.243.140	worker	>=2	>=2G	s2

这五台机器均需事先安装好Docker，由于安装过程比较简单这里不进行介绍，可以参考官方文档：

https://docs.docker.com/engine/install/centos/

系统设置（所有节点）

1、主机名必须每个节点都不一样，并且保证所有点之间可以通过hostname互相访问。设置hostname：

# 查看主机名
$ hostname
# 修改主机名
$ hostnamectl set-hostname <your_hostname>

配置host，使所有节点之间可以通过hostname互相访问：

$ vim /etc/hosts
192.168.243.138 m1
192.168.243.136 m2
192.168.243.141 m3
192.168.243.139 s1
192.168.243.140 s2

2、安装依赖包：

# 更新yum
$ yum update
# 安装依赖包
$ yum install -y conntrack ipvsadm ipset jq sysstat curl iptables libseccomp

3、关闭防火墙、swap，重置iptables：

# 关闭防火墙
$ systemctl stop firewalld && systemctl disable firewalld
# 重置iptables
$ iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat && iptables -P FORWARD ACCEPT
# 关闭swap
$ swapoff -a
$ sed -i '/swap/s/^(.*)$/#1/g' /etc/fstab
# 关闭selinux
$ setenforce 0
# 关闭dnsmasq(否则可能导致docker容器无法解析域名)
$ service dnsmasq stop && systemctl disable dnsmasq
# 重启docker服务
$ systemctl restart docker

4、系统参数设置：

# 制作配置文件
$ cat > /etc/sysctl.d/kubernetes.conf <<EOF
net.bridge.bridge-nf-call-iptables=1
net.bridge.bridge-nf-call-ip6tables=1
net.ipv4.ip_forward=1
vm.swappiness=0
vm.overcommit_memory=1
vm.panic_on_oom=0
fs.inotify.max_user_watches=89100
EOF
# 生效文件
$ sysctl -p /etc/sysctl.d/kubernetes.conf

安装必要工具（所有节点）

工具说明：

kubeadm: 部署集群用的命令
kubelet: 在集群中每台机器上都要运行的组件，负责管理pod、容器的生命周期
kubectl: 集群管理工具（可选，只要在控制集群的节点上安装即可）

1、首先添加k8s的源：

$ bash -c 'cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF'

2、安装k8s相关组件：

$ yum install -y kubelet kubeadm kubectl
$ systemctl enable --now kubelet.service

配置kubectl命令补全

kubectl是用于与k8s集群交互的一个命令行工具，操作k8s基本离不开这个工具，所以该工具所支持的命令比较多。好在kubectl支持设置命令补全，使用kubectl completion -h可以查看各个平台下的设置示例。这里以Linux平台为例，演示一下如何设置这个命令补全，完成以下操作后就可以使用tap键补全命令了：

[root@m1 ~]# yum install bash-completion -y
[root@m1 ~]# source /usr/share/bash-completion/bash_completion
[root@m1 ~]# source <(kubectl completion bash)
[root@m1 ~]# kubectl completion bash > ~/.kube/completion.bash.inc
[root@m1 ~]# printf "  
# Kubectl shell completion  
source '$HOME/.kube/completion.bash.inc'  
" >> $HOME/.bash_profile
[root@m1 ~]# source $HOME/.bash_profile

高可用集群部署

部署keepalived - apiserver高可用（任选两个master节点）

1、在两个主节点上执行如下命令安装keepalived（一主一备），我这里选择在m1和m2节点上进行安装：

$ yum install -y keepalived

2、分别在两台机器上创建keepalived配置文件的存放目录：

$ mkdir -p /etc/keepalived

3、在m1（角色为master）上创建配置文件如下：

[root@m1 ~]# vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
 router_id keepalive-master
}

vrrp_script check_apiserver {
 # 检测脚本路径
 script "/etc/keepalived/check-apiserver.sh"
 # 多少秒检测一次
 interval 3
 # 失败的话权重-2
 weight -2
}

vrrp_instance VI-kube-master {
   state MASTER  # 定义节点角色
   interface ens32  # 网卡名称
   virtual_router_id 68
   priority 100
   dont_track_primary
   advert_int 3
   virtual_ipaddress {
     # 自定义虚拟ip
     192.168.243.100
   }
   track_script {
       check_apiserver
   }
}

4、在m2（角色为backup）上创建配置文件如下：

[root@m2 ~]# vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
 router_id keepalive-backup
}

vrrp_script check_apiserver {
 script "/etc/keepalived/check-apiserver.sh"
 interval 3
 weight -2
}

vrrp_instance VI-kube-master {
   state BACKUP
   interface ens32
   virtual_router_id 68
   priority 99
   dont_track_primary
   advert_int 3
   virtual_ipaddress {
     192.168.243.100
   }
   track_script {
       check_apiserver
   }
}

5、分别在m1和m2节点上创建keepalived的检测脚本，这个脚本比较简单，可以自行根据需求去完善：

$ vim /etc/keepalived/check-apiserver.sh
#!/bin/sh
netstat -ntlp |grep 6443 || exit 1

6、完成上述步骤后，启动keepalived：

# 分别在master和backup上启动keepalived服务
$ systemctl enable keepalived && service keepalived start

# 检查状态
$ service keepalived status

# 查看日志
$ journalctl -f -u keepalived

# 查看虚拟ip
$ ip a

部署第一个k8s主节点

使用kubeadm创建的k8s集群，大部分组件都是以docker容器的方式去运行的，所以kubeadm在初始化master节点的时候需要拉取相应的组件镜像。但是kubeadm默认是从Google的k8s.gcr.io上拉取镜像，因此在国内自然是无法成功拉取到所需的镜像。

要解决这种情况要么***，要么手动拉取国内与之对应的镜像到本地然后改下tag。我这里选择后者，首先查看kubeadm需要拉取的镜像列表：

[root@m1 ~]# kubeadm config images list
W0830 19:17:13.056761   81487 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
k8s.gcr.io/kube-apiserver:v1.19.0
k8s.gcr.io/kube-controller-manager:v1.19.0
k8s.gcr.io/kube-scheduler:v1.19.0
k8s.gcr.io/kube-proxy:v1.19.0
k8s.gcr.io/pause:3.2
k8s.gcr.io/etcd:3.4.9-1
k8s.gcr.io/coredns:1.7.0
[root@m1 ~]#

我这里是从阿里云的容器镜像仓库去拉取，但是有个问题就是版本号可能会与kubeadm中定义的对不上，这就需要我们自行到镜像仓库查询确认：

https://cr.console.aliyun.com/cn-hangzhou/instances/images

例如，我这里kubeadm列出的版本号是v1.19.0，但阿里云镜像仓库上却是v1.19.0-rc.1。找到对应的版本号后，为了避免重复的工作，我这里就写了个shell脚本去完成镜像的拉取及修改tag：

[root@m1 ~]# vim pullk8s.sh
#!/bin/bash
ALIYUN_KUBE_VERSION=v1.19.0-rc.1
KUBE_VERSION=v1.19.0
KUBE_PAUSE_VERSION=3.2
ETCD_VERSION=3.4.9-1
DNS_VERSION=1.7.0
username=registry.cn-hangzhou.aliyuncs.com/google_containers

images=(
    kube-proxy-amd64:${ALIYUN_KUBE_VERSION}
    kube-scheduler-amd64:${ALIYUN_KUBE_VERSION}
    kube-controller-manager-amd64:${ALIYUN_KUBE_VERSION}
    kube-apiserver-amd64:${ALIYUN_KUBE_VERSION}
    pause:${KUBE_PAUSE_VERSION}
    etcd-amd64:${ETCD_VERSION}
    coredns:${DNS_VERSION}
)

for image in ${images[@]}
do
    docker pull ${username}/${image}
    # 此处需删除“-amd64”，否则kuadm还是无法识别本地镜像
    new_image=`echo $image|sed 's/-amd64//g'`
    if [[ $new_image == *$ALIYUN_KUBE_VERSION* ]]
    then
        new_kube_image=`echo $new_image|sed "s/$ALIYUN_KUBE_VERSION//g"`
        docker tag ${username}/${image} k8s.gcr.io/${new_kube_image}$KUBE_VERSION
    else
        docker tag ${username}/${image} k8s.gcr.io/${new_image}
    fi
    docker rmi ${username}/${image}
done
[root@m1 ~]# sh pullk8s.sh

脚本执行完后，此时查看Docker镜像列表应如下：

[root@m1 ~]# docker images
REPOSITORY                           TAG                 IMAGE ID            CREATED             SIZE
k8s.gcr.io/kube-proxy                v1.19.0             b2d80fe68e4f        6 weeks ago         120MB
k8s.gcr.io/kube-controller-manager   v1.19.0             a7cd7b6717e8        6 weeks ago         116MB
k8s.gcr.io/kube-apiserver            v1.19.0             1861e5423d80        6 weeks ago         126MB
k8s.gcr.io/kube-scheduler            v1.19.0             6d4fe43fdd0d        6 weeks ago         48.4MB
k8s.gcr.io/etcd                      3.4.9-1             d4ca8726196c        2 months ago        253MB
k8s.gcr.io/coredns                   1.7.0               bfe3a36ebd25        2 months ago        45.2MB
k8s.gcr.io/pause                     3.2                 80d28bedfe5d        6 months ago        683kB
[root@m1 ~]#

创建kubeadm用于初始化master节点的配置文件：

[root@m1 ~]# vim kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v1.19.0
# 指定控制面板的访问端点，这里的ip为keepalived的虚拟ip
controlPlaneEndpoint: "192.168.243.100:6443"
networking:
    # This CIDR is a Calico default. Substitute or remove for your CNI provider.
    podSubnet: "172.22.0.0/16"  # 指定pod所使用的网段

然后执行如下命令进行初始化：

[root@m1 ~]# kubeadm init --config=kubeadm-config.yaml --upload-certs
W0830 20:05:29.447773   88394 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[init] Using Kubernetes version: v1.19.0
[preflight] Running pre-flight checks
    [WARNING Service-Docker]: docker service is not enabled, please run 'systemctl enable docker.service'
    [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local m1] and IPs [10.96.0.1 192.168.243.138 192.168.243.100]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost m1] and IPs [192.168.243.138 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost m1] and IPs [192.168.243.138 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[apiclient] All control plane components are healthy after 173.517640 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.19" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[upload-certs] Using certificate key:
a455fb8227dd15882b57b11f3587187181b972d95524bb3ef43e78f76360121e
[mark-control-plane] Marking the node m1 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node m1 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: 5l7pv5.5iiq4atzlazq0b7x
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of the control-plane node running the following command on each as root:

  kubeadm join 192.168.243.100:6443 --token 5l7pv5.5iiq4atzlazq0b7x 
    --discovery-token-ca-cert-hash sha256:0fdc9947984a1c655861349dbd251d581bd6ec336c1ab8d9013cf302412b2140 
    --control-plane --certificate-key a455fb8227dd15882b57b11f3587187181b972d95524bb3ef43e78f76360121e

Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.243.100:6443 --token 5l7pv5.5iiq4atzlazq0b7x 
    --discovery-token-ca-cert-hash sha256:0fdc9947984a1c655861349dbd251d581bd6ec336c1ab8d9013cf302412b2140 
[root@m1 ~]#

拷贝一下这里打印出来的两条kubeadm join命令，后面添加其他master节点以及worker节点时需要用到

然后在master节点上执行如下命令拷贝配置文件：

[root@m1 ~]# mkdir -p $HOME/.kube
[root@m1 ~]# cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[root@m1 ~]# chown $(id -u):$(id -g) $HOME/.kube/config

查看当前的Pod信息：

[root@m1 ~]# kubectl get pod --all-namespaces
NAMESPACE     NAME                         READY   STATUS    RESTARTS   AGE
kube-system   coredns-f9fd979d6-kg4lf      0/1     Pending   0          9m9s
kube-system   coredns-f9fd979d6-t8xzj      0/1     Pending   0          9m9s
kube-system   etcd-m1                      1/1     Running   0          9m22s
kube-system   kube-apiserver-m1            1/1     Running   1          9m22s
kube-system   kube-controller-manager-m1   1/1     Running   1          9m22s
kube-system   kube-proxy-rjgnw             1/1     Running   0          9m9s
kube-system   kube-scheduler-m1            1/1     Running   1          9m22s
[root@m1 ~]#

使用curl命令请求一下健康检查接口，返回ok代表没问题：

[root@m1 ~]# curl -k https://192.168.243.100:6443/healthz
ok
[root@m1 ~]#

部署网络插件 - calico

创建配置文件存放目录：

[root@m1 ~]# mkdir -p /etc/kubernetes/addons

在该目录下创建calico-rbac-kdd.yaml配置文件：

[root@m1 ~]# vi /etc/kubernetes/addons/calico-rbac-kdd.yaml
# Calico Version v3.1.3
# https://docs.projectcalico.org/v3.1/releases#v3.1.3
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: calico-node
rules:
  - apiGroups: [""]
    resources:
      - namespaces
    verbs:
      - get
      - list
      - watch
  - apiGroups: [""]
    resources:
      - pods/status
    verbs:
      - update
  - apiGroups: [""]
    resources:
      - pods
    verbs:
      - get
      - list
      - watch
      - patch
  - apiGroups: [""]
    resources:
      - services
    verbs:
      - get
  - apiGroups: [""]
    resources:
      - endpoints
    verbs:
      - get
  - apiGroups: [""]
    resources:
      - nodes
    verbs:
      - get
      - list
      - update
      - watch
  - apiGroups: ["extensions"]
    resources:
      - networkpolicies
    verbs:
      - get
      - list
      - watch
  - apiGroups: ["networking.k8s.io"]
    resources:
      - networkpolicies
    verbs:
      - watch
      - list
  - apiGroups: ["crd.projectcalico.org"]
    resources:
      - globalfelixconfigs
      - felixconfigurations
      - bgppeers
      - globalbgpconfigs
      - bgpconfigurations
      - ippools
      - globalnetworkpolicies
      - globalnetworksets
      - networkpolicies
      - clusterinformations
      - hostendpoints
    verbs:
      - create
      - get
      - list
      - update
      - watch

---

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: calico-node
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: calico-node
subjects:
- kind: ServiceAccount
  name: calico-node
  namespace: kube-system

然后分别执行如下命令完成calico的安装：

[root@m1 ~]# kubectl apply -f /etc/kubernetes/addons/calico-rbac-kdd.yaml
[root@m1 ~]# kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml

查看状态：

[root@m1 ~]# kubectl get pod --all-namespaces 
NAMESPACE     NAME                                       READY   STATUS    RESTARTS   AGE
kube-system   calico-kube-controllers-5bc4fc6f5f-pdjls   1/1     Running   0          2m47s
kube-system   calico-node-tkdmv                          1/1     Running   0          2m47s
kube-system   coredns-f9fd979d6-kg4lf                    1/1     Running   0          23h
kube-system   coredns-f9fd979d6-t8xzj                    1/1     Running   0          23h
kube-system   etcd-m1                                    1/1     Running   1          23h
kube-system   kube-apiserver-m1                          1/1     Running   2          23h
kube-system   kube-controller-manager-m1                 1/1     Running   2          23h
kube-system   kube-proxy-rjgnw                           1/1     Running   1          23h
kube-system   kube-scheduler-m1                          1/1     Running   2          23h
[root@m1 ~]#

将其它master节点加入集群

使用之前保存的kubeadm join命令加入集群，但是要注意master和worker的join命令是不同的不要搞错了。分别在m2和m3上执行：

$ kubeadm join 192.168.243.100:6443 --token 5l7pv5.5iiq4atzlazq0b7x 
    --discovery-token-ca-cert-hash sha256:0fdc9947984a1c655861349dbd251d581bd6ec336c1ab8d9013cf302412b2140 
    --control-plane --certificate-key a455fb8227dd15882b57b11f3587187181b972d95524bb3ef43e78f76360121e

Tips：master节点的join命令包含--control-plane --certificate-key参数

然后等待一会，该命令执行成功会输出如下内容：

[preflight] Running pre-flight checks
    [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks before initializing the new control plane instance
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local m3] and IPs [10.96.0.1 192.168.243.141 192.168.243.100]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost m3] and IPs [192.168.243.141 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost m3] and IPs [192.168.243.141 127.0.0.1 ::1]
[certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[certs] Using the existing "sa" key
[kubeconfig] Generating kubeconfig files
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[check-etcd] Checking that the etcd cluster is healthy
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[etcd] Announced new etcd member joining to the existing etcd cluster
[etcd] Creating static Pod manifest for "etcd"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[mark-control-plane] Marking the node m3 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node m3 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]

This node has joined the cluster and a new control plane instance was created:

* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane (master) label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.

To start administering your cluster from this node, you need to run the following as a regular user:

    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config

Run 'kubectl get nodes' to see this node join the cluster.

然后按照提示完成kubectl配置文件的拷贝：

$ mkdir -p $HOME/.kube
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config

并且此时6443端口应该是被监听的：

[root@m2 ~]# netstat -lntp |grep 6443
tcp6       0      0 :::6443                 :::*                    LISTEN      31910/kube-apiserve 
[root@m2 ~]#

但join命令执行成功不一定代表就加入集群成功，此时需要回到m1节点上去查看节点是否为Ready状态：

[root@m1 ~]# kubectl get nodes
NAME   STATUS     ROLES    AGE     VERSION
m1     Ready      master   24h     v1.19.0
m2     NotReady   master   3m47s   v1.19.0
m3     NotReady   master   3m31s   v1.19.0
[root@m1 ~]#

可以看到m2和m3都是NotReady状态，代表没有成功加入到集群。于是我使用如下命令查看日志：

$ journalctl -f

发现是万恶的网络问题（墙）导致无法成功拉取pause镜像：

8月 31 20:09:11 m2 kubelet[10122]: W0831 20:09:11.713935   10122 cni.go:239] Unable to update cni config: no networks found in /etc/cni/net.d
8月 31 20:09:12 m2 kubelet[10122]: E0831 20:09:12.442430   10122 kubelet.go:2103] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
8月 31 20:09:17 m2 kubelet[10122]: E0831 20:09:17.657880   10122 kuberuntime_manager.go:730] createPodSandbox for pod "calico-node-jksvg_kube-system(5b76b6d7-0bd9-4454-a674-2d2fa4f6f35e)" failed: rpc error: code = Unknown desc = failed pulling image "k8s.gcr.io/pause:3.2": Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

于是在m2和m3上执行如下命令拷贝m1上之前用于拉取国内镜像的脚本并执行：

$ scp -r m1:/root/pullk8s.sh /root/pullk8s.sh
$ sh /root/pullk8s.sh

执行完成并等待几分钟后，回到m1节点再次查看nodes信息，这次就都是Ready状态了：

[root@m1 ~]# kubectl get nodes
NAME   STATUS   ROLES    AGE   VERSION
m1     Ready    master   24h   v1.19.0
m2     Ready    master   14m   v1.19.0
m3     Ready    master   13m   v1.19.0
[root@m1 ~]#

将worker节点加入集群

与上一小节的步骤基本是相同的，只不过是在s1和s2节点上执行而已，kubeadm join命令不要搞错了就行，所以这里简略带过：

# 使用之前保存的join命令加入集群
$ kubeadm join 192.168.243.100:6443 --token 5l7pv5.5iiq4atzlazq0b7x 
    --discovery-token-ca-cert-hash sha256:0fdc9947984a1c655861349dbd251d581bd6ec336c1ab8d9013cf302412b2140 

# 耐心等待一会，可以观察下日志
$ journalctl -f

成功将所有的worker节点加入集群后，至此我们就完成了k8s高可用集群的搭建。此时集群的node信息如下：

[root@m1 ~]# kubectl get nodes 
NAME   STATUS   ROLES    AGE     VERSION
m1     Ready    master   24h     v1.19.0
m2     Ready    master   60m     v1.19.0
m3     Ready    master   60m     v1.19.0
s1     Ready    <none>   9m45s   v1.19.0
s2     Ready    <none>   119s    v1.19.0
[root@m1 ~]#

pod信息如下：

[root@m1 ~]# kubectl get pod --all-namespaces 
NAMESPACE     NAME                                       READY   STATUS    RESTARTS   AGE
kube-system   calico-kube-controllers-5bc4fc6f5f-pdjls   1/1     Running   0          73m
kube-system   calico-node-8m8lz                          1/1     Running   0          9m43s
kube-system   calico-node-99xps                          1/1     Running   0          60m
kube-system   calico-node-f48zw                          1/1     Running   0          117s
kube-system   calico-node-jksvg                          1/1     Running   0          60m
kube-system   calico-node-tkdmv                          1/1     Running   0          73m
kube-system   coredns-f9fd979d6-kg4lf                    1/1     Running   0          24h
kube-system   coredns-f9fd979d6-t8xzj                    1/1     Running   0          24h
kube-system   etcd-m1                                    1/1     Running   1          24h
kube-system   kube-apiserver-m1                          1/1     Running   2          24h
kube-system   kube-controller-manager-m1                 1/1     Running   2          24h
kube-system   kube-proxy-22h6p                           1/1     Running   0          9m43s
kube-system   kube-proxy-khskm                           1/1     Running   0          60m
kube-system   kube-proxy-pkrgm                           1/1     Running   0          60m
kube-system   kube-proxy-rjgnw                           1/1     Running   1          24h
kube-system   kube-proxy-t4pxl                           1/1     Running   0          117s
kube-system   kube-scheduler-m1                          1/1     Running   2          24h
[root@m1 ~]#

集群可用性测试

创建nginx ds

在m1节点上创建nginx-ds.yml配置文件，内容如下：

apiVersion: v1
kind: Service
metadata:
  name: nginx-ds
  labels:
    app: nginx-ds
spec:
  type: NodePort
  selector:
    app: nginx-ds
  ports:
  - name: http
    port: 80
    targetPort: 80
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: nginx-ds
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
spec:
  selector:
    matchLabels:
      app: nginx-ds
  template:
    metadata:
      labels:
        app: nginx-ds
    spec:
      containers:
      - name: my-nginx
        image: nginx:1.7.9
        ports:
        - containerPort: 80

然后执行如下命令创建nginx ds：

[root@m1 ~]# kubectl create -f nginx-ds.yml
service/nginx-ds created
daemonset.apps/nginx-ds created
[root@m1 ~]#

检查各种ip连通性

稍等一会后，检查Pod状态是否正常：

[root@m1 ~]# kubectl get pods -o wide
NAME             READY   STATUS    RESTARTS   AGE     IP               NODE   NOMINATED NODE   READINESS GATES
nginx-ds-6nnpm   1/1     Running   0          2m32s   172.22.152.193   s1     <none>           <none>
nginx-ds-bvpqj   1/1     Running   0          2m32s   172.22.78.129    s2     <none>           <none>
[root@m1 ~]#

在每个节点上去尝试ping Pod IP：

[root@s1 ~]# ping 172.22.152.193
PING 172.22.152.193 (172.22.152.193) 56(84) bytes of data.
64 bytes from 172.22.152.193: icmp_seq=1 ttl=63 time=0.269 ms
64 bytes from 172.22.152.193: icmp_seq=2 ttl=63 time=0.240 ms
64 bytes from 172.22.152.193: icmp_seq=3 ttl=63 time=0.228 ms
64 bytes from 172.22.152.193: icmp_seq=4 ttl=63 time=0.229 ms
^C
--- 172.22.152.193 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 2999ms
rtt min/avg/max/mdev = 0.228/0.241/0.269/0.022 ms
[root@s1 ~]#

然后检查Service的状态：

[root@m1 ~]# kubectl get svc
NAME         TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
kubernetes   ClusterIP   10.96.0.1        <none>        443/TCP        2d1h
nginx-ds     NodePort    10.105.139.228   <none>        80:31145/TCP   3m21s
[root@m1 ~]#

在每个节点上尝试下访问该服务，能正常访问代表Service的IP也是通的：

[root@m1 ~]# curl 10.105.139.228:80
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>
[root@m1 ~]#

然后在每个节点检查NodePort的可用性，nginx-ds的NodePort为31145。如下能正常访问代表NodePort也是正常的：

[root@m3 ~]# curl 192.168.243.140:31145
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>
[root@m3 ~]#

检查dns可用性

需要创建一个Nginx Pod，首先定义一个pod-nginx.yaml配置文件，内容如下：

apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  containers:
  - name: nginx
    image: nginx:1.7.9
    ports:
    - containerPort: 80

然后基于该配置去创建Pod：

[root@m1 ~]# kubectl create -f pod-nginx.yaml
pod/nginx created
[root@m1 ~]#

使用如下命令进入到Pod里：

[root@m1 ~]# kubectl exec nginx -i -t -- /bin/bash

查看dns配置：

root@nginx:/# cat /etc/resolv.conf
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local localdomain
options ndots:5
root@nginx:/#

接着测试是否可以正确解析Service的名称。如下能根据nginx-ds这个名称解析出对应的IP：10.105.139.228，代表dns也是正常的：

root@nginx:/# ping nginx-ds
PING nginx-ds.default.svc.cluster.local (10.105.139.228): 48 data bytes

高可用测试

到m1节点上执行如下命令将其关机：

[root@m1 ~]# init 0

然后查看虚拟IP是否成功漂移到了m2节点上：

[root@m2 ~]# ip a |grep 192.168.243.100
    inet 192.168.243.100/32 scope global ens32
[root@m2 ~]#

接着测试能否在m2或m3节点上使用kubectl与集群进行交互，能正常交互则代表集群具备了一定程度的高可用性：

[root@m2 ~]# kubectl get nodes
NAME   STATUS     ROLES    AGE   VERSION
m1     NotReady   master   3d    v1.19.0
m2     Ready      master   16m   v1.19.0
m3     Ready      master   13m   v1.19.0
s1     Ready      <none>   2d    v1.19.0
s2     Ready      <none>   47h   v1.19.0
[root@m2 ~]#

部署dashboard

dashboard是k8s提供的一个可视化操作界面，用于简化我们对集群的操作和管理，在界面上我们可以很方便的查看各种信息、操作Pod、Service等资源，以及创建新的资源等。dashboard的仓库地址如下，

https://github.com/kubernetes/dashboard

dashboard的部署也比较简单，首先定义dashboard-all.yaml配置文件，内容如下：

apiVersion: v1
kind: Namespace
metadata:
  name: kubernetes-dashboard

---

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard

---

kind: Service
apiVersion: v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
spec:
  ports:
    - port: 443
      targetPort: 8443
      nodePort: 30005
  type: NodePort
  selector:
    k8s-app: kubernetes-dashboard

---

apiVersion: v1
kind: Secret
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard-certs
  namespace: kubernetes-dashboard
type: Opaque

---

apiVersion: v1
kind: Secret
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard-csrf
  namespace: kubernetes-dashboard
type: Opaque
data:
  csrf: ""

---

apiVersion: v1
kind: Secret
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard-key-holder
  namespace: kubernetes-dashboard
type: Opaque

---

kind: ConfigMap
apiVersion: v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard-settings
  namespace: kubernetes-dashboard

---

kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
rules:
  # Allow Dashboard to get, update and delete Dashboard exclusive secrets.
  - apiGroups: [""]
    resources: ["secrets"]
    resourceNames: ["kubernetes-dashboard-key-holder", "kubernetes-dashboard-certs", "kubernetes-dashboard-csrf"]
    verbs: ["get", "update", "delete"]
    # Allow Dashboard to get and update 'kubernetes-dashboard-settings' config map.
  - apiGroups: [""]
    resources: ["configmaps"]
    resourceNames: ["kubernetes-dashboard-settings"]
    verbs: ["get", "update"]
    # Allow Dashboard to get metrics.
  - apiGroups: [""]
    resources: ["services"]
    resourceNames: ["heapster", "dashboard-metrics-scraper"]
    verbs: ["proxy"]
  - apiGroups: [""]
    resources: ["services/proxy"]
    resourceNames: ["heapster", "http:heapster:", "https:heapster:", "dashboard-metrics-scraper", "http:dashboard-metrics-scraper"]
    verbs: ["get"]

---

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
rules:
  # Allow Metrics Scraper to get metrics from the Metrics server
  - apiGroups: ["metrics.k8s.io"]
    resources: ["pods", "nodes"]
    verbs: ["get", "list", "watch"]

---

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: kubernetes-dashboard
subjects:
  - kind: ServiceAccount
    name: kubernetes-dashboard
    namespace: kubernetes-dashboard

---

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: kubernetes-dashboard
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kubernetes-dashboard
subjects:
  - kind: ServiceAccount
    name: kubernetes-dashboard
    namespace: kubernetes-dashboard

---

kind: Deployment
apiVersion: apps/v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
spec:
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      k8s-app: kubernetes-dashboard
  template:
    metadata:
      labels:
        k8s-app: kubernetes-dashboard
    spec:
      containers:
        - name: kubernetes-dashboard
          image: kubernetesui/dashboard:v2.0.3
          imagePullPolicy: Always
          ports:
            - containerPort: 8443
              protocol: TCP
          args:
            - --auto-generate-certificates
            - --namespace=kubernetes-dashboard
            # Uncomment the following line to manually specify Kubernetes API server Host
            # If not specified, Dashboard will attempt to auto discover the API server and connect
            # to it. Uncomment only if the default does not work.
            # - --apiserver-host=http://my-address:port
          volumeMounts:
            - name: kubernetes-dashboard-certs
              mountPath: /certs
              # Create on-disk volume to store exec logs
            - mountPath: /tmp
              name: tmp-volume
          livenessProbe:
            httpGet:
              scheme: HTTPS
              path: /
              port: 8443
            initialDelaySeconds: 30
            timeoutSeconds: 30
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
            runAsUser: 1001
            runAsGroup: 2001
      volumes:
        - name: kubernetes-dashboard-certs
          secret:
            secretName: kubernetes-dashboard-certs
        - name: tmp-volume
          emptyDir: {}
      serviceAccountName: kubernetes-dashboard
      nodeSelector:
        "kubernetes.io/os": linux
      # Comment the following tolerations if Dashboard must not be deployed on master
      tolerations:
        - key: node-role.kubernetes.io/master
          effect: NoSchedule

---

kind: Service
apiVersion: v1
metadata:
  labels:
    k8s-app: dashboard-metrics-scraper
  name: dashboard-metrics-scraper
  namespace: kubernetes-dashboard
spec:
  ports:
    - port: 8000
      targetPort: 8000
  selector:
    k8s-app: dashboard-metrics-scraper

---

kind: Deployment
apiVersion: apps/v1
metadata:
  labels:
    k8s-app: dashboard-metrics-scraper
  name: dashboard-metrics-scraper
  namespace: kubernetes-dashboard
spec:
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      k8s-app: dashboard-metrics-scraper
  template:
    metadata:
      labels:
        k8s-app: dashboard-metrics-scraper
      annotations:
        seccomp.security.alpha.kubernetes.io/pod: 'runtime/default'
    spec:
      containers:
        - name: dashboard-metrics-scraper
          image: kubernetesui/metrics-scraper:v1.0.4
          ports:
            - containerPort: 8000
              protocol: TCP
          livenessProbe:
            httpGet:
              scheme: HTTP
              path: /
              port: 8000
            initialDelaySeconds: 30
            timeoutSeconds: 30
          volumeMounts:
          - mountPath: /tmp
            name: tmp-volume
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
            runAsUser: 1001
            runAsGroup: 2001
      serviceAccountName: kubernetes-dashboard
      nodeSelector:
        "kubernetes.io/os": linux
      # Comment the following tolerations if Dashboard must not be deployed on master
      tolerations:
        - key: node-role.kubernetes.io/master
          effect: NoSchedule
      volumes:
        - name: tmp-volume
          emptyDir: {}

创建dashboard服务：

[root@m1 ~]# kubectl create -f dashboard-all.yaml 
namespace/kubernetes-dashboard created
serviceaccount/kubernetes-dashboard created
service/kubernetes-dashboard created
secret/kubernetes-dashboard-certs created
secret/kubernetes-dashboard-csrf created
secret/kubernetes-dashboard-key-holder created
configmap/kubernetes-dashboard-settings created
role.rbac.authorization.k8s.io/kubernetes-dashboard created
clusterrole.rbac.authorization.k8s.io/kubernetes-dashboard created
rolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created
clusterrolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created
deployment.apps/kubernetes-dashboard created
service/dashboard-metrics-scraper created
deployment.apps/dashboard-metrics-scraper created
[root@m1 ~]#

查看deployment运行情况：

[root@m1 ~]# kubectl get deployment kubernetes-dashboard -n kubernetes-dashboard
NAME                   READY   UP-TO-DATE   AVAILABLE   AGE
kubernetes-dashboard   1/1     1            1           29s
[root@m1 ~]#

查看dashboard pod运行情况：

[root@m1 ~]# kubectl --namespace kubernetes-dashboard get pods -o wide |grep dashboard
dashboard-metrics-scraper-7b59f7d4df-q4jqj   1/1     Running   0          5m27s   172.22.152.198   s1     <none>           <none>
kubernetes-dashboard-5dbf55bd9d-nqvjz        1/1     Running   0          5m27s   172.22.202.17    m1     <none>           <none>
[root@m1 ~]#

查看dashboard service的运行情况：

[root@m1 ~]# kubectl get services kubernetes-dashboard -n kubernetes-dashboard
NAME                   TYPE       CLUSTER-IP       EXTERNAL-IP   PORT(S)         AGE
kubernetes-dashboard   NodePort   10.104.217.178   <none>        443:30005/TCP   5m57s
[root@m1 ~]#

查看30005端口是否有被正常监听：

[root@m1 ~]# netstat -ntlp |grep 30005
tcp        0      0 0.0.0.0:30005      0.0.0.0:*     LISTEN      4085/kube-proxy     
[root@m1 ~]#

访问dashboard

为了集群安全，从 1.7 开始，dashboard 只允许通过 https 访问，我们使用NodePort的方式暴露服务，可以使用 https://NodeIP:NodePort 地址访问。例如使用curl进行访问：

[root@m1 ~]# curl https://192.168.243.138:30005 -k
<!--
Copyright 2017 The Kubernetes Authors.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

<!doctype html>
<html lang="en">

<head>
  <meta charset="utf-8">
  <title>Kubernetes Dashboard</title>
  <link rel="icon"
        type="image/png"
        href="assets/images/kubernetes-logo.png" />
  <meta name="viewport"
        content="width=device-width">
<link rel="stylesheet" href="styles.988f26601cdcb14da469.css"></head>

<body>
  <kd-root></kd-root>
<script src="runtime.ddfec48137b0abfd678a.js" defer></script><script src="polyfills-es5.d57fe778f4588e63cc5c.js" nomodule defer></script><script src="polyfills.49104fe38e0ae7955ebb.js" defer></script><script src="scripts.391d299173602e261418.js" defer></script><script src="main.b94e335c0d02b12e3a7b.js" defer></script></body>

</html>
[root@m1 ~]#

由于dashboard的证书是自签的，所以这里需要加-k参数指定不验证证书进行https请求

关于自定义证书

默认dashboard的证书是自动生成的，肯定是非安全的证书，如果大家有域名和对应的安全证书可以自己替换掉。使用安全的域名方式访问dashboard。

在dashboard-all.yaml中增加dashboard启动参数，可以指定证书文件，其中证书文件是通过secret注进来的。

- –tls-cert-file - dashboard.cer - –tls-key-file - dashboard.key

登录dashboard

Dashboard 默认只支持 token 认证，所以如果使用 KubeConfig 文件，需要在该文件中指定 token，我们这里使用token的方式登录。

首先创建service account：

[root@m1 ~]# kubectl create sa dashboard-admin -n kube-system
serviceaccount/dashboard-admin created
[root@m1 ~]#

创建角色绑定关系：

[root@m1 ~]# kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kube-system:dashboard-admin
clusterrolebinding.rbac.authorization.k8s.io/dashboard-admin created
[root@m1 ~]#

查看dashboard-admin的secret名称：

[root@m1 ~]# kubectl get secrets -n kube-system | grep dashboard-admin | awk '{print $1}'
dashboard-admin-token-ph7h2
[root@m1 ~]#

打印secret的token：

[root@m1 ~]# ADMIN_SECRET=$(kubectl get secrets -n kube-system | grep dashboard-admin | awk '{print $1}')
[root@m1 ~]# kubectl describe secret -n kube-system ${ADMIN_SECRET} | grep -E '^token' | awk '{print $2}'
eyJhbGciOiJSUzI1NiIsImtpZCI6IkVnaDRYQXgySkFDOGdDMnhXYXJWbkY2WVczSDVKeVJRaE5vQ0ozOG5PanMifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJkYXNoYm9hcmQtYWRtaW4tdG9rZW4tcGg3aDIiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiZGFzaGJvYXJkLWFkbWluIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiNjA1ZWY3OTAtOWY3OC00NDQzLTgwMDgtOWRiMjU1MjU0MThkIiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50Omt1YmUtc3lzdGVtOmRhc2hib2FyZC1hZG1pbiJ9.xAO3njShhTRkgNdq45nO7XNy242f8XVs-W4WBMui-Ts6ahdZECoNegvWjLDCEamB0UW72JeG67f2yjcWohANwfDCHobRYPkOhzrVghkdULbrCCGai_fe60Svwf_apSmlKP3UUdu16M4GxopaTlINZpJY_z5KJ4kLq66Y1rjAA6j9TI4Ue4EazJKKv0dciv6NsP28l7-nvUmhj93QZpKqY3PQ7vvcPXk_sB-jjSSNJ5ObWuGeDBGHgQMRI4F1XTWXJBYClIucsbu6MzDA8yop9S7Ci8D00QSa0u3M_rqw-3UHtSxQee41uVVjIASfnCEVayKDIbJzG3gc2AjqGqJhkQ
[root@m1 ~]#

获取到token后，使用浏览器访问https://192.168.243.138:30005，由于是dashboard是自签的证书，所以此时浏览器会提示警告。不用理会直接点击“高级” -> “继续前往”即可：

然后输入token：

成功登录后首页如下：

可视化界面也没啥可说的，这里就不进一步介绍了，可以自行探索一下。