Kubernetes(十五)—部署K8s多master集群Nginx四层负载均衡+Keepalived高可用+Supervisor管理进程

 

部署一套完整的企业级高可用K8s集群

Master节点主要有三个服务kube-apiserver、kube-controller-manager和kube-scheduler,其中kube-controller-manager和kube-scheduler组件自身通过选择机制已经实现了高可用,所以Master高可用主要针对kube-apiserver组件,而该组件是以HTTP API提供服务,因此对他高可用与Web服务器类似,对其负载均衡即可,并且可水平扩容,但是当Master节点有增减时,如何动态配置Node节点上的负载均衡器成为了另外一个需要解决的问题,负载均衡器本身就是一个单点故障隐患,所以这里使用keepalived。

在这里插入图片描述

准备环境

3台Master(192.168.2.58、192.168.2.59、192.168.2.60) 3台Node(192.168.2.158、168.168.2.159、192.168.2.160) Master最小硬件配置:2核CPU、2G内存、20G硬盘 Node最小配置:1核CPU、1G内存、20G硬盘

方法

  • 企业环境可以使用使用Ansible Playbook一键部署,但是个人配置不建议使用,方法太过简单,无法掌握其中内部机制。 官方项目地址 安装教程
  • 使用Kubeadm手动部署,提供kubeadm init和kubeadm join,可用于快速部署Kubernetes集群,个人使用该方法。
  • 二进制包方式,从官方下载发行版的二进制包,手动部署每个组件,组成Kubernetes集群,过程较为繁琐,但是对 kubernetes 理解也会更全面。

使用Kubeadm手动部署步骤

初始化所有节点

#关闭防火墙
systemctl stop firewalld
systemctl disable firewalld

#关闭selinux
setenforce 0 
sed -i 's/enforcing/disabled/' /etc/selinux/config 

配置机器主机名

master1节点执行:
hostnamectl set-hostname master1 && bash

master2节点执行:
hostnamectl set-hostname master2 && bash

master3节点执行:
hostnamectl set-hostname master3 && bash

node1节点执行:
hostnamectl set-hostname node1 && bash

node2节点执行:
hostnamectl set-hostname node2 && bash

node3节点执行:
hostnamectl set-hostname node3 && bash

以下步骤所有主机都需要设置,所以建议用xshell连接全部6台主机,打开发送键输入所有会话的工具。当然也可以使用ansible。

在这里插入图片描述

所有主机设置/etc/hosts保证主机名能够解析,并设置github解析地址可加速国内使用,方便以后使用。

cat <<EOF >>/etc/hosts
192.168.2.58 master1
192.168.2.59 master2
192.168.2.60 master3
192.168.2.158 node1
192.168.2.159 node2
192.168.2.160 node3
52.69.186.44 github.com
185.199.110.133 raw.githubusercontent.com
EOF

互相建立免密通道

ssh-keygen
ssh-copy-id -p22 -i /root/.ssh/id_rsa.pub [email protected]
ssh-copy-id -p22 -i /root/.ssh/id_rsa.pub [email protected]
ssh-copy-id -p22 -i /root/.ssh/id_rsa.pub [email protected]
ssh-copy-id -p22 -i /root/.ssh/id_rsa.pub [email protected]
ssh-copy-id -p22 -i /root/.ssh/id_rsa.pub [email protected]
ssh-copy-id -p22 -i /root/.ssh/id_rsa.pub [email protected]

检测通道

ping -c 1 master1
ping -c 1 master2
ping -c 1 master3
ping -c 1 node1
ping -c 1 node2
ping -c 1 node3

配置阿里云yum源

rm -f /etc/yum.repos.d/*.repo
curl -o /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo
curl -o /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo
sed -i "/mirrors.aliyuncs.com/d" /etc/yum.repos.d/CentOS-Base.repo
sed -i "/mirrors.cloud.aliyuncs.com/d" /etc/yum.repos.d/CentOS-Base.repo
yum clean all

安装docker

yum install -y yum-utils

yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo 

yum install docker-ce docker-ce-cli containerd.io -y

systemctl start docker

systemctl enable docker

docker version

安装K8S

##关闭swap
swapoff -a
sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab   

#修改Cgroup驱动
cat <<EOF> /etc/docker/daemon.json
{"exec-opts":["native.cgroupdriver=systemd"]}
EOF

systemctl restart docker
#查看驱动改为system
docker info | grep Cgroup
docker info

执行docker info如果报以下警告,执行如下命令

WARNING: bridge-nf-call-iptables is disabled WARNING:bridge-nf-call-ip6tables is disabled

cat <<EOF> /etc/sysctl.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-arptables = 1
EOF

sysctl -p

配置Kubernetes的yum仓库

cat <<EOF >/etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=0
EOF

注意一定要指定1.24.0以下的版本,1.24版本已经移除dockershim。

yum install -y kubelet-1.23.6 kubeadm-1.23.6 kubectl-1.23.6
systemctl enable kubelet

配置master节点

以下步骤只需在master配置,所以断掉3台node的连接。

在这里插入图片描述

在这里插入图片描述

安装nginx

脚本或者命令任选其一

bash <(curl -s -L https://cdn.jsdelivr.net/gh/yutao517/code@main/bash/one-key-nginx-install.sh)
#使用脚本安装在指定路径/usr/local/nginx,键入nginx启动  
yum install epel-release -y
yum install nginx nginx-mod-stream -y
systemctl enable nginx 
#命令安装设置nginx开机自启动,脚本安装不需要,已经配置。

nginx配置4层负载均衡和健康检测

worker_processes  2;
events {
    worker_connections  2048;
}
stream {
    log_format main '$remote_addr $upstream_addr - [$time_local] $status $upstream_bytes_sent';
    access_log /usr/local/nginx/logs/k8s-access.log main;
    upstream k8s-apiserver {
        server 192.168.2.58:6443 max_fails=1 fail_timeout=10s; # Master1 APISERVER IP:PORT
        server 192.168.2.59:6443 max_fails=1 fail_timeout=10s; # Master2 APISERVER IP:PORT
        server 192.168.2.60:6443 max_fails=1 fail_timeout=10s; # Master3 APISERVER IP:PORT
    }
    server {
        listen 16443; 
        proxy_pass k8s-apiserver;
    } 
}
http {
    include       mime.types;
    default_type  application/octet-stream;
    sendfile        on;
    keepalive_timeout  65;
    server {
listen  80;
server_name www.yutao.co;
        location / {
            root   html;
            index  index.html index.htm;
        }
        error_page   500 502 503 504  /50x.html;
        location = /50x.html {
            root   html;
        }
    }
}

如果不做健康检查,后端有不健康节点,负载均衡器依然会先把该请求转发给该不健康节点,然后再转发给别的节点,这样就会浪费一次转发。可是,如果当后端应用重启时,重启操作需要很久才能完成的时候就会有可能拖死整个负载均衡器。此时,由于无法准确判断节点健康状态,导致请求handle住,出现假死状态,最终整个负载均衡器上的所有节点都无法正常响应请求。 max_fails=1和fail_timeout=10s,表示在单位周期为10s钟内,中达到1次连接失败,那么接将把节点标记为不可用,并等待下一个周期(同样时常为fail_timeout)再一次去请求,判断是否连接是否成功。fail_timeout为10s,max_fails为1次。

nginx -s reload

安装keepalived

yum install keepalived -y

配置keepalived高可用 关闭发送到全部,因为keepalived配置文件不一致, 最好使用非抢占模式。

master1:

cat <<EOF >/etc/keepalived/keepalived.conf 

! Configuration File for keepalived
global_defs {
    router_id NGINX_MASTER
}
vrrp_script check_nginx {
    script "/etc/keepalived/check_nginx.sh"
}
vrrp_instance VI_1 {
    state BACKUP
    nopreempt
    #非抢占模式
    interface ens33
    virtual_router_id 51	 
    priority 100
    advert_int 1 
    mcast_src_ip 192.168.2.58
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    #虚拟IP
    virtual_ipaddress {
        192.168.2.68
    }
       track_script {
        check_nginx
    } 

}
EOF

master2:

cat <<EOF >/etc/keepalived/keepalived.conf 
! Configuration File for keepalived
global_defs {
    router_id NGINX_BACKUP
}
vrrp_script check_nginx {
    script "/etc/keepalived/check_nginx.sh"
}
vrrp_instance VI_1 {
    state BACKUP
    nopreempt
    interface ens33     
    virtual_router_id 51         
    priority 90       
    advert_int 1
    mcast_src_ip 192.168.2.59   
    authentication {    
        auth_type PASS
        auth_pass 1111
    }   
    virtual_ipaddress {
        192.168.2.68
    }
     track_script {
        check_nginx
    }    
}
EOF

master3:

cat <<EOF >/etc/keepalived/keepalived.conf 
! Configuration File for keepalived
global_defs {
    router_id NGINX_BACKUP
}
vrrp_script check_nginx {
    script "/etc/keepalived/check_nginx.sh"
}
vrrp_instance VI_1 {
    state BACKUP
    nopreempt
    interface ens33  
    virtual_router_id 51     
    priority 80     
    advert_int 1       
    mcast_src_ip 192.168.2.60
        authentication {    
        auth_type PASS
        auth_pass 1111
    }   
    virtual_ipaddress {
        192.168.2.68
    } 
     track_script {
        check_nginx
    }   
} 
EOF

vrrp_script:指定检查 nginx 工作状态脚本(根据 nginx 状态判断是否故障转移)

cat <<EOF > /etc/keepalived/check_nginx.sh 
#!/bin/bash
count=$(pidof nginx|wc -l)
if (($count==0));then
	systemctl stop keepalived
else
    systemctl start keepalived
fi
EOF
chmod +x /etc/keepalived/check_nginx.sh
systemctl start keepalived
systemctl enable keepalived
systemctl status keepalived

配置Supervisor管理进程 在该脚本中,如果nginx恢复并不会重启keepalived,因为只有keepalived运行的时候才会运行脚本,所以只能停止keepalived服务,而不能重启。 可以在三台master配置Supervisor,通过Supervisor管理nginx,check_nginx.sh的进程,当nginx意外被 Kill 时,Supervisor 会自动将nginx重启,nginx恢复时,Supervisor 会通过脚本自动将keepalived重启,可以很方便地做到进程自动恢复的目的,而无需自己编写 shell 脚本来管理进程。 详细步骤

检测VIP漂移现象

master1关掉nginx

nginx -s stop

查看虚拟IP是否漂移到master2

在这里插入图片描述

在master2关掉nginx,查看虚拟IP是否漂移到master2

在这里插入图片描述

master1初始化K8S controlPlaneEndpoint:填写自己的VIP和监听端口 apiServer:填写6台主机和VIP的IP地址

cat <<EOF >kubeadm-config.yaml 
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v1.23.6
controlPlaneEndpoint: 192.168.2.68:16443
imageRepository: registry.aliyuncs.com/google_containers
apiServer:
  certSANs:
  - 192.168.2.58
  - 192.168.2.59
  - 192.168.2.60
  - 192.168.2.158
  - 192.168.2.159
  - 192.168.2.160
  - 192.168.2.68
networking:
  podSubnet: 10.244.0.0/16
  serviceSubnet: 10.10.0.0/16
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs
EOF
kubeadm init --config kubeadm-config.yaml

初始化成功后配置 kubectl 的配置文件 config,相当于对 kubectl进行授权,这样 kubectl 命令可以使用这个证书对 k8s 集群进行管理。

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

添加master

把 master1 节点的证书拷贝到 master2,master3上

ssh master2 "cd /root && mkdir -p /etc/kubernetes/pki/etcd &&mkdir -p ~/.kube/"
scp /etc/kubernetes/pki/ca.crt master2:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/ca.key master2:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/sa.key master2:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/sa.pub master2:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/front-proxy-ca.crt master2:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/front-proxy-ca.key master2:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/etcd/ca.crt master2:/etc/kubernetes/pki/etcd/
scp /etc/kubernetes/pki/etcd/ca.key master2:/etc/kubernetes/pki/etcd/
ssh master3 "cd /root && mkdir -p /etc/kubernetes/pki/etcd &&mkdir -p ~/.kube/"
scp /etc/kubernetes/pki/ca.crt master3:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/ca.key master3:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/sa.key master3:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/sa.pub master3:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/front-proxy-ca.crt master3:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/front-proxy-ca.key master3:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/etcd/ca.crt master3:/etc/kubernetes/pki/etcd/
scp /etc/kubernetes/pki/etcd/ca.key master3:/etc/kubernetes/pki/etcd/

在master1上查看加入节点的命令,复制自己的

kubeadm token create --print-join-command

在这里插入图片描述

注意master加入的时候,后面接 –control-plane 所以应该在master2、master3输入以下命令

kubeadm join 192.168.2.68:16443 --token 8d1s8m.04a2o7as7uyvtg1h --discovery-token-ca-cert-hash sha256:747df164201cc98d4aa11d5fe0796cbe41d2a53ff84d3227781c8efdf076a47c --control-plane

添加node

在node1,node2,node3

kubeadm join 192.168.2.68:16443 --token 8d1s8m.04a2o7as7uyvtg1h --discovery-token-ca-cert-hash sha256:747df164201cc98d4aa11d5fe0796cbe41d2a53ff84d3227781c8efdf076a47c

master1安装网络插件flannel 此时集群节点状况是NotReady,所以集群需要安装网络插件flannel

wget https://raw.githubusercontent.com/yutao517/mirror/main/profile/kube-flannel.yml
#如果比较慢
wget https://download.yutao.co/mirror/kube-flannel.yml

kubectl apply -f kube-flannel.yml

验证集群

kubectl get node

因为主机性能不佳,所以只加入了node1和node2节点

在这里插入图片描述

查看负载均衡器版本

curl -k https://192.168.2.68:16443/version

在这里插入图片描述

安装图形化界面

安装Dashboard

在这里插入图片描述

个人感觉Dashboard的效果不如Kuboard效果明显些

安装Kuboard

提前部署Metrics-Serve,否则无法采集到数据

kubectl apply -f https://download.yutao.co/mirror/components.yaml

在这里插入图片描述

Prometheus—Alertmanager+Grafana实现企业微信告警和Webhook机器人(钉钉机器人和微信机器人)告警