从零开始的 Kubernetes 学习笔记(三)

后知后觉 暂无评论

使用 Kubernetes 对 Docker 容器进行快速编排成为新的微服务时代的部署风潮。

镜像

由于众所周知的原因,官方的镜像仓库 k8s.io 的镜像在中国大陆的机器上无法正常拉取,因此可以使用阿里云的「容器镜像服务 ACR 」进行“中转”。

创建命名空间

首先需要创建阿里云的镜像仓库命名空间,这个命名空间是为了区分唯一用户的。可以随便取,建议简洁一些。

创建命名空间

创建镜像仓库

然后需要创建镜像仓库,仓库即对应一个镜像名称。

创建镜像仓库

小贴士:注意建议将「仓库名称」直接命名为实际需要的镜像名称,比如 kube-apiserver 仓库类型根据需求进行选择,私有仓库需要登录操作才能拉取。

海外上传镜像

需要先找到一个可以正常连接 k8s.gcr.io 仓库的服务器,比如可以使用阿里云的「轻量应用服务器 SAS香港/新加坡 服务器。

然后使用命名查询所需镜像名称和版本

$ kubeadm config images list
I0830 02:30:40.287681    3425 version.go:252] remote version is much newer: v1.19.0; falling back to: stable-1.18
W0830 02:30:43.217917    3425 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
k8s.gcr.io/kube-apiserver:v1.18.8
k8s.gcr.io/kube-controller-manager:v1.18.8
k8s.gcr.io/kube-scheduler:v1.18.8
k8s.gcr.io/kube-proxy:v1.18.8
k8s.gcr.io/pause:3.2
k8s.gcr.io/etcd:3.4.3-0
k8s.gcr.io/coredns:1.6.7

然后正常拉取上述镜像,再上传到阿里云的私有镜像仓库,以下选取其中一个为例。

$ sudo docker image pull k8s.gcr.io/kube-apiserver:v1.18.8

修改标签

$ sudo docker tag k8s.gcr.io/kube-apiserver:v1.18.8 registry.cn-beijing.aliyuncs.com/vndroid/kube-apiserver:v1.18.8

登录私有仓库

$ sudo docker login --username=example@outlook.com registry.cn-beijing.aliyuncs.com
小贴士:上述命名可以随便进入一个仓库中,即可查看到具体的信息。用于登录的用户名为阿里云账号全名,密码为开通服务时设置的密码。

推送到私有仓库

$ sudo docker push registry.cn-beijing.aliyuncs.com/vndroid/kube-apiserver:v1.18.8
小贴士:如果提示无法找到主机等问题,是因为 DNS 问题,请在 /etc/resolv.conf 中加入阿里云 DNS '223.5.5.5' 即可。

推送后即可到版本仓库中看到已经推送上来的版本。

镜像版本

部署

私有镜像

拉取全部镜像(阿里云私有仓库),如果是私有仓库需要先进行登录。

docker image pull registry.cn-beijing.aliyuncs.com/vndroid/kube-apiserver:v1.18.8
docker image pull registry.cn-beijing.aliyuncs.com/vndroid/kube-controller-manager:v1.18.8
docker image pull registry.cn-beijing.aliyuncs.com/vndroid/kube-scheduler:v1.18.8
docker image pull registry.cn-beijing.aliyuncs.com/vndroid/kube-proxy:v1.18.8
docker image pull registry.cn-beijing.aliyuncs.com/vndroid/etcd:3.4.3-0
docker image pull registry.cn-beijing.aliyuncs.com/vndroid/coredns:1.6.7
docker image pull registry.cn-beijing.aliyuncs.com/vndroid/pause:3.2

修改私有镜像标签

docker image tag registry.cn-beijing.aliyuncs.com/vndroid/kube-apiserver:v1.18.8 k8s.gcr.io/kube-apiserver:v1.18.8
docker image tag registry.cn-beijing.aliyuncs.com/vndroid/kube-controller-manager:v1.18.8 k8s.gcr.io/kube-controller-manager:v1.18.8
docker image tag registry.cn-beijing.aliyuncs.com/vndroid/kube-scheduler:v1.18.8 k8s.gcr.io/kube-scheduler:v1.18.8
docker image tag registry.cn-beijing.aliyuncs.com/vndroid/kube-proxy:v1.18.8 k8s.gcr.io/kube-proxy:v1.18.8
docker image tag registry.cn-beijing.aliyuncs.com/vndroid/etcd:3.4.3-0 k8s.gcr.io/etcd:3.4.3-0
docker image tag registry.cn-beijing.aliyuncs.com/vndroid/coredns:1.6.7 k8s.gcr.io/coredns:1.6.7
docker image tag registry.cn-beijing.aliyuncs.com/vndroid/pause:3.2 k8s.gcr.io/pause:3.2

清理多余镜像(可以不操作)实际上也并不会节约磁盘空间主要是为了治疗强迫症

docker image rm registry.cn-beijing.aliyuncs.com/vndroid/kube-apiserver:v1.18.8
docker image rm registry.cn-beijing.aliyuncs.com/vndroid/kube-controller-manager:v1.18.8
docker image rm registry.cn-beijing.aliyuncs.com/vndroid/kube-scheduler:v1.18.8
docker image rm registry.cn-beijing.aliyuncs.com/vndroid/kube-proxy:v1.18.8
docker image rm registry.cn-beijing.aliyuncs.com/vndroid/etcd:3.4.3-0
docker image rm registry.cn-beijing.aliyuncs.com/vndroid/coredns:1.6.7
docker image rm registry.cn-beijing.aliyuncs.com/vndroid/pause:3.2

然后拉取网络模型和控制面板

docker image pull quay.io/coreos/flannel:v0.12.0-amd64
docker image pull kubernetesui/dashboard:v2.0.3

初始化集群

$ sudo kubeadm init --kubernetes-version v1.18.8 --pod-network-cidr=10.244.0.0/16

详细过程可参考

# kubeadm init --kubernetes-version v1.18.8
W0818 22:49:38.623515    2987 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[init] Using Kubernetes version: v1.18.8
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [debian kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 172.16.16.5]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [debian localhost] and IPs [172.16.16.5 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [debian localhost] and IPs [172.16.16.5 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
W0818 22:49:49.636154    2987 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-scheduler"
W0818 22:49:49.639931    2987 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 27.511628 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.18" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node debian as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node debian as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: zb5t92.ggq885b0bw1ni7m8
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 172.16.16.5:6443 --token zb5t92.ggq885b0bw1ni7m8 \
    --discovery-token-ca-cert-hash sha256:0e49919c783ef0b917819e04ee0777fdc7b35ba8c1de6535315f8c63281155f2

应用网络模型

$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds-amd64 created
daemonset.apps/kube-flannel-ds-arm64 created
daemonset.apps/kube-flannel-ds-arm created
daemonset.apps/kube-flannel-ds-ppc64le created
daemonset.apps/kube-flannel-ds-s390x created
小贴士:注意需要先应用网络模型再将 Node 添加到集群,否则会出现网络错误。如果无法访问可以使用国内镜像地址 https://wave.md/kubernetes/kube-flannel.yml

添加工作节点

在其他的节点上执行 kubeadm init 后的提示命令

kubeadm join 172.16.16.5:6443 --token zb5t92.ggq885b0bw1ni7m8 \
    --discovery-token-ca-cert-hash sha256:0e49919c783ef0b917819e04ee0777fdc7b35ba8c1de6535315f8c63281155f2

然后获取节点信息

$ kubectl get node
NAME   STATUS   ROLES    AGE    VERSION
k8s1   Ready    master   63m    v1.18.8
k8s2   Ready    <none>   62m    v1.18.8
k8s3   Ready    <none>   61m    v1.18.8

应用节点身份

将 s2、 s3 节点身份改为 Node 工作节点

$ kubectl label nodes k8s2 node-role.kubernetes.io/node=node
node/k8s2 labeled
$ kubectl get nodes
NAME   STATUS     ROLES    AGE   VERSION
k8s1   NotReady   master   64m   v1.18.8
k8s2   NotReady   node     65m   v1.18.8
k8s3   NotReady   <none>   66m   v1.18.8

初始化节点

查看问题

查看 kubelet 日志,一般用于网络故障排除

# journalctl -f -u kubelet

故障排除

localhost:8080 was refused

如果看到以下报错

$ kubectl get nodes
The connection to the server localhost:8080 was refused - did you specify the right host or port?

是因为没有按照 kubeadm init 的指导步骤操作导致的,需要操作:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

node flannel not have CIDR IPs

如果 flannel 虚拟网卡没有地址,类似于下部信息:

# ip a
4: flannel.1: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN group default 
    link/ether 5e:88:cf:b0:81:ea brd ff:ff:ff:ff:ff:ff

可能是因为初始化集群时没有指定 CIDR 导致的,重新初始化集群并添加参数 --pod-network-cidr=10.244.0.0/16

且 Node 节点日志会出现大量类似日志

Aug 19 03:53:17 k8s2 kubelet[32203]: E0819 03:53:17.698669   32203 pod_workers.go:191] Error syncing pod 310713fc-e1b5-451a-9816-bfba8a40c985 ("kube-flannel-ds-amd64-2bvqs_kube-system(310713fc-e1b5-451a-9816-bfba8a40c985)"), skipping: failed to "StartContainer" for "kube-flannel" with CrashLoopBackOff: "back-off 5m0s restarting failed container=kube-flannel pod=kube-flannel-ds-amd64-2bvqs_kube-system(310713fc-e1b5-451a-9816-bfba8a40c985)"
Aug 19 03:36:21 k8s2 kubelet[32203]: E0819 03:36:21.438073   32203 kubelet.go:2188] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

All node status are NotReady

若发现集群全部处于 NotReady 状态,如下:

root@k8s1:~# kubectl get nodes
NAME   STATUS     ROLES    AGE   VERSION
k8s1   NotReady   master   65m   v1.18.8
k8s2   NotReady   <none>   61m   v1.18.8
k8s3   NotReady   <none>   60m   v1.18.8

可能是因为没有先部署网络模型就添加了节点,需要删除节点后,重新部署网络模型后重新添加节点。

already has an IP address different

如果已经部署了网络模型却依然全部节点提示 NotReady 则需要查日志,若出现类似如下报错:

  Warning  FailedCreatePodSandBox  15m                    kubelet, k8s1      Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "d8113436aa24e2c32d07cb15b7997d66709456892cbe62625710a655f91814be" network for pod "coredns-66bff467f8-fvczt": networkPlugin cni failed to set up pod "coredns-66bff467f8-fvczt_kube-system" network: failed to set bridge addr: "cni0" already has an IP address different from 10.244.0.1/24

查看当前的 cni 虚拟网卡

1731: cni0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default qlen 1000
    link/ether 7a:71:cf:43:aa:17 brd ff:ff:ff:ff:ff:ff
    inet 10.10.0.1/24 brd 10.244.0.255 scope global cni0
       valid_lft forever preferred_lft forever
    inet6 fe80::7871:cfff:fe43:aa17/64 scope link 
       valid_lft forever preferred_lft forever

可以发现这是因为初始化集群时的参数与当前虚拟网卡的 CIDR 不符导致的,注意初始化集群时需要删除旧的 CIDR

ip link delete cni0

然后重新启用节点即可。

Trace[456471162]: [30.001441123s] [30.001441123s] END
E0820 05:33:15.672472       1 reflector.go:153] pkg/mod/k8s.io/client-go@v0.17.2/tools/cache/reflector.go:105: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
[INFO] plugin/ready: Still waiting on: "kubernetes"
[INFO] plugin/ready: Still waiting on: "kubernetes"

附录

相关链接

参考链接

如果遇到问题或者对文章内容存疑,请在下方留言,博主看到后将及时回复,谢谢!
回复
回答15+22=