Поиск типовых неисправностей k8s
ControlPlane
Проверяем, что kubelet работает
Пример вывода:[root@instance-master-1 bootsman]# systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; disabled; vendor preset: disabled)
Drop-In: /usr/lib/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Sun 2023-10-15 20:02:26 MSK; 3 weeks 4 days ago
Docs: https://kubernetes.io/docs/
Main PID: 7034 (kubelet)
Tasks: 12 (limit: 9495)
Memory: 67.9M
CPU: 9h 9min 2.203s
CGroup: /system.slice/kubelet.service
└─7034 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --container-runtime-endpoint=unix:///var/run/containerd/containerd.sock
Kubelet должен быть запущен
Проверка работоспособности компонентов ControlPlane
Запускаем
Пример вывода:
[root@instance-master-1 bootsman]# crictl ps
CONTAINER IMAGE CREATED STATE NAME ATTEMPT POD ID POD
7b4104c3d04cc 770cd897072cf 3 weeks ago Running promtail 0 365e5927d3842 loki-promtail-msv9x
0413f403d2696 1dbe0e9319764 3 weeks ago Running node-exporter 0 842b0138b9938 rancher-monitoring-prometheus-node-exporter-qwb4k
64b0a16b34cd2 bbd91fd54b288 3 weeks ago Running pushprox-client 0 fc63dbef28fa5 pushprox-kube-scheduler-client-b7nf5
fbf4f748fd336 bbd91fd54b288 3 weeks ago Running pushprox-client 0 e2fbef708b9f1 pushprox-kube-etcd-client-sjthv
2a5cb5aea628f bbd91fd54b288 3 weeks ago Running pushprox-client 0 28e0982488730 pushprox-kube-controller-manager-client-s65nb
d7ac2a9bf51e0 ead0a4a53df89 3 weeks ago Running coredns 0 af506bca5c1a6 coredns-b4bf48566-rz4xn
8096b0477c64e ead0a4a53df89 3 weeks ago Running coredns 0 cffa19f90e6cf coredns-b4bf48566-5mhh4
fc2248a358ac6 d00a7abfa71a6 3 weeks ago Running cilium-agent 0 4a72e9f6420a0 cilium-lkhwc
7566d5669fcb5 88429d3e5d05e 3 weeks ago Running cilium-operator 0 2a3b675d452e5 cilium-operator-777ddbc998-fprsq
4a9093098d9c3 09067696476ff 3 weeks ago Running kube-vip 0 1096be86add48 kube-vip-instance-master-1.kobik-personal
1f7b3c72ca6ed 86b6af7dd652c 3 weeks ago Running etcd 0 aaaaaa125e89e etcd-instance-master-1.kobik-personal
4b2c95aeaf6f6 f466468864b7a 3 weeks ago Running kube-controller-manager 0 153da6ce2ed91 kube-controller-manager-instance-master-1.kobik-personal
9f444dde5cfc6 98ef2570f3cde 3 weeks ago Running kube-scheduler 0 74180ee0dd901 kube-scheduler-instance-master-1.kobik-personal
7b3cbbd00bea6 e7972205b6614 3 weeks ago Running kube-apiserver 0 4c2d9a7e758e1 kube-apiserver-instance-master-1.kobik-personal
- kube-apiserver
- kube-vip
- cilium-agent
- kube-scheduler
- kube-controller-manager
Worker
Проверяем, что kubelet работает
Пример вывода:[root@instance-master-1 bootsman]# systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; disabled; vendor preset: disabled)
Drop-In: /usr/lib/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Sun 2023-10-15 20:02:26 MSK; 3 weeks 4 days ago
Docs: https://kubernetes.io/docs/
Main PID: 7034 (kubelet)
Tasks: 12 (limit: 9495)
Memory: 67.9M
CPU: 9h 9min 2.203s
CGroup: /system.slice/kubelet.service
└─7034 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --container-runtime-endpoint=unix:///var/run/containerd/containerd.sock
Kubelet должен быть запущен
Просмотр логов kubelet
Выполняем команду
ETCD
Проверка работы контейнера
Выполняем команду
Пример вывода[root@instance-master-1 bootsman]# crictl ps --name etcd
CONTAINER IMAGE CREATED STATE NAME ATTEMPT POD ID POD
1f7b3c72ca6ed 86b6af7dd652c 3 weeks ago Running etcd 0 aaaaaa125e89e etcd-instance-master-1.kobik-personal
Просмотр логов ETCD
Получаем ID контейнера
В первом столбце (CONTAINER) будет нужный ID[root@instance-master-1 bootsman]# crictl ps --name etcd
CONTAINER IMAGE CREATED STATE NAME ATTEMPT POD ID POD
1f7b3c72ca6ed 86b6af7dd652c 3 weeks ago Running etcd 0 aaaaaa125e89e etcd-instance-master-1.kobik-personal
Longhorn
Проверка деплоймента
Выполняем команду
Все поды должны работать, пример вывода:[root@instance-worker-1 bootsman]# kubectl get po -n longhorn-system
NAME READY STATUS RESTARTS AGE
csi-attacher-76cfbcc684-7mcs9 1/1 Running 0 25d
csi-attacher-76cfbcc684-82wc9 1/1 Running 0 25d
csi-attacher-76cfbcc684-j6xgg 1/1 Running 0 25d
csi-provisioner-7fdb5f4c6c-5m6gb 1/1 Running 0 25d
csi-provisioner-7fdb5f4c6c-ns42z 1/1 Running 0 25d
csi-provisioner-7fdb5f4c6c-qkn6l 1/1 Running 0 25d
csi-resizer-7c4fc545-5c7d2 1/1 Running 0 25d
csi-resizer-7c4fc545-6l8pp 1/1 Running 0 25d
csi-resizer-7c4fc545-zhvjt 1/1 Running 0 25d
csi-snapshotter-595bc9d4c7-5g8c7 1/1 Running 0 25d
csi-snapshotter-595bc9d4c7-ghzzs 1/1 Running 0 25d
csi-snapshotter-595bc9d4c7-l7wzt 1/1 Running 0 25d
engine-image-ei-9619d2ae-6snnx 1/1 Running 0 25d
engine-image-ei-9619d2ae-kk4x8 1/1 Running 0 25d
engine-image-ei-9619d2ae-tz7vg 1/1 Running 0 25d
instance-manager-bb4a3b2ff0ded1fa189e5c3f2f3aea72 1/1 Running 0 25d
instance-manager-e33c4122c6efaf2a01cfd66cad4bf6eb 1/1 Running 0 25d
instance-manager-f8931f593fcb7288c3f4777469cd7901 1/1 Running 0 25d
longhorn-csi-plugin-9z96f 3/3 Running 0 25d
longhorn-csi-plugin-clgjg 3/3 Running 0 25d
longhorn-csi-plugin-r6949 3/3 Running 0 25d
longhorn-driver-deployer-77fbb76899-xgt6j 1/1 Running 0 25d
longhorn-manager-8wssz 1/1 Running 0 25d
longhorn-manager-q5pvv 1/1 Running 0 25d
longhorn-manager-v6s4k 1/1 Running 0 25d
longhorn-ui-799696dd6c-54wrn 1/1 Running 0 25d
longhorn-ui-799696dd6c-rjphr 1/1 Running 0 25d
Проверка серверов
На воркерах проверьте наличие и работоспособность демона, выполнив команду
Пример вывода с работающим демоном[root@instance-worker-0 bootsman]# systemctl status iscsid
● iscsid.service - Open-iSCSI
Loaded: loaded (/usr/lib/systemd/system/iscsid.service; enabled; vendor preset: disabled)
Active: active (running) since Sun 2023-10-15 19:55:08 MSK; 3 weeks 4 days ago
TriggeredBy: ● iscsid.socket
Docs: man:iscsid(8)
man:iscsiuio(8)
man:iscsiadm(8)
Main PID: 4022 (iscsid)
Status: "Ready to process requests"
Tasks: 1 (limit: 9495)
Memory: 2.4M
CPU: 8ms
CGroup: /system.slice/iscsid.service
└─4022 /usr/sbin/iscsid -f