POD 状态异常

CrashLoopBackOff

错误场景

Pod 状态显示 CrashLoopBackOff

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
test-centos7-7cc5dc6987-jz486 0/1 CrashLoopBackOff 8 (111s ago) 17m

查看 Pod 详细信息

$ kubectl describe pod test-centos7-7cc5dc6987-jz486
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 18m default-scheduler Successfully assigned default/test-centos7-7cc5dc6987-jz486 to ops-kubernetes3
Normal Pulled 16m (x5 over 18m) kubelet Container image "centos:centos7.9.2009" already present on machine
Normal Created 16m (x5 over 18m) kubelet Created container centos7
Normal Started 16m (x5 over 18m) kubelet Started container centos7
Warning BackOff 3m3s (x71 over 18m) kubelet Back-off restarting failed container

结果显示, Reason BackOff Message 显示 Back-off restarting failed container

可能原因

Back-off restarting failed container 的原因,通常是因为,容器内 PID 为 1 的进程退出导致(通常用户在构建镜像执行 CMD 时,启动的程序,均是 PID 为1) [1]

容器进程退出(命令执行结束或者进程异常结束),则容器生命周期结束。kubernetes 控制器检查到容器退出,会持续重启容器。针对此种情况,需要检查镜像,是否不存在常驻进程,或者常驻进程异常。

针对此种情况,可以单独使用 docker 客户端部署镜像,查看镜像的运行情况,如果部署后,容器中的进程立马结束或退出,则容器也会随之结束。

定位中也可以使用 kubectl describe pod 命令检查 Pod 的退出状态码。Kubernetes 中的 Pod ExitCode 状态码是容器退出时返回的退出状态码,这个状态码通常用来指示容器的执行结果,以便 Kubernetes 和相关工具可以根据它来采取后续的操作。以下是一些常见的 ExitCode 状态码说明:

  • ExitCode 0 : 这表示容器正常退出,没有错误。这通常是期望的结果。
  • ExitCode 1 : 通常表示容器以非正常方式退出,可能是由于应用程序内部错误或异常导致的。通常是容器中 pid 为 1 的进程错误而失败
  • ExitCode 非零 : 任何非零的状态码都表示容器退出时发生了错误。ExitCode 的具体值通常是自定义的,容器内的应用程序可以根据需要返回不同的状态码来表示不同的错误情况。你需要查看容器内应用程序的文档或日志来了解具体的含义。
  • ExitCode 137 : 通常表示容器因为被操作系统终止(例如, OOM-killer )而非正常退出。这可能是由于内存不足等资源问题导致的。
  • ExitCode 139 : 通常表示容器因为接收到了一个信号而非正常退出。这个信号通常是 SIGSEGV (段错误),表示应用程序试图访问无效的内存。
  • ExitCode 143 : 通常表示容器因为接收到了 SIGTERM 信号而正常退出。这是 Kubernetes 在删除 Pod 时发送的信号,容器应该在接收到该信号后做一些清理工作然后退出。
  • ExitCode 130 : 通常表示容器因为接收到了 SIGINT 信号而正常退出。这是当用户在命令行中按下 Ctrl+C 时发送的信号。
  • ExitCode 255 :通常表示未知错误,或者容器无法启动。这个状态码通常是容器运行时的问题,比如容器镜像不存在或者启动命令有问题。
  • ImagePullBackOff

    Harbor 证书过期后,更新了证书, 更新证书后相关问题参考 ,Kubernetes 中更新 Pod 失败, 节点上使用了 containerd 做为 CRI ,具体报错信息如下

    # kubectl get pods -n ops
    NAME READY STATUS RESTARTS AGE
    get-cloud-cdn-statistics-pjfsc-6jnzl 0/1 Init:ImagePullBackOff 0 2m28s
    get-cloud-cdn-statistics-r67s2-qs8kj 0/1 Init:ImagePullBackOff 0 81m


    # kubectl describe pod -n ops get-cloud-cdn-statistics-pjfsc-x9mh7
    Events:
    Type Reason Age From Message
    ---- ------ ---- ---- -------
    Normal Scheduled 17s default-scheduler Successfully assigned ops/get-cloud-cdn-statistics-pjfsc-x9mh7 to k8s-worker1
    Normal BackOff 17s kubelet Back-off pulling image "harbor1.mydomain.com/ops/all/cloud-server-cdn-statistics-code:master-0.0-20230207143540"
    Warning Failed 17s kubelet Error: ImagePullBackOff
    Normal Pulling 2s (x2 over 18s) kubelet Pulling image "harbor1.mydomain.com/ops/all/cloud-server-cdn-statistics-code:master-0.0-20230207143540"
    Warning Failed 2s (x2 over 18s) kubelet Failed to pull image "harbor1.mydomain.com/ops/all/cloud-server-cdn-statistics-code:master-0.0-20230207143540": rpc error: code = Unknown desc = failed to pull and unpack image "harbor1.mydomain.com/ops/all/cloud-server-cdn-statistics-code:master-0.0-20230207143540": failed to resolve reference "harbor1.mydomain.com/ops/all/cloud-server-cdn-statistics-code:master-0.0-20230207143540": failed to do request: Head "https://harbor1.mydomain.com/v2/ops/all/cloud-server-cdn-statistics-code/manifests/master-0.0-20230207143540": x509: certificate signed by unknown authority
    Warning Failed 2s (x2 over 18s) kubelet Error: ErrImagePull

    在节点 k8s-worker1 上使用 docker curl 测试访问 Harbor 域名均正常。因此判断问题出现在 containerd 未识别到证书导致。

    对于 Kubernetes 使用 containerd 作为容器运行时,如果需要配置额外的证书(如信任自签名的 Harbor 仓库证书),可能需要修改或创建 containerd 的配置文件,通常为 /etc/containerd/config.toml 。在配置文件中添加证书配置, 在这个示例中, ca_file 应该指向 Harbor 证书。确保该路径正确且证书格式为 PEM

    /etc/containerd/config.toml
    [plugins."io.containerd.grpc.v1.cri".registry]
    [plugins."io.containerd.grpc.v1.cri".registry.configs."harbor1.mydomain.com".tls]
    ca_file = "/etc/docker/certs.d/harbor1.mydomain.com/ca.crt"

    重启 containerd 服务,然后重新部署 Pod

    systemctl restart containerd

    POD 状态为 InvalidImageName

    错误场景

    Pod 状态显示 InvalidImageName

    kubectl get pods -n cs
    NAME READY STATUS RESTARTS AGE
    54fdc56754-qrlt6 0/2 InvalidImageName 0 14s
    8486f49b89-zp25b 0/2 Init:ErrImagePull 0 7s

    可能原因

    镜像的 url 地址中,以 http:// https:// 开头。配置中镜像的 url 地址中无需指定协议( http:// https://

    Pod 状态为 Error

    The node was low on resource: ephemeral-storage

    错误场景

    查看 Pod 状态,显示 Error

    $ kubectl get pods
    NAME READY STATUS RESTARTS AGE
    front-7df8ccc4c7-xhp6s 0/1 Error 0 5h42m

    检查 Pod 的具体信息

    $ kubectl describe pod front-7df8ccc4c7-xhp6s
    ...
    Status: Failed
    Reason: Evicted
    Message: The node was low on resource: ephemeral-storage. Container php was using 394, which exceeds its request of 0.
    ...

    其中包含异常的关键信息: Status: Failed Reason: Evicted ,具体原因为 The node was low on resource: ephemeral-storage

    检查节点上的 Kuberlet 日志,搜索关键字 evicte 或者 disk ,也可以看到系统上文件系统空间使用率超过了阈值

    $ journalctl -u kubelet  | grep -i -e disk -e evict
    image_gc_manager.go:310] "Dis usage on image filesystem is over the high threshold, trying to free bytes down to the low threshold" usage=85 highThreshold=85 amountToFree=5122092236 lowThreshold=80
    eviction_manager.go:349] "Eviction manager: must evict pod(s) to reclaim" resourceName="ephemeral-storage"
    eviction_manager.go:338] "Eviction manager: attempting to reclaim" resourceName="ephemeral-storage"

    可能原因

    根据以上信息,可知 Pod 异常是因为 The node was low on resource: ephemeral-storage ,表示 临时存储资源 不足导致节点处于 Tainted ,其上的 Pod 被驱逐( Evicted )

    本地临时存储说明

    针对此种情况,如果某 Pod 的临时存储用量超出了你所允许的范围,kubelet 会向其发出逐出( eviction )信号,触发该 Pod 被逐出所在节点。

    如果用于可写入容器镜像层、节点层面日志或者 emptyDir 卷的文件系统中可用空间太少, 节点会为自身设置本地存储不足的污点( Tainted )标签。 这一污点会触发对那些无法容忍该污点的 Pod 的逐出操作。

    解决方法

  • 增加磁盘空间

  • 调整 kubelet nodefs.available 的 threshold 值

    修改节点上的 kubelet 的启动配置文件 /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf ,添加以下启动参数,主要为定义环境变量 KUBELET_EVICT_NODEFS_THRESHOLD_ARGS ,并将其添加到启动参数中

    Environment="KUBELET_EVICT_NODEFS_THRESHOLD_ARGS=--eviction-hard=nodefs.available<5%"
    ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS $KUBELET_EVICT_NODEFS_THRESHOLD_ARGS

    修改之后重启 kubelet 服务,并通过日志查看 nodefs.available 的新值是否生效

    $ systemctl daemon-reload
    $ systemctl restart kubelet

    $ journalctl -u kubelet | grep -i nodefs
    17604 container_manager_linux.go:267] "Creating Container Manager object based on Node Config" nodeConfig={RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: KubeletOOMScoreAdj:-999 ContainerRuntime: CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:systemd KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: ReservedSystemCPUs: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:nodefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.05} GracePeriod:0s MinReclaim:<nil>}]} QOSReserved:map[] ExperimentalCPUManagerPolicy:none ExperimentalCPUManagerPolicyOptions:map[] ExperimentalTopologyManagerScope:container ExperimentalCPUManagerReconcilePeriod:10s ExperimentalMemoryManagerPolicy:None ExperimentalMemoryManagerReservedMemory:[] ExperimentalPodPidsLimit:-1 EnforceCPULimits:true CPUCFSQuotaPeriod:100ms ExperimentalTopologyManagerPolicy:none}

    日志中看到 Signal:nodefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.05 ,表明更改生效。 [2]

    Pod 状态为 Init

    Unable to attach or mount volumes

    Pod 启动异常,查看 Pod 状态为 Init:0/1

    $ kubectl get pods
    NAME READY STATUS RESTARTS AGE
    admin-cbb479556-j9qg2 0/1 Init:0/1 0 3m37s

    查看 Pod 的详细描述信息

    $ kubectl describe pod admin-cbb479556-j9qg2
    Events:
    Type Reason Age From Message
    ---- ------ ---- ---- -------
    Normal Scheduled 3m41s default-scheduler Successfully assigned admin-cbb479556-j9qg2 to k8s-work2
    Warning FailedMount 99s kubelet Unable to attach or mount volumes: unmounted volumes=[logs], unattached volumes=[wwwroot kube-api-access-z8745 logs]: timed out waiting for the condition
    Warning FailedMount 42s kubelet MountVolume.SetUp failed for volume "uat-nfs-pv" : mount failed: exit status 32
    Mounting command: mount
    Mounting arguments: -t nfs 34.230.1.1:/data/NFSDataHome /var/lib/kubelet/pods/9d9a4807-706c-4369-b8be-b5727ee6aa8f/volumes/kubernetes.io~nfs/uat-nfs-pv
    Output: mount.nfs: Connection timed out

    根据 Events 中输出的信息, MountVolume.SetUp failed for volume "uat-nfs-pv" : mount failed: exit status 32 ,显示挂载卷失败,输出中包含了挂载卷时使用的命令和参数( mount -t nfs 34.230.1.1:/data/NFSDataHome /var/lib/kubelet/pods/9d9a4807-706c-4369-b8be-b5727ee6aa8f/volumes/kubernetes.io~nfs/uat-nfs-pv )及命令失败后的返回结果( mount.nfs: Connection timed out

    根据 Events 中的信息,查看配置,发现此卷为 NFS 类型的 PV,根据报错排查,此例原因为 NFS 的服务器地址填写错误,更新 PV 配置中的 NFS Server 的地址后,Pod 正常启动。

    ContainerCreating

    dbus: connection closed by user

    更新 DaemonSet 类型的 node_exporter ,其中一个节点上的 Pod 未创建成功,状态一直保持在 ContainerCreating ,检查 Pod 的详细描述信息

    $ kubectl get pods -n prometheus -o wide
    NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
    node-exporter-glnk5 1/1 Running 0 28h 172.31.8.197 work2 <none> <none>
    node-exporter-kzs2r 1/1 Running 1 28h 172.31.100.86 work1 <none> <none>
    node-exporter-nxz9v 0/1 ContainerCreating 0 5m30s 172.31.100.38 master <none> <none>
    node-exporter-vpkwt 1/1 Running 0 31m 172.31.100.69 work4 <none> <none>
    node-exporter-wft7v 1/1 Running 0 14m 172.31.14.7 work3 <none> <none>
    prometheus-67ccbbd78-zqw9x 1/1 Running 0 46h 10.244.14.75 work2 <none> <none>

    $ kubectl describe pod -n prometheus node-exporter-nxz9v
    Name: node-exporter-nxz9v
    Namespace: prometheus
    Priority: 0

    Annotations: <none>
    Status: Pending
    IP: 172.31.100.38
    IPs:
    IP: 172.31.100.38
    Controlled By: DaemonSet/node-exporter
    Containers:
    node-exporter:
    Container ID:
    Image: prom/node-exporter
    Image ID:
    Port: 9100/TCP
    Host Port: 9100/TCP
    Args:
    --path.procfs
    /host/proc
    --path.sysfs
    /host/sys
    --collector.disable-defaults
    --collector.cpu
    --collector.cpufreq
    --collector.meminfo
    --collector.diskstats
    --collector.filesystem
    --collector.filefd
    --collector.loadavg
    --collector.netdev
    --collector.netstat
    --collector.nfs
    --collector.os
    --collector.stat
    --collector.time
    --collector.udp_queues
    --collector.uname
    --collector.xfs
    --collector.netclass
    --collector.vmstat
    --collector.systemd
    --collector.systemd.unit-include
    (sshd|crond|iptables|systemd-journald|kubelet|containerd).service
    State: Waiting
    Reason: ContainerCreating
    Ready: False


    Events:
    Type Reason Age From Message
    ---- ------ ---- ---- -------
    Normal Scheduled 5m24s default-scheduler Successfully assigned prometheus/node-exporter-nxz9v to master
    Warning FailedCreatePodContainer 1s (x26 over 5m24s) kubelet unable to ensure pod container exists: failed to create container for [kubepods besteffort pode526f19a-57d6-417c-ba5a-fb0f232d31c6] : dbus: connection closed by user

    错误信息显示为 unable to ensure pod container exists: failed to create container for [kubepods besteffort pode526f19a-57d6-417c-ba5a-fb0f232d31c6] : dbus: connection closed by user

    查看 kubelet 日志,显示同样的日志

    $ journalctl -u kubelet
    master kubelet[1160]: E0707 14:40:55.036424 1160 qos_container_manager_linux.go:328] "Failed to update QoS cgroup configuration" err="dbus: connection closed by user"
    master kubelet[1160]: I0707 14:40:55.036455 1160 qos_container_manager_linux.go:138] "Failed to reserve QoS requests" err="dbus: connection closed by user"
    master kubelet[1160]: E0707 14:41:00.263041 1160 qos_container_manager_linux.go:328] "Failed to update QoS cgroup configuration" err="dbus: connection closed by user"
    master kubelet[1160]: E0707 14:41:00.263152 1160 pod_workers.go:190] "Error syncing pod, skipping" err="failed to ensure that the pod: 0cdaf660-bb6a-40ee-99ae-21dff3b55411 cgroups exist and are correctly applied: failed to create container for [kubepods besteffort pod0cdaf660-bb6a-40ee-99ae-21dff3b55411] : dbus: connection closed by user" pod="prometheus/node-exporter-rcd8x" podUID=0cdaf660-bb6a-40ee-99ae-21dff3b55411

    根据以上日志信息,问题原因为 kubelet 和系统服务 dbus 通信异常,可以 通过重启 kubelet 服务 的方法解决此问题。

    “cni0” already has an IP address different from

    集群中创建的 POD 状态一直处于 ContainerCreating ,检查 Pod 详细信息

    # kubectl describe pod -n cattle-system cattle-cluster-agent-7d766b5476-hsq45
    ...
    FailedCreatePodSandBox 82s (x4 over 85s) kubelet (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "2d58156e838349a79da91e0a6d8bccdec0e62c5f5c9ca6a1c30af6186d6253b1" network for pod "cattle-cluster-agent-7d766b5476-hsq45": networkPlugin cni failed to set up pod "cattle-cluster-agent-7d766b5476-hsq45_cattle-system" network: failed to delegate add: failed to set bridge addr: "cni0" already has an IP address different from 10.244.2.1/24

    关键信息 failed to set bridge addr: "cni0" already has an IP address different from 10.244.2.1/24

    检查节点上的 IP 信息,发现 flannel.1 网段和 cni0 网段不一致。可能因为 flannel 读取的配置错误, 重启节点后恢复

    # ip add
    4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8951 qdisc noqueue state UNKNOWN group default
    link/ether b2:b1:12:2d:8c:66 brd ff:ff:ff:ff:ff:ff
    inet 10.244.2.0/32 scope global flannel.1
    valid_lft forever preferred_lft forever
    inet6 fe80::b0b1:12ff:fe2d:8c66/64 scope link
    valid_lft forever preferred_lft forever
    5: cni0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8951 qdisc noqueue state UP group default qlen 1000
    link/ether ca:88:b1:51:0f:02 brd ff:ff:ff:ff:ff:ff
    inet 10.244.0.1/24 brd 10.244.2.255 scope global cni0
    valid_lft forever preferred_lft forever
    inet6 fe80::c888:b1ff:fe51:f02/64 scope link
    valid_lft forever preferred_lft forever

    PodInitializing

    新部署的 Pod 状态一直处于 PodInitializing

    # kubectl get pods
    ops ops-admin-5656d7bb64-mpqz5 0/2 PodInitializing 0 2m55s

    登陆到 Pod 所在节点,检查 kubelet 服务日志

    # journalctl -f -u kubelet
    Sep 11 17:09:19 ops-k8s-admin kubelet[22700]: {"cniVersion":"0.3.1","hairpinMode":true,"ipMasq":false,"ipam":{"ranges":[[{"subnet":"10.244.3.0/24"}]],"routes":[{"dst":"10.244.0.0/16"}],"type":"host-local"},"isDefaultGateway":true,"isGateway":true,"mtu":8951,"name":"cbr0","type":"bridge"}E0911 17:09:19.245283 22700 kuberuntime_manager.go:864] container &Container{Name:php,Image:54.236.67.117:5000/comm/ops-php:20221205093123-,Command:[],Args:[],WorkingDir:,Ports:[]ContainerPort{ContainerPort{Name:,HostPort:0,ContainerPort:9000,Protocol:TCP,HostIP:,},},Env:[]EnvVar{},Resources:ResourceRequirements{Limits:ResourceList{},Requests:ResourceList{},},VolumeMounts:[]VolumeMount{VolumeMount{Name:wwwroot,ReadOnly:false,MountPath:/home/www,SubPath:,MountPropagation:nil,SubPathExpr:,},VolumeMount{Name:uploads,ReadOnly:false,MountPath:/home/www/public/uploads,SubPath:,MountPropagation:nil,SubPathExpr:,},VolumeMount{Name:log-code,ReadOnly:false,MountPath:/home/www/storage/logs,SubPath:,MountPropagation:nil,SubPathExpr:,},VolumeMount{Name:kube-api-access-z4cxp,ReadOnly:true,MountPath:/var/run/secrets/kubernetes.io/serviceaccount,SubPath:,MountPropagation:nil,SubPathExpr:,},},LivenessProbe:nil,ReadinessProbe:nil,Lifecycle:&Lifecycle{PostStart:&Handler{Exec:&ExecAction{Command:[/bin/sh -c php /home/www/artisan command:apollo.sync >> apollo.log;php /home/www/artisan queue:restart],},HTTPGet:nil,TCPSocket:nil,},PreStop:nil,},TerminationMessagePath:/dev/termination-log,ImagePullPolicy:IfNotPresent,SecurityContext:nil,Stdin:false,StdinOnce:false,TTY:false,EnvFrom:[]EnvFromSource{},TerminationMessagePolicy:File,VolumeDevices:[]VolumeDevice{},StartupProbe:nil,} start failed in pod ops-admin-5656d7bb64-kvvmx_ops(a44af28c-3a39-439b-97c1-7e78b03ccd91): PostStartHookError: command '/bin/sh -c php /home/www/artisan command:apollo.sync >> apollo.log;php /home/www/artisan queue:restart' exited with 137: : Exec lifecycle hook ([/bin/sh -c php /home/www/artisan command:apollo.sync >> apollo.log;php /home/www/artisan queue:restart]) for Container "php" in Pod "ops-admin-5656d7bb64-kvvmx_ops(a44af28c-3a39-439b-97c1-7e78b03ccd91)" failed - error: command '/bin/sh -c php /home/www/artisan command:apollo.sync >> apollo.log;php /home/www/artisan queue:restart' exited with 137: , message: "队列延迟启动,因为.env配置不完善,rows=7,等待Apollo获取配置或手动完善

    从日志中可以看到关键错误日志信息: start failed in pod ops-admin-5656d7bb64-kvvmx_ops exited with 137: : Exec lifecycle hook ([/bin/sh -c php /home/www/artisan command:apollo.sync >> apollo.log;php /home/www/artisan queue:restart]) for Container "php" in Pod "ops-admin-5656d7bb64-kvvmx_ops(a44af28c-3a39-439b-97c1-7e78b03ccd91)" failed - error: command '/bin/sh -c php /home/www/artisan command:apollo.sync >> apollo.log;php /home/www/artisan queue:restart' exited with 137

    由此可以判断 Pod 无法正常启动的原因为 Pod 中的容器中的进程执行错误导致。

    Failed to update QoS cgroup configuration

    集群中某个节点上面的 Pod 状态显示为 Init 或者 PodInitializing ,其他节点正常,登陆异常节点,检查 kubelet 服务日志

    # journalctl -f -u kubelet
    kubelet[26451]: E1109 13:32:04.385251 26451 qos_container_manager_linux.go:328] "Failed to update QoS cgroup configuration" err="dbus: connection closed by user"
    kubelet[26451]: E1109 13:32:04.385307 26451 pod_workers.go:190] "Error syncing pod, skipping" err="failed to ensure that the pod: e31980b5-849b-4a95-b93d-983c1df31034 cgroups exist and are correctly applied: failed to create container for [kubepods besteffort pode31980b5-849b-4a95-b93d-983c1df31034] : dbus: connection closed by user" pod="6fd86565c6-4wn7k" podUID=e31980b5-849b-4a95-b93d-983c1df31034
    kubelet[26451]: E1109 13:32:04.385416 26451 pod_workers.go:190] "Error syncing pod, skipping" err="failed to ensure that the pod: f9e342d5-9f69-41bc-bb5e-df46c37b7bcd cgroups exist and are correctly applied: failed to create container for [kubepods besteffort podf9e342d5-9f69-41bc-bb5e-df46c37b7bcd] : dbus: connection closed by user" pod="5bfffd564f-sn82t" podUID=f9e342d5-9f69-41bc-bb5e-df46c37b7bcd
    kubelet[26451]: E1109 13:32:04.385777 26451 qos_container_manager_linux.go:328] "Failed to update QoS cgroup configuration" err="dbus: connection closed by user"
    kubelet[26451]: E1109 13:32:04.385962 26451 pod_workers.go:190] "Error syncing pod, skipping" err="failed to ensure that the pod: 541c88a3-cf05-40ce-b0db-80bf07f542b6 cgroups exist and are correctly applied: failed to create container for [kubepods besteffort pod541c88a3-cf05-40ce-b0db-80bf07f542b6] : dbus: connection closed by user" pod="5994c65989-zkn2w" podUID=541c88a3-cf05-40ce-b0db-80bf07f542b6
    kubelet[26451]: E1109 13:32:08.385429 26451 qos_container_manager_linux.go:328] "Failed to update QoS cgroup configuration" err="dbus: connection closed by user"
    kubelet[26451]: E1109 13:32:08.385657 26451 pod_workers.go:190] "Error syncing pod, skipping" err="failed to ensure that the pod: 255ce122-804c-4bcc-9f12-0a3abce77db5 cgroups exist and are correctly applied: failed to create container for [kubepods besteffort pod255ce122-804c-4bcc-9f12-0a3abce77db5] : dbus: connection closed by user" pod="67d89cf47f-x4wp7" podUID=255ce122-804c-4bcc-9f12-0a3abce77db5

    关键日志 "Failed to update QoS cgroup configuration" err="dbus: connection closed by user" ,根据此信息,可能是因为要与 DBus 服务通信更新容器的 QoS cgroup 配置失败。具体来说, kubelet 在尝试更新容器的 QoS cgroup 配置时遇到了 dbus: connection closed by user 错误,并且无法正确创建容器。

    这种情况可能是由于系统上的 DBus 服务异常导致的 。本示例中在先后重启了 dbus 服务和 kubelet 服务后问题恢复。

    systemctl restart dbus

    systemctl restart kubelet

    集群中 Pod 状态为 Pending

    Kubernetes 集群节点信息如下

    # kubectl get nodes
    NAME STATUS ROLES AGE VERSION
    test-k8s-master1 Ready control-plane 366d v1.24.7
    test-k8s-master2 Ready control-plane 366d v1.24.7
    test-k8s-master3 Ready control-plane 366d v1.24.7
    test-k8s-worker1 Ready <none> 366d v1.24.7
    test-k8s-worker2 Ready <none> 366d v1.24.7

    集群中 coredns 状态处于 Pending。通常, Pod 处于 Pending 状态意味着 Kubernetes 调度程序未能将 Pod 分配给任何节点 。查询 Pod 状态如下

    # kubectl get pods -n kube-system -o wide
    NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
    coredns-6d4b75cb6d-7np4c 0/1 Pending 0 68m <none> <none> <none> <none>
    coredns-6d4b75cb6d-ckl6f 0/1 Pending 0 68m <none> <none> <none> <none>
    etcd-k8s-master1 1/1 Running 1 (65d ago) 68m 172.31.26.116 k8s-master1 <none> <none>
    etcd-k8s-master2 1/1 Running 0 68m 172.31.19.164 k8s-master2 <none> <none>
    etcd-k8s-master3 1/1 Running 0 68m 172.31.21.3 k8s-master3 <none> <none>
    kube-apiserver-k8s-master1 1/1 Running 2 (65d ago) 68m 172.31.26.116 k8s-master1 <none> <none>
    kube-apiserver-k8s-master2 1/1 Running 4 (65d ago) 68m 172.31.19.164 k8s-master2 <none> <none>
    kube-apiserver-k8s-master3 1/1 Running 4 (65d ago) 68m 172.31.21.3 k8s-master3 <none> <none>
    kube-controller-manager-k8s-master1 1/1 Running 1 (41h ago) 68m 172.31.26.116 k8s-master1 <none> <none>
    kube-controller-manager-k8s-master2 1/1 Running 0 68m 172.31.19.164 k8s-master2 <none> <none>
    kube-controller-manager-k8s-master3 1/1 Running 1 (41h ago) 68m 172.31.21.3 k8s-master3 <none> <none>
    kube-proxy-84l4v 0/1 Pending 0 68m <none> <none> <none> <none>
    kube-proxy-pfwd5 0/1 Pending 0 68m <none> <none> <none> <none>
    kube-proxy-qbzq8 0/1 Pending 0 68m <none> <none> <none> <none>
    kube-proxy-qfplm 0/1 Pending 0 68m <none> <none> <none> <none>
    kube-proxy-w4t62 0/1 Pending 0 68m <none> <none> <none> <none>
    kube-scheduler-k8s-master1 1/1 Running 0 68m 172.31.26.116 k8s-master1 <none> <none>
    kube-scheduler-k8s-master2 1/1 Running 0 68m 172.31.19.164 k8s-master2 <none> <none>
    kube-scheduler-k8s-master3 1/1 Running 1 (41h ago) 68m 172.31.21.3 k8s-master3 <none> <none>
    kube-state-metrics-6d44cbdb56-kv8bm 0/1 Pending 0 68m <none> <none> <none> <none>
    metrics-server-6cd9f9f4cf-rqlzf 0/2 Pending 0 68m <none> <none> <none> <none>

    看到除了 kube-controller-manager kube-scheduler etcd kube-apiserver 外,其他 Pod 状态都为 Pending ,并且 Node 列显示为 <none> ,说明集群未将新创建的 Pod 调度到某个节点上。以 coredns-6d4b75cb6d-7np4c 为例查看其描述信息, Events 中未包含任何事件信息。

    # kubectl describe pod -n kube-system coredns-6d4b75cb6d-7np4c
    Name: coredns-6d4b75cb6d-7np4c
    Namespace: kube-system
    Priority: 2000000000
    Priority Class Name: system-cluster-critical
    Node: <none>
    Labels: k8s-app=kube-dns
    pod-template-hash=6d4b75cb6d
    Annotations: <none>
    Status: Pending
    IP:
    IPs: <none>
    Controlled By: ReplicaSet/coredns-6d4b75cb6d
    Containers:
    coredns:
    Image: k8s.gcr.io/coredns/coredns:v1.8.6
    Ports: 53/UDP, 53/TCP, 9153/TCP
    Host Ports: 0/UDP, 0/TCP, 0/TCP
    Args:
    -conf
    /etc/coredns/Corefile
    Limits:
    memory: 170Mi
    Requests:
    cpu: 100m
    memory: 70Mi
    Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
    Readiness: http-get http://:8181/ready delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment: <none>
    Mounts:
    /etc/coredns from config-volume (ro)
    /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-4hm48 (ro)
    Volumes:
    config-volume:
    Type: ConfigMap (a volume populated by a ConfigMap)
    Name: coredns
    Optional: false
    kube-api-access-4hm48:
    Type: Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds: 3607
    ConfigMapName: kube-root-ca.crt
    ConfigMapOptional: <nil>
    DownwardAPI: true
    QoS Class: Burstable
    Node-Selectors: kubernetes.io/os=linux
    Tolerations: CriticalAddonsOnly op=Exists
    node-role.kubernetes.io/control-plane:NoSchedule
    node-role.kubernetes.io/master:NoSchedule
    node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
    node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
    Events: <none>

    鉴于以上信息,怀疑这可能是集群级别的问题。Pod 调度主要由 kube-scheduler 进行,因此首先查看 kube-scheduler 组件日志

    # kubectl logs -n kube-system kube-scheduler-fm-k8s-c1-master1 | tail -n 20
    reflector.go:138] vendor/k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Pod: failed to list *v1.Pod: Unauthorized
    reflector.go:324] vendor/k8s.io/client-go/informers/factory.go:134: failed to list *v1.PersistentVolumeClaim: Unauthorized
    reflector.go:138] vendor/k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.PersistentVolumeClaim: failed to list *v1.PersistentVolumeClaim: Unauthorized
    leaderelection.go:330] error retrieving resource lock kube-system/kube-scheduler: Unauthorized
    reflector.go:324] vendor/k8s.io/client-go/informers/factory.go:134: failed to list *v1.StatefulSet: Unauthorized
    reflector.go:138] vendor/k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.StatefulSet: failed to list *v1.StatefulSet: Unauthorized
    reflector.go:324] vendor/k8s.io/client-go/informers/factory.go:134: failed to list *v1.ReplicaSet: Unauthorized
    reflector.go:138] vendor/k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.ReplicaSet: failed to list *v1.ReplicaSet: Unauthorized
    leaderelection.go:330] error retrieving resource lock kube-system/kube-scheduler: Unauthorized
    reflector.go:324] vendor/k8s.io/client-go/informers/factory.go:134: failed to list *v1.CSINode: Unauthorized
    reflector.go:138] vendor/k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.CSINode: failed to list *v1.CSINode: Unauthorized
    reflector.go:324] vendor/k8s.io/client-go/informers/factory.go:134: failed to list *v1.StorageClass: Unauthorized
    reflector.go:138] vendor/k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.StorageClass: failed to list *v1.StorageClass: Unauthorized
    leaderelection.go:330] error retrieving resource lock kube-system/kube-scheduler: Unauthorized
    leaderelection.go:330] error retrieving resource lock kube-system/kube-scheduler: Unauthorized
    reflector.go:324] vendor/k8s.io/client-go/informers/factory.go:134: failed to list *v1.Service: Unauthorized
    reflector.go:138] vendor/k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Service: failed to list *v1.Service: Unauthorized
    reflector.go:324] vendor/k8s.io/client-go/informers/factory.go:134: failed to list *v1.PersistentVolume: Unauthorized
    reflector.go:138] vendor/k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.PersistentVolume: failed to list *v1.PersistentVolume: Unauthorized
    leaderelection.go:330] error retrieving resource lock kube-system/kube-scheduler: Unauthorized

    根据日志内容,显示 kube-scheduler 无法获取到集群资源,原因为 Unauthorized

    一般 Unauthorized 常见原因可能是因为 RBAC 或者证书。先检查 RBAC, kube-scheduler 默认使用用户 system:kube-scheduler ,下面查看用户绑定的 Role 及权限

    # kubectl describe clusterrole system:kube-scheduler
    Name: system:kube-scheduler
    Labels: kubernetes.io/bootstrapping=rbac-defaults
    Annotations: rbac.authorization.kubernetes.io/autoupdate: true
    PolicyRule:
    Resources Non-Resource URLs Resource Names Verbs
    --------- ----------------- -------------- -----
    events [] [] [create patch update]
    events.events.k8s.io [] [] [create patch update]
    bindings [] [] [create]
    endpoints [] [] [create]
    pods/binding [] [] [create]
    tokenreviews.authentication.k8s.io [] [] [create]
    subjectaccessreviews.authorization.k8s.io [] [] [create]
    leases.coordination.k8s.io [] [] [create]
    pods [] [] [delete get list watch]
    namespaces [] [] [get list watch]
    nodes [] [] [get list watch]
    persistentvolumeclaims [] [] [get list watch]
    persistentvolumes [] [] [get list watch]
    replicationcontrollers [] [] [get list watch]
    services [] [] [get list watch]
    replicasets.apps [] [] [get list watch]
    statefulsets.apps [] [] [get list watch]
    replicasets.extensions [] [] [get list watch]
    poddisruptionbudgets.policy [] [] [get list watch]
    csidrivers.storage.k8s.io [] [] [get list watch]
    csinodes.storage.k8s.io [] [] [get list watch]
    csistoragecapacities.storage.k8s.io [] [] [get list watch]
    endpoints [] [kube-scheduler] [get update]
    leases.coordination.k8s.io [] [kube-scheduler] [get update]
    pods/status [] [] [patch update]

    # kubectl describe clusterrolebinding system:kube-scheduler
    Name: system:kube-scheduler
    Labels: kubernetes.io/bootstrapping=rbac-defaults
    Annotations: rbac.authorization.kubernetes.io/autoupdate: true
    Role:
    Kind: ClusterRole
    Name: system:kube-scheduler
    Subjects:
    Kind Name Namespace
    ---- ---- ---------
    User system:kube-scheduler

    查看 RBAC 权限,并无异常。检查集群证书,集群证书已经更新过,显示正常。

    # kubeadm certs check-expiration
    [check-expiration] Reading configuration from the cluster...
    [check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'

    CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
    admin.conf Dec 07, 2024 06:05 UTC 364d ca no
    apiserver Dec 07, 2024 07:17 UTC 364d ca no
    apiserver-etcd-client Dec 07, 2024 07:15 UTC 364d etcd-ca no
    apiserver-kubelet-client Dec 07, 2024 07:15 UTC 364d ca no
    controller-manager.conf Dec 07, 2024 06:05 UTC 364d ca no
    etcd-healthcheck-client Dec 07, 2024 07:15 UTC 364d etcd-ca no
    etcd-peer Dec 07, 2024 07:15 UTC 364d etcd-ca no
    etcd-server Dec 07, 2024 07:15 UTC 364d etcd-ca no
    front-proxy-client Dec 07, 2024 07:15 UTC 364d front-proxy-ca no
    scheduler.conf Dec 07, 2024 06:05 UTC 364d ca no

    CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED
    ca Dec 03, 2032 09:50 UTC 8y no
    etcd-ca Dec 05, 2033 07:15 UTC 9y no
    front-proxy-ca Dec 05, 2033 07:15 UTC 9y no

    查看下 kube-apiserver 日志信息,从日志中看到连接 etcd 127.0.0.1:2379 )异常,主要为证书问题,并且 kube-apiserver 日志中显示证书并未更新。

    clientconn.go:1331] [core] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 127.0.0.1 <nil> 0 <nil>}. Err: connection error: desc = "transport: authentication handshake failed: remote error: tls: internal error". Reconnecting...
    authentication.go:63] "Unable to authenticate the request" err="[x509: certificate has expired or is not yet valid: current time 2023-12-08T08:40:26Z is after 2023-12-06T09:58:58Z, verifying certificate SN=4790061324473323615, SKID=, AKID=08:39:2B:D0:14:00:F4:7F:3F:58:26:36:32:BA:F8:0E:0E:B4:D4:83 failed: x509: certificate has expired or is not yet valid: current time 2023-12-08T08:40:26Z is after 2023-12-06T09:58:58Z]"

    检查 etcd 日志,日志中显示找不到证书: open /etc/kubernetes/pki/etcd/peer.crt: no such file or directory

    {"level":"warn","ts":"2023-12-08T07:54:23.780Z","caller":"embed/config_logging.go:169","msg":"rejected connection","remote-addr":"172.31.21.3:30426","server-name":"","error":"open /etc/kubernetes/pki/etcd/peer.crt: no such file or directory"}
    {"level":"warn","ts":"2023-12-08T07:54:24.195Z","caller":"embed/config_logging.go:169","msg":"rejected connection","remote-addr":"172.31.19.164:28650","server-name":"","error":"open /etc/kubernetes/pki/etcd/peer.crt: no such file or directory"}

    堆叠(Stack)高可用模式下 etcd 组件启动时会挂载 Master 节点的 /etc/kubernetes/pki/etcd/ 目录作为自己的证书文件,具体配置可以查看静态 Pod 的配置 /etc/kubernetes/manifests/

    /etc/kubernetes/manifests/etcd.yaml
    apiVersion: v1
    kind: Pod
    metadata:
    labels:
    component: etcd
    tier: control-plane
    name: etcd
    namespace: kube-system
    spec:
    containers:
    - command:
    - etcd
    - --advertise-client-urls=https://172.31.19.164:2379
    - --cert-file=/etc/kubernetes/pki/etcd/server.crt
    - --client-cert-auth=true
    - --data-dir=/var/lib/etcd
    - --experimental-initial-corrupt-check=true
    - --initial-advertise-peer-urls=https://172.31.19.164:2380
    - --initial-cluster=k8s-master1=https://172.31.26.116:2380,k8s-master3=https://172.31.21.3:2380,k8s-master2=https://172.31.19.164:2380
    - --initial-cluster-state=existing
    - --key-file=/etc/kubernetes/pki/etcd/server.key
    - --listen-client-urls=https://127.0.0.1:2379,https://172.31.19.164:2379
    - --listen-metrics-urls=http://0.0.0.0:2381
    - --listen-peer-urls=https://172.31.19.164:2380
    - --name=k8s-master2
    - --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
    - --peer-client-cert-auth=true
    - --peer-key-file=/etc/kubernetes/pki/etcd/peer.key
    - --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
    - --snapshot-count=10000
    - --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
    image: k8s.gcr.io/etcd:3.5.3-0
    imagePullPolicy: IfNotPresent
    volumeMounts:
    - mountPath: /var/lib/etcd
    name: etcd-data
    - mountPath: /etc/kubernetes/pki/etcd
    name: etcd-certs
    hostNetwork: true
    priorityClassName: system-node-critical
    securityContext:
    seccompProfile:
    type: RuntimeDefault
    volumes:
    - hostPath:
    path: /etc/kubernetes/pki/etcd
    type: DirectoryOrCreate
    name: etcd-certs
    - hostPath:
    path: /var/lib/etcd
    type: DirectoryOrCreate
    name: etcd-data

    登陆到 etcd 的容器中,检查目录 /etc/kubernetes/pki/etcd ,发现下面为空,没有文件。原因未找到,重启系统后挂载正常。

    网络问题

    同一个节点上的 Pod 之间网络不通

    问题现象

    同一个节点上的 Pod 之间网络不通

    排查思路

  • 检查系统内核配置是否开启转发
    $ sysctl -a | grep net.ipv4.ip_forward
    net.ipv4.ip_forward = 1
  • 检查 iptables 是否禁止转发, iptables 防火墙配置参考
  • 为了定位是否为 iptables 影响,开关闭 iptables 再进行测试,如果关闭防火墙后可以通信,可以确定是防火墙规则导致,需要检查防火墙规则。
  • 更深入的排查,可以部署 netshoot 容器 进行抓包定位,
  • Pod 无法访问到外部 Internet 网络

    某个节点上,Pod 无法外部主机的服务(端口 6603/tcp)。分别在 Pod ,节点 cni0 网卡,节点出口网卡 eth0 ,目标服务网卡上抓包。此例中 Pod IP 为 10.244.4.173 ,目标服务的 IP 地址为 50.18.6.225

    查看 Pod 抓包结果

    可以看到源 IP 为 Pod 地址,目标为服务 IP 的 6603/tcp 的请求发送后,未收到 TCP 连接建立的响应。查看 节点 cni0 网卡 的抓包

    可以看到源 IP 为 Pod 地址,目标为服务 IP 的 6603/tcp 的请求发送后,未收到 TCP 连接建立的响应。查看节点出口网卡 eth0 的抓包。

    此处看到的源 IP 依然是 Pod 的 IP 地址,此处存在问题 。在云主机的场景中,如果数据包以这种结构发送出去,数据包到了 Internet 网关将拒绝它,因为网关 NAT(将 VM 的 IP 转换为公网 IP) 只了解连接到 VM 的 IP 地址。

    正常情况下,Pod 的流量到节点的出口网卡之前,是应该经过 iptables 执行源 NAT - 更改数据包源,使数据包看起来来自 VM 而不是 Pod 。有了正确的源 IP,数据包才可以离开 VM 进入 Internet

    此种情况下,数据包可以从节点的出口网卡发送出去,但是到了 Internet 网关将会被丢弃,因此目标服务无法接收到请求,查看目标服务器上的抓包,确实未收到来自此 Pod 的请求。

    此处的 源 NAT 是由 iptables 负责执行,流入节点出口网卡的数据包未被正确的 源 NAT ,有可能是因为 kube-proxy 维护的网络规则错误,或者因为 iptables 规则配置错误。可以通过重启 kube-proxy (由服务 kubelet 管理)和 iptables 服务尝试恢复。

    systemctl restart kubelet
    systemctl restart iptables

    本示例中,重启这 2 个服务后,Pod 恢复正常。

    Pod 间歇性无法连接外部数据库

    集群中的 Pod 出现连接集群之外的数据库服务超时,且出现频率较高

    跨节点 Pod 无法访问

    环境信息