kubectl logs 无法查看 Pod 日志报错 NotFound

1. 现象

能查看 Pod 的信息

1
2
3
4
kubectl -n my-testns get pod my-testpod

NAME         READY   STATUS    RESTARTS   AGE
my-testpod   1/1     Running   0          2d13h

不能查看 Pod 的日志

1
2
3
kubectl -n my-testns logs my-testpod -f

Error from server (NotFound): the server could not find the requested resource ( pods/log my-testpod)

在 Pod 所在主机上可以通过 docker logs 查看容器日志。

测试 Kubelet 的健康状态 OK

1
curl -k https://x.x.x.x:10250/healthz

这里要使用主机的 IP 地址，kubectl logs 命名会直接调用 Kubelet 的 API 获取容器日志。

不能 exec 到 Pod 中

1
2
3
kubectl -n my-testns exec -it  my-testpod -- bash

Error from server:

没有详细输出错误详情。

kube-apiserver 中没有相关的错误日志

1
kubectl -n kube-system logs -l component=kube-apiserver -f

top 节点无法查看 Metrics

1
2
3
kubectl top node node-3

Error from server (NotFound): nodemetrics.metrics.k8s.io "node-3" not found

1
2
3
4
5
6
7
kubectl top node

node-03     246m         0%          4921Mi          7%
node-11      1929m        1%          152378Mi        14%
node-26     <unknown>    <unknown>   <unknown>       <unknown>
node-28     <unknown>    <unknown>   <unknown>       <unknown>
node-01     <unknown>    <unknown>   <unknown>       <unknown>

无法查看 Metrics Server Pod 所在节点的指标，有时部分可以。

Metrics Server 容器异常报错

1
E0221 06:27:28.020225       1 scraper.go:149] "Failed to scrape node" err="request failed, status: \"403 Forbidden\"" node="node-01"

2. 解决办法

在 Kubelet 中添加 node-ip

1
vim /var/lib/kubelet/kubeadm-flags.env

添加 --node-ip=x.x.x.x，然后重启 kubelet，这里的 x.x.x.x 是主机的 IP 地址。

这种方式是 Issues 提到的解决办法，但是对于我们无效。

删掉问题节点上的 Metrics Server

能够临时解决问题，Metrics Server 会重新部署到其他节点上导致出现类似的问题。

修改 Metrics Server 的版本

在 Kubernetes 1.25.6、Docker 20.10.7 上，将 Metrics Server 从 0.7.1 降级到 0.6.2 之后恢复，应该是 Metrics Server 的高版本对 Docker 的兼容性问题。

3. 参考

https://github.com/kubernetes/kubernetes/issues/125783

kubectl logs 无法查看 Pod 日志报错 NotFound

1. 现象

2. 解决办法

3. 参考

相关内容