1. Calico
1.1 BIRD is not ready
1
| kubectl -n kube-system get pod
|
calico-node-xxx 0/1 一直起不来,报错 calico/node is not ready: BIRD is not ready: BGP not established with
解决办法:
Calico 默认使用 first-found,也就是从第一个找到的网卡中获取 NodeIP。虽然排除了 lo、docker0 等网卡,但是依然有一定概率会识别失败。需要手动修改,指定网卡。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:50:56:92:3a:20 brd ff:ff:ff:ff:ff:ff
inet 10.13.5.65/23 brd 10.13.5.255 scope global ens160
valid_lft forever preferred_lft forever
inet6 fe80::250:56ff:fe92:3a20/64 scope link
valid_lft forever preferred_lft forever
|
1
| kubectl -n kube-system edit ds calico-node
|
指定 IP_AUTODETECTION_METHOD
中的 interface 为网卡名即可,支持通配符。
1
2
3
4
5
6
| spec:
containers:
- env:
- name: IP_AUTODETECTION_METHOD
value: interface=ens160
image: docker.io/calico/node:v3.18.2
|
2. Metric Server
2.1 无法访问 Metric Server
无法访问 Metric Server 服务
解决办法:
kubectl -n kube-system edit deploy metrics-server
修改启动参数:
1
2
| - --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
|
跳过证书验证、使用 Node 节点的 IP 进行通信。
3. NFS Storage
3.1 selfLink was empty
在 NFS 的 Pod 中会看到类似错误日志
1
2
| I0916 06:12:44.587396 1 leaderelection.go:185] attempting to acquire leader lease default/cluster.local-nfs-client-nfs-client-provisioner...
E0916 06:12:44.597222 1 event.go:259] Could not construct reference to: '&v1.Endpoints{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"cluster.local-nfs-client-nfs-client-provisioner", GenerateName:"", Namespace:"default", SelfLink:"", UID:"bd270086-5338-464b-b50d-b3ec110fc6d1", ResourceVersion:"4413847", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63767369564, loc:(*time.Location)(0x1956800)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string{"control-plane.alpha.kubernetes.io/leader":"{\"holderIdentity\":\"nfs-client-nfs-client-provisioner-5d4fd84f8b-pnwkv_1b60c33d-16b5-11ec-9098-dae3d1cb24d6\",\"leaseDurationSeconds\":15,\"acquireTime\":\"2021-09-16T06:12:44Z\",\"renewTime\":\"2021-09-16T06:12:44Z\",\"leaderTransitions\":0}"}, OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, Subsets:[]v1.EndpointSubset(nil)}' due to: 'selfLink was empty, can't make reference'. Will not report event: 'Normal' 'LeaderElection' 'nfs-client-nfs-client-provisioner-5d4fd84f8b-pnwkv_1b60c33d-16b5-11ec-9098-dae3d1cb24d6 became leader'
|
解决办法:
Kubernetes 1.20 开始, 默认删除了 metadata.selfLink
字段。但是 nfs-client-provisioner 依然使用了该字段。因此, 需要在 kube-apiserver 中开启。
编辑 /etc/kubernetes/manifests/kube-apiserver.yaml
, 在启动参数中添加一行 - --feature-gates=RemoveSelfLink=false
即可。