Please enable Javascript to view the contents

Kubernetes 安装问题 QA

 ·  ☕ 1 分钟

1. Calico

1.1 BIRD is not ready

1
kubectl -n kube-system get pod 

calico-node-xxx 0/1 一直起不来,报错 calico/node is not ready: BIRD is not ready: BGP not established with

解决办法:

Calico 默认使用 first-found,也就是从第一个找到的网卡中获取 NodeIP。虽然排除了 lo、docker0 等网卡,但是依然有一定概率会识别失败。需要手动修改,指定网卡。

  • 查看主机上的网卡
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
ip addr

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:50:56:92:3a:20 brd ff:ff:ff:ff:ff:ff
    inet 10.13.5.65/23 brd 10.13.5.255 scope global ens160
       valid_lft forever preferred_lft forever
    inet6 fe80::250:56ff:fe92:3a20/64 scope link 
       valid_lft forever preferred_lft forever
  • 编辑 Calico 部署文件
1
kubectl -n kube-system edit ds calico-node

指定 IP_AUTODETECTION_METHOD 中的 interface 为网卡名即可,支持通配符。

1
2
3
4
5
6
spec:
      containers:
      - env:
        - name: IP_AUTODETECTION_METHOD
          value: interface=ens160
      image: docker.io/calico/node:v3.18.2

2. Metric Server

2.1 无法访问 Metric Server

无法访问 Metric Server 服务

解决办法:

kubectl -n kube-system edit deploy metrics-server

修改启动参数:

1
2
 - --kubelet-insecure-tls
 - --kubelet-preferred-address-types=InternalIP

跳过证书验证、使用 Node 节点的 IP 进行通信。

3. NFS Storage

在 NFS 的 Pod 中会看到类似错误日志

1
2
I0916 06:12:44.587396       1 leaderelection.go:185] attempting to acquire leader lease  default/cluster.local-nfs-client-nfs-client-provisioner...
E0916 06:12:44.597222       1 event.go:259] Could not construct reference to: '&v1.Endpoints{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"cluster.local-nfs-client-nfs-client-provisioner", GenerateName:"", Namespace:"default", SelfLink:"", UID:"bd270086-5338-464b-b50d-b3ec110fc6d1", ResourceVersion:"4413847", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63767369564, loc:(*time.Location)(0x1956800)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string{"control-plane.alpha.kubernetes.io/leader":"{\"holderIdentity\":\"nfs-client-nfs-client-provisioner-5d4fd84f8b-pnwkv_1b60c33d-16b5-11ec-9098-dae3d1cb24d6\",\"leaseDurationSeconds\":15,\"acquireTime\":\"2021-09-16T06:12:44Z\",\"renewTime\":\"2021-09-16T06:12:44Z\",\"leaderTransitions\":0}"}, OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, Subsets:[]v1.EndpointSubset(nil)}' due to: 'selfLink was empty, can't make reference'. Will not report event: 'Normal' 'LeaderElection' 'nfs-client-nfs-client-provisioner-5d4fd84f8b-pnwkv_1b60c33d-16b5-11ec-9098-dae3d1cb24d6 became leader'

解决办法:

Kubernetes 1.20 开始, 默认删除了 metadata.selfLink 字段。但是 nfs-client-provisioner 依然使用了该字段。因此, 需要在 kube-apiserver 中开启。

编辑 /etc/kubernetes/manifests/kube-apiserver.yaml , 在启动参数中添加一行 - --feature-gates=RemoveSelfLink=false 即可。


微信公众号
作者
微信公众号