Monitoring
kubectl 頂部節點不起作用。看起來像 heapster 的問題
我在 gke 上有一個新的 k8s 集群。
每當我跑步時,
kubectl top node gke-data-custom-vm-6-25-0cbae9b9-hrkc
我都會得到Error from server (NotFound): the server could not find the requested resource (get services http:heapster:)
同時我有這個服務:
> kubectl -n kube-system get services NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE default-http-backend NodePort 10.11.241.20 <none> 80:32688/TCP 59d heapster ClusterIP 10.11.245.182 <none> 80/TCP 59d kube-dns ClusterIP 10.11.240.10 <none> 53/UDP,53/TCP 59d metrics-server ClusterIP 10.11.249.26 <none> 443/TCP 59d
並且一個帶有 heapster 的 pod 正在執行(我可以看到它被重新啟動了很多次)
kubectl -n kube-system get pods NAME READY STATUS RESTARTS AGE event-exporter-v0.2.3-85644fcdf-kwd6g 2/2 Running 0 16d fluentd-gcp-scaler-8b674f786-dbrcr 1/1 Running 0 16d fluentd-gcp-v3.2.0-2fqgl 2/2 Running 0 17d fluentd-gcp-v3.2.0-47586 2/2 Running 0 17d fluentd-gcp-v3.2.0-552xm 2/2 Running 0 16d heapster-v1.6.0-beta.1-fdc7fd478-8s998 3/3 Running 73 16d
但是我可以在 heapster-nanny 容器的日誌中看到一些錯誤:
> kubectl logs -n kube-system --tail 10 -f po/heapster-v1.6.0-beta.1-fdc7fd478-8s998 -c heapster-nanny ERROR: logging before flag.Parse: E0418 23:30:10.075539 1 nanny_lib.go:95] Error while querying apiserver for resources: Get https://10.11.240.1:443/api/v1/namespaces/kube-system/pods/heapster-v1.6.0-beta.1-fdc7fd478-8s998: dial tcp 10.11.240.1:443: getsockopt: connection refused ERROR: logging before flag.Parse: E0418 23:30:10.971230 1 reflector.go:205] k8s.io/autoscaler/addon-resizer/nanny/kubernetes_client.go:107: Failed to list *v1.Node: Get https://10.11.240.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.11.240.1:443: getsockopt: connection refused ERROR: logging before flag.Parse: E0418 23:30:11.972337 1 reflector.go:205] k8s.io/autoscaler/addon-resizer/nanny/kubernetes_client.go:107: Failed to list *v1.Node: Get https://10.11.240.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.11.240.1:443: getsockopt: connection refused ERROR: logging before flag.Parse: E0418 23:30:12.973637 1 reflector.go:205] k8s.io/autoscaler/addon-resizer/nanny/kubernetes_client.go:107: Failed to list *v1.Node: Get https://10.11.240.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.11.240.1:443: getsockopt: connection refused ERROR: logging before flag.Parse: E0418 23:30:13.975024 1 reflector.go:205] k8s.io/autoscaler/addon-resizer/nanny/kubernetes_client.go:107: Failed to list *v1.Node: Get https://10.11.240.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.11.240.1:443: getsockopt: connection refused ERROR: logging before flag.Parse: E0418 23:30:14.976582 1 reflector.go:205] k8s.io/autoscaler/addon-resizer/nanny/kubernetes_client.go:107: Failed to list *v1.Node: Get https://10.11.240.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.11.240.1:443: getsockopt: connection refused ERROR: logging before flag.Parse: E0418 23:30:16.063760 1 reflector.go:205] k8s.io/autoscaler/addon-resizer/nanny/kubernetes_client.go:107: Failed to list *v1.Node: Get https://10.11.240.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.11.240.1:443: getsockopt: connection refused ERROR: logging before flag.Parse: E0418 23:30:27.065693 1 reflector.go:205] k8s.io/autoscaler/addon-resizer/nanny/kubernetes_client.go:107: Failed to list *v1.Node: Get https://10.11.240.1:443/api/v1/nodes?resourceVersion=0: net/http: TLS handshake timeout ERROR: logging before flag.Parse: E0418 23:30:30.077159 1 nanny_lib.go:95] Error while querying apiserver for resources: Get https://10.11.240.1:443/api/v1/namespaces/kube-system/pods/heapster-v1.6.0-beta.1-fdc7fd478-8s998: net/http: TLS handshake timeout ERROR: logging before flag.Parse: E0418 23:30:59.778560 1 reflector.go:205] k8s.io/autoscaler/addon-resizer/nanny/kubernetes_client.go:107: Failed to list *v1.Node: Get https://10.11.240.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.11.240.1:443: i/o timeout
以及在 heapster 容器中
I0423 07:02:10.765134 1 heapster.go:113] Starting heapster on port 8082 W0423 07:16:27.975467 1 manager.go:152] Failed to get all responses in time (got 2/3) W0423 07:16:43.064110 1 manager.go:107] Failed to get kubelet_summary:10.128.0.49:10255 response in time W0423 07:20:36.875359 1 manager.go:152] Failed to get all responses in time (got 2/3) W0423 07:20:44.383790 1 manager.go:107] Failed to get kubelet_summary:10.128.0.49:10255 response in time W0423 07:22:29.683060 1 manager.go:152] Failed to get all responses in time (got 2/3) W0423 07:22:40.278962 1 manager.go:107] Failed to get kubelet_summary:10.128.0.49:10255 response in time W0423 07:31:27.072711 1 manager.go:152] Failed to get all responses in time (got 2/3) W0423 07:31:54.580031 1 manager.go:107] Failed to get kubelet_summary:10.128.0.49:10255 response in time
我怎樣才能解決這個問題?
我應該提供任何其他資訊嗎?
棄用 Heapster
Heapster 是一個已棄用的項目,在最近的 Kubernetes 版本中執行時可能會出現問題。
請參閱Heapster 棄用時間表:
| Kubernetes Release | Action | Policy/Support | |---------------------|---------------------|----------------------------------------------------------------------------------| | Kubernetes 1.11 | Initial Deprecation | No new features or sinks are added. Bugfixes may be made. | | Kubernetes 1.12 | Setup Removal | The optional to install Heapster via the Kubernetes setup script is removed. | | Kubernetes 1.13 | Removal | No new bugfixes will be made. Move to kubernetes-retired organization. |
從 Kubernetes v1.10 開始,預設情況下
kubectl top
依賴於metrics-server。
kubectl top
在命令中支持指標 API 。(#56206,@brancz)此 PR 實現了對
kubectl top
將 metrics-server 用作聚合 API 的命令的支持,而不是直接從 heapster 請求指標。如果metrics.k8s.io
apiserver 不提供 API,那麼這仍然會退回到以前的行為。你應該做什麼:
由於Heapster已被棄用,並且您已經部署了metrics-server,最好的選擇是使用
kubectl
版本v1.10
或更高版本,因為它從 metrics-server 獲取指標。但是,請注意
kubectl
版本傾斜策略:
kubectl
在一個次要版本(較舊或較新)中受支持kube-apiserver
kube-apiserver
在選擇您的版本之前檢查您的kubectl
版本。