Prometheus访问监控对象metrics连接被拒绝

您所在的位置:网站首页 ifnotpresent Prometheus访问监控对象metrics连接被拒绝

Prometheus访问监控对象metrics连接被拒绝

2023-04-13 10:17| 来源: 网络整理| 查看: 265

scheduler 暴露10251端口¶

scheduler 监控采集:

Get "http://172.21.44.238:10251/metrics": dial tcp 172.21.44.238:10251: connect: connection refused

这个问题参考 Kubernetes 监控平面组件 scheduler controller-manager proxy kubelet etcd 有很好的总结。此外 Kubectl get componentstatus fails for scheduler and controller-manager 提供了一个很好的思路,就是首先检查 componentstatus (或者缩写成 cs ) ; 此外检查 kube-apiserver 也有帮助:

检查 componentstatus ( cs ):

执行 kubectl get cs ( componentstatus )可以获得监控异常原因¶ kubectl get cs

输出显示:

执行 kubectl get cs 输出¶ ME STATUS MESSAGE ERROR controller-manager Unhealthy Get http://127.0.0.1:10252/healthz: dial tcp 127.0.0.1:10252: connect: connection refused scheduler Unhealthy Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused etcd-0 Healthy {"health":"true"} etcd-1 Healthy {"health":"true"} etcd-2 Healthy {"health":"true"}

可以看到这个K8s集群的 controller-manager 和 scheduler 的健康状态都无法获得( connection refused )

对于 Kubernetes 高版本,例如,我遇到的项目采用了 1.18.10 版本,默认组件配置 apiserver 允许 metrics ,但是 controller-manager 和 scheduler 则关闭 metrics 。这个配置在管控服务器的 /etc/kubernetes/manifests 目录下有如下配置文件:

kube-apiserver.yaml kube-controller-manager.yaml kube-scheduler.yaml

这3个文件决定了管控三驾马车的运行特性:

kube-scheduler 的 HTTP 访问 --port int 默认值是 10251 。这个配置如果是 0 则根本不提供HTTP:

kube-scheduler 配置 --port=0 运行参数则会关闭HTTP访问,也就是无法获取默认的metrics¶ apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: component: kube-scheduler tier: control-plane name: kube-scheduler namespace: kube-system spec: containers: - command: - kube-scheduler - --authentication-kubeconfig=/etc/kubernetes/scheduler.conf - --authorization-kubeconfig=/etc/kubernetes/scheduler.conf - --bind-address=0.0.0.0 - --kubeconfig=/etc/kubernetes/scheduler.conf - --leader-elect=true - --port=0 image: lank8s.cn/kube-scheduler:v1.18.10 imagePullPolicy: IfNotPresent ...

但是,我直接在管控服务器上修改 /etc/kubernetes/manifests/kube-scheduler.yaml

... - --port=10251 ...

但是重建 kube-scheduler 这个pods之后,发现运行参数还是 --port=0 。也无法直接修改 kubectl -n kube-system edit pods kube-scheduler-control001 :

kubectl -n kube-system edit pods kube-scheduler-control001 尝试修改配置 --port=10251 但是提示错误¶ ... spec: containers: - command: - kube-scheduler - --authentication-kubeconfig=/etc/kubernetes/scheduler.conf - --authorization-kubeconfig=/etc/kubernetes/scheduler.conf - --bind-address=0.0.0.0 - --kubeconfig=/etc/kubernetes/scheduler.conf - --leader-elect=true - --port=0 image: lank8s.cn/kube-scheduler:v1.18.10 imagePullPolicy: IfNotPresent livenessProbe: ...

提示错误:

kubectl -n kube-system edit pods kube-scheduler-control001 尝试修改配置 --port=10251 错误信息返回¶ # pods "kube-scheduler-control001" was not valid: # * spec: Forbidden: pod updates may not change fields other than `spec.containers[*].image`, `spec.initContainers[*].image`, `spec.activeDeadlineSeconds` or `spec.tolerations` (only additions to existing tolerations) # core.PodSpec{ # Volumes: []core.Volume{{Name: "kubeconfig", VolumeSource: core.VolumeSource{HostPath: &core.HostPathVolumeSource{Path: "/etc/kubernetes/scheduler.conf", Type: &"FileOrCreate"}}}}, # InitContainers: nil, # Containers: []core.Container{ # { # Name: "kube-scheduler", # Image: "lank8s.cn/kube-scheduler:v1.18.10", # Command: []string{ # ... // 4 identical elements # "--kubeconfig=/etc/kubernetes/scheduler.conf", # "--leader-elect=true", # + "--port=0", # }, # Args: nil, # WorkingDir: "", # ... // 17 identical fields # }, # }, # EphemeralContainers: nil, # RestartPolicy: "Always", # ... // 24 identical fields # }


【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3