使用vmware_exporter监控vsphere

背景

本地环境目前需要监控底层硬件健康状况,防止存储异常导致丢失数据。使用IPMI exporter获取到的数据不全;使用SNMP exporter的话,需要对应不同的硬件厂商mib,底层上无法控制。而我的环境中虚拟化使用的是vSphere,所以采用了vmware-exporter来监控vSphere。

部署

我这里是在k8s集群中部署的。监控是使用rancher部署的,namespace是cattle-monitoring-system,对应修改。

deployment,svc

也可以创建secret来存储密码,cm来存储配置信息。参考官方文档:https://github.com/pryorda/vmware_exporter/tree/main/kubernetes

下面的yaml中是直接存储到了环境变量中。

cat vmware-exporter.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: vmware-exporter
    name: vmware-exporter
  name: vmware-exporter
  namespace: cattle-monitoring-system
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: vmware-exporter
      name: vmware-exporter
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
    type: RollingUpdate
  template:
    metadata:
      annotations:
      labels:
        app: vmware-exporter
        name: vmware-exporter
    spec:
      containers:
      - env:
        - name: VSPHERE_USER
          value: administrator@admin
        - name: VSPHERE_PASSWORD
          value: admin
        - name: VSPHERE_HOST
          value: "192.168.1.1"
        - name: VSPHERE_IGNORE_SSL
          value: "True"
        - name: VSPHERE_SPECS_SIZE
          value: "2000"
        - name: VSPHERE_FETCH_ALARMS
          value: "True"
        image: pryorda/vmware_exporter
        imagePullPolicy: IfNotPresent
        name: vmware-exporter
        ports:
        - containerPort: 9272
          name: tcp
          protocol: TCP
        resources: {}
        securityContext:
          allowPrivilegeEscalation: true
          capabilities: {}
          privileged: false
          readOnlyRootFilesystem: false
          runAsNonRoot: false
        stdin: true
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        tty: true
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: vmware-exporter
    name: vmware-exporter
  name: vmware-exporter
  namespace: cattle-monitoring-system
spec:
  ports:
  - name: http
    port: 9272
    protocol: TCP
    targetPort: 9272
  selector:
    app: vmware-exporter
    name: vmware-exporter
  sessionAffinity: None
  type: ClusterIP

加一个收集告警信息的配置,这样后面告警也根据这个来告警。

file

servicemonitor

cat vmware-exporter-servicemonitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    app: vmware-exporter
    name: vmware-exporter
  name: vmware-exporter
  namespace: cattle-monitoring-system
spec:
  jobLabel: vmware_vcenter
  endpoints:
  - interval: 60s
    port: http
    path: /metrics
    # 需要适当调整获取metrics的超时时间,防止数据过多导致超时拉取失败。
    scrapeTimeout: 30s
    scheme: http
  namespaceSelector:
    matchNames:
    - cattle-monitoring-system
  selector:
    matchLabels:
      app: vmware-exporter
      name: vmware-exporter

file

测试metrics

curl http://10.43.161.37:9272/metrics

file

添加grafana

导入模板11243。也可以使用官方的:https://github.com/pryorda/vmware_exporter/tree/main/dashboards

file

添加告警

告警尝试过用具体的metrics来实现,比如拔掉一个电源PS1,在zabbix上马上就能显示告警,而vmware-exporter获取的metrics却迟迟没有更新。所以只能采用alarm的形式来获取vSphere的告警信息。

file

file

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  labels:
    app: rancher-monitoring
    release: rancher-monitoring
  name: vmware-alert
  namespace: cattle-monitoring-system
spec:
  groups:
  - name: vmware-alert
    rules:
    - alert: vmware-alert
      expr: vmware_host_red_alarms == 1.0
      for: 5m
      labels:
        alert_name: vmware-alert
        alert_type: metric
        severity: critical
        comparison: equal
        expression: vmware_host_red_alarms == 1.0
      annotations:
        description: 告警项为 {{ $labels.alarms }},告警主机为 {{ $labels.host_name }}
        summary: 告警项为 {{ $labels.alarms }},告警主机为 {{ $labels.host_name }}

file

查看告警信息

file

0 0 投票数
文章评分
订阅评论
提醒
guest

0 评论
最旧
最新 最多投票
内联反馈
查看所有评论

相关文章

开始在上面输入您的搜索词,然后按回车进行搜索。按ESC取消。

返回顶部
0
希望看到您的想法,请您发表评论x