步骤
使用clickhouse-exporter采集metrics,然后使用prometheus监控。
clickhouse-exporter文档:https://github.com/ClickHouse/clickhouse_exporter
构建镜像
git clone https://github.com/ClickHouse/clickhouse_exporter
cd clickhouse_exporter
vim Dockerfile
# 修改go版本为1.24,有个依赖需要1.24版本以上
FROM golang:1.24 AS BUILDER
# 添加环境变量设置goproxy
ENV GOPROXY=https://goproxy.io/
RUN make init
docker build . -t clickhouse-exporter:latest

启动容器
docker run -itd --name clickhouse-exporter -p 9116:9116 clickhouse-exporter -scrape_uri=http://127.0.0.1:8123/ clickhouse-exporter:latest
查看metrics
curl http://127.0.0.1:9116/metrics,无法获取到metrics

报错认证失败。

停止并删除容器,添加用户名密码重新启动:
docker run -itd --name clickhouse-exporter -p 9116:9116 -e CLICKHOUSE_USER=clickhouse -e CLICKHOUSE_PASSWORD=clickhouse clickhouse-exporter -clickhouse_only -scrape_uri=http://10.0.5.140:8123/ clickhouse-exporter:latest

查看metrics:

添加监控
添加prometheus job:
vim prometheus.yml
- job_name: 'clickhouse'
scrape_interval: 30s
metrics_path: /metrics
static_configs:
- targets: ['1.2.3.4:9116']
labels:
group: 'clickhouse'

添加grafana面板,882:

监控指标参考文档:https://developer.aliyun.com/article/1285360
添加告警规则
- name: Clickhouse_Down
rules:
- alert: ClickHouseDown
expr: clickhouse_up == 0
for: 2m
labels:
severity: critical
annotations:
summary: "ClickHouse 实例 {{ $labels.instance }} 已宕机"
description: |
ClickHouse 实例 {{ $labels.instance }} (job={{ $labels.job }}) 已无法访问超过2分钟。
clickhouse_up = {{ $value }}
发生时间:{{ $value | printf `%.0f` }} (0表示不可达)
value: "{{ $value }}"
