Prometheus
一、基础简介
1.1.prometheus简介
1.2.数据模型
1.3.指标类型
1.4.Jobs和Instances
二、安装部署
2.1.rpm部署监控组件
2.2.docker部署监控组件
三、PromSQL
3.1.PromQL基本使用
3.2.Prometheus基础查询
3.3.查询操作符
3.4.内置函数
3.5.在HTTPAPI中使用PromQL
3.6.最佳实践
四、告警处理
4.1.告警简介
4.2.自定义Prometheus告警规则
4.3.常见告警规则
4.4.部署Alertmanager
4.5.Alertmanager配置概述
4.6.基于标签的告警处理路由
4.7.使用Receiver接收告警信息
4.8.自定义告警模板
4.9.屏蔽告警通知
4.10.使用RecodingRules优化性能
五、Exporter
5.1.exporter
5.2.NodeExporter
5.3.ProcessExporter
5.4.cAdvisor
5.5.MysqlExporter
5.6.BlackboxExporter
5.7.ProcessExporter
5.8.Ipmiexport
5.9.Pushgateway
PostgresExporter
六、Grafana
6.1.grafana基本概念
6.2.创建dashboard与Panel
6.3.变化趋势:Graph面板
6.4.graph面板常用操作
6.5.分布统计:Heatmap面板
6.6.当前状态:SingleStat面板
6.7.变量
6.8.grafana报警
七、集群高可用
7.1.本地存储
7.2.远程存储
7.3.联邦集群
7.4.prometheus高可用
7.5.Alertmanager高可用
八、服务发现
8.1.Prometheus与服务发现
8.2.基于文件的服务发现
8.3.标签管理
九、Operator
9.1.什么是PrometheusOperator
9.2.PrometheusOperator自定义监控项
9.3.配置PrometheusRule
十、AlterManager
10.1.基础入门
10.2.配置详解
本文档使用 MrDoc 发布
-
+
home page
2.2.docker部署监控组件
## node-export部署 ```bash docker run -d --name node-exporter --restart always \ --net="host" \ --pid="host" \ -v "/:/host:ro,rslave" \ quay.io/prometheus/node-exporter:v1.7.0 \ --path.rootfs=/host ``` ## alertmanager部署 ```json route: group_by: ['alertname'] group_wait: 30s group_interval: 5m repeat_interval: 1h receiver: 'web.hook' receivers: - name: 'web.hook' webhook_configs: - url: 'http://127.0.0.1:5001/' inhibit_rules: - source_match: severity: 'critical' target_match: severity: 'warning' equal: ['alertname', 'dev', 'instance'] ``` ```bash mkdir /opt/alertmanager cd /opt/alertmanager docker run -d -p 9093:9093 --name alertmanager -v $PWD/alertmanager.yml:/etc/alertmanager/alertmanager.yml --restart always prom/alertmanager:latest ``` ## prometheus部署 - /opt/prometheus/prometheus.yml ```json global: scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. # scrape_timeout is set to the global default (10s). # Alertmanager configuration alerting: alertmanagers: - static_configs: - targets: - 192.168.10.10:9093 # Load rules once and periodically evaluate them according to the global 'evaluation_interval'. rule_files: # - "first_rules.yml" # - "second_rules.yml" # A scrape configuration containing exactly one endpoint to scrape: # Here it's Prometheus itself. scrape_configs: # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. - job_name: "prometheus" # metrics_path defaults to '/metrics' # scheme defaults to 'http'. static_configs: - targets: ["localhost:9090"] - job_name: "node-export" static_configs: - targets: ["192.168.10.10:9100"] - job_name: "alertmanager" static_configs: - targets: ["192.168.10.10:9093"] ``` - 创建容器 ```bash mkdir /opt/prometheus cd /opt/prometheus docker run -d -p 9090:9090 --name prometheus -v $PWD/prometheus.yml:/etc/prometheus/prometheus.yml -v $PWD/data:/prometheus -v $PWD/roles:/etc/prometheus/ --restart always prom/prometheus:latest # 如果提示"open /prometheus/queries.active: permission denied" docker exec -it prometheus sh /prometheus $ cat /etc/passwd 查看当前容器用户权限 nobody:x:65534:65534:nobody:/home:/bin/false chown -R 65534:65534 /opt/prometheus ``` - prometheus.yml是prometheus的配置文件 - data目录用于存放prometheus程序持久化的数据 ## grafana部署 ```bash docker run -d -p 3000:3000 -v /opt/grafana/data:/var/lib/grafana -v /opt/grafana/grafana.ini:/etc/grafana/grafana.ini --name=grafana --restart always grafana/grafana:latest ``` ##
Nathan
Aug. 24, 2024, 3:08 p.m.
转发文档
Collection documents
Last
Next
手机扫码
Copy link
手机扫一扫转发分享
Copy link
Markdown文件
PDF文件
Docx文件
share
link
type
password
Update password