当前位置: 移动技术网 > 网络运营>服务器>Linux > prometheus之钉钉报警配置

prometheus之钉钉报警配置

2020年07月29日  | 移动技术网网络运营  | 我要评论

1.上传安装包

1.上传最新得二进制安装包并解压
tar xf alertmanager-0.20.0-rc.0.linux-amd64.tar.gz
tar xf prometheus-webhook-dingtalk-0.3.0.linux-amd64.tar.gz
2.改名
mv alertmanager-0.20.0-rc.0.linux-amd64 alertmanager
mv prometheus-webhook-dingtalk-0.3.0.linux-amd64 prometheus-webhook-dingtalk

2.启动钉钉插件

钉钉创建机器人拿webhook上网一大堆

nohup ./prometheus-webhook-dingtalk --ding.profile="ops_dingding=自己钉钉得webhook"   & 

3.配置alertmanager

# 1.配置文件
vim alertmanager.yml
global:
  resolve_timeout: 5m
route:
  receiver: webhook
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h
  group_by: [alertname]
  routes:
  - receiver: webhook
    group_wait: 10s
    match:
      team: node
receivers:
- name: webhook
  webhook_configs:
  - url: http://10.10.9.200:8060/dingtalk/ops_dingding/send #钉钉插件地址,ops_dingding和启动插件指定得名字一样
    send_resolved: true
  
# 2.启动alertmanager
nohup ./alertmanager --config.file=alertmanager.yml &

4.配置prometheus报警规则

#1.配置报警规则
vim rules.yml
groups:
    - name: test-rule
      rules:
      - alert: 主机状态
        expr: up == 0
        for: 2m
        labels:
          status: warning
        annotations:
          summary: "{{$labels.instance}}:服务器关闭"
          description: "{{$labels.instance}}:服务器关闭"

#2.修改prometheus配置让报警生效
vim prometheus.yml
# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets: ["10.10.9.200:9093"] #alertmanager地址
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  - "rules.yml"#指定报警规则文件
  # - "second_rules.yml"

3.重启prometheus

5.实验配置是否生效

1.关闭node监控
2.钉钉报警信息
[FIRING:1] 主机状态
Labels

alertname: 主机状态
instance: linux
job: node_export
status: warning
Annotations

description: linux:服务器关闭
summary: linux:服务器关闭
Source: http://test:9090/graph?g0.expr=up+%3D%3D+0&g0.tab=1

promethus报警状态
· Inactive:这里什么都没有发生。
· Pending:已触发阈值,但未满足告警持续时间(即rule中的for字段)
· Firing:已触发阈值且满足告警持续时间。警报发送到Notification Pipeline,经过处理,发送给接受者这样目的是多次判断失败才发告警,减少邮件。

本文地址:https://blog.csdn.net/weixin_43999932/article/details/107608046

如对本文有疑问, 点击进行留言回复!!

相关文章:

验证码:
移动技术网