route:
group_wait: 5s
group_interval: 5s
repeat_interval: 5m
receiver: default-receiver
routes:
- match:
alertname: Watchdog
receiver: ops-team
- match_re:
severity: ^(warning|critical)$
receiver: ops-team
routes:
- match:
product: foo
receiver: foo-team
- match:
product: bar
severity: critical
receiver: bar-team
receivers:
- name: default-receiver
webhook_configs:
- url: 'http://localhost:5001/alert'
send_resolved: true
http_config:
timeout: 30s
bearer_token_file: /etc/token
### 默认情况下,send_resolved是false,这意味着它不会发送resolve通知。确保将其设置为true。
- name: ops-team
webhook_configs:
- url: 'http://localhost:5001/alert'
send_resolved: true
- name: foo-team
webhook_configs:
- url: 'http://localhost:5001/alert'
send_resolved: true
- name: bar-team
webhook_configs:
- url: 'http://localhost:5001/alert'
send_resolved: true
route:
group_by: ['alertname']
group_wait: 10s
group_interval: 10s
repeat_interval: 1h
receiver: webhook
检查接收器配置,确保在触发警报和解决警报时都设置了相应的send_resolved值。例如,上面的示例中,所有接收器都设置了send_resolved为true,这意味着在警报解决时也会发送通知。
检查警报接口是否配置正确。如果警报接收器使用Webhook,则确保Webhook上的URL是正确的,并确保在接收警报的端点上实现了支持解决通知。
检查警报的解决状态是否被正确标记。在Prometheus中,警报的解决状态是通过标记警报的标签之一来识别的。如果标记的值不正确,则可能会导致Alertmanager无法识别警报的