Questions tagged [prometheus-alertmanager]
The Alertmanager handles alerts sent by client applications such as the Prometheus server. It takes care of deduplicating, grouping, and routing them to the correct receiver integration such as email, PagerDuty, or OpsGenie. It also takes care of silencing and inhibition of alerts.
prometheus-alertmanager
771
questions
0
votes
1
answer
63
views
How to configure Prometheus Alertmanager for alerting when any container goes down?
I am using Prometheus Alertmanager to monitor dozens of hosts and hundreds of containers on these hosts. I need to receive notifications when any container goes down. I understand from the ...
0
votes
0
answers
25
views
Slack alerts into Webex Team
I'm trying to migrate Prometheus Alertmanager alerting from using slack_configs (ref1) into the webex_configs (ref2). Is there any possibility to adjust the message sent to Webex with structure used ...
0
votes
1
answer
58
views
adding node exporter info to prometheus query
I am running multiple docker stacks all providing the infrastructure for a product.
Internally those docker stacks are monitored using Prometheus and alert to a teams channel.
I want all docker stacks ...
0
votes
1
answer
147
views
PromQL query to find 99th percentile request latency
I'm looking at a PromQL query we're using to fire an alert when the latency of a certain service goes beyond 400ms, but I'm unable to understand how it works, or even if it is correct at all -
The ...
0
votes
1
answer
157
views
Custom alerts sent to Alertmanager not showing up in Grafana
I am currently using a version of the kube-prometheus-stack Helm chart to deploy Grafana, Prometheus and Alertmanager.
I made a Go program to send custom alerts to Alertmanager (this feature will be ...
0
votes
1
answer
70
views
AlertManager fail starting up on kube-prometheus-9.0.3 [closed]
kubectl logs alertmanager-kube-prometheus-alertmanager-x -n monitoring
Error from server (BadRequest): container "alertmanager" in pod "alertmanager-kube-prometheus-alertmanager-x" ...
1
vote
0
answers
68
views
In Prometheus, is it possible that at same timestamp the defined PromQL is evaluated as true but alert is not firing?
I recently encountered a problem that struggle me several days...
I've defined a PromQL in rule yaml in expr, and for is set as 0s (firing immediately). Global evaluation_interval is 1m and query....
0
votes
0
answers
87
views
Kafka too many messages alert
I have an issue with Kafka alerts.
We're receiving alerts for KafkaTooManyMessages, the threshold is 50K messages.
When the threshold is breached, we're receiving the alerts properly, however we get a ...
0
votes
1
answer
45
views
Compare metric to itself of past 1 hour for the past x days?
I want to compare a metric to itself but to the average of the last x days. but the same time range.
example :
current time 14:00
i want to compare todays average for 13:00 - 14:00 for the same range ...
0
votes
0
answers
226
views
Unable to fetch alert rules. Is the Prometheus data source properly configured?
Failed to load the data source configuration for Prometheus. Unable to fetch alert rules. Is the Prometheus data source properly configured?
Prometheus data source is working fine.
I am use AWS ...
0
votes
0
answers
104
views
How can I exclude any hosts in prometheus alerting rules
Here is my alert_rules.yml file
groups:
- name: http_alerts
rules:
- alert: ssl expire in 30 days
expr: probe_ssl_earliest_cert_expiry - time() < 86400 * 30
for: 5m
labels:
...
1
vote
1
answer
186
views
Resolved Alerts resets repeat_interval in Alertmanager
I have alertmanager configuration as
config:
global:
resolve_timeout: 5m
route:
group_by: ['alert_manager_group_by']
group_wait: 30s
group_interval: 15m
repeat_interval: 30m
...
0
votes
0
answers
112
views
Detect systemd service restart using prometheus
I want to trigger an alert if a systemd service restarts. If I am able to capture time between a service inactive and active state, then I think I am able to create such alert. We know we can delay ...
1
vote
0
answers
103
views
Alertmanager template dynamic url
I'm trying to create a template that will dynamically pass the correct URL to the Slack receiver action button
In my rules, I have dashboard_url annotations, but not for all rules, only for specific
...
1
vote
1
answer
157
views
prometheus alerts for true expr to be triggered after 4 hours except first time
I am trying to control the prometheus alert for same expr to be triggered only after 4 hours if it already triggered first time. I have appname and operation configured in the configuration with ...