Questions tagged [prometheus-alertmanager]
The Alertmanager handles alerts sent by client applications such as the Prometheus server. It takes care of deduplicating, grouping, and routing them to the correct receiver integration such as email, PagerDuty, or OpsGenie. It also takes care of silencing and inhibition of alerts.
prometheus-alertmanager
771
questions
1
vote
0
answers
1k
views
How to send alerts to multiple receivers
I want to add multiple receivers in Alertmanager: both slack and mail. But only one is working at a time.
Here is my current attempt of configuration:
config:
global:
resolve_timeout: 5m
...
0
votes
0
answers
90
views
Prometheus query to visualize "Cluster Node CPU Utilization" - many-to-many matching not allowed
I am trying to query the Cluster Node CPU Utilization for my Kubernetes cluster using the prometheus. I am using the following query
sum by (kubernetes_node) (rate(node_cpu_seconds_total{mode!~"...
0
votes
0
answers
197
views
How can I get high availability for an AlertManager that is behind Google Private Service Connect on GKE?
I have two Kubernetes GCP clusters, and I am using Prometheus operator.
I am using Google Private Service Connect (PSC) to connect both clusters, and this part is working. Prometheus, which is running ...
0
votes
0
answers
402
views
AlertManager: Loading configuration file failed
I'm trying to start up Prometheus with Alert Manager using docker-compose.
I've tried all examples I've found, but the Alertmanager fails to start:
docker-compose up
. . . . . .
"alertmanager_1 ...
0
votes
0
answers
168
views
How to solve an inhibited alert still alerting due to race condition
I currently am running into an issue where I have two alerts where one will always fire if the other is also firing.
First Alert (ServerOffline): Detects based on a ping exporter if a server has gone ...
0
votes
1
answer
124
views
How can I add metric labels to Prometheus alert manager description?
I want to access lables of the metric which matches the expression to inform the team about which exact queue exceeds the limit:
An example of the metric
job_duration_bucket{resource_server="rs1&...
1
vote
0
answers
75
views
Alert not triggered in prometheus
I have this alert in prometheus:
- alert: Error_pods
expr: sum by (namespace) (kube_pod_status_ready{namespace="gradl-enterprise", condition="false"}) > 0
for: 5m
And I can see ...
0
votes
0
answers
81
views
How to fire alerts on the basis of historical data in prometheus using alert manager?
I have a small use case where we have alerts firing daily on the basis of highload, cpu, memory etc etc.
I would want a regression (Dynamic alerting) alerting system in prometheus for below scenario:
...
2
votes
0
answers
172
views
Prometheus Alerting when Kubernetes service are down
I have few services in my Kubernetes cluster. I need to get alert when any of the service is down.
Right now, I'm getting alerts for other scenarios like pod not ready, Crashloopbackoff etc.
I'm using ...
0
votes
0
answers
221
views
Prometheus/Alertmanager: routing alerts to two different emails
I have following Alertmanager configuration:
global:
resolve_timeout: 1m
route:
# A default receiver
receiver: "EmailNotifications-EPG"
routes:
- receiver: "...
0
votes
1
answer
49
views
Limit email count which is getting fired by alertmanager
Can someone provide some suggestions on limiting the email alerting count to send only one email instead of 100 duplicate emails when a particular metric condition is triggered and alert starts to ...
0
votes
1
answer
142
views
How implement query which scrape metric only from currently live pod
I have a quite common scenario where a pod was just redeployed and so given a metric I see both from dead pod and new one.
Now I would like to implement an alert just for example about the number of ...
1
vote
1
answer
498
views
<.Subcharts.alertmanager>: nil pointer evaluating interface {}.alertmanager Error while install prometheus on kubernetes cluster
helm upgrade -i prometheus prometheus-community/prometheus --namespace prometheus --set alertmanager.persistentVolume.storageClass="gp2",server.persistentVolume.storageClass="gp2&...
0
votes
0
answers
161
views
Prometheus Query calculation and negative values for alerts manager
I'm writing an alert rule comparing a new value with old values to get an alert if the result is more than 10%.
Something like that.
(( increase(old value[10m] offset 10m) - increase(new value[10m]) )/...
0
votes
1
answer
490
views
Out of date prometheus alerts keeps firing
We have an alert in our AlertManager config
- alert: NoMessageForTooLong
expr: >
changes(kafka_topic_partition_current_offset{
topic!="__consumer_offsets&...