Skip to main content

Questions tagged [prometheus-alertmanager]

The Alertmanager handles alerts sent by client applications such as the Prometheus server. It takes care of deduplicating, grouping, and routing them to the correct receiver integration such as email, PagerDuty, or OpsGenie. It also takes care of silencing and inhibition of alerts.

prometheus-alertmanager
0 votes
0 answers
23 views

Multi-tenant Loki Ruler not sending alerts to Mimir AlertManager

I have an AlertRule set up in my Loki-Distributed instance via the Loki ruler as below: ruler: directories: fake: rules.txt: | groups: - name: mimir_loki_test ...
Golide's user avatar
  • 961
-1 votes
0 answers
19 views

Script to Deploy Grafana Dashboard with PromQL and Alert Queries [closed]

I'm looking to build a script that can deploy a Grafana dashboard with PromQL queries and alert queries all in one go. The idea is to automate the deployment process so that I can simply call the ...
Samantha V's user avatar
0 votes
0 answers
26 views

Why alertmanager is re-sending notifications after (group_interval + repeat_interval) duration?

Alertmanager notifications are delayed. Alertmanager re-sends notifications after (group_interval + repeat_interval) time. We are expecting the notifications to be re-sent after repeat-interval time ...
Aruna Thuse's user avatar
0 votes
0 answers
24 views

how to configure Grafana alertmanager datasource with kube-prometheus-stack helm deployment

I am using the prometheus alertmanager that gets deployed by the kube-prometheus-stack helm deployment. Now I want to connect Grafana and tried to configure a alertmanager datasource what fails with ...
Herr Hempel's user avatar
0 votes
0 answers
12 views

How does repeat_interval work in alertmanager?

I am having trouble making sense of some alerting behavior I'm seeing and it seems connected to repeat_interval. Can't seem to find a simple answer to my question. Question: Will repeat_interval fire ...
shek's user avatar
  • 215
0 votes
0 answers
44 views

Add a template under alert manager receivers in Kube-prometheus-stack values.yaml

I am trying to define a template to use without having to create a full ConfigMap within the values.yaml for alert manager since it is such a small template. I saw in a few examples people using ...
Sh3perd's user avatar
  • 69
0 votes
0 answers
32 views

Spring Boot Prometheus PushGatewayManager: Unable to push metrics due to SocketTimeoutException

I'm encountering an issue with my Spring Boot application where it fails to push metrics to Prometheus Pushgateway. The issue is happening intermettently The error message indicates a ...
Dhruv's user avatar
  • 43
0 votes
0 answers
20 views

Prometheus systemd expression based on a list of services

I have several VMs each exporting (using node-exporter) the status of its systemd services. I would like to have an alerting rule that looks something like: - alert: service_down expr: ...
Corel's user avatar
  • 623
0 votes
0 answers
21 views

Managing Time-to-Live (TTL) for Alerts in Prometheus

We're collecting EUR/USD prices every minute on Prometheus and setting up various alerts, such as percentage changes over the last 10 minutes and the last hour. The data flow works seamlessly; however,...
more's user avatar
  • 133
0 votes
0 answers
17 views

Notes "New" or "Continuing" alert for Alertmanager

I am doing a fairly simple monitoring setup based on Prometheus and Alertmanager (main branch). Reciver - Telgram chat. As alertmanager supports Telegram as a reciver since v0.21, i would like to ...
neversure's user avatar
0 votes
0 answers
27 views

show individual container as Prometheus target

I am running a server with several docker (unfortunately I cannot use podman because it is mailcow and this does not fully support podman yet) in one docker-compose file. I have added cadvisor to the ...
LeifSec's user avatar
  • 85
0 votes
0 answers
19 views

Is there a way to suppress 'resolve' messages sent to pager duty

Our pager duty is integrated to alert manager, and we want to stop an specific alert from auto resolving. To my knowledge, the best way to do that would be changing the alert manager configuration so ...
Jose Antonio Vidal Sanchez's user avatar
0 votes
0 answers
16 views

How to detect and create a alarm for a hudi job failure using hoodie metrics via Prometheus

Problem: While using multi delta streamer for kafka ingestion, out of many tables, if one of the table ingestion fails, job succeeds. There is no way to check for success/failure for a particular ...
Roobal Jindal's user avatar
0 votes
1 answer
23 views

Categorisation of alerts in Prometheus and grafana

I want to do categorisation of alerts in Prometheus and grafana if cpu usage is between 70 and 75 then medium , if cpu usage is between 75 and 80 medium and usage is greater than 80 then critical ...
Akash Kotkar's user avatar
0 votes
0 answers
17 views

Alertmanager does not fire alarm

I use the following config for alertmanager of AWS Prometheus, and it works: alertmanager_config: | global: resolve_timeout: 60s route: receiver: default group_by: ['alertname'] ...
dn2024's user avatar
  • 1
0 votes
1 answer
63 views

How to configure Prometheus Alertmanager for alerting when any container goes down?

I am using Prometheus Alertmanager to monitor dozens of hosts and hundreds of containers on these hosts. I need to receive notifications when any container goes down. I understand from the ...
alis's user avatar
  • 1
0 votes
0 answers
25 views

Slack alerts into Webex Team

I'm trying to migrate Prometheus Alertmanager alerting from using slack_configs (ref1) into the webex_configs (ref2). Is there any possibility to adjust the message sent to Webex with structure used ...
ddano's user avatar
  • 1
0 votes
1 answer
58 views

adding node exporter info to prometheus query

I am running multiple docker stacks all providing the infrastructure for a product. Internally those docker stacks are monitored using Prometheus and alert to a teams channel. I want all docker stacks ...
jonathan-dev's user avatar
0 votes
1 answer
147 views

PromQL query to find 99th percentile request latency

I'm looking at a PromQL query we're using to fire an alert when the latency of a certain service goes beyond 400ms, but I'm unable to understand how it works, or even if it is correct at all - The ...
Soham Dixit's user avatar
0 votes
1 answer
157 views

Custom alerts sent to Alertmanager not showing up in Grafana

I am currently using a version of the kube-prometheus-stack Helm chart to deploy Grafana, Prometheus and Alertmanager. I made a Go program to send custom alerts to Alertmanager (this feature will be ...
quenting's user avatar
0 votes
1 answer
70 views

AlertManager fail starting up on kube-prometheus-9.0.3 [closed]

kubectl logs alertmanager-kube-prometheus-alertmanager-x -n monitoring Error from server (BadRequest): container "alertmanager" in pod "alertmanager-kube-prometheus-alertmanager-x" ...
Adedamola's user avatar
  • 135
1 vote
0 answers
68 views

In Prometheus, is it possible that at same timestamp the defined PromQL is evaluated as true but alert is not firing?

I recently encountered a problem that struggle me several days... I've defined a PromQL in rule yaml in expr, and for is set as 0s (firing immediately). Global evaluation_interval is 1m and query....
Conifers's user avatar
  • 370
0 votes
0 answers
87 views

Kafka too many messages alert

I have an issue with Kafka alerts. We're receiving alerts for KafkaTooManyMessages, the threshold is 50K messages. When the threshold is breached, we're receiving the alerts properly, however we get a ...
Moshiko Siyahu's user avatar
0 votes
1 answer
45 views

Compare metric to itself of past 1 hour for the past x days?

I want to compare a metric to itself but to the average of the last x days. but the same time range. example : current time 14:00 i want to compare todays average for 13:00 - 14:00 for the same range ...
Farhan Jailani  's user avatar
0 votes
0 answers
226 views

Unable to fetch alert rules. Is the Prometheus data source properly configured?

Failed to load the data source configuration for Prometheus. Unable to fetch alert rules. Is the Prometheus data source properly configured? Prometheus data source is working fine. I am use AWS ...
Eranda Peiris's user avatar
0 votes
0 answers
104 views

How can I exclude any hosts in prometheus alerting rules

Here is my alert_rules.yml file groups: - name: http_alerts rules: - alert: ssl expire in 30 days expr: probe_ssl_earliest_cert_expiry - time() < 86400 * 30 for: 5m labels: ...
Shubham Gupta's user avatar
1 vote
1 answer
186 views

Resolved Alerts resets repeat_interval in Alertmanager

I have alertmanager configuration as config: global: resolve_timeout: 5m route: group_by: ['alert_manager_group_by'] group_wait: 30s group_interval: 15m repeat_interval: 30m ...
Mahnoor Fatima's user avatar
0 votes
0 answers
112 views

Detect systemd service restart using prometheus

I want to trigger an alert if a systemd service restarts. If I am able to capture time between a service inactive and active state, then I think I am able to create such alert. We know we can delay ...
bobs's user avatar
  • 55
1 vote
0 answers
103 views

Alertmanager template dynamic url

I'm trying to create a template that will dynamically pass the correct URL to the Slack receiver action button In my rules, I have dashboard_url annotations, but not for all rules, only for specific ...
Disbalance's user avatar
1 vote
1 answer
157 views

prometheus alerts for true expr to be triggered after 4 hours except first time

I am trying to control the prometheus alert for same expr to be triggered only after 4 hours if it already triggered first time. I have appname and operation configured in the configuration with ...
DeadPool's user avatar
0 votes
1 answer
177 views

Exporting Prometheus Alertmanager Alerts to CSV Using Python, Filtering by Specific Timestamps

I'm working with a monitoring platform provisioned on AWS, utilising EKS, and configured with a Prometheus data source and Alertmanager. Currently, alerts triggered by Alertmanager are sent to a Slack ...
Emmanuel Spencer Egbuniwe's user avatar
2 votes
0 answers
86 views

Doc for setting up Prometheus Alertmanager in Go App?

Functionality broken and client signatures are vastly different while upgrading Alertmanager from 0.21.0 to 0.24.0. There seems no good documentation for setting it up with v0.24.0 too. Is there a doc ...
Sanjay Nag's user avatar
1 vote
0 answers
201 views

Prometheus-Alertmanager, Slack messages not fully showing

I have my Prometheus/Alertmanager(0.26.0) in Docker Compose on a VM. My problem is most of the time Alerts are not fully showing Slack. Lets say I get one good alert and then when I run docker-compose ...
Ronald's user avatar
  • 31
0 votes
0 answers
160 views

Dynamic variables in Prometheus alert rule

I have a rule that monitors the memory usage of a container compared to the limit set to it: - alert: high_memory_usage_resource_limits expr: > ( sum by (container, pod, namespace) (...
Daniel's user avatar
  • 673
0 votes
2 answers
438 views

Is there a way to set up alert with Prometheus storage retention size

Is there a way to set a prometheus alert on storage.tsdb.retention.size, lets say if the retention size has been hit, i want an alert sent out.
floormind's user avatar
  • 1,966
0 votes
0 answers
220 views

Configure custom Prometheus Alertmanager templates in docker-compose

Could you help me configure the Prometheus Alertmanager custom templates in docker-compose? I tried to write down the path to the templates, but seems I do something wrong. There is a code of my ...
rzs's user avatar
  • 1
2 votes
1 answer
291 views

Promql use metric value as a new label

I have this metric: task_code{pod="foobar"} 9 I am trying to write an expression to check the value of the mentioned metric to raise an alert as the value of the above metric can be any ...
Ashwin's user avatar
  • 2,905
0 votes
0 answers
85 views

Prometheus AlertManager alerts not being forwarded to Slack channels

Prometheus ver. 2.47.0 / AlertManager ver. 0.26.0 I'm trying to route AlertManager alerts to 2 different Slack channels. Prometheus config file - job_name: 'node_exporter_metrics_staging' ...
Roberto Jobet's user avatar
0 votes
0 answers
97 views

Prometheus Alerts.How to set a regular expression in the configuration?

I have configuration: config: global: resolve_timeout: 5m inhibit_rules: - equal: - namespace - alertname source_matchers: - severity = critical ...
Maksim's user avatar
  • 339
0 votes
0 answers
150 views

How to set the no_proxy params in Alertmanager?

Problem I'm currently deploying the kube-prom-stack HELM chart (from the prometheus-community), setting-up Alertmanager, and got an issue while setting no_proxy parameter inside a receiver conf. The ...
Ottobus's user avatar
  • 75
0 votes
0 answers
138 views

How to query Kubernetes node CPU levels

I have the following PromQL query to get the CPU percentages of the nodes in my Kubernetes cluster. 100 - (avg by (instance) (rate(node_cpu_seconds_total{mode!="idle|wait|iowait"}[5m])) * ...
Jananath Banuka's user avatar
0 votes
0 answers
119 views

Prometheus alertmanager to send alerts based on alertname to relevant receivers

Here is the alertmanager.yml file I have, it has 2 routes with routing taken care via alertname winA and winB. I am able to get alerts from winA to winA receiver but for some reason when I try to ...
Jagadish Kumar's user avatar
0 votes
0 answers
127 views

Tag Prometheus alerts with labels

I want to create PrometheusRule which generates alerts for "pod restarts" only for namespaces that are tagged with a specific label. How can I do that? For e.g. if I mark k label namespace ...
bvnbhati's user avatar
  • 382
1 vote
0 answers
133 views

SNMP Exporter Not Displaying Correct Interface Details but when i use snmpwalk command it does

My goal is to obtain metrics from the SNMP device, and I have successfully configured my snmp.yaml file. However, I would prefer to add numerous targets directly to Snmp.yaml rather than through the ...
Goutham's user avatar
  • 17
1 vote
0 answers
391 views

Monitoring host traffic with node-exporter running in container not in local network

I'm using docker swarm and a public network for my containers. But I need to monitor traffic usage on the machine hosting the container (whichever that is). node_network_receive_bytes_total seems to ...
Mefitico's user avatar
  • 1,066
1 vote
0 answers
1k views

How to send alerts to multiple receivers

I want to add multiple receivers in Alertmanager: both slack and mail. But only one is working at a time. Here is my current attempt of configuration: config: global: resolve_timeout: 5m ...
AnmolDevops's user avatar
0 votes
0 answers
90 views

Prometheus query to visualize "Cluster Node CPU Utilization" - many-to-many matching not allowed

I am trying to query the Cluster Node CPU Utilization for my Kubernetes cluster using the prometheus. I am using the following query sum by (kubernetes_node) (rate(node_cpu_seconds_total{mode!~"...
Jananath Banuka's user avatar
0 votes
0 answers
197 views

How can I get high availability for an AlertManager that is behind Google Private Service Connect on GKE?

I have two Kubernetes GCP clusters, and I am using Prometheus operator. I am using Google Private Service Connect (PSC) to connect both clusters, and this part is working. Prometheus, which is running ...
MTG's user avatar
  • 571
0 votes
0 answers
402 views

AlertManager: Loading configuration file failed

I'm trying to start up Prometheus with Alert Manager using docker-compose. I've tried all examples I've found, but the Alertmanager fails to start: docker-compose up . . . . . . "alertmanager_1 ...
Carla's user avatar
  • 3,278
0 votes
0 answers
168 views

How to solve an inhibited alert still alerting due to race condition

I currently am running into an issue where I have two alerts where one will always fire if the other is also firing. First Alert (ServerOffline): Detects based on a ping exporter if a server has gone ...
LordSherman's user avatar

15 30 50 per page
1
2 3 4 5
16