All Questions
Tagged with prometheus-alertmanager metrics
13
questions
0
votes
1
answer
147
views
PromQL query to find 99th percentile request latency
I'm looking at a PromQL query we're using to fire an alert when the latency of a certain service goes beyond 400ms, but I'm unable to understand how it works, or even if it is correct at all -
The ...
0
votes
1
answer
598
views
sum(increase(my_metrics(...)[30m])).... not working in prometheus
I have configured a custom prometheus mertic named "my_metrics" in my code which is simply capturing a special failure condition of my API.
After deployment, if I want to check what the ...
0
votes
0
answers
88
views
Spring boot : Prometheus query to get Number of Major garbage collections per min in last 10 mins
I have a requirement to get Number of Major garbage collections per min in last 10 mins, and if that number exceed 2, I need to generate alert manager alert.
Based on online search, I think I need to ...
0
votes
0
answers
379
views
http_client_requests_seconds_count metric giving random value after each refresh of /actuator/prometheus url
In my rest API, I am calling another API from another application. In order to capture total number of calls made from my API to that API, I am using http_client_requests_seconds_count metric under /...
0
votes
0
answers
46
views
Prometheus with rebalancing not firing
The status of prometheus is running, in my prometheus the rule is inactive. but not is working the condition of rule!
Actual situation:
There is metric is being used as an example for my lab.
...
0
votes
0
answers
59
views
Alert with Prometheus
I have a cluster where metrics from each namespace comes with a rash and I want to optimize a query with some regex to bring all namespaces to alert.
This is my configure of job:
- job: my_job
---...
0
votes
1
answer
2k
views
Prometheus - calculate percentage of 503 error count, per API using PromQL
Let say I have following time series followed by total count of status code -
app_interface_statusCode{instance="localhost:5555", job="prometheus", metricType="Count", ...
0
votes
1
answer
1k
views
Prometheus alerts - Absence of succeeded metrics having a date label and a timestamp value
I have a metrics of gauge type that, for each action (label step) and a give date (label day in the form 'yyyy-mm-dd'), stores the timestamp of the last succeeded event.
Something like:
{step_name=&...
0
votes
1
answer
185
views
Identify metrics names that are exceeding their limits
I want to catch when AWS limits are exceeded.
Currently I export metrics using https://github.com/jantman/awslimitchecker that look like:
# HELP vpc_vpcs
# TYPE vpc_vpcs gauge
vpc_vpcs{region="us-...
1
vote
0
answers
183
views
Prometheus metrics alerting
I have running Prometheus instance which has it's own /metrics endpoint.
This /metrics endpoint provides about 700+ metrics. I'd like to monitor these metrics with Prometheus monitoring stack.
My ...
0
votes
1
answer
2k
views
Define absence of the alert Prometheus
I have alert for Prometheus set up in such a way that it depends on the absence of value for another alert:
- alert: Some_Alert
expr: |
round(some_expr) > 24
AND ALERTS{alertname=...
1
vote
1
answer
3k
views
Firing Alerts for an activity which is supposed to happen during a particular time interval(using Prometheus Metrics and AlertManager)
I am fairly new to Prometheus alertmanager and had a doubt regarding firing alerts only during a particular period
I have a microservice which receives a file and does some processing on it, which is ...
4
votes
1
answer
14k
views
route matching multiple labels
I am not getting alertmanager to send alerts based on multiple labels.
In general sending e-mails on alerts is working. But only if there is a simple match on one label. E.g. teamB route is working. ...