All Questions
Tagged with prometheus-alertmanager promql
70
questions
0
votes
1
answer
23
views
Categorisation of alerts in Prometheus and grafana
I want to do categorisation of alerts in Prometheus and grafana if cpu usage is between 70 and 75 then medium , if cpu usage is between 75 and 80 medium and usage is greater than 80 then critical ...
0
votes
1
answer
147
views
PromQL query to find 99th percentile request latency
I'm looking at a PromQL query we're using to fire an alert when the latency of a certain service goes beyond 400ms, but I'm unable to understand how it works, or even if it is correct at all -
The ...
1
vote
0
answers
68
views
In Prometheus, is it possible that at same timestamp the defined PromQL is evaluated as true but alert is not firing?
I recently encountered a problem that struggle me several days...
I've defined a PromQL in rule yaml in expr, and for is set as 0s (firing immediately). Global evaluation_interval is 1m and query....
0
votes
1
answer
45
views
Compare metric to itself of past 1 hour for the past x days?
I want to compare a metric to itself but to the average of the last x days. but the same time range.
example :
current time 14:00
i want to compare todays average for 13:00 - 14:00 for the same range ...
2
votes
1
answer
291
views
Promql use metric value as a new label
I have this metric: task_code{pod="foobar"} 9
I am trying to write an expression to check the value of the mentioned metric to raise an alert as the value of the above metric can be any ...
1
vote
1
answer
221
views
Combining alert rules for all metrics
I have a alert.rules.yml file which looks like this
groups:
- name: my-alert-rules
rules:
- alert: FileCountRANRRCTooHigh
expr: ((file_count_RAN_RRC - file_count_RAN_RRC offset 24h)...
1
vote
1
answer
87
views
Get file count in the last day
I have a prometheus metric file_count_RAN_RRC{folder="RAN_RRC", instance="jobserver:9669", job="files_monitor"} which gives me the count of files that was created in a ...
1
vote
1
answer
422
views
promql: how to do comparison of two vectors with different labels and values number?
Real life example. I'm trying to make query for alertmanager that will check if kafka consumer group is lagging, eg when lag metric is bigger than certain threshold from another metrics.
Right now my ...
0
votes
0
answers
825
views
Prometheus query alert if average memory/CPU utilization is trending up
The actual requirement is if CPU/memory increases(trending up) continuously due to a memory leak and finally one day container may blast and recreate right due to memory/CPU exhausted
Before ...
1
vote
0
answers
421
views
Prometheus alerting on same rule with different labels
please, consider this:
lets say I have this rule:
- alert: EndpointIsDown
expr:
probe_success == 0
for: 7m
labels:
severity: critical
annotations:
summary: "{{ $labels....
0
votes
1
answer
598
views
sum(increase(my_metrics(...)[30m])).... not working in prometheus
I have configured a custom prometheus mertic named "my_metrics" in my code which is simply capturing a special failure condition of my API.
After deployment, if I want to check what the ...
0
votes
0
answers
88
views
Spring boot : Prometheus query to get Number of Major garbage collections per min in last 10 mins
I have a requirement to get Number of Major garbage collections per min in last 10 mins, and if that number exceed 2, I need to generate alert manager alert.
Based on online search, I think I need to ...
0
votes
1
answer
2k
views
join prometheus queries while keeping data from the left side
I want to build promql query that joins two vectors: one with some metrics and other is informational. The caveat is that info vector doesn't have all the information for "joining label", ...
1
vote
2
answers
989
views
PromQL : compare metric value with its label's value
I have this metric : my_metric{expected_value="123"} 123
Using Prometheus, how can create an alert that triggers when the value differs from the label expected_value's value ?
0
votes
0
answers
255
views
Prometheus - Alert when count increases steadily
I'm trying to create an alert for below cases.
A. Metric name is docker_push_messages_api_failed_total and of type counter.
Alert when, continuous or steady rise of counter during an hour.
Came up ...