1

Can someone please explain how the series_outliers() Kusto function calculates the anomaly scores? I understand that it uses Tukey fences with a min percentile and max percentile given a numeric array, but I would like to know in more details what are the steps/algorithm.

For example, given this table

let T = datatable(val:real)
[
   -3, 2.4, 15, 3.9, 5, 6, 4.5, 5.2, 3, 4, 5, 16, 7, 5, 5, 4
]

I found Q1 = 2.4, Q3 = 15, and IQR = 12.6 with a 10%/90% quantile range. So how did it derive these anomaly scores? [-1.9040785483608571, -0.10021466044004519, 1.3361954725339347, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.6702443406674186, 0.0, 0.0, 0.0, 0.0]

1 Answer 1

2

In that function the 10th and 90th are calculated with linear interpolation, so p10=2.7, p90=11 so IQR=8.3.
In addition, we normalize the score to get a score that is similar to standard Tukey's test (that uses 25th and 75th percentiles), regardless of the specific percentiles we used for calculating the IQR.
The normalization is done by assuming normal distribution and looking at score k=1.5 (that is the common threshold for mild anomalies) when using p25 and p75. So, when using p10, p90 to normalize the score we need to multiply it by 2.772 to make sure that we get k=1.5.
Let's see how it works for -3.0, the first point in your sample data. k=(-3-2.7)/(11-2.7)*2.772=-1.904.
I hope it's clear now.

2
  • Thank you!!! Could you also share the steps of how you got 2.772? Commented Jan 30, 2023 at 1:27
  • The score calculation is: K = (X – X(P-high))/(X(P-high)-X(P-low)). Assuming normal distribution we can replace percentile values by Z score. We have 2 equations: K = (X – Z(P-high))/(Z(P-high)-Z(P-low)) * NormalizationFactor = (X – Z(p75))/(Z(P75)-Z(P25)) K = (X – Z(P75))/(Z(P75)-Z(25)) = 1.5 solving them and taking P-low=10th, P-high=90th then Z(P-low) = -1.2968, Z(P-high) = 1.2968 and NormalizationFactor = (2.704-0.676)*2*1.2968/((2.704-1.2968)*2*0.676)=2.764639
    – Adi E
    Commented Jan 31, 2023 at 7:29

Not the answer you're looking for? Browse other questions tagged or ask your own question.