Skip to main content

Questions tagged [outliers]

An outlier is an observation that appears to be unusual or not well described relative to a simple characterization of a dataset.

0 votes
1 answer
21 views

Fast way of detecting outliers in 2D space

I have hundreds of millions of point clouds like the following: I want to remove outliers 1, 2, 4, 5, 6, 7. The safest bet is to build a minimum spanning tree connecting all the points and remove ...
user2961927's user avatar
  • 1,650
0 votes
1 answer
57 views

Calculation of outlier score in series_outlier method

I want to implement the series_outlier method in Python & used the following code import pandas as pd import numpy as np from scipy.stats import norm # Load the data into a DataFrame data = { ...
New2015's user avatar
  • 29
0 votes
0 answers
19 views

Determining the p-value of a test statistic, which is not distributed according to a commonly known distribution under the null hypothesis [migrated]

Currently I am working in R on a project that aims to identify Dragon King events (massive outliers) in large datasets. These outliers appear for example in the city sizes in England, where London is ...
user25936873's user avatar
0 votes
1 answer
37 views

Differences between manually and ggplot-calculated boxplot statistics / outlier elimination

Note: Question edited to simplify the data. I have a few datasets combined in a dataframe, which I want to eliminate outliers from. When trying different ways to calculate upper and lower thresholds I ...
Sulfatide's user avatar
1 vote
1 answer
29 views

How to remove y-outliers from x-y Scatter plot in Python?

I am plotting a dataframe, df, containing x and y in a scatter plot. Clearly, in many cases, for each x value, y-values may be scattered. I want to remove y outliers for each x. This is different from ...
vivek777's user avatar
0 votes
2 answers
60 views

Rendering outliers in Gnuplot's box-and-whiskers

A boxplot in Gnuplot v5.4.2 renders as shown below. Is there a way to "project" all outliers belonging to the same box to the same x-position above / below the box? The drawing is somewhat ...
emacs drives me nuts's user avatar
0 votes
0 answers
31 views

assets price jumps robust ARMA-GARCH estimation

I want to replicate this article "Testing for jumps in conditionally Gaussian ARMA–GARCH models, a robust approach" . this article for estimatingARMA-GARCH model follows the following steps ...
Dav00d Darigh's user avatar
2 votes
1 answer
100 views

Confused with Isolation Forest

Let say, I have the anomaly detection (unsupervised learning) dataset with 10 observations (two features). The datasets is like below: After executing the model, following are the results (anomalies ...
Bits's user avatar
  • 309
0 votes
0 answers
39 views

How to identify outliers for two categorical values?

I am a beginner in data analysis project and would like to know how to identify outliers in a dataset like this. I have a "Satisfaction" column which refers to the overall experience ...
heyhey asea's user avatar
0 votes
1 answer
50 views

Python data filtering to remove outliers around a density plot

Referring to the below plot, I would like to remove all the outliers outside the density region marked in black color oval shape. I can use simple horizontal filters, like, -4 < data < 4. But ...
Mainland's user avatar
  • 4,514
0 votes
1 answer
74 views

Ignore outliers in box-violin plot in ggplot2

I'm trying to plot a box-violin plot in ggplot2 but I can't seem to find a way to ignore outliers in geom_violin which in geom_boxplot is taken care of by outlier.shape = NA. As a result the tails of ...
accibio's user avatar
  • 533
2 votes
2 answers
111 views

Intensity outliers in 2D plot (max or min local peaks with high intensity)

I wonder what kind of method better to use to see outliers on z value of 2D plot. For example, I have measurements of x and y values both in range of 1 to 16 with step of 1. Next I calculate how many ...
Zoomman's user avatar
  • 57
1 vote
1 answer
32 views

Finding outliers in a small sized vector

Let us say I have an n element vector consisting of certain measurements with spikes that need to be located (n is small, say 5-7). My task is to locate all elements in the vector that are "much ...
user2751530's user avatar
0 votes
0 answers
37 views

Error: can't extract column with 'col', subscript 'col' must be size 1, not 253

I am running the following code to remove outliers from my data: detect_outlier <- function(x) { Quantile1 <- quantile(x, probs=.25) Quantile3 <- quantile(x, probs=.75) x > Quantile3 + (...
Captain Beaky's user avatar
0 votes
1 answer
48 views

Need to modify identify_outliers function in rstatix, but modified function is throwing a strange error

I am trying to modify the identify_outliers function in rstatix package to allow for any coefficient when determining outliers in the is_outlier function. Here is the code for identify_outliers: ...
Phaaltu Waaltu's user avatar

15 30 50 per page
1
2 3 4 5
82