Skip to main content

Questions tagged [imputation]

Missing data imputation is the process of replacing missing data with substituted, 'best guess', values. Because missing data can create problems for analyzing data and can lead to missing-data bias, imputation is seen as a way to avoid the problems associated with listwise deletion (ignoring all observations with any missing values).

0 votes
0 answers
26 views

Is there a way to modelize a partial predictor in a classification problem with an unbalanced target?

I would like to share with you a classification issue I faced during the modelling process. I have to create a model for an unbalanced binary target by 4 predictors where one of them has 45% of wrong ...
rambo17's user avatar
0 votes
0 answers
19 views

My IV summary in R reports as NA after imputing with mice and matching with Amelia

After imputing and matching, my IV of interest returns NAs. I have a dataset that is mostly complete but for a couple of variables - coord1D and cinc. I used the following code to create my ...
Dylan Irons's user avatar
0 votes
0 answers
25 views

Python sklearn Iterative Imputer - How to impute with mixed numerical and categorical features and keep the format of categorical columns intact?

Say we've got a dataframe with a mixture of categorical and numerical features which will be used for binary classification with missing values. import pandas as pd import numpy as np from sklearn....
GM_3's user avatar
  • 11
0 votes
1 answer
44 views

Interpolate zero values only if one zero and surrounding values are bigger than zero

I want to interpolate zero values in a time series dataframe but only if: 1) there is only one missing value so subsequent and proceeding values are non-zero, 2) the surrounding non-zero values are ...
Dove_pigeon's user avatar
0 votes
0 answers
34 views

Using fine-gray regression on mids object created with mice()

I am trying to fit a Fine-Gray regression model on a multiple imputed dataset created with mice() and was wondering how to do it with the finegray() function. I used code found in cant get crr() Fine-...
ccalle's user avatar
  • 51
0 votes
0 answers
16 views

Differences Between IterativeImputer with RandomForestRegressor and the MissForest Imputer

If I use IterativeImputer with the estimator "RandomForestRegressor()" and, on the other side, MissForest Imputer, what is the difference ? Iterative imputer will use tree-based methods to ...
NoTisan's user avatar
  • 85
1 vote
0 answers
28 views

After using ga.lasso from the miselect package how do I pool results?

ive been running multiple imputation on a dataset using the mice package creating 5 mids objects. Using those objects ive performed variable selection using the cv.galasso function from the miselect ...
intern5's user avatar
  • 11
1 vote
0 answers
36 views

How do use multiple imputation only for intermittent missing values?

I have a dataset with time-ordered variables where I distinguish between a continuous series of missing values including the final value (monotone missing) and missing values where at least one non-...
Esben Mølgaard's user avatar
0 votes
0 answers
18 views

Error with parallelize='variables' using "missForest" in R

I've started using missForest to potentially replace rfImpute and while doing some testing with both synthetic and real data and the different flavours of parallelization strategies offered by ...
MarkH's user avatar
  • 320
0 votes
0 answers
21 views

Pooling Levene’s Test in R: Why is D1 method not working? [duplicate]

I want to perform a Levene's Test on multiply imputed datasets (m=5) using the pool_leventest function in R. First, I followed the example code to understand the procedure: imp_data <- mice(...
Abby's user avatar
  • 1
0 votes
0 answers
18 views

Imputation Strategy on Boston Housing Dataset Delivers Same Results

I'm following some tutorials on doing data engineering and feature engineering using boston dataset sample and here is an example where I'm trying the different impute strategy with cross validation ...
joesan's user avatar
  • 14.9k
0 votes
0 answers
14 views

How do I solve module 'numpy' has no attribute 'float'. Error while using MICE?

Here's my code 1 2 Here's some data 3 - data sample Here's knn imputation result 4 - knn imputation Hi, I'm a beginner in machine learning. While filling in the missing values in the data using MICE, ...
2113원준혁's user avatar
0 votes
0 answers
39 views

How to create my own custom imputter to input constant values seamlessly in pyspark.ml pipelines

I would like to optimize the imputation of missing values on my dataset through a CV search. This is trivial to do in sklearn, with which I am familiar -- however, I am for the first time working with ...
GaloisFan's user avatar
  • 111
1 vote
1 answer
52 views

Number of observations changing significantly after imputing using mice() in R

My data has a significant number of missing values, so I can't use the na.omit() default in order to conduct downstream analysis on my dataset, as this removes the whole row if there is even one value ...
user24943575's user avatar
0 votes
1 answer
34 views

Using a for loop to run multiple imputation in R

I suspect that there are parallels with other questions but I haven't been able to find a combination which works in this situation. In essence I am trying to use a for loop to do multiple imputation (...
beanie42's user avatar

15 30 50 per page
1
2 3 4 5
64