GitHub - xxl4tomxu98/Big5-kmeans-clustering: Kmeans clustering algorithm is unsupervised machine learning library that can divide this large 2018 dataset of over 1 million rows of answers from survey that answers 50 questions with level 1 through 5 on Big 5 personality traits. Predicted raw score is further normalized to show percentiles of relative importance.

Cluster-personality-kmeans (Kaggle dataset: IPIP-FFM-data-8Nov2018)

Dataset with no label so we cluster it using kmeans to generate 10(arbitary) personality catagories

The clustered catagory numbers are normalized and bar charts created in jupyter notebook

data is based 2018 update version with metadata which can be used for further ML study

Introduction

The Big Five personality traits, also known as the five-factor model (FFM) and the OCEAN model, is a taxonomy, or grouping, for personality traits. When factor analysis (a statistical technique) is applied to personality survey data, some words used to describe aspects of personality are often applied to the same person. For example, someone described as conscientious is more likely to be described as "always prepared" rather than "messy". This theory is based therefore on the association between words but not on neuropsychological experiments. This theory uses descriptors of common language and therefore suggests five broad dimensions commonly used to describe the human personality and psyche.

The Dataset

This dataset contains 1,015,342 questionnaire answers collected online by Open Psychometrics.

Source:

"Possible Questionnaire Format for Administering the 50-Item Set of IPIP Big-Five Factor Markers". International Personality Item Pool.

References:

Goldberg, Lewis R. "The development of markers for the Big-Five factor structure." Psychological assessment 4.1 (1992): 26.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.DS_Store		.DS_Store
README.md		README.md
big5-traits.py		big5-traits.py
codebook.txt		codebook.txt
kmeans-model.ipynb		kmeans-model.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cluster-personality-kmeans (Kaggle dataset: IPIP-FFM-data-8Nov2018)

Dataset with no label so we cluster it using kmeans to generate 10(arbitary) personality catagories

The clustered catagory numbers are normalized and bar charts created in jupyter notebook

data is based 2018 update version with metadata which can be used for further ML study

Introduction

The Dataset

Source:

References:

About

Releases

Packages

Languages

xxl4tomxu98/Big5-kmeans-clustering

Folders and files

Latest commit

History

Repository files navigation

Cluster-personality-kmeans (Kaggle dataset: IPIP-FFM-data-8Nov2018)

Dataset with no label so we cluster it using kmeans to generate 10(arbitary) personality catagories

The clustered catagory numbers are normalized and bar charts created in jupyter notebook

data is based 2018 update version with metadata which can be used for further ML study

Introduction

The Dataset

Source:

References:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages