Skip to content

Explore advanced audio classification with SimCLR-UrbanSound8K. This repository applies SimCLR for urban sound categorization using the UrbanSound8K dataset, demonstrating state-of-the-art techniques in deep learning and audio analysis

License

Notifications You must be signed in to change notification settings

pranavgupta2603/SimCLR-UrbanSound8K

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

SimCLR Implementation for UrbanSound8K Classification πŸŽ΅πŸ™οΈπŸ€–

Introduction

This project implements the SimCLR (Simple Framework for Contrastive Learning of Visual Representations) architecture for the classification of urban sounds using the UrbanSound8K dataset. The goal is to accurately classify different urban sounds like sirens, car horns, etc., using advanced deep learning techniques.

Dataset πŸ“

The dataset used for this project is the UrbanSound8K dataset. This dataset consists of Mel-Spectrogram images, which are a visual representation of the audio data, suitable for our SimCLR model.

Code for Audio to Spectrogram Conversion πŸ”„

The conversion of audio to Mel-Spectrogram images is performed using a code available in this GitHub repository: UrbanSound8k-MelSpectrogram. This is crucial for preparing the dataset in a format that our model can process.

Architecture πŸ—οΈ

  • SimCLR Framework: A self-supervised learning model used to learn representations of audio data.
  • Classifier: A neural network that classifies audio based on the representations learned by SimCLR.

Hyperparameters βš™οΈ

  • Epochs: 15
  • Number of Folds: 10 (Cross-validation approach)
  • Batch Size: 32
  • Learning Rate: 0.001
  • Weight Decay: 1e-6
  • Optimizer: Adam
  • Loss Function: NTXentLoss (Contrastive Loss)

Outputs πŸ“Š

The model was trained across multiple folds, showing consistent improvement in accuracy. Here are some highlights:

  • Validation Accuracy: Ranges around 65% to 81%, varying across different epochs and folds.

Conclusion πŸŽ‰

This implementation showcases the effectiveness of SimCLR in a non-traditional domain like urban sound classification. The model achieves promising results, illustrating the power of self-supervised learning in audio processing.


About

Explore advanced audio classification with SimCLR-UrbanSound8K. This repository applies SimCLR for urban sound categorization using the UrbanSound8K dataset, demonstrating state-of-the-art techniques in deep learning and audio analysis

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages