Lab to familiarize yourself with Amazon Elastic MapReduce (EMR)
-
Updated
Mar 10, 2019
Lab to familiarize yourself with Amazon Elastic MapReduce (EMR)
Completed a big data project using Hadoop, HBase, and Sqoop to ingest, process, and analyze a large dataset of taxi ride data on an AWS EMR cluster. Developed MapReduce codes to perform a variety of tasks. Exported the results of each MapReduce task to an RDS instance for visualization and analysis.
Utilize Apache Spark for ETL processes to prepare data, followed by the construction of a Machine Learning model for Natural Language Processing (NLP) classification. Subsequently, deploy the model within a Gradio web application for seamless interaction.
This project is to analyse amazon reviews as provided by aws
This repository contains the projects that I did for the Data Engineering Nanodegree by Udacity.
Covid Detection via CT Scan Image Analysis
Ce projet a pour but de réaliser une extraction de features, suivie d'une PCA sur des données volumineuses à l'aide de Spark dans le cloud.
ETL pipeline with PySpark on EMR for data lake on S3
Big Data and Cloud Computing Mini Project 2 - March 07, 2022
Run Snowplow's enrichments on Amazon Elastic MapReduce with minimum fuss
Criando seu Ecossistema de Big Data na Nuvem
Analyzing Spark Cluster Performance in Amazon EMR
Add a description, image, and links to the aws-emr topic page so that developers can more easily learn about it.
To associate your repository with the aws-emr topic, visit your repo's landing page and select "manage topics."