MapReduce Analysis on Amazon Food Review Dataset (Big-Data)
-
Updated
Aug 6, 2017
MapReduce Analysis on Amazon Food Review Dataset (Big-Data)
Assignments belonging to the course Supercomputing for Big Data (ET4310) at TU Delft
Lab to familiarize yourself with Amazon Elastic MapReduce (EMR)
My AWS Playground
Completed a big data project using Hadoop, HBase, and Sqoop to ingest, process, and analyze a large dataset of taxi ride data on an AWS EMR cluster. Developed MapReduce codes to perform a variety of tasks. Exported the results of each MapReduce task to an RDS instance for visualization and analysis.
Generic python library that enables to provision emr clusters with yaml config files (Configuration as Code)
Code and documentation for the demonstration example of the real-time bushfire alerting with the Complex Event Processing (CEP) in Apache Flink on Amazon EMR and a simulated IoT sensor network as described on the AWS Big Data Blog: Real-time bushfire alerting with Complex Event Processing in Apache Flink on Amazon EMR and IoT sensor network.
Goal: Develop Machine Learning aplication in a distributed environment using AWS services with Spark.
Data Science and Engineering project - Programming for Big Data @ Simon Fraser University (SFU)
Utilize Apache Spark for ETL processes to prepare data, followed by the construction of a Machine Learning model for Natural Language Processing (NLP) classification. Subsequently, deploy the model within a Gradio web application for seamless interaction.
Apache spark sandbox on GCP and Amazon EMR.
Ce projet a pour but de réaliser une extraction de features, suivie d'une PCA sur des données volumineuses à l'aide de Spark dans le cloud.
Building a Data Lake with Spark
This Big Data project consists of obtaining data on vehicle theft in the city of São Paulo and consolidating it in a counting and heat map, in order to show areas with a higher index of this type of crime. All applicable in AWS Resources.
Add a description, image, and links to the aws-emr topic page so that developers can more easily learn about it.
To associate your repository with the aws-emr topic, visit your repo's landing page and select "manage topics."