Introduction To Big Data Ecosystems: Description
Introduction To Big Data Ecosystems: Description
Introduction To Big Data Ecosystems: Description
(Group Project)
Web: http://www.mytechnospeak.com
Description
As a part of this course, you and your team will demonstrate your learning by working on developing an
analytical solution. Given the short time and background in programming, your task is to work on any
“Challenges related to Big Data Analytics” and write a detailed report/case study. You are open to write a
code to demonstrate your findings, but it is not compulsory.
You can select the topic from any domain (HR, marketing, finance, etc.). If two teams happen to choose
the same domain, they should avoid the same topic.
Assignment
Your project report should clearly demonstrate the process that you have followed. You should explain
each step of the analysis and demonstrate your understanding of the subject. I encourage all of you to
publish this report in an open community for feedback.
1. Project report (50 points + 15 points for report writing, neatness, visual representations, etc.):
This is your end project delivery document (12-20 pages, 1.5 line spacing, 11 Font size). At minimum, the
report should include the following sections:
3. Introduction
Submit the above items (1-2) on the course portal as a single zipped file. One Submission per group is
sufficient. The zip file should be named Section_X_Group_Y.zip where X is your section, and Y is your
group number. (e.g. Section_A_Group_10.zip).
Sample Topics:
1. Big Data Tools and Technology
NOTE: Please read carefully on how to use the data and how to reference in each of the data source and
follow the data policy with proper referencing.
2 Gapminder https://www.gapminder.org/data/
5 R-datasets http://vincentarelbundock.github.io/Rdatasets/datasets.html
8 Airbnb http://insideairbnb.com/get-the-data.html
Dow Jones Index Data Set
9 http://archive.ics.uci.edu/ml/datasets/Dow+Jones+Index
Yelp
10 https://www.yelp.com/dataset/challenge
May 2015 Reddit Comments
11 https://www.kaggle.com/reddit/reddit-comments-may-2015
21 YAHOO https://webscope.sandbox.yahoo.com/
22 FiveThirtyEight https://data.fivethirtyeight.com/
Other References:
Big Data and AI Applications in Finance Industry:
http://www.ee.columbia.edu/~cylin/course/bigdata/EECS6893-BigDataAnalytics-Lecture11.pdf
Sapphirine Big Data Analytics Open Source Applications:
http://www.ee.columbia.edu/~cylin/course/bigdata/projects/