Data Mining With Bigdata
Data Mining With Bigdata
Presented By:
Sandip B. Tipayle Patil
Under the Guidance of
Prof. Y.N.Patil
DEPARTMENT OF COMPUTER ENGINEERING
DR. BABASAHEB AMBEDKAR TECHNOLOGICAL UNIVERSITY
Lonere.
Outlines
Introduction
Literature Review
Proposed System
System Architecture
Hadoop Framework
Conclusion
Introduction
Interesting Facts
The volume of business data worldwide, across all companies, doubles every
1.2 years (was 1.5 years)
Daily 2500 quadrillion of data are produced and more than 90 percentage of
data are produced within past two years.
A regular person is processing daily more data than a 16th century individual in
his entire life
In the last years cost of storage and processing power dropped significantly
Bad data or poor data quality costs US businesses $600 billion annually
Google has over 3 million servers processing over 2 trillion searches per year
in 2012 (only 22 million in 2000)
What is
-- Forrester
Bo
rin
g!
-- Forrester
R
a
ndcharacterized
Big data is the data
by 3
o
m
attributes: volume, variety and
words
velocity.
-- IBM
What is ?
Data Mining
Big Data
The term Big data is used to describe a massive
volume of both structured and unstructured data
that is so large that it's difficult to process using
traditional database and software techniques.
Forms of Data????
Literature Review
Most people dont know what to do with all data that they
already have
Giant Elephant
4 Vs of Big Data
Volume
Velocity
Variety
Veracity
Data quantity
Data Speed
Data Types
Authenticity
Proposed System:
System will process these data within reasonable cost and time limits.
System Architecture:
Hadoop framework :
I.
II.
II.
III.
Challenges
Location
Volume
Hardware
PrivacyHaving
domain knowledge
Getting
meaningful information
Solutions
Parallel
computing programming
An
Restricting
Advantages:
Fast response
Conclusion
We have entered an era of Big Data. Through better analysis of the large
volumes of data that are becoming available, there is the potential for
making faster advances in many scientific and improving the profitability and
success of many enterprises by using technologies like hadoop ,pig and so on.
Furthermore, this system will provide fully transformative solutions, and will
be address naturally for the next generation of industrial applications. We
must support and encourage this proposed framework towards addressing
these technical challenges of unstructured data, if we are to achieve the
promised benefits of Big Data.