1.introduction To Data Science
1.introduction To Data Science
Dr.M.Dhurgadevi
Associate Professor
Sri Krishna College of Technology
Coimbatore
Outline
Data,Big Data and Challenges
Data Science
◦ Introduction
◦ Why Data Science
Data Scientists
◦ What do they do?
Major/Concentration in Data Science
◦ What courses to take.
Data All Around
Lots of data is being collected
and warehoused
◦ Web data, e-commerce
◦ Financial transactions, bank/credit transactions
◦ Online trading and purchasing
◦ Social Network
How Much Data Do We have?
Google processes 20 PB a day (2008)
Facebook has 60 TB of daily logs
eBay has 6.5 PB of user data + 50 TB/day
(5/2009)
1000 genomes project: 200 TB
Cost of 1 TB of disk: $35
Time to read 1 TB disk: 3 hrs
(100 MB/s)
Introduction to Big Data
What is Data?
The quantities, characters, or symbols on which operations are performed by a
computer, which may be stored and transmitted in the form of electrical signals
and recorded on magnetic, optical, or mechanical recording media.
What is Big Data?
Big Data is also data but with a huge size. Big Data is a term used to describe
a collection of data that is huge in volume and yet growing exponentially
with time. In short such data is so large and complex that none of the
traditional data management tools are able to store it or process it efficiently.