Lecture 1 Introduction To Advanced Data Analytics
Lecture 1 Introduction To Advanced Data Analytics
Analytics
Stewart Muchuchuti
MSc Comp, BSc Eng, ML Eng, AI Eng.
stewartm@bac.ac.bw
Here is wisdom
• "In God we trust, all others must bring data." - W. Edwards Deming,
statistician and quality control expert
• "Without big data analytics, companies are blind and deaf, wandering out
onto the web like deer on a freeway." - Geoffrey Moore, author and
consultant
• "Data is a precious thing and will last longer than the systems themselves." -
Tim Berners-Lee, inventor of the World Wide Web
What is data Analytics
• Process of analyzing and interpreting large sets of data to identify patterns,
draw conclusions, and inform decision-making
• Involves using a variety of techniques and tools, such as statistical analysis,
data mining, machine learning, and data visualization, to extract insights
from complex data sets.
• Businesses:
• identify trends in consumer behavior
• optimize marketing campaigns
• improve operational efficiency, and reduce costs.
• Healthcare providers: improve patient outcomes, reduce medical errors, and
identify opportunities for preventive care
• Governments: improve public services, monitor social trends, and inform
policy decisions.
Key definitions
What is a Data Analyst?
Mathematics Statistics
Excel Python/R
Databases
Example – Data Analytics at play
• Say the Operations Director of a multinational tyre
company wants to do a detailed analysis of defects
during the tyre production at its various
manufacturing plants across the globe. Every time
there are defects on the tires during the
manufacturing process, the defect is stored with a
predefined defect code.
9
Data sources
• Sources of data for a business can range from customer feedback to sales
figures to product or service demands.
• Few sources of data a business may utilize:
• Social media: LinkedIn, Twitter, and Facebook can provide insight into
the kind of customer traffic your web page receives. These platforms
also provide cost-effective ways to conduct surveys about customer
satisfaction with products or services and customer preferences.
• Online Engagement Reporting: Using tools such as Google Analytics can
provide you with data about how customers interact with your website.
• Transactional Data: This kind of data will include information collected
from sales reports, ledgers, and web payment transactions. With a
customer relationship management system, you will also be able to
collect data about how customers spend their money on your products.
• So now, data has become so big!
How data can improve your business
33
What’s Big Data?
Processor or Virtual Storage Disk Storage
35
Volume (Scale)
• Data Volume
• 44x increase from 2009 to 2020
• From 0.8 zettabytes to 35zb
• Data volume is increasing exponentially
Exponential increase in
collected/generated data
36
4.6
30 billion RFID billion
tags today
camera
(1.3B in 2005)
phones
world wide
100s of
millions
of GPS
data every day
? TBs of
enabled
devices sold
annually
25+ TBs of
log data 2+
every day billion
people on
the Web
76 million smart meters by end
in 2009… 2011
200M by 2014
Variety (Complexity)
• Relational Data (Tables/Transaction/Legacy Data)
• Text Data (Web)
• Semi-structured Data (XML)
• Graph Data
• Social Network, Semantic Web (RDF), …
• Streaming Data
• You can only scan the data once
39
Real-time/Fast Data
Mobile devices
(tracking all objects all the time)
• The progress and innovation is no longer hindered by the ability to collect data
• But, by the ability to manage, analyze, summarize, visualize, and discover
knowledge from the collected data in a timely manner and in a scalable fashion
40
Real-Time Analytics/Decision Requirement
Product
Recommendations Learning why Customers
Influence
that are Relevant Behavior Switch to competitors
& Compelling and their offers; in
time to Counter
Friend Invitations
Improving the Customer to join a
Marketing Game or Activity
Effectiveness of a that expands
Promotion while it business
is still in Play
Preventing Fraud
as it is Occurring
& preventing more
proactively
Some Make it 4V’s
42
Harnessing Big Data
43
Big Data Technology
44