Introduction to Data Science Module 2
Introduction to Data Science Module 2
DATA SCIENCE
Instructor
Abubakar Yussuf
DATA COLLECTION AND
STORAGE
What is data collection?
• Data collection is the process of gathering data for use in
business decision-making, strategic planning, research and
other purposes.
• Effective data collection provides the information that's
needed to answer questions, analyze business performance or
other outcomes, and predict future trends, actions and
scenarios.
Consequences from improperly collected data include:
• Variety: Big data comes in a wide range of formats, not just the
structured data (rows and columns) found in traditional databases. It
can include unstructured data like social media posts, images, and
videos, as well as semi-structured data like emails and logs.
Continuation…….
• Velocity: The speed at which data is generated and processed is
another key characteristic. Big data can be generated in real-time or
near real-time, requiring fast processing and analysis tools to keep up
with the data flow.
• Veracity: This refers to the accuracy and quality of the data. With
diverse sources and formats, ensuring the trustworthiness and
consistency of big data can be a challenge. Data cleaning and
validation techniques are crucial for reliable analysis.
• Value: Extracting meaningful insights and value from vast amounts of
data is the ultimate goal. Big data analytics techniques help us uncover
hidden patterns, trends, and correlations that can inform better
decision-making, optimize processes, and create new opportunities.
Drivers to the increase in Data Growth
• More Data-Generating Devices: The proliferation of smartphones, tablets,
laptops, and other internet-connected devices contributes significantly. Each
device captures and stores data, from photos and videos to browsing history
and app activity.
• Internet of Things (IoT): The increasing number of internet-connected
sensors and devices embedded in everyday objects is another major
contributor. These devices constantly generate data on everything from
weather conditions to traffic patterns to appliance usage.
• Social Media and User-Generated Content: The rise of social media
platforms and online communities has led to a surge in user-generated content
like posts, comments, images, and videos. This data adds significantly to the
overall volume.
. The ever-growing tide of data holds immense potential benefits across various
sectors. Here are some of the promising ways data growth can be harnessed for
positive change:
Improved resource allocation: Data can reveal areas where resources are
underutilized or overspent, enabling better allocation for optimal results.
Data-driven marketing: Businesses can personalize marketing
campaigns based on customer behavior and preferences, leading to
higher engagement and sales.
Scientific advancements: Researchers can analyze vast datasets to
accelerate discoveries in medicine, materials science, and other fields.
Environmental monitoring: Data from sensors can be used to monitor
environmental changes, track pollution levels, and inform sustainable
practices
• Personalized experiences: Data allows for customization, tailoring
experiences to individual preferences. This can range from
personalized learning platforms to recommendation systems for
products and content
2. Innovation and Development: Data is the fuel for innovation. Here's
how big data can drive progress:
•Developing new products and services: Companies can use data to
identify customer needs and preferences, informing the development of
innovative products and services that cater to those needs.
•Optimizing existing processes: Data analysis can help identify
inefficiencies in operations, leading to process improvements and cost
reductions.
3. Societal Progress: Data can play a crucial role in tackling global
challenges:
•Public health management: Data analysis can be used to track disease
outbreaks, predict epidemics, and optimize healthcare resource
allocation.
•Smart cities: Urban planning can leverage data to improve traffic flow,
optimize energy use, and enhance public safety.
4. Improved Customer Service: Businesses can leverage data to
personalize customer interactions and provide a more positive customer
experience:
•Proactive customer support: By analyzing customer data, companies
can anticipate potential issues and provide proactive support, reducing
frustration.
•Faster issue resolution: Data can help identify root causes of customer
problems, leading to quicker and more effective solutions.
•Sentiment analysis: Businesses can use data to understand customer
sentiment and feedback, allowing them to improve products and services
Scientific Discovery: Big data opens doors to groundbreaking research
in various scientific fields:
•Genomics and personalized medicine: Analyzing vast amounts of
genetic data can lead to personalized healthcare approaches and
accelerate drug discovery.
•Climate change research: Data analysis from weather stations,
satellites, and other sources helps us understand climate patterns and
predict future trends.
•Social science research: Studying social media data and online
interactions can provide insights into human behavior and social trends.