Practical Data Science with Jupyter: Explore Data Cleaning, Pre-processing, Data Wrangling, Feature Engineering and Machine Learning using Python and Jupyter (English Edition)

Ebook586 pages4 hours

Practical Data Science with Jupyter: Explore Data Cleaning, Pre-processing, Data Wrangling, Feature Engineering and Machine Learning using Python and Jupyter (English Edition)

Name: Practical Data Science with Jupyter: Explore Data Cleaning, Pre-processing, Data Wrangling, Feature Engineering and Machine Learning using Python and Jupyter (English Edition)
Author: Prateek Gupta
ISBN: 9789389898071

By Prateek Gupta

Rating: 0 out of 5 stars

()

Read preview

About this ebook

This book begins with an introduction to Data Science followed by the Python concepts. The readers will understand how to interact with various database and Statistics concepts with their Python implementations. You will learn how to import various types of data in Python, which is the first step of the data analysis process. Once you become comfortable with data importing, you will clean the dataset and after that will gain an understanding about various visualization charts. This book focuses on how to apply feature engineering techniques to make your data more valuable to an algorithm. The readers will get to know various Machine Learning Algorithms, concepts, Time Series data, and a few real-world case studies. This book also presents some best practices that will help you to be industry-ready.

This book focuses on how to practice data science techniques while learning their concepts using Python and Jupyter. This book is a complete answer to the most common question that how can you get started with Data Science instead of explaining Mathematics and Statistics behind the Machine Learning Algorithms.

Skip carousel

Intelligence (AI) & Semantics

LanguageEnglish

PublisherBPB Online LLP

Release dateMar 1, 2021

ISBN9789389898071

Author

Prateek Gupta

Related authors

Skip carousel

Related to Practical Data Science with Jupyter

Related ebooks

Skip carousel

Data Science Fundamentals and Practical Approaches: Understand Why Data Science Is the Next (English Edition)
Ebook
Data Science Fundamentals and Practical Approaches: Understand Why Data Science Is the Next (English Edition)
byDr. Gypsy Nandi
Rating: 0 out of 5 stars
0 ratings
Getting Started with Python Data Analysis
Ebook
Getting Started with Python Data Analysis
byVo.T.H Phuong
Rating: 0 out of 5 stars
0 ratings
Python Machine Learning: A Step by Step Beginner’s Guide to Learn Machine Learning Using Python
Ebook
Python Machine Learning: A Step by Step Beginner’s Guide to Learn Machine Learning Using Python
byBrady Ellison
Rating: 0 out of 5 stars
0 ratings
R for Data Science
Ebook
R for Data Science
byDan Toomey
Rating: 5 out of 5 stars
5/5
Data Scientist Pocket Guide: Over 600 Concepts, Terminologies, and Processes of Machine Learning and Deep Learning Assembled Together
Ebook
Data Scientist Pocket Guide: Over 600 Concepts, Terminologies, and Processes of Machine Learning and Deep Learning Assembled Together
byMohamed Sabri
Rating: 0 out of 5 stars
0 ratings
The Data Science Workshop: A New, Interactive Approach to Learning Data Science
Ebook
The Data Science Workshop: A New, Interactive Approach to Learning Data Science
byAnthony So
Rating: 0 out of 5 stars
0 ratings
Machine Learning Interview Questions
Ebook
Machine Learning Interview Questions
byTech Interviews
Rating: 5 out of 5 stars
5/5
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
Python Machine Learning
Ebook
Python Machine Learning
bySebastian Raschka
Rating: 4 out of 5 stars
4/5
Deep Learning With Python Illustrated Guide For Beginners & Intermediates: The Future Is Here!: The Future Is Here!, #2
Ebook
Deep Learning With Python Illustrated Guide For Beginners & Intermediates: The Future Is Here!: The Future Is Here!, #2
byWilliam Sullivan
Rating: 1 out of 5 stars
1/5
Time Series with Python: How to Implement Time Series Analysis and Forecasting Using Python
Ebook
Time Series with Python: How to Implement Time Series Analysis and Forecasting Using Python
byBob Mather
Rating: 3 out of 5 stars
3/5
PYTHON FOR DATA ANALYSIS: A Practical Guide to Manipulating, Cleaning, and Analyzing Data Using Python (2023 Beginner Crash Course)
Ebook
PYTHON FOR DATA ANALYSIS: A Practical Guide to Manipulating, Cleaning, and Analyzing Data Using Python (2023 Beginner Crash Course)
byIke Beck
Rating: 0 out of 5 stars
0 ratings
Python Machine Learning By Example
Ebook
Python Machine Learning By Example
byYuxi (Hayden) Liu
Rating: 4 out of 5 stars
4/5
The Python Workshop: Learn to code in Python and kickstart your career in software development or data science
Ebook
The Python Workshop: Learn to code in Python and kickstart your career in software development or data science
byAndrew Bird
Rating: 5 out of 5 stars
5/5
Python Machine Learning For Beginners: Handbook For Machine Learning, Deep Learning And Neural Networks Using Python, Scikit-Learn And TensorFlow
Ebook
Python Machine Learning For Beginners: Handbook For Machine Learning, Deep Learning And Neural Networks Using Python, Scikit-Learn And TensorFlow
byFinn Sanders
Rating: 1 out of 5 stars
1/5
Advanced Machine Learning with Python
Ebook
Advanced Machine Learning with Python
byJohn Hearty
Rating: 0 out of 5 stars
0 ratings
Machine Learning for Finance
Ebook
Machine Learning for Finance
bySaurav Singla
Rating: 5 out of 5 stars
5/5
Practical Data Analysis
Ebook
Practical Data Analysis
byHector Cuesta
Rating: 4 out of 5 stars
4/5
Learning NumPy Array
Ebook
Learning NumPy Array
byIvan Idris
Rating: 0 out of 5 stars
0 ratings
Python Machine Learning: Introduction to Machine Learning with Python
Ebook
Python Machine Learning: Introduction to Machine Learning with Python
byFrank Millstein
Rating: 0 out of 5 stars
0 ratings
Beginning with Machine Learning: The Ultimate Introduction to Machine Learning, Deep Learning, Scikit-learn, and TensorFlow (English Edition)
Ebook
Beginning with Machine Learning: The Ultimate Introduction to Machine Learning, Deep Learning, Scikit-learn, and TensorFlow (English Edition)
byDr. Amit Dua
Rating: 0 out of 5 stars
0 ratings
Introduction to Statistical and Machine Learning Methods for Data Science
Ebook
Introduction to Statistical and Machine Learning Methods for Data Science
byCarlos Andre Reis Pinheiro
Rating: 0 out of 5 stars
0 ratings
Mastering Python Data Analysis
Ebook
Mastering Python Data Analysis
byMagnus Vilhelm Persson
Rating: 0 out of 5 stars
0 ratings
Machine Learning For Beginners Guide Algorithms: Supervised & Unsupervsied Learning. Decision Tree & Random Forest Introduction
Ebook
Machine Learning For Beginners Guide Algorithms: Supervised & Unsupervsied Learning. Decision Tree & Random Forest Introduction
byWilliam Sullivan
Rating: 0 out of 5 stars
0 ratings
NumPy: Beginner's Guide - Third Edition
Ebook
NumPy: Beginner's Guide - Third Edition
byIvan Idris
Rating: 4 out of 5 stars
4/5
Bayesian Analysis with Python
Ebook
Bayesian Analysis with Python
byOsvaldo Martin
Rating: 4 out of 5 stars
4/5
Pragmatic Machine Learning with Python: Learn How to Deploy Machine Learning Models in Production
Ebook
Pragmatic Machine Learning with Python: Learn How to Deploy Machine Learning Models in Production
byAvishek Nag
Rating: 0 out of 5 stars
0 ratings
Markov Models Supervised and Unsupervised Machine Learning: Mastering Data Science And Python
Ebook
Markov Models Supervised and Unsupervised Machine Learning: Mastering Data Science And Python
byWilliam Sullivan
Rating: 2 out of 5 stars
2/5
Machine Learning - A Complete Exploration of Highly Advanced Machine Learning Concepts, Best Practices and Techniques: 4
Ebook
Machine Learning - A Complete Exploration of Highly Advanced Machine Learning Concepts, Best Practices and Techniques: 4
byPeter Bradley
Rating: 0 out of 5 stars
0 ratings
Python Machine Learning
Ebook
Python Machine Learning
byWei-Meng Lee
Rating: 5 out of 5 stars
5/5

Intelligence (AI) & Semantics For You

Skip carousel

ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
Ebook
ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
byAlec Rowe
Rating: 0 out of 5 stars
0 ratings
AI Money Machine: Unlock the Secrets to Making Money Online with AI
Ebook
AI Money Machine: Unlock the Secrets to Making Money Online with AI
byLucas Bennett
Rating: 5 out of 5 stars
5/5
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
Ebook
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
byCea West
Rating: 4 out of 5 stars
4/5
Co-Intelligence: Living and Working with AI
Ebook
Co-Intelligence: Living and Working with AI
byEthan Mollick
Rating: 4 out of 5 stars
4/5
AI for Educators: AI for Educators
Ebook
AI for Educators: AI for Educators
byMatt Miller
Rating: 3 out of 5 stars
3/5
Nexus: A Brief History of Information Networks from the Stone Age to AI
Ebook
Nexus: A Brief History of Information Networks from the Stone Age to AI
byYuval Noah Harari
Rating: 4 out of 5 stars
4/5
The Coming Wave: AI, Power, and Our Future
Ebook
The Coming Wave: AI, Power, and Our Future
byMustafa Suleyman
Rating: 5 out of 5 stars
5/5
The Instant AI Agency: How to Cash 6 & 7 Figure Checks in the New Digital Gold Rush Without Being A Tech Nerd
Ebook
The Instant AI Agency: How to Cash 6 & 7 Figure Checks in the New Digital Gold Rush Without Being A Tech Nerd
byDan Wardrope
Rating: 5 out of 5 stars
5/5
Artificial Intelligence: A Guide for Thinking Humans
Ebook
Artificial Intelligence: A Guide for Thinking Humans
byMelanie Mitchell
Rating: 4 out of 5 stars
4/5
ChatGPT Side Hustles 2024 - Unlock the Digital Goldmine and Get AI Working for You Fast with More Than 85 Side Hustle Ideas to Boost Passive Income, Create New Cash Flow, and Get Ahead of the Curve
Ebook
ChatGPT Side Hustles 2024 - Unlock the Digital Goldmine and Get AI Working for You Fast with More Than 85 Side Hustle Ideas to Boost Passive Income, Create New Cash Flow, and Get Ahead of the Curve
byAlec Rowe
Rating: 0 out of 5 stars
0 ratings
Some Future Day: How AI Is Going to Change Everything
Ebook
Some Future Day: How AI Is Going to Change Everything
byMarc Beckman
Rating: 0 out of 5 stars
0 ratings
Make Money with ChatGPT: Your Guide to Making Passive Income Online with Ease using AI: AI Wealth Mastery
Ebook
Make Money with ChatGPT: Your Guide to Making Passive Income Online with Ease using AI: AI Wealth Mastery
byBen Preston
Rating: 1 out of 5 stars
1/5
Coding with AI For Dummies
Ebook
Coding with AI For Dummies
byChris Minnick
Rating: 1 out of 5 stars
1/5
A Brief History of Artificial Intelligence: What It Is, Where We Are, and Where We Are Going
Ebook
A Brief History of Artificial Intelligence: What It Is, Where We Are, and Where We Are Going
byMichael Wooldridge
Rating: 4 out of 5 stars
4/5
ChatGPT for Beginners: How to Make Money Online and 10x Your Productivity Using ChatGPT Even if You’re an Absolute Beginner (The Complete Up-to-Date ChatGPT Guide)
Ebook
ChatGPT for Beginners: How to Make Money Online and 10x Your Productivity Using ChatGPT Even if You’re an Absolute Beginner (The Complete Up-to-Date ChatGPT Guide)
byMatthew Hayes
Rating: 0 out of 5 stars
0 ratings
Writing AI Prompts For Dummies
Ebook
Writing AI Prompts For Dummies
byStephanie Diamond
Rating: 0 out of 5 stars
0 ratings
The AI-Driven Leader: Harnessing AI to Make Faster, Smarter Decisions
Ebook
The AI-Driven Leader: Harnessing AI to Make Faster, Smarter Decisions
byGeoff Woods
Rating: 2 out of 5 stars
2/5
100M Offers Made Easy: Create Your Own Irresistible Offers by Turning ChatGPT into Alex Hormozi
Ebook
100M Offers Made Easy: Create Your Own Irresistible Offers by Turning ChatGPT into Alex Hormozi
byBen Preston
Rating: 0 out of 5 stars
0 ratings
Midjourney Mastery - The Ultimate Handbook of Prompts
Ebook
Midjourney Mastery - The Ultimate Handbook of Prompts
byAndreea Todinca
Rating: 5 out of 5 stars
5/5
Neural Networks: A Practical Guide for Understanding and Programming Neural Networks and Useful Insights for Inspiring Reinvention
Ebook
Neural Networks: A Practical Guide for Understanding and Programming Neural Networks and Useful Insights for Inspiring Reinvention
bySteven Cooper
Rating: 4 out of 5 stars
4/5
So You Want to Start a Podcast: Finding Your Voice, Telling Your Story, and Building a Community That Will Listen
Ebook
So You Want to Start a Podcast: Finding Your Voice, Telling Your Story, and Building a Community That Will Listen
byKristen Meinzer
Rating: 3 out of 5 stars
3/5
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
ChatGPT Millionaire 2024 - Bot-Driven Side Hustles, Prompt Engineering Shortcut Secrets, and Automated Income Streams that Print Money While You Sleep. The Ultimate Beginner’s Guide for AI Business
Ebook
ChatGPT Millionaire 2024 - Bot-Driven Side Hustles, Prompt Engineering Shortcut Secrets, and Automated Income Streams that Print Money While You Sleep. The Ultimate Beginner’s Guide for AI Business
byAlec Rowe
Rating: 3 out of 5 stars
3/5
Chat-GPT Income Ideas: Pioneering Monetization Concepts Utilizing Conversational AI for Profitable Ventures
Ebook
Chat-GPT Income Ideas: Pioneering Monetization Concepts Utilizing Conversational AI for Profitable Ventures
byThe Passive Income Strategist
Rating: 3 out of 5 stars
3/5
101 Midjourney Prompt Secrets
Ebook
101 Midjourney Prompt Secrets
byMarcus Byrne
Rating: 3 out of 5 stars
3/5
ChatGPT Millionaire: Work From Home and Make Money Online, Tons of Business Models to Choose from
Ebook
ChatGPT Millionaire: Work From Home and Make Money Online, Tons of Business Models to Choose from
byBen Wong
Rating: 5 out of 5 stars
5/5
Dark Aeon: Transhumanism and the War Against Humanity
Ebook
Dark Aeon: Transhumanism and the War Against Humanity
byJoe Allen
Rating: 5 out of 5 stars
5/5
Artificial Intelligence For Dummies
Ebook
Artificial Intelligence For Dummies
byJohn Paul Mueller
Rating: 3 out of 5 stars
3/5
80 Ways to Use ChatGPT in the Classroom
Ebook
80 Ways to Use ChatGPT in the Classroom
byStan Skrabut
Rating: 5 out of 5 stars
5/5
The Secrets of ChatGPT Prompt Engineering for Non-Developers
Ebook
The Secrets of ChatGPT Prompt Engineering for Non-Developers
byCea West
Rating: 5 out of 5 stars
5/5

Related podcast episodes

Skip carousel

Exploring deep reinforcement learning: with Thomas Simonini of Hugging Face
UNLIMITED
Exploring deep reinforcement learning: with Thomas Simonini of Hugging Face
byPractical AI
0 ratings
0% found this document useful
Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks: A cross-over episode from The Machine Learning Podcast with the team from Deepchecks, exploring the challenges of testing and validating machine learning applications and their work to make it easier.
UNLIMITED
Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks: A cross-over episode from The Machine Learning Podcast with the team from Deepchecks, exploring the challenges of testing and validating machine learning applications and their work to make it easier.
byThe Python Podcast.__init__
0 ratings
0% found this document useful
Leveling Up Natural Language Processing with Transfer Learning: An interview with Paul Azunre about how you can use transfer learning techniques to build more flexible natural language processing systems and reduce the requirements for labelled data.
UNLIMITED
Leveling Up Natural Language Processing with Transfer Learning: An interview with Paul Azunre about how you can use transfer learning techniques to build more flexible natural language processing systems and reduce the requirements for labelled data.
byThe Python Podcast.__init__
0 ratings
0% found this document useful
78: Mindset of a Rockstar Data Analyst w/ Trevor Tapscott: Our focus for this inspiring episode of AOF is mindset, especially if you want to be a standout data analyst! I have brought one of my first ever followers and day ones! Trevor Tapscott is a VP and Analytics Consultant at Wells Fargo and has been in...
UNLIMITED
78: Mindset of a Rockstar Data Analyst w/ Trevor Tapscott: Our focus for this inspiring episode of AOF is mindset, especially if you want to be a standout data analyst! I have brought one of my first ever followers and day ones! Trevor Tapscott is a VP and Analytics Consultant at Wells Fargo and has been in...
byAnalytics on Fire
0 ratings
0% found this document useful
Measuring Your Python Learning Progress
UNLIMITED
Measuring Your Python Learning Progress
byThe Real Python Podcast
100%
100% found this document useful
#1 Data Science, Past, Present and Future: Hilary Mason talks about the past, present, and future of data science with Hugo. Hilary is the VP of Research at Cloudera Fast Forward, a machine intelligence research company, and the data scientist in residence at Accel. If you want to hear about wh...
UNLIMITED
#1 Data Science, Past, Present and Future: Hilary Mason talks about the past, present, and future of data science with Hugo. Hilary is the VP of Research at Cloudera Fast Forward, a machine intelligence research company, and the data scientist in residence at Accel. If you want to hear about wh...
byDataFramed
100%
100% found this document useful
S1:E1 "The Beginning"
UNLIMITED
S1:E1 "The Beginning"
byData Science Now
0 ratings
0% found this document useful
Episode 19 (Python for Data Science - Python Files - Scripts and Modules)
UNLIMITED
Episode 19 (Python for Data Science - Python Files - Scripts and Modules)
byHow to Data (Joshiverse- Journey of a Budding Data Scientist)
0 ratings
0% found this document useful
Wes McKinney's Career In Python For Data Analysis: An interview with Wes Mckinney about the path that led him from Pandas to Apache Arrow, and everything in between
UNLIMITED
Wes McKinney's Career In Python For Data Analysis: An interview with Wes Mckinney about the path that led him from Pandas to Apache Arrow, and everything in between
byThe Python Podcast.__init__
0 ratings
0% found this document useful
Open Source TensorFlow with Yifei Feng: Yifei Feng, a TensorFlow software engineer, shares with Melanie and Mark about her work on the open source TensorFlow project and the tools she builds.
UNLIMITED
Open Source TensorFlow with Yifei Feng: Yifei Feng, a TensorFlow software engineer, shares with Melanie and Mark about her work on the open source TensorFlow project and the tools she builds.
byGoogle Cloud Platform Podcast
100%
100% found this document useful
A Multipurpose Database For Transactions And Analytics To Simplify Your Data Architecture With Singlestore: An interview with Shireesh Thota about how the Singlestore database engine allows you to reduce architectural sprawl in your data systems by combining performant and scalable transactional and analytical capabilities into a single platform
UNLIMITED
A Multipurpose Database For Transactions And Analytics To Simplify Your Data Architecture With Singlestore: An interview with Shireesh Thota about how the Singlestore database engine allows you to reduce architectural sprawl in your data systems by combining performant and scalable transactional and analytical capabilities into a single platform
byData Engineering Podcast
0 ratings
0% found this document useful
008 Math: Introduction to the branches of mathematics used in machine learning. Linear algebra, statistics, calculus. ocdevel.com/mlg/8 for notes and resources
UNLIMITED
008 Math: Introduction to the branches of mathematics used in machine learning. Linear algebra, statistics, calculus. ocdevel.com/mlg/8 for notes and resources
byMachine Learning Guide
0 ratings
0% found this document useful
Episode 161: Trapped as a QA engineer and trapped as a generalist
UNLIMITED
Episode 161: Trapped as a QA engineer and trapped as a generalist
bySoft Skills Engineering
0 ratings
0% found this document useful
Advantages of Completing Small Python Projects
UNLIMITED
Advantages of Completing Small Python Projects
byThe Real Python Podcast
0 ratings
0% found this document useful
Big Data, Data Lakes, and Blockchain with Rahul Pathak, Executive at Amazon Web Services: Everyone knows that data is exploding. What most people don’t realize is the pace and ways in which data is changing our everyday lives. According to , we’re seeing a “roughly 10x increase in data every 5 years, and the types of data that’s...
UNLIMITED
Big Data, Data Lakes, and Blockchain with Rahul Pathak, Executive at Amazon Web Services: Everyone knows that data is exploding. What most people don’t realize is the pace and ways in which data is changing our everyday lives. According to , we’re seeing a “roughly 10x increase in data every 5 years, and the types of data that’s...
byMission Daily
0 ratings
0% found this document useful
Why Executives Should Keep Up with AI Trends in Business: I hope that by the end of this episode of the AI in Industry podcast, you'll not only be able to hire better data scientists who will be a fit for your business problems and build better data science teams, but also pick the AI applications and use...
UNLIMITED
Why Executives Should Keep Up with AI Trends in Business: I hope that by the end of this episode of the AI in Industry podcast, you'll not only be able to hire better data scientists who will be a fit for your business problems and build better data science teams, but also pick the AI applications and use...
byThe AI in Business Podcast
0 ratings
0% found this document useful
Four Most Commonly Asked Questions About AI with Dr. Jerry Smith: Dr. Jerry Smith welcomes you to another episode of AI Live and Unbiased to explore the breadth and depth of Artificial Intelligence and to encourage you to change the world, not just observe it! Dr. Jerry is talking today about questions and...
UNLIMITED
Four Most Commonly Asked Questions About AI with Dr. Jerry Smith: Dr. Jerry Smith welcomes you to another episode of AI Live and Unbiased to explore the breadth and depth of Artificial Intelligence and to encourage you to change the world, not just observe it! Dr. Jerry is talking today about questions and...
byAI Live & Unbiased
0 ratings
0% found this document useful
Episode 150 – Roaring News: In this news episode, we use a nice little article on how you can help keep open source sustainable as a structure for a broader discussion on this subject. The second subject this time goes another round on the "data engineers are not data scientists"...
UNLIMITED
Episode 150 – Roaring News: In this news episode, we use a nice little article on how you can help keep open source sustainable as a structure for a broader discussion on this subject. The second subject this time goes another round on the "data engineers are not data scientists"...
byRoaring Elephant
0 ratings
0% found this document useful
ProcurementSoftware.site – The FREE resource for digital procurement
UNLIMITED
ProcurementSoftware.site – The FREE resource for digital procurement
byThe Procurement Software Podcast
0 ratings
0% found this document useful
#187: Beyond the resume - how to stand out in the competitive world of tech
UNLIMITED
#187: Beyond the resume - how to stand out in the competitive world of tech
byPybites Podcast
0 ratings
0% found this document useful
172 The New Way To Create Content & Code: There is a fundamental tectonic change happening in the way work gets done – White Collar knowledge work to be specific. This new technology is creating a new category of worker beyond what has been the sort of top of the pyramid,
UNLIMITED
172 The New Way To Create Content & Code: There is a fundamental tectonic change happening in the way work gets done – White Collar knowledge work to be specific. This new technology is creating a new category of worker beyond what has been the sort of top of the pyramid,
byLochhead on Marketing
0 ratings
0% found this document useful
70: Web Components at Microsoft: Summary Daniel Buchner (@csuwildcat), former Mozillian & Program Manager at Microsoft takes us through the plans for Web Components at Microsoft. Daniel is the creator of the Web Components free open source library, X-Tag which Microsoft is now...
UNLIMITED
70: Web Components at Microsoft: Summary Daniel Buchner (@csuwildcat), former Mozillian & Program Manager at Microsoft takes us through the plans for Web Components at Microsoft. Daniel is the creator of the Web Components free open source library, X-Tag which Microsoft is now...
byThe Web Platform Podcast
0 ratings
0% found this document useful
How Designers Are Using MidJourney To Build Their Businesses with Sherry Horowitz the Ai Conjurer | Ep 125
UNLIMITED
How Designers Are Using MidJourney To Build Their Businesses with Sherry Horowitz the Ai Conjurer | Ep 125
byPackaging Unboxd with Evelio Mattos
0 ratings
0% found this document useful
Ep. 039, You want chili powder with that?: You want chili powder with that?
UNLIMITED
Ep. 039, You want chili powder with that?: You want chili powder with that?
byUnderserved
0 ratings
0% found this document useful
[DataFramed Careers Series #3]: Accelerating Data Careers with Writing
UNLIMITED
[DataFramed Careers Series #3]: Accelerating Data Careers with Writing
byDataFramed
0 ratings
0% found this document useful
From Chaos to Chapters: How AI is Revolutionizing Book Publishing | Dan Curran | 611: AI Meets Editorial Expertise to Create Your Story
UNLIMITED
From Chaos to Chapters: How AI is Revolutionizing Book Publishing | Dan Curran | 611: AI Meets Editorial Expertise to Create Your Story
byLeveraging Thought Leadership
0 ratings
0% found this document useful
Cloud BI for Everyone
UNLIMITED
Cloud BI for Everyone
byThe Cloudcast
100%
100% found this document useful
72: Teaching and Learning Angular: Summary Kent C. Dodds (@kentcdodds) & Shai Reznik (@shai_reznik) join us for episode 72 about teaching and learning the popular Angular JavaScript Framework. These two veteran technologists provide great insights into how they teach code, what...
UNLIMITED
72: Teaching and Learning Angular: Summary Kent C. Dodds (@kentcdodds) & Shai Reznik (@shai_reznik) join us for episode 72 about teaching and learning the popular Angular JavaScript Framework. These two veteran technologists provide great insights into how they teach code, what...
byThe Web Platform Podcast
0 ratings
0% found this document useful
How ChatGPT Can Supercharge Your L&D With Ross Stevenson
UNLIMITED
How ChatGPT Can Supercharge Your L&D With Ross Stevenson
byThe Learning & Development Podcast
0 ratings
0% found this document useful
Building a Carbon-Focused Tech Startup With CoveTool Co-Founder Patrick Chopson: Patrick Chopson, is the co-founder od the carbon-focused Atlanta-based startup, cove.tool. As the Chief Product Officer, he leads product development for cove.tool, a web-based software for analyzing, drawing, engineering, and connecting...
UNLIMITED
Building a Carbon-Focused Tech Startup With CoveTool Co-Founder Patrick Chopson: Patrick Chopson, is the co-founder od the carbon-focused Atlanta-based startup, cove.tool. As the Chief Product Officer, he leads product development for cove.tool, a web-based software for analyzing, drawing, engineering, and connecting...
byThe Green Building Matters Podcast with Charlie Cichetti
0 ratings
0% found this document useful

Related categories

Skip carousel

Reviews for Practical Data Science with Jupyter

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Practical Data Science with Jupyter - Prateek Gupta

CHAPTER 1

Data Science Fundamentals

Learning from data is virtually universally useful. Master it and you will be welcomed anywhere.

– John Elder, founder of the Elder Research

Elder Research is America’s largest and most experienced analytics consultancy. With his vision about data, John started his company in 1995, yet the importance of finding information from the data is a niche and the most demanding skill of the 21st century. Today data science is everywhere.

The explosive growth of the digital world requires professionals with not just strong skills, but also adaptability and a passion for staying on the forefront of technology. A recent study shows that demand for data scientists and analysts is projected to grow by 28 percent by 2021. This is on top of the current market need. According to the U.S. Bureau of Labor Statistics, growth for data science jobs skills will grow about 28% through 2026. Unless something changes, these skill-gaps will continue to widen. In this first chapter, you will learn how to be familiar with data, your role as an aspiring data scientist, and the importance of Python programming language in data science.

Structure

What is data?

What is data science?

What does a data scientist do?

Real-world use cases of data science

Why Python for data science?

Objective

After studying this chapter, you should be able to understand the data types, the amount of the data generated daily, and the need for data scientists with currently available real-world use cases.

What is data?

The best way to describe data is to understand the types of data. Data is divided into the following three categories.

Structured data

A well-organized data in the form of tables that can be easily be operated is known as structured data. Searching and accessing information from such type of data is very easy. For example, data stored in the relational database, i.e., SQL in the form of tables having multiple rows and columns. The spreadsheet is another good example of structured data. Structured data represent only 5% to 10% of all data present in the world. The following figure 1.1 is an example of SQL data, where an SQL table is holding the merchant related data:

Figure 1.1: Sample SQL Data

Unstructured data

Unstructured data requires advanced tools and software’s to access information. For example, images and graphics, PDF files, word document, audio, video, emails, PowerPoint presentations, webpages and web contents, wikis, streaming data, location coordinates, etc., fall under the unstructured data category. Unstructured data represent around 80% of the data. The following figure 1.2 shows various unstructured data types:

Figure 1.2: Unstructured data types

Semi-structured data

Semi-structured data is structured data that is unorganized. Web data such as JSON (JavaScript Object Notation) files, BibTex files, CSV files, tab-delimited text files, XML, and other markup languages are examples of semi-structured data found on the web. Semi-structured data represent only 5% to 10% of all data present in the world. The following figure 1.3 shows an example of JSON data:

Figure 1.3: JSON data

What is data science?

It’s become a universal truth that modern businesses are awash with data. Last year, McKinsey estimated that Big Data initiatives in the US healthcare system could account for $300 billion to $450 billion in reduced healthcare spending or 12-17 percent of the $2.6 trillion baselines in US healthcare costs. On the other hand though, bad or unstructured data is estimated to be costing the US roughly $3.1 trillion a year.

Data-driven decision making is increasing in popularity. Accessing and finding information from the unstructured data is complex and cannot be done easily with some BI tools; here data science comes into the picture.

Data science is a field that extracts the knowledge and insights from the raw data. To do so, it uses mathematics, statistics, computer science, and programming language knowledge. A person who has all these skills is known as a data scientist. A data scientist is all about being curious, self-driven, and passionate about finding answers. The following figure 1.4 shows the skills that a modern data scientist should have:

Figure 1.4: Skills of a modern data scientist

What does a data scientist do?

Most data scientists in the industry have advanced training in statistics, math, and computer science. Their experience is a vast horizon that also extends to data visualization, data mining, and information management. The primary job of a data scientist is to ask the right question. It’s about surfacing hidden insight that can help enable companies to make smarter business decisions.

The job of a data scientist is not bound to a particular domain. Apart from scientific research, they are working in various domains including shipping, healthcare, e-commerce, aviation, finance, education, etc. They start their work by understanding the business problem and then they proceed with data collection, reading the data, transforming the data in the required format, visualizing, modeling, and evaluating the model and then deployment. You can imagine their work cycle as mentioned in the following figure 1.5:

Figure 1.5: Work cycle of a data scientist

Eighty percent of a data scientist’s time is spent in simply finding, cleansing, and organizing data, leaving only 20 percent to perform analysis. These processes can be time-consuming and tedious. But it’s crucial to get them right since a model is only as good as the data that is used to build it. And because models generally improve as they are exposed to increasing amounts of data, it’s in the data scientists’ interests to include as much data as they can in their analysis.

In the later chapters of this book, you will learn all the above-required skills to be a data scientist.

Real-world use cases of data science

Information is the oil of the 21st century, and analytics is the combustion engine. Whether you are uploading a picture on Facebook, posting a tweet, emailing anybody, or shopping in an e-commerce site, the role of data science is everywhere. In the modern workplace, data science is applied to many problems to predict and calculate outcomes that would have taken several times more human hours to process. Following are some list of real-world examples where data scientists are playing a key role:

Google’s AI research arm is taking the help of data scientists to build the best performing algorithm for automatically detecting objects.

Amazon has built a product recommendation system to personalize their product.

Santander Group of Bank has built a model with the help of data scientists to identify the value of transactions for each potential customer.

Airbus in the maritime industry is taking the help of data scientists to build a model that detects all ships in satellite images as quickly as possible to increase knowledge, anticipate threats, trigger alerts, and improve efficiency at sea.

YouTube is using an automated video classification model in limited memory.

Data scientists at the Chinese internet giant Baidu Inc. released details of a new deep learning algorithm that they claim can help pathologists identify tumors more accurately.

The Radiological Society of North America (RSNA®) is using an algorithm to detect a visual signal for pneumonia in medical images which automatically locate lung opacities on chest radiographs.

The Inter-American Development Bank is using an algorithm that considers a family’s observable household attributes like the material of their walls and ceiling, or the assets found in the home to classify them and predict their level of need.

Netflix data uses data science skills on the movie viewing patterns to understand what drives user interest and uses that to make decisions on which Netflix original series to produce.

Why Python for data science?

Python is very beginner friendly. The syntax (words and structure) is extremely simple to read and follow, most of which can be understood even if you do not know any programming. Python is a multi-paradigm programming language – a sort of Swiss Army knife for the coding world. It supports object-oriented programming, structured programming, and functional programming patterns, among others. There’s a joke in the Python community that Python is generally the second-best language for everything.

Python is a free, open-source software, and consequently, anyone can write a library package to extend its functionality. Data science has been an early beneficiary of these extensions, particularly Pandas, the big daddy of them all.

Python’s inherent readability and simplicity makes it relatively easy to pick up, and the number of dedicated analytical libraries available today means that data scientists in almost every sector will find packages already tailored to their needs, freely available for download.

The following survey (figure 1.6) was done by KDnuggets – a leading site on business analytics, Big Data, data mining, data science, and machine learning – clearly shows that Python is a preferable choice for data science/machine learning:

Figure 1.6: Survey by KDnuggets

Conclusion

Most of the people think that it is very difficult to become a data scientist. But, let me be clear, it is not tough!

If you love making discoveries about the world, and if you are fascinated by machine learning, then you can break into the data science industry no matter what your situation is. This book will push you to learn, improve, and master the data science skill on your own. There is only one thing you need to keep on, that is, LEARN-APPLY-REPEAT. In the next chapter, we will set up our machine, and be ready for our data science journey.

CHAPTER 2

Installing Software and System Setup

In the last chapter, we covered the data science fundamentals, and now we are ready to move ahead and prepare our system for data science. In this chapter, we will learn about the most popular Python data science platform – Anaconda. With this platform, you don't need to install Python explicitly – just one installation in your system (Windows, macOS, or Linux) and you are ready to use the industry-standard platform for developing, testing, and training.

Structure

System requirements

Downloading the Anaconda

Installing the Anaconda in Windows

Installing the Anaconda in Linux

How to install a new Python library in Anaconda

Open your notebook – Jupyter

Know your notebook

Objective

After studying this chapter, you should be able to install Anaconda in your system successfully and use the Jupyter notebook. You will also run your first Python program in your notebook.

System requirements

System architecture: 64-bit x86, 32-bit x86 with Windows or Linux, Power8, or Power9

Operating system: Windows Vista or newer, 64-bit macOS 10.10+, or Linux, including Ubuntu, RedHat, CentOS 6+

Minimum 3 GB disk space to download and install

Downloading Anaconda

You can download the Anaconda Distribution from the following link:

https://www.anaconda.com/download/

Once you click on the preceding link, you will see the following screen (as shown in figure 2.1):

Figure 2.1: Anaconda Distribution download page

Anaconda Distribution shows different OS options – Windows, macOS, and Linux. According to your OS, select the appropriate option. For this example, I have selected the Windows OS’s 64-Bit Graphical Installer (457 MB) option as shown in the following figure 2.2 :

Figure 2.2: Anaconda Distribution installer versions for Windows OS

Python community has stopped its support for Python 2.x and the prior version, so it is highly recommended that you should use Python 3.x. We are going to use Python 3.8 version throughout this book, so I will recommend downloading this version only. For downloading the distribution, see the two links just below the Download button; they are showing the Graphical Installer for each system architecture type-64-bit or 32-bit. Click on the appropriate link, and the downloading will start. This downloading process is the same for macOS and Linux.

Installing the Anaconda on Windows

Once the downloading is complete, double click on the installer to launch (the recommended way is to run the installer with admin privileges).

Click Next, accept the terms, select the users – Just Me or All Users and click Next.

Select the default destination folder or add a custom location to install the Anaconda, copy this path for later use and click Next.

Install Anaconda to a directory path that does not contain spaces or Unicode characters.

Deselect (uncheck) the first following option (if checked already) – add Anaconda to my PATH environment variable, then click Install, wait till the installation is completed.

Click Next, click Skip, and then click Finish.

Now open the Advanced system settings in your machine and add the following two values in your PATH environment variable:

C:\Users\prateek\Anaconda3

C:\Users\prateek\Anaconda3\Scripts

Here, replace theC:\Users\prateek\Anaconda3with the actual path of your Anaconda installation folder that you copied earlier.

Save the settings and restart your system.

Verify your installation by clicking on the Windows icon in the taskbar or simply type Anaconda in the search bar – you will see Anaconda Navigator option, click on this option, and the following screen will appear (as shown in figure 2.3):

Figure 2.3: Anaconda Navigator

Installing the Anaconda with Graphical Installer in macOS is the same as we did above for Windows.

Installing the Anaconda in Linux

After downloading the 64bit(x86) installer, run the following two commands to check the data integrity:

Md5sum /path/filename

Sha256sum /path/filename

Replace /path/filename with the actual path and filename of the file you downloaded.

Enter the following to install Anaconda for Python 3.8, just replace ~/Downloads/ with the path to the file you downloaded:

Figure 2.4: Installing Anaconda in Linux

Choose Install Anaconda as a user unless root privileges are required. The installer prompts – In order to continue the installation process, please review the license agreement. Click Enter to view license terms.

Scroll to the bottom of the license terms and enter Yes to agree. The installer prompts you to click Enter to accept the default install location, CTRL + C to cancel the installation, or specify an alternate installation directory. If you accept the default install location, the installer displays PREFIX=/home//anaconda<3> and continues the installation. It may take a few minutes to complete.

The installer prompts – Do you wish the installer to prepend the Anaconda<3> install location to PATH in your /home//.bashrc? Enter Yes.

If you enter No, you must manually add the path to Anaconda or conda will not work.

The installer describes Microsoft VS Code and asks if you would like to install the VS Code. Enter yes or no. If you select yes, follow the instructions on the screen to complete the VS Code installation.

Installing VS Code with the Anaconda installer requires an internet connection. Offline users may be able to find an offline VS Code installer from Microsoft.

The installer finishes and displays – Thank you for installing Anaconda<3>! Close and open your terminal window for the installation to take effect, or you can enter the command source ~/.bashrc.

After your installation is complete, verify it by opening Anaconda Navigator, a program that is included with Anaconda – open a Terminal window and type anaconda-navigator. If Navigator opens, you have successfully installed Anaconda.

You can find some known issues while installing Anaconda and their solutions in the following link: https://docs.anaconda.com/anaconda/user-guide/troubleshooting/

How to install a new Python library in Anaconda?

Most of the Python libraries/packages are preinstalled with the Anaconda Distribution, which you can verify by typing the following command in an Anaconda Prompt:

conda list

Figure 2.5: Anaconda Prompt

Now if you need to install any Python package which is not in the preceding list and is required for your task, then follow these steps. In the same Anaconda Prompt terminal, type conda install .

For example, if you want to install scipy package, just type conda install scipy, then press enter and then enter y to continue.

A second recommended approach to install any new package in Anaconda is to search the same (conda install ) in Google first and then go to the first search result, which is shown as follows:

In Google search, I am searching a package for example imageio i.e. conda install imageio.

Go to the first search result; this will open the Anaconda official site showing the installers of the searched package. In our example, it is like https://anaconda.org/menpo/imageio

Now copy the text under– To install this package with conda run: and paste in the Anaconda Prompt. In our case, text is: conda install -c menpoimageio

Open your notebook – Jupyter

After installing Anaconda, the next step is to open the notebook – an open-source web application that allows you to create and share documents that contain live code, equations, visualizations, and narrative text. For the notebook, open Anaconda Navigator and click on Launch button under the Jupyter Notebook icon or just type Jupyter Notebook in the search bar in Windows and then select it as shown in the following figure 2.6:

Figure 2.6: Windows search bar

Once you select it, a browser window (default is IE) will be opened showing the notebook as showing in the following figure 2.7:

Figure 2.7: Browser window

Know your notebook

Once your notebook is opened in the browser, click on the New dropdown and select the default first option – Python 3 as shown in the following figure 2.8:

Figure 2.8: Dropdown menu

After clicking on Python 3 option, a new tab will be opened containing the new untitled notebook, as shown in the following figure 2.9:

Figure 2.9: New tab

Rename your notebook with a proper name by double-clicking on the Untitled text and then enter any new name (I have named it MyFirstNotebook) and click Rename (refer to the following figure 2.10):

Figure 2.10: Rename

The preceding step will rename your notebook. Now it's time to run your first Python program in your first notebook. We will print a greeting message in Python for this purpose. In the cell (text bar) just type any welcome message inside the print block as shown in the following figure 2.11:

Figure 2.11: Welcome message

In the above cell, we are printing a string in Python 3.6. Now to run this program, you can simply press Shift + Enter keys together or click on the Play button just below the cell column (refer to the following figure 2.12):

Figure 2.12: Play button

Once you run the cell, your program will run and give you the output, as shown just below the cell in the following figure 2.13:

Figure 2.13: Output

Congrats! You have successfully run your first program in Python 3.7. This is just a one-line code using simple plain English text. Let's explore some more, the simplicity of the Python by doing some mathematical calculations.

Let's add two numbers by entering the FirstNumber + SecondNumber and then run it as shown in the following figure 2.14:

Figure 2.14: Simple calculation

Quite interesting, right! Let's move ahead and ask the user to input numbers and let Python do the homework. In the following example, you need

Enjoying the preview?

Page 1 of 1

Practical Data Science with Jupyter: Explore Data Cleaning, Pre-processing, Data Wrangling, Feature Engineering and Machine Learning using Python and Jupyter (English Edition)

About this ebook

Prateek Gupta

Related authors

Related to Practical Data Science with Jupyter

Related ebooks

Data Science Fundamentals and Practical Approaches: Understand Why Data Science Is the Next (English Edition)

Getting Started with Python Data Analysis

Python Machine Learning: A Step by Step Beginner’s Guide to Learn Machine Learning Using Python

R for Data Science

Data Scientist Pocket Guide: Over 600 Concepts, Terminologies, and Processes of Machine Learning and Deep Learning Assembled Together

The Data Science Workshop: A New, Interactive Approach to Learning Data Science

Machine Learning Interview Questions

Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees

Python Machine Learning

Deep Learning With Python Illustrated Guide For Beginners & Intermediates: The Future Is Here!: The Future Is Here!, #2

Time Series with Python: How to Implement Time Series Analysis and Forecasting Using Python

PYTHON FOR DATA ANALYSIS: A Practical Guide to Manipulating, Cleaning, and Analyzing Data Using Python (2023 Beginner Crash Course)

Python Machine Learning By Example

The Python Workshop: Learn to code in Python and kickstart your career in software development or data science

Python Machine Learning For Beginners: Handbook For Machine Learning, Deep Learning And Neural Networks Using Python, Scikit-Learn And TensorFlow

Advanced Machine Learning with Python

Machine Learning for Finance

Practical Data Analysis

Learning NumPy Array

Python Machine Learning: Introduction to Machine Learning with Python

Beginning with Machine Learning: The Ultimate Introduction to Machine Learning, Deep Learning, Scikit-learn, and TensorFlow (English Edition)

Introduction to Statistical and Machine Learning Methods for Data Science

Mastering Python Data Analysis

Machine Learning For Beginners Guide Algorithms: Supervised & Unsupervsied Learning. Decision Tree & Random Forest Introduction

NumPy: Beginner's Guide - Third Edition

Bayesian Analysis with Python

Pragmatic Machine Learning with Python: Learn How to Deploy Machine Learning Models in Production

Markov Models Supervised and Unsupervised Machine Learning: Mastering Data Science And Python

Machine Learning - A Complete Exploration of Highly Advanced Machine Learning Concepts, Best Practices and Techniques: 4

Python Machine Learning

Intelligence (AI) & Semantics For You

ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind

AI Money Machine: Unlock the Secrets to Making Money Online with AI

Mastering ChatGPT: 21 Prompts Templates for Effortless Writing

Co-Intelligence: Living and Working with AI

AI for Educators: AI for Educators

Nexus: A Brief History of Information Networks from the Stone Age to AI

The Coming Wave: AI, Power, and Our Future

The Instant AI Agency: How to Cash 6 & 7 Figure Checks in the New Digital Gold Rush Without Being A Tech Nerd

Artificial Intelligence: A Guide for Thinking Humans

ChatGPT Side Hustles 2024 - Unlock the Digital Goldmine and Get AI Working for You Fast with More Than 85 Side Hustle Ideas to Boost Passive Income, Create New Cash Flow, and Get Ahead of the Curve

Some Future Day: How AI Is Going to Change Everything

Make Money with ChatGPT: Your Guide to Making Passive Income Online with Ease using AI: AI Wealth Mastery

Coding with AI For Dummies

A Brief History of Artificial Intelligence: What It Is, Where We Are, and Where We Are Going

ChatGPT for Beginners: How to Make Money Online and 10x Your Productivity Using ChatGPT Even if You’re an Absolute Beginner (The Complete Up-to-Date ChatGPT Guide)

Writing AI Prompts For Dummies

The AI-Driven Leader: Harnessing AI to Make Faster, Smarter Decisions

100M Offers Made Easy: Create Your Own Irresistible Offers by Turning ChatGPT into Alex Hormozi

Midjourney Mastery - The Ultimate Handbook of Prompts

Neural Networks: A Practical Guide for Understanding and Programming Neural Networks and Useful Insights for Inspiring Reinvention

So You Want to Start a Podcast: Finding Your Voice, Telling Your Story, and Building a Community That Will Listen

Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees

ChatGPT Millionaire 2024 - Bot-Driven Side Hustles, Prompt Engineering Shortcut Secrets, and Automated Income Streams that Print Money While You Sleep. The Ultimate Beginner’s Guide for AI Business

Chat-GPT Income Ideas: Pioneering Monetization Concepts Utilizing Conversational AI for Profitable Ventures

101 Midjourney Prompt Secrets

ChatGPT Millionaire: Work From Home and Make Money Online, Tons of Business Models to Choose from

Dark Aeon: Transhumanism and the War Against Humanity

Artificial Intelligence For Dummies

80 Ways to Use ChatGPT in the Classroom

The Secrets of ChatGPT Prompt Engineering for Non-Developers

Related podcast episodes

Related categories

Reviews for Practical Data Science with Jupyter

What did you think?

Book preview

Practical Data Science with Jupyter - Prateek Gupta

CHAPTER 1

Data Science Fundamentals

Structure

Objective

What is data?

Structured data

Unstructured data