Hands-on Data Analysis and Visualization with Pandas: Engineer, Analyse and Visualize Data, Using Powerful Python Libraries

Ebook542 pages3 hours

Hands-on Data Analysis and Visualization with Pandas: Engineer, Analyse and Visualize Data, Using Powerful Python Libraries

Name: Hands-on Data Analysis and Visualization with Pandas: Engineer, Analyse and Visualize Data, Using Powerful Python Libraries
Brand: BPB Online LLP
Rating: 5.0 (1 reviews)

By PURNA CHANDER RAO. KATHULA

Rating: 5 out of 5 stars

5/5

()

Read preview

Data Analysis
Data Visualization
Pandas
Jupyterlab
Data Science
Mentor Figure
Machine Learning
Statistics
Publishing
Python Programming
Python

About this ebook

The book will start with quick introductions to Python and its ecosystem libraries for data science such as JupyterLab, Numpy, Pandas, SciPy, Matplotlib, and Seaborn.

This book will help in learning python data structures and essential concepts such as Functions, Lambdas, List comprehensions, Datetime objects, etc. required for data engineering. It also covers an in-depth understanding of Python data science packages where JupyterLab used as an IDE for writing, documenting, and executing the python code, Numpy used for computation of numerical operations, Pandas for cleaning and reorganizing the data, handling large datasets and merging the dataframes to get meaningful insights. You will go through the statistics to understand the relation between the variables using SciPy and building visualization charts using Matplotllib and Seaborn libraries.

Skip carousel

Computers

LanguageEnglish

PublisherBPB Online LLP

Release dateAug 13, 2020

ISBN9789389845655

Author

PURNA CHANDER RAO. KATHULA

Related authors

Skip carousel

Related to Hands-on Data Analysis and Visualization with Pandas

Related ebooks

Skip carousel

Pandas in 7 Days: Utilize Python to Manipulate Data, Conduct Scientific Computing, Time Series Analysis, and Exploratory Data Analysis
Ebook
Pandas in 7 Days: Utilize Python to Manipulate Data, Conduct Scientific Computing, Time Series Analysis, and Exploratory Data Analysis
byFabio Nelli
Rating: 0 out of 5 stars
0 ratings
Python Data Visualization Essentials Guide: Become a Data Visualization expert by building strong proficiency in Pandas, Matplotlib, Seaborn, Plotly, Numpy, and Bokeh
Ebook
Python Data Visualization Essentials Guide: Become a Data Visualization expert by building strong proficiency in Pandas, Matplotlib, Seaborn, Plotly, Numpy, and Bokeh
byKalilur Rahman
Rating: 0 out of 5 stars
0 ratings
Hands-On Data Analysis with Pandas: Efficiently perform data collection, wrangling, analysis, and visualization using Python
Ebook
Hands-On Data Analysis with Pandas: Efficiently perform data collection, wrangling, analysis, and visualization using Python
byStefanie Molin
Rating: 0 out of 5 stars
0 ratings
Python Data Science Essentials - Second Edition
Ebook
Python Data Science Essentials - Second Edition
byAlberto Boschetti
Rating: 4 out of 5 stars
4/5
Python For Data Science
Ebook
Python For Data Science
byKevin Clark
Rating: 0 out of 5 stars
0 ratings
Data Science with Jupyter: Master Data Science skills with easy-to-follow Python examples
Ebook
Data Science with Jupyter: Master Data Science skills with easy-to-follow Python examples
byPrateek Gupta
Rating: 0 out of 5 stars
0 ratings
Practical Data Science with Jupyter: Explore Data Cleaning, Pre-processing, Data Wrangling, Feature Engineering and Machine Learning using Python and Jupyter (English Edition)
Ebook
Practical Data Science with Jupyter: Explore Data Cleaning, Pre-processing, Data Wrangling, Feature Engineering and Machine Learning using Python and Jupyter (English Edition)
byPrateek Gupta
Rating: 0 out of 5 stars
0 ratings
Data Analysis with Python: Introducing NumPy, Pandas, Matplotlib, and Essential Elements of Python Programming (English Edition)
Ebook
Data Analysis with Python: Introducing NumPy, Pandas, Matplotlib, and Essential Elements of Python Programming (English Edition)
byRituraj Dixit
Rating: 0 out of 5 stars
0 ratings
R for Data Science
Ebook
R for Data Science
byDan Toomey
Rating: 5 out of 5 stars
5/5
Learn AI with Python: Explore Machine Learning and Deep Learning techniques for Building Smart AI Systems Using Scikit-Learn, NLTK, NeuroLab, and Keras (English Edition)
Ebook
Learn AI with Python: Explore Machine Learning and Deep Learning techniques for Building Smart AI Systems Using Scikit-Learn, NLTK, NeuroLab, and Keras (English Edition)
byGaurav Leekha
Rating: 5 out of 5 stars
5/5
Python Data Analysis
Ebook
Python Data Analysis
byIvan Idris
Rating: 4 out of 5 stars
4/5
Python Data Science Essentials
Ebook
Python Data Science Essentials
byAlberto Boschetti
Rating: 0 out of 5 stars
0 ratings
Pandas 1.x Cookbook - Second Edition: Practical recipes for scientific computing, time series analysis, and exploratory data analysis using Python, 2nd Edition
Ebook
Pandas 1.x Cookbook - Second Edition: Practical recipes for scientific computing, time series analysis, and exploratory data analysis using Python, 2nd Edition
byMatt Harrison
Rating: 5 out of 5 stars
5/5
Data Science Fundamentals and Practical Approaches: Understand Why Data Science Is the Next (English Edition)
Ebook
Data Science Fundamentals and Practical Approaches: Understand Why Data Science Is the Next (English Edition)
byDr. Gypsy Nandi
Rating: 0 out of 5 stars
0 ratings
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
Practical Full Stack Machine Learning: A Guide to Build Reliable, Reusable, and Production-Ready Full Stack ML Solutions
Ebook
Practical Full Stack Machine Learning: A Guide to Build Reliable, Reusable, and Production-Ready Full Stack ML Solutions
byAlok Kumar
Rating: 0 out of 5 stars
0 ratings
Python Machine Learning: A Step by Step Beginner’s Guide to Learn Machine Learning Using Python
Ebook
Python Machine Learning: A Step by Step Beginner’s Guide to Learn Machine Learning Using Python
byBrady Ellison
Rating: 0 out of 5 stars
0 ratings
Advanced Machine Learning with Python
Ebook
Advanced Machine Learning with Python
byJohn Hearty
Rating: 0 out of 5 stars
0 ratings
Practical Data Science Cookbook - Second Edition
Ebook
Practical Data Science Cookbook - Second Edition
byTony Ojeda
Rating: 0 out of 5 stars
0 ratings
Python Machine Learning - Third Edition: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2, 3rd Edition
Ebook
Python Machine Learning - Third Edition: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2, 3rd Edition
bySebastian Raschka
Rating: 4 out of 5 stars
4/5
Practical Data Analysis
Ebook
Practical Data Analysis
byHector Cuesta
Rating: 4 out of 5 stars
4/5
Getting Started with Python Data Analysis
Ebook
Getting Started with Python Data Analysis
byVo.T.H Phuong
Rating: 0 out of 5 stars
0 ratings
Python In - Depth: Use Python Programming Features, Techniques, and Modules to Solve Everyday Problems
Ebook
Python In - Depth: Use Python Programming Features, Techniques, and Modules to Solve Everyday Problems
byAhidjo Ayeva
Rating: 0 out of 5 stars
0 ratings
Advanced Deep Learning with Python: Design and implement advanced next-generation AI solutions using TensorFlow and PyTorch
Ebook
Advanced Deep Learning with Python: Design and implement advanced next-generation AI solutions using TensorFlow and PyTorch
byIvan Vasilev
Rating: 0 out of 5 stars
0 ratings
Python Machine Learning By Example
Ebook
Python Machine Learning By Example
byYuxi (Hayden) Liu
Rating: 4 out of 5 stars
4/5
Hands-On Web Scraping with Python: Perform advanced scraping operations using various Python libraries and tools such as Selenium, Regex, and others
Ebook
Hands-On Web Scraping with Python: Perform advanced scraping operations using various Python libraries and tools such as Selenium, Regex, and others
byAnish Chapagain
Rating: 0 out of 5 stars
0 ratings
Hands-On Deep Learning Algorithms with Python: Master deep learning algorithms with extensive math by implementing them using TensorFlow
Ebook
Hands-On Deep Learning Algorithms with Python: Master deep learning algorithms with extensive math by implementing them using TensorFlow
bySudharsan Ravichandiran
Rating: 0 out of 5 stars
0 ratings
Hands-on Supervised Learning with Python
Ebook
Hands-on Supervised Learning with Python
byMadeleine Shang
Rating: 0 out of 5 stars
0 ratings
Machine Learning for Beginners: Learn to Build Machine Learning Systems Using Python (English Edition)
Ebook
Machine Learning for Beginners: Learn to Build Machine Learning Systems Using Python (English Edition)
byHarsh Bhasin
Rating: 0 out of 5 stars
0 ratings
Mastering Data Science with Python: The Ultimate Guide: Unlock the Power of Data Analysis and Visualization with Python's Cutting-Edge Tools and Techniques
Ebook
Mastering Data Science with Python: The Ultimate Guide: Unlock the Power of Data Analysis and Visualization with Python's Cutting-Edge Tools and Techniques
bydaniel Huston
Rating: 0 out of 5 stars
0 ratings

Computers For You

Skip carousel

Elon Musk
Ebook
Elon Musk
byWalter Isaacson
Rating: 4 out of 5 stars
4/5
Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
Ebook
Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
byT.C. Boyle
Rating: 5 out of 5 stars
5/5
Deep Search: How to Explore the Internet More Effectively
Ebook
Deep Search: How to Explore the Internet More Effectively
byAlan Pearce
Rating: 5 out of 5 stars
5/5
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Ebook
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
byMargot Lee Shetterly
Rating: 4 out of 5 stars
4/5
Learning the Chess Openings
Ebook
Learning the Chess Openings
byJef Kaan
Rating: 5 out of 5 stars
5/5
Slenderman: Online Obsession, Mental Illness, and the Violent Crime of Two Midwestern Girls
Ebook
Slenderman: Online Obsession, Mental Illness, and the Violent Crime of Two Midwestern Girls
byKathleen Hale
Rating: 4 out of 5 stars
4/5
ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
Ebook
ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
byAlec Rowe
Rating: 0 out of 5 stars
0 ratings
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
Ebook
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
byCea West
Rating: 4 out of 5 stars
4/5
The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution
Ebook
The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution
byWalter Isaacson
Rating: 4 out of 5 stars
4/5
CompTIA Security+ Get Certified Get Ahead: SY0-701 Study Guide
Ebook
CompTIA Security+ Get Certified Get Ahead: SY0-701 Study Guide
byJoe Shelley
Rating: 5 out of 5 stars
5/5
The Professional Voiceover Handbook: Voiceover training, #1
Ebook
The Professional Voiceover Handbook: Voiceover training, #1
byPeter Baker
Rating: 5 out of 5 stars
5/5
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
Ebook
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
byWalter Shields
Rating: 4 out of 5 stars
4/5
Some Future Day: How AI Is Going to Change Everything
Ebook
Some Future Day: How AI Is Going to Change Everything
byMarc Beckman
Rating: 0 out of 5 stars
0 ratings
Excel 101: A Beginner's & Intermediate's Guide for Mastering the Quintessence of Microsoft Excel (2010-2019 & 365) in no time!
Ebook
Excel 101: A Beginner's & Intermediate's Guide for Mastering the Quintessence of Microsoft Excel (2010-2019 & 365) in no time!
byJohannes Wild
Rating: 0 out of 5 stars
0 ratings
An Ultimate Guide to Kali Linux for Beginners
Ebook
An Ultimate Guide to Kali Linux for Beginners
byAnsh Goyal
Rating: 3 out of 5 stars
3/5
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
Ebook
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
byTJ Books
Rating: 4 out of 5 stars
4/5
The Self-Taught Computer Scientist: The Beginner's Guide to Data Structures & Algorithms
Ebook
The Self-Taught Computer Scientist: The Beginner's Guide to Data Structures & Algorithms
byCory Althoff
Rating: 0 out of 5 stars
0 ratings
Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics
Ebook
Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics
byGary Smith
Rating: 4 out of 5 stars
4/5
How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally
Ebook
How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally
byAlex Parkinson
Rating: 4 out of 5 stars
4/5
Learn Python Programming for Beginners: The Best Step-by-Step Guide for Coding with Python, Great for Kids and Adults. Includes Practical Exercises on Data Analysis, Machine Learning and More.
Ebook
Learn Python Programming for Beginners: The Best Step-by-Step Guide for Coding with Python, Great for Kids and Adults. Includes Practical Exercises on Data Analysis, Machine Learning and More.
byFlynn Fisher
Rating: 4 out of 5 stars
4/5
101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters
Ebook
101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters
byTriumph Books
Rating: 4 out of 5 stars
4/5
Alan Turing: The Enigma: The Book That Inspired the Film The Imitation Game - Updated Edition
Ebook
Alan Turing: The Enigma: The Book That Inspired the Film The Imitation Game - Updated Edition
byAndrew Hodges
Rating: 4 out of 5 stars
4/5
A Brief History of Artificial Intelligence: What It Is, Where We Are, and Where We Are Going
Ebook
A Brief History of Artificial Intelligence: What It Is, Where We Are, and Where We Are Going
byMichael Wooldridge
Rating: 4 out of 5 stars
4/5
Going Text: Mastering the Command Line
Ebook
Going Text: Mastering the Command Line
byBrian Schell
Rating: 4 out of 5 stars
4/5
Tor and the Dark Art of Anonymity
Ebook
Tor and the Dark Art of Anonymity
byLance Henderson
Rating: 5 out of 5 stars
5/5
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
Python Machine Learning By Example
Ebook
Python Machine Learning By Example
byYuxi (Hayden) Liu
Rating: 4 out of 5 stars
4/5
The Hacker Crackdown: Law and Disorder on the Electronic Frontier
Ebook
The Hacker Crackdown: Law and Disorder on the Electronic Frontier
byBruce Sterling
Rating: 4 out of 5 stars
4/5
CompTia Security 701: Fundamentals of Security
Ebook
CompTia Security 701: Fundamentals of Security
byAS Snipes
Rating: 0 out of 5 stars
0 ratings
Uncanny Valley: A Memoir
Ebook
Uncanny Valley: A Memoir
byAnna Wiener
Rating: 4 out of 5 stars
4/5

Related podcast episodes

Skip carousel

78: Mindset of a Rockstar Data Analyst w/ Trevor Tapscott: Our focus for this inspiring episode of AOF is mindset, especially if you want to be a standout data analyst! I have brought one of my first ever followers and day ones! Trevor Tapscott is a VP and Analytics Consultant at Wells Fargo and has been in...
UNLIMITED
78: Mindset of a Rockstar Data Analyst w/ Trevor Tapscott: Our focus for this inspiring episode of AOF is mindset, especially if you want to be a standout data analyst! I have brought one of my first ever followers and day ones! Trevor Tapscott is a VP and Analytics Consultant at Wells Fargo and has been in...
byAnalytics on Fire
0 ratings
0% found this document useful
Exploring deep reinforcement learning: with Thomas Simonini of Hugging Face
UNLIMITED
Exploring deep reinforcement learning: with Thomas Simonini of Hugging Face
byPractical AI
0 ratings
0% found this document useful
Measuring Your Python Learning Progress
UNLIMITED
Measuring Your Python Learning Progress
byThe Real Python Podcast
100%
100% found this document useful
Getting Technical about the Data Center Revolution with Jonathan Friedmann, CEO of Speedata
UNLIMITED
Getting Technical about the Data Center Revolution with Jonathan Friedmann, CEO of Speedata
byMaking Data Simple
0 ratings
0% found this document useful
#1 Data Science, Past, Present and Future: Hilary Mason talks about the past, present, and future of data science with Hugo. Hilary is the VP of Research at Cloudera Fast Forward, a machine intelligence research company, and the data scientist in residence at Accel. If you want to hear about wh...
UNLIMITED
#1 Data Science, Past, Present and Future: Hilary Mason talks about the past, present, and future of data science with Hugo. Hilary is the VP of Research at Cloudera Fast Forward, a machine intelligence research company, and the data scientist in residence at Accel. If you want to hear about wh...
byDataFramed
100%
100% found this document useful
#059 - 10 Python clean code tips drawn from code reviews
UNLIMITED
#059 - 10 Python clean code tips drawn from code reviews
byPybites Podcast
0 ratings
0% found this document useful
Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks: A cross-over episode from The Machine Learning Podcast with the team from Deepchecks, exploring the challenges of testing and validating machine learning applications and their work to make it easier.
UNLIMITED
Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks: A cross-over episode from The Machine Learning Podcast with the team from Deepchecks, exploring the challenges of testing and validating machine learning applications and their work to make it easier.
byThe Python Podcast.__init__
0 ratings
0% found this document useful
S1:E1 "The Beginning"
UNLIMITED
S1:E1 "The Beginning"
byData Science Now
0 ratings
0% found this document useful
Advantages of Completing Small Python Projects
UNLIMITED
Advantages of Completing Small Python Projects
byThe Real Python Podcast
0 ratings
0% found this document useful
Episode 19 (Python for Data Science - Python Files - Scripts and Modules)
UNLIMITED
Episode 19 (Python for Data Science - Python Files - Scripts and Modules)
byHow to Data (Joshiverse- Journey of a Budding Data Scientist)
0 ratings
0% found this document useful
Combining Python And SQL To Build A PyData Warehouse: An interview about how data warehouses fit into the PyData ecosystem for advanced analytics on big data
UNLIMITED
Combining Python And SQL To Build A PyData Warehouse: An interview about how data warehouses fit into the PyData ecosystem for advanced analytics on big data
byThe Python Podcast.__init__
0 ratings
0% found this document useful
Harnessing Generative AI For Creating Educational Content With Illumidesk: Generative AI has unlocked a massive opportunity for content creation. There is also an unfulfilled need for experts to be able to share their knowledge and build communities. Illumidesk was built to take advantage of this intersection. In this episode Greg Werner explains how they are using generative AI as an assistive tool for creating educational material, as well as building a data driven experience for learners.
UNLIMITED
Harnessing Generative AI For Creating Educational Content With Illumidesk: Generative AI has unlocked a massive opportunity for content creation. There is also an unfulfilled need for experts to be able to share their knowledge and build communities. Illumidesk was built to take advantage of this intersection. In this episode Greg Werner explains how they are using generative AI as an assistive tool for creating educational material, as well as building a data driven experience for learners.
byData Engineering Podcast
0 ratings
0% found this document useful
Agile Applied AI Research with Parvez Ahammad - #492: Today we’re joined by Parvez Ahammad, head of data science applied research at LinkedIn. In our conversation, Parvez shares his interesting take on organizing principles for his organization, starting with how data science teams are broadly...
UNLIMITED
Agile Applied AI Research with Parvez Ahammad - #492: Today we’re joined by Parvez Ahammad, head of data science applied research at LinkedIn. In our conversation, Parvez shares his interesting take on organizing principles for his organization, starting with how data science teams are broadly...
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
Supercharging Your Process Mining with Python
UNLIMITED
Supercharging Your Process Mining with Python
byMining Your Business
0 ratings
0% found this document useful
MLOps.community #6 - Mid Scale Production Feature Engineering with Dr. Venkata Pingali
UNLIMITED
MLOps.community #6 - Mid Scale Production Feature Engineering with Dr. Venkata Pingali
byMLOps.community
0 ratings
0% found this document useful
MLOps Meetup #25 // Python and Dask: Scaling the DataFrame // Dan Gerlanc - Founder of Enplus Advisors
UNLIMITED
MLOps Meetup #25 // Python and Dask: Scaling the DataFrame // Dan Gerlanc - Founder of Enplus Advisors
byMLOps.community
0 ratings
0% found this document useful
Open Standards Make MLOps Easier and Silos Harder // Cody Peterson // #234
UNLIMITED
Open Standards Make MLOps Easier and Silos Harder // Cody Peterson // #234
byMLOps.community
0 ratings
0% found this document useful
Build Your Second Brain One Piece At A Time: Generative AI promises to accelerate the productivity of human collaborators. Currently the primary way of working with these tools is through a conversational prompt, which is often cumbersome and unwieldy. In order to simplify the integration of AI capabilities into developer workflows Tsavo Knott helped create Pieces, a powerful collection of tools that complements the tools that developers already use. In this episode he explains the data collection and preparation process, the collection of model types and sizes that work together to power the experience, and how to incorporate it into your workflow to act as a second brain.
UNLIMITED
Build Your Second Brain One Piece At A Time: Generative AI promises to accelerate the productivity of human collaborators. Currently the primary way of working with these tools is through a conversational prompt, which is often cumbersome and unwieldy. In order to simplify the integration of AI capabilities into developer workflows Tsavo Knott helped create Pieces, a powerful collection of tools that complements the tools that developers already use. In this episode he explains the data collection and preparation process, the collection of model types and sizes that work together to power the experience, and how to incorporate it into your workflow to act as a second brain.
byData Engineering Podcast
0 ratings
0% found this document useful
Four Most Commonly Asked Questions About AI with Dr. Jerry Smith: Dr. Jerry Smith welcomes you to another episode of AI Live and Unbiased to explore the breadth and depth of Artificial Intelligence and to encourage you to change the world, not just observe it! Dr. Jerry is talking today about questions and...
UNLIMITED
Four Most Commonly Asked Questions About AI with Dr. Jerry Smith: Dr. Jerry Smith welcomes you to another episode of AI Live and Unbiased to explore the breadth and depth of Artificial Intelligence and to encourage you to change the world, not just observe it! Dr. Jerry is talking today about questions and...
byAI Live & Unbiased
0 ratings
0% found this document useful
The Role of Infrastructure in ML // Niels Bantilan // #197
UNLIMITED
The Role of Infrastructure in ML // Niels Bantilan // #197
byMLOps.community
0 ratings
0% found this document useful
Composable Data Analytics
UNLIMITED
Composable Data Analytics
byThe Cloudcast
0 ratings
0% found this document useful
AI and ML Networking: bridging the gap between performance and economy
UNLIMITED
AI and ML Networking: bridging the gap between performance and economy
byTechnology Now
0 ratings
0% found this document useful
#48 Managing Data Science Teams
UNLIMITED
#48 Managing Data Science Teams
byDataFramed
0 ratings
0% found this document useful
[DataFramed Careers Series #3]: Accelerating Data Careers with Writing
UNLIMITED
[DataFramed Careers Series #3]: Accelerating Data Careers with Writing
byDataFramed
0 ratings
0% found this document useful
Eliminate The Overhead In Your Data Integration With The Open Source dlt Library: Cloud data warehouses and the introduction of the ELT paradigm has led to the creation of multiple options for flexible data integration, with a roughly equal distribution of commercial and open source options. The challenge is that most of those options are complex to operate and exist in their own silo. The dlt project was created to eliminate overhead and bring data integration into your full control as a library component of your overall data system. In this episode Adrian Brudaru explains how it works, the benefits that it provides over other data integration solutions, and how you can start building pipelines today.
UNLIMITED
Eliminate The Overhead In Your Data Integration With The Open Source dlt Library: Cloud data warehouses and the introduction of the ELT paradigm has led to the creation of multiple options for flexible data integration, with a roughly equal distribution of commercial and open source options. The challenge is that most of those options are complex to operate and exist in their own silo. The dlt project was created to eliminate overhead and bring data integration into your full control as a library component of your overall data system. In this episode Adrian Brudaru explains how it works, the benefits that it provides over other data integration solutions, and how you can start building pipelines today.
byData Engineering Podcast
0 ratings
0% found this document useful
How Column-Aware Development Tooling Yields Better Data Models: Architectural decisions are all based on certain constraints and a desire to optimize for different outcomes. In data systems one of the core architectural exercises is data modeling, which can have significant impacts on what is and is not possible for downstream use cases. By incorporating column-level lineage in the data modeling process it encourages a more robust and well-informed design. In this episode Satish Jayanthi explores the benefits of incorporating column-aware tooling in the data modeling process.
UNLIMITED
How Column-Aware Development Tooling Yields Better Data Models: Architectural decisions are all based on certain constraints and a desire to optimize for different outcomes. In data systems one of the core architectural exercises is data modeling, which can have significant impacts on what is and is not possible for downstream use cases. By incorporating column-level lineage in the data modeling process it encourages a more robust and well-informed design. In this episode Satish Jayanthi explores the benefits of incorporating column-aware tooling in the data modeling process.
byData Engineering Podcast
0 ratings
0% found this document useful
Understanding Machine Learning Features and Platforms
UNLIMITED
Understanding Machine Learning Features and Platforms
byThe Cloudcast
0 ratings
0% found this document useful
Bring Your Own Data to LLMs (W/ Jerry Liu of LlamaIndex): Jerry Liu is the CEO and co-founder of LlamaIndex. LlamaIndex is an open-source framework that helps people prep their data for use with large language models in a process called retrieval augmented generation. LLMs are great decision engines, but in...
UNLIMITED
Bring Your Own Data to LLMs (W/ Jerry Liu of LlamaIndex): Jerry Liu is the CEO and co-founder of LlamaIndex. LlamaIndex is an open-source framework that helps people prep their data for use with large language models in a process called retrieval augmented generation. LLMs are great decision engines, but in...
byThe Analytics Engineering Podcast
0 ratings
0% found this document useful
554. Barry Saunders: AI Project Case Study: Show Notes: Barry Saunders, a digital expert at McKinsey, discusses his background in the firm and his experience in AI-related projects. He worked in the LEAP practice, which built platforms for video streaming, preventative maintenance, and...
UNLIMITED
554. Barry Saunders: AI Project Case Study: Show Notes: Barry Saunders, a digital expert at McKinsey, discusses his background in the firm and his experience in AI-related projects. He worked in the LEAP practice, which built platforms for video streaming, preventative maintenance, and...
byUnleashed - How to Thrive as an Independent Professional
0 ratings
0% found this document useful
Gain Visibility Into Your Entire Machine Learning System Using Data Logging With WhyLogs: An interview with Andy Dang about the open source WhyLogs library and how it simplifies the work of data logging for instrumenting your machine learning workflows and unlocking observability.
UNLIMITED
Gain Visibility Into Your Entire Machine Learning System Using Data Logging With WhyLogs: An interview with Andy Dang about the open source WhyLogs library and how it simplifies the work of data logging for instrumenting your machine learning workflows and unlocking observability.
byData Engineering Podcast
0 ratings
0% found this document useful

Related categories

Skip carousel

Reviews for Hands-on Data Analysis and Visualization with Pandas

Rating: 5 out of 5 stars

5/5

1 rating0 reviews

Book preview

Hands-on Data Analysis and Visualization with Pandas - PURNA CHANDER RAO. KATHULA

CHAPTER 1

Introduction to Data Analysis

Data analysis is an art. It is a science of extracting insights from the silos of data. This chapter introduces you to the data and its ecosystem components, along with the different stages of the data analysis process, how Python is useful for data analysis and different data science libraries/modules, and their installation process.

Structure

Inspiration for data analysis

What is data science?

Domain expertise

Maths and statistics

Artificial intelligence

Machine learning

Data infrastructure

Data analysis process

Business requirements

Data collection

Data cleansing

Data exploration and visualization

Data modeling

Model validation and testing

Deployment

Why Python for data analysis?

Python libraries for data analysis

Objective

This chapter will guide you through the different processes of data analysis, various concepts such as maths, statistics, and processes that make up this discipline. The concepts covered here will be a heads up for the coming chapters where these concepts and procedures will be applied in the form of Python code with different data related libraries.

Inspiration for data analysis

In this chapter, we will be covering various factors and trends that influence data analysis. In the current world of digitalization, a huge amount of data is produced by IoT devices like sensors, diagnosis reports from healthcare or wellness industry, social network portals such as Facebook, YouTube, LinkedIn, Instagram, and e-commerce sites like Alibaba, Amazon, or Flipkart, where you add an audio, video, comment, add a like, emoji, or you make bank transactions online or use an ATM kiosk to withdraw the money, buy something on e-commerce sites and much more.

This data is not exactly useful information. It is the result of processing, which takes into account a certain set of data that extracts some set of conclusions that can be used in different ways. This process of extracting information from the raw data is data analysis. This analysis of the data becomes the foundation for building predictive models or drawing data visualization charts around the data.

Without Big data and analytics, companies are blind and deaf, wandering on to the web like deer on a freeway.

-Geoffrey Moore, author, and consultant.

What is data science?

Data science is a study of data. It is multidisciplinary that involves maths, statistics, algorithms, domain expertise, processes, and systems to extract insights from data. This data might be structured, semi-structured, and unstructured. The following Figure 1.1 display different structures of data:

Figure 1.1

Structured data

Tabular rows and columns (Databases)

DWH (Tera data systems) and BI Systems

Text files such as comma-separated (.csv), tab-separated (.tsv).

Semi-structured data

Excel, XML, JSON, Logs.

Unstructured data

Audio, Video, Images.

Domain expertise

Domain expertise or domain knowledge is about expertise in a particular field like Healthcare, Insurance, Banking, and so on. A domain expert may or may not relate to technology but has in-depth knowledge of a particular industry, its trends, and practices that impact the industry. The process of data analysis not only requires having good expertise in tools and computational techniques but also needs to have a good understanding of the data. In short, the data analyst must be able to know how to search not only for data but also for information and how to treat that information to get valid insights from it.

For example, you are asked to build an application for e-commerce, banking, or insurance domain. The application has to be that it complements the industry and various dimensions of it. The technical team wouldn‘t know the industry norms or the application features; here is where domain expert and domain knowledge comes into the picture.

Maths and statistics

It is a study of statistics from a mathematical point of view. Data analysis requires a good amount of math. Good knowledge of statistics is also required because the statistical methods are applied to the analysis and interpretation of the data. Python provides a good amount of libraries to solve these mathematical and statistical problems, but one should have a good idea about how the libraries work.

Artificial intelligence

Artificial intelligence is the intelligence demonstrated by machines, in contrast to the natural intelligence displayed by humans. Artificial intelligence is the superset of data science, which is one of the advanced concepts in data analysis. It is the study of training computers for jobs which are done by humans. The term Artificial intelligence is two different words: Artificial means something which is not natural or human-made, and Intelligence means the ability to think or understand.

AI Market is already widespread, and you interact with it on a daily basis. Here are a few examples of Artificial intelligence:

Search engines like Google internally use gigantic algorithms to perform a better search.

Self-driving cars where the vehicles can completely navigate their way from one point to another.

Chatbots help as online messengers to assist customers immediately and effectively.

Voice searches on smartphones use AI to determine the best result for those long-tail keywords and conversational queries.

Online Ads use AI to target specific customers based on past behavior, interest, and search queries.

Machine learning

It is an Algorithmic driven study which makes computers capable of learning based on their own previous experience and improve the performance of the task. Machine learning is the subset of Artificial intelligence, and it is a study of machines where machines learn by themselves without being explicitly trained. Assuming you are asked to write a program for a speech recognition software converting speech to text, based on accent, grammar, pronunciation, vocabulary. It would be a gigantic task that can be easily understood by machine learning.

Technically machine learning is divided into three parts, explained as follows:

Supervised learning

In this learning, we ask machine questions and compare answers with the actual answers and instruct the machines to minimize the errors. Supervised machine learning can do things as follows:

Weather forecasting.

Detecting online frauds.

Market forecasting.

Image classification.

Unsupervised learning

In this learning, you give the machine huge chunks of data and instruct it to find some sort of patterns, and based on these patterns, your machine accomplishes certain tasks. Unsupervised machine learning can do things as follows:

Build recommendation engines

Targeted marketing

Customer segmentation

Reinforcement learning

In this learning, the machine is left in an environment where something is happening, and there is a reward if the machine does what we want, and there is a penalty if it performs incorrectly and based on it we instruct the machine to maximize the reward, and eventually, the machine learns the things which we want it to do. Reinforcement learning works on:

Games

Bidding and advertising

Training self-driven cars

Data infrastructure

Generally, people tend to refer to infrastructure as those things that support what they are doing at work. For example, the roads used for transportation, sewage system, and bridges, all these are considered as infrastructure. The role of data infrastructure is to protect, preserve, process, move, secure, and serve data as well as their applications for information service delivery. Data infrastructure includes software, hardware, and cloud or managed services, servers, storage, and so on.

Thanks to the Big data world, it generates a humongous amount of information that needs to be processed. Sometimes normal desktop systems or servers doesn‘t have enough computation power to read, process, or analyze them. We need systems with a high configuration of RAM or a good amount of disk space to save the data. The cloud-based Amazon (AWS)/GCP/Azure help us meet the challenges through resource allocation and virtualization.

Data analysis process

Data analysis is a series of steps in which the raw data is transformed and processed in order to produce insights about the data and to make predictions. The processing includes mathematical and statistical approaches and charts or graphs for data visualizations. So data analysis is schematized as a process chain consisting of the following sequence of stages, as shown in Figure 1.2:

Figure 1.2

Let‘s discuss these processes in detail.

Business requirements

Data Analysis starts with a problem to be solved, which needs to be defined, like predicting the stock price of a company or identifying credit card fraudulent transactions or detecting tumors based on health data and so on.

Data collection

The data must be chosen with the basic purpose of building a predictive model. This is the most tedious task to analyze anything we need to have data. Mostly data will be shared by the clients in the form of comma-separated, tab-delimited, pipe delimited files. Not all data is available in files or databases; it can be as HTML pages; this process of collecting the data is called Web Scraping. Python libraries such as scrapy, beautiful soup, and requests help in scraping the data from web pages.

Data cleansing

This stage seems to be less problematic but requires more resources and time to complete. The data collected may be from different sources such as excel, CSV, Json, parquet or a scraped data from a web page each of which will have different representation of data like date field might be a string or an integer might be read as float, so all these data needs to be cleaned for data analysis. Cleansing includes invalid data, ambiguous or missing values or outliers in the data.

Data exploring and visualization

Exploration is the process of graphical and statistical representation to find patterns, connections, and relations between variables in the data. Python libraries such as matplotlib and seaborn help us to visualize the data. Different statistical formats like heatmaps, boxplot, violin plot, scatter plots help us to understand the patterns, outliers, and relationships better. Exploration also includes one or more of the following activities:

Grouping the data

Summarizing the data

Construction of regression models to find the deviation of data

Data modeling

It is the process of choosing a suitable statistical model to predict the result. After data exploration, we need to develop a mathematical model that encodes the relationship between data. These models are divided according to the result they produce:

Classification: If the result obtained by the model is categorical.

Regression: If the result obtained by the model is numerical.

Clustering: It involves grouping of the data points to gain valuable insights.

Python’s Scikit Learn library provides methods such as linear regression, logistic regression, classification trees, SVM, Adaboost, and K-nearest neighbor to generate these models.

Model validation and testing

Validation of the model is divided into train and test phase. The data is randomly divided to 70 percent for training, 30 percent for testing. The model gets trained by the 70 percent data, which in turn compares with the remaining 30 percent test data. There are several techniques to validate the effectiveness of the model; the most popular is k-Fold

Enjoying the preview?

Page 1 of 1

Hands-on Data Analysis and Visualization with Pandas: Engineer, Analyse and Visualize Data, Using Powerful Python Libraries

About this ebook

PURNA CHANDER RAO. KATHULA

Related authors

Related to Hands-on Data Analysis and Visualization with Pandas

Related ebooks

Pandas in 7 Days: Utilize Python to Manipulate Data, Conduct Scientific Computing, Time Series Analysis, and Exploratory Data Analysis

Python Data Visualization Essentials Guide: Become a Data Visualization expert by building strong proficiency in Pandas, Matplotlib, Seaborn, Plotly, Numpy, and Bokeh

Hands-On Data Analysis with Pandas: Efficiently perform data collection, wrangling, analysis, and visualization using Python

Python Data Science Essentials - Second Edition

Python For Data Science

Data Science with Jupyter: Master Data Science skills with easy-to-follow Python examples

Practical Data Science with Jupyter: Explore Data Cleaning, Pre-processing, Data Wrangling, Feature Engineering and Machine Learning using Python and Jupyter (English Edition)

Data Analysis with Python: Introducing NumPy, Pandas, Matplotlib, and Essential Elements of Python Programming (English Edition)

R for Data Science

Learn AI with Python: Explore Machine Learning and Deep Learning techniques for Building Smart AI Systems Using Scikit-Learn, NLTK, NeuroLab, and Keras (English Edition)

Python Data Analysis

Python Data Science Essentials

Pandas 1.x Cookbook - Second Edition: Practical recipes for scientific computing, time series analysis, and exploratory data analysis using Python, 2nd Edition

Data Science Fundamentals and Practical Approaches: Understand Why Data Science Is the Next (English Edition)

Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees

Practical Full Stack Machine Learning: A Guide to Build Reliable, Reusable, and Production-Ready Full Stack ML Solutions

Python Machine Learning: A Step by Step Beginner’s Guide to Learn Machine Learning Using Python

Advanced Machine Learning with Python

Practical Data Science Cookbook - Second Edition

Python Machine Learning - Third Edition: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2, 3rd Edition

Practical Data Analysis

Getting Started with Python Data Analysis

Python In - Depth: Use Python Programming Features, Techniques, and Modules to Solve Everyday Problems

Advanced Deep Learning with Python: Design and implement advanced next-generation AI solutions using TensorFlow and PyTorch

Python Machine Learning By Example

Hands-On Web Scraping with Python: Perform advanced scraping operations using various Python libraries and tools such as Selenium, Regex, and others

Hands-On Deep Learning Algorithms with Python: Master deep learning algorithms with extensive math by implementing them using TensorFlow

Hands-on Supervised Learning with Python

Machine Learning for Beginners: Learn to Build Machine Learning Systems Using Python (English Edition)

Mastering Data Science with Python: The Ultimate Guide: Unlock the Power of Data Analysis and Visualization with Python's Cutting-Edge Tools and Techniques

Computers For You

Elon Musk

Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad

Deep Search: How to Explore the Internet More Effectively

Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race

Learning the Chess Openings

Slenderman: Online Obsession, Mental Illness, and the Violent Crime of Two Midwestern Girls

ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind

Mastering ChatGPT: 21 Prompts Templates for Effortless Writing

The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution

CompTIA Security+ Get Certified Get Ahead: SY0-701 Study Guide

The Professional Voiceover Handbook: Voiceover training, #1

SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL

Some Future Day: How AI Is Going to Change Everything

Excel 101: A Beginner's & Intermediate's Guide for Mastering the Quintessence of Microsoft Excel (2010-2019 & 365) in no time!

An Ultimate Guide to Kali Linux for Beginners

The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology

The Self-Taught Computer Scientist: The Beginner's Guide to Data Structures & Algorithms

Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics

How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally

Learn Python Programming for Beginners: The Best Step-by-Step Guide for Coding with Python, Great for Kids and Adults. Includes Practical Exercises on Data Analysis, Machine Learning and More.

101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters

Alan Turing: The Enigma: The Book That Inspired the Film The Imitation Game - Updated Edition

A Brief History of Artificial Intelligence: What It Is, Where We Are, and Where We Are Going

Going Text: Mastering the Command Line

Tor and the Dark Art of Anonymity

Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees

Python Machine Learning By Example

The Hacker Crackdown: Law and Disorder on the Electronic Frontier

CompTia Security 701: Fundamentals of Security

Uncanny Valley: A Memoir

Related podcast episodes

Related categories

Reviews for Hands-on Data Analysis and Visualization with Pandas

What did you think?

Book preview

Hands-on Data Analysis and Visualization with Pandas - PURNA CHANDER RAO. KATHULA

CHAPTER 1

Introduction to Data Analysis

Structure

Objective

Inspiration for data analysis

What is data science?

Structured data

101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters