Team knowledge sharing presentation covering topics of classical statistics vs modern machine learning including linear regression, logistic regression, neural networks, and deep learning using Python and R
This document discusses and provides examples of supervised and unsupervised learning. Supervised learning involves using labeled training data to learn relationships between inputs and outputs and make predictions. An example is using data on patients' attributes to predict the likelihood of a heart attack. Unsupervised learning involves discovering hidden patterns in unlabeled data by grouping or clustering items with similar attributes, like grouping fruits by color without labels. The goal of supervised learning is to build models that can make predictions when new examples are presented.
The document summarizes key concepts in machine learning, including defining learning, types of learning (induction vs discovery, guided learning vs learning from raw data, etc.), generalisation and specialisation, and some simple learning algorithms like Find-S and the candidate elimination algorithm. It discusses how learning can be viewed as searching a generalisation hierarchy to find a hypothesis that covers the examples. The candidate elimination algorithm maintains the version space - the set of hypotheses consistent with the training examples - by updating the general and specific boundaries as new examples are processed.
Anomaly detection (or Outlier analysis) is the identification of items, events or observations which do not conform to an expected pattern or other items in a dataset. It is used is applications such as intrusion detection, fraud detection, fault detection and monitoring processes in various domains including energy, healthcare and finance.
In this workshop, we will discuss the core techniques in anomaly detection and discuss advances in Deep Learning in this field.
Through case studies, we will discuss how anomaly detection techniques could be applied to various business problems. We will also demonstrate examples using R, Python, Keras and Tensorflow applications to help reinforce concepts in anomaly detection and best practices in analyzing and reviewing results.
What you will learn:
Anomaly Detection: An introduction
Graphical and Exploratory analysis techniques
Statistical techniques in Anomaly Detection
Machine learning methods for Outlier analysis
Evaluating performance in Anomaly detection techniques
Detecting anomalies in time series data
Case study 1: Anomalies in Freddie Mac mortgage data
Case study 2: Auto-encoder based Anomaly Detection for Credit risk with Keras and Tensorflow
Explainable AI (XAI) is becoming Must-Have NFR for most AI enabled product or solution deployments. Keen to know viewpoints and collaboration opportunities.
Active learning is a machine learning technique where the learner is able to interactively query the oracle (e.g. a human) to obtain labels for new data points in an effort to learn more accurately from fewer labeled examples. The learner selects the most informative samples to be labeled by the oracle, such as samples closest to the decision boundary or where models disagree most. This allows the learner to minimize the number of labeled samples needed, thus reducing the cost of training an accurate model. Suggested improvements include querying batches of samples instead of single samples and accounting for varying labeling costs.
This document discusses machine learning and various applications of machine learning. It provides an introduction to machine learning, describing how machine learning programs can automatically improve with experience. It discusses several successful machine learning applications and outlines the goals and multidisciplinary nature of the machine learning field. The document also provides examples of specific machine learning achievements in areas like speech recognition, credit card fraud detection, and game playing.
Abstract: This PDSG workshop introduces basic concepts of splitting a dataset for training a model in machine learning. Concepts covered are training, test and validation data, serial and random splitting, data imbalance and k-fold cross validation.
Level: Fundamental
Requirements: No prior programming or statistics knowledge required.
Machine learning works by processing data to discover patterns that can be used to analyze new data. Popular programming languages for machine learning include Python, R, and SQL. There are several types of machine learning including supervised learning, unsupervised learning, semi-supervised learning, reinforcement learning, and deep learning. Common machine learning tasks involve classification, regression, clustering, dimensionality reduction, and model selection. Machine learning is widely used for applications such as spam filtering, recommendations, speech recognition, and machine translation.
- Naive Bayes is a classification technique based on Bayes' theorem that uses "naive" independence assumptions. It is easy to build and can perform well even with large datasets.
- It works by calculating the posterior probability for each class given predictor values using the Bayes theorem and independence assumptions between predictors. The class with the highest posterior probability is predicted.
- It is commonly used for text classification, spam filtering, and sentiment analysis due to its fast performance and high success rates compared to other algorithms.
This document provides an overview of key mathematical concepts relevant to machine learning, including linear algebra (vectors, matrices, tensors), linear models and hyperplanes, dot and outer products, probability and statistics (distributions, samples vs populations), and resampling methods. It also discusses solving systems of linear equations and the statistical analysis of training data distributions.
Machine Learning With Logistic RegressionKnoldus Inc.
Machine learning is the subfield of computer science that gives computers the ability to learn without being programmed. Logistic Regression is a type of classification algorithm, based on linear regression to evaluate output and to minimize the error.
Machine learning involves developing systems that can learn from data and experience. The document discusses several machine learning techniques including decision tree learning, rule induction, case-based reasoning, supervised and unsupervised learning. It also covers representations, learners, critics and applications of machine learning such as improving search engines and developing intelligent tutoring systems.
Machine learning involves programming computers to optimize performance using example data or past experience. It is used when human expertise does not exist, humans cannot explain their expertise, solutions change over time, or solutions need to be adapted to particular cases. Learning builds general models from data to approximate real-world examples. There are several types of machine learning including supervised learning (classification, regression), unsupervised learning (clustering), and reinforcement learning. Machine learning has applications in many domains including retail, finance, manufacturing, medicine, web mining, and more.
Intro to modelling-supervised learningJustin Sebok
This document provides an introduction to machine learning concepts. It defines machine learning as allowing computers to learn without being explicitly programmed. Two main types are described: supervised learning, where the goal is to predict known outputs from inputs, and unsupervised learning, where patterns in unknown data are identified. Supervised learning is further divided into classification and regression problems. Example algorithms covered include k-nearest neighbors, decision trees, and linear regression. Key concepts like bias, variance, and dimensionality are also introduced.
Machine learning helps predict behavior and recognize patterns that humans cannot by learning from data without relying on programmed rules. It is an algorithmic approach that differs from statistical modeling which formalizes relationships through mathematical equations. Machine learning is a part of the broader field of artificial intelligence which aims to develop systems that can act and respond intelligently like humans. The machine learning workflow involves collecting and preprocessing data, selecting algorithms, training models, and evaluating performance. Common machine learning algorithms include supervised learning, unsupervised learning, reinforcement learning, and deep learning. Popular tools for machine learning include Python, R, TensorFlow, and Spark.
The document discusses sources and approaches to handling uncertainty in artificial intelligence. It provides examples of uncertain inputs, knowledge, and outputs in AI systems. Common methods for representing and reasoning with uncertain data include probability, Bayesian belief networks, hidden Markov models, and temporal models. Effectively handling uncertainty through probability and inference allows AI to make rational decisions with imperfect knowledge.
Pattern recognition and Machine Learning.Rohit Kumar
Machine learning involves using examples to generate a program or model that can classify new examples. It is useful for tasks like recognizing patterns, generating patterns, and predicting outcomes. Some common applications of machine learning include optical character recognition, biometrics, medical diagnosis, and information retrieval. The goal of machine learning is to build models that can recognize patterns in data and make predictions.
This document provides an overview of getting started with data science using Python. It discusses what data science is, why it is in high demand, and the typical skills and backgrounds of data scientists. It then covers popular Python libraries for data science like NumPy, Pandas, Scikit-Learn, TensorFlow, and Keras. Common data science steps are outlined including data gathering, preparation, exploration, model building, validation, and deployment. Example applications and case studies are discussed along with resources for learning including podcasts, websites, communities, books, and TV shows.
Team knowledge sharing presentation covering topics of decision trees, XGBoost, logistic regression, neural networks, and deep learning using scikit-learn, statsmodels, and Keras over TensorFlow in python within PowerBI, Azure Notebooks, AWS SageMaker notebooks, and Google Colab notebooks
IIPGH Webinar 1: Getting Started With Data Scienceds4good
In this webinar for ICT Professionals Ghana, we explore the concepts of data science and its motivations as a recent specialization. creating the background for how Artificial Intelligence relates to Machine Learning and to Deep Learning. We further discuss the data science technology stack and the opportunities that exist in the space.
The document provides guidance for days 90-100 of a 100 Days of Data Science Challenge, suggesting focusing on reviewing and revising one's progress during this time. It recommends reviewing goals and objectives, reflecting on strengths and challenges, reviewing completed project work, seeking feedback, and revising one's project plan based on learnings and feedback to stay motivated and on track, and identify areas for continued focus and improvement.
Data Science at Scale - The DevOps ApproachMihai Criveti
DevOps Practices for Data Scientists and Engineers
1 Data Science Landscape
2 Process and Flow
3 The Data
4 Data Science Toolkit
5 Cloud Computing Solutions
6 The rise of DevOps
7 Reusable Assets and Practices
8 Skills Development
This document discusses using Perl and Raku for data science. It begins by noting the growth of data science jobs and examines common programming languages used, including Perl, Python, and R. While there were no Raku modules for statistics at the time, basic statistics functions can be written easily in Raku. Examples are provided demonstrating calculating statistics and creating graphs using Perl modules. The future of data science is seen to include areas like data mining, artificial intelligence, and machine learning.
1. Introduction and how to get into Data
2. Data Engineering and skills needed
3. Comparison of Data Analytics for statistic and real time streaming data
4. Bayesian Reasoning for Data
This document provides an overview of how to prepare for a career in data science. It discusses the author's own career path, which included degrees in bioinformatics and machine learning as well as jobs as a data scientist. It then outlines the typical data science workflow, including identifying problems, accessing and cleaning data, exploratory analysis, modeling, and deploying results. It emphasizes that data science is an iterative process and stresses the importance of communication skills. Finally, it discusses how data science fits within business contexts and the value of working on teams with complementary skills.
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactDr. Sunil Kr. Pandey
This is my presentation on the Topic "Data Science - An emerging Stream of Science with its Spreading Reach & Impact". I have compiled and collected different statistics and data from different sources. This may be useful for students and those who might be interested in this field of Study.
The document provides a general introduction to artificial intelligence (AI), machine learning (ML), deep learning (DL), and data science (DS). It defines each term and describes their relationships. Key points include:
- AI is the ability of computers to mimic human cognition and intelligence.
- ML is an approach to achieve AI by having computers learn from data without being explicitly programmed.
- DL uses neural networks for ML, especially with unstructured data like images and text.
- DS involves extracting insights from data through scientific methods. It is a multidisciplinary field that uses techniques from ML, DL, and statistics.
A Comprehensive Learning Path to Become a Data Science 2021.pptxRajSingh512965
The 2021 data science learning path provides a comprehensive curriculum to become a data scientist. It includes extended skills in storytelling, model deployment, unsupervised learning, exercises, and projects. The path covers key skills and tools like Python, R, machine learning algorithms, deep learning, natural language processing, and model deployment. It consists of monthly modules that progress from the data science toolkit to advanced topics, with hands-on training and real-world projects.
Big data and artificial intelligence have developed through an iterative process where increased data leads to improved infrastructure which then enables the collection of even more data. This virtuous cycle began with the rise of the internet and web data in the 1990s. Modern frameworks like Hadoop and algorithms like MapReduce established the infrastructure needed to analyze large, distributed datasets and fuel machine learning applications. Deep learning techniques are now widely used for tasks involving images, text, video and other complex data types, with many companies seeking to gain advantages by leveraging proprietary datasets.
Huge amount of data is being collected everywhere - when we browse the web, go to the doctor's clinic, visit the supermarket, tweet or watch a movie. This plethora of data is dealt under a new realm called Data Science. Data Science is now recognized as a highly-critical growing area with impact across many sectors including science, government, finance, health care, social networks, manufacturing, advertising, retail,
and others. This colloquium will try to provide an overview as well as clarify bits and bats about this emerging field.
1) The document discusses a self-study approach to learning data science through project-based learning using various online resources.
2) It recommends breaking down projects into 5 steps: defining problems/solutions, data extraction/preprocessing, exploration/engineering, model implementation, and evaluation.
3) Each step requires different skillsets from domains like statistics, programming, SQL, visualization, mathematics, and business knowledge.
Artificial intelligence: Simulation of IntelligenceAbhishek Upadhyay
1. The document discusses the history and development of artificial intelligence and machine learning, from early concepts in probability and statistics in the 18th century to modern algorithms and applications.
2. It outlines important early milestones like the McCulloch-Pitts neural network model from 1943 and the Turing Test in 1950. Major algorithms like perceptron and modern frameworks like TensorFlow are also mentioned.
3. The text advocates for applying machine learning to solve real-world business problems by understanding the problem domain, acquiring relevant data, selecting an appropriate algorithm, and iterating through the problem solving process.
Bringing Machine Learning and Knowledge Graphs Together
Six Core Aspects of Semantic AI:
- Hybrid Approach
- Data Quality
- Data as a Service
- Structured Data Meets Text
- No Black-box
- Towards Self-optimizing Machines
This document provides an introduction to data science, including definitions, key concepts, and applications. It discusses what data science is, the differences between data science, big data, and artificial intelligence. It also outlines several applications of data science like internet search, recommendation systems, image/speech recognition, gaming, and price comparison websites. Finally, it discusses the data science life cycle and some popular tools used in data science like Python, NumPy, Pandas, Matplotlib, and Scikit-learn.
Hadoop Vs Snowflake Blog PDF Submission.pptxdewsharon760
Explore the key differences between Hadoop and Snowflake. Understand their unique features, use cases, and how to choose the right data platform for your needs.
emotional interface - dehligame satta for youbkldehligame1
Welcome to DelhiGame.in, your premier hub for the latest Satta results and gaming updates in Delhi! Check out our live results https://delhigame.in/ and stay informed with the latest updates https://delhigame.in/past-results/ . Join us to experience the thrill of gaming like never before!
Data analytics is a powerful tool that can transform business decision-making across industries. Contact District 11 Solutions, which specializes in data analytics, to make informed decisions and achieve your business goals.
Graph Machine Learning - Past, Present, and Future -kashipong
Graph machine learning, despite its many commonalities with graph signal processing, has developed as a relatively independent field.
This presentation will trace the historical progression from graph data mining in the 1990s, through graph kernel methods in the 2000s, to graph neural networks in the 2010s, highlighting the key ideas and advancements of each era. Additionally, recent significant developments, such as the integration with causal inference, will be discussed.
NYCMeetup07-25-2024-Unstructured Data Processing From Cloud to EdgeTimothy Spann
NYCMeetup07-25-2024-Unstructured Data Processing From Cloud to Edge
https://www.meetup.com/unstructured-data-meetup-new-york/
https://www.meetup.com/unstructured-data-meetup-new-york/events/301720478/
Details
This is an in-person event! Registration is required to get in.
Topic: Connecting your unstructured data with Generative LLMs
What we’ll do:
Have some food and refreshments. Hear three exciting talks about unstructured data and generative AI.
5:30 - 6:00 - Welcome/Networking/Registration
6:05 - 6:30 - Tim Spann, Principal DevRel, Zilliz
6:35 - 7:00 - Chris Joynt, Senior PMM, Cloudera
7:05 - 7:30 - Lisa N Cao, Product Manager, Datastrato
7:30 - 8:30 - Networking
Tech talk 1: Unstructured Data Processing From Cloud to Edge
Speaker: Tim Spann, Principal Dev Advocate, Zilliz
In this talk I will do a presentation on why you should add a Cloud Native vector database to your Data and AI platform. He will also cover a quick introduction to Milvus, Vector Databases and unstructured data processing. By adding Milvus to your architecture you can scale out and improve your AI use cases through RAG, Real-Time Search, Multimodal Search, Recommendations Engines, fraud detection and many more emerging use cases.
As I will show, Edge devices even as small and inexpensive as a Raspberry Pi 5 can work in machine learning, deep learning and AI use cases and be enhanced with a vector database.
Tech talk 2: RAG Pipelines with Apache NiFi
Speaker: Chris Joynt, Senior PMM, Cloudera
Executing on RAG Architecture is not a set-it-and-forget-it endeavor. Unstructured or multimodal data must be cleansed, parsed, processed, chunked and vectorized before being loaded into knowledge stores and vector DB's. That needs to happen efficiently to keep our GenAI up to date always with fresh contextual data. But not only that, changes will have to be made on an ongoing basis. For example, new data sources must be added. Experimentation will be necessary to find the ideal chunking strategy. Apache NiFi is the perfect tool to build RAG pipelines to stream proprietary and external data into your RAG architectures. Come learn how to use this scalable and incredible versatile tool to quickly build pipelines to activate your GenAI use case.
Tech Talk 3: Metadata Lakes for Next-Gen AI/ML
Speaker: Lisa N Cao, Datastrato
Abstract: As data catalogs evolve to meet the growing and new demands of high-velocity, unstructured data, we see them taking a new shape as an emergent and flexible way to activate metadata for multiple uses. This talk discusses modern uses of metadata at the infrastructure level for AI-enablement in RAG pipelines in response to the new demands of the ecosystem. We will also discuss Apache (incubating) Gravitino and its open source-first approach to data cataloging across multi-cloud and geo-distributed architectures.
Who Should attend:
Anyone interested in talking and learning about Unstructured Data and Generative AI Apps.
When:
July 25, 2024
5:30PM
Docker has revolutionized the way we develop, deploy, and run applications. It's a powerful platform that allows you to package your software into standardized units called containers. These containers are self-contained environments that include everything an application needs to run: code, libraries, system tools, and settings.
Here's a breakdown of what Docker offers:
Faster Development and Deployment:
Spin up new environments quickly: Forget about compatibility issues and dependency management. With Docker, you can create consistent environments for development, testing, and production with ease.
Share and reuse code: Build reusable Docker images and share them with your team or the wider community on Docker Hub, a public registry for Docker images.
Reliable and Consistent Applications:
Cross-platform compatibility: Docker containers run the same way on any system with Docker installed, eliminating compatibility headaches. Your code runs consistently across Linux, Windows, and macOS.
Isolation and security: Each container runs in isolation, sharing only the resources it needs.
3. Definition of Data
data
noun
1. a plural of datum
datum
noun
1. a single piece of information, as a fact, statistic, or code; an item of data
2. Philosophy
a. any fact assumed to be a matter of direct observation
b. any proposition assumed or given, from which conclusions may be drawn
c. Also called sense datum. Epistemology. The object of knowledge as
presented to the mind
4. What do you think data really *is* though?
Me thinks:
● Data is are inert fragments, or shards, of information
● Logical building blocks capable of leading to stories
● “True” data (information) requires consciousness to exist (b/c contextual)
● Even when data are is auto-generated or never directly seen by a human,
consciousness is needed for “design”, “interpretation” (aka, meaning), etc.
● In everyday use, we think of data as representing quantities, characters,
or symbols on which operations are performed by a computer1
● Data can be organized into many different data structures (i.e. lists,
tuples, arrays, data frames) and data types (i.e. integer, dates, strings)
1 https://en.wikipedia.org/wiki/Data_(computing)
6. What is Statistics? It’s Applied Math
● Two kinds - descriptive vs inferential
○ Descriptive statistics: motivation is to accurately reflect the past
○ Inferential statistics: motivation is to accurately predict the future
● Want to draw (infer) valid conclusions from samples and subsets
○ To save time, energy, $$$
○ Census counts, Agriculture crop yields, genetics, drug efficacy, baseball, …
○ Requires many assumptions be made about underlying data for results to be valid
○ Goal: “simple”, human-understandable model, or formula, that explains most variability
● Deeply rooted in rigorous mathematical theory, especially probability
and matrix algebra theory, and “bell shaped curves” matter
7. And Machine Learning? Computer Science
● Two kinds - supervised vs unsupervised
○ Supervised: “learns” rules for mapping inputs (aka, features) to an output (aka, label)
○ Unsupervised: “learns" patterns without any outputs involved
● Want to derive rules that provide the maximum accuracy as possible
○ To save time, energy, $$$
○ Census counts, Agriculture crop yields, genetics, drug efficacy, baseball, …
○ Requires virtually no prior assumptions be made about input data (i.e. proof in the pudding)
○ “Learned” rules are often difficult to interpret; you might not even look at them directly
● Motivated by artificial intelligence (AI), minimizing “cost functions” while
not “over-fitting” model based on training data is what matters
9. What about “Deep Learning”? Fact or Fiction?
- Deep learning and machine
learning are at what it calls the
"peak of inflated expectation",
but are just two to five years
away from mainstream adoption.
Cognitive computing is also at
peak hype, but up to 10 years
away, while general artificial
intelligence remains more than a
decade away and is still at the
stage of early innovation.
- Effective machine learning is
difficult because finding patterns
is hard and often not enough
training data is available; as a
result, machine-learning
programs often fail to deliver
10. Upper limit. When does it all “end”? 2045?
● Nobody (and nothing) can predict how life will be in 30 years - it’s not possible
● Technology is always neutral, good or bad, depending on intent and application
● Forecasts for the future generally reflect our own human hopes and fears more
than anything else
● Human intelligence is of a different kind than machine “intelligence” (right?)
18. Statistics: linear regression example (via RStudio)
> df = data.frame(
hr=c(762, 755, 714, 696, 660, 630, 614, 612, 609),
rbi=c(1996, 2297, 2214, 2086, 1903, 1836, 1918, 1699, 1667))
> fit = lm(rbi ~ hr, data=df)
> summary(fit)
> eq = paste("RBI = ", round(fit$coefficients[2],1), "HR + ",
round(fit$coefficients[1],1), sep="")
> plot(df$rbi ~ df$hr, ylab="RBI", xlab="HR", pch=19, main=eq)
> abline(fit, col="red", lty=2)
Career RBI vs HR, for MLB players with 600+ home runs
1) Matrix form
2) Best-fit coefficients are “deterministic”
3) Thus, we have a formula for estimating RBI from HR
19. ML: logistic regression (Python via Jupyter on AWS)
Career HR and RBI vs HOF=Y/N, for MLB players 1950-2010
20. ML: neural network (Python via Visual Studio)
Career HR and RBI vs HOF=Y/N, for MLB players 1950-2010
21. “Big Data” ML with Spark and Scala (via Docker Zeppelin)
This demo covers the following (plus fetching data stored in an AWS S3)
● Hadoop=distributed I/O; Spark=distributed I//O and RAM
● Scala (is more Java than JavaScript) = default language for Spark
● Zeppelin (Scala) ~ Jupyter (Python)
(p.s. there’s others out there too; i.e. Beaker, Sage)
● Docker = pre-configured software & services bundled into “containers”
(note: mostly Linux-based and non-GUI based programs)
22. “Old School” BI + “New Age” Data Science
Career HR and RBI vs HOF=Y/N, for MLB players 1950-2010
Data from AWS RDS joined to local text file data
w/ slice & dice interactivity + R statistical graphing
23. GUI-based Machine Learning with Orange
(Python)● Orange (used to be called “Orange Canvas”) is an open-source Python library
● I first came across it circa 2013 and really like it’s potential, but bit buggy on Windows
● Works best with “small data” but it keeps improving and getting better
24. Julia: invented at MIT in 2012* and built for speed
● R is based on S language from Bell Labs in mid-1970’s; built for single workstations
● Python has had rebirth of sorts in recent years thanks to Anaconda “data science” distro
● Julia designed from scratch to be best of all modern numerical computing languages and constructs
Pros Cons
- New and modern (designed for parallelism, etc.) - Still very early and immature (not even to 1.0 yet)
- Fast; 5x faster than Python and 10x faster than R - Packages/modules buggy; not as stable or proven as Python and R
- Supports unicode and math symbols; 1-based arrays :-) - Can be very hard to find help and working examples
- Can directly invoke existing Python and R modules - No acceptable native graphics library; must call Python or R
- Attracting lots of attention (Apple, Amazon, Facebook,
IBM, Intel, hiring Julia programmers)
- Yet another language to learn?! Besides, there’s a lot of
dependencies on Python, why not just learn Python?
26. Links
Historical Timeline of Computable Knowledge
http://www.wolframalpha.com/docs/timeline
Data (Computing)
https://en.wikipedia.org/wiki/Data_(computing)
Electronic Statistics Textbook
http://www.statsoft.com/Textbook
Wikipedia: Statistics
https://en.wikipedia.org/wiki/Statistics
Wikipedia: Machine Learning
https://en.wikipedia.org/wiki/Machine_learning
Dr. Andrew Ng’s World-Famous Machine Learning Course
https://www.youtube.com/playlist?list=PLA89DCFA6ADACE599
Data Science Concepts
http://www.saedsayad.com/data_mining_map.htm
26 Hilariously Inaccurate Predictions About the Future (from 2014)
http://www.cracked.com/pictofacts-101-26-hilariously-inaccurate-predictions-about-future/
27. Python: where to even begin?
Step 1/3: Download the free Windows 64-bit Anaconda “Data Science” distro:
https://www.anaconda.com/download/#windows
28. Python: where to even begin?
Step 2/3: Open “Anaconda Navigator” and launch “jupyter notebook”
Note: it may be slow to launch (especially first time), but will open a new browser window at
http://localhost:8888/tree (or something close, port number 8888 can vary, depending)
29. Python: where to even begin?
Step 3/3: Create a new Python 3 notebook and start learning at your own pace
i.e. like open in 2nd tab “Python for Data Analysis”: https://github.com/wesm/pydata-book