Machine Learning Introduction
Machine Learning Introduction
Korris Chung
Dept of Computing
COMP4432
Roadmap
o Basic Concepts
o Data vs Feature vs Model
o Theoretical/Mathematical aspect
o Machine Learning - Why? What? and Where?
o Related Disciplines
o ML Models
o Supervised Learning
o Unsupervised Learning
o Issues & Resources
o Take-home messages!
2
Basic Concepts
3
Basic Concepts
4
Really Important Views/Concepts
o Data vs Feature
7
Let’s go a bit deeper!
8
Feature space and decision boundary
9
Supervised Learning (Classification) Space
10
Unsupervised Learning (Clustering) Space
12
Example:
Face Classification
13
After all, how are data samples
represented?
Iris dataset: sepal length and width, petal length
14
Face geometry?
15
Social Network Geometry?
17
After all, can data representation be learned?
So, what are the data, feature and output in this application?
19
Yet more advanced applications…
It’s just so powerful…
Image Completion
ML Concept -
Theoretical/Mathematical
Aspect
23
Regularization
24
Bias-Variance Trade-off
25
Why?
What?
Where?
26
Why “Learn”?
§ Machine learning is programming computers to
optimize a performance criterion using example data or
past experience.
§ There is no need to “learn” to calculate payroll (which
is deterministic)
§ Learning is used when:
§ Human expertise does not exist (navigating on Mars),
§ Humans are unable to explain their expertise (speech
recognition, painting style)
§ Solution changes in time (routing on a computer network,
stock market)
§ Solution needs to be adapted to particular cases (user
biometrics, Chinese chatbot)
27
What We Talk About When We
Talk About “Learning”
§ Learning general models from a data of particular
examples
§ Data is cheap and abundant (data warehouses, data
marts); knowledge is expensive and scarce.
§ Build a model that is a good and useful
approximation to the data.
28
What is Machine Learning?
Definition Stuff:
29
What is Machine Learning?
§ Machine Learning
§ Study of algorithms that improve their performance at some task with
experience
§ Optimize a performance criterion using example data or past
experience.
§ Role of Statistics: Inference from a sample
§ Role of Computer science: Efficient algorithms to
§ Solve the optimization problem, e.g. via SGD
§ Representing and evaluating the model for inference
30
Why Study Machine Learning?
- Developing Better Computing Systems
§ Develop systems that are too difficult/expensive to
construct manually because they require specific detailed
skills or knowledge tuned to a specific task (feature
engineering bottleneck).
§ Develop systems that can automatically adapt and
customize themselves to individual users (self-learning).
§ Personalized news or email filter
§ Personalized tutoring
§ Discover new knowledge from large databases (data
mining).
§ Market basket analysis (e.g. diapers and beer)
§ Medical text mining
31
Where is Machine Learning?
32
Related Disciplines
§ Artificial Intelligence
§ Data Mining
§ Probability and Statistics
§ Information theory
§ Numerical optimization
§ Computational complexity theory
§ Control theory (adaptive)
§ Psychology (developmental, cognitive)
§ Neurobiology
§ Linguistics
§ Philosophy
33
ML Hierarchy
34
Growth of Machine Learning
§ Machine learning is preferred approach to
§ Speech recognition, Natural language processing
§ Computer vision
§ Medical outcome analysis
§ Robot control
§ Computational biology
§ This trend is accelerating
§ Improved machine learning algorithms
§ Improved data capture, networking, faster computers
§ Software too complex to write by hand
§ New sensors / IO devices
§ Internet of Things (IoT) and 5G Network
§ Demand for self-customization to user, environment
§ It turns out to be difficult to extract knowledge from human
expertsàfailure of expert systems in the 1980’s.
35
Models
§ Association Analysis (in Data Mining)
§ Supervised Learning
§ Classification
§ Regression/Prediction
§ Unsupervised Learning
§ Reinforcement Learning (in AI)
§ Also, semi-supervised learning (in ML)
36
Learning Associations (DM stuff)
Famous quote: 60% of the customers
buy diaper also buy beer
§ Market Basket Analysis (MBA):
P (Y | X ) probability that somebody who buys X also buys Y where
X and Y are products/services.
Example: P(Diaper|Beer) = 100%; P(Beer|Diaper) = 75%
Market-Basket transactions
TID Items
1 Bread, Milk
2 Bread, Diaper, Beer, Eggs
3 Milk, Diaper, Beer, Coke
4 Bread, Milk, Diaper, Beer
5 Bread, Milk, Diaper, Coke
37
Classification
§ Example: Credit scoring
§ Differentiating
between low-risk and
high-risk customers
from their income and
savings
39
Face Recognition
Training examples of a person
Test images
40
Prediction: Regression
41
Supervised Learning: Classification
§ Example: Cancer diagnosis
Patient ID # of Tumors Avg Area Avg Density Diagnosis
1 5 20 118 Malignant
2 3 15 130 Benign Training Set
3 7 10 52 Benign
4 2 30 100 Malignant
45
Unsupervised Learning
§ Learning “what normally happens”
§ No output or label
§ Clustering: Grouping similar instances
§ Other applications: Summarization, Association
Analysis
§ Example applications
§ Customer segmentation in CRM
§ Image compression: Color quantization
§ Bioinformatics: Learning motifs
46
Notationally,
Unsupervised Learning: Clustering
47
Pictorially,
Unsupervised Learning: Clustering
48
Pictorially,
Unsupervised Learning: Clustering
What’s happening in the feature space?
50
Issues in Machine Learning
§ What algorithms can approximate functions well and
when?
§ How does the number of training examples influence
accuracy?
§ Problem representation / feature extraction
§ Integrating learning with systems
§ What are the theoretical limits of learnability?
§ Continuous (life-long) learning
§ Transfer learning
§ Few-shot learning
§ Interpretable ML (or explainable AI)
§ Many others…
51
Measuring Performance
• Generalization accuracy
• Solution correctness
• Solution quality (length, efficiency)
• Speed of performance (scalability)
52
Scaling issues in ML
• Number of
• Inputs
• Outputs (e.g. Extreme Classification (with lots of labels))
• Batch vs real-time
• Training vs testing
53
Resources: Datasets
§ UCI Repository:
http://www.ics.uci.edu/~mlearn/MLRepository.html
§ UCI KDD Archive:
http://kdd.ics.uci.edu/summary.data.application.html
§ Kaggle: https://www.kaggle.com/datasets
§ Tianchi (天池): https://tianchi.aliyun.com/dataset/
and many others…
54
Resources: Journals
§ Journal of Machine Learning Research
www.jmlr.org
§ Machine Learning
§ IEEE Transactions on Neural Networks
§ IEEE Transactions on Pattern Analysis and Machine
Intelligence
§ Annals of Statistics
§ Journal of the American Statistical Association
§ ...
55
Resources: Conferences
§ International Conference on Machine Learning (ICML)
§ European Conference on Machine Learning (ECML)
§ Neural Information Processing Systems (NIPS)
§ Computational Learning
§ International Joint Conference on Artificial Intelligence (IJCAI)
§ ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD)
§ IEEE Int. Conf. on Data Mining (ICDM)
§ Etc.
56
Take-home Messages
About ML and Data Analytics (DA)
§ DA concerns more about data+feature
§ ML concerns more about model
57
Acknowledgement
§ Slides of
§ E. Alpaydin, Introduction to Machine Learning. 2nd Ed.
MIT Press, 2010.
§ C.F. Eick, U of Houston.
§ Photos from Internet
58