Lecture 1
Lecture 1
Introduction to ML
Mehryar Mohri
Courant Institute and Google Research
mohri@cims.nyu.edu
Probability tools.
Algorithms:
• learning guarantees.
• analysis of algorithms.
Algorithms:
Applications:
Data:
Queries:
Probability tools.
Loss function: L : Y ⇥Y ! R .
• depends on features.
• in deterministic case, R? = 0.
Notion of simplicity/complexity.
How do we de ne complexity?
b
h = argmin R(h).
h2H
error
estimation
approximation
upper bound
∞§
∞
[
H= H .
2
H 1 ⇢ H2 ⇢ · · · ⇢ H n ⇢ · · ·
b
h = argmin R(h) + penalty(Hn , m).
h2Hn ,n2N
b
h = argmin R(h).
h2H
b
h = argmin R(h) + penalty(Hn , m).
h2Hn ,n2N
Regularization-based algorithms: 0,
b
h = argmin R(h) + khk2 .
h2H
Probability tools.
Pr[X ✏] E[X]
✏ .
lim Pr[|X n µ| ✏] = 0.
n!1
tx b x ta x a tb
e e + e .
b a b a
Thus,
E[etX ] E[ bb X ta
ae + b a e ]
X a tb
= b
b
a e ta
+ a tb
b ae =e (t)
,
with,
(t) = log( b b a eta + a tb
b ae ) = ta + log( b b a + a t(b a)
b ae ).
Pm
2✏2 / i=1 (bi a i )2
Pr[Sm E[Sm ] ✏] e .