0% found this document useful (0 votes)

147 views

Lecture 3 - MDPs and Dynamic Programming

This lecture introduces Markov decision processes (MDPs) and dynamic programming. MDPs provide a mathematical framework for modeling sequential decision making problems. The key components of an MDP are states, actions, transition probabilities, and rewards. Dynamic programming methods can find optimal policies by iteratively computing value functions, given a true model of the environment. The next lectures will discuss how to apply similar ideas to problems where the true model is unknown.

Uploaded by

Trinaya Kodavati

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

147 views

Lecture 3 - MDPs and Dynamic Programming

Uploaded by

Trinaya Kodavati

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 66

Lecture 3:

Markov Decision Processes and Dynamic Programming

Diana Borsa

January 15, 2021

Background

Sutton & Barto 2018, Chapter 3 + 4

Recap

I Reinforcement learning is the science of learning to make decisions

I Agents can learn a policy, value function and/or a model
I The general problem involves taking into account time and consequences
I Decisions affect the reward, the agent state, and environment state
This Lecture

I Last lecture: multiple actions, but only one state—no model

I This lecture:
I Formalise the problem with full sequential structure
I Discuss first class of solution methods which assume true model is given
I These methods are called dynamic programming
I Next lectures: use similar ideas, but use sampling instead of true model
Formalising the RL interaction
Formalising the RL interface

I We will discuss a mathematical formulation of the agent-environment interaction

I This is called a Markov Decision Process (MDP)
I Enables us to talk clearly about the objective and how to achieve it
MDPs: A simplifying assumption

I For now, assume the environment is fully observable:

⇒ the current observation contains all relevant information

I Note: Almost all RL problems can be formalised as MDPs, e.g.,

I Optimal control primarily deals with continuous MDPs
I Partially observable problems can be converted into MDPs
I Bandits are MDPs with one state
Markov Decision Process
Definition (Markov Decision Process - Sutton & Barto 2018 )
A Markov Decision Process is a tuple (S, A, p, γ), where
I S is the set of all possible states
I A is the set of all possible actions (e.g., motor controls)
I p(r , s 0 | s, a) is the joint probability of a reward r and next state s 0 , given a state s
and action a
I γ ∈ [0, 1] is a discount factor that trades off later rewards to earlier ones

Observations:
I p defines the dynamics of the problem
I Sometimes it is useful to marginalise out the state transitions or expected reward:
X X X
p(s 0 | s, a) = p(s 0 , r | s, a) E [R | s, a] = r p(r , s 0 | s, a) .
r r s0
Markov Decision Process: Alternative Definition

Definition (Markov Decision Process)

A Markov Decision Process is a tuple (S, A, p, r ,γ), where
I S is the set of all possible states
I A is the set of all possible actions (e.g., motor controls)
I p(s 0 | s, a) is the probability of transitioning to s 0 , given a state s and action a
I r : S × A → R is the excepted reward, achieved on a transition starting in (s, a)

r = E [R | s, a]

I γ ∈ [0, 1] is a discount factor that trades off later rewards to earlier ones

Note: These are equivalent formulations: no additional assumptions w.r.t the previous def.
Markov Property: The future is independent of the past given the present

Definition (Markov Property)

Consider a sequence of random variables, {St }t∈N , indexed by time. A state s has the
Markov property when for states ∀s 0 ∈ S

p St+1 = s 0 | St = s = p St+1 = s 0 | ht−1 , St = s

for all possible histories ht−1 = {S1 , . . . , St−1 , A1 , . . . , At−1 , R1 , . . . , Rt−1 }

In a Markov Decision Process all states are assumed to have the Markov property.
I The state captures all relevant information from the history.
I Once the state is known, the history may be thrown away.
I The state is a sufficient statistic of the past.
Markov Property in a MDP: Test your understanding

In a Markov Decision Process all states are assumed to have the Markov property.

Q: In an MDP this property implies: (Which of the following statements are true?)

p St+1 = s 0 | St = s, At = a = p St+1 = s 0 | S1 , . . . , St−1 , A1 , . . . , At , St = s (1)

p St+1 = s 0 | St = s, At = a = p St+1 = s 0 | S1 , . . . , St−1 , St = s, At = a

(2)

p St+1 = s 0 | St = s, At = a = p St+1 = s 0 | S1 , . . . , St−1 , St = s

(3)

p Rt+1 = r , St+1 = s 0 | St = s = p Rt+1 = r , St+1 = s 0 | S1 , . . . , St−1 , St = s (4)

Example: cleaning robot

I Consider a robot that cleans soda cans

I Two states: high battery charge or low battery charge
I Actions: {wait, search} in high, {wait, search, recharge} in low
I Dynamics may be stochastic
I p(St+1 = high | St = high, At = search) = α
I p(St+1 = low | St = high, At = search) = 1 − α
I Reward could be expected number of collected cans (deterministic), or actual
number of collected cans (stochastic)

Reference: Sutton and Barto, Chapter 3, pg 52-53.

Example: robot MDP
Example: robot MDP
Formalising the objective
Returns
I Acting in a MDP results in immediate rewards Rt , which leads to returns Gt :
I Undiscounted return (episodic/finite horizon pb.)

−t−1
TX
Gt = Rt+1 + Rt+2 + ... + RT = Rt+k+1
k=0

I Discounted return (finite or infinite horizon pb.)

−t−1
TX
Gt = Rt+1 + γRt+2 + ... + γ T −t RT = γ k Rt+k+1
k=0

I Average return (continuing, infinite horizon pb.)

−t−1
TX
1 1
Gt = (Rt+1 + Rt+2 + ... + RT ) = Rt+k+1
T −t −1 T −t −1
k=0

Note: These are random variables that depends on MDP and policy
Discounted Return

I Discounted returns Gt for infinite horizon T → ∞:

∞
X
Gt = Rt+1 + γRt+2 + ... = γ k Rt+k+1
k=0

I The discount γ ∈ [0, 1] is the present value of future rewards

I The marginal value of receiving reward R after k + 1 time-steps is γ k R
I For γ < 1, immediate rewards are more important than delayed rewards
I γ close to 0 leads to ”myopic” evaluation
I γ close to 1 leads to ”far-sighted” evaluation
Why discount?

Most Markov decision processes are discounted. Why?

I Problem specification:
I Immediate rewards may actually be more valuable (e.g., consider earning interest)
I Animal/human behaviour shows preference for immediate reward
I Solution side:
I Mathematically convenient to discount rewards
I Avoids infinite returns in cyclic Markov processes
I The way to think about it: reward and discount together determine the goal
Policies

Goal of an RL agent
To find a behaviour policy that maximises the (expected) return Gt

I A policy is a mapping π : S × A → [0, 1] that, for every state s assigns for each
action a ∈ A the probability of taking that action in state s. Denoted by π(a|s).
I For deterministic policies, we sometimes use the notation at = π(st ) to denote the
action taken by the policy.
Value Functions

I The value function v (s) gives the long-term value of state s

vπ (s) = E [Gt | St = s, π]

I We can define (state-)action values:

qπ (s, a) = E [Gt | St = s, At = a, π]

I (Connection between them) Note that:

X
vπ (s) = π(a | s)qπ (s, a) = E [qπ (St , At ) | St = s, π] , ∀s
a
Optimal Value Function

Definition (Optimal value functions)

The optimal state-value function v ∗ (s) is the maximum value function over all policies

v ∗ (s) = max vπ (s)

The optimal action-value function q ∗ (s, a) is the maximum action-value function over
all policies

q ∗ (s, a) = max qπ (s, a)

I The optimal value function specifies the best possible performance in the MDP
I An MDP is “solved” when we know the optimal value function
Optimal Policy

Define a partial ordering over policies

π ≥ π0 ⇐⇒ vπ (s) ≥ vπ0 (s) , ∀s

Theorem (Optimal Policies)

For any Markov decision process
I There exists an optimal policy π ∗ that is better than or equal to all other policies,
π ∗ ≥ π, ∀π
(There can be more than one such optimal policy.)
∗
I All optimal policies achieve the optimal value function, v π (s) = v ∗ (s)
∗
I All optimal policies achieve the optimal action-value function, q π (s, a) = q ∗ (s, a)
Finding an Optimal Policy

An optimal policy can be found by maximising over q ∗ (s, a),

1 if a = argmax q ∗ (s, a)
(
∗
π (s, a) = a∈A
0 otherwise

Observations:
I There is always a deterministic optimal policy for any MDP
I If we know q ∗ (s, a), we immediately have the optimal policy
I There can be multiple optimal policies
I If multiple actions maximize q∗ (s, ·), we can also just pick any of these
(including stochastically)
Bellman Equations
Value Function

I The value function v (s) gives the long-term value of state s

vπ (s) = E [Gt | St = s, π]

I It can be defined recursively:

vπ (s) = E [Rt+1 + γGt+1 | St = s, π]

= E [Rt+1 + γvπ (St+1 ) | St = s, At ∼ π(St )]
X XX
p(r , s 0 | s, a) r + γvπ (s 0 )

= π(a | s)
a r s0

I The final step writes out the expectation explicitly

Action values
I We can define state-action values

qπ (s, a) = E [Gt | St = s, At = a, π]

I This implies

qπ (s, a) = E [Rt+1 + γvπ (St+1 ) | St = s, At = a]

= E [Rt+1 + γqπ (St+1 , At+1 ) | St = s, At = a]
!
XX X
0 0 0 0 0
= p(r , s | s, a) r + γ π(a | s )qπ (s , a )
r s0 a0

I Note that
X
vπ (s) = π(a | s)qπ (s, a) = E [qπ (St , At ) | St = s, π] , ∀s
a
Bellman Equations

Theorem (Bellman Expectation Equations)

Given an MDP, M = hS, A, p, r , γi, for any policy π, the value functions obey the
following expectation equations:
" #
X X
vπ (s) = π(s, a) r (s, a) + γ p(s 0 |a, s)vπ (s 0 ) (5)
a s0
X X
0
qπ (s, a) = r (s, a) + γ p(s |a, s) π(a0 |s 0 )qπ (s 0 , a0 ) (6)
s0 a0 ∈A
The Bellman Optimality Equations

Theorem (Bellman Optimality Equations)

Given an MDP, M = hS, A, p, r , γi, the optimal value functions obey the following
expectation equations:
" #
X
v ∗ (s) = max r (s, a) + γ p(s 0 |a, s)v ∗ (s 0 ) (7)
a
s0
X
q ∗ (s, a) = r (s, a) + γ p(s 0 |a, s) max
0
q ∗ (s 0 , a0 ) (8)
a ∈A
s0

There can be no policy with a higher value than v∗ (s) = maxπ vπ (s), ∀s
Some intuition
(Reminder) Greedy on v ∗ = Optimal Policy
I An optimal policy can be found by maximising over q ∗ (s, a),

1 if a = argmax q ∗ (s, a)
(
∗
π (s, a) = a∈A
0 otherwise

I Apply the Bellman Expectation Eq. (6):

X X
qπ∗ (s, a) = r (s, a) + γ p(s 0 |a, s) π ∗ (a0 |s 0 )qπ∗ (s 0 , a0 )
s0 a0 ∈A
| {z }
maxa0 q ∗ (s 0 ,a0 )
X
= r (s, a) + γ p(s 0 |a, s) max
0
q ∗ (s 0 , a0 )
a ∈A
s0
Solving RL problems using the Bellman Equations
Problems in RL

I Pb1: Estimating vπ or qπ is called policy evaluation or, simply, prediction

I Given a policy, what is my expected return under that behaviour?
I Given this treatment protocol/trading strategy, what is my expected return?

I Pb2 : Estimating v∗ or q∗ is sometimes called control, because these can be used

for policy optimisation
I What is the optimal way of behaving? What is the optimal value function?
I What is the optimal treatment? What is the optimal control policy to minimise
time, fuel consumption, etc?
Exercise:

I Consider the following MDP:

I The actions have a 0.9 probability of success and with 0.1 probably we remain in the
same state
I Rt = 0 for all transitions that end up in S0 , and Rt = −1 for all other transitions
Exercise: (pause to work this out)
I Consider the following MDP:

I The actions have a 0.9 probability of success and with 0.1 probably we remain in the
same state
I Rt = 0 for all transitions that end up in S0 , and Rt = −1 for all other transitions

I Q: Evaluation problems (Consider a discount γ = 0.9)

I What is vπ for π(s) = a1 (→), ∀s?
I What is vπ for the uniformly random policy?
I Same policy evaluation problems for γ = 0.0? (What do you notice?)
A solution
Bellman Equation in Matrix Form

I The Bellman value equation, for given π, can be expressed using matrices,

v = rπ + γPπ v

where

vi = v (si )
riπ = E [Rt+1 | St = si , At ∼ π(St )]
X
Pijπ = p(sj | si ) = π(a | si )p(sj | si , a)
a
Bellman Equation in Matrix Form
I The Bellman equation, for a given policy π, can be expressed using matrices,

v = rπ + γPπ v

I This is a linear equation that can be solved directly:

v = rπ + γPπ v
(I − γPπ ) v = rπ
v = (I − γPπ )−1 rπ

I Computational complexity is O(|S|3 ) — only possible for small problems

I There are iterative methods for larger problems
I Dynamic programming
I Monte-Carlo evaluation
I Temporal-Difference learning
Solving the Bellman Optimality Equation

I The Bellman optimality equation is non-linear

I Cannot use the same direct matrix solution as for policy optimisation (in general)

I Many iterative solution methods:

I Using models / dynamic programming
I Value iteration
I Policy iteration
I Using samples
I Monte Carlo
I Q-learning
I Sarsa
Dynamic Programming
Dynamic Programming
The 1950s were not good years for mathematical research. I felt I had to shield
the Air Force from the fact that I was really doing mathematics. What title,
what name, could I choose? I was interested in planning, in decision making,
in thinking. But planning is not a good word for various reasons. I decided
to use the word ‘programming.’ I wanted to get across the idea that this was
dynamic, this was time-varying—I thought, let’s kill two birds with one stone.
Let’s take a word that has a precise meaning, namely dynamic, in the classical
physical sense. It also is impossible to use the word, dynamic, in a pejorative
sense. Try thinking of some combination that will possibly give it a pejorative
meaning. It’s impossible. Thus, I thought dynamic programming was a good
name. It was something not even a Congressman could object to. So I used it
as an umbrella for my activities.
– Richard Bellman
(slightly paraphrased for conciseness)
Dynamic programming

Dynamic programming refers to a collection of algorithms that can be used

to compute optimal policies given a perfect model of the environment as a
Markov decision process (MDP).

Sutton & Barto 2018

I We will discuss several dynamic programming methods to solve MDPs
I All such methods consist of two important parts:
policy evaluation and policy improvement
Policy evaluation

I We start by discussing how to estimate

vπ (s) = E [Rt+1 + γvπ (St+1 ) | s, π]

I Idea: turn this equality into an update

Algorithm
I First, initialise v0 , e.g., to zero
I Then, iterate
∀s : vk+1 (s) ← E [Rt+1 + γvk (St+1 ) | s, π]
I Stopping: whenever vk+1 (s) = vk (s), for all s, we must have found vπ

I Q: Does this algorithm always converge?

Answer : Yes, under appropriate conditions (e.g., γ < 1). More next lecture!
Example: Policy evaluation
Policy evaluation
Policy evaluation
Policy evaluation + Greedy Improvement
Policy evaluation + Greedy Improvement
Policy Improvement
I The example already shows we can use evaluation to then improve our policy
I In fact, just being greedy with respect to the values of the random policy sufficed!
(That is not true in general)

Algorithm
Iterate, using

∀s : πnew (s) = argmax qπ (s, a)

a
= argmax E [Rt+1 + γvπ (St+1 ) | St = s, At = a]
a

Then, evaluate πnew and repeat

I Claim: One can show that vπnew (s) ≥ vπ (s), for all s
Policy Improvement: qπnew (s, a) ≥ qπ (s, a)
Policy Iteration

Policy evaluation Estimate v π

Policy improvement Generate π 0 ≥ π
Example: Jack’s Car Rental

I States: Two locations, maximum of 20 cars at each

I Actions: Move up to 5 cars overnight (-$2 each)
I Reward: $10 for each available car rented, γ = 0.9
I Transitions: Cars returned and requested randomly
n
I Poisson distribution, n returns/requests with prob λn! e −λ
I 1st location: average requests = 3, average returns = 3
I 2nd location: average requests = 4, average returns = 2
Example: Jack’s Car Rental – Policy Iteration
Policy Iteration

I Does policy evaluation need to converge to v π ?

I Or should we stop when we are ‘close’ ?

(E.g., with a threshold on the change to the values)
I Or simply stop after k iterations of iterative policy evaluation?
I In the small gridworld k = 3 was sufficient to achieve optimal policy

I Extreme: Why not update policy every iteration — i.e. stop after k = 1?
I This is equivalent to value iteration
Value Iteration

I We could take the Bellman optimality equation, and turn that into an update

∀s : vk+1 (s) ← max E [Rt+1 + γvk (St+1 ) | St = s, At = s]

I This is equivalent to policy iteration, with k = 1 step of policy evaluation between

each two (greedy) policy improvement steps

Algorithm: Value Iteration

I Initialise v0
I Update:vk+1 (s) ← maxa E [Rt+1 + γvk (St+1 ) | St = s, At = s]
I Stopping: whenever vk+1 (s) = vk (s), for all s, we must have found v ∗
Example: Shortest Path
g 0 0 0 0 0 -1 -1 -1 0 -1 -2 -2

0 0 0 0 -1 -1 -1 -1 -1 -2 -2 -2

0 0 0 0 -1 -1 -1 -1 -2 -2 -2 -2

Problem V1 V2 V3

0 -1 -2 -3 0 -1 -2 -3 0 -1 -2 -3 0 -1 -2 -3

-1 -2 -3 -3 -1 -2 -3 -4 -1 -2 -3 -4 -1 -2 -3 -4

-2 -3 -3 -3 -2 -3 -4 -4 -2 -3 -4 -5 -2 -3 -4 -5

-3 -3 -3 -3 -3 -4 -4 -4 -3 -4 -5 -5 -3 -4 -5 -6

V4 V5 V6 V7
Synchronous Dynamic Programming Algorithms

Problem Bellman Equation Algorithm

Iterative
Prediction Bellman Expectation Equation
Policy Evaluation
Bellman Expectation Equation
Control Policy Iteration
+ (Greedy) Policy Improvement
Control Bellman Optimality Equation Value Iteration

Observations:
I Algorithms are based on state-value function vπ (s) or v ∗ (s) ⇒ complexity
O(|A||S|2 ) per iteration, for |A| actions and |S| states
I Could also apply to action-value function qπ (s, a) or q ∗ (s, a) ⇒ complexity
O(|A|2 |S|2 ) per iteration
Extensions to Dynamic Programming
Asynchronous Dynamic Programming

I DP methods described so far used synchronous updates (all states in parallel)

I Asynchronous DP
I backs up states individually, in any order
I can significantly reduce computation
I guaranteed to converge if all states continue to be selected
Asynchronous Dynamic Programming

Three simple ideas for asynchronous dynamic programming:

I In-place dynamic programming
I Prioritised sweeping
I Real-time dynamic programming
In-Place Dynamic Programming

I Synchronous value iteration stores two copies of value function

for all s in S : vnew (s) ← max E [Rt+1 + γvold (St+1 ) | St = s]

a
vold ← vnew

I In-place value iteration only stores one copy of value function

for all s in S : v (s) ← max E [Rt+1 + γv (St+1 ) | St = s]

a
Prioritised Sweeping

I Use magnitude of Bellman error to guide state selection, e.g.

max E [Rt+1 + γv (St+1 ) | St = s] − v (s)

a

I Backup the state with the largest remaining Bellman error

I Update Bellman error of affected states after each backup
I Requires knowledge of reverse dynamics (predecessor states)
I Can be implemented efficiently by maintaining a priority queue
Real-Time Dynamic Programming

I Idea: only update states that are relevant to agent

I E.g., if the agent is in state St , update that state value, or states that it expects
to be in soon
Full-Width Backups

I Standard DP uses full-width backups

I For each backup (sync or async)
I Every successor state and action is considered s Vk+1(s)
I Using true model of transitions and reward function
I DP is effective for medium-sized problems (millions of a
states) r
I For large problems DP suffers from curse of dimensionality s' Vk(s')
I Number of states n = |S| grows exponentially with number
of state variables
I Even one full backup can be too expensive
Sample Backups

I In subsequent lectures we will consider sample backups

s Vk+1(s)
I Using sample rewards and sample transitions hs, a, r , s 0 i
(Instead of reward function r and transition dynamics p) a
I Advantages:
r
I Model-free: no advance knowledge of MDP required
I Breaks the curse of dimensionality through sampling s'

I Cost of backup is constant, independent of n = |S|

Summary
What have we covered today?

I Markov Decision Processes

I Objectives in an MDP: different notion of return
I Value functions - expected returns, condition on state (and action)
I Optimality principles in MDPs: optimal value functions and optimal policies
I Bellman Equations
I Two class of problems in RL: evaluation and control
I How to compute vπ (aka solve an evaluation/prediction problem)
I How to compute the optimal value function via dynamic programming:
I Policy Iteration
I Value Iteration
Questions?

The only stupid question is the one you were afraid to ask but never did.
-Rich Sutton

For questions that may arise during this lecture please use Moodle and/or the next
Q&A session.

MATH1241 Unit4
No ratings yet
MATH1241 Unit4
25 pages
AI 3000 / CS5500: Reinforcement Learning Exam 1: Instructions
0% (1)
AI 3000 / CS5500: Reinforcement Learning Exam 1: Instructions
4 pages
6.891 Machine Learning: Project Proposal
No ratings yet
6.891 Machine Learning: Project Proposal
2 pages
Amazon ACE Challenge - Operations Case Breaker
No ratings yet
Amazon ACE Challenge - Operations Case Breaker
6 pages
Ampl Tutorial
No ratings yet
Ampl Tutorial
19 pages
Effect of Winglets On Induced Drag of Ideal Wing Shapes
100% (1)
Effect of Winglets On Induced Drag of Ideal Wing Shapes
30 pages
Math Books - AOPS Recommended PDF
50% (2)
Math Books - AOPS Recommended PDF
5 pages
Quiz 4 (Pass - Quizchapter4) (Page 2 of 2)
No ratings yet
Quiz 4 (Pass - Quizchapter4) (Page 2 of 2)
1 page
3 Markov Decision Processes
No ratings yet
3 Markov Decision Processes
70 pages
Intro To Machine Learning With Apache Cassandra and Apache Spark
No ratings yet
Intro To Machine Learning With Apache Cassandra and Apache Spark
80 pages
3.reinforcement Learning DDPG-PPO Agent-Based Control S Ystem
No ratings yet
3.reinforcement Learning DDPG-PPO Agent-Based Control S Ystem
14 pages
Optimization of Process Synthesis and Design Problems - A Modified Differential Evolution Approach
No ratings yet
Optimization of Process Synthesis and Design Problems - A Modified Differential Evolution Approach
15 pages
OM-1 Question Paper
No ratings yet
OM-1 Question Paper
10 pages
Confusion Matrix: For Evaluating The KNN Model
No ratings yet
Confusion Matrix: For Evaluating The KNN Model
17 pages
Marketing Model (BA ZC 421) Sidharth Mishra 13/01/2021: BITS Pilani BITS Pilani
No ratings yet
Marketing Model (BA ZC 421) Sidharth Mishra 13/01/2021: BITS Pilani BITS Pilani
268 pages
Case Study On Delhivery
No ratings yet
Case Study On Delhivery
8 pages
Verification and Validation of Simulation Models: Chapter 10-2
No ratings yet
Verification and Validation of Simulation Models: Chapter 10-2
24 pages
MACHINE LEARNING ALGORITHM Unit-II Part-II-1
No ratings yet
MACHINE LEARNING ALGORITHM Unit-II Part-II-1
65 pages
Basf - The Chemical Company
No ratings yet
Basf - The Chemical Company
6 pages
Nestle: Operation Management
No ratings yet
Nestle: Operation Management
13 pages
Aggregate Planning in Supply Chain
No ratings yet
Aggregate Planning in Supply Chain
43 pages
Assignment - Digital Transformation
No ratings yet
Assignment - Digital Transformation
12 pages
Push Pull System Spring 2014
No ratings yet
Push Pull System Spring 2014
25 pages
Final Report ENPM662 SiddheshRane
100% (1)
Final Report ENPM662 SiddheshRane
21 pages
Case Studies Questions of Chapter 2,3,4 and 5
No ratings yet
Case Studies Questions of Chapter 2,3,4 and 5
5 pages
Lecture2 DataMiningFunctionalities
No ratings yet
Lecture2 DataMiningFunctionalities
18 pages
Visvesvaraya Technological University: "Jnana Sangama", Belagavi-590018
No ratings yet
Visvesvaraya Technological University: "Jnana Sangama", Belagavi-590018
29 pages
Admission Offer For IMT's Post Graduate Program
No ratings yet
Admission Offer For IMT's Post Graduate Program
3 pages
Chase and Level Strategy
No ratings yet
Chase and Level Strategy
9 pages
Management II - Part 3 - Inventory Control
No ratings yet
Management II - Part 3 - Inventory Control
56 pages
Stages of Supply Chain Management
100% (1)
Stages of Supply Chain Management
12 pages
Elements of Manufacturing, Distribution and Logistics Quantitative
No ratings yet
Elements of Manufacturing, Distribution and Logistics Quantitative
308 pages
Routing and Scheduling in A Flexible Job Shop by Tabu Search
No ratings yet
Routing and Scheduling in A Flexible Job Shop by Tabu Search
27 pages
Machine Learning in Materials Science
No ratings yet
Machine Learning in Materials Science
21 pages
CODE2
No ratings yet
CODE2
42 pages
Ch#3 Integrated Circuit Technologies
No ratings yet
Ch#3 Integrated Circuit Technologies
49 pages
CNN and RNN
No ratings yet
CNN and RNN
82 pages
Machine Learning: Professional CORE (CET3006B) T. Y. B.Tech CSE
No ratings yet
Machine Learning: Professional CORE (CET3006B) T. Y. B.Tech CSE
106 pages
Problem-42: Required
No ratings yet
Problem-42: Required
7 pages
Mettl Bulk Upload Template General Questions P v12
No ratings yet
Mettl Bulk Upload Template General Questions P v12
34 pages
Monte Carlo Simulation
No ratings yet
Monte Carlo Simulation
6 pages
INVERTED PENDULUM On A Cart WITH Linear ACTUATOR
No ratings yet
INVERTED PENDULUM On A Cart WITH Linear ACTUATOR
15 pages
Agent: Dept. of Computer Science Faculty of Science and Technology
No ratings yet
Agent: Dept. of Computer Science Faculty of Science and Technology
131 pages
Sarthak's Resume
No ratings yet
Sarthak's Resume
1 page
Particle Tech Zinc
No ratings yet
Particle Tech Zinc
22 pages
A Study of Correlation of Humidity and Temperature of Refrigeration System Using TEC Device
No ratings yet
A Study of Correlation of Humidity and Temperature of Refrigeration System Using TEC Device
32 pages
Final - Purchasing and Supplier MGT
No ratings yet
Final - Purchasing and Supplier MGT
20 pages
ML - Expectation-Maximization Algorithm
No ratings yet
ML - Expectation-Maximization Algorithm
3 pages
An Integrated Approach To New Food Product Develop
No ratings yet
An Integrated Approach To New Food Product Develop
21 pages
Experiment 22-Monte Carlo Simulation (Mini-Project)
No ratings yet
Experiment 22-Monte Carlo Simulation (Mini-Project)
9 pages
Commonly Used Approaches To Real-Time Scheduling
No ratings yet
Commonly Used Approaches To Real-Time Scheduling
40 pages
An ERP Story: Background
No ratings yet
An ERP Story: Background
7 pages
BTech CBCS Course Structure With Syllabi - Minerals and Metallurgical Engineering
No ratings yet
BTech CBCS Course Structure With Syllabi - Minerals and Metallurgical Engineering
38 pages
Digital Manufacturing
No ratings yet
Digital Manufacturing
10 pages
Sensors in Internet of Things
100% (1)
Sensors in Internet of Things
4 pages
SMA Lab Manual 2
No ratings yet
SMA Lab Manual 2
24 pages
Machine Learning Unit 3
No ratings yet
Machine Learning Unit 3
40 pages
SCLM
No ratings yet
SCLM
168 pages
Data Analytics For International Business
No ratings yet
Data Analytics For International Business
16 pages
Integer Programming: Chapter Topics
No ratings yet
Integer Programming: Chapter Topics
8 pages
Machine Learning (Assignment 1-5)
No ratings yet
Machine Learning (Assignment 1-5)
3 pages
Inventory Policies and Safety Stock Optimization For Supply Chain Planning
No ratings yet
Inventory Policies and Safety Stock Optimization For Supply Chain Planning
14 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
11 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
Lesson 5 COMP218
No ratings yet
Lesson 5 COMP218
15 pages
4-SSS SAS ASA and AAS Congruence PDF
No ratings yet
4-SSS SAS ASA and AAS Congruence PDF
4 pages
Unit 1 Notes
0% (1)
Unit 1 Notes
33 pages
2017-JIT Grade 11 Term 2
No ratings yet
2017-JIT Grade 11 Term 2
34 pages
Additional Coaching Problems Baquilar
No ratings yet
Additional Coaching Problems Baquilar
100 pages
Composites Materials
No ratings yet
Composites Materials
21 pages
Dimensionality Reduction of Hyperspectral Data Using Discrete Wavelet Transform Feature Extraction
No ratings yet
Dimensionality Reduction of Hyperspectral Data Using Discrete Wavelet Transform Feature Extraction
8 pages
Kinematics-03-Subjective Solved
No ratings yet
Kinematics-03-Subjective Solved
11 pages
Sampling and Sampling Distributions: ASW, Chapter 7
No ratings yet
Sampling and Sampling Distributions: ASW, Chapter 7
25 pages
CSET240 Handout
No ratings yet
CSET240 Handout
4 pages
Mechanical Behavior of A 9 7 MW Induction Motor 1714188865
No ratings yet
Mechanical Behavior of A 9 7 MW Induction Motor 1714188865
15 pages
Uace Mathematics Paper 2 2011 and Marking Guide
No ratings yet
Uace Mathematics Paper 2 2011 and Marking Guide
12 pages
Life Cycle Costing
100% (4)
Life Cycle Costing
38 pages
Grade 7 Math Worksheet
No ratings yet
Grade 7 Math Worksheet
37 pages
Turbine Cascade Geometry t106 Profile
No ratings yet
Turbine Cascade Geometry t106 Profile
9 pages
Fusion 360 - Torch Tutorial
No ratings yet
Fusion 360 - Torch Tutorial
29 pages
Moderator Mediator Variable Distinction in Social Psychological Research
No ratings yet
Moderator Mediator Variable Distinction in Social Psychological Research
5 pages
PRCV Lab Manual-Final
No ratings yet
PRCV Lab Manual-Final
60 pages
Random Error Propagation
No ratings yet
Random Error Propagation
13 pages
4g
No ratings yet
4g
32 pages
Problem On Monte Carlo Simulation
No ratings yet
Problem On Monte Carlo Simulation
3 pages
Session 5 Assignment
No ratings yet
Session 5 Assignment
15 pages
Basic Probability and Reliability Concepts: Roy Billinton Power System Research Group University of Saskatchewan Canada
No ratings yet
Basic Probability and Reliability Concepts: Roy Billinton Power System Research Group University of Saskatchewan Canada
248 pages
Envmath 2 01 TA P
No ratings yet
Envmath 2 01 TA P
5 pages
SBI Clerk Reasoning Previous Year Questions
No ratings yet
SBI Clerk Reasoning Previous Year Questions
35 pages
Artificial Neural Networks For Modeling Time Series of Beach Litter in The Southern North Sea
No ratings yet
Artificial Neural Networks For Modeling Time Series of Beach Litter in The Southern North Sea
9 pages