Open navigation menu

Scribd

0% found this document useful (0 votes)

17 views

Bandit Algorithms

Uploaded by

Arushi Srivastava

Copyright

© © All Rights Reserved

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views

Bandit Algorithms

Uploaded by

Arushi Srivastava

Copyright

© © All Rights Reserved

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 2

Bandit Algorithms

Multi-arm bandits are one of the fundamental problems in reinforcement learning. The
problem can be described as follows: there are multiple slot machines (arms), each
with a different reward distribution, and the objective is to find the machine with the
highest expected reward by playing them sequentially. This problem is commonly
encountered in many real-world applications, such as advertising, healthcare, and
finance, where the agent needs to choose the best option out of a set of alternatives.
Multi-arm bandits are widely studied in the literature and have been shown to have
important applications in various fields.

Types of Multi-Arm Bandits

There are two main types of multi-arm bandits: stochastic and adversarial. In
stochastic multi-arm bandits, the rewards of the arms are generated from a fixed
probability distribution, which is unknown to the agent. In adversarial multi-arm
bandits, the rewards of the arms are chosen by an adversary, who tries to make the
problem more challenging for the agent. The adversary can be either deterministic or
randomized.

Algorithms for Multi-Arm Bandits

There are many algorithms for solving the multi-arm bandit problem, each with its
own strengths and weaknesses. The most well-known algorithms are epsilon-greedy,
UCB1 and Thompson Sampling.

Epsilon-Greedy: Epsilon-greedy is a simple algorithm that selects the arm with the

highest estimated reward with probability 1-epsilon and selects a random arm with

probability epsilon. The value of epsilon is usually chosen to balance exploration and

exploitation. If epsilon is set to zero, the algorithm always selects the arm with the

highest estimated reward, which can lead to suboptimal solutions if the estimates are

inaccurate. On the other hand, if epsilon is set to one, the algorithm always selects a

random arm, which can lead to inefficient exploration.

UCB: UCB (Upper Confidence Bound) is an algorithm that balances exploration and

exploitation by selecting the arm with the highest upper confidence bound (UCB)
estimate. The UCB estimate consists of two terms: the estimated reward and the

confidence interval. The confidence interval is proportional to the square root of the

logarithm of the number of times the arm has been played. The UCB algorithm has
been shown to have good performance in both stochastic and adversarial

environments.

Thompson Sampling: Thompson Sampling is a Bayesian algorithm that updates

its beliefs about the reward distribution of each arm after each play. The algorithm

then samples a reward from the updated distribution and selects the arm with the

highest sample. The Thompson Sampling algorithm has been shown to have good

performance in stochastic environments, but its performance in adversarial

environments needs to be better understood.

Extensions of Multi-Arm Bandits

There are many extensions of the multi-arm bandit problem, such as contextual

multi-arm bandits, collaborative filtering, and dynamic pricing. In contextual multi-

arm bandits, the rewards of the arms depend not only on the arm played but also on a

context vector that is observed by the agent. In collaborative filtering, the agent has to

recommend items to users based on their preferences. In dynamic pricing, the agent

has to set the price of a product to maximize revenue.

You might also like

The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6124)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (627)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1148)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (933)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4/5 (8214)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (631)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1253)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4/5 (8365)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (860)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (877)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (954)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4/5 (2922)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (483)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (277)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Toibin
3.5/5 (2061)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (4972)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (444)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4281)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (447)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2283)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1068)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (1987)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (278)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (1993)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2619)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (1936)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (125)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (1912)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (692)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (4074)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (75)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (830)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (143)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (901)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2530)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M L Stedman
4.5/5 (790)
Unit I
No ratings yet
Unit I
6 pages
Unit Iv
No ratings yet
Unit Iv
8 pages
CT20244394381 Application
No ratings yet
CT20244394381 Application
5 pages
Financial Fraud Detection Using Machine Learning Techniques
No ratings yet
Financial Fraud Detection Using Machine Learning Techniques
43 pages
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
Little Women
From Everand
Little Women
Louisa May Alcott
4/5 (105)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
3.5/5 (109)