The Fragility of Optimized Bandits



Much of the literature on optimal design of bandit algorithms is based on minimization of expected regret. It is well known that algorithms that are optimal over certain exponential families can achieve expected regret that grows logarithmically in the number of trials, at a rate specified by the Lai-Robbins lower bound. In this talk, we point out that when one uses such optimized algorithms, the resulting regret distribution necessarily has a very heavy tail, specifically, that of a truncated Cauchy distribution. We show that optimized UCB algorithms are also fragile in an additional sense, namely when the problem is even slightly mis-specified, the regret can grow much faster than the conventional theory suggests. To alleviate the fragility issues exposed, we show that UCB algorithms can be modified so as to ensure a desired degree of robustness to mis-specification. This is joint work with Lin Fan.



Peter W. Glynn is the Thomas Ford Professor in the Department of Management Science and Engineering (MS&E) at Stanford University, and also holds a courtesy appointment in the Department of Electrical Engineering. He received his Ph.D in Operations Research from Stanford University in 1982. He then joined the faculty of the University of Wisconsin at Madison, where he held a joint appointment between the Industrial Engineering Department and Mathematics Research Center, and courtesy appointments in Computer Science and Mathematics. In 1987, he returned to Stanford, where he joined the Department of Operations Research. From 1999 to 2005, he served as Deputy Chair of the Department of Management Science and Engineering, and was Director of Stanford's Institute for Computational and Mathematical Engineering from 2006 until 2010. He served as Chair of MS&E from 2011 through 2015. He is a Fellow of INFORMS and a Fellow of the Institute of Mathematical Statistics, and was an IMS Medallion Lecturer in 1995, a Lunteren Lecturer in 2007, the INFORMS Markov Lecturer in 2014, an Infosys-ICTS Turing Lecturer in 2019, and gave a Titan of Simulation talk at the 2019 Winter Simulation Conference. He was co-winner of the Outstanding Publication Awards from the INFORMS Simulation Society in 1993, 2008, and 2016, was a co-winner of the Best (Biannual) Publication Award from the INFORMS Applied Probability Society in 2009, was the co-winner of the John von Neumann Theory Prize from INFORMS in 2010, and gave the INFORMS Philip McCord Morse Lecture in 2020. In 2012, he was elected to the National Academy of Engineering, and in 2021 he received the Lifetime Professional Achievement Award of the INFORMS Simulation Society. He was Founding Editor-in-Chief of Stochastic Systems and served as Editor-in-Chief of Journal of Applied Probability and Advances in Applied Probability from 2016 to 2018. His research interests lie in simulation, computational probability, queueing theory, statistical inference for stochastic processes, and stochastic modeling.