Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Stochastic scheduling: A short history of index policies and some key recent developments

Glazebrook, K. D., Kirkbride, C., Hodge, D. and Minty, John 2011. Stochastic scheduling: A short history of index policies and some key recent developments. Presented at: 5th Multidisciplinary International Conference on Scheduling: Theory and Applications (MISTA 2011), Phoenix, AZ, USA, 9-11 August 2011. Proceedings of the 5th Multidisciplinary International Conference on Scheduling: Theory and Applications (MISTA 2011). MISTA, pp. 94-112.

Full text not available from this repository.

Abstract

A multi-armed bandit problem classically concerns N >= 2 populations of rewards whose statistical properties are unknown (or at least only partly known). A decision-maker secures rewards by sampling sequentially from the populations, using past sampled values to make inferences about the populations and so guide the choice of which population to sample next. The goal is to make these choices in such a way as to maximise some measure of total reward secured. Such problems embody, in a particularly simple form, the dichotomy present in many decision problems between making decisions with a view to securing information which can improve future decision-making (exploration) and those which exploit the information already available (exploitation). In the 1970's John Gittins discovered that important classes of such multi-armed bandit problems have solutions of a particularly simple form: at each stage of the sampling compute an index (the Gittins index ) for each of the N populations, namely a function of the rewards already sampled from the population concerned. Always sample next from the population with the largest current index. Moreover, the index concerned has a simple interpretation as an equivalent known reward for the population concerned. It emerges that many problems involving the sequential allocation of effort, some of quite different character to the above multi-armed bandit problems, have index solutions. Since the 1970's, Gittins' index result together with a range of developments and reformulations of it have constituted an influential stream of ideas and results contributing to research into the scheduling of stochastic objects. Application areas to which these ideas have contributed include approximate dynamic programming, control of queueing systems, fast fashion retail, machine maintenance, military logistics, optimal search, research planning, sensor management, communication channel usage and website morphing. The talk will give an overview of some key ideas and some recent developments.

Item Type: Conference or Workshop Item (Paper)
Date Type: Publication
Status: Published
Schools: Mathematics
Subjects: Q Science > QA Mathematics
Publisher: MISTA
Last Modified: 25 Oct 2016 02:52
URI: https://orca.cardiff.ac.uk/id/eprint/38792

Actions (repository staff only)

Edit Item Edit Item