Regret Bounds for Gaussian Process Bandit Problems

Steffen Grünewälder; Jean-Yves Audibert; Manfred Opper; John Shawe-Taylor

Communication Dans Un Congrès Année : 2010

Regret Bounds for Gaussian Process Bandit Problems

(1) , (2, 3, 4) , (5) , (1)

1
2
3
4
5

Steffen Grünewälder

Fonction : Auteur

Department of Computer science [University College of London]

Jean-Yves Audibert

Fonction : Auteur
PersonId : 931557

imagine [Marne-la-Vallée]

Models of visual object recognition and scene understanding

Laboratoire d'Informatique Gaspard-Monge

Manfred Opper

Fonction : Auteur

Department of Artificial Intelligence

John Shawe-Taylor

Fonction : Auteur

Department of Computer science [University College of London]

Résumé

Bandit algorithms are concerned with trading exploration with exploitation where a number of options are available but we can only learn their quality by experimenting with them. We consider the scenario in which the reward distribution for arms is modelled by a Gaussian process and there is no noise in the observed reward. Our main result is to bound the regret experienced by algorithms relative to the a posteriori optimal strategy of playing the best arm throughout based on benign assumptions about the covariance function de ning the Gaussian process. We further complement these upper bounds with corresponding lower bounds for particular covariance functions demonstrating that in general there is at most a logarithmic looseness in our upper bounds.

Domaines

Apprentissage [cs.LG] Autres [stat.ML]

Fichier principal

AISTATS10.pdf (701.79 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Jean-Yves Audibert : Connectez-vous pour contacter le contributeur

https://enpc.hal.science/hal-00654517

Soumis le : jeudi 22 décembre 2011-10:06:06

Dernière modification le : vendredi 19 avril 2024-16:18:57

Archivage à long terme le : vendredi 23 mars 2012-02:21:32

Dates et versions

hal-00654517 , version 1 (22-12-2011)

Identifiants

HAL Id : hal-00654517 , version 1

Citer

Steffen Grünewälder, Jean-Yves Audibert, Manfred Opper, John Shawe-Taylor. Regret Bounds for Gaussian Process Bandit Problems. AISTATS 2010 - Thirteenth International Conference on Artificial Intelligence and Statistics, May 2010, Chia Laguna Resort, Sardinia, Italy. pp.273-280. ⟨hal-00654517⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS ENPC CNRS INRIA UNIV-MLV LIGM_A3SI PARISTECH LIGM IMAGINE INRIA2 PSL ESIEE-PARIS UNIV-EIFFEL JSE2024

876 Consultations

265 Téléchargements

Regret Bounds for Gaussian Process Bandit Problems

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager