P. Auer, N. Cesa-bianchi, and P. Fischer, Finite-time analysis of the multiarmed bandit problem, Machine Learning, vol.47, issue.2/3, pp.235-256, 2002.
DOI : 10.1023/A:1013689704352

S. Bubeck, R. Munos, and G. Stoltz, Pure Exploration in Multi-armed Bandits Problems, Proc. of the 20th International Conference on Algorithmic Learning Theory, 2009.
DOI : 10.1090/S0002-9904-1952-09620-8

C. Domingo, R. Gavaldà, and O. Watanabe, Adaptive sampling methods for scaling up knowledge discovery algorithms, Data Mining and Knowledge Discovery, vol.6, issue.2, pp.131-152, 2002.
DOI : 10.1023/A:1014091514039

E. Even-dar, S. Mannor, and Y. Mansour, Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems, The Journal of Machine Learning Research, vol.7, pp.1079-1105, 2006.

O. Maron and A. W. Moore, Hoeffding races: Accelerating model selection search for classification and function approximation, NIPS, pp.59-66, 1993.

V. Mnih, C. Szepesvári, and J. Audibert, Empirical Bernstein stopping, Proceedings of the 25th international conference on Machine learning, ICML '08, pp.672-679, 2008.
DOI : 10.1145/1390156.1390241

URL : https://hal.archives-ouvertes.fr/hal-00834983

H. Robbins, Some aspects of the sequential design of experiments, Bulletin of the American Mathematical Society, vol.58, issue.5, pp.527-535, 1952.
DOI : 10.1090/S0002-9904-1952-09620-8