C. Allenberg, P. Auer, L. Györfi, and G. Ottucsák, Hannan Consistency in On-Line Learning in Case of Unbounded Losses Under Partial Monitoring, ALT, pp.229-243, 2006.
DOI : 10.1007/11894841_20

J. Audibert, R. Munos, and C. Szepesvári, Exploration???exploitation tradeoff using variance estimates in multi-armed bandits, Theoretical Computer Science, vol.410, issue.19, pp.1876-1902, 2009.
DOI : 10.1016/j.tcs.2009.01.016

URL : https://hal.archives-ouvertes.fr/hal-00711069

P. Auer, Using confidence bounds for exploitation-exploration trade-offs, Journal of Machine Learning Research, vol.3, pp.397-422, 2002.

P. Auer, N. Cesa-bianchi, Y. Freund, and R. Schapire, Gambling in a rigged casino: The adversarial multi-armed bandit problem, Proceedings of IEEE 36th Annual Foundations of Computer Science, pp.322-331, 1995.
DOI : 10.1109/SFCS.1995.492488

P. Auer, N. Cesa-bianchi, and P. Fischer, Finite-time analysis of the multiarmed bandit problem, Machine Learning, vol.47, issue.2/3, pp.235-256, 2002.
DOI : 10.1023/A:1013689704352

P. Auer, N. Cesa-bianchi, Y. Freund, and R. Schapire, The Nonstochastic Multiarmed Bandit Problem, SIAM Journal on Computing, vol.32, issue.1, pp.48-77, 2002.
DOI : 10.1137/S0097539701398375

K. Azuma, Weighted sums of certain dependent random variables, Tohoku Mathematical Journal, vol.19, issue.3, pp.357-367, 1967.
DOI : 10.2748/tmj/1178243286

N. Cesa-bianchi, Analysis of two gradient-based algorithms for on-line regression, Proceedings of the tenth annual conference on Computational learning theory , COLT '97, pp.392-411, 1999.
DOI : 10.1145/267460.267492

N. Cesa-bianchi, Y. Freund, D. P. Helmbold, D. Haussler, R. E. Schapire et al., How to use expert advice, Journal of the ACM, vol.44, issue.3, pp.427-485, 1997.
DOI : 10.1145/258128.258179

N. Cesa-bianchi, G. Lugosi, and G. Stoltz, Minimizing Regret With Label Efficient Prediction, IEEE Transactions on Information Theory, vol.51, issue.6, pp.2152-2162, 2005.
DOI : 10.1109/TIT.2005.847729

URL : https://hal.archives-ouvertes.fr/hal-00007537

D. A. Freedman, On tail probabilities for martingales. The Annals of Probability, pp.100-118, 1975.

A. György and G. Ottucsák, Adaptive Routing Using Expert Advice, The Computer Journal, vol.49, issue.2, pp.180-189, 2006.
DOI : 10.1093/comjnl/bxh168

D. Helmbold and S. Panizza, Some label efficient learning results, Proceedings of the tenth annual conference on Computational learning theory , COLT '97, pp.218-230, 1997.
DOI : 10.1145/267460.267502

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.179.437

W. Hoeffding, Probability Inequalities for Sums of Bounded Random Variables, Journal of the American Statistical Association, vol.1, issue.301, pp.13-30, 1963.
DOI : 10.1214/aoms/1177730491

H. Robbins, Some aspects of the sequential design of experiments, Bulletin of the American Mathematical Society, vol.58, issue.5, pp.527-535, 1952.
DOI : 10.1090/S0002-9904-1952-09620-8