D. P. Bertsekas, The method of multipliers for equality constraints, Constrained optimization and Lagrange Multiplier methods. Athena scientific, 1982.

D. P. Bertsekas, Nonlinear programming, 1999.

M. Collins, A. Globerson, T. Koo, X. Carreras, and P. L. Bartlett, Exponentiated gradient algorithms for conditional random fields and max-margin Markov networks, JMLR, vol.9, pp.1775-1822, 2008.

S. Furuichi, Information theoretical properties of Tsallis entropies, Journal of Mathematical Physics, vol.1, issue.2, p.23302, 2006.
DOI : 10.1002/j.1538-7305.1948.tb00917.x

T. Hazan and R. Urtasun, A primal-dual message-passing algorithm for approximated large scale structured prediction, NIPS, pp.838-846, 2010.

M. Hong and Z. Luo, On the linear convergence of the alternating direction method of multipliers, Mathematical Programming, pp.165-199, 2017.
DOI : 10.1109/ISIT.2010.5513535

T. Koo, A. Globerson, C. Pérez, X. , C. et al., Structured prediction models via the matrix-tree theorem, Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp.141-150, 2007.

S. Lacoste-julien, M. Jaggi, M. Schmidt, and P. Pletscher, Block-coordinate Frank-Wolfe optimization for structural SVMs, ICML, pp.53-61, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00720158

B. London, B. Huang, and L. Getoor, The benefits of learning with strongly convex approximate inference, ICML, pp.410-418, 2015.

O. Meshi, D. Sontag, A. Globerson, and T. S. Jaakkola, Learning efficiently with approximate inference via dual losses, ICML, pp.783-790, 2010.

O. Meshi, N. Srebro, and T. Hazan, Efficient training of structured SVMs via soft constraints, AISTATS, pp.699-707, 2015.

Y. Nesterov, Introductory lectures on convex optimization: A basic course, 2013.
DOI : 10.1007/978-1-4419-8853-9