A. Argyriou, R. Foygel, and N. Srebro, Sparse prediction with the k-support norm, Advances in Neural Information Processing Systems 25, pp.1466-1474, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00858954

F. Bach, Structured sparsity-inducing norms through submodular functions, Adv. NIPS, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00511310

F. Bach, Learning with Submodular Functions: A Convex Optimization Perspective, Foundations and Trends?? in Machine Learning, vol.6, issue.2-3, pp.145-373, 2013.
DOI : 10.1561/2200000039

URL : https://hal.archives-ouvertes.fr/hal-00645271

F. Bach, R. Jenatton, J. Mairal, and G. Obozinski, Optimization with sparsity-inducing penalties. Foundation and Trends, Machine Learning, pp.1-106, 2012.
DOI : 10.1561/2200000015

URL : https://hal.archives-ouvertes.fr/hal-00613125

R. Baraniuk, V. Cevher, M. Duarte, and C. Hegde, Model-Based Compressive Sensing, IEEE Transactions on Information Theory, vol.56, issue.4, pp.1982-2001, 2010.
DOI : 10.1109/TIT.2010.2040894

R. Barlow and H. Brunk, The Isotonic Regression Problem and its Dual, Journal of the American Statistical Association, vol.17, issue.9, pp.140-147, 1972.
DOI : 10.1080/01621459.1972.10481216

F. Bauer, J. Stoer, and C. Witzgall, Absolute and monotonic norms, Numerische Mathematik, vol.63, issue.1, pp.257-264, 1961.
DOI : 10.1007/BF01386026

M. Best and N. Chakravarti, Active set algorithms for isotonic regression; A unifying framework, Mathematical Programming, pp.425-439, 1990.
DOI : 10.1007/BF01580873

P. Bickel, Y. Ritov, and A. Tsybakov, Simultaneous analysis of Lasso and Dantzig selector, The Annals of Statistics, vol.37, issue.4, pp.1705-1732, 2009.
DOI : 10.1214/08-AOS620

URL : https://hal.archives-ouvertes.fr/hal-00401585

J. Bien, J. Taylor, and R. Tibshirani, A lasso for hierarchical interactions. The Annals of Statistics, pp.1111-1141, 2013.
DOI : 10.1214/13-aos1096

URL : http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4527358

M. Bogdan, E. Van-den-berg, C. Sabatti, W. Su, C. et al., SLOPE???Adaptive variable selection via convex optimization, The Annals of Applied Statistics, vol.9, issue.3, pp.1103-1140, 2015.
DOI : 10.1214/15-AOAS842SUPP

URL : http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4689150

H. D. Bondell and B. J. Reich, Simultaneous Regression Shrinkage, Variable Selection, and Supervised Clustering of Predictors with OSCAR, Biometrics, vol.67, issue.1, pp.115-123, 2008.
DOI : 10.1111/j.1541-0420.2007.00843.x

S. P. Boyd and L. Vandenberghe, Convex Optimization, 2004.

A. Chambolle and J. Darbon, On Total Variation Minimization and Surface Evolution Using Parametric Maximum Flows, International Journal of Computer Vision, vol.40, issue.9, pp.288-307, 2009.
DOI : 10.1007/s11263-009-0238-9

URL : http://escholarship.org/uc/item/5sd211v1.pdf

V. Chandrasekaran, B. Recht, P. A. Parrilo, and A. S. Willsky, The Convex Geometry of Linear Inverse Problems, Foundations of Computational Mathematics, vol.1, issue.10, pp.805-849, 2012.
DOI : 10.1007/s10208-012-9135-7

W. Dinkelbach, On Nonlinear Fractional Programming, Management Science, vol.13, issue.7, pp.492-498, 1967.
DOI : 10.1287/mnsc.13.7.492

J. Edmonds, Submodular Functions, Matroids, and Certain Polyhedra, Combinatorial optimization -Eureka, you shrink!, pp.11-26, 2003.
DOI : 10.1007/3-540-36478-1_2

M. Figueiredo and R. D. Nowak, Sparse estimation with strongly correlated variables using ordered weighted 1 regularization, 2014.

S. Fujishige, Submodular Functions and Optimization, 2005.

G. Gallo, M. D. Grigoriadis, and R. E. Tarjan, A Fast Parametric Maximum Flow Algorithm and Applications, SIAM Journal on Computing, vol.18, issue.1, pp.30-55, 1989.
DOI : 10.1137/0218003

H. Groenevelt, Two algorithms for maximizing a separable concave function over a polymatroid feasible region, European Journal of Operational Research, vol.54, issue.2, pp.227-236, 1991.
DOI : 10.1016/0377-2217(91)90300-K

L. He and L. Carin, Exploiting structure in wavelet-based Bayesian compressive sensing, IEEE Transactions on Signal Processing, vol.57, pp.3488-3497, 2009.

D. S. Hochbaum and S. Hong, About strongly polynomial time algorithms for quadratic optimization over submodular constraints, Mathematical Programming, vol.34, issue.3, pp.1-3269, 1995.
DOI : 10.1007/BF01585561

J. Huang, T. Zhang, and D. Metaxas, Learning with structured sparsity, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, pp.3371-3412, 2011.
DOI : 10.1145/1553374.1553429

URL : http://arxiv.org/abs/0903.3002

L. Jacob, G. Obozinski, and J. Vert, Group lasso with overlap and graph lasso, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, 2009.
DOI : 10.1145/1553374.1553431

R. Jenatton, J. Audibert, and F. Bach, Structured variable selection with sparsity-inducing norms, JMLR, vol.12, pp.2777-2824, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00377732

R. Jenatton, J. Mairal, G. Obozinski, and F. Bach, Proximal methods for sparse hierarchical dictionary learning, Proc. ICML, 2010.

R. Jenatton, J. Mairal, G. Obozinski, and F. Bach, Proximal methods for hierarchical sparse coding, JMLR, vol.12, pp.2297-2334, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00516723

S. Kim and E. P. Xing, Tree-guided group lasso for multi-task regression with structured sparsity, Proc. ICML, 2010.
DOI : 10.1214/12-aoas549

URL : http://arxiv.org/abs/0909.1373

L. Lovász, On the ratio of optimal integral and fractional covers, Discrete Mathematics, vol.13, issue.4, pp.383-390, 1975.
DOI : 10.1016/0012-365X(75)90058-8

R. Luss and S. Rosset, Generalized Isotonic Regression, Journal of Computational and Graphical Statistics, vol.58, issue.1, pp.192-210, 2014.
DOI : 10.1016/0022-0000(83)90006-5

J. Mairal, R. Jenatton, G. Obozinski, and F. Bach, Convex and network flow optimization for structured sparsity, JMLR, vol.12, pp.2681-2720, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00584817

A. M. Mcdonald, M. Pontil, and D. Stamos, New perspectives on k-support and cluster norms. arXiv preprint, 2015.

C. A. Micchelli, J. M. Morales, and M. Pontil, Regularizers for structured sparsity, Advances in Computational Mathematics, vol.37, issue.6A, pp.455-489, 2013.
DOI : 10.1007/s10444-011-9245-9

S. Negahban, P. Ravikumar, M. Wainwright, Y. , and B. , A Unified Framework for High-Dimensional Analysis of $M$-Estimators with Decomposable Regularizers, Statistical Science, vol.27, issue.4, pp.538-557, 2012.
DOI : 10.1214/12-STS400SUPP

S. Negahban and M. J. Wainwright, Joint support recovery under high-dimensional scaling: Benefits and perils of 1 -? -regularization, Adv. NIPS, 2008.

G. Obozinski, L. Jacob, and J. Vert, Group Lasso with overlaps: the Latent Group Lasso approach, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00628498

P. M. Pardalos and G. Xue, Algorithms for a Class of Isotonic Regression Problems, Algorithmica, vol.23, issue.3, pp.211-222, 1999.
DOI : 10.1007/PL00009258

J. Rissanen, Modeling by shortest data description, Automatica, vol.14, issue.5, pp.465-471, 1978.
DOI : 10.1016/0005-1098(78)90005-5

R. Rockafellar, Convex Analysis, 1970.
DOI : 10.1515/9781400873173

G. W. Stewart and J. Sun, Matrix Perturbation Theory, 1990.

Q. F. Stout, Isotonic Regression via Partitioning, Algorithmica, vol.33, issue.1, pp.93-112, 2013.
DOI : 10.1007/s00453-012-9628-4

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.310.1384

S. Van-de-geer, Weakly decomposable regularization penalties and structured sparsity, Scandinavian Journal of Statistics, vol.41, issue.1, pp.72-86, 2014.
DOI : 10.1111/sjos.12032

X. Yan and J. Bien, Hierarchical sparse modeling: A choice of two regularizers. arXiv preprint, 2015.

M. Yuan, V. R. Joseph, and H. Zou, Structured variable selection and estimation, The Annals of Applied Statistics, vol.3, issue.4, pp.1738-1757, 2009.
DOI : 10.1214/09-AOAS254

M. Yuan and Y. Lin, Model selection and estimation in regression with grouped variables, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.58, issue.1, pp.49-67, 2006.
DOI : 10.1198/016214502753479356

P. Zhao, G. Rocha, Y. , and B. , The composite absolute penalties family for grouped and hierarchical variable selection. The Annals of Statistics, pp.3468-3497, 2009.

P. Zhao, G. Rocha, Y. , and B. , The composite absolute penalties family for grouped and hierarchical variable selection, The Annals of Statistics, vol.37, issue.6A, pp.3468-3497, 2009.
DOI : 10.1214/07-AOS584

P. Zhao and B. Yu, On model selection consistency of Lasso, JMLR, vol.7, pp.2541-2563, 2006.

L. W. Zhong and J. T. Kwok, Efficient Sparse Modeling With Automatic Feature Grouping, IEEE Transactions on Neural Networks and Learning Systems, vol.23, issue.9, pp.1436-1447, 2012.
DOI : 10.1109/TNNLS.2012.2200262

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.224.2577

Y. Zhou, R. Jin, and S. C. Hoi, Exclusive lasso for multi-task feature selection, AISTATS, 2010.