P. Agrawal, J. Carreira, and J. Malik, Learning to See by Moving, 2015 IEEE International Conference on Computer Vision (ICCV), pp.37-45, 2015.
DOI : 10.1109/ICCV.2015.13

Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle, Greedy layer-wise training of deep networks, Advances in neural information processing systems, pp.153-160, 2007.

P. Bojanowski and A. Joulin, Unsupervised learning by predicting noise. arXiv preprint, 2017.

C. Doersch and A. Zisserman, Multi-task self-supervised visual learning. CoRR, abs, 1708.

C. Doersch, A. Gupta, and A. A. Efros, Unsupervised Visual Representation Learning by Context Prediction, 2015 IEEE International Conference on Computer Vision (ICCV), pp.1422-1430, 2015.
DOI : 10.1109/ICCV.2015.167

J. Donahue, P. Krähenbühl, and T. Darrell, Adversarial feature learning. arXiv preprint, 2016.

A. Dosovitskiy, J. T. Springenberg, M. Riedmiller, and T. Brox, Discriminative Unsupervised Feature Learning with Exemplar Convolutional Neural Networks, Advances in Neural Information Processing Systems, pp.766-774, 2014.
DOI : 10.1109/TPAMI.2015.2496141

M. Everingham, L. Van-gool, C. K. Williams, J. Winn, and A. Zisserman, The Pascal Visual Object Classes (VOC) Challenge, International Journal of Computer Vision, vol.73, issue.2, pp.303-338, 2010.
DOI : 10.1371/journal.pcbi.0040027

R. Girshick, Fast R-CNN, 2015 IEEE International Conference on Computer Vision (ICCV), pp.1440-1448, 2015.
DOI : 10.1109/ICCV.2015.169

I. Goodfellow, J. Pouget-abadie, M. Mirza, B. Xu, D. Warde-farley et al., Generative adversarial nets, Advances in neural information processing systems, pp.2672-2680, 2014.

P. Goyal, P. Dollár, R. Girshick, P. Noordhuis, L. Wesolowski et al., Yangqing Jia, and Kaiming He. Accurate, large minibatch sgd: Training imagenet in 1 hour. arXiv preprint, 2017.

Y. Fu-jie-huang, Y. Boureau, and . Lecun, Unsupervised learning of invariant feature hierarchies with applications to object recognition, Computer Vision and Pattern Recognition CVPR'07. IEEE Conference on, pp.1-8, 2007.

A. Karpathy and L. Fei-fei, Deep visual-semantic alignments for generating image descriptions, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.3128-3137, 2015.

P. Krähenbühl, C. Doersch, J. Donahue, and T. Darrell, Data-dependent initializations of convolutional neural networks, 2015.

A. Krizhevsky and G. Hinton, Learning multiple layers of features from tiny images, 2009.

G. Larsson, M. Maire, and G. Shakhnarovich, Learning Representations for Automatic Colorization, European Conference on Computer Vision, pp.577-593, 2016.
DOI : 10.1007/978-3-319-46487-9_40

G. Larsson, M. Maire, and G. Shakhnarovich, Colorization as a proxy task for visual understanding. arXiv preprint, 2017.

Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, Proceedings of the IEEE, vol.86, issue.11, pp.2278-2324, 1998.
DOI : 10.1109/5.726791

R. Liao, A. Schwing, R. Zemel, and R. Urtasun, Learning deep parsimonious representations, Advances in Neural Information Processing Systems, pp.5076-5084, 2016.

M. Lin, Q. Chen, and S. Yan, Network in network. arXiv preprint, 2013.
URL : https://hal.archives-ouvertes.fr/hal-01950552

J. Long, E. Shelhamer, and T. Darrell, Fully convolutional networks for semantic segmentation, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
DOI : 10.1109/CVPR.2015.7298965

J. Masci, U. Meier, D. Cires¸ancires¸an, and J. Schmidhuber, Stacked convolutional autoencoders for hierarchical feature extraction, Artificial Neural Networks and Machine Learning? ICANN 2011, pp.52-59, 2011.
DOI : 10.1007/978-3-642-21735-7_7

URL : http://www.idsia.ch/~juergen/icann2011stack.pdf

M. Noroozi and P. Favaro, Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles, European Conference on Computer Vision, pp.69-84, 2016.
DOI : 10.1007/978-3-319-10590-1_53

M. Noroozi, H. Pirsiavash, and P. Favaro, Representation learning by learning to count. arXiv preprint, 2017.

E. Oyallon and S. Mallat, Deep roto-translation scattering for object classification, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.2865-2873, 2015.
DOI : 10.1109/CVPR.2015.7298904

E. Oyallon, E. Belilovsky, and S. Zagoruyko, Scaling the Scattering Transform: Deep Hybrid Networks, 2017 IEEE International Conference on Computer Vision (ICCV), 2017.
DOI : 10.1109/ICCV.2017.599

URL : https://hal.archives-ouvertes.fr/hal-01495734

D. Pathak, R. Girshick, P. Dollár, T. Darrell, and B. Hariharan, Learning Features by Watching Objects Move, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
DOI : 10.1109/CVPR.2017.638

D. Pathak, P. Krahenbuhl, J. Donahue, T. Darrell, and A. A. Efros, Context Encoders: Feature Learning by Inpainting, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.2536-2544, 2016.
DOI : 10.1109/CVPR.2016.278

URL : http://arxiv.org/pdf/1604.07379

A. Radford, L. Metz, and S. Chintala, Unsupervised representation learning with deep convolutional generative adversarial networks, 2015.

O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh et al., ImageNet Large Scale Visual Recognition Challenge, International Journal of Computer Vision, vol.1010, issue.1, pp.211-252, 2015.
DOI : 10.1007/978-3-642-15555-0_11

URL : http://arxiv.org/pdf/1409.0575

X. Wang and A. Gupta, Unsupervised Learning of Visual Representations Using Videos, 2015 IEEE International Conference on Computer Vision (ICCV), pp.2794-2802, 2015.
DOI : 10.1109/ICCV.2015.320

J. Yang, D. Parikh, and D. Batra, Joint Unsupervised Learning of Deep Representations and Image Clusters, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.5147-5156, 2016.
DOI : 10.1109/CVPR.2016.556

R. Zhang, P. Isola, and A. A. Efros, Colorful Image Colorization, European Conference on Computer Vision, pp.649-666, 2016.
DOI : 10.1109/CVPR.2015.7298965

URL : http://arxiv.org/pdf/1603.08511

R. Zhang, P. Isola, and A. A. Efros, Split-Brain Autoencoders: Unsupervised Learning by Cross-Channel Prediction, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
DOI : 10.1109/CVPR.2017.76

URL : http://arxiv.org/pdf/1611.09842

B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva, Learning deep features for scene recognition using places database, Advances in Neural Information Processing Systems 27, pp.487-495, 2014.