https://hal-enpc.archives-ouvertes.fr/hal-01404973Leclère, VincentVincentLeclèreCERMICS - Centre d'Enseignement et de Recherche en Mathématiques et Calcul Scientifique - ENPC - École des Ponts ParisTechGrave, EdouardEdouardGraveLBNL - Lawrence Berkeley National Laboratory [Berkeley]El Ghaoui, LaurentLaurentEl GhaouiLBNL - Lawrence Berkeley National Laboratory [Berkeley]Probabilistic Approach to One-Class Support Vector MachineHAL CCSD2016[STAT.ML] Statistics [stat]/Machine Learning [stat.ML][MATH.MATH-OC] Mathematics [math]/Optimization and Control [math.OC]Leclère, Vincent2016-12-05 15:20:182019-06-29 01:38:402016-12-07 14:25:10enPreprints, Working Papers, ...application/pdf1Classification is one of the main problem addressed by machine learning algorithms. Among them the Support Vector Machine (SVM) has attracted a lot of interest and shown success in the past decades. SVM are originally tailored for binary classification. If we have only a few example of negative dataset we can turn to one-class SVM. In this paper we propose a probabilistic interpretation of the one-class SVM approach and an extension especially adapted in the case of highly imbalanced dataset. Indeed, we consider a binary classification problem where we represent the negative dataset by its two first moments, while still modeling the positive class by individual examples. The optimization problem is shown to have an equivalent formulation to a one-class SVM applied to the positive dataset after some preprocess-ing. The usual one-class SVM corresponding to the case where the negative class has mean 0 and identity variance. We show empirically, on a protein classification task and a text classification task, that our approach achieves similar statistical performance than the two mainstream approaches to imbalanced classification problems , while being more computationally efficient .