Accéder directement au contenu Accéder directement à la navigation

A semi-supervised Learning Approach to find equivalent long-string Organization Names

Abstract : Background: A platform called Opalia has been built to propose free access to all publications about a laboratory for a given range of years. This platform makes indexing of a corpus of a scientific article of a given lab. But in the French research system, a lab includes researchers from different organizations in the same unit generally called. UMR. Authors can write their laboratory names differently. Aim: Sorting a set of labels that is noisy can be seen as a binary classification into positives and leave negatives strings. We propose to use a cascade processing with the help of tagging some positive strings to build a relevant space of features that helps classification into good labels.
Liste complète des métadonnées

Littérature citée [5 références]  Voir  Masquer  Télécharger
Contributeur : Frédérique Bordignon Connectez-vous pour contacter le contributeur
Soumis le : jeudi 10 octobre 2019 - 09:31:40
Dernière modification le : vendredi 15 octobre 2021 - 10:56:03


Fichiers produits par l'(les) auteur(s)


  • HAL Id : hal-02310298, version 1



Frédérique Bordignon, Nicolas Turenne, Yann Feugueur. A semi-supervised Learning Approach to find equivalent long-string Organization Names. Colloque- Forum PEPS EXIA, Oct 2016, Champs sur Marne, France. 2016. ⟨hal-02310298⟩



Consultations de la notice


Téléchargements de fichiers