Accéder directement au contenu Accéder directement à la navigation
Poster

A semi-supervised Learning Approach to find equivalent long-string Organization Names

Abstract : Background: A platform called Opalia has been built to propose free access to all publications about a laboratory for a given range of years. This platform makes indexing of a corpus of a scientific article of a given lab. But in the French research system, a lab includes researchers from different organizations in the same unit generally called. UMR. Authors can write their laboratory names differently. Aim: Sorting a set of labels that is noisy can be seen as a binary classification into positives and leave negatives strings. We propose to use a cascade processing with the help of tagging some positive strings to build a relevant space of features that helps classification into good labels.
Liste complète des métadonnées

Littérature citée [5 références]  Voir  Masquer  Télécharger

https://hal-enpc.archives-ouvertes.fr/hal-02310298
Contributeur : Frédérique Bordignon <>
Soumis le : jeudi 10 octobre 2019 - 09:31:40
Dernière modification le : mardi 17 mars 2020 - 01:33:39

Fichier

poster_EXIA_v1.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-02310298, version 1

Collections

Citation

Frédérique Bordignon, Nicolas Turenne, Yann Feugueur. A semi-supervised Learning Approach to find equivalent long-string Organization Names. Colloque- Forum PEPS EXIA, Oct 2016, Champs sur Marne, France. 2016. ⟨hal-02310298⟩

Partager

Métriques

Consultations de la notice

64

Téléchargements de fichiers

28