Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Approche statistique pour le filtrage terminologique des occurrences de candidats termes en texte intégral

Camacho Collados, Jose, Billami, Mokhtar Boumedyen, Jacquey, Evelyne and Kister, Laurence 2014. Approche statistique pour le filtrage terminologique des occurrences de candidats termes en texte intégral. Presented at: JADT 2014, Paris, 3-6 June 2014. Proceedings of the 12th International Conference on the Statistical Analysis of Textual Data.

Full text not available from this repository.

Abstract

Following (L'Homme, 2004), this paper focuses on terms variations in full text in French and more precisely it highlights the semantic ambiguity of terms occurrences with regards to a very high leveled distinction between terminological and general uses. This issue is very present especially in Humanities. For instance, we are interested in distinguishing between the terminological meaning of the term "sujet (subject)" in the phrase "le sujet de la phrase (the subject of the sentence)" (Linguistics) or "les réponses du sujet (subject's answers)" (Psychology), and the general meaning of the noun "sujet (topic)" that we may find in a phrase like "le sujet de cet article (the topic of this article)". In order to solve this problem, we assume that textual contexts around term occurrences give us relevant information on the kind of use we face, terminological or general. Our research is based on a statistical approach of the textual contexts. The proposed metrics are based on the hypergeometric distribution and the lexical specificity calculus as described in (Lafon, 1980). By using a manually annotated corpus as the training set, we build lexical profiles for each high leveled meaning of the term candidates. We use two methods which were compared to a baseline metric based on term frequency. The results we obtained are analyzed from both a quantitative and a qualitative point of view.

Item Type: Conference or Workshop Item (Paper)
Date Type: Publication
Status: Published
Schools: Computer Science & Informatics
Language other than English: French
ISBN: 9782954778112
Last Modified: 13 Jul 2018 15:01
URI: http://orca.cf.ac.uk/id/eprint/113069

Actions (repository staff only)

Edit Item Edit Item