An improved automatic term recognition method for spanish

Alberto Barrón-Cedeño, Gerardo Sierra, Patrick Drouin, Sophia Ananiadou

    Research output: Chapter in Book/Report/Conference proceedingChapter

    Abstract

    The C-value/NC-value algorithm, a hybrid approach to automatic term recognition, has been originally developed to extract multiword term candidates from specialised documents written in English. Here, we present three main modifications to this algorithm that affect how the obtained output is refined. The first modification aims to maximise the number of real terms in the list of candidates with a new approach for the stop-list application process. The second modification adapts the C-value calculation formula in order to consider single word terms. The third modification changes how the term candidates are grouped, exploiting a lemmatised version of the input corpus. Additionally, size of candidate's context window is variable. We also show the necessary linguistic modifications to apply this algorithm to the recognition of term candidates in Spanish. © Springer-Verlag Berlin Heidelberg 2009.
    Original languageEnglish
    Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)|Lect. Notes Comput. Sci.
    Place of PublicationProceedings of the 10th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing 2009)
    PublisherSpringer Nature
    Pages125-136
    Number of pages11
    Volume5449
    DOIs
    Publication statusPublished - 2009
    Event10th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2009 - Mexico City
    Duration: 1 Jul 2009 → …

    Other

    Other10th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2009
    CityMexico City
    Period1/07/09 → …

    Fingerprint

    Dive into the research topics of 'An improved automatic term recognition method for spanish'. Together they form a unique fingerprint.

    Cite this