An analogy-based method for semantic clustering

Gerardo Sierra, John McNaught

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    Abstract

    An analogy-based clustering method is proposed, through the alignment of
    definitions from two different sources. The method relies on the assumption that
    two authors use different words to express a definition. The algorithm
    introduced here is analogy-based, and starts from calculating the Levenshtein
    distance, which is a variation of the edit distance, and allows us to align the
    definitions. As a measure of similarity, the concept of longest collocation couple
    is introduced, which is the basis of clustering similar words. The process
    iterates, replacing similar pairs of words in the definitions until no new clusters
    are found.
    Original languageEnglish
    Title of host publicationProceedings of the International Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2000)
    EditorsAlexander Gelbukh
    PublisherInstituto Politécnico Nacional
    Number of pages14
    Publication statusPublished - 2000

    Keywords

    • clustering
    • alignment of definitions
    • computational linguistics
    • natural language processing

    Fingerprint

    Dive into the research topics of 'An analogy-based method for semantic clustering'. Together they form a unique fingerprint.

    Cite this