User needs for textual corpora in natural language processing

    Research output: Contribution to journalArticlepeer-review

    Abstract

    We discuss the needs of natural language processing (NLP)
    researchers in relation to corpora. Reasons for the growing
    interest in corpora by NLP researchers are given. Their
    needs are quite different to those of theoretical linguists, as
    end-users of NLP systems require robust systems for 'real
    language'. Monolithic general language descriptions are
    contrasted with sublanguage descriptions and found to be
    wanting. Ideal needs of NLP are contrasted with realistic
    needs. Ideal needs cannot be satisfied without first having
    solved problems whose solution requires accurately tagged
    and analysed corpora. Currently, partial skeletal analysis of
    corpora can yield useful patterns and structures. Various
    computational linguistic and probability or statistically
    based tools are required to allow further exploration of
    especially sublanguage corpora.
    Original languageEnglish
    Pages (from-to)227-234
    JournalLiterary and Linguistic Computing
    Volume8
    Issue number4
    DOIs
    Publication statusPublished - 1994

    Fingerprint

    Dive into the research topics of 'User needs for textual corpora in natural language processing'. Together they form a unique fingerprint.

    Cite this