The Document Components Ontology (DoCO)

Alexandru Constantin, Silvio Peroni, Steve Pettifer, David Shotton, Fabio Vitali

    Research output: Contribution to journalArticlepeer-review

    148 Downloads (Pure)

    Abstract

    The availability in machine-readable form of descriptions of the structure of documents, as well as of the document discourse (e.g. the scientific discourse within scholarly articles), is crucial for facilitating semantic publishing and the overall comprehension of documents by both users and machines. In this paper we introduce DoCO, the Document Components Ontology, an OWL 2 DL ontology that provides a general-purpose structured vocabulary of document elements to describe both structural and rhetorical document components in RDF. In addition to describing the formal description of the ontology, this paper showcases its utility in practice in a variety of our own applications and other activities of the Semantic Publishing community that rely on DoCO to annotate and retrieve document components of scholarly articles.
    Original languageEnglish
    JournalSemantic Web
    DOIs
    Publication statusPublished - 2015

    Keywords

    • DEO, DoCO, PDFX, SPAR ontologies, Utopia Documents, document components, rhetoric, structural patterns.

    Fingerprint

    Dive into the research topics of 'The Document Components Ontology (DoCO)'. Together they form a unique fingerprint.

    Cite this