Disentangling the Structure of Tables in Scientific Literature

Nikola Milosevic, Goran Nenadic, Cassie Gregson, Robert Hernandez

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    687 Downloads (Pure)

    Abstract

    Within the scientific literature, tables are commonly used to
    present factual and statistical information in a compact way, which is easy
    to digest by readers. The ability to “understand” the structure of tables is
    key for information extraction in many domains. However, the complexity
    and variety of presentation layouts and value formats makes it difficult to
    automatically extract roles and relationships of table cells. In this paper,
    we present a model that structures tables in a machine readable way and
    a methodology to automatically disentangle and transform tables into the
    modelled data structure. The method was tested in the domain of clinical
    trials: it achieved an F-score of 94.26 % for cell function identification and
    94.84 % for identification of inter-cell relationships.
    Original languageEnglish
    Title of host publicationNatural Language Processing and Information Systems
    Subtitle of host publication21st International Conference on Applications of Natural Language to Information Systems, NLDB 2016, Salford, UK, June 22-24, 2016, Proceedings
    Place of PublicationSwitzerland
    PublisherSpringer Nature
    Pages162-174
    Number of pages13
    Volume9612
    ISBN (Electronic)978-3-319-41754-7
    ISBN (Print)978-3-319-41753-0
    DOIs
    Publication statusPublished - 17 Jun 2016
    Event21st International Conference on Applications of Natural Language to Information Systems - Media City, Salford, United Kingdom
    Duration: 22 Jun 201624 Jun 2016
    Conference number: 21
    http://www.salford.ac.uk/conferencing-at-salford/conference-management/current-conference/nldb-conference

    Publication series

    NameLecture Notes in Computer Science
    PublisherSpringer
    Volume9612

    Conference

    Conference21st International Conference on Applications of Natural Language to Information Systems
    Abbreviated titleNLDB 2016,
    Country/TerritoryUnited Kingdom
    CitySalford
    Period22/06/1624/06/16
    Internet address

    Keywords

    • Table mining
    • Text mining
    • Data management
    • Data modelling
    • Natural language processing

    Fingerprint

    Dive into the research topics of 'Disentangling the Structure of Tables in Scientific Literature'. Together they form a unique fingerprint.

    Cite this