Abstract
PubMed contains nearly 800,000 clinical trial citations, which report detailed trial planning, execution and results, including descriptions of study arms, demographic data, inclusion/exclusion criteria, protocols that have been followed, specific outcomes etc. So far, medical text mining has mostly focused on extracting information from the body of text with some success. Processing of information from tables is often limited to textual captions, whereas data presented in tables are typically ignored in large-scale automated processing. Here we report on a methodology developed to support semi-automated data curation and integration from clinical trial reports that relies on processing both the main text and tables. In a case study with the extraction of values of body mass index and/or weight of patients involved in clinical trials, we achieved a F-measure of 85% for body mass index extraction.
Original language | English |
---|---|
Publication status | Published - Jun 2015 |
Event | Postgraduate Summer Research Showcase 2015 - Whitworth Hall, The University of Manchester Duration: 5 Jun 2015 → 5 Jun 2015 |
Conference
Conference | Postgraduate Summer Research Showcase 2015 |
---|---|
Abbreviated title | PSRS 2015 |
City | The University of Manchester |
Period | 5/06/15 → 5/06/15 |