Narrative
Evidence-based public health (EBPH) reviews are central to public health policy, practice, and guidance. EBPH reviews require dynamic and multidimensional views of relevant information from the literature, without relying on a priori research questions. The large and growing number of published studies, therefore, makes the task of identifying relevant studies unbiasedly both complex and time-consuming. Since crucial information can be difficult to locate and understand given the complex nature of EBPH problems, the multiple causes and interrelations between interventions, diseases, populations, and outcomes can remain hidden. The global economic impact of preventable ill health (WHO) will continue to increase at an alarming rate and improved awareness of diseases at different levels: societal, financial, clinical, psychological, etc., are much needed. Thus, methods that provide cost-effective approaches to understanding interconnections between topics and better coverage of EBPH contribute towards mitigating the cost of public health.To address these limitations, the Supporting Evidence-based Public Health Interventions using Text Mining project led by the National Centre for Text Mining (NaCTeM) at the University of Manchester combined text mining and machine learning to produce novel search methods while screening tools for public health reviews. Text mining methods can discover automatically knowledge from unstructured data and machine learning can support the prioritisation and ranking of the extracted information into meaningful topics. The combination of the two can minimise the impact of publication bias in reviews and extract more accurate and pertinent information from the literature, thus meeting policy and practice timescales. The increased cost efficiency contributes towards transforming EBPH and influencing the development of guidelines at a national and international level via NICE.
The project collaborated with Machine Learning and Data Analytics (MaLDA) at the University of Liverpool and the National Institute for Health and Care Excellence (NICE), and was funded by Medical Research Council (MRC)and Biotechnology and Biological Sciences Research Council (BBSRC). Follow-up funding has been awarded from the Alan Turing Institute (UK’s national institute for data science and artificial intelligence), National Institute for Health Research (NIHR), Engineering and Physical Sciences Research Council (EPSRC), and JISC.
One specific body of the work has been fundamental research into, and subsequent development of, a tool that can minimise the human workload involved in the study identification phase of systematic reviews. RobotAnalyst, as the culmination of the work, is a web-based software tool, which contains several research innovations, including document prioritisation, topic detection, and description document clustering, to improve the prioritisation accuracy of the screening process.
Combining text mining and machine learning algorithms for organising articles by their content, RobotAnalyst is equipped with active learning prioritisation that other text-mining tools for systematic reviews do not do. It substantially decreases human workload between 40% to 85% and the risk of bias, whilst increasing the consistency of findings. As of July 2020, Robot Analyst is used by over 200 teams, of which at least 80 different teams are non-academic, across 25 countries, mainly within sectors working in evidence-based medicine. It has to date helped the NICE, the Observatory Evidence Service (OES) at Public Health Wales and many other hospitals, national public health organisations, and policymakers to undertake systematic reviews and improve evidence-based decisions, cut costs and improve efficiency and robustness of key policy decisions. Using the GBP13,000 per review metric, this is currently benefitting clinical (non-academic) guideline activity valued in the region of GBP1,040,000. Moreover, given the national and international importance of EBPH reviewing, the project has developed a 'multistrand pathways to impact' document to engage with a variety of key EBPH stakeholders both in the UK and internationally.
This research also contributes to improving UK’s competitive position in a digital market through better language technology products and services. This project brings together a mixture of unsupervised techniques, which are reusable and re-targettable in supporting and enabling language technology-based access (via semantic search). Thus, these advanced search and screening techniques will be applicable in almost any other domain, such as energy, security, national libraries, and institutional repositories.
Impact date | 2016 → 2020 |
---|---|
Category of impact | Health and wellbeing, Economic, Technological |
Impact level | Adoption |
Research Beacons, Institutes and Platforms
- Biotechnology
- Digital Futures
- Institute for Data Science and AI
- Manchester Institute of Biotechnology
Documents & Links
Related content
-
Research output
-
Data Visualization with Structural Control of Global Cohort and Local Data Neighborhoods
Research output: Contribution to journal › Article › peer-review
-
Reducing systematic review workload through certainty-based screening
Research output: Contribution to journal › Article › peer-review
-
Topic Detection Using Paragraph Vectors to Support Active Learning in Systematic Reviews
Research output: Contribution to journal › Article › peer-review
-
Mapping Phenotypic Information in Heterogeneous Textual Sources to a Domain- Specific Terminological Resource
Research output: Contribution to journal › Article › peer-review
-
Distributed Document and Phrase Co-embeddings for Descriptive Clustering
Research output: Chapter in Book/Conference proceeding › Conference contribution › peer-review
-
Self-Tuned Descriptive Document Clustering using a Predictive Network
Research output: Contribution to journal › Article › peer-review
-
Prioritising references for systematic reviews with RobotAnalyst: a user study
Research output: Contribution to journal › Article › peer-review
-
Adaptable, High Recall, Event Extraction System with Minimal Configuration
Research output: Contribution to journal › Article › peer-review
-
Using text mining techniques to extract phenotypic information from the PhenoCHF corpus
Research output: Contribution to journal › Article › peer-review
-
A semi-supervised approach using label propagation to support citation screening
Research output: Contribution to journal › Article › peer-review
-
Comparable Study of Event Extraction in Newswire and Biomedical Domains
Research output: Chapter in Book/Conference proceeding › Conference contribution › peer-review
-
Event-based text mining for biology and functional genomics
Research output: Contribution to journal › Article › peer-review
-
Bilingual term alignment from comparable corpora in English discharge summary and Chinese discharge summary
Research output: Contribution to journal › Article › peer-review
-
Descriptive Document Clustering via Discriminant Learning in a Co-Embedded Space of Multilevel Similarities
Research output: Contribution to journal › Article › peer-review
-
Using text mining for study identification in systematic reviews: a systematic review of current approaches
Research output: Contribution to journal › Article › peer-review
-
Anatomical entity recognition with a hierarchical framework augmented by external resource
Research output: Contribution to journal › Article › peer-review
-
Supporting Systematic Reviews Using LDA-based Document Representations
Research output: Contribution to journal › Article › peer-review
-
Projects
-
Prizes
-
Next Big Thing
Prize: Prize (including medals and awards)