The articulation of mathematical arguments is a fundamental part of scientific reasoning and communication. Across many disciplines, expressing relations and interdependencies between quantities (usually in an equational form) is at the centre of scientific argumentation. One can easily find examples of mathematical discourse across different scientific contributions and textbooks. Nevertheless, despite its importance, the application of contemporary NLP models for performing inference over mathematical text remains underexplored, especially when compared with other advances in natural language processing and domainspecific text mining (e.g. biomedical text). In this work, we contribute to the area of Mathematical Language Processing, which addresses problems in the intersection of Natural Language and Mathematics. While several aspects of the mathematical discourse are still unexplored in this field, we have opted to focus on three main dimensions: (i) defining an evaluation framework for mathematical natural language inference; (ii) learning sentencelevel representations of mathematical statements; (iii) leveraging argumentationlevel premiseclaim discourse relations between mathematical statements. The discovery of supporting evidence for addressing complex mathematical problems is a semantically challenging task, which is still unexplored in the field of natural language processing for mathematical text. In this work, we propose the Natural Language Premise Selection task, together with a new dataset, which consists in using conjectures written in both natural language and mathematical formulae to recommend premises that most likely will be helpful to prove a particular statement. Another fundamental requirement towards mathematical language understanding is the creation of models able to represent variables meaningfully. In this work, we propose different deep learning based techniques to address such issues, identifying the challenges associated with such tasks and pave the way for future work in this field.
Date of Award  1 Aug 2022 

Original language  English 

Awarding Institution   The University of Manchester


Supervisor  Uli Sattler (Supervisor) & Andre Freitas (Supervisor) 

 natural language processing
 information retrieval
 mathematical language
MATHEMATICAL LANGUAGE PROCESSING: DEEP LEARNING REPRESENTATIONS AND INFERENCE OVER MATHEMATICAL TEXT
Mendes Ferreira, D. (Author). 1 Aug 2022
Student thesis: Phd