MATHEMATICAL LANGUAGE PROCESSING: DEEP LEARNING REPRESENTATIONS AND INFERENCE OVER MATHEMATICAL TEXT

  • Deborah Mendes Ferreira

Student thesis: Phd

Abstract

The articulation of mathematical arguments is a fundamental part of scientific reasoning and communication. Across many disciplines, expressing relations and interdependencies between quantities (usually in an equational form) is at the centre of scientific argumentation. One can easily find examples of mathematical discourse across different scientific contributions and textbooks. Nevertheless, despite its importance, the application of contemporary NLP models for performing inference over mathematical text remains under-explored, especially when compared with other advances in natural language processing and domain-specific text mining (e.g. biomedical text). In this work, we contribute to the area of Mathematical Language Processing, which addresses problems in the intersection of Natural Language and Mathematics. While several aspects of the mathematical discourse are still unexplored in this field, we have opted to focus on three main dimensions: (i) defining an evaluation framework for mathematical natural language inference; (ii) learning sentence-level representations of mathematical statements; (iii) leveraging argumentation-level premise-claim discourse relations between mathematical statements. The discovery of supporting evidence for addressing complex mathematical problems is a semantically challenging task, which is still unexplored in the field of natural language processing for mathematical text. In this work, we propose the Natural Language Premise Selection task, together with a new dataset, which consists in using conjectures written in both natural language and mathematical formulae to recommend premises that most likely will be helpful to prove a particular statement. Another fundamental requirement towards mathematical language understanding is the creation of models able to represent variables meaningfully. In this work, we propose different deep learning based techniques to address such issues, identifying the challenges associated with such tasks and pave the way for future work in this field.
Date of Award1 Aug 2022
Original languageEnglish
Awarding Institution
  • The University of Manchester
SupervisorUli Sattler (Supervisor) & Andre Freitas (Supervisor)

Keywords

  • natural language processing
  • information retrieval
  • mathematical language

Cite this

'