Are Machine Reading Comprehension Systems Robust to Context Paraphrasing?

Research output: Chapter in Book/Conference proceedingConference contributionpeer-review

Abstract

Investigating the behaviour of Machine Reading Comprehension (MRC) models under various types of test-time perturbations can shed light on the enhancement of their robustness and generalisation capability, despite the superhuman performance they have achieved on existing benchmark datasets. In this paper, we study the robustness of contemporary MRC systems to context paraphrasing, i.e., whether these models are still able to correctly answer the questions once the reading passages have been paraphrased. To this end, we systematically design a pipeline to semi-automatically generate perturbed MRC instances which ultimately lead to the creation of a paraphrased test set. We conduct experiments on this data set with six state-of-the-art neural MRC models and we find that even the minimum performance drop of all these models exceeds 41%,whereas human performance remains high. Retraining models with augmented perturbed examples results in improved robustness, though the performance remains lower than on the original dataset. These results demonstrate that the existing high-performing MRC systems are still far away from real language understanding1.
Original languageEnglish
Title of host publicationProceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Subtitle of host publicationVolume 2: Short Papers
PublisherAssociation for Computational Linguistics
Pages184-196
DOIs
Publication statusPublished - 4 Nov 2023
EventProceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 2: Short Papers) - Nusa Dua, Bali
Duration: 1 Nov 20231 Nov 2023

Conference

ConferenceProceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 2: Short Papers)
Period1/11/231/11/23

Fingerprint

Dive into the research topics of 'Are Machine Reading Comprehension Systems Robust to Context Paraphrasing?'. Together they form a unique fingerprint.

Cite this