Multiword units lead to errors of commission in children’s spontaneous production: “What corpus data can tell us?*”

Stewart M McCauley, Colin Bannard, Anna Theakston, Michelle Davis, Thea Cameron-Faulkner, Ben Ambridge

Research output: Contribution to journalArticlepeer-review


Psycholinguistic research over the past decade has suggested that children’s linguistic knowledge includes dedicated representations for frequently-encountered multiword sequences. Important evidence for this comes from studies of children’s production: it has been repeatedly demonstrated that children’s rate of speech errors is greater for word sequences that are infrequent and thus unfamiliar to them than for those that are frequent. In this study, we investigate whether children’s knowledge of multiword sequences can explain a phenomenon that has long represented a key theoretical fault line in the study of language development: errors of subject-auxiliary non-inversion in question production (e.g., “why we can’t go outside?*”). In doing so we consider a type of error that has been ignored in discussion of multiword sequences to date. Previous work has focused on errors of omission – an absence of accurate productions for infrequent phrases. However, if children make use of dedicated representations for frequent sequences of words in their productions, we might also expect to see errors of commission – the appearance of frequent phrases in children’s speech even when such phrases are not appropriate. Through a series of corpus analyses, we provide the first evidence that the global input frequency of multiword sequences (e.g., “she is going” as it appears in declarative utterances) is a valuable predictor of their errorful appearance (e.g., the uninverted question “what she is going to do?*”) in naturalistic speech. This finding, we argue, constitutes powerful evidence that multiword sequences can be represented as linguistic units in their own right.
Original languageEnglish
Article numbere13125
JournalDevelopmental science
Issue number6
Early online date1 Jun 2021
Publication statusPublished - 1 Nov 2021


  • Child Female Humans Language Language Development Linguistics Psycholinguistics Speech chunking corpus analysis language acquisition questions


Dive into the research topics of 'Multiword units lead to errors of commission in children’s spontaneous production: “What corpus data can tell us?*”'. Together they form a unique fingerprint.

Cite this