Abstract
In this paper, I introduce methodologies to tap corpora for exploring aggregate linguistic distances between dialects or varieties as a function of properties of geographic space. The paper describes the different steps necessary to obtain an appropriate corpus-based dataset (a so-called 'distance matrix'), and subsequently discusses several cartographic visualisation techniques - network maps, continuum maps and cluster maps - to project aggregate linguistic relationships to geography. In addition, the paper sketches some statistical methods to quantify these relationships. By way of example, a case study draws on the Freiburg Corpus of English Dialects - a major dialect corpus in which more than thirty traditional dialects of English from all over Great Britain are sampled. With a focus on regional variation in morphosyntax and on the basis of text frequencies of several dozen features, the study probes joint linguistic variability between the dialects sampled in the corpus. © Edinburgh University Press.
Original language | English |
---|---|
Pages (from-to) | 45-76 |
Number of pages | 31 |
Journal | Corpora |
Volume | 6 |
Issue number | 1 |
DOIs | |
Publication status | Published - May 2011 |