Abstract
“Higher-order” linguistic features such as lexis, grammar and morphology are frequently analysed in forensic speaker comparison (FSC) cases (Gold and French, 2011), but assessment of these features arguably lacks a formalised comparison framework. Furthermore, the question of how the assessment of “higher-order” features could be combined with other phonetic features in FSC assessments is yet to be comprehensively addressed.
Using transcribed speech data from the West Yorkshire Regional English Database (Gold et al., 2018), we applied two well-known authorship analysis methods using the likelihood ratio framework: Cosine Delta (Ishihara 2021) and Phi n-gram tracing (Nini 2023) to assess the speaker discriminatory value of frequent words (Sergidou et al., 2023) and word-level n-grams. We also applied the method to transcripts where vocalised hesitation markers had been coded with phonetic information to assess the potential for “higher-order” linguistic features to be combined with segmental phonetic analysis to achieve greater speaker discriminatory power.
Findings support previous research that methods used to discriminate between authors can be usefully applied to transcribed speech data. We also find that these methods can be enhanced by embedding segmental phonetic information within transcripts. For Delta, the use of hesitation markers performs as well as function words. For Phi n-gram tracing, the analysis of only n-grams containing only hesitation markers is superior to a classic word n-grams analysis.
References
Gold, E. (2020). WYRED - West Yorkshire Regional English Database 2016-2019. [data collection]. UK Data Service. SN: 854354, DOI: 10.5255/UKDA-SN-854354
Ishihara, Shunichi. 2021. Score-based likelihood ratios for linguistic text evidence with a bag-of-words model. Forensic Science International. Elsevier 327. 110980.
Nini, A. (2023). A Theory of Linguistic Individuality for Authorship Analysis. Elements in Forensic Linguistics. Cambridge University Press.
Sergidou, E. K., Scheijen, N., Leegwater, J., Cambier-Langeveld, T., & Bosma, W. (2023). Frequent-words analysis for forensic speaker comparison. Speech Communication, 150, 1-8.
Using transcribed speech data from the West Yorkshire Regional English Database (Gold et al., 2018), we applied two well-known authorship analysis methods using the likelihood ratio framework: Cosine Delta (Ishihara 2021) and Phi n-gram tracing (Nini 2023) to assess the speaker discriminatory value of frequent words (Sergidou et al., 2023) and word-level n-grams. We also applied the method to transcripts where vocalised hesitation markers had been coded with phonetic information to assess the potential for “higher-order” linguistic features to be combined with segmental phonetic analysis to achieve greater speaker discriminatory power.
Findings support previous research that methods used to discriminate between authors can be usefully applied to transcribed speech data. We also find that these methods can be enhanced by embedding segmental phonetic information within transcripts. For Delta, the use of hesitation markers performs as well as function words. For Phi n-gram tracing, the analysis of only n-grams containing only hesitation markers is superior to a classic word n-grams analysis.
References
Gold, E. (2020). WYRED - West Yorkshire Regional English Database 2016-2019. [data collection]. UK Data Service. SN: 854354, DOI: 10.5255/UKDA-SN-854354
Ishihara, Shunichi. 2021. Score-based likelihood ratios for linguistic text evidence with a bag-of-words model. Forensic Science International. Elsevier 327. 110980.
Nini, A. (2023). A Theory of Linguistic Individuality for Authorship Analysis. Elements in Forensic Linguistics. Cambridge University Press.
Sergidou, E. K., Scheijen, N., Leegwater, J., Cambier-Langeveld, T., & Bosma, W. (2023). Frequent-words analysis for forensic speaker comparison. Speech Communication, 150, 1-8.
Original language | English |
---|---|
DOIs | |
Publication status | Published - 26 Jun 2024 |
Event | 5th European Conference of the IAFLL – International Association for Forensic and Legal Linguists - Aston University, Birmingham, United Kingdom Duration: 24 Jun 2024 → 27 Jun 2024 Conference number: 5 https://www.aston.ac.uk/research/forensic-linguistics/iafll-regional-conference |
Conference
Conference | 5th European Conference of the IAFLL – International Association for Forensic and Legal Linguists |
---|---|
Country/Territory | United Kingdom |
City | Birmingham |
Period | 24/06/24 → 27/06/24 |
Internet address |
Fingerprint
Dive into the research topics of 'Evaluating the usefulness of embedding phonetic representations into an authorship analysis-based framework for the comparison of spoken data'. Together they form a unique fingerprint.Impacts
-
Forensic linguistic authorship analysis of disputed texts
Nini, A. (Participant)
Impact: Legal impacts, Societal impacts
Prizes
-
IAFPA Research Grant
Tompkinson, J. (Recipient) & Nini, A. (Recipient), 28 Feb 2024
Prize: Prize (including medals and awards)