Word frequencies: A comparison of Pareto type distributions

Martin Wiegand, Saralees Nadarajah*, Yuancheng Si

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    Abstract

    Mehri and Jamaati (2017) used Zipf's law to model word frequencies in Holy Bible translations for one hundred live languages. We compare the fit of Zipf's law to a number of Pareto type distributions. The latter distributions are shown to provide the best fit, as judged by a number of comparative plots and error measures. The fit of Zipf's law appears generally poor.

    Original languageEnglish
    Pages (from-to)621-632
    Number of pages11
    JournalPhysics Letters, Section A: General, Atomic and Solid State Physics
    Volume382
    Issue number9
    Early online date5 Jan 2018
    DOIs
    Publication statusPublished - 9 Mar 2018

    Keywords

    • Kolmogorov-Smirnov test statistic
    • Squared error
    • Zipf's law

    Fingerprint

    Dive into the research topics of 'Word frequencies: A comparison of Pareto type distributions'. Together they form a unique fingerprint.

    Cite this