iClassifier, a digital research tool for corpus-based classifier networks in complex writing systems

Haleli Harel, Orly Goldwasser, Dmitry Nikolaev

Research output: Contribution to journalArticlepeer-review

Abstract

This article presents the method applied by the iClassifier (©Goldwasser/Harel/Nikolaev) digital research tool for the study of the linguistic phenomenon of classifiers. The tool was created in 2019 with the objective of curating corpus-based and data-driven documentation of classifier systems. The record of classifiers comprises millions of tokens worth of “big data” analysis. By tagging classifiers in various corpora, a topography of categories emerges, visualized as complex, multilayered networks. This article offers an overview of how classifier-based networks are created and how network analysis methods can be applied to analyze knowledge organization. We present the data structure and annotation scheme of the iClassifier research tool, demonstrating how one can plot classifier networks and generate reports of lemma and classifier repertoires in each corpus. The iClassifier tool provides quantitative reports, including classifier frequency, variation and co-occurrence statistics. Each data subset, such as a certain part of speech, timespan, geographical location, or textual genre, can be queried and visualized. The tool is meant to allow browsing between a macro-overview of all categories in a corpus and zooming in into micro-analysis of the individual categories and lemmas that built up a corpus. Each classifier is seen as a category head, and the categories are drawn in their multilayered and multidimensional relationships. The potency of this tool is in documenting the phenomenon in large corpora of texts and expanding our knowledge about the rules and functions of classifier systems, leading us to a more reined mind-mapping of ancient cultures. Currently, very little systematic analysis has been done on this ancient record of emic information.
Original languageEnglish
JournalJournal of Chinese Writing Systems
Publication statusAccepted/In press - 1 Jan 2024

Keywords

  • classifier studies
  • digital humanities
  • network analysis
  • lexical semantics

Fingerprint

Dive into the research topics of 'iClassifier, a digital research tool for corpus-based classifier networks in complex writing systems'. Together they form a unique fingerprint.

Cite this