Specificity: A graph-based estimator of divergence

Carole J. Twining, Christopher J. Taylor

    Research output: Contribution to journalArticlepeer-review

    Abstract

    In statistical modeling, there are various techniques used to build models from training data. Quantitative comparison of modeling techniques requires a method for evaluating the quality of the fit between the model probability density function (pdf) and the training data. One graph-based measure that has been used for this purpose is the specificity. We consider the large-numbers limit of the specificity, and derive expressions which show that it can be considered as an estimator of the divergence between the unknown pdf from which the training data was drawn and the model pdf built from the training data. Experiments using artificial data enable us to show that these limiting large-number relations enable us to obtain good quantitative and qualitative predictions of the behavior of the measured specificity, even for small numbers of training examples and in some extreme cases. We demonstrate that specificity can provide a more sensitive measure of difference between various modeling methods than some previous graph-based techniques. Key points are illustrated using real data sets. We thus establish a proper theoretical basis for the previously ad hoc concept of specificity, and obtain useful insights into the application of specificity in the analysis of real data. © 2011 IEEE.
    Original languageEnglish
    Article number5765997
    Pages (from-to)2492-2505
    Number of pages13
    JournalIEEE Transactions on Pattern Analysis and Machine Intelligence
    Volume33
    Issue number12
    DOIs
    Publication statusPublished - 2011

    Keywords

    • assessment of modeling
    • cross entropy
    • entropy estimation
    • estimation of divergence
    • estimation of statistical distance
    • generalization
    • graph-based estimators
    • Kullback-Leibler divergence
    • nearest-neighbor estimators
    • Specificity

    Fingerprint

    Dive into the research topics of 'Specificity: A graph-based estimator of divergence'. Together they form a unique fingerprint.

    Cite this