Disclosure Risk Measurement with Entropy in Two-Dimensional Sample Based Frequency Tables

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

54 Downloads (Pure)


We extend a disclosure risk measure defined for population based frequency
tables to sample based frequency tables. The disclosure risk measure is based on information theoretical expressions, such as entropy and conditional entropy, that reflect the properties of attribute disclosure. To estimate the disclosure risk of a sample based frequency table we need to take into account the underlying population and therefore need both the population and sample frequencies. However, population frequencies might not be known and therefore they must be estimated from the sample. We consider two
probabilistic models, a log-linear model and a so-called Polya urn model, to estimate the population frequencies. Numerical results suggest that the Polya urn model may be a feasible alternative to the log-linear model for estimating population frequencies and the disclosure risk measure.
Original languageEnglish
Title of host publicationProceedings of UNECE worksession on statistical confidentiality
Subtitle of host publicationHelsinki, 5-7 October 2015
Publication statusPublished - Mar 2015
EventUNECE worksession on Statistical Confidentiality - Tarragona
Duration: 1 Jan 1824 → …


ConferenceUNECE worksession on Statistical Confidentiality
Period1/01/24 → …


Dive into the research topics of 'Disclosure Risk Measurement with Entropy in Two-Dimensional Sample Based Frequency Tables'. Together they form a unique fingerprint.

Cite this