## Abstract

We extend a disclosure risk measure defined for population based frequency

tables to sample based frequency tables. The disclosure risk measure is based on information theoretical expressions, such as entropy and conditional entropy, that reflect the properties of attribute disclosure. To estimate the disclosure risk of a sample based frequency table we need to take into account the underlying population and therefore need both the population and sample frequencies. However, population frequencies might not be known and therefore they must be estimated from the sample. We consider two

probabilistic models, a log-linear model and a so-called Polya urn model, to estimate the population frequencies. Numerical results suggest that the Polya urn model may be a feasible alternative to the log-linear model for estimating population frequencies and the disclosure risk measure.

Original language | English |
---|---|

Title of host publication | Proceedings of UNECE worksession on statistical confidentiality |

Subtitle of host publication | Helsinki, 5-7 October 2015 |

Pages | 1-10 |

Publication status | Published - Mar 2015 |

Event | UNECE worksession on Statistical Confidentiality - Tarragona Duration: 1 Jan 1824 → … |

### Conference

Conference | UNECE worksession on Statistical Confidentiality |
---|---|

City | Tarragona |

Period | 1/01/24 → … |