Differential correct attribution probability for synthetic data: An exploration

Jennifer Taub, Mark Elliot, Maria Pampaka, Duncan Smith

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Synthetic data generation has been proposed as a flexible alternative to more traditional statistical disclosure control (SDC) methods for limiting disclosure risk. Synthetic data generation is functionally distinct from standard SDC methods in that it breaks the link between the data subjects and the data such that reidentification is no longer meaningful. Therefore orthodox measures of disclosure risk assessment - which are based on reidentification - are not applicable. Research into developing disclosure assessment measures specifically for synthetic data has been relatively limited. In this paper, we develop a method called Differential Correct Attribution Probability (DCAP). Using DCAP, we explore the effect of multiple imputation on the disclosure risk of synthetic data.

Original languageEnglish
Title of host publicationPrivacy in Statistical Databases - UNESCO Chair in Data Privacy, International Conference, PSD 2018, Proceedings
EditorsFrancisco Montes, Josep Domingo-Ferrer
PublisherSpringer Nature
Pages122-137
Number of pages16
ISBN (Print)9783319997704
DOIs
Publication statusPublished - 2018
EventInternational Conference on Privacy in Statistical Databases, PSD 2018 - Valencia, Spain
Duration: 26 Sept 201828 Sept 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11126 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceInternational Conference on Privacy in Statistical Databases, PSD 2018
Country/TerritorySpain
CityValencia
Period26/09/1828/09/18

Keywords

  • CART
  • Disclosure risk
  • Synthetic data

Research Beacons, Institutes and Platforms

  • Cathie Marsh Institute

Fingerprint

Dive into the research topics of 'Differential correct attribution probability for synthetic data: An exploration'. Together they form a unique fingerprint.

Cite this