Comparison of remote analysis with statistical disclosure control for protecting the confidentiality of business data

Christine M. O'Keefe, Natalie Shlomo

Research output: Contribution to journalArticlepeer-review


This paper is concerned with the challenge of allowing statistical analysis of confidential business data while maintaining confidentiality. The most widely-used approach to date is statistical disclosure control, which involves modifying or confidentialising data before releasing it to users. Newer proposed approaches include the release of multiply imputed synthetic data in place of the original data, and the use of a remote analysis system enabling users to submit statistical queries and receive output without direct access to data. Most implementations of statistical disclosure control methods to date involve census or survey microdata on individual persons, because existing methods are generally acknowledged to provide inadequate confidentiality protection to business (or enterprise) data. In this paper we seek to compare the statistical disclosure control approach with the remote analysis approach, in the context of protecting the confidentiality of business data in statistical analysis. We provide an example which enables a side-by-side comparison of the outputs of exploratory data analysis and linear regression analysis conducted on a sample business dataset under these two approaches, and provide traditional unconfidentialised results as a standard for comparison. There are certainly advantages and disadvantages in the remote analysis approach and it is unlikely that remote analysis will replace statistical disclosure control methods in all applications. If the disadvantages are judged too serious in a given situation, the analyst may have to seek access to the unconfidentialised dataset. However, our example supports the conclusion that the advantages may outweigh the disadvantages in some cases, including for some analyses of unconfidentialised business data, provided the analyst is aware of the output confidentialisation methods and their potential impact.
Original languageEnglish
Pages (from-to)403-432
Number of pages29
JournalTransactions on Data Privacy
Issue number2
Publication statusPublished - Aug 2012


  • Attribute disclosure
  • Confidentialised output
  • Data utility
  • Noise addition
  • Output checking


Dive into the research topics of 'Comparison of remote analysis with statistical disclosure control for protecting the confidentiality of business data'. Together they form a unique fingerprint.

Cite this