Projects per year
Abstract
Linking administrative data to produce more informative data for subsequent analysis has become an increasingly common practice.
However, there might be concomitant risks of disclosing sensitive information about individuals. One practice that reduces these risks
is data synthesis. In data synthesis the data are used to fit a model from which synthetic data are then generated. The synthetic data
are then released to end users. There are some scenarios where an end user might have the option of using linked data, or accepting
synthesized data. However, linkage and synthesis are susceptible to errors that could limit their usefulness. Here, we investigate the
problem of comparing the quality of linked data to synthesized data and demonstrate through simulations how the problem might be
approached. These comparisons are important when considering how an end user can be supplied with the highest quality data, and
in situations where one must consider risk / utility trade-offs.
However, there might be concomitant risks of disclosing sensitive information about individuals. One practice that reduces these risks
is data synthesis. In data synthesis the data are used to fit a model from which synthetic data are then generated. The synthetic data
are then released to end users. There are some scenarios where an end user might have the option of using linked data, or accepting
synthesized data. However, linkage and synthesis are susceptible to errors that could limit their usefulness. Here, we investigate the
problem of comparing the quality of linked data to synthesized data and demonstrate through simulations how the problem might be
approached. These comparisons are important when considering how an end user can be supplied with the highest quality data, and
in situations where one must consider risk / utility trade-offs.
Original language | English |
---|---|
Journal | Journal of Data and Information Quality |
Publication status | Published - 2023 |
Fingerprint
Dive into the research topics of 'To Link or Synthesize? An Approach to Data Quality Comparison'. Together they form a unique fingerprint.Projects
- 2 Finished
-
Data Skills and Training Research Group
Price, D., King-Hele, S., Mackey, E., Williams, L., Higgins, V., Smyth, P., Kapadia, D., Buckley, J., Morales-Gómez, A., Carter, J., Shlomo, N., Meadows, G., Brown, M., Pampaka, M., Elliot, M., Spencer, C., Swift, L. & Gosling, Z.
1/08/19 → 30/09/22
Project: Research
-
National Centre for Research Methods 2014-2019
Elliot, M., Chandola, T. & Shlomo, N.
1/10/14 → 30/06/20
Project: Research