Calibrated hot-deck donor imputation subject to edit restrictions

Wieger Coutinho, Ton de Waal, Natalie Shlomo

Research output: Contribution to journalArticlepeer-review

Abstract

A major challenge faced by basically all institutes that collect statistical data on persons, households or enterprises is that data may be missing in the observed data sets. The most common solution for handling missing data is imputation. Imputation is complicated owing to the existence of constraints in the form of edit restrictions that have to be satisfied by the data. Examples of such edit restrictions are that someone who is less than 16 years old cannot be married in the Netherlands, and that someone whose marital status is unmarried cannot be the spouse of the head of household. Records that do not satisfy these edits are inconsistent, and are hence considered incorrect. A further complication when imputing categorical data is that the frequencies of certain categories are sometimes known from other sources or have previously been estimated. In this article we develop imputation methods for imputing missing values in categorical data that take both the edit restrictions and known frequencies into account. © Statistics Sweden.
Original languageEnglish
Pages (from-to)299-321
Number of pages22
JournalJournal of Official Statistics
Volume29
Issue number2
Publication statusPublished - 2013

Keywords

  • Categorical data
  • Edit rules
  • Imputation
  • Population frequencies

Fingerprint

Dive into the research topics of 'Calibrated hot-deck donor imputation subject to edit restrictions'. Together they form a unique fingerprint.

Cite this