TY - JOUR
T1 - Dealing with Under-reported Variables: An Information Theoretic Solution
AU - Sechidis, Konstantinos
AU - Sperrin, Matthew
AU - Petherick, Emily S
AU - Lujan, Mikel
AU - Brown, Gavin
PY - 2017/6
Y1 - 2017/6
N2 - Under-reporting occurs in survey data when there is a reason for participants to give a false negative response to a question, e.g. maternal smoking in epidemiological studies. Failing to correct this misreporting introduces biases and it may lead to misinformed decision making. Our work provides methods of correcting for this bias, by reinterpreting it as a missing data problem, and particularly learning from positive and unlabelled data. Focusing on information theoretic approaches we have three key contributions: (1) we provide a method to perform valid independence tests with known power by incorporating prior knowledge over misreporting; (2) we derive corrections for point/interval estimates of the mutual information that capture both relevance and redundancy; and finally, (3) we derive different ways for ranking under-reported risk factors. Furthermore, we show how to use our results in real-world problems and machine learning tasks.
AB - Under-reporting occurs in survey data when there is a reason for participants to give a false negative response to a question, e.g. maternal smoking in epidemiological studies. Failing to correct this misreporting introduces biases and it may lead to misinformed decision making. Our work provides methods of correcting for this bias, by reinterpreting it as a missing data problem, and particularly learning from positive and unlabelled data. Focusing on information theoretic approaches we have three key contributions: (1) we provide a method to perform valid independence tests with known power by incorporating prior knowledge over misreporting; (2) we derive corrections for point/interval estimates of the mutual information that capture both relevance and redundancy; and finally, (3) we derive different ways for ranking under-reported risk factors. Furthermore, we show how to use our results in real-world problems and machine learning tasks.
U2 - 10.1016/j.ijar.2017.04.002
DO - 10.1016/j.ijar.2017.04.002
M3 - Article
VL - 85
SP - 159
EP - 177
JO - International Journal of Approximate Reasoning
JF - International Journal of Approximate Reasoning
SN - 0888-613X
ER -