Abstract
Many variables have within group homogeneity (similarity of values for the individual units that comprise the groups). Measures of within group homogeneity are useful for the sample design and statistical analysis of datasets for populations that contain groups, such as individuals in geographical areas. Homogeneity measures can easily be defined for continuous or dichotomous variables. Here, we propose a homogeneity measure for a multi-category variable, and show how this measure can be calculated without access to individual level data. We apply the measure to data from the UK census, and show how this measure can be related to the homogeneity of particular linear combinations of the categories, called Canonical Grouping Variables (CGVs), and explain how these are interpreted. © 2011 Copyright Taylor and Francis Group, LLC.
Original language | English |
---|---|
Pages (from-to) | 649-658 |
Number of pages | 9 |
Journal | Journal of Statistical Theory and Practice |
Volume | 5 |
Issue number | 4 |
DOIs | |
Publication status | Published - 1 Dec 2011 |
Keywords
- Aggregate data
- Canonical grouping variables
- Categorical variables
- Census area data
- Clustering
- Groups
- Homogeneity
- Intra-class correlation