Often predictor variables in regression models are measured with errors. This is known as an errors-in-variables (EIV) problem. The statistical analysis of the data ignoring the EIV is called naive analysis. As a result, the variance of the errors is underestimated. This affects any statistical inference that may subsequently be made about the model parameter estimates or the response prediction. In some cases (e.g. quadratic polynomial models) the parameter estimates and the model prediction is biased. The errors can occur in different ways. These errors are mainly classified into classical (i.e. occur in observational studies) or Berkson type (i.e. occur in designed experiments). This thesis addresses the problem of the Berkson EIV and their effect on the statistical analysis of data fitted using linear and nonlinear models. In particular, the case when the errors are dependent and have heterogeneous variance is studied. Both analytical and empirical tools have been used to develop new approaches for dealing with this type of errors. Two different scenarios are considered: mixture experiments where the model to be estimated is linear in the parameters and the EIV are correlated; and bioassay dose-response studies where the model to be estimated is nonlinear. EIV following Gaussian distribution, as well as the much less investigated non-Gaussian distribution are examined. When the errors occur in mixture experiments both analytical and empirical results showed that the naive analysis produces biased and inefficient estimators for the model parameters. The magnitude of the bias depends on the variances of the EIV for the mixture components, the model and its parameters. First and second Scheffé polynomials are used to fit the response. To adjust for the EIV, four different approaches of corrections are proposed. The statistical properties of the estimators are investigated, and compared with the naive analysis estimators. Analytical and empirical weighted regression calibration methods are found to give the most accurate and efficient results. The approaches require the error variance to be known prior to the analysis. The robustness of the adjusted approaches for misspecified variance was also examined. Different error scenarios of EIV in the settings of concentrations in bioassay dose-response studies are studied (i.e. dependent and independent errors). The scenarios are motivated by real-life examples. Comparisons between the effects of the errors are illustrated using the 4-prameter Hill model. The results show that when the errors are non-Gaussian, the nonlinear least squares approach produces biased and inefficient estimators. An extension of the well-known simulation-extrapolation (SIMEX) method is developed for the case when the EIV lead to biased model parameters estimators, and is called Berkson simulation-extrapolation (BSIMEX). BSIMEX requires the error variance to be known. The robustness of the adjusted approach for misspecified variance is examined. Moreover, it is shown that BSIMEX performs better than the regression calibration methods when the EIV are dependent, while the regression calibration methods are preferable when the EIV are independent.
|Date of Award||1 Aug 2011|
- The University of Manchester
|Supervisor||Alexander Donev (Supervisor)|
- Errors-in-Variables, Berkson Errors, Classical Errors, Mixing Errors, Mixture Experiment, Dose-Response Studies, Bioassay, Scheffé Polynomial, Hill Equation, Regression Calibration, Simulation-Extrapolation.