Smoothing methodology with applications to nonparametric statistics

  • Andreas Vrahimis

    Student thesis: Phd

    Abstract

    The work in this thesis is based on kernel smoothing techniques with applications to nonparametric statistical methods and especially kernel density estimation and nonparametric regression. We examine a bootstrap iterative method of choosing the smoothing parameter, in univariate kernel density estimation, and propose an empirical smoothness correction that generally improves the method for small-medium sample sizes tested. In a simulation study performed, the corrected bootstrap iterative method shows consistent overall performance and can compete with other popular widely used methods. The theoretical asymptotic properties of the smoothed bootstrap method, in univariate kernel density estimation, are examined and an adaptive data-based choice of fixed pilot smoothing parameter formed, that provides a good performance trade-off among distributions of various shapes, with fast relative rate of convergence to the optimal. The asymptotic and practical differences of the smoothed bootstrap method, when the diagonal terms of the error criterion are included or omitted, are also examined. The exclusion of the diagonal terms yields faster relative rates of convergence of the smoothing parameter to the optimal but a simulation study shows that for smaller sample sizes, including the diagonal terms can be favourable. In a real data set application both methods produced similar smoothing parameters and the resulting kernel density estimates were of reasonable smoothness.Existing methods of kernel density estimation in two dimensions are discussed and the corrected bootstrap iterative method is adapted to work in the bivariate kernel density estimation, with considerable success. Additionally, the theoretical asymptotic properties of the smoothed bootstrap method, in the bivariate kernel density estimation, are examined, and adaptive data-based choices for the fixed pilot smoothing parameters formed, that provide fast relative rates of convergence to the optimal, compared to other popular methods. The smoothed bootstrap method with the diagonal terms of the error criterion omitted, exhibits slightly faster relative rates of convergence, compared to the method which includes the diagonal terms, and in a simulation study they performed considerably well, compared to other methods. Also, we discover that a scaling transformation of the data, before applying the method, leads to poor results for distributions of various shapes, and it should be generally avoided. In an application using the iris flowers data set, both smoothed bootstrap versions suggested, produce reasonable kernel density estimates.We also look at various methods of estimating the variance of the errors in nonparametric regression and suggest a simple robust method of estimating the error variance, for the homoscedastic fixed design. The method is based on a multiplicative correction of the variance of the residuals and a comparison with popular difference-based methods shows favourable results, especially when the local linear estimator is employed.
    Date of Award1 Aug 2011
    Original languageEnglish
    Awarding Institution
    • The University of Manchester
    SupervisorPeter Foster (Supervisor) & Jianxin Pan (Supervisor)

    Cite this

    '