Functional Data Analysis (FDA) provides information about curves that vary over a continuum. In this thesis, we propose two novel methodologies to classify a functional dataset, using supervised learning. The first methodology is based on Nearest Neighbours methods for functional data and classifies based on ranks of the functional signed depth. The proposed classifier uses the simplicity of the k-Ranked Nearest Neighbours (k-RNN) and its practical efficiency, exploiting the fact that the k-RNN provides conditional probabilities where the depth of an observed curve belongs to a particular group. Using this, we develop a probabilistic classifier and construct point-wise confidence intervals using a bootstrap approach. Following a generalized additive model, we propose a classifier based on the signed depth and the distance to the mode for functional observations. By means of a simulation study, we compare the performance of the proposed classifier against other nearest neighbours and depth classifiers. We also investigate the performance of the proposed classifier under different types of outliers common to these kinds of problems; we see that our proposed method works well under these different scenarios. The second methodology we developed is based on log ratios of density estimates using Bayesâ theorem. We propose a nonparametric adaptive density Bayesian classifier based on log ratios density estimates of functional principal component scores combined with different semimetrics. We study some of the main properties of the density estimator in a finite dimensional space and conduct a simulation study to investigate the performance of the proposed classifier under two semimetrics: the semimetric based on principal components scores and the semimetric based on partial least squares. We also compare the performance of the proposed classifier against different methods for simulated and real datasets. Imbalanced sample sizes appear frequently in the classification problem and present multiple issues. We propose a method to strengthen observations that are at the boundary. We study different sampling methods to strengthen observations that are more susceptible to misclassification and we generate new curves by considering a linear combination of the observations in the border and the observations closes in depth.
Date of Award | 1 Aug 2019 |
---|
Original language | English |
---|
Awarding Institution | - The University of Manchester
|
---|
Supervisor | Peter Foster (Supervisor) & Christiana Charalambous (Supervisor) |
---|
- Functional Data
- Nonparametric statistics
- Kernel density estimation
- k-ranked nearest neighbours
- Bayesian classifier
- Principal component analysis
- Functional Depth
- Simulations
Discriminant Anlaysis: A functional perspective
Perez Ruiz, D. (Author). 1 Aug 2019
Student thesis: Phd