In this thesis we provide a unifying framework for two decades of work in an area of Machine Learning known as cost-sensitive Boosting algorithms. This area is concerned with the fact that most real-world prediction problems are asymmetric, in the sense that different types of errors incur different costs.Adaptive Boosting (AdaBoost) is one of the most well-studied and utilised algorithms in the field of Machine Learning, with a rich theoretical depth as well as practical uptake across numerous industries. However, its inability to handle asymmetric tasks has been the subject of much criticism. As a result, numerous cost-sensitive modifications of the original algorithm have been proposed. Each of these has its own motivations, and its own claims to superiority.With a thorough analysis of the literature 1997-2016, we find 15 distinct cost-sensitive Boosting variants - discounting minor variations. We critique the literature using {\em four} powerful theoretical frameworks: Bayesian decision theory, the functional gradient descent view, margin theory, and probabilistic modelling.From each framework, we derive a set of properties which must be obeyed by boosting algorithms. We find that only 3 of the published Adaboost variants are consistent with the rules of all the frameworks - and even they require their outputs to be calibrated to achieve this.Experiments on 18 datasets, across 21 degrees of cost asymmetry, all support the hypothesis - showing that once calibrated, the three variants perform equivalently, outperforming all others.Our final recommendation - based on theoretical soundness, simplicity, flexibility and performance - is to use the original Adaboost algorithm albeit with a shifted decision threshold and calibrated probability estimates. The conclusion is that novel cost-sensitive boosting algorithms are unnecessary if proper calibration is applied to the original.
Date of Award | 1 Aug 2017 |
---|
Original language | English |
---|
Awarding Institution | - The University of Manchester
|
---|
Supervisor | Gavin Brown (Supervisor) & Jonathan Shapiro (Supervisor) |
---|
- decision theory
- ensemble learning
- product of experts
- margin theory
- functional gradient descent
- cost-sensitive
- risk minimization
- imbalanced classes
- probability estimation
- Adaboost
- Boosting
- classifier calibration
Cost-sensitive boosting: A unified approach
Nikolaou, N. (Author). 1 Aug 2017
Student thesis: Phd