The University of Manchester, Faculty of Engineering and Physical Sciences ABSTRACT OF THESIS submitted by Yun Yang for the degree of Doctor of Philosophy and entitled: Unsupervised ensemble learning and its application to temporal data clustering. Date of Submission: 02/11/2011Temporal data clustering can provide underpinning techniques for the discovery of intrinsic structures and can condense or summarize information contained in temporal data, demands made in various fields ranging from time series analysis to understanding sequential data. In the context of the treatment of data dependency in temporal data, existing temporal data clustering algorithms can be classified in three categories: model-based, temporal-proximity and feature-based clustering. However, unlike static data, temporal data have many distinct characteristics, including high dimensionality, complex time dependency, and large volume, all of which make the clustering of temporal data more challenging than conventional static data clustering. A large of number of recent studies have shown that unsupervised ensemble approaches improve clustering quality by combining multiple clustering solutions into a single consolidated clustering ensemble that has the best performance among given clustering solutions. This thesis systemically reviews existing temporal clustering and unsupervised ensemble learning techniques and proposes three unsupervised ensemble learning approaches for temporal data clustering. The first approach is based on the ensemble of HMM k-models clustering, associated with agglomerative clustering refinement, for solving problems with finding the intrinsic number of clusters, model initialization sensitivity and computational cost, problems which exist in most forms of model-based clustering.Secondly, we propose a sampling-based clustering ensemble approach namely the iteratively constructed clustering ensemble. Our approach iteratively constructs multiple partitions on the subset of whole input instances selected by a smart weighting scheme, combining the strength of both boosting and bagging approaches whilst attempting to simultaneously avoid their drawbacks.Finally, we propose a weighted ensemble learning approach to temporal data clustering which combines partitions obtained by different representations of temporal data. As a result, this approach has the capability to capture the properties of temporal data and the synergy created by reconciling diverse partitions due to combining different representations. The proposed weighted function has out-standing ability in automatic model selection and appropriate grouping for complex temporal data.
|Date of Award||31 Dec 2011|
- The University of Manchester
|Supervisor||Ke Chen (Supervisor)|