ANCHOR-GUIDED DIMENSION REDUCTION METHODS FOR COHORT DATA VISUALISATION

  • Rui Qin

Student thesis: Phd

Abstract

Data visualisation is the first stage of analysing data and developing data-driven solutions. A visual understanding of the data can help analysts quickly identify key data patterns and intrinsic structural information, facilitating data scientists in many domains and applications. An effective way of ``looking at" high-dimensional data is to embed the patterns into 2,3-D spaces by using dimension reduction techniques. The main goal of generating a meaningful visualisation for cohort datasets, which are datasets that contain natural clusters or can be classified by clustering technique, is to preserve the essential aspects of the intrinsic data information, e.g., local neighbourhood of data points, the internal structure of the cohort, positioning, and separation of data cohorts. It is still an open question on how to well balance between all these aspects, and the evaluation of the cohort positioning is mostly done qualitatively by plotting out the embeddings and assessing the plot manually. This thesis focuses on improving cohort data visualisation and its evaluation. The first contribution is the ANchor GuidEd Local (ANGEL) algorithm with its variations, proposed to balance local neighbourhood preservation, cohort positioning, and cohort separation. The second contribution is the new evaluation approach designed to quantitatively measure all these three aspects. ANGEL is evaluated and compared with a series of state-of-the-art approaches using several benchmark datasets. Results show that it can effectively generate more informative visualisation. The third contribution is the incremental extensions of ANGEL. The simple but intuitive p-ANGEL and i-ANGEL algorithms are proposed to incrementally embed new data points in a batch embedding setup. It also saves memory cost and accelerates the computational speed, more applicable to large-scale real-world cohort data visualisations. Overall, this PhD research pushes the state-of-the-art of cohort data visualisation forward.
Date of Award1 Aug 2023
Original languageEnglish
Awarding Institution
  • The University of Manchester
SupervisorGavin Brown (Supervisor) & Tingting Mu (Supervisor)

Keywords

  • data visualisation
  • dimension reduction
  • ordinal embedding
  • multi-objective optimisation

Cite this

'