Differential Geometry Inspired Machine Learning Solutions

  • Yian Deng

Student thesis: Phd

Abstract

This thesis studies the role of differential geometry in optimization and classification. The objective of this research is to develop machine learning models that can perform better than existing works under various metrics and diverse datasets. One work is on constrained large-scale non-convex optimization where the constraint set implies a manifold structure. Solving such problems is important in a multitude of fundamental machine learning tasks. Recent advances in Riemannian optimization have enabled the convenient recovery of solutions by adapting unconstrained optimization algorithms over manifolds. However, it remains challenging to scale up and meanwhile maintain stable convergence rates and handle saddle points. We propose a new second-order Riemannian optimization algorithm, aiming at improving the convergence rate and reducing the computational cost. It enhances the Riemannian trust-region algorithm that explores curvature information to escape saddle points through a mixture of subsampling and cubic regularization techniques. We conduct rigorous analysis to study the convergence behavior of the proposed algorithm. We also perform extensive experiments to evaluate it based on two general machine learning tasks using multiple datasets. The proposed algorithm exhibits improved computational speed and convergence behavior compared to a large set of state-of-the-art Riemannian optimization algorithms. Apart from exploring the application of differential geometry in optimization algorithms to solve classical machine learning problems, we also investigate specific aspects of differential geometry, i.e., the curvature in classification machine learning models. The ensemble strategy has become popular in adversarial defense, which trains multiple base classifiers to defend against adversarial attacks in a cooperative manner and has achieved empirical success. However, theoretical explanations on why an ensemble of adversarially trained classifiers is more robust than single ones, and why the trade-off between adversarial robustness and classification accuracy in single-branch classifiers exists, remain unclear. To fill in this gap, we develop a new error theory dedicated to understanding ensemble adversarial defense from the perspective of loss curvature, demonstrating the provable impact of the loss curvature on models' predicted probabilities and a provable 0-1 loss reduction by ensemble models on example sets challenging in adversarial defense scenario. Guided by the theory, we propose an effective approach to further improve ensemble adversarial defense, referred to as interactive global adversarial training (iGAT). The proposal includes (1) a probabilistic distributing rule that selectively allocates to different base classifiers adversarial examples that are globally challenging to the ensemble, and (2) a regularization term for rescuing the severest weaknesses of the base classifiers. Being tested against existing ensemble adversarial defense techniques, iGAT is capable of boosting their performance by an increase between 1% and 17% evaluated using CIFAR10 and CIFAR100 datasets under both white-box and black-box attacks.
Date of Award1 Aug 2024
Original languageEnglish
Awarding Institution
  • The University of Manchester
SupervisorXiaojun Zeng (Supervisor) & Tingting Mu (Supervisor)

Keywords

  • Adversarial robustness
  • Subsampling
  • Riemannian optimization
  • Ensemble
  • Cubic regularization
  • Curvature

Cite this

'