Estimating driver state from facial features

  • Farshid Rayhan

Student thesis: Phd

Abstract

The study of detecting and monitoring faces has been a focal point of research activities since the early 1990s. This project aimed to investigate the possibility of deducing a person’s driving behaviour from their facial features. To achieve this, we investigated computer vision methods, including face detection, facial landmark alignment, and transfer learning. We introduced ChoiceNet, a new convolutional neural network architecture that excelled in tasks like classification, facial landmark alignment, and semantic segmentation. Furthermore, we presented the Anisotropic loss, a distinctive cost function that improved training efficiency for models locating facial feature points, especially for datasets with substantial pose variations. For the purpose of facial feature extraction, we developed a novel CNN model, ChoiceNet, which competed strongly against ResNet and DenseNet in accuracy with fewer parameters across datasets encompassing object recognition tasks (e.g., ImageNet, Cifar10/100), SVHN, semantic segmentation tasks (e.g., CamVid), and facial landmark alignment tasks (e.g., 300W). ChoiceNet demonstrated encouraging generalisability across a variety of tasks, including object recognition, semantic segmentation, and facial landmark alignment. We also introduced a novel loss function, Anisotropic loss, to better train CNN models for 2D landmark localisation tasks. Tests were conducted on state-of-the-art facial landmark alignment datasets using a range of CNN architectures, showing it led to better performing models than other loss functions. The effectiveness of a pose balancing scheme for large datasets was also demonstrated, enhancing training efficiency for general-purpose CNNs like ResNet and ChoiceNet in facial landmark alignment tasks. To study drivers’ behaviours, we worked with a dataset provided by Toyota, which contained video recordings of drivers performing various manoeuvres with varying levels of difficulty and stress. We provided a comprehensive description of the dataset and the system we developed to analyse facial features and assess the correlations between facial movements and the executed manoeuvres. In this context, we employed state-of-the-art loss functions as well as Anisotropic loss to train CNN models, including ChoiceNet, for the extraction of facial features. We devised a modular system that combined pre-trained CNNs for face detection and feature extraction with conventional tools such as data balancing techniques and models like adaptive boosting and random forests for classification. Our system achieved an 84% accuracy rate in estimating driver actions based on their facial features
Date of Award1 Aug 2024
Original languageEnglish
Awarding Institution
  • The University of Manchester
SupervisorTimothy Cootes (Supervisor) & Aphrodite Galata (Supervisor)

Keywords

  • Computer Vision
  • Convolutional Neural Network
  • Face Tracking
  • Loss function
  • Autonomous Vehicle

Cite this

'