Guiding a representation towards capturing temporally coherent aspects present invideo improves object identity encoding. Existing models apply temporal coherenceuniformly over all features based on the assumption that optimal encoding of objectidentity only requires temporally stable components. We test the validity of this assumptionby exploring the effects of applying a mixture of temporally coherent invariantfeatures, alongside variable features, in a single 'mixed' representation. Applyingtemporal coherence to different proportions of the available features, we evaluate arange of models on a supervised object classification task. This series of experimentswas tested on three video datasets, each with a different complexity of object shape andmotion. We also investigated whether a mixed-representation improves the capture ofinformation components associated with object position, alongside object identity, ina single representation. Tests were initially applied using a single layer autoencoderas a test bed, followed by subsequent tests investigating whether similar behaviouroccurred in the more abstract features learned by a deep network. A representationapplying temporal coherence in some fashion produced the best results in all tests,on both single layered and deep networks. The majority of tests favoured a mixed representation,especially in cases where the quantity of labelled data available to thesupervised task was plentiful. This work is the first time a mixed-representation hasbeen investigated, and demonstrates its use as a method for representation learning.
Date of Award | 1 Aug 2017 |
---|
Original language | English |
---|
Awarding Institution | - The University of Manchester
|
---|
Supervisor | Ke Chen (Supervisor) & Jonathan Shapiro (Supervisor) |
---|
- Unsupervised learning
- Autoencoders
- Computer vision
- Neural Networks
- Representation learning
- Temporal coherence
Representation learning with a temporally coherent mixed-representation
Parkinson, J. (Author). 1 Aug 2017
Student thesis: Phd