In image classification, deriving robust image representations is a key process that determines the performance of vision systems. Numerous image features and descriptors have been developed manually over the years. As an alternative, however, deep neural networks, in particular convolutional neural networks (CNNs), have become popular for learning image features or representations from data and have demonstrated remarkable performance in many real-world applications. But CNNs often require huge amount of labelled data, which may be prohibitive in many applications, as well as long training times. This paper considers an alternative, data-independent means of obtaining features for CNNs. The proposed framework makes use of the Markov random field (MRF) and self-organising map (SOM) to generate basic features and model both intra- and inter-image dependencies. Various MRF textures are synthesized first, and are then clustered by a convolutional translation-invariant SOM, to form generic image features. These features can be directly applied as early convolutional filters of the CNN, leading to a new way of deriving effective features for image classification. The MRF framework also offers a theoretical and transparent way to examine and determine the influence of image features on performance of CNNs. Comprehensive experiments on the MNIST, rotated MNIST, CIFAR-10 and CIFAR-100 datasets were conducted with results outperforming most state-of-the-art models of similar complexity.
- convolutional neural networks
- image representation
- Markov random field
- image classification
- Self-organising maps