AbstractTechniques in computer vision have evolved over the years and seen breakthroughs in the recent past with fully data-driven approaches such as deep neural networks. Although these approaches have shown impressive capabilities when detecting objects in images, they suffer from several shortcomings such as high data dependency and lack of transparency. This is a problem when dealing with applications where data is scarce or transparency is critical to build trust in the decision making process. For example, vision applications in healthcare or self-driving technology, where the direct impact on human life is high. The aim of this thesis is to study the role of background knowledge when overcoming these limitations in neural network-based vision models. I focus on ontologies to be the source of background knowledge in the investigations because of their superior capabilities in ensuring the consistency of information and their ability to infer new information from existing information. The downstream task of choice during the experimentation is few-shot image classification which is used to evaluate the influence of background knowledge when classifying visual objects with a few examples. I propose a framework that integrates ontology-based background knowledge with a vision model which has two major components: (1) Concept embeddings that are learnt by capturing symbolic knowledge from an ontology in a continuous vector space. This study investigates methods to represent different properties of an ontology with embeddings. It further designs and applies techniques to measure their quality when representing the knowledge. (2) A vision model that is guided by the learnt embeddings during the training and inference stages. Experiments are carried out to evaluate the informed vision models with several few-shot image classification benchmarks, where they achieve superior performance compared to existing approaches. The improvement on few-shot learning capabilities of the vision models achieved through the integration of background knowledge manifests a way to overcome the challenge of high data dependency. Moreover, I argue that the use of learnt concept embeddings enhances the transparency of the vision model behaviour as the distribution of the extracted image features is decided by the embedding space. To this end, I further introduce a framework to measure the degree of error during predictions based on the background knowledge used. This study also discusses the design and construction of suitable ontologies based on the image labels of datasets used for the vision tasks.
|Date of Award||1 Aug 2022|
|Supervisor||Uli Sattler (Supervisor) & Tingting Mu (Supervisor)|
- Computer Vision
- Image Classification
- Background Knowledge