Manual semantic tagging of data is too labour intensive for practical use, and the increasing reliance on data for decision making has led the researchers to explore different techniques for automatic knowledge acquisition to automate this process. One such technique is Formal Concept Analysis (FCA). FCA takes a table of incidence relations between sampled data instances and their properties, called a formal context, and constructs a lattice of partial order relationships between the instance sets and between the property sets. This is mapped onto a semantic knowledge structure comprising domain concepts with their instances and properties. However, this automatic extraction of structure from a large number of instances usually leads to a lattice which is too complicated and noisy for practical semantic analysis of real-world datasets. Algorithms to reduce the lattice exist. However, these mainly rely on the lattice structure (using mathematical measurements of relevancy) and are agnostic about any prior knowledge about the domain of interest. In contrast, our work uses existing domain knowledge encoded in a semantic ontology to inform the reduction process. The main contribution of the research is the proposed Ontology-informed Lattice Reduction Approach that leverages the use of an existing domain ontology to reduce and streamline lattices created when applying FCA to real-world data. The approach assumes a partial overlap between the sampled instances and those in the domain ontology, and its value is to provide semantic structure capturing all sampled instances. The approach relies on a new relevancy metric called Discrimination Power Index (DPI) that is used to automatically classify any sampled instances and align them with the domain ontology. It measures the commonality between concepts in the domain ontology and those arising from the sampled formal context (a sample representation of a dataset). The calculation of this index is based on two relevancy criteria described in full within the thesis: (1) the number of shared instances between the domain ontology and the sampled formal context and (2) the overall importance of a property within the formal context based on the partial order relationships between sets of instances tagged with this property. The utility of the proposed approach is demonstrated using three different case studies constructed from real datasets. The results demonstrate significant reduction of lattice nodes, even when the overlap between ontology and sampled instances is minimal.
Date of Award | 1 Aug 2020 |
---|
Original language | English |
---|
Awarding Institution | - The University of Manchester
|
---|
Supervisor | Nikolay Mehandjiev (Supervisor) & Azar Shahgholian (Supervisor) |
---|
- Lattice reduction
- Semantic structures
- FCA
Ontology-informed Lattice Reduction Using the Discrimination Power Index
Quboa, Q. (Author). 1 Aug 2020
Student thesis: Phd