Linking Information Resources with Automatic Semantic Extraction

  • Daniel Joseph

Student thesis: Phd


Knowledge is a critical dimension in the problem solving processes of human intelligence. Consequently, enabling intelligent systems to provide advanced services requires that their artificial intelligence routines have access to knowledge of relevant domains. Ontologies are often utilised as the formal conceptualisation of domains, in that they identify and model the concepts and relationships of the targeted domain. However complexities inherent in ontology development and maintenance have limited their availability. Separate from the conceptualisation component, domain knowledge also encompasses the concept membership of object instances within the domain. The need to capture both the domain model and the current state of instances within the domain has motivated the import of Formal Concept Analysis into intelligent systems research. Formal Concept Analysis, which provides a simplified model of a domain, has the advantage in that not only does it define concepts in terms of their attribute description but object instances are simultaneously ascribed to their appropriate concepts. Nonetheless, a significant drawback of Formal Concept Analysis is that when applied to a large dataset, the lattice with which it models a domain is often composed of a copious amount of concepts, many of which are arguably unnecessary or invalid. In this research a novel measure is introduced which assigns a relevance value to concepts in the lattice. This measure is termed the Collapse Index and is based on the minimum number of object instances that need be removed from a domain in order for a concept to be expunged from the lattice. Mathematics that underpin its origin and behaviour are detailed in the thesis showing that if the relevance of a concept is defined by the Collapse Index: a concept will eventually lose relevance if one of its immediate subconcepts increasingly acquires object instance support; and a concept has its highest relevance when its immediate subconcepts have equal or near equal object instance support.In addition, experimental evaluation is provided where the Collapse Index demonstrated comparable or better performance than the current prominent alternatives in: being consistent across samples; the ability to recall concepts in noisy lattices; and efficiency of calculation. It is also demonstrated that the Collapse Index affords concepts with low object instance support the opportunity to have a higher relevance than those of high supportThe second contribution to knowledge is that of an approach to semantic extraction from a dataset where the Collapse Index is included as a method of selecting concepts for inclusion in a final concept hierarchy. The utility of the approach is demonstrated by reviewing its inclusion in the implementation of a recommender system. This recommender system serves as the final contribution featuring a unique design where lattices represent user profiles and concepts in these profiles are pruned using the Collapse Index. Results showed that pruning of profile lattices enabled by the Collapse Index improved the success levels of movie recommendations if the appropriate thresholds are set.
Date of Award1 Aug 2016
Original languageEnglish
Awarding Institution
  • The University of Manchester
SupervisorBabis Theodoulidis (Supervisor) & Nikolay Mehandjiev (Supervisor)


  • Recommender System
  • Semantics
  • Formal Concept Analysis
  • Concept Relevancy
  • Taxonomy

Cite this