Tree view self-organisation of web content - Institute for Water Education

Richard T. Freeman, Hujun Yin

    Research output: Contribution to journalArticlepeer-review

    Abstract

    When browsing a large set of unstructured documents, it is advantageous if the documents have been organised and presented in a way that makes navigation efficient, understanding underlying concepts easy and locating related information quickly. This paper proposes a new method termed Treeview self-organising maps (Treeview SOMs) for clustering and organising text documents by means of a series of independently and automatically created, hierarchical one-dimensional SOMs. The method generates a topological taxonomy tree for a set of unstructured text documents in terms of presentation and visualisation. The documents are organised in a hierarchy of dynamically generated and automatically validated topics extracted from the corpus of the documents. The results presented in a labelled tree view, clearly show underlying contents of the documents and can help browsing the document set more efficiently than those of previous work using SOMs or hierarchical clustering methods. A brief overview on general document clustering and a review on SOM-based document analysis methods are also provided together with a comparison among them. © 2004 Elsevier B.V. All rights reserved.
    Original languageEnglish
    Pages (from-to)415-446
    Number of pages31
    JournalNeurocomputing
    Volume63
    DOIs
    Publication statusPublished - Jan 2005

    Keywords

    • Browsing and navigation
    • Document clustering
    • Information retrieval
    • Knowledge management
    • Self-organising maps

    Fingerprint

    Dive into the research topics of 'Tree view self-organisation of web content - Institute for Water Education'. Together they form a unique fingerprint.

    Cite this