Proximity-based graph embeddings for multi-label classification

Tingting Mu, Sophia Ananiadou

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    Abstract

    In many real applications of text mining, information retrieval and natural language processing, large-scale features are frequently used, which often make the employed machine learning algorithms intractable, leading to the well-known problem "curse of dimensionality". Aiming at not only removing the redundant information from the original features but also improving their discriminating ability, we present a novel approach on supervised generation of low-dimensional, proximity-based, graph embeddings to facilitate multi-label classification. The optimal embeddings are computed from a supervised adjacency graph, called multi-label graph, which simultaneously preserves proximity structures between samples constructed based on feature and multi-label class information. We propose different ways to obtain this multi-label graph, by either working in a binary label space or a projected real label space. To reduce the training cost in the dimensionality reduction procedure caused by large-scale features, a smaller set of relation features between each sample and a set of representative prototypes are employed. The effectiveness of our proposed method is demonstrated with two document collections for text categorization based on the "bag of words" model.
    Original languageEnglish
    Title of host publicationKDIR 2010 - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval|KDIR - Proc. Int. Conf. Knowl. Discov. Inf. Retr.
    Pages74-84
    Number of pages10
    Publication statusPublished - 2010
    EventInternational Conference on Knowledge Discovery and Information Retrieval, KDIR 2010 - Valencia
    Duration: 1 Jul 2010 → …

    Conference

    ConferenceInternational Conference on Knowledge Discovery and Information Retrieval, KDIR 2010
    CityValencia
    Period1/07/10 → …

    Keywords

    • Adjacency graph
    • Dimensionality reduction
    • Embedding
    • Multi-label classification
    • Supervised

    Fingerprint

    Dive into the research topics of 'Proximity-based graph embeddings for multi-label classification'. Together they form a unique fingerprint.

    Cite this