Using text mining to analyze quality aspects of unstructured data: A case study for "stock-touting" spam emails

Mohamed Zaki, David Diaz, Babis Theodoulidis

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    Abstract

    The growth in the utilization of text mining tools and techniques in the last decade has been primarily driven by the increase in the sheer volume of unstructured texts and the need to extract useful and more importantly, quality information from them. The impetus to analyse unstructured data efficiently and effectively as part of the decision making processes within an organization has further motivated the need to better understand how to use text mining tools and techniques. This paper describes a case study of a stock spam e-mail architecture that demonstrates the process of refining linguistic resources to extract relevant, high quality information including stock profile, financial key words, stock and company news (positive/negative), and compound phrases from stock spam e-mails. The context of such a study is to identify high quality information patterns that can be used to support relevant authorities in detecting and analyzing fraudulent activities.
    Original languageEnglish
    Title of host publication16th Americas Conference on Information Systems 2010, AMCIS 2010|Amer. Conf. Inf. Sys., AMCIS
    Subtitle of host publicationhttp://aisel.aisnet.org/amcis2010/364/
    PublisherCurran Associates Incorporated
    Pages4949-4958
    Number of pages9
    Volume7
    ISBN (Print)9781617389528
    Publication statusPublished - 2010
    Event16th Americas Conference on Information Systems 2010, AMCIS 2010 - Lima
    Duration: 1 Jul 2010 → …

    Conference

    Conference16th Americas Conference on Information Systems 2010, AMCIS 2010
    CityLima
    Period1/07/10 → …

    Keywords

    • Data and Information Quality
    • Spam Emails
    • Text Mining
    • Unstructured Data

    Fingerprint

    Dive into the research topics of 'Using text mining to analyze quality aspects of unstructured data: A case study for "stock-touting" spam emails'. Together they form a unique fingerprint.

    Cite this