Data quality support to on-the-fly data integration using adaptive query processing

Paolo Missier, Roald Lengu, Alvaro A A Fernandes, Giovanna Guerrini, Marco Mesiti

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    Abstract

    In dynamic, on-the-fly relational data integration settings, such as data mashups, there is a need to reconcile values heterogeneity across sources, in order to ensure consistency and completeness of the integrated data. In this scenario, the use of exact joins to match records across sources may lead to incomplete integration, while similarity joins, often advocated as a solution to this problem, is computationally expensive. In this paper we explore the use of adaptive query processing (AQP) techniques in order to combine exact (fast) and approximate (accurate) joins when performing dynamic integration. The adaptive algorithm uses an an priori expectation of the join result size combined with the monitoring of join progress to statistically determine, at various points during query execution, which join operator should be used. Depending on its configuration, the algorithm can achieve various tradeoffs between completeness of the join result, and query execution time. Our experimental results show that sensible savings in join execution time can be achieved in practice, at the expense of a modest reduction in result completeness.
    Original languageEnglish
    Title of host publication17th Italian Symposium on Advanced Database Systems, SEBD 2009|Ital. Symp. Adv. Databases Syst., SEBD
    Pages213-220
    Number of pages7
    Publication statusPublished - 2009
    Event17th Italian Symposium on Advanced Database Systems, SEBD 2009 - Camogli, Genova
    Duration: 1 Jul 2009 → …

    Conference

    Conference17th Italian Symposium on Advanced Database Systems, SEBD 2009
    CityCamogli, Genova
    Period1/07/09 → …

    Fingerprint

    Dive into the research topics of 'Data quality support to on-the-fly data integration using adaptive query processing'. Together they form a unique fingerprint.

    Cite this