Abstract
Information integration systems allow users to express queries over high-level conceptual models. However, such queries must subsequently be evaluated over collections of sources, some of which are likely to be expensive to use or subject to periods of unavailability. As such, it would be useful if information integration systems were able to provide users with estimates of the consequences of omitting certain sources from query execution plans. Such omissions can affect both the soundness (the fraction of returned answers which are returned) and the completeness (the fraction of correct answers which are returned) of the answer set returned by a plan. Many recent information integration systems have used conceptual models expressed in description logics (DLs). This paper presents an approach to estimating the soundness and completeness of queries expressed in the ALCQI DL. Our estimation techniques are based on estimating the cardinalities of query answers. We have have conducted some statistical evaluation of our techniques, the results of which are presented here. We also offer some suggestions as to how estimates for cardinalities of subqueries can be used to aid users in improving the soundness and completeness of query plans. © 2003 Elsevier B.V. All rights reserved.
Original language | English |
---|---|
Pages (from-to) | 105-129 |
Number of pages | 24 |
Journal | Data and Knowledge Engineering |
Volume | 47 |
Issue number | 1 |
DOIs | |
Publication status | Published - Oct 2003 |
Keywords
- Cardinality estimation
- Data quality
- Description logics
- Distributed query processing
- Information integration