Abstract
Dataspaces aim to remove the up-front costs of information integration by gathering the needed domain information through targeted interactions with the end-user throughout the life-time of the integration. State-of-the-art tools are used to rapidly construct an initial (incorrect) integration, which is then refined in a payas- you-go manner by asking end-users to supply feedback on the resulting data. Œe idea is that end-users will choose to put effort into providing feedback on the areas of the integration where the quality is important to them, while other less well-used areas will receive a smaller share of user attention. Œis approach is promising but open problems remain. One issue is that the end-user loses control over the process. ŒTheir contribution is to specify their query requirements and to provide feedback on the results, as directed by the dataspace. But what feedback should the user supply to get the data they want? We propose a new approach to data integration in which the end-user and the dataspace work as equal partners to meet the integration goal. Both are able to perform data integration tasks directly, and both request and provide feedback on the results. In addition, the dataspace observes the actions of the end-user when carrying out integration, with the aim of automating that part of the work in future integration tasks. In this paper, we explore this idea by examining how a dataspace can observe an end-user at work, correcting errors in query results, to gather feedback needed to refine the mappings used for integration. We propose an algorithm for converting manual corrections to feedback, and present the results of a preliminary evaluation comparing this approach with seeking explicit feedback from end-users.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics, HILDA@SIGMOD 2017, Chicago, IL, USA |
DOIs | |
Publication status | Published - May 2017 |
Event | SIGMOD/PODS'17 International Conference on Management of Data - Chicago, United States Duration: 14 May 2017 → 19 Jun 2017 |
Conference
Conference | SIGMOD/PODS'17 International Conference on Management of Data |
---|---|
Country/Territory | United States |
City | Chicago |
Period | 14/05/17 → 19/06/17 |
Keywords
- Information integration
- dataspaces
- Pay-as-you-go
- Implicit Feedback
- manual data correction