Observing the Data Scientist: Using Manual Corrections as Implicit Feedback

Nurzety Binti Ahmad Azuan, Suzanne Embury, Norman Paton

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


Dataspaces aim to remove the up-front costs of information integration by gathering the needed domain information through targeted interactions with the end-user throughout the life-time of the integration. State-of-the-art tools are used to rapidly construct an initial (incorrect) integration, which is then refined in a payas- you-go manner by asking end-users to supply feedback on the resulting data. Œe idea is that end-users will choose to put effort into providing feedback on the areas of the integration where the quality is important to them, while other less well-used areas will receive a smaller share of user attention. Œis approach is promising but open problems remain. One issue is that the end-user loses control over the process. ŒTheir contribution is to specify their query requirements and to provide feedback on the results, as directed by the dataspace. But what feedback should the user supply to get the data they want? We propose a new approach to data integration in which the end-user and the dataspace work as equal partners to meet the integration goal. Both are able to perform data integration tasks directly, and both request and provide feedback on the results. In addition, the dataspace observes the actions of the end-user when carrying out integration, with the aim of automating that part of the work in future integration tasks. In this paper, we explore this idea by examining how a dataspace can observe an end-user at work, correcting errors in query results, to gather feedback needed to refine the mappings used for integration. We propose an algorithm for converting manual corrections to feedback, and present the results of a preliminary evaluation comparing this approach with seeking explicit feedback from end-users.
Original languageEnglish
Title of host publicationProceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics, HILDA@SIGMOD 2017, Chicago, IL, USA
Publication statusPublished - May 2017
EventSIGMOD/PODS'17 International Conference on Management of Data - Chicago, United States
Duration: 14 May 201719 Jun 2017


ConferenceSIGMOD/PODS'17 International Conference on Management of Data
Country/TerritoryUnited States


  • Information integration
  • dataspaces
  • Pay-as-you-go
  • Implicit Feedback
  • manual data correction


Dive into the research topics of 'Observing the Data Scientist: Using Manual Corrections as Implicit Feedback'. Together they form a unique fingerprint.

Cite this