TY - GEN
T1 - Schema Mapping Generation in the Wild: A Demonstration with Open Government Data
AU - Mazilu, Mihaela
AU - Konstantinou, Nikolaos
AU - Paton, Norman
AU - Fernandes, Alvaro A.A.
N1 - Publisher Copyright:
© 2020 Copyright held by the owner/author(s).
PY - 2020
Y1 - 2020
N2 - Schema mapping generation identifies how data sets can be combined to create views that are relevant to an application. Where the data sets to be combined lack declared relationships, such as foreign keys, schema mapping generation can be considered to be in the wild. In this paper, we describe an approach to schema mapping generation in the context of open government data, in particular, the London Datastore. Mapping generation is informed by inferred profiling data about the data sets and their relationships, where the data sets are made available as csv files. We outline the mapping generation algorithm, and describe a demonstration of the approach, in which the user can: (i) specify the target to be populated by the generated mappings over a collection of sources from The London Datastore; (ii) browse the generated candidate mappings and the evidence that informed their creation; and (iii) steer the mapping generation process, to make use of preferred sources and dependable profiling results.
AB - Schema mapping generation identifies how data sets can be combined to create views that are relevant to an application. Where the data sets to be combined lack declared relationships, such as foreign keys, schema mapping generation can be considered to be in the wild. In this paper, we describe an approach to schema mapping generation in the context of open government data, in particular, the London Datastore. Mapping generation is informed by inferred profiling data about the data sets and their relationships, where the data sets are made available as csv files. We outline the mapping generation algorithm, and describe a demonstration of the approach, in which the user can: (i) specify the target to be populated by the generated mappings over a collection of sources from The London Datastore; (ii) browse the generated candidate mappings and the evidence that informed their creation; and (iii) steer the mapping generation process, to make use of preferred sources and dependable profiling results.
UR - http://www.scopus.com/inward/record.url?scp=85084174286&partnerID=8YFLogxK
U2 - 10.5441/002/edbt.2020.77
DO - 10.5441/002/edbt.2020.77
M3 - Conference contribution
SN - 9783893180837
T3 - Advances in Database Technology - EDBT
SP - 615
EP - 618
BT - Advances in Database Technology - EDBT 2020
A2 - Bonifati, Angela
A2 - Zhou, Yongluan
A2 - Vaz Salles, Marcos Antonio
A2 - Bohm, Alexander
A2 - Olteanu, Dan
A2 - Fletcher, George
A2 - Khan, Arijit
A2 - Yang, Bin
T2 - 23rd International Conference on Extending Database Technology
Y2 - 30 March 2020 through 2 April 2020
ER -