Helena Galhardas,

R – INESC-ID Lisboa

Abstract:

Data cleaning and ETL processes are usually modeled as graphs of data transformations. The involvement of the users responsible for executing these graphs over real data is important to tune data transformations and to manually correct data items that cannot be treated automatically. This talk will describe recent research in which, in order to better support the user involvement in data cleaning processes, data cleaning graphs were equiped with data quality constraints to help users identifying the points of the graph and the records that need their attention, and manual data repairs for representing the way users can provide the feedback required to manually clean data items. Some preliminary experimental results will be presented, showing the significant gains obtained with the use of data cleaning graphs.

 

Date: 2011-Sep-26     Time: 16:00:00     Room: 336


For more information: