Mapper: An Efficient Data Transformation Operator
Paulo Jorge Fernandes Carreira,
Faculdade de Ciências de Universidade de Lisboa –
Application scenarios such as legacy data migration, Extract-Transform-Load (ETL) processes, and data cleaning require the transformation of input tuples into output tuples. Traditional approaches for implementing these data transformations enclose solutions as Persistent Stored Modules (PSM) executed by an RDBMS or transformation code using a commercial ETL tool. Neither of these is easily maintainable or optimizable. A third approach consists of combining SQL queries with external code, written in a programming language. However, this solution is not expressive enough to specify an important class of data transformations that produce several output tuples for a single input tuple. In my PhD thesis, I propose the data mapper operator as an extension to the relational algebra to address this class of data transformations. Furthermore, the thesis discusses a set of algebraic rewriting rules for optimizing expressions that combine standard relational operators with mappers. Experimental results confirm the benefits brought by some of the proposed semantic optimizations.
Date: 2008-Apr-16 Time: 15:00:00 Room: N7.1
For more information:
INESC-ID ESR Talks – February 2023
If you are a masters/PhD student or a postdoctoral fellow, come and present your work in an informal and friendly environment – and savour some tasty snacks!
Individual talks will be 10-15 minutes plus time for feedback. Enroll on your selected date by emailing pedro.ferreira[at]inesc-id.pt.
Happening on the second Wednesday of every month (4pm-5pm):
- 15 February (Alves Redol, Room 9)
- 15 March (Alves Redol, Room 9)
- 12 April (Alves Redol, Room 9)
- 10 May (Alves Redol, Room 9)
- 14 June (Alves Redol, Room 9)
- 12 July (Alves Redol, Room 9)
We hope to see you there!