Data Search using Deep Learning
Movie Premiere Director Actors The Web is a rich source for tables: Biutiful 05/17/2010 ? ? True Grit 12/22/2010 ? ? - Many tables describe the same real The Social Network 10/01/2010 ? ? world entities The King's Speech 1/07/2010 ? ? 127 Hours 09/04/2010 ? ? - Most tables contain only partial The Fighter 12/10/2010 ? ? ? ? ? ? information - Tables are scattered across different websites WDC Schema.org Table Corpus
Table Augmentation Actor Movie - Augment an input table with Jesse Eisenberg Social Network, The Andrew GarfieldMovieSocial Network, TheDirector information from a table corpus Colin Firth King's Speech, The Biutiful - Task not trivial due to heterogeneity Helena Bonham Carter King's Speech, The Joel Coen and size of table corpus Geoffrey Rush True GritKing's Speech, The Ethan Coen The Social Network David Fincher The King's Speech Tom Hooper
How can we use transformer-based models for Table Augmentation?
Universität Mannheim – Bizer/Brinkmann: Team Project FSS2021 – Slide 1 Data Search using Deep Learning
Project Goal Experiment with State-of-the-Art NLP Transformer Models and use them to search for Tabular Data Involves − Data Profiling, Data Preprocessing − Model Training, Evaluation, Selection Learning Targets − Gain technical experience with State-of-the-Art Data Search Technologies − Gain work experience as Data Scientist Requirements − Data Science & Engineering Skills, Programming Experience (Python) − Relevant Courses: Web Data Integration, Data Mining I & II, Information Retrieval & Web Search Organization: 4-6 people, 6 months, work as a complete team and in subgroups Instructors: Alexander Brinkmann, Christian Bizer
Universität Mannheim – Bizer/Brinkmann: Team Project FSS2021 – Slide 2