Fast Foreign-Key Detection in Microsoft SQL Server PowerPivot for Excel Zhimin Chen Vivek Narasayya Surajit Chaudhuri Microsoft Research Microsoft Research Microsoft Research
[email protected] [email protected] [email protected] ABSTRACT stored in a relational database, which they can import into Excel. Microsoft SQL Server PowerPivot for Excel, or PowerPivot for Other sources of data are text files, web data feeds or in general any short, is an in-memory business intelligence (BI) engine that tabular data range imported into Excel. enables Excel users to interactively create pivot tables over large data sets imported from sources such as relational databases, text files and web data feeds. Unlike traditional pivot tables in Excel that are defined on a single table, PowerPivot allows analysis over multiple tables connected via foreign-key joins. In many cases however, these foreign-key relationships are not known a priori, and information workers are often not be sophisticated enough to define these relationships. Therefore, the ability to automatically discover foreign-key relationships in PowerPivot is valuable, if not essential. The key challenge is to perform this detection interactively and with high precision even when data sets scale to hundreds of millions of rows and the schema contains tens of tables and hundreds of columns. In this paper, we describe techniques for fast foreign-key detection in PowerPivot and experimentally evaluate its accuracy, performance and scale on both synthetic benchmarks and real-world data sets. These techniques have been incorporated into PowerPivot for Excel. Figure 1. Example of pivot table in Excel. It enables multi- dimensional analysis over a single table.