A Cover Sheet Please replace this page with the cover sheet. Title: Data Intensive Computing: Scalable, Social Data Analysis PI: Maneesh Agrawala, Associate Professor Phone: 510-643-8220 E-Mail:
[email protected] Co-PI: Jeffrey Heer, Assistant Professor E-Mail:
[email protected] Co-PI: Joseph M. Hellerstein, Professor E-Mail:
[email protected] Page 1 Data Intensive Computing: Scalable, Social Data Analysis Contents A Cover Sheet 1 B Table of Contents and List of Figures. 2 C Project Summary 4 D Project Description 5 1 Introduction 5 2 Motivation and Objectives 6 2.1 Social Data Analysis . .6 2.2 Interacting with Big Data . .7 2.3 Surfacing Social Context and Activity . .8 3 Data Model for Scalable Collaborative Analysis 10 4 Surfacing Social Context 11 5 Collaborating with Big Data 13 5.1 Interacting with Scalable Statistics . 13 5.2 Modeling and Discussing Data Transformations and Provenance . 14 6 Applications 15 6.1 CommentSpace . 15 6.2 Using E-mail to Bootstrap Analysis of Social Context . 17 6.3 Bellhop: Interactive, Collaborative Data Profiling . 17 7 Evaluation: Metrics and Methodology 18 8 Results from Prior NSF Funding 20 E Collaboration Plan 21 E.1 PI Roles and Responsibilities . 21 E.2 Project Management Across Investigators, Institutions and Disciplines . 22 E.3 Specific Coordination Mechanisms . 22 Data Intensive Computing: Scalable, Social Data Analysis E.4 Budget Line Items Supporting Coordination Mechanisms . 22 - References Cited. 23 List of Figures 1 Collaborative sensemaking in Sense.us . .6 2 Online aggregation interface. .8 3 Enron e-mail corpus viewer .