Halifax, Nova Scotia ­ Canada August 13 ­ 17, 2017 23rd ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Contents KDD 2017 Agenda at a Glance KDD 2017 Chairs’ Welcome Message Program Highlights Keynote Talks Research and Applied Data Science Tracks Applied Data Science Track Invited Talks Applied Data Science Panel KDD Panel Tutorials Hands‐On Tutorials Workshops KDD 2017 Tutorial Program KDD 2017 Workshop Program Full‐Day Workshops ‐ Monday August 14, 8:00am ‐5:00pm Half Day Workshops ‐ Monday August 14, 8:00am ‐ 12:00pm Half Day Workshops ‐ Monday August 14, 1:00pm ‐ 5:00pm KDD Cup Workshop ‐ Wednesday August 16, 1:30pm ‐ 5:00pm KDD 2017 Hands‐On Tutorial Program Tuesday August 15, 2017 Wednesday August 16, 2017 Thursday August 17, 2017 KDD 2017 Conference Program Monday August 14 2017 Detailed Program Monday August 14, 2017 5:15pm – 7:00pm, KDD 2017 Opening Session ‐ Scoabank Centre Tuesday August 15, 2017 Detailed Program Wednesday August 16, 2017 Detailed Program Thursday August 17, 2017 Detailed Program KDD 2017 Conference Organizaon KDD 2017 Organizing Commiee Research Track Senior Program Commiee Applied Data Science Track Senior Program Commiee Research Track Program Commiee Applied Data Science Track Program Commiee KDD 2017 Sponsors & Supporters Halifax, Points of Interest Useful Links and Emergency Contacts

KDD 2017 Agenda at a Glance

Saturday, August 12th Level 8 ­ Summit 8:00AM ­ 5:00PM Workshop: Broadening Participation in Data Mining (BPDM) ­ Day 1 Suite/Meeting Room 5 4:00PM ­ 6:00PM KDD 2017 Registration Level 1 Atrium

Sunday, August 13th (TUTORIAL DAY) 7:30AM ­ 5:00PM KDD 2017 Registration Level 1 ­ Atrium Level 8 ­ Summit 8:00AM ­ 5:00PM Workshop: Broadening Participation in Data Mining (BPDM) ­ Day 2 Suite/Meeting Room 5 Tutorial 4: Network Embedding­Enabling Network Analytics and 8:00AM ­ 12:00PM Room 200C1 Inference in Vector Space Tutorial 1: Mining Entity­Relation­Attribute Structures from 8:00AM ­ 12:00PM Room 200C2 Massive Text Data Tutorial 9: Deep Learning for Personalized Search and 8:00AM ­ 12:00PM Room 200D Recommender Systems 8:00AM ­ 12:00PM Tutorial 6: Time Series Data Mining Using Matrix Profiling Room 200E Tutorial 8: Data Mining in Unusual Domains with Information­rich 8:00AM ­ 12:00PM Suite 202­203 Knowledge Graph Construction, Inference and Search 8:00AM ­ 12:00PM Tutorial 10: Safe Data Analytics: Theory, Algorithms, and Applications Suite 204­205 Tutorial 2: Urban Computing: Enabling Intelligent Cities with AI and Big 8:00AM ­ 12:00PM Suite 301 Data 8:00AM ­ 12:00PM Tutorial 3: Non­IID Learning Suite 302 Tutorial 5: From Theory to Data Product ­ Applying Data Science 8:00AM ­ 12:00PM Suite 303 Methods to Effect Business Change Tutorial 11: IoT in Practice: Case Studies in Data Analytics, with Issues 8:00AM ­ 12:00PM Suite 304 in Privacy and Security. Tutorial 20: Athlytics: Data Mining and Machine Learning for 8:00AM ­ 12:00PM Suite 305­ 306 Sports Analytics 10:00AM ­ 10:30AM KDD Coffee Break Level 2 & 3 Foyer 12:00PM ­ 1:00PM Lunch (On Own) 3:00PM ­ 3:30PM KDD Coffee Break Level 2 & 3 Foyer 1:00PM ­ 5:00PM Tutorial 22: Smart Analytics for Big Time­series Data Room 200C1 Tutorial 13: Machine Learning for Survival Analysis: Theory, Algorithms 1:00PM ­ 5:00PM Room 200C2 and Applications 1:00PM ­ 5:00PM Tutorial 21: Learning Representations of Large­scale Networks Room 200D 1:00PM ­ 5:00PM Tutorial 7: Recent Advances in Feature Selection Room 200E 1:00PM ­ 5:00PM Tutorial 19: System Event Mining ­ Algorithms and Applications Suite 202­203

Tutorial 16: A Critical Review of Social Data: Biases, Methodological 1:00PM ­ 5:00PM Suite 204­205 Pitfalls, and Ethical Boundaries Tutorial 14: Large Scale Hierarchical Classification: Foundations, 1:00PM ­ 5:00PM Suite 301 Algorithms and Applications 1:00PM ­ 5:00PM Tutorial 15: A/B Testing at Scale: Accelerating Software Innovation Suite 302 Tutorial 17: Data­Driven Approaches towards Malicious Behavior 1:00PM ­ 5:00PM Suite 303 Modeling 1:00PM ­ 5:00PM Tutorial 12: Making Better Use of the Crowd Suite 304 Tutorial 18: Context­Rich Recommendation: Integrating Links, Text, and 1:00PM ­ 5:00PM Suite 305­ 306 Spatio­Temporal Dimensions

Monday, August 14th (WORKSHOP DAY/POSTER RECEPTION 1) 7:30AM ­ 5:00PM KDD 2017 Registration Level 1 ­ Atrium Level 8 ­ Summit 8:00AM ­ 12:00PM Workshop 14: Machine Learning for Creativity Suite/Meeting Room 5 8:00AM ­ 12:00PM Workshop 11: Workshop on Causal Discovery Room 200C1 Workshop 2: Big Data, IoT Streams and Heterogeneous Source Mining 8:00AM ­ 12:00PM Room 200C2 (Full­Day) 8:00AM ­ 12:00PM Workshop 1: Mining and Learning from Time Series (Full­Day) Room 200D 8:00AM ­ 12:00PM Workshop 4: Mining and Learning with Graphs (Full­Day) Room 200E Workshop 5: Fairness, Accountability, Transparency in Machine 8:00AM ­ 12:00PM Suite 202 Learning (FATML) (Full­Day) 8:00AM ­ 12:00PM Workshop 6: International Workshop on Data Mining in Bioinformatics Suite 203 8:00AM ­ 12:00PM Workshop 15: Advancing Education with Data (Full­Day) Suite 204 8:00AM ­ 12:00PM Workshop 7: Urban Computing (Full­Day) Suite 205 Workshop 3: Interactive Data Exploration and Analytics (IDEA) 8:00AM ­ 12:00PM Suite 301 (Full Day) 8:00AM ­ 12:00PM Workshop 12: Medical Informatics and Healthcare Suite 302 8:00AM ­ 12:00PM Workshop 13: Big Data Analytics­as­a­Service Suite 303 8:00AM ­ 12:00PM Workshop 9: Data Science for Intelligent Food, Energy and Water Suite 304 8:00AM ­ 12:00PM Workshop 8: Data Science & Journalism (Full­Day) Suite 305 8:00AM ­ 12:00PM Workshop 10: 2017 Edition of AdKDD and TargetAd (Full­Day) Suite 306 10:00AM ­ 10:30AM KDD Coffee Break Level 2 & 3 Foyer 12:00PM ­ 1:00PM Lunch (On Own) 3:00PM ­ 3:30PM KDD Coffee Break Level 2 & 3 Foyer Level 8 ­ Summit 1:00PM ­ 5:00PM Workshop 18: Data Driven Discovery Suite/Meeting Room 5 1:00PM ­ 5:00PM Workshop 19: Anomaly Detection in Finance Room 200C1 Workshop 2: Big Data, IoT Streams and Heterogeneous Source Mining 1:00PM ­ 5:00PM Room 200C2 (Full­Day)

1:00PM ­ 5:00PM Workshop 1: Mining and Learning from Time Series (Full­Day) Room 200D 1:00PM ­ 5:00PM Workshop 4: Mining and Learning with Graphs (Full­Day) Room 200E Workshop 5: Fairness, Accountability, Transparency in Machine 1:00PM ­ 5:00PM Suite 202 Learning (FATML) (Full­Day) Workshop 20: Workshop on Issues of Sentiment Discovery and Opinion 1:00PM ­ 5:00PM Suite 203 (WISDOM) 1:00PM ­ 5:00PM Workshop 15: Advancing Education with Data (Full­Day) Suite 204 1:00PM ­ 5:00PM Workshop 7: Urban Computing (Full­Day) Suite 205 Workshop 3: Interactive Data Exploration and Analytics (IDEA) (Full 1:00PM ­ 5:00PM Suite 301 Day) 1:00PM ­ 5:00PM Workshop 16: Machine Learning Meets Fashion Suite 302 Workshop 17: Machine Learning for Prognostics and Health 1:00PM ­ 5:00PM Suite 303 Management 1:00PM ­ 5:00PM Workshop 9: Data Science for Intelligent Food, Energy and Water Suite 304 1:00PM ­ 5:00PM Workshop 8: Data Science & Journalism (Full­Day) Suite 305 1:00PM ­ 5:00PM Workshop 10: 2017 Edition of AdKDD and TargetAd (Full­Day) Suite 306 4:00PM ­ 5:00PM Poster Reception Presenter Set­Up WTCC Room 100 5:15PM ­ 7:00PM KDD 2017 Opening Session Scotiabank Centre 7:00PM ­ 10:00PM Poster Reception: Group 1 WTCC Room 100

Tuesday, August 15th (MAIN CONFERENCE DAY) 7:00AM ­ 5:00PM KDD 2017 Registration Level 1 ­ Atrium Level 8 ­ Meeting Room 9:30AM ­ 6:00PM Sponsor Room 4 9:30AM ­ 6:00PM KDD Exhibit Hall Scotiabank Centre 7:00AM ­ 8:00AM KDD Breakfast Level 3 Foyer Keynote: Bin Yu ­ Three Principles of Data Science: Predictability, 8:00AM ­ 9:30AM Scotiabank Centre Stability, and Computability 8:30AM ­ 12:00PM Hands­On Tutorial: Amazon Web Services & MxNET Suite 301­ 303 8:30AM ­ 12:00PM Hands­On Tutorial: Anomaly Detection in Networks Suite 304­ 306 Level 8 ­ Summit 10:00AM ­ 1:00PM India Chapter Meeting ­ Data Science in India Suite/Meeting Room 5 Level 2 & 3 Foyer / 9:30AM ­ 10:00AM KDD Coffee Break Exhibit Hall 10:00AM ­ 12:00PM AT1: Platforms and Infrastructure Room 200C 10:00AM ­ 12:00PM AI1: Understanding Behavior with Data Science Room 200D 10:00AM ­ 12:00PM RT1: Kernels and Sketches Room 200E 10:00AM ­ 12:00PM RT2: Temporal Analysis Suite 202­205 12:00PM ­ 1:30PM KDD Lunch Scotiabank Centre

Prince George Hotel ­ 12:00PM ­ 1:30PM KDD Women's Lunch (T icket Required) Windsor Room 1:30PM ­ 5:30PM Hands­On Tutorial: Amazon Web Services & MxNET Suite 301­ 303 1:30PM ­ 5:30PM Hands­On Tutorial: Massive Online Analytics Suite 304­ 306 Level 8 ­ Summit 1:30PM ­ 5:00PM Chapter Meeting Suite/Meeting Room 5 1:30PM ­ 3:30PM AT2: Novel Applications 1 Room 200C 1:30PM ­ 3:30PM AI2: Applied Machine Learning Room 200D 1:30PM ­ 3:30PM RT3: Graphs I Room 200E 1:30PM ­ 3:30PM RT4: Supervised Learning I Suite 202­205 Level 2 & 3 Foyer / 3:30PM ­ 4:00PM KDD Coffee Break Exhibit Hall 4:00PM ­ 6:00PM AT3: Medical Data Room 200C 4:00PM ­ 6:00PM RT5: Deep Learning Room 200DE 4:00PM ­ 6:00PM Dissertation Award Suite 202­205 6:30PM ­ 10:00PM KDD 2017 Banquet Cunard Centre

Wednesday, August 16th (MAIN CONFERENCE DAY/POSTER RECEPTION 2) 8:00AM ­ 5:00PM KDD 2017 Registration Level 1 Atrium

9:30AM ­ 6:00PM Sponsor Room Level 8 ­ Meeting Room 4 9:30AM ­ 12:00PM / KDD Exhibit Hall Scotiabank Center 1:30PM ­ 6:00PM (p lease note, hall closed during KDD Business Lunch) 7:00AM ­ 8:00AM KDD Breakfast Level 3 Foyer 8:00AM ­ 9:30AM Keynote: Cynthia Dwork ­ What's Fair? Scotiabank Centre Hands­On Tutorial: META: A Unifying Framework for the Management 8:30AM ­ 12:00PM Suite 301­ 303 and Analysis of Text Data

10:00AM ­ 12:00PM Meet the Editors Panel Level 8 ­ Summit Suite/Meeting Room 5

9:30AM ­ 10:00AM KDD Coffee Break Level 2 & 3 Foyer / Exhibit Hall 10:00AM ­ 12:00PM AT4: Networks and Graphs Room 200C 10:00AM ­ 12:00PM AI3: Intelligent Systems and Data Science Room 200D 10:00AM ­ 12:00PM RT6: Graphs II Room 200E 10:00AM ­ 12:00PM RT7: Methodology Suite 202­205 12:00PM ­ 1:30PM KDD Business Lunch Scotiabank Centre Hands­On Tutorial: Using R for Scalable Data Science: Single 1:30PM ­ 5:30PM Suite 301­ 303 Machines to Hadoop Spark Clusters Hands­On Tutorial: Declarative, Large­Scale Machine Learning with 1:30PM ­ 5:30PM Suite 304­ 306 Apache SystemML

1:30PM ­ 5:30PM KDD Cup Workshop Level 8 ­ Summit Suite/Meeting Room 5 1:30PM ­ 3:30PM AT5: Novel Applications 2 Room 200C 1:30PM ­ 3:30PM AI4: Management and Benchmarks Room 200D 1:30PM ­ 3:30PM RT8: Representations Room 200E 1:30PM ­ 3:30PM KDD Panel:The Future of Artificially Intelligent Assistants Suite 202­205

3:30PM ­ 4:00PM KDD Coffee Break Level 2 & 3 Foyer / Exhibit Hall 4:00PM ­ 6:00PM AT6: Urban Planning Room 200C AI5: Panel on Benchmarks and Process Management in Data Science: 4:00PM ­ 6:00PM Room 200D Will We Ever Get Over this Mess? (Moderator: Usama Fayyad) 4:00PM ­ 6:00PM RT9: Matrices Room 200E 4:00PM ­ 6:00PM RT10: Clustering Suite 202­205 6:00PM ­ 10:00PM Poster Reception: Group 2 WTCC Room 100

Thursday, August 17th (MAIN CONFERENCE DAY) 8:30AM ­ 12:00PM KDD 2017 Registration Level 1 Atrium Level 8 ­ Meeting Room 9:30AM ­ 12:00PM Sponsor Room 4 9:30AM ­ 12:30PM KDD Exhibit Hall Scotiabank Centre 7:00AM ­ 8:00AM KDD Breakfast Level 3 Foyer 8:00AM ­ 9:30AM Keynote: Renée Miller ­ The Future of Data Integration Scotiabank Centre 8:30AM ­ 12:00PM Hands­On Tutorial: TensorFlow Suite 301­ 303 Hands­On Tutorial: Cloud based Data Mining Tools for Storage, 8:30AM ­ 12:00PM Distributed Processing, and Machine Learning Systems for Scientific Suite 304­ 306 Data Level 2 & 3 Foyer / 9:30AM ­ 10:00AM KDD Coffee Break Exhibit Hall 10:00AM ­ 12:00PM AT7: Web Applications Room 200C 10:00AM ­ 12:00PM RT13: Recommenders Room 200D 10:00AM ­ 12:00PM RT11: Supervised Learning II Room 200E 10:00AM ­ 12:00PM RT12: Humans and Crowds Suite 202­205 12:00PM ­ 12:30PM KDD Coffee Break Level 2 Foyer 12:30PM ­ 1:00PM KDD 2017 Closing Session Room 200CD Level 8 ­ Summit 2:00PM ­ 5:00PM KDD Ocean Data Analytics Industry Connector Suite/Meeting Room 5

KDD 2017 Chairs’ Welcome Message

It is our great pleasure to welcome you to the 2017 ACM Conference on Knowledge Discovery and Data Mining – KDD 2017. We hope that the content and the professional networking opportunities at KDD 2017 will help you to succeed professionally by enabling you to: identify new technology trends; learn from contributed papers, presentations, and posters; discover new tools, processes and practices; identify new job opportunities; and hire new team members.

The terms “Data Science”, “Data Mining” and “Big Data” have, in the last few years, grown out of research labs and gained presence in the media and in everyday conversations. We also hear these terms on social media and from decision makers at various level of governments and corporations. The impact of these technologies is felt in almost every walk of life. Importantly, the current rapid progress in data science is facilitated by the timely sharing of newly discovered and developed representations and algorithms between those working in research and those interested in industrial deployment. It is the hallmark of KDD conferences in the past that they have been the bridge between theory and practise, the great facilitator and catalyst for this exchange. Researchers and practitioners meet in person and interact in a meaningful way over several days. The conference program, with its three parallel tracks ­ the Research Track, the Applied Data Science Track and the Applied Invited Speakers Track ­ brings the two groups together. Participants are welcome to freely attend any track, and the events common for all tracks.

The conference this year continues with its tradition of a strong tutorial and workshop program on leading edge issues of data mining during the first two days of the program. The last three days are devoted to contributed technical papers, describing both novel, important research contributions, and deployed, innovative solutions. Three keynote talks, by Cynthia Dwork, Bin Yu, and Renée J. Miller touch on some of the hard, emerging issues before the field of data mining. With a growing industry around AI assistants, our KDD Panel brings together industry experts in this field to spawn discussions and an exchanges of ideas. We have an outstanding lineup of industry speakers sharing their experiences and expertise in deploying industrial data mining solutions. We continue a strong hands­on tutorial program, in which participants will learn how to use practical data science tools. In order to broaden the impact of KDD and to increase the participation of attendees who would greatly benefit from the conference but would have otherwise found it financially challenging to attend, we reserved a substantial budget for travel grants. KDD 2017 awarded a record USD 145k for student travel and also set aside USD 25k to enable smaller startups to attend. With the new “Meet the Experts” sessions, KDD 2017 also gives researchers and practitioners a unique opportunity to form professional networks and to share their perspectives with others interested in the various aspects of data science. We hope that the KDD 2017 conference will serve as a meeting ground for researchers, practitioners, funding agencies and investors to help create new algorithms and commercial products.

The table below summarizes numerically different elements of the conference program and provides acceptance rates, whenever applicable.

Venue or Track Reviewed Accepted Acceptance Rate Research Track Papers 748 64* , 66+ 8.6* , 8.8+ Applied Data Science Track Papers 390 36* , 50+ 9.2* , 12.8+ Workshops 36 22 61.1 Tutorials 35 22 62.8 Hands­on Tutorials 8 8 NA Applied Data Science Talks Invited 11 NA Regular Keynotes Invited 3 NA Panels Invited 2 NA Papers: * o ral, + p oster

Putting together KDD 2017 has been a wonderful team effort by the members of the organizing committee. We thank the authors and the speakers for providing the content of the program. We are grateful to the program committee and the senior program committee, who worked very hard in reviewing papers and providing feedback for authors. Finally, we thank the numerous sponsors and ACM SIGKDD for their support in hosting the conference. We hope that you will find this program interesting and thought­provoking, and that the conference will provide you with a valuable opportunity to share ideas with other researchers and practitioners from institutions around the world. We also hope that the conference facilities, and the host city of Halifax, Nova Scotia, Canada, will provide a friendly environment and many opportunities for such fruitful discussions and exchanges.

KDD’17 Chairs Stan Matwin (General Chair) Shipeng Yu (General Chair) Faisal Farooq (Associate General Chair)

Program Highlights

Keynote Talks ● Bin Yu, Professor, University of California at Berkeley. Three Principles of Data Science: Predictability, Stability, and Computability ● Cynthia Dwork, D istinguished Scientist, / Harvard University. W hat’s Fair? ● Renée J. Miller, P rofessor, University of Toronto. T he Future of Data Integration

Research and Applied Data Science Tracks ● 130 Research Track Papers ● 86 Applied Data Science Track Papers

Applied Data Science Track Invited Talks ● Andy Berglund, Professor, University of . Mining Big Data in NeuroGenetics to Understand Muscular Dystrophy ● Mainak Mazumdar, EVP ­ Chief Research Officer, Nielsen. Addressing challenges with Big Data for Media Measurement ● Paritosh Desai, SVP Enterprise Data and Analytics, Target. It Takes More than Math and Science to Hit the Bullseye with Data ● Rajesh Parekh, D irector of Analytics, Facebook. Designing AI at Scale to Power Everyday Life ● Josh Bloom, V P Data and Analytics, GE. Industrial Machine Learning ● Longbing Cao, Professor, University of Technology Sydney. Behavior Informatics to Discover Behavior Insight for Active and Tailored Client Management ● Vipin Kumar, Professor, University of Minnesota. Big Data in Climate: Opportunities and Challenges for Machine Learning ● David Potere, P rofessor, Tellus Labs. Spaceborne Data Enters the Mainstream ● Jonathan How, P rofessor, MIT. Planning and Learning under Uncertainty: Theory and Practice ● Eduardo Ariño de la Rubia, Chief Data Scientist, Domino Data Lab. More than the Sum of its Parts: Building Domino Data Lab ● Szilard Pafka, C hief Scientist, Epoch. Machine Learning Software in Practice: Quo Vadis?

Applied Data Science Panel ● Moderator: Usama M. Fayyad, CEO, Open Insights. Benchmarks and Process Management in Data Science: Will We Ever Get Over the Mess?

KDD Panel ● Moderators: Muthu Muthukrishnan, Professor, Rutgers University and Andrew Tomkins, Engineering Director, Google. T he Future of Artificially Intelligent Assistants

Tutorials ● Mining Entity­Relation­Attribute Structures from Massive Text Data. Jingbo Shang (University of Illinois, Urbana­Champaign), Xiang Ren (University of Illinois, Urbana­Champaign), Meng Jiang (University of Illinois, Urbana­Champaign), Jiawei Han (University of Illinois, Urbana­Champaign) ● Urban Computing: Enabling Intelligent Cities with AI and Big Data. Y u Zheng (Microsoft Research) ● Non­IID Learning. Longbing Cao (UTS), Philip Yu (UIC), Guangsong Pang (UTS), Chengzhang Zhu (UTS) ● Making Better Use of the Crowd. Jennifer Wortman Vaughan (Microsoft Research) ● Machine Learning for Survival Analysis: Theory, Algorithms and Applications. Chandan K. Reddy (Virginia Tech), Yan Li (Univ. of Michigan) ● Large Scale Hierarchical Classification: Foundations, Algorithms and Applications. Huzefa Rangwala (George Mason University), Azad Naik (Microsoft) ● A/B Testing at Scale: Accelerating Software Innovation. Alex Deng (Microsoft), Pavel Dmitriev (Microsoft), Somit Gupta (Microsoft), Ron Kohavi (Microsoft), Paul Raff (Microsoft), Lukas Vermeer (Booking.com) ● Network Embedding­ Enabling Network Analytics and Inference in Vector Space. Peng Cui (Tsinghua University), Jian Pei (Simon Fraser University), Wenwu Zhu (Tsinghua University) ● From Theory to Data Product Applying Data Science Methods to Effect Business Change. Danielle Leighton (T4G Limited), Lindsay Brin (T4G Limited), Janet Forbes (T4G Limited) ● A Critical Review of Social Data: Biases, Methodological Pitfalls, and Ethical Boundaries. Alexandra Olteanu (IBM Research), Emre Kiciman (Microsoft Research), Carlos Castillo (Eurecat, Spain), Fernando Diaz (Spotify, US) ● Time Series Data Mining Using Matrix Profiling: A Unifying View of Motif Discovery, Anomaly Detection, Segmentation, Classification, and Similarity Joins. Abdullah Mueen (University of New Mexico), Eamonn Keogh (University of California, Riverside) ● Data­Driven Approaches towards Malicious Behavior Modeling. Meng Jiang (University of Notre Dame), Srijan Kumar (Stanford University), VS Subrahmanian (University of Maryland, College Park), Christos Faloutsos (CMU) ● Recent Advances in Feature Selection: A Data Perspective. Jundong Li (ASU), Jiliang Tang (MSU), Huan Liu (ASU) ● Context­Rich Recommendation: Integrating Links, Text, and Spatio­Temporal Dimensions. Yizhou Sun (UCLA), Xiang Ren (UIUC), Hongzhi Yin (The University of Queensland) ● System Event Mining: Algorithms and Applications. Tao Li (Florida International University / Nanjing University of Posts and Telecommunications), Larisa Shwartz (IBM Research), Genady Ya Grabarnik (St. John_?s University) ● Data Mining in Unusual Domains with Information­rich Knowledge Graph Construction, Inference and Search. Mayank Kejriwal (USC­ISI), Pedro Szekely (USC­ISI) ● Deep Learning for Personalized Search and Recommender Systems. Ganesh Venkataraman, Nadia Fawaz, Saurabh Kataria, Benjamin Le, Liang Zhang (LinkedIn Corp.) ● Safe Data Analytics: Theory, Algorithms, and Applications. Jun Huan (KU), Chao Lan (KU), Xiaoli Li (KU) ● Athlytics: Data Mining and Machine Learning for Sports Analytics. Konstantinos Pelechrinis (University of Pittsburgh), Benjamin Alamar (ESPN), Evangelos Papalexakis (University of California, Riverside)

● Learning representations of large­scale networks. Jian Tang (Umich), Cheng Li (Umich), Qiaozhu Mei (Umich) ● IoT in Practice: Case Studies in Data Analytics, with Issues in Privacy and Security. Albert Bifet (Telecom ParisTech), Latifur Khan (University of Texas at Dallas), Joao Gama (University of Porto), Wei Fan (Baidu Research Big Data Lab) ● Smart Analytics for Big Time­series Data. Yasushi Sakurai (Kumamoto University), Yasuko Matsubara (Kumamoto University), Christos Faloutsos (Carnegie Mellon University)

Hands­On Tutorials ● Amazon Web Services & MxNET. Alex Smola (Amazon), Joseph Spisak (Amazon), Mu Li (Amazon) ● Anomaly Detection in Networks. Veena B. Mendiratta (Nokia Bell Labs) ● Massive Online Analytics. Bernhard Pfahringer (University of Waikato), Albert Bifet (Telecom­ParisTech) ● META: A Unifying Framework for the Management and Analysis of Text Data. Chase Geigle (UIUC), ChengXiang Zhai (UIUC) ● Using R for Scalable Data Science: Single Machines to Hadoop Spark Clusters. Robert Horton (Microsoft), Mario Inchiosa (Microsoft), Vanja Paunic (Microsoft), Hang Zhang (Microsoft) ● Declarative, Large­Scale Machine Learning with Apache SystemML. Matthias Boehm (IBM Research), Alexandre Evfimievski (IBM Research), Niketan Pansare (IBM Research), Berthold Reinwald (IBM Research), Prithvi Sen, (IBM Research) ● TensorFlow. Rajat Monga (Google), Martin Wicke (Google) , Daniel ‘Wolff’ Dobson (Google), Joshua Gordon (Google) ● Cloud based Data Mining Tools for Storage, Distributed Processing, and Machine Learning Systems for Scientific Data. V ani Mandava (Microsoft Research), Dennis Gannon (Indiana University)

Workshops

Full­day Workshops: ● Workshop 1: Mining and Learning from Time Series ● Workshop 2: B ig Data, IoT Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications ● Workshop 3: I nteractive Data Exploration and Analytics (IDEA) ● Workshop 4: M ining and Learning with Graphs ● Workshop 5: F airness, Accountability, Transparency in Machine Learning (FATML) ● Workshop 7: U rban Computing ● Workshop 8: D ata Science + Journalism ● Workshop 9: D ata Science for Intelligent Food, Energy and Water ● Workshop 10: 2 017 Edition of AdKDD and TargetAd ● Workshop 15: A dvancing Education With Data

Half­day, Morning Workshops: ● Workshop 6: International Workshop on D ata Mining in Bioinformatics ● Workshop 11: Workshop on Causal Discovery ● Workshop 12: M edical Informatics and Healthcare

● Workshop 13: B ig data analytics­as­a­Service: Architecture, Algorithms, and Application in Health Informatics ● Workshop 14: M achine Learning for Creativity

Half­day, Afternoon Workshops: ● Workshop 16: M achine Learning Meets Fashion: Data, algorithms and analytics for the fashion industry ● Workshop 17: M achine Learning for Prognostics and Health Management ● Workshop 18: D ata­Driven Discovery ● Workshop 19: A nomaly Detection in Finance ● Workshop 20: W orkshop on Issues of Sentiment Discovery and Opinion Mining (WISDOM)

KDD 2017 Tutorial Program

Sunday August 13, 2017

8:00am­12:00pm (morning)

Tutorial 1 – R oom 200C2 Title: M ining Entity­Relation­Attribute Structures from Massive Text Data Instructors: Jingbo Shang (University of Illinois, Urbana­Champaign), Xiang Ren (University of Illinois, Urbana­Champaign), Meng Jiang (University of Illinois, Urbana­Champaign), Jiawei Han (University of Illinois, Urbana­Champaign)

Tutorial 2 – S uite 301 Title: Urban Computing: Enabling Intelligent Cities with AI and Big Data Instructors: Y u Zheng (Microsoft Research)

Tutorial 3 – S uite 302 Title: Non­IID Learning Instructors: Longbing Cao (UTS), Philip Yu (UIC), Guangsong Pang (UTS), Chengzhang Zhu (UTS)

Tutorial 4– R oom 200C1 Title: Network Embedding­ Enabling Network Analytics and Inference in Vector Space Instructors: Peng Cui (Tsinghua University), Jian Pei (Simon Fraser University), Wenwu Zhu (Tsinghua University)

Tutorial 5 – S uite 303 Title: From Theory to Data Product Applying Data Science Methods to Effect Business Change Instructors: D anielle Leighton (T4G Limited), Lindsay Brin (T4G Limited), Janet Forbes (T4G Limited)

Tutorial 6 – R oom 200E Title: Time Series Data Mining Using Matrix Profiling: A Unifying View of Motif Discovery, Anomaly Detection, Segmentation, Classification, and Similarity Joins Instructors: Abdullah Mueen (University of New Mexico), Eamonn Keogh (University of California, Riverside)

Tutorial 7 – S uite 202­203 Title: Data Mining in Unusual Domains with Information­rich Knowledge Graph Construction, Inference and Search Instructors: M ayank Kejriwal (USC­ISI), Pedro Szekely (USC­ISI)

Tutorial 8 – R oom 200D Title: Deep Learning for Personalized Search and Recommender Systems

Instructors: Ganesh Venkataraman (LinkedIn Corp.), Nadia Fawaz (LinkedIn Corp.), Saurabh Kataria (LinkedIn Corp.), Benjamin Le (LinkedIn Corp.), Liang Zhang (LinkedIn Corp.)

Tutorial 9 – S uite 204­205 Title: Safe Data Analytics: Theory, Algorithms, and Applications Instructors: J un Huan (KU), Chao Lan (KU), Xiaoli Li (KU)

Tutorial 10 – S uite 305­306 Title: Athlytics: Data Mining and Machine Learning for Sports Analytics Instructors: Konstantinos Pelechrinis (University of Pittsburgh), Benjamin Alamar (ESPN), Evangelos Papalexakis (University of California, Riverside)

Tutorial 11 – S uite 304 Title: I oT in Practice: Case Studies in Data Analytics, with Issues in Privacy and Security Instructors: Albert Bifet (Telecom ParisTech), Latifur Khan (University of Texas at Dallas), Joao Gama (University of Porto), Wei Fan (Baidu Research Big Data Lab)

13:00pm­17:00pm (Afternoon)

Tutorial 12 – S uite 304 Title: Making Better Use of the Crowd Instructors: Jennifer Wortman Vaughan (Microsoft Research)

Tutorial 13 – R oom 200C2 Title: Machine Learning for Survival Analysis: Theory, Algorithms and Applications Instructors: Chandan K. Reddy (Virginia Tech), Yan Li (Univ. of Michigan)

Tutorial 14 – S uite 301 Title: Large Scale Hierarchical Classification: Foundations, Algorithms and Applications Instructors: Huzefa Rangwala (George Mason University), Azad Naik (Microsoft)

Tutorial 15 – S uite 302 Title: A/B Testing at Scale: Accelerating Software Innovation Instructors: Alex Deng (Microsoft), Pavel Dmitriev (Microsoft), Somit Gupta (Microsoft), Ron Kohavi (Microsoft), Paul Raff (Microsoft), Lukas Vermeer (Booking.com)

Tutorial 16 – S uite 204­205 Title: A Critical Review of Social Data: Biases, Methodological Pitfalls, and Ethical Boundaries Instructors: Alexandra Olteanu (IBM Research), Emre Kiciman (Microsoft Research), Carlos Castillo (Eurecat, Spain), Fernando Diaz (Spotify, US)

Tutorial 17 – S uite 303 Title: Data­Driven Approaches towards Malicious Behavior Modeling

Instructors: Meng Jiang (University of Notre Dame), Srijan Kumar (Stanford University), VS Subrahmanian (University of Maryland, College Park), Christos Faloutsos (CMU)

Tutorial 18 – R oom 200E Title: Recent Advances in Feature Selection: A Data Perspective Instructors: Jundong Li (ASU), Jiliang Tang (MSU), Huan Liu (ASU)

Tutorial 19 – S uite 305­306 Title: Context­Rich Recommendation: Integrating Links, Text, and Spatio­Temporal Dimensions Instructors: Yizhou Sun (UCLA), Xiang Ren (UIUC), Hongzhi Yin (The University of Queensland)

Tutorial 20 – S uite 202­203 Title: System Event Mining: Algorithms and Applications Instructors: Tao Li (Florida International University / Nanjing University of Posts and Telecommunications), Larisa Shwartz (IBM Research), Genady Ya Grabarnik (St. John_?s University)

Tutorial 21 – R oom 200D Title: Learning representations of large­scale networks Instructors: J ian Tang (Umich), Cheng Li (Umich), Qiaozhu Mei (Umich)

Tutorial 22 – R oom 200C1 Title: Smart Analytics for Big Time­series Data Instructors: Yasushi Sakurai (Kumamoto University), Yasuko Matsubara (Kumamoto University), Christos Faloutsos (Carnegie Mellon University)

KDD 2017 Workshop Program

Full‐Day Workshops ‐ Monday August 14, 8:00am ‐5:00pm

***Please check the workshop web­pages for latest schedules***

Workshop 1: Mining and Learning from Time Series Http: h ttp://www­bcf.usc.edu/~liu32/milets17/ Organizers: Eamonn Keogh (University of California Riverside) Room 200D Yan Liu (University of Southern California) Abdullah Mueen (University of New Mexico) Sanjay Purushotham (University of Southern California) Vijay Manikandan Janakiraman (NASA Ames Research Center)

Agenda: ● 8:00 ­ 8:10: Opening Remarks ● 8:10 ­ 9:00: I nvited Talk by Albert Bifet ● 9:00 ­ 10:00: Paper Session 1: D eep Learning and Nonparametric Solutions for time series o Reading the Tea Leaves: A Neural Network Perspective on Technical Trading. Sid Ghoshal and Stephen Roberts o DECADE: A Deep Metric Learning Model for Multivariate Time Series. Zhengping Che, Xinran He, Ke Xu and Yan Liu o Robust Parameter­Free Season Length Detection in Time Series. Maximilian Toller and Roman Kern

● 10:00 ­ 10:30: Coffee Break and Poster Sessions

● 10:30 ­ 11:30: I nvited Talk by Jian Pei o Tracking in Dynamic Networks ● 11:30 ­ 12:00: P oster Spotlights o Temporal Lag estimation and Granger Causality on time series. Indranil Bhattacharya, Arnab Bhattacharyya and Praveen Pankajakshan o Utilizing Artificial Neural Networks to Detect Compound Events in Spatio­Temporal Soccer Data. Keven Richly, Florian Moritz and Christian Schwarz o Automatic Singular Spectrum Analysis and Forecasting. Michele Trovero, Michael Leonard and Bruce Elsheimer o Short­Term Wind Energy Forecasting with Temporally Dependent Neural Network Models. Rui Li, Pu Wang, Jingrui Xie, Alex Chien and Mustafa Kabul o Time Series Classification for Scrap Rate Prediction in Transfer Molding. Anna Mandli, Robert Palovics, Matyas Susits and Andras A. Benczur

● 12:00 ­ 13:00: Lunch Break

● 13:00 ­ 14:00: I nvited Talk b y B enjamin Marlin o Learning With Temporally Uncertain Labels ● 14:00 ­ 15:00: P aper Session 2: S treaming and Trajectory data analysis o Online Thinning for High Volume Streaming Data. Xin Hunt and Rebecca Willett o Sub­string/Pattern Matching in Sub­linear Time Using a Sparse Fourier Transform Approach. Nagaraj Thenkarai Janakiraman, Avinash Vem, Krishna Narayanan and Jean­Francois Chamberland o Coordination Event Detection and Initiator Identification in Time Series Data. Chainarong Amornbunchornvej, Ivan Brugere, Ariana Strandburg­Peshkin, Damien Farine, Margaret Crofoot and Tanya Berger­Wolf

● 15:00 ­ 15:30: Coffee Break and Poster Session

● 15:30 ­ 16:00: Poster Session ● 16:00 ­ 16:30: Panel Discussion and Concluding Remarks

Workshop 2: Big Data, IoT Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications Http: https://bigmine.github.io/bigmine17/ Organizers: Wei Fan (Baidu Research Big Data Lab) Room 200C2 Albert Bifet (Telecom­ParisTech) Jesse Read (École Polytechnique) Qiang Yang (Hong Kong University of Science and Technology) Philip Yu (University of Illinois at )

Agenda: ● 8:00 ­ 8:05: O pening Remarks ● 8:05 ­ 9:00: I nvited Talk by Kai Chen (Hong Kong Univ. of Science & Technology) o Towards Datacenter­Scale Deep Learning with Efficient Networking ● 9:00 ­ 10:00: Paper Session 1

● 10:00 ­ 10:30: Coffee Break

● 10:30 ­ 11:30: I nvited Talk by Jure Leskovec (Stanford University) o Mining Online Networks and Communities ● 11:30 ­ 12:00: Paper Session 2

● 12:00 ­ 13:00: Lunch Break

● 13:00 ­ 14:00: I nvited Talk by Martin Wicke (Google)

o Learning from Real­World Data with TensorFlow ● 14:00 ­ 15:00: Paper Session 3

● 15:00 ­ 15:30: Coffee Break

● 15:30 ­ 16:25: I nvited Talk by Huan Liu (Arizona State University) o Big Data + Deep Learning = A Universal Solution? ● 16:25 ­ 16:55: Paper Session 4 ● 16:55 ­ 17:00: Final Remarks

Workshop 3: Interactive Data Exploration and Analytics (IDEA) Http: http://poloclub.gatech.edu/idea2017/ Organizers: Jefrey Lijffijt (Ghent University) Polo Chau (Georgia Tech) Suite 301 Jilles Vreeken (Max Planck Institute for Informatics, and Saarland University) Matthijs van Leeuwen (Universiteit Leiden) Dafna Shahaf (The Hebrew University of Jerusalem) Christos Faloutsos (Carnegie Mellon)

Agenda: ● 8:15 –8:30: Welcome to IDEA'17 ● 8:30–9:20: I nvited speaker: R ich Caruana (Microsoft Research) ● 9:20–9:40: A Game­theoretic Approach to Data Interaction: A Progress Report. Ben McCamish, Arash Termehchy, and Behrouz Touri ● 9:40–10:00: Exploring Dimensionality Reductions with Forward and Backward Projections. Marco Cavallo and Cagatay Demiralp

● 10:00–10:30: Coffee break

● 10:30–10:50: Foresight: Recommending Visual Insights. Cagatay Demiralp, Peter J. Haas, Srinivasan Parthasarathy, and Tejaswini Pedapati ● 10:50–11:40: I nvited speaker: N athalie Henry Riche (Microsoft Research)

● 12:00–13:00 Lunch break

● 13:00–13:10: Re­welcome ● 13:10–13:30: Interactive Unsupervised Clustering with Clustervision. Bum Chul Kwon, Ben Eysenbach, Janu Verma, Kenney Ng, and Adam Perer ● 13:30–13:50: Incorporating Feedback into Tree­based Anomaly Detection. Shubhomoy Das, Weng­Keen Wong, Alan Fern, Thomas Dietterich, and Md. Amran Siddiqui

● 13:50–14:10: Portable In­Browser Data Cube Exploration. Kareem El Gebaly, Lukasz Golab, and Jimmy Lin ● 14:10–15:00: Invited speaker: Samuel Kaski (Aalto University & Helsinki Institute for Information Technology, HIIT) ● 15:00–15:05: Closing words

● 15:05–15:30: Coffee break (+ poster & demo session)

● 15:30–17:00: Poster & demo session (Contributed papers listed above and) ○ Visualizing Wikipedia for Interactive Exploration. R on Bekkerman and Olga Donin ○ DycomDetector: Discover topics using automatic community detections in dynamic networks. T ommy Dang and Vinh Nguyen ○ ECOviz: Comparative Visualization of Time­Evolving Network Summaries. Lisa Jin and Danai Koutra ○ Clipped Projections for More Informative Visualizations [A Work­in­Progress Report]. Bo Kang, Junning Deng, Jefrey Lijffijt, and Tijl De Bie ○ Towards an Interactive Learning­to­Rank System for Economic Competitiveness Understanding. C aitlin Kuhlman and Elke Rundensteiner ○ Data Sketches for Disaggregated Subset Sum Estimation. D aniel Ting

Workshop 4: Mining and Learning with Graphs Http: http://www.mlgworkshop.org/2017/ Organizers: Michele Catasta (EPFL / Stanford) Shobeir Fakhraei (University of Maryland) Room 200E Danai Koutra () Silvio Lattanzi (Google Research) Julian McAuley (UC San Diego) Jennifer Neville (Purdue University)

Agenda: ● 8:50­9:00: O pening Remarks ● 9:00­9:40: K eynote: Nitesh Chawla, University of Notre Dame ○ "Representing, Modeling, and Visualizing Higher Order Networks" ● 9:40­10:00: Poster Spotlights 1

● 10:00­10:30: Coffee Break

● 10:30­10:50: Poster Spotlights 2 ● 10:50­11:30: K eynote: Vahab Mirrokni, Google Research, NY ○ "Distributed Graph Mining: Theory and Practice" ● 11:30­12:00: Poster Session

● 12:00­13:00: Lunch (+ Poster Session)

● 13:00­13:40: K eynote: Elena Zheleva, University of Illinois at Chicago ○ "Sharing and Gifting Online" ● 13:40­14:00: Contributed Talks Session 1 ● 14:00­14:40: K eynote: Yan Liu, University of Southern California ○ "Robust Diffusion Network Inference" ● 14:40­15:00: C ontributed Talks Session 2

● 15:00­15:30: Coffee Break

● 15:30­16:10: K eynote: Jiliang Tang, Michigan State University ○ "Node Relevance in Signed Networks: Measurements and Applications" ● 16:10­16:50: K eynote: TBD ● 16:50­17:00: Closing Remarks

Workshop 5: Fairness, Accountability, Transparency in Machine Learning (FATML) Http: http://www.fatml.org/schedule/2017/page/call­for­papers­2017 Organizers: Solon Barocas (Microsoft Research) Suite 202 Sorelle Friedler (Haverford College) Joshua Kroll (Cloudflare) Suresh Venkatasubramanian (University of Utah) Hanna Wallach (Microsoft Research and University of Massachusetts Amherst)

Agenda: ● 8:50­9:00: O pening Remarks ● 9:00­10:00: I nvited talk: Margaret Mitchell (Google) ○ The Seen and Unseen Factors Influencing Knowledge in AI Systems

● 10:00­10:30: Coffee break

● 10:30­12:00: Contributed Talks: ○ Su Lin Blodgett and Brendan O'Connor. "Racial Disparity in Natural Language Processing: A Case Study of Social Media African­American English" ○ Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi and Sergei Vassilvitskii. "Fair Clustering Through Fairlets" ○ Yang Liu, Goran Radanovic, Christos Dimitrakakis, David Parkes and Debmalya Mandal. "Calibrated fairness in bandits" ○ Michael Skirpan and Micha Gorelick. "The Authority of 'Fair' in Machine Learning"

● 12:00­14:00: Lunch and Poster Session

● 14:00­15:00: I nvited talk: Rich Caruana (Microsoft) ○ Friends Don’t Let Friends Deploy Black­Box Models: Preventing Bias via Transparent Machine Learning

● 15:00­15:30: Coffee Break

● 15:30­17:00: Contributed talks. ○ Danielle Ensign, Sorelle A. Friedler, Scott Neville, Carlos Scheidegger and Suresh Venkatasubramanian. "Runaway Feedback Loops in Predictive Policing" ○ Lily Hu and Yiling Chen. "Fairness at Equilibrium in the Labor Market" ○ Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, Krishna P. Gummadi and Adrian Weller. "Preference vs. Parity­based Notions of Fairness" ○ Michael Veale. "Logics and practices of transparency and opacity in real­world applications of public sector machine learning"

Workshop 7: Urban Computing Http: http://urbcomp.ist.psu.edu/2017/ Organizers: Yu Zheng (Microsoft Research) Jieping Ye (DiDi Chuxing) Suite 205 Qiang Yang (Hong Kong University of Science of Technology) Philip Yu (University of Illinois at Chicago) Ouri Wolfson (University of Illinois at Chicago) Zhenhui Li (Penn State University)

Agenda: ● 8:00­8:10: O pening ● 8:10­9:10: S ession 1 ­ Mobility ● 9:10­10:00: I nvited talk 1 by Dr. Yu Zheng, Microsoft Research ○ Urban Computing: Enabling Intelligent Cities which ith AI and Big Data

● 10:00­10:30: Coffee Break

● 10:30­11:20: I nvited talk 2 by Dr. Zhenhui (Jessie) Li, Penn State University ○ Toward Semantic Understanding of Spatial Trajectories ● 11:20­12:00: S ession 2 ­ Energy

● 12:00­13:30: Lunch

● 13:30­14:20: I nvited talk 3 b y Dr. Jieping Ye, Didi Chuxing & University of Michigan ○ Big Data at Didi Chuxing ● 14:20­15:00: S ession 3 ­ Transportation

● 15:00­15:30: Coffee Break

● 15:30­17:00: Session 4 ­ Traffic

Workshop 8: Data Science + Journalism Http: https://sites.google.com/view/dsandj2017 Organizers: Elena Erdmann (TU Dortmund University) Suite 305 Kristian Kersting (TU Dortmund University) Amanda Stent (Bloomberg LP) Elena Zheleva (University of Illinois at Chicago)

Agenda: ● 9:00­9:05: Welcome ● 9:05­10:05: I nvited talk: Nick Diakopoulos

● 10:05­10:30am: Coffee Break

● 10:30­12:00: 5 accepted paper presentations (15 minutes each)

● 12:00­13:15: Lunch

● 13:15­14:15: Invited talk (Hannes Munzinger) ● 14:15­15:00: 3 accepted paper presentations (15 minutes each)

● 15:00­15:30: Coffee Break

● 15:30­16:30: I nvited talk: TBD ● 16:30­17:00: Breakout session to discuss emerging challenges and interesting problems to solve

Workshop 9: Data Science for Intelligent Food, Energy and Water Http: http://ai4good.org/few17/ Organizers: Suite 304 Naoki Abe (IBM Research) James Hodson (AI for Good Foundation)

Chid Apte (IBM Research) Joseph Byrum (Syngenta) Rayid Ghani (University of Chicago) Marko Grobelnik (Jozef Stefan Institute) Estevam Rafael Hruschka (Federal University of Sao Carlos DC­UFSCar) Vipin Kumar (University of Minnesota) Shashi Shekhar (University of Minnesota) Mitch Tuinstra (Purdue University) Ranga Raju Vatsavai (North Carolina State University)

Agenda: ● 8:00­8:15: Greetings and workshop introduction ● 8:15­9:00: K eynote talk 1 ● 9:00­10:00: Paper presentations 1

● 10:00­10:30: Coffee Break

● 10:30­12:00: A I Challenge award announcement & entry presentations

● 12:00­13:00: Lunch

● 13:00­13:45: K eynote talk 2 ● 13:45­15:00: Paper presentations 2

● 15:00­15:30: Coffee Break

● 15:30­16:45: P anel discussion ● 16:45­17:00: C losing

Workshop 10: 2017 Edition of AdKDD and TargetAd Http: https://adkdd17.wixsite.com/adkddtargetad2017 Organizers: Abraham Bagherjeiran (A9) Nemanja Djuric (UBER) Suite 306 Mihajlo Grbovic (AirBnB) Kuang­Chih Lee (Yahoo Research) Kun Liu (LinkedIn) Vladan Radosavljevic (UBER) Suju Rajan (Criteo Research)

Agenda: ● 8:00­8:20: O ptimal Reserve Price for Online Ads Trading Based on Inventory Identification

● 8:20­8:40: M M2RTB: Bring Multimedia Metrics to Real­Time Bidding ● 8:40­9:00: D ata­Driven Reserve Prices for Social Advertising Auctions at LinkedIn ● 9:00­9:20: C ost­sensitive Learning for Utility Optimization in Auctions ● 9:20­10:00: I nvited talk: R andall Lewis (Netflix)

● 10:00­10:30: Coffee Break

● 10:30­11:20: I nvited talk: S usan Athey (Stanford) ● 11:20­11:40: B lacklisting the Blacklist in Online Advertising ● 11:40­12:00: A nti­ Strategy: Measuring its True Impact

● 12:00­13:00: Lunch

● 13:00­13:20: An Ensemble­based Approach to Click­Through Rate Prediction for Promoted Listings at Etsy ● 13:20­13:40: A Practical Framework of Conversion Rate Prediction for Online Display Advertising ● 13:40­14:00: D eep & Cross Network for Ad Click Predictions ● 14:00­14:20: Ranking and Calibrating Click­Attributed Purchases in Performance Display Advertising ● 14:20­15:00: I nvited talk: A lex Smola (Amazon)

● 15:00­15:30: Coffee Break

● 15:30­16:10: I nvited talk: T horsten Joachims (Cornell) ● 16:10­16:30: A ttribution Modeling Increases Efficiency of Bidding in Display Advertising ● 16:30­16:50: P rofit Maximization for Online Advertising Demand­Side Platforms ● 16:50­17:00: Best paper award

Workshop 15: Advancing Education With Data Http: http://ml4ed.cc/KDD­Call­For­Papers/ Organizers: Andrew Lan () Christopher G. Brinton (Zoomi) Suite 204 Jiquan Ngiam (Coursera) Mung Chiang (Princeton University) Richard Baraniuk (Rice University) Roshan Sumbaly (Coursera) Shivani Rao (LinkedIn)

Agenda: ● 8:20 ­ 8:30: Opening remarks and logistics ● Session 1: Assessments

● 8:30 ­ 9:00: I nvited talk 1: Neil Heffernan, Worcester Polytechnic Institute ● 9:00 ­ 9:20: T alk 2: Large­Scale and Interpretable Collaborative Filtering for Educational Data ● 9:20 ­ 9:40: Talk 3: Enriching Course­Specific Regression Models with Content Features for Grade Prediction ● 9:40 ­ 10:00: Talk 4: Clustering LaTeX Solutions to Machine Learning Assignments for Rapid Assessment

● 10:00 ­ 10:30 Coffee Break

● Session 2: Learning Analytics and Personalization ● 10:30 ­ 11:00: I nvited talk 5: Peter Brusilovsky, University of Pittsburgh ● 11:00 ­ 11:20: O rganizer talk 6: Christopher Brinton, Zoomi Inc. ● 11:20 ­ 11:40: Talk 7: A Latent Factor Model for Instructor Content Preference Analysis ● 11:40 ­ 12:00: T alk 8: Transfer Learning for Education Data

● 12:00 ­ 13:30 Lunch break

● Session 3: Infrastructure for Personalized Learning ● 13:30 ­ 14:00: I nvited talk 9: Elena Glassman, University of California, Berkeley ● 14:00 ­ 14:20: O rganizer talk 10: Shivani Rao, LinkedIn ● 14:20 ­ 14:40: Talk 11: CollectiveTeach: A Platform for Integrating Web­based Educational Content into Lesson Plans ● 14:40 ­ 15:00: Talk 12: Using Probabilistic Tag Modeling to Improve Recommendations

● 15:00 ­ 15:30 Coffee break

● Session 4: Lifelong Learning ● 15:30 ­ 15:50: Organizer talk 13: Jiquan Ngiam, Coursera ● 15:50 ­ 16:10: Talk 14: Analyzing Game­Based Collaborative Problem Solving with Computational Psychometrics ● 16:10 ­ 16:30: Talk 15: STEM­ming the Tide: Predicting STEM Attrition using Student Transcript Data ● 16:30 ­ 17:00: Concluding remarks

Half Day Workshops ‐ Monday August 14, 8:00am ‐ 12:00pm

Workshop 6: International Workshop on Data Mining in Bioinformatics Http: http://home.biokdd.org/ Organizers: Jake Y. Chen (University of Alabama at Birmingham) Suite 203 Mohammed J. Zaki (Rensselaer Polytechnic Institute) Xin Gao (KAUST) Le Song (Georgia Institute of Technology)

Agenda: ● 8:00 ­ 8:10: Opening Remarks ● 8:10 ­ 9:10: K eynote Talk b y Wei Wang (UCLA) ○ TBD ● 9:10 ­ 9:35: P aper presentation 1 ○ ANTENNA, a Multi­Rank, Multi­Layered Recommender System for Inferring Reliable Drug­Gene­Disease Associations ● 9:35 ­ 10:00: Paper presentation 2 ○ Cost­sensitive Deep Learning for Early Readmission Prediction at A Major Hospital

● 10:00 ­ 10:30 Coffee Break

● 10:30 ­ 10:55: P aper presentation 3 ○ Improving the Prediction of Functional Outcome in Ischemic Stroke Patients ● 10:55 ­ 11:20: Paper presentation 4 ○ Ontology­based workflow extraction from texts using word sense disambiguation ● 11:20 ­ 11:45: Paper presentation 5 ○ Clustering Genetic Data: A Random Forest Cluster Ensemble Approach ● 11:45 ­ 12:00: Discussion and Final Remarks

Workshop 11: Workshop on Causal Discovery Http: http://nugget.unisa.edu.au/CD2017/ Organizers: Lin Liu (University of South Australia) Room 200C1 Jiuyong Li (University of South Australia) Kun Zhang (Carnegie Mellon University) Emre Kıcıman (Microsoft Research) Negar Kiyavash (University of Illinois at Urbana­Champaign)

A genda: ● 8:00­8:10: Opening and welcome ● 8:10­9:10: I nvited talk by Ioannis Tsamardinos ○ Advances in Causal­Based Feature Selection ● 9:10­9:30: Fast Causal Inference with Non­Random Missingness by Test­Wise Deletion, Eric Strobl, Shyam Visweswaran, and Peter Spirtes ● 9:30­9:50: Constraint­based Causal Discovery with Mixed Data, Michail Tsagris, Giorgos Borboudakis, Vincenzo Lagani, and Ioannis Tsamardinos ● 9:50­10:00: Discussion

● 10:00­10:30: Coffee Break

● 10:30­10:50: On Causal Analysis for Heterogeneous Networks, Katerina Marazopoulou, David Arbour, and David Jensen ● 10:50­11:10: Scoring Bayesian Networks of Mixed Variables, Bryan Andrews, Joseph Ramsey, and Greg Cooper ● 11:10­11:30: Comparison of Strategies for Scalable Causal Discovery of Latent Variable Models from Mixed Data, Vineet Raghu, Joseph Ramsey, Alison Morris, Dimitrios Manatakis, Peter Sprites, Panos K. Chrysanthis, Clark Glymour, and Panayiotis V. Benos ● 11:30­11:50: On Scoring Maximal Ancestral Graphs with the Max­Min Hill Climbing Algorithm, Konstantinos Tsirlis, Vincenzo Lagani, Sofia Triantafillou, and Ioannis Tsamardinos ● 11:50­12:00: Discussion, closing

Workshop 12: Medical Informatics and Healthcare Http: http://datasys.cs.iit.edu/events/MIH17/index.html#Dates Organizers: Suite 302 Samah Jamal Fodeh (Yale University) Daniela Stan Raicu (DePaul University)

A genda: ● 8:00­8:15: Opening remarks ● 8:15­9:00: K eynote speaker: Dr. Sameer Antani (NIH/NLM) ● 9:00­10:00: S ession 1

● 10:00­10:30: Coffee Break

● 10:30:11:45: S ession 2 ● 11:45­12:00: B est paper / closing remarks

Workshop 13: Big data analytics­as­a­Service: Architecture, Algorithms, and Application in Health Informatics Http: http://bigdas.org/ Organizers: Peggy Peissig (Marshfield Clinic Research Institute) David Page (University of Wisconsin­Madison) Suite 303 Reza Zadeh (Stanford University) Ahmad P. Tafti (Marshfield Clinic Research Institute) Eric LaRose (Marshfield Clinic Research Institute) Philippe Cudre­Mauroux (University of Fribourg) Richard Segall (Arkansas State University) Angus Roberts (University of Sheffield)

A genda: ● 8:00­8:10: O pening and welcome

● 8:10­8:50: I nvited Talk b y Vagelis Hristidis (UC Riverside) ○ Analysis of Online Health­Related User­Generated Content ● 8:50­9:30: I nvited Talk by Yuanyuan Tian (IBM Research) ○ Building Systems for Big Data Analytics: From SQL to Machine Learning and Graph Analysis ● 9:30­10:00: Poster presentation

● 10:00­10:30: Coffee Break

● 10:30­11:10: P aper Presentation ● 11:10­11:50: P anel ○ Challenges and Future Directions in Big Data Analytics and its Application in Health Informatics ● 11:50­12:00: Closing remarks

Workshop 14: Machine Learning for Creativity Http: https://creativeai.mybluemix.net/ Organizers: Karthik Sankaranarayanan (IBM Research) Lav R. Varshney (University of Illinois at Urbana­Champaign) Level 8 ­ Summit Suite Francois Pachet (Sony CSL) / Meeting Douglas Eck (Google Brain ­ Project Magenta) Room 5 Kush R. Varshney (IBM Research) Anush Sankaran (IBM Research) Priyanka Agrawal (IBM Research) Disha Shrivastava (IBM Research)

A genda: ● 8:00­8:05: Opening remarks ● 8:05­8:35: I nvited Talk b y Mark Riedl (Georgia Institute of Technology) ● 8:35­9:05: S pecial session on innovative applications of creativity ● 9:05­10:00: O ral paper presentations (4 papers)

● 10:00­11:00: Coffee Break and poster session

● 11:00­11:30: I nvited talk by Flavio du Pin Calmon (Harvard University) ● 11:30­12:00: I nvited talk by Nick Montfort (MIT)

Half Day Workshops ‐ Monday August 14, 1:00pm ‐ 5:00pm

Workshop 16: Machine Learning Meets Fashion, Data, algorithms and analytics for the fashion industry Http: https://kddfashion2017.mybluemix.net/ Organizers: Vikas C. Raykar (IBM Research) Suite 302 Soo­Min Pantel (Amazon) Raghavendra Singh (IBM Research) Julian McAuley (University of California, San Diego)

A genda: ● 13:00 ­ 13:10: Opening Remarks ● 13:10 ­ 13:40: I nvited Talk b y Kavita Bala: ○ Fashion and Style Discovery: object and material recognition from online photo collections ● 13:40 ­ 15:00: O ral Paper Presentations ○ Size Recommendation System for Fashion E­commerce ○ Learning Fashion Traits with Label Uncertainty ○ Style Potential : Modeling Sellability of Fashion Product ○ Cross­modal Search for Fashion Attributes

● 15:00 ­ 15:40: Coffee break & Poster Session

● 15:40 ­ 16:10: I nvited talk b y Madhu Kurup on ○ Making fashion recommendations in cold start situations· ● 16:10 ­ 16:50: O ral Paper Presentations ○ Algorithmic clothing: hybrid recommendation, from street­style­to­shop ○ Deciphering Fashion Sensibility Using Community Detection ● 16:50 ­ 17:00: O pen house (5 min spot­light talks/demos inviting interested participants to present)

Workshop 17: Machine Learning for Prognostics and Health Management Http: https://sites.google.com/site/mlforphm2017/ Organizers: Michael Giering (United Technologies Research Center) Suite 303 Kishore Reddy (United Technologies Research Center) Soumalya Sarkar (United Technologies Research Center) Soumik Sarkar (Iowa State University) Madhu Shashanka (Charles Schwab) Abhishek Srivastav (GE Global Research)

A genda: ● 13:00 ­ 13:10: O pening remarks ● 13:10 ­ 14:00: K eynote talk: ○ Raj Bharadwaj, Honeywell ● 14:00 ­ 14:20: R egular Session 1, Paper 1 ● 14:20 ­ 14:40: R egular Session 1, Paper 2 ● 14:40 ­ 15:00: R egular Session 1, Paper 3

● 15:00 ­ 15:30: Coffee Break

● 15:30 ­ 15:50: K eynote talk ● 15:50 ­ 16:10: R egular Session 2, Paper 1 ● 16:10 ­ 16:30: R egular Session 2, Paper 2 ● 16:30 ­ 16:50: R egular Session 2, Paper 3 ● 16:50 ­ 17:00: F inal Remarks

Workshop 18: Data­Driven Discovery Http: http://datainnovation.soic.indiana.edu:8080/kdd2017_workshop/in dex.html Level 8 ­ Organizers: Summit Suite Ying Ding (Indiana University) / Meeting James A. Evans (University of Chicago) Room 5 Scott Spangler (IBM Almaden Research Center) Lav R. Varshney (University of Illinois at Urbana­Champaign) Dashun Wang (Northwestern University)

A genda: ● 13:00 ­ 13:05: O pening Remarks ● 13:05 ­ 13:35: I nvited Talk b y Jie Tang (Tsinghua University) ● 13:35 ­ 14:05: I nvited Talk b y Hyejin Youn (Santa Fe Institute) ● 14:05 ­ 14:35: I nvited Talk b y Jure Leskovec (Stanford) ● 14:35 ­ 15:00: P oster Spotlight ○ Aditya Vempaty, Lav Varshney and Pramod Varshney. A Coupon­Collector Model of Machine­Aided Discovery ○ Kyle Hundman and Chris Mattmann. Measurement Context Extraction from Text: Discovering Opportunities and Gaps in Earth Science ○ Aminata Kane. Feature Selection and Grouping for Multivariate Time Series Using PCA ○ Charalampos Chelmis, Daphney­Stavroula Zois and Mengfan Yao. If Networks Could Talk: Understanding the Patterns and Characteristics of Cyberbullying

○ Setu Shah and Xiao Luo. Biomedical Document Clustering and Visualization based on the Concepts of Diseases ○ Kalyani Selvarajah, Pooya Moradian Zadeh, Mehdi Kargar and Ziad Kobti. A Knowledge­based Computational Algorithm for Discovering a Team of Experts Social Networks

● 15:00 ­ 15:30: Coffee Break and Poster Session

● 15:30 ­ 16:00: I nvited Talk b y Scott Splanger (IBM) ● 16:00 ­ 16:50: P anel Discussion: Data­Driven Discovery: Challenges and Opportunities ● 16:50 ­ 17:00: F inal Remarks

Workshop 19: Anomaly Detection in Finance Http: https://sites.google.com/view/kdd­adf­2017 Organizers: Senthil Kumar (Capital One) Room 200C1 Alexander Statnikov (American Express) Tanveer Faruquie (Capital One) Di W. Xu (American Express)

A genda: ● 13:00­13:10: O pening Remarks ○ Senthil Kumar, Director of Data Science, Capital One Labs ○ Alexander Statnikov, VP, Machine Learning Solutions and Global Line Modeling, American Express ● 13:10­13:35: K eynote 1 ○ Amir Averbuch, Tel Aviv University and Theta Ray, “Uncovering Unknown Unknowns in Financial Services Big Data by Unsupervised Methodologies” ● 13:35­14:00: K eynote 2 ○ Nong Ye, Arizona State University, “Targeted Profiling and Partial­Value Association for Anomaly Detection” ● 14:00­14:45: F ull Paper Session 1 ○ Jaroslav Kuchař and Vojtěch Svátek. Spotlighting Anomalies using Frequent Patterns ○ Rasha Kashef. Ensemble­based Anomaly Detection Using Cooperative Agreement ○ Ira Cohen, Meir Toledano, Yonatan Ben Simhon and Inbal Tadeski. Real­time anomaly detection system for time series at scale ● 14:45­15:00: I ndustry Perspective ○ Brian Surette, VP, Enterprise Model Risk, Analytics, and Data, Capital One

● 15:00­15:30: Coffee Break

● 15:30­15:45: I ndustry Perspective ○ Chao Yuan, SVP, Head of Decision Sciences, American Express

● 15:45­16:15: F ull Paper Session 2 ○ Bokai Cao, Mia Mao, Siim Viidu and Philip S. Yu. Collective Fraud Detection Capturing Inter­Transaction Dependency ○ Youngjoon Ki and Ji Won Yoon. PD­FDS: Purchase Density based Online Credit Card Fraud Detection System ● 16:15­16:55: S potlight Talks ○ Parikshit Ram and Alexander Gray. Anomaly detection with density estimation trees ○ Nalin Aggarwal, Alexander Statnikov and Chao Yuan. Automated System for Data Attribute Anomaly Detection ○ Matthew van Adelsberg and Christian Schwantes. Binned Kernels for Anomaly Detection in Multi­timescale Data using Gaussian Processes ○ Daniel Lasaga and Prakash Santhana. Deep Learning to Detect Treatment Fraud amongst Healthcare Providers ○ Michelle Miller and Robert Cezeaux. Sleuthing for adverse outcomes using anomaly detection ○ Sen Tian and Panos Ipeirotis. Large­Scale Anomaly Detection ○ Jagan Kuntipuram, Vamsi Patil, Marco Jorge and Pedro Bizarro. Fraud­Trips: Detecting fraudsters that try to lay low ○ Nian Yan, Xiaohang Zhang and Sriram Tirunellayi. Identity Fraud Detection with Distributed Graph Mining ● 16:55­17:00: Closing Remarks ○ Senthil Kumar, Director of Data Science, Capital One Labs ○ Alexander Statnikov, VP, Machine Learning Solutions and Global Line Modeling, American Express

Workshop 20: Workshop on Issues of Sentiment Discovery and Opinion Mining (WISDOM) Http: http://sentic.net/wisdom/#wisdom2017 Organizers: Suite 203 Yongzheng Zhang (LinkedIn Inc.) Erik Cambria (Nanyang Technological University) Bing Liu (University of Illinois at Chicago) Yunqing Xia ()

A genda: ● 13:00 ­ 13:10: Opening remarks ● 13:10 ­ 14:10: K eynote talk b y Dr. Xiaodan Zhu, NRC Canada and University of Ottawa ○ Deep Learning for Contrasting Meaning Representation and Composition ● 14:10 ­ 15:00: P aper presentation session I

● 15:00 ­ 15:30: Coffee Break

● 15:30 ­ 16:50: P aper presentation session II

● 16:50 ­ 17:00: C losing remarks

KDD Cup Workshop ‐ Wednesday August 16, 1:30pm ‐ 5:00pm

KDD Cup: https: http://tb.am/wr8dg Level 8 ­ Summit Organizers: Suite/Meeting Yu Liu, Wanli Min, Kuan Song, Xichun Tian, Jiawei Wang, Shuang Wu, Room 5 Yiting Wang, Liang Yu, and Xinfa Yan (Alibaba Inc.)

Agenda: ● 13:30 ­ 13:50 Opening Introduction to the competition problems ● 13:50 ­ 14:20 Invited Talk (Speaker TBD) ● 14:20 ­ 14:40 coffee break ● 14:40 ­ 15:25 Travel Time Prediction: Team Convolution (Team lead: Ke Hu); Team Longing for a teammate (Team lead: Huang Yide); Team Warriors (Team Lead Hengxing Cai) ● 15:25 ­ 16:10 Volume Prediction: Team Convolution (Team lead: Ke Hu); Team Black­Swan (Team lead: Yitian Chen); Team CarTrailBlazer (Team lead: Suiqian Luo) ● 16:10 ­ 16:30 Closing Remarks

KDD 2017 Hands­On Tutorial Program

Tuesday August 15, 2017

8:30AM­12:00PM, 1:30PM­5:30PM H ands­On Tutorial 1 ­ Suite 301­303 Title: Amazon Web Services & MxNET Instructors: A lex Smola (Amazon), Joseph Spisak (Amazon), Mu Li (Amazon)

8:30AM­12:00PM H ands­On Tutorial 2 ­ Suite 304­306 Title: Anomaly Detection in Networks Instructors: V eena B. Mendiratta (Nokia Bell Labs) https://veena­mendiratta.blog/tutorial­anomaly­detection­in­networks/

1:30PM­5:30PM H ands­On Tutorial 3 ­ Suite 304­306 Title: Massive Online Analytics Instructors: Bernhard Pfahringer (University of Waikato), Albert Bifet (Telecom­ParisTech)

Wednesday August 16, 2017

8:30AM­12:00PM H ands­On Tutorial 4 ­ Suite 301­303 Title: META: A Unifying Framework for the Management and Analysis of Text Data Instructors: C hase Geigle (UIUC), ChengXiang Zhai (UIUC)

1:30PM­5:30PM H ands­On Tutorial 5 ­ Suite 301­303 Title: Using R for Scalable Data Science: Single Machines to Hadoop Spark Clusters Instructors: Robert Horton (Microsoft), Mario Inchiosa (Microsoft), Vanja Paunic (Microsoft), Hang Zhang (Microsoft) https://github.com/Azure/KDD2017R

1:30PM­5:30PM H ands­On Tutorial 6 ­ Suite 304­306 Title: Declarative, Large­Scale Machine Learning with Apache SystemML Instructors: Matthias Boehm (IBM Research), Alexandre Evfimievski (IBM Research), Niketan Pansare (IBM Research), Berthold Reinwald (IBM Research), Prithvi Sen, (IBM Research) http://systemml.apache.org/tutorial­kdd2017.html

Thursday August 17, 2017

8:30AM­12:00PM H ands­On Tutorial 7 ­ S uite 301­303 Title: TensorFlow Instructors: Rajat Monga (Google), Martin Wicke (Google) , Daniel ‘Wolff’ Dobson (Google), Joshua Gordon (Google)

1:30PM­5:30PM H ands­On Tutorial 8 ­ Suite 304­306 Title: Cloud based Data Mining Tools for Storage, Distributed Processing, and Machine Learning Systems for Scientific Data Instructors: V ani Mandava (Microsoft Research), Dennis Gannon (Indiana University)

KDD 2017 Conference Program

Monday August 14 2017 Detailed Program

Monday August 14, 4:00pm ­ 5:00pm, Poster Presenter Setup (Poster Group 1) – WTCC Room 100 Monday August 14, 2017 5:15pm – 7:00pm, KDD 2017 Opening Session ­ Scotiabank Centre

Monday August 14, 7:00pm ­ 10:00pm, Poster Reception (Poster Group 1) – WTCC Room 100

Poster 1: On Sampling Strategies for Neural Network­based Collaborative Filtering Ting Chen (University of California, Los Angeles), Yizhou Sun (University of California, Los Angeles), Yue Shi (Yahoo! Research), Liangjie Hong (Etsy Inc.)

Poster 2: Recurrent Poisson Factorization for Temporal Recommendation Seyed Abbas Hosseini (Sharif University of Technology), Keivan Alizadeh (Sharif University of Technology), Ali Khodadadi (Sharif University of Technology), Ali Arabzadeh (Sharif University of Technology), Mehrdad Farajtabar (Georgia Institute of Technology), Hongyuan Zha (Georgia Institute of Technology), Hamid R. Rabiee (Sharif University of Technology)

Poster 3: Aspect Based Recommendations: Recommending Items with the Most Valuable Aspects Based on User Reviews Konstantin Bauman (New York University), Bing Liu (University of Illinois at Chicago (UIC)), Alexander Tuzhilin (New York University)

Poster 4: Bridging Collaborative Filtering and Semi­Supervised Learning: A Neural Approach for POI Recommendation Carl Yang (University of Illinois, Urbana Champaign), Lanxiao Bai (University of Illinois, Urbana Champaign), Chao Zhang (University of Illinois, Urbana Champaign), Quan Yuan (University of Illinois, Urbana Champaign), Jiawei Han (University of Illinois, Urbana Champaign)

Poster 5: A Location­Sentiment­Aware Recommender System for Both Home­Town and Out­of­Town Users Hao Wang (Qihoo 360 Search Lab), Yanmei Fu (Chinese Academy of Sciences & University of Chinese Academy of Sciences), Qinyong Wang (University of Queensland), Hongzhi Yin (University of Queensland), Changying Du (Chinese Academy of Sciences), Hui Xiong (Rutgers University)

Poster 6: Post Processing Recommender Systems for Diversity Arda Antikacioglu (Carnegie Mellon University), R Ravi (Carnegie Mellon University)

Poster 7: A Data­driven Process Recommender Framework Sen Yang (Rutgers University), Xin Dong (Rutgers University), Leilei Sun (Tsinghua University), Yichen Zhou (Rutgers University), Richard A. Farneth (Children's National Medical Center), Hui Xiong (Rutgers University), Randall S. Burd (Children's National Medical Center), Ivan Marsic (Rutgers University)

Poster 8: Dynamic Attention Deep Model for Article Recommendation by Learning Human Editors' Demonstration Xuejian Wang (Shanghai Jiao Tong University), Lantao Yu (Shanghai Jiao Tong University), Kan Ren (Shanghai Jiao

Tong University), Guanyu Tao (ULU Technologies Inc.), Weinan Zhang (Shanghai Jiao Tong University), Yong Yu (Shanghai Jiao Tong University), Jun Wang (University College )

Poster 9: Embedding­based News Recommendation for Millions of Users Shumpei Okura (Yahoo Japan Corporation), Yukihiro Tagami (Yahoo Japan Corporation), Shingo Ono (Yahoo Japan Corporation), Akira Tajima (Yahoo Japan Corporation)

Poster 10: Anomaly Detection in Streams with Extreme Value Theory Alban Siffer (IRISA), Pierre­Alain Fouque (IRISA), Alexandre Termier (IRISA), Christine Largouet (IRISA)

Poster 11: Finding Precursors to Anomalous Drop in Airspeed During a Flight's Takeoff Vijay Manikandan Janakiraman (USRA/ NASA Ames Research Center), Bryan Matthews (USRA/ NASA Ames Research Center), Nikunj Oza (NASA Ames Research Center)

Poster 12: Machine Learning for Encrypted Malware Traffic Classification: Accounting for Noisy Labels and Non­Stationarity Blake Anderson (Cisco Systems, Inc.), David McGrew (Cisco Systems, Inc.)

Poster 13: Let's See Your Digits: Anomalous­State Detection using Benford's Law Samuel Maurus (Technical University of ), Claudia Plant (University of Vienna)

Poster 14: Distributed Local Outlier Detection in Big Data Yizhou Yan (Worcester Polytechnic Institute), Lei Cao (Massachusetts Institute of Technology), Caitlin Kulhman (Worcester Polytechnic Institute), Elke A Rundensteiner (Worcester Polytechnic Institute)

Poster 15: Scalable Top­n Local Outlier Detection Yizhou Yan (Worcester Polytechnic Institute), Lei Cao (MIT), Elke A Rundensteiner (Worcester Polytechnic Institute)

Poster 16: REMIX: Automated Exploration for Interactive Outlier Detection Yanjie Fu (Missouri University of Science & Technology), Charu Aggarwal (IBM T. J. Watson Research Center), Srinivasan Parthasarathy (IBM T. J. Watson Research Center), Deepak S. Turaga (IBM T. J. Watson Research Center), Hui Xiong (Rutgers University)

Poster 17: Contextual Spatial Outlier Detection with Metric Learning Guanjie Zheng (Pennsylvania State University), Susan L. Brantley (Pennsylvania State University), Thomas Lauvaux (Pennsylvania State University), Zhenhui Li (Pennsylvania State University)

Poster 18: Inferring the Strength of Social Ties: A Community­Driven Approach Polina Rozenshtein (Aalto University), Nikolaj Tatti (Aalto University), Aristides Gionis (Aalto University)

Poster 19: A Temporally Heterogeneous Survival Framework with Application to Social Behavior Dynamics Linyun Yu (Tsinghua University), Peng Cui (Tsinghua University), Chaoming Song (University of Miami), Tianyang Zhang (Tsinghua University), Shiqiang Yang (Tsinghua University)

Poster 20: Unsupervised Feature Selection in Signed Social Networks Kewei Cheng (Arizona State University), Jundong Li (Arizona State University), Huan Liu (Arizona State University)

Poster 21: Detecting Network Effects: Randomizing Over Randomized Experiments Martin Saveski (Massachusetts Institute of Technology), Jean Pouget­Abadie (Harvard University), Guillaume Saint­Jacques (Massachusetts Institute of Technology), Weitao Duan (LinkedIn), Souvik Ghosh (LinkedIn), Ya Xu (LinkedIn), Edoardo M Airoldi (Harvard University)

Poster 22: Revisiting Power­law Distributions in Spectra of Real World Networks

Nicole Eikmeier (Purdue University), David F. Gleich (Purdue University)

Poster 23: Structural Diversity and Homophily: A Study Across More Than One Hundred Big Networks Yuxiao Dong (Microsoft Research & University of Notre Dame), Reid A Johnson (University of Notre Dame), Jian Xu (University of Notre Dame), Nitesh V Chawla (University of Notre Dame)

Poster 24: When is a Network a Network? Multi­Order Graphical Model Selection in Pathways and Temporal Networks Ingo Scholtes (ETH Zürich)

Poster 25: Learning from Labeled and Unlabeled Vertices in Networks Wei Ye (L udwig­Maximilians­Universität München) , Linfei Zhou (L udwig­Maximilians­Universität München) , Dominik Mautz (L udwig­Maximilians­Universität München) , Claudia Plant (University of Vienna), Christian Böhm (L udwig­Maximilians­Universität München)

Poster 26: Relay­Linking Models for Prominence and Obsolescence in Evolving Networks Mayank Singh (IIT Kharagpur), Rajdeep Sarkar (IIT Kharagpur), Pawan Goyal (IIT Kharagpur), Animesh Mukherjee (IIT Kharagpur), Soumen Chakrabarti (IIT Bombay)

Poster 27: Learning Tree­Structured Detection Cascades for Heterogeneous Networks of Embedded Devices Hamid Dadkhahi (University of Massachusetts Amherst), Benjamin M. Marlin (University of Massachusetts Amherst)

Poster 28: An Intelligent Customer Care Assistant System for Large­Scale Cellular Network Diagnosis Lujia Pan (Huawei Technologies), Jianfeng Zhang (Huawei Technologies), Patrick P. C. Lee (Chinese University of Hong Kong), Hong Cheng (Chinese University of Hong Kong), Cheng He (Huawei Technologies), Caifeng He (Huawei Technologies), Keli Zhang (Huawei Technologies)

Poster 29: DeepProbe: Information Directed Sequence Understanding and Chatbot Design via Recurrent Neural Networks Zi Yin (Stanford University & Microsoft), Keng­hao Chang (Microsoft), Ruofei Zhang (Microsoft)

Poster 30: Adversary Resistant Deep Neural Networks with an Application to Malware Detection Qinglong Wang (Pennsylvania State University & McGill University), Wenbo Guo (Pennsylvania State University), Kaixuan Zhang (Pennsylvania State University), Alexander G. Ororbia II (Pennsylvania State University), Xinyu Xing (Pennsylvania State University), Xue Liu (McGill University), C. Lee Giles (Pennsylvania State University)

Poster 31: Learning from Multiple Teacher Networks Shan You (Peking University), Chang Xu (University of Sydney), Chao Xu (Peking University), Dacheng Tao (University of Sydney)

Poster 32: Incremental Dual­memory LSTM in Land Cover Prediction Xiaowei Jia (University of Minnesota), Ankush Khandelwal (University of Minnesota), Guruprasad Nayak (University of Minnesota), James Gerber (University of Minnesota), Kimberly Carson (University of Hawaii Manoa), Paul West (University of Minnesota), Vipin Kumar (University of Minnesota)

Poster 33: Learning Temporal State of Diabetes Patients via Combining Behavioral and Demographic Data Houping Xiao (SUNY Buffalo & T.J. Watson Research Center), Jing Gao (SUNY Buffalo), Long Vu (IBM T.J. Watson Research Center), Deepak S Turaga (IBM T.J. Watson Research Center)

Poster 34: Unsupervised Discovery of Drug Side­Effects from Heterogeneous Data Sources Fenglong Ma (SUNY Buffalo), Chuishi Meng (SUNY Buffalo), Houping Xiao (SUNY Buffalo), Qi Li (SUNY Buffalo), Jing Gao (SUNY Buffalo), Lu Su (SUNY Buffalo), Aidong Zhang (SUNY Buffalo)

Poster 35: Dipole: Diagnosis Prediction in Healthcare via Attention­based Bidirectional Recurrent Neural Networks

Fenglong Ma (SUNY Buffalo & Xerox), Radha Chitta (Conduent Labs US), Jing Zhou (Conduent Labs US), Quanzeng You (University of Rochester), Tong Sun (United Technologies Research Center & Xerox), Jing Gao (SUNY Buffalo)

Poster 36: Multi­Modality Disease Modeling via Collective Deep Matrix Factorization Qi Wang (Michigan State University), Mengying Sun (Michigan State University), Liang Zhan (University of Wisconsin­Stout), Paul Thompson (University of Southern California), Shuiwang Ji (Washington State University), Jiayu Zhou (Michigan State University)

Poster 37: Federated Tensor Factorization for Computational Phenotyping Yejin Kim (Pohang University of Science and Technology & University of California, San Diego), Jimeng Sun (Georgia Institute of Technology), Hwanjo Yu (Pohang University of Science and Technology), Xiaoqian Jiang (University of California, San Diego)

Poster 38: GRAM: Graph­based Attention Model for Healthcare Representation Learning Edward Choi (Georgia Institute of Technology), Mohammad Taha Bahadori (Georgia Institute of Technology), Le Song (Georgia Institute of Technology), Walter F. Stewart (Sutter Health), Jimeng Sun (Georgia Institute of Technology)

Poster 39: Resolving the Bias in Electronic Medical Records Kaiping Zheng (National University of ), Jinyang Gao (National University of Singapore), Kee Yuan Ngiam (National University Health System), Beng Chin Ooi (National University of Singapore), Wei Luen James Yip (National University Health System)

Poster 40: Collecting and Analyzing Millions of mHealth Data Streams Tom Quisel (Evidation Health, Inc.), Luca Foschini (Evidation Health, Inc.), Alessio Signorini (Evidation Health, Inc.), David C. Kale (USC Information Sciences Institute)

Poster 41: LEAP: Learning to Prescribe Effective and Safe Treatment Combinations for Multimorbidity Yutao Zhang (Tsinghua University), Robert Chen (Georgia Institute of Technology), Jie Tang (Tsinghua University), Walter F. Stewart (Sutter Health), Jimeng Sun (Georgia Institute of Technology)

Poster 42: Learning to Count Mosquitoes for the Sterile Insect Technique Yaniv Ovadia (Google Inc.), Yoni Halpern (Google Inc.), Dilip Krishnan (Google Inc.), Josh Livni (Verily Inc.), Daniel Newburger (Verily Inc.), Ryan Poplin (Verily Inc.), Tiantian Zha (Verily Inc.), D. Sculley (Google Inc.)

Poster 43: DeepMood: Modeling Mobile Phone Typing Dynamics for Mood Detection Bokai Cao (University of Illinois at Chicago), Lei Zheng (University of Illinois at Chicago), Chenwei Zhang (University of Illinois at Chicago), Philip S. Yu (Tsinghua University & University of Illinois at Chicago), Andrea Piscitello (University of Illinois at Chicago), John Zulueta (University of Illinois at Chicago), Olu Ajilore (University of Illinois at Chicago), Kelly Ryan (University of Michigan), Alex D. Leow (University of Illinois at Chicago)

Poster 44: Distributed Multi­Task Relationship Learning Sulin Liu (Nanyang Technological University, Singapore), Sinno Jialin Pan (Nanyang Technological University, Singapore), Qirong Ho (Petuum, Inc.)

Poster 45: Multi­task Function­on­function Regression with Co­grouping Structured Sparsity Pei Yang (South China University of Technology & Arizona State University), Qi Tan (South China Normal University), Jingrui He (Arizona State University)

Poster 46: Multi­view Learning over Retinal Thickness and Visual Sensitivity on Glaucomatous Eyes Toshimitsu Uesaka (The University of Tokyo), Kai Morino (The University of Tokyo), Hiroki Sugiura (The University of Tokyo), Taichi Kiwaki (The University of Tokyo), Hiroshi Murata (The University of Tokyo), Ryo Asaoka (The University of Tokyo), Kenji Yamanishi (The University of Tokyo)

Poster 47: Privacy­Preserving Distributed Multi­Task Learning with Asynchronous Updates Liyang Xie (Michigan State University), Inci M Baytas (Michigan State University), Kaixiang Lin (Michigan State University), Jiayu Zhou (Michigan State University)

Poster 48: Achieving Non­Discrimination in Data Release Lu Zhang (University of Arkansas), Yongkai Wu (University of Arkansas), Xintao Wu (University of Arkansas)

Poster 49: Prospecting the Career Development of Talents: A Survival Analysis Perspective Huayu Li (University of North Carolina at Charlotte & Baidu Talent Intelligence Center), Yong Ge (), Hengshu Zhu (Baidu Talent Intelligence Center), Hui Xiong (Rutgers University), Hongke Zhao (University of Sci. and Tech. of China)

Poster 50: A Context­aware Attention Network for Interactive Question Answering Huayu Li (University of North Carolina, Charlotte), Martin Renqiang Min (NEC Laboratories America), Yong Ge (University of Arizona), Asim Kadav (NEC Laboratories America)

Poster 51: Structural Event Detection from Log Messages Fei Wu (Pennsylvania State University), Pranay Anchuri (NEC Laboratories America), Zhenhui Li (Pennsylvania State University)

Poster 52: A Hybrid Framework for Text Modeling with Convolutional RNN Chenglong Wang (Alibaba Group), Feijun Jiang (Alibaba Group), Hongxia Yang (Alibaba Group)

Poster 53: Automatic Synonym Discovery with Knowledge Bases Meng Qu (University of Illinois at Urbana­Champaign), Xiang Ren (University of Illinois at Urbana­Champaign), Jiawei Han (University of Illinois at Urbana­Champaign)

Poster 54: Discovering Enterprise Concepts Using Spreadsheet Tables Keqian Li (University of California, Santa Barbara & Microsoft Research), Yeye He (Microsoft Research), Kris Ganjam (Microsoft Research)

Poster 55: End­to­end Learning for Short Text Expansion Jian Tang (University of Michigan), Yue Wang (University of Michigan), Kai Zheng (University of California, Irvine), Qiaozhu Mei (University of Michigan)

Poster 56: MetaPAD: Meta Pattern Discovery from Massive Text Corpora Meng Jiang (University of Illinois at Urbana­Champaign), Jingbo Shang (University of Illinois at Urbana­Champaign), Taylor Cassidy (Army Research Laboratory), Xiang Ren (University of Illinois at Urbana­Champaign), Lance M. Kaplan (Army Research Laboratory), Timothy P. Hanratty (Army Research Laboratory), Jiawei Han (University of Illinois at Urbana­Champaign)

Poster 57: ReasoNet: Learning to Stop Reading in Machine Comprehension Yelong Shen (Microsoft Research), Po­Sen Huang (Microsoft Research), Jianfeng Gao (Microsoft Research), Weizhu Chen (Microsoft Research)

Poster 58: Semi­Supervised Techniques for Mining Learning Outcomes and Prerequisites Igor Labutov (Carnegie Mellon University), Yun Huang (University of Pittsburgh), Peter Brusilovsky (University of Pittsburgh), Daqing He (University of Pittsburgh)

Poster 59: Formative Essay Feedback Using Predictive Scoring Models Bronwyn Woods (Turnitin), David Adamson (Turnitin), Shayne Miel (Turnitin), Elijah Mayfield (Turnitin)

Poster 60: Point­of­Interest Demand Modeling with Human Mobility Patterns

Yanchi Liu (Rutgers University), Chuanren Liu (Drexel University), Xinjiang Lu (Northwestern Polytechnical University), Mingfei Teng (Rutgers University), Hengshu Zhu (Baidu Talent Intelligence Center), Hui Xiong (Rutgers University)

Poster 61: Predicting Optimal Facility Location without Customer Locations Emre Yilmaz (Bilkent University), Sanem Elbasi (Bilkent University), Hakan Ferhatosmanoglu (Bilkent University)

Poster 62: Discovering Pollution Sources and Propagation Patterns in Urban Area Xiucheng Li (Nanyang Technological University), Yun Cheng (Air Scientific), Gao Cong (Nanyang Technological University), Lisi Chen (Hong Kong Baptist University)

Poster 63: Functional Zone Based Hierarchical Demand Prediction For Bike System Expansion Junming Liu (Rutgers University), Leilei Sun (Tsinghua University), Qiao Li (Rutgers University), Jingci Ming (Rutgers University), Yanchi Liu (Rutgers University), Hui Xiong (Rutgers University)

Poster 64: A Taxi Order Dispatch Model based On Combinatorial Optimization Lingyu Zhang (Didi Research Institute, Didi Chuxing), Tao Hu (Didi Research Institute, Didi Chuxing), Yue Min (Didi Research Institute, Didi Chuxing), Guobin Wu (Didi Research Institute, Didi Chuxing), Junying Zhang (Didi Research Institute, Didi Chuxing), Pengcheng Feng (Didi Research Institute, Didi Chuxing), Pinghua Gong (Didi Research Institute, Didi Chuxing), Jieping Ye (Didi Research Institute, Didi Chuxing)

Poster 65: Visual Search at eBay Fan Yang (eBay Inc.), Ajinkya Kale (eBay Inc.), Yury Bubnov (eBay Inc.), Leon Stein (eBay Inc.), Qiaosong Wang (eBay Inc.), Hadi Kiapour (eBay Inc.), Robinson Piramuthu (eBay Inc.)

Poster 66: Visualizing Attributed Graphs via Terrain Metaphor Yang Zhang (Ohio State University), Yusu Wang (Ohio State University), Srinivasan Parthasarathy (Ohio State University)

Poster 67: Device Graphs Matthew Malloy (comScore), Paul Barford (comScore & University of Wisconsin), Enis Ceyhun Alp (comScore & University of Wisconsin), Jonathan Koller (comScore), Adria Jewell (comScore)

Poster 68: Construction of Directed 2K Graphs Bálint Tillman (University of California, Irvine), Athina Markopoulou (University of California, Irvine), Carter T. Butts (University of California, Irvine), Minas Gjoka (University of California, Irvine)

Poster 69: Anarchists, Unite: Practical Entropy Approximation for Distributed Streams Moshe Gabel (Technion ­ Israel Institute of Technology), Daniel Keren (Haifa University), Assaf Schuster (Technion ­ Israel Institute of Technology)

Poster 70: PAMAE: Parallel k­Medoids Clustering with High Accuracy and Efficiency Hwanjun Song (Korea Advanced Institute of Science and Technology), Jae­Gil Lee (Korea Advanced Institute of Science and Technology), Wook­Shin Han (POSTECH)

Poster 71: Extremely Fast Decision Tree Mining for Evolving Data Streams Albert Bifet (Telecom ParisTech), Jiajin Zhang (Huawei), Wei Fan (Baidu Research Big Data Lab), Cheng He (Huawei), Jianfeng Zhang (Huawei), Jianfeng Qian (), Geoff Holmes (University of Waikato), Bernhard Pfahringer (University of Waikato)

Poster 72: Mixture Factorized Ornstein­Uhlenbeck Processes for Time­Series Forecasting Guo­Jun Qi (University of Central Florida), Jiliang Tang (Michigan State University), Jingdong Wang (Microsoft Research Asia and Hefei University of Technology), Jiebo Luo (University of Rochester)

Poster 73: Tripoles: A New Class of Relationships in Time Series Data Saurabh Agrawal (University of Minnesota), Gowtham Atluri (University of Cincinnati), Anuj Karpatne (University of Minnesota), William Haltom (University of Minnesota), Stefan Liess (Univertsity of Minnesota), Snigdhansu Chatterjee (University of Minnesota), Vipin Kumar (University of Minnesota)

Poster 74: Algorithmic Decision Making and the Cost of Fairness Sam Corbett­Davies (Stanford University), Emma Pierson (Stanford University), Avi Feller (University of California, Berkeley), Sharad Goel (Stanford University), Aziz Huq (University of Chicago)

Poster 75: Decomposed Normalized Maximum Likelihood Codelength Criterion for Selecting Hierarchical Latent Variable Models Tianyi Wu (University of Tokyo), Shinya Sugawara (University of Tokyo), Kenji Yamanishi (University of Tokyo)

Poster 76: Statistical Emerging Pattern Mining with Multiple Testing Correction Junpei Komiyama (University of Tokyo), Masakazu Ishihata (Hokkaido University), Hiroki Arimura (Hokkaido University), Takashi Nishibayashi (VOYAGE GROUP, Inc.), Shin­ichi Minato (Hokkaido University)

Poster 77: An Alternative to NCD for Large Sequences, Lempel­Ziv Jaccard Distance Edward Raff (Laboratory for Physical Sciences), Charles Nicholas (University of Maryland, Baltimore County)

Poster 78: An Efficient Bandit Algorithm for Realtime Multivariate Optimization Daniel N Hill (Amazon.com, Inc.), Houssam Nassif (Amazon.com, Inc.), Yi Liu (Amazon.com, Inc.), Anand Iyer (Amazon.com, Inc.), S.V.N. Vishwanathan (Amazon.com, Inc. & University of California, Santa Cruz)

Poster 79: Effective Evaluation Using Logged Bandit Feedback from Multiple Loggers Aman Agarwal (Cornell University), Soumya Basu (Cornell University), Tobias Schnabel (Cornell University), Thorsten Joachims (Cornell University)

Poster 80: Convex Factorization Machine for Toxicogenomics Prediction Makoto Yamada (RIKEN AIP, JST PRESTO), Wenzhao Lian (Vicarious), Amit Goyal (Yahoo Research), Jianhui Chen (Microsoft), Kishan Wimalawarne (Kyoto University), Suleiman A Khan (University of Helsinki), Samuel Kaski (Aalto University), Hiroshi Mamitsuka (Kyoto University & Aalto University), Yi Chang (Huawei Research America)

Poster 81: DenseAlert: Incremental Dense­Subtensor Detection in Tensor Streams Kijung Shin (Carnegie Mellon University), Bryan Hooi (Carnegie Mellon University), Jisu Kim (Carnegie Mellon University), Christos Faloutsos (Carnegie Mellon University)

Poster 82: Fast Newton Hard Thresholding Pursuit for Sparsity Constrained Nonconvex Optimization Jinghui Chen (University of Virginia), Quanquan Gu (University of Virginia)

Poster 83: Robust Spectral Clustering for Noisy Data Aleksandar Bojchevski (Technical University of Munich), Yves Matkovic (Technical University of Munich), Stephan Günnemann (Technical University of Munich)

Poster 84: Optimized Risk Scores Berk Ustun (Massachusetts Institute of Technology), Cynthia Rudin (Duke University)

Poster 85: Retrospective Higher­Order Markov Processes for User Trails Tao Wu (Purdue University), David F. Gleich (Purdue University)

Poster 86: Small Batch or Large Batch? Gaussian Walk with Rebound Can Teach Peifeng Yin (IBM Almaden Research Center), Ping Luo (Chinese Academy of Sciences & University of Chinese Academy of Sciences), Taiga Nakamura (IBM Almaden Research Center)

Poster 87: Sparse Compositional Local Metric Learning Joseph St.Amand (University of Kansas), Jun Huan (University of Kansas)

Poster 88: Evaluating U.S. Electoral Representation with a Joint Statistical Model of Congressional Roll­Calls, Legislative Text, and Voter Registration Data Zhengming Xing (Criteo Labs), Sunshine Hillygus (Duke University), Lawrence Carin (Duke University)

Poster 89: SPOT: Sparse Optimal Transformations for High Dimensional Variable Selection and Exploratory Regression Analysis Qiming Huang (Purdue University), Michael Zhu (Purdue University & Tsinghua University)

Poster 90: Bolt: Accelerated Data Mining with Fast Vector Compression Davis W Blalock (Massachusetts Institute of Technology), John V Guttag (Massachusetts Institute of Technology)

Poster 91: Inductive Semi­supervised Multi­Label Learning with Co­Training Wang Zhan (Southeast University & Ministry of Education), Min­Ling Zhang (Southeast University & Collaborative Innovation Center of Wireless Communications Technology)

Poster 92: Large Scale Sentiment Learning with Limited Labels Vasileios Iosifidis (Leibniz University Hanover & L3S Research Center), Eirini Ntoutsi (Leibniz University Hanover & L3S Research Center)

Poster 93: Customer Lifetime Value Prediction Using Embeddings Benjamin Paul Chamberlain (Imperial College London), Ângelo Cardoso (ASOS.com), C.H. Bryan Liu (ASOS.com), Roberto Pagliari (ASOS.com), Marc Peter Deisenroth (Imperial College London)

Poster 94: Real­Time Optimization of Web Publisher RTB Revenues Pedro Chahuara (XRCE), Nicolas Grislain (AlephD), Gregoire Jauvion (AlephD), Jean­Michel Renders (XRCE)

Poster 95: Deep Design: Product Aesthetics for Heterogeneous Markets Yanxin Pan (University of Michigan), Alexander Burnap (University of Michigan), Jeffrey L. Hartley (General Motors), Richard Gonzalez (University of Michigan), Panos Y. Papalambros (University of Michigan)

Poster 96: Ad Serving with Multiple KPIs Brendan Kitts (PrecisionDemand), Michael Krishnan (Adap.tv), Ishadutta Yadav (Adap.tv), Yongbo Zeng (Adap.tv), Garrett Badeau (Adap.tv), Andrew Potter (AOL Platforms), Sergey Tolkachov (AOL Platforms), Ethan Thornburg (AOL Platforms), Satyanarayana Reddy Janga (AOL Platforms)

Poster 97: Local Algorithm for User Action Prediction Towards Display Ads Hongxia Yang (Alibaba Group), Yada Zhu (IBM Research), Jingrui He (Arizona State University)

Poster 98: Optimization Beyond Prediction:Prescriptive Price Optimization Shinji Ito (NEC Corporation), Ryohei Fujimaki (NEC Corpoartion)

Poster 99: Optimized Cost per Click in Taobao Display Advertising Han Zhu (Alibaba Group), Junqi Jin (Alibaba Group), Chang Tan (Alibaba Group), Fei Pan (Alibaba Group), Yifan Zeng (Alibaba Group), Han Li (Alibaba Group), Kun Gai (Alibaba Group)

Poster 100: RUSH! Targeted Time­limited Coupons via Purchase Forecasts Emaad Manzoor (Carnegie Mellon University), Leman Akoglu (Carnegie Mellon University)

Poster 101: Stock Price Prediction via Discovering Multi­Frequency Trading Patterns Liheng Zhang (University of Central Florida), Charu Aggarwal (IBM T. J. Watson Research Center), Guo­Jun Qi

(University of Central Florida)

Poster 102: Learning to Generate Rock Descriptions from Multivariate Well Logs with Hierarchical Attention Bin Tong (Hitachi, Ltd.), Martin Klinkigt (Hitachi, Ltd.), Makoto Iwayama (Hitachi, Ltd.), Toshihiko Yanase (Hitachi, Ltd.), Yoshiyuki Kobayashi (Hitachi, Ltd.), Anshuman Sahu (Hitachi America, Ltd.), Ravigopal Vennelakanti (Hitachi America, Ltd.)

Poster 103: “The Leicester City Fairytale?”: Utilizing New Soccer Analytics Tools to Compare Performance in the 15/16 & 16/17 EPL Seasons Héctor Ruiz (STATS), Paul Power (STATS), Xinyu Wei (STATS), Patrick Lucey (STATS)

Poster 104: A Practical Algorithm for Solving the Incoherence Problem of Topic Models In Industrial Applications Amr Ahmed (Google Research), James Long (Google Research), Daniel Silva (Google Research), Yuan Wang (Google Research)

Poster 105: AESOP: Automatic Policy Learning for Predicting and Mitigating Network Service Impairments Supratim Deb (AT&T Labs), Zihui Ge (AT&T Labs), Sastry Isukapalli (AT&T Labs), Sarat Puthenpura (AT&T Labs), Shobha Venkataraman (AT&T Labs), He Yan (AT&T Labs), Jennifer Yates (AT&T Labs)

Poster 106: Automated Categorization of Onion Sites for Analyzing the Darkweb Ecosystem Shalini Ghosh (SRI), Ariyam Das (UCLA), Phil Porras (SRI), Vinod Yegneswaran (SRI), Ashish Gehani (SRI)

Poster 107: Automatic Application Identification from Billions of Files Kyle Soska (Carnegie Mellon University), Chris Gates (Symantec), Kevin A Roundy (Symantec), Nicolas Christin (Carnegie Mellon University)

Poster 108: BDT: Gradient Boosted Decision Tables for High Accuracy and Scoring Efficiency Yin Lou (Airbnb Incorporation), Mikhail Obukhov (LinkedIn Corporation)

Poster 109: Dispatch with Confidence: Integration of Machine Learning, Optimization and Simulation for Open Pit Mines Kosta Ristovski (Hitachi America Ltd), Chetan Gupta (Hitachi America Ltd), Kunihiko Harada (Hitachi America Ltd), Hsiu­Khuern Tang (Hitachi America Ltd)

Poster 110: Matching Restaurant Menus to Crowdsourced Food Data Hesam Salehian (Under Armour Connected Fitness), Patrick Howell (Under Armour Connected Fitness), Chul Lee (Under Armour Connected Fitness)

Poster 111: STAR: A System for Ticket Analysis and Resolution Wubai Zhou (Florida International University), Wei Xue (Florida International University), Ramesh Baral (Florida International University), Qing Wang (Florida International University), Chunqiu Zeng (Florida International University), Tao Li (Florida International University), Jian Xu (Nanjing University of Science and Technology), zheng Liu (Nanjing University of Posts and Telecommunications), Larisa Shwartz (IBM T.J. Watson Research Center), Genady Ya. Grabarnik (St. John’s University, Queens)

Poster 112: Supporting Employer Name Normalization at both Entity and Cluster Level Qiaoling Liu (CareerBuilder LLC), Faizan Javed (CareerBuilder LLC), Vachik S Dave (Indiana University ­ Purdue University Indianapolis & CareerBuilder LLC), Ankita Joshi (University of Georgia & CareerBuilder LLC)

Poster 113: The Fake vs Real Goods Problem: Microscopy and Machine Learning to the Rescue Ashlesh Sharma (Entrupy Inc), Vidyuth Srinivasan (Entrupy Inc), Vishal Kanchan (Entrupy Inc), Lakshminarayanan Subramanian (Entrupy Inc and New York University)

Poster 114: Toward Automated Fact­Checking: Detecting Check­worthy Factual Claims by ClaimBuster

Naeemul Hassan (University of Mississippi), Fatma Arslan (University of Texas at Arlington), Chengkai Li (University of Texas at Arlington), Mark Tremayne (University of Texas at Arlington)

Poster 115: TensorFlow Estimators: Managing Simplicity vs. Flexibility in High­Level Machine Learning Frameworks Heng­Tze Cheng (Google, Inc.), Zakaria Haque (Google, Inc.), Lichan Hong (Google, Inc.), Mustafa Ispir (Google, Inc.), Clemens Mewald (Google, Inc), Illia Polosukhin (Google, Inc.), Georgios Roumpos (Google, Inc.), D Sculley (Google, Inc.), Jamie Smith (Google, Inc.), David Soergel (Google, Inc.), Yuan Tang (Uptake Technologies, Inc.), Philipp Tucker (Google, Inc.), Martin Wicke (Google, Inc.), Cassandra Xia (Google, Inc.), Jianwei Xie (Google, Inc.)

Tuesday August 15, 2017 Detailed Program

Tuesday 7:00am ­ 5:00pm, Registration, Registration Desk ­ Level 1 Atrium

Tuesday 7:00am ­ 8:00am, KDD Breakfast ­ Level 3 Foyer

Tuesday 9:30am ­ 6:00pm, KDD Exhibit Hall ­ Exhibit Hall

Tuesday 9:30am ­ 6:00pm, KDD Sponsor Room ­ Level 8 Meeting Room 4

Tuesday 8:00am ­ 9:30am, Scotiabank Centre Keynote Session 1: Three Principles of Data Science: Predictability, Stability, and Computability. Speaker: B in Yu, P rofessor ­ University of California at Berkeley chaired by T ina Eliassi­Rad Abstract: In this talk, I will discuss the intertwining importance and connections of three principles of data science in the title in data­driven decisions. The ultimate importance of prediction lies in the fact that future holds the unique and possibly the only purpose of all human activities, in business, education, research, and government alike. Making prediction as its central task and embracing computation as its core, machine learning has enabled wide­ranging data­driven successes. Prediction is a useful way to check with reality. Good prediction implicitly assumes stability between past and future. Stability (relative to data and model perturbations) is also a minimum requirement for interpretability and reproducibility of data driven results. It is closely related to uncertainty assessment. Obviously, both prediction and stability principles can not be employed without feasible computational algorithms, hence the importance of computability. The three principles will be demonstrated through analytical connections, and in the context of two on­going projects, for which “data wisdom” is also indispensable. Specifically, the first project employs deep learning networks (CNNs) to understand pattern selectivities of neurons in the difficult visual cortex V4; and the second project predicts partisanship and tone of political TV ads by employing and comparing different latent variable models with a Lasso­based model.

Tuesday 9:30am ­ 10:00am, Coffee Break, Level 2 and 3 Foyer & Exhibit Hall

Tuesday 10:00am ­ 12:00pm Applied Data Science Session AT1: Platforms and Infrastructure, Room 200C Chair: Tilmann Bruckhaus

Google Vizier: A Service for Black­Box Optimization Daniel Golovin (Google Research), Benjamin Solnik (Google Research), Subhodeep Moitra (Google Research), Greg Kochanski (Google Research), John Karro (Google Research), D. Sculley (Google Research) TFX: A TensorFlow­Based Production­Scale Machine Learning Platform Denis Baylor (Google Inc.), Eric Breck (Google Inc.), Heng­Tze Cheng (Google Inc.), Noah Fiedel (Google Inc.), Chuan Yu Foo (Google Inc.), Zakaria Haque (Google Inc.), Salem Haykal (Google Inc.), Mustafa Ispir (Google Inc.), Vihan Jain (Google Inc.), Levent Koc (Google Inc.), Chiu Yuen Koo (Google Inc.), Lukasz Lew (Google Inc.), Clemens Mewald (Google Inc.), Akshay Naresh Modi (Google Inc.), Neoklis Polyzotis (Google Inc.), Sukriti Ramesh (Google Inc.), Sudip Roy (Google Inc.), Steven Euijong Whang (Google Inc.), Martin Wicke (Google Inc.), Jarek Wilkiewicz (Google Inc.), Xin Zhang (Google Inc.), Martin Zinkevich (Google Inc.)

KunPeng: Parameter Server based Distributed Learning Systems and Its Applications in Alibaba and Ant Financial Jun Zhou (Ant Financial Services Group), Xiaolong Li (Ant Financial Services Group), Peilin Zhao (Ant Financial Services Group), Chaochao Chen (Ant Financial Services Group), Longfei Li (Ant Financial Services Group), Xinxing Yang (Ant Financial Services Group), Qing Cui (Alibaba Cloud), Jin Yu (Alibaba Cloud), Xu Chen (Alibaba Cloud), Yi Ding (Alibaba Cloud), Yuan Alan Qi (Ant Financial Services Group) FLAP: An End­to­End Event Log Analysis Platform for System Management Tao Li (Nanjing University of Posts and Telecommunications), Yexi Jiang (Florida International University), Chunqiu Zeng (Florida International University), Bin Xia (Nanjing University of Posts and Telecommunications), Zheng Liu (Nanjing University of Posts and Telecommunications), Wubai Zhou (Florida International University), Xiaolong Zhu (Florida International University), Wentao Wang (Florida International University), Liang Zhang (Huawei Nanjing Research and Development Center), Jun Wu (Huawei Nanjing Research and Development Center), Li Xue (Huawei Nanjing Research and Development Center), Dewei Bao (Huawei Nanjing Research and Development Center) Developing a Comprehensive Framework for Multimodal Feature Extraction Quinten McNamara (University of Texas at Austin), Alejandro De La Vega (University of Texas at Austin), Tal Yarkoni (University of Texas at Austin)

Applied Data Science Invited Session AI1: Understanding Behavior with Data Science, Room 200D Chair: Evangelos Simoudis

Mining Big Data in NeuroGenetics to Understand Muscular Dystrophy Andy Berglund (University of Florida) Addressing challenges with Big Data for Media Measurement Mainak Mazumdar (Nielsen) It Takes More than Math and Engineering to Hit the Bullseye with Data Paritosh Desai (Target)

Research Track Session RT1: Kernels and Sketches, Room 200E Chair: Stratis Ioannidis

A Minimal Variance Estimator for the Cardinality of Big Data Set Intersection Reuven Cohen (Technion), Liran Katzir (Technion), Aviv Yehezkel (Technion) Coresets for Kernel Regression Yan Zheng (University of Utah), Jeff M. Phillips (University of Utah) HyperLogLog Hyperextended: Sketches for Concave Sublinear Frequency Statistics Edith Cohen (Google Research)

Communication­Efficient Distributed Block Minimization for Nonlinear Kernel Machines Cho­Jui Hsieh (University of California, Davis), Si Si (Google Inc. & Google Research), Inderjit S. Dhillon (University of Texas at Austin) Linearized GMM Kernels and Normalized Random Fourier Features Ping Li (Rutgers University)

Research Track Session RT2: Temporal Analysis, Suite 202­205 Chair: Jie Tang

Matrix Profile V: A Generic Technique to Incorporate Domain Knowledge into Motif Discovery Hoang Anh Dau (University of California Riverside), Eamonn Keogh (University of California Riverside) TrioVecEvent: Embedding­Based Online Local Event Detection in Geo­Tagged Tweet Streams Chao Zhang (University of Illinois at Urbana­Champaign), Liyuan Liu (University of Illinois at Urbana­Champaign), Dongming Lei (University of Illinois at Urbana­Champaign), Quan Yuan (University of Illinois at Urbana­Champaign), Honglei Zhuang (University of Illinois at Urbana­Champaign), Tim Hanratty (U.S. Army Research Lab), Jiawei Han (University of Illinois at Urbana­Champaign) Effective and Real­time In­App Activity Analysis in Encrypted Internet Traffic Streams Junming Liu (Rutgers University), Yanjie Fu (Missouri University of Science and Technology), Jingci Ming (Rutgers University), Yong Ren (Futurewei Tech. Inc), Leilei Sun (Tsinghua University), Hui Xiong (Rutgers University) Toeplitz Inverse Covariance­Based Clustering of Multivariate Time Series Data David Hallac (Stanford University), Sagar Vare (Stanford University), Stephen Boyd (Stanford University), Jure Leskovec (Stanford University) Contextual Motifs Ian Fox (University of Michigan), Lynn Ang (University of Michigan), Mamta Jaiswal (University of Michigan), Rodica Pop­Busui (University of Michigan), Jenna Wiens (University of Michigan)

12:00PM ­ 1:30PM, KDD Lunch, Scotiabank Centre

12:00PM ­ 1:30PM, KDD Women's Lunch (Ticket Required), Prince George Hotel ­ Windsor Room

1:30PM ­ 5:00PM, China Chapter Meeting, Level 8 ­ Summit Suite/Meeting Room 5

Tuesday 1:30pm ­ 3:30pm

Applied Data Science Track Session AT2: Novel Applications 1, Room 200C Chair: Shalini Ghosh

Not All Passes Are Created Equal:" Objectively Measuring The Risk and Reward of Passes in Soccer from Tracking Data" Paul Power (STATS), Hector Ruiz (STATS), Xinyu Wei (STATS), Patrick Lucey (STATS) Luck is Hard to Beat: The Difficulty of Sports Prediction Raquel Y S Aoki (Universidade Federal de Minas Gerais), Renato M Assuncao (Universidade Federal de Minas Gerais), Pedro O S Vaz de Melo (Universidade Federal de Minas Gerais)

PNP: Fast Path Ensemble Method for Movie Design Danai Koutra (University of Michigan), Abhilash Dighe (University of Michigan), Smriti Bhagat (Facebook & Technicolor), Udi Weinsberg (Facebook & Technicolor), Stratis Ioannidis (Northeastern University), Christos Faloutsos (Carnegie Mellon University), Jean Bolot (Technicolor) DeepSD: Generating High Resolution Climate Change Projections through Single Image Super­Resolution Thomas Vandal (Northeastern University), Evan Kodra (risQ Inc.), Sangram Ganguly (Bay Area Environmental Research Institute / NASA Ames Research Center), Andrew Michaelis (University Corporation, Monterey Bay), Ramakrishna Nemani (NASA Advanced Supercomputing Division / NASA Ames Research Center), Auroop R Ganguly (Northeastern University) Deep Choice Model Using Pointer Networks for Airline Itinerary Prediction Alejandro Mottini (Amadeus SAS), Rodrigo Acuna­Agost (Amadeus SAS)

Applied Data Science Invited Session AI2: Applied Machine Learning, Room 200D Chair: Evangelos Simoudis

Designing AI at Scale to Power Everyday Life Rajesh Parekh (Facebook) Industrial Machine Learning Josh Bloom (GE Digital) Behavior Informatics to Discover Behavior Insight for Active and Tailored Client Management Longbing Cao (University of Technology Sydney)

Research Track Session RT3: Graphs I, Room 200E Chair: Matteo Riondato

On Finding Socially Tenuous Groups for Online Social Networks Chih­Ya Shen (National Tsing Hua University), Liang­Hao Huang (Academia Sinica), De­Nian Yang (Academia Sinica), Hong­Han Shuai (National Chiao Tung University), Wang­Chien Lee (Pennsylvania State University), Ming­Syan Chen (National Taiwan University) Improved Degree Bounds and Full Spectrum Power Laws in Preferential Attachment Networks Chen Avin (Ben Gurion University of the Negev), Zvi Lotker (Ben Gurion University of the Negev), Yinon Nahum (Weizmann Institute of Science), David Peleg (Weizmann Institute of Science) A Local Algorithm for Structure­Preserving Graph Cut Dawei Zhou (Arizona State University), Si Zhang (Arizona State University), Mehmet Yigit Yildirim (Arizona State University), Scott Alcorn (Early Warnings LLC.), Hanghang Tong (Arizona State University), Hasan Davulcu (Arizona State University), Jingrui He (Arizona State University) Fast Enumeration of Large k­Plexes Alessio Conte (), Donatella Firmani (Roma Tre University), Caterina Mordente (Be Think Solve Execute), Maurizio Patrignani (Roma Tre University), Riccardo Torlone (Roma Tre University) Graph Edge Partitioning via Neighborhood Heuristic Chenzi Zhang (University of Hong Kong & Noah’s Ark Lab), Fan Wei (Stanford University), Qin Liu (Huawei Noah’s Ark Lab & Chinese University of Hong Kong), Zhihao Gavin Tang (University of Hong Kong), Zhenguo Li (Huawei Noah’s Ark Lab)

Research Track Session RT4: Supervised Learning I, Suite 202­205 Chair: Naoki Abe

AnnexML: Approximate Nearest Neighbor Search for Extreme Multi­label Classification Yukihiro Tagami (Yahoo Japan Corporation & Kyoto University) Interpretable Predictions of Tree­based Ensembles via Actionable Feature Tweaking Gabriele Tolomei (Yahoo Research), Fabrizio Silvestri (Facebook), Andrew Haines (Yahoo Research), Mounia Lalmas (Yahoo Research)

Similarity Forests Saket Sathe (IBM T. J. Watson Research Center), Charu C Aggarwal (IBM T. J. Watson Research Center) Learning Certifiably Optimal Rule Lists Elaine Angelino (University of California, Berkeley), Nicholas Larus­Stone (Harvard University), Daniel Alabi (Harvard University), Margo Seltzer (Harvard University), Cynthia Rudin (Duke University) PPDsparse: A Parallel Primal­Dual Sparse Method for Extreme Classification Ian E.H. Yen (Carnegie Mellon University), Xiangru Huang (University of Texas at Austin), Wei Dai (Carnegie Mellon University & Petuum Inc.), Pradeep Ravikumar (Carnegie Mellon University), Inderjit Dhillon (University of Texas at Austin), Eric Xing (Carnegie Mellon University & Petuum Inc.)

3:30PM ­ 4:00PM, KDD Coffee Break, Level 2 & 3 Foyer, Exhibit Hall Tuesday 4:00pm ­ 6:00pm

Applied Data Science Track Session AT3: Medical Data, Room 200C Chair: Luca Foschini

MARAS: Signaling Multi­Drug Adverse Reactions Xiao Qin (Worcester Polytechnic Institute), Tabassum Kakar (Worcester Polytechnic Institute), Susmitha Wunnava (Worcester Polytechnic Institute), Elke A Rundensteiner (Worcester Polytechnic Institute), Lei Cao (Massachusetts Institute of Technology) Pharmacovigilance via Baseline Regularization with Large­Scale Longitudinal Observational Data Zhaobin Kuang (University of Wisconsin­Madison), Peggy Peissig (Marshfield Clinic), Vitor Santos Costa (Universidade do Porto), Richard Maclin (University of Minnesota­Duluth), David Page (University of Wisconsin­Madison) Predicting Clinical Outcomes Across Changing Electronic Health Record Systems Jen J Gong (Massachusetts Institute of Technology), Tristan Naumann (Massachusetts Institute of Technology), Peter Szolovits (Massachusetts Institute of Technology), John V. Guttag (Massachusetts Institute of Technology) Prognosis and Diagnosis of Parkinson's Disease Using Multi­Task Learning Saba Emrani (SAS Institute Inc.), Anya McGuirk (SAS Institute Inc.), Wei Xiao (SAS Institute Inc.) GELL: Automatic Extraction of Epidemiological Line Lists from Open Sources Saurav Ghosh (Virginia Tech), Prithwish Chakraborty (Virginia Tech), Bryan L. Lewis (Virginia Tech), Maimuna S. Majumder (Massachusetts Institute of Technology & Boston Children’s Hospital), Emily Cohn (Boston Children's Hospital), John S. Brownstein (Boston Children’s Hospital), Madhav V. Marathe (Virginia Tech), Naren Ramakrishnan (Virginia Tech)

Research Track Session RT5: Deep Learning, Room 200DE Chair: Lei Li

Patient Subtyping via Time­Aware LSTM Networks Inci M Baytas (Michigan State University), Cao Xiao (IBM T. J. Watson Research Center), Xi Zhang (Cornell University), Fei Wang (Cornell University), Anil K Jain (Michigan State University), Jiayu Zhou (Michigan State University) KATE: K­Competitive Autoencoder for Text Yu Chen (Rensselaer Polytechnic Institute), Mohammed J Zaki (Rensselaer Polytechnic Institute) Scalable and Sustainable Deep Learning via Randomized Hashing Ryan Spring (Rice University), Anshumali Shrivastava (Rice University) Anomaly Detection with Robust Deep Autoencoders Chong Zhou (Worcester Polytechnic Institute), Randy C. Paffenroth (Worcester Polytechnic Institute) Collaborative Variational Autoencoder for Recommender Systems Xiaopeng Li (Hong Kong University of Science and Technology), James She (Hong Kong University of Science and Technology)

6:30pm­10:00pm, KDD 2017 Banquet, Cunard Centre

Wednesday August 16, 2017 Detailed Program

Wednesday 8:00am ­ 5:00pm, Registration, Registration Desk ­ Level 1 Atrium

Wednesday 7:00am ­ 8:00am, KDD Breakfast ­ Level 3 Foyer

Wednesday 9:30am ­ 6:00pm, KDD Exhibit Hall ­ Exhibit Hall (closed for KDD Business Lunch)

Wednesday 9:30am ­ 6:00pm, KDD Sponsor Room ­ Level 8 Meeting Room 4

Wednesday 8:00am ­ 9:30am, Scotiabank Centre Keynote Session 2: What's Fair? Speaker: Cynthia Dwork, Distinguished Scientist, Microsoft Research / Harvard University chaired by Ravi Kumar Abstract: Data, algorithms, and systems have biases embedded within them reflecting designers’ explicit and implicit choices, historical biases, and societal priorities. They form, literally and inexorably, a codification of values. “Unfairness” of algorithms – for tasks ranging from advertising to recidivism prediction – has attracted considerable attention in the popular press. The talk will discuss the nascent mathematically rigorous study of fairness in classification and scoring.

Wednesday 9:30am ­ 10:00am, Coffee Break, Level 2 and 3 Foyer / Exhibit Hall

10:00am­12:00pm Meet the Editors Panel Level 8 ­ Summit Suite/Meeting Room 5

Wednesday 10:00am ­ 12:00pm

Applied Data Science Session AT4: Networks and Graphs, Room 200C

Chair: Jennifer Neville

FIRST: Fast Interactive Attributed Subgraph Matching Boxin Du (Arizona State University), Si Zhang (Arizona State University), Nan Cao (Tongji University), Hanghang Tong (Arizona State University) HinDroid: An Intelligent Android Malware Detection System Based on Structured Heterogeneous Information Network Shifu Hou (West Virginia University), Yanfang Ye (West Virginia University), Yangqiu Song (HKUST), Melih Abdulhayoglu (Comodo Security Solutions, Inc.)

MOLIERE: Automatic Biomedical Hypothesis Generation System Justin Sybrandt (Clemson University), Michael Shtutman (University of South Carolina), Ilya Safro (Clemson University) A Century of Science: Globalization of Scientific Collaborations, Citations, and Innovations Yuxiao Dong (Microsoft Research), Hao Ma (Microsoft Research), Zhihong Shen (Microsoft Research), Kuansan Wang (Microsoft Research) Estimation of Recent Ancestral Origins of Individuals on a Large Scale Ross E Curtis (AncestryDNA), Ahna R Girshick (AncestryDNA)

Applied Data Science Invited Session AI3: Intelligent Systems and Data Science, Room 200D Chair: Evangelos Simoudis

Big Data in Climate: Opportunities and Challenges for Machine Learning Vipin Kumar (University of Minnesota) Spaceborne Data Enters the Mainstream David Potere (Tellus Laboratories) Planning and Learning under Uncertainty: Theory and Practice Jonathan P. How (Massachusetts Institute of Technology)

Research Track Session RT6: Graphs II, Room 200E Chair: Tanya Berger­Wolf

PReP: Path­Based Relevance from a Probabilistic Perspective in Heterogeneous Information Networks Yu Shi (University of Illinois at Urbana­Champaign), Po­Wei Chan (University of Illinois at Urbana­Champaign), Honglei Zhuang (University of Illinois at Urbana­Champaign), Huan Gui (University of Illinois at Urbana­Champaign), Jiawei Han (University of Illinois at Urbana­Champaign) Weisfeiler­Lehman Neural Machine for Link Prediction Muhan Zhang (Washington University in St. Louis), Yixin Chen (Washington University in St. Louis) Network Inference via the Time­Varying Graphical Lasso David Hallac (Stanford University), Youngsuk Park (Stanford University), Stephen Boyd (Stanford University), Jure Leskovec (Stanford University) Long Short Memory Process: Modeling Growth Dynamics of Microscopic Social Connectivity Chengxi Zang (Tsinghua University), Peng Cui (Tsinghua University), Christos Faloutsos (Carnegie Mellon University), Wenwu Zhu (Tsinghua University) FORA: Simple and Effective Approximate Single­Source Personalized PageRank Sibo Wang (University of Queensland & Nanyang Technological University), Renchi Yang (Nanyang Technological University), Xiaokui Xiao (Nanyang Technological University), Zhewei Wei (Renmin University of China & Nanyang Technological University), Yin Yang (Hamad Bin Khalifa University)

Research Track Session RT7: Methodology, Suite 202­205 Chair: B. Aditya Prakash

Constructivism Learning: A Learning Paradigm for Transparent Predictive Analytics Xiaoli Li (University of Kansas), Jun Huan (University of Kansas) Randomized Feature Engineering as a Fast and Accurate Alternative to Kernel Methods Suhang Wang (Arizona State University), Charu Aggarwal (IBM T. J. Watson Research Center), Huan Liu (Arizona State University) Groups­Keeping Solution Path Algorithm for Sparse Regression with Automatic Feature Grouping Bin Gu (University of Texas at Arlington), Guodong Liu (University of Texas at Arlington), Heng Huang (University of Texas at Arlington) Discovering Reliable Approximate Functional Dependencies Panagiotis Mandros (Max Planck Institute for Informatics & Saarland University), Mario Boley (Max Planck Institute for Informatics & Saarland University), Jilles Vreeken (Max Planck Institute for Informatics & Saarland University) The Selective Labels Problem: Evaluating Algorithmic Predictions in the Presence of Unobservables Himabindu Lakkaraju (Stanford University), Jon Kleinberg (Cornell University), Jure Leskovec (Stanford University), Jens Ludwig (University of Chicago), Sendhil Mullainathan (Harvard University)

Wednesday 1:30pm ­ 3:30pm

Applied Data Science Track Session AT5: Novel Applications 2, Room 200C Chair: Byron Galbraith

Compass: Spatio Temporal Sentiment Analysis of US Election Debjyoti Paul (University of Utah), Feifei Li (University of Utah), Murali Krishna Teja (University of Utah), Xin Yu (University of Utah), Richie Frost (University of Utah) A Data Mining Framework for Valuing Large Portfolios of Variable Annuities Guojun Gan (University of Connecticut), Jimmy Xiangji Huang (York University) Backpage and Bitcoin: Uncovering Human Traffickers Rebecca S Portnoff (University of California, Berkeley), Danny Yuxing Huang (University of California, San Diego), Periwinkle Doerfler (New York University), Sadia Afroz (ICSI), Damon McCoy (New York University) A Data Science Approach to Understanding Residential Water Contamination in Flint Alex Chojnacki (University of Michigan), Chengyu Dai (University of Michigan), Arya Farahi (University of Michigan), Guangsha Shi (University of Michigan), Jared Webb (Brigham Young University), Daniel T. Zhang (University of Michigan), Jacob Abernethy (University of Michigan), Eric Schwartz (University of Michigan) Quick Access: Building a Smart Experience for Google Drive Sandeep Tata (Google USA), Alexandrin Popescul (Google USA), Marc Najork (Google USA), Mike Colagrosso (Google USA), Julian Gibbons (Google Australia), Alan Green (Google Australia), Alexandre Mah (Google Australia), Michael Smith (Google Australia), Divanshu Garg (Google Australia), Cayden Meyer (Google Australia), Reuben Kan (Google Australia)

Applied Data Science Invited Session AI4: Management and Benchmarks, Room 200D Chair: Usama Fayyad More than the Sum of its Parts: Building Domino Data Lab

Eduardo Ariño de la Rubia (Domino Data Lab.) Machine Learning Software in Practice: Quo Vadis? Szilard Pafka (Epoch)

Research Track Session RT8: Representations, Room 200E Chair: Yizhou Sun

EmbedJoin: Efficient Edit Similarity Joins via Embeddings Haoyu Zhang (Indiana University Bloomington), Qin Zhang (Indiana University Bloomington) Collaboratively Improving Topic Discovery and Word Embeddings by Coordinating Global and Local Contexts Guangxu Xun (SUNY at Buffalo), Yaliang Li (SUNY at Buffalo & Baidu Research Big Data Lab), Jing Gao (SUNY at Buffalo), Aidong Zhang (SUNY at Buffalo) metapath2vec: Scalable Representation Learning for Heterogeneous Networks Yuxiao Dong (Microsoft Research & University of Notre Dame), Nitesh V Chawla (University of Notre Dame), Ananthram Swami (Army Research Laboratory) Struc2vec: Learning Node Representations from Structural Identity Leonardo F. R. Ribeiro (Federal University of ), Pedro H. P. Saverese (Federal University of Rio de Janeiro), Daniel R. Figueiredo (Federal University of Rio de Janeiro) Efficient Correlated Topic Modeling with Topic Embedding Junxian He (Carnegie Mellon University & Shanghai Jiao Tong University), Zhiting Hu (Carnegie Mellon University & Petuum Inc.), Taylor Berg­Kirkpatrick (Carnegie Mellon University), Ying Huang (Shanghai Jiao Tong University), Eric P Xing (Carnegie Mellon University & Petuum Inc.)

KDD Panel, Suite 202­205 Moderators: Muthu Muthukrishnan (Rutgers University) and Andrew Tomkins (Google)

KDD Panel: The Future of Artificially Intelligent Assistants Muthu Muthukrishnan (Rutgers), Andrew Tomkins (Google), Larry Heck (Google), Alborz Geramifard (Amazon), Deepak Agarwal (LinkedIn)

Wednesday 3:30pm ­ 4:00pm, Coffee Break, Level 2 and 3 Foyer / Exhibit Hall

Wednesday 4:00pm ­ 6:00pm

Applied Data Science Track Session AT6: Urban Planning, Room 200C Chair: Karl Ni

No Longer Sleeping with a Bomb: A Duet System for Protecting Urban Safety from Dangerous Goods Jingyuan Wang (Beihang University), Chao Chen (Beihang University), Junjie Wu (Beihang University), Zhang Xiong (Beihang University) Planning Bike Lanes based on Sharing­Bikes' Trajectories Jie Bao (Microsoft Research), Tianfu He (Harbin Institution of Technology), Sijie Ruan (Xidian University), Yanhua Li (WPI), Yu Zheng (Microsoft Research)

The Simpler The Better: A Unified Approach to Predicting Original Taxi Demands based on Large­Scale Online Platforms Yongxin Tong (Beihang University), Yuqiang Chen (4Paradigm Inc.), Zimu Zhou (ETH Zurich), Lei Chen (Hong Kong University of Science and Technology), Jie Wang (Didi Research Institute), Qiang Yang (4Paradigm Inc. & Hong Kong University of Science and Technology), Jieping Ye (Didi Research Institute), Weifeng Lv (Beihang University)

A Quasi­experimental Estimate of the Impact of P2P Transportation Platforms on Urban Consumer Patterns Zhe Zhang (Carnegie Mellon University), Beibei Li (Carnegie Mellon University) Using Convolutional Networks and Satellite Imagery to Identify Patterns in Urban Environments at a Large Scale Adrian Albert (Massachusetts Institute of Technology), Jasleen Kaur (Philips Lighting Research), Marta C. Gonzalez (Massachusetts Institute of Technology)

Applied Data Science Invited Session AI5: Panel on Benchmarks and Process Management in Data Science, Room 200D Moderator: Usama Fayyad

Benchmarks and Process Management in Data Science: Will We Ever Get Over this Mess? Usama M. Fayyad (Open Insights), Arno Candel (H2O.ai, Inc.), Eduardo Ariño de la Rubia (Domino Data Lab.), Szilárd Pafka (Epoch), Anthony Chong (IKASI), Jeong­Yoon Lee (Microsoft)

Research Track Session RT9: Matrices, Room 200E Chair: Austin Benson

Multi­Aspect Streaming Tensor Completion Qingquan Song (Texas A&M University), Xiao Huang (Texas A&M University), Hancheng Ge (Texas A&M University), James Caverlee (Texas A&M University), Xia Hu (Texas A&M University & Texas A&M Engineering Experiment Station) Discrete Content­aware Matrix Factorization Defu Lian (University of Electronic Science and Technology of China), Rui Liu (University of Electronic Science and Technology of China), Yong Ge (University of Arizona), Kai Zheng (University of Electronic Science and Technology of China), Xing Xie (Microsoft Research), Longbing Cao (University of Technology Sydney) SPARTan: Scalable PARAFAC2 for Large & Sparse Data Ioakeim Perros (Georgia Institute of Technology), Evangelos E. Papalexakis (University of California, Riverside), Fei Wang (Weill Cornell Medicine), Richard Vuduc (Georgia Institute of Technology), Elizabeth Searles (Children's Healthcare Of Atlanta), Michael Thompson (Children's Healthcare Of Atlanta), Jimeng Sun (Georgia Institute of Technology) Unsupervised Network Discovery for Brain Imaging Data Zilong Bai (University of California, Davis), Peter Walker (Naval Medical Research Center), Anna Tschiffely (Naval Medical Research Center), Fei Wang (Cornell University), Ian Davidson (University of California, Davis) Randomization or Condensation? Linear­Cost Matrix Sketching Via Cascaded Compression Sampling Kai Zhang (Temple University), Chuanren Liu (Drexel University), Jie Zhang (Fudan University), Hui Xiong (Rutgers University), Eric Xing (Carneigie Mellon University), Jieping Ye (University of Michigan, Ann Arbor)

Research Track Session RT10: Clustering, Suite 202­205 Chair: Martin Ester

Towards an Optimal Subspace for K­Means Dominik Mautz (Ludwig­Maximilians­Universität München), Wei Ye (Ludwig­Maximilians­Universität München), Claudia Plant (University of Vienna), Christian Böhm (Ludwig­Maximilians­Universität München) Clustering Individual Transactional Data for Masses of Users Riccardo Guidotti (ISTI­CNR & University of Pisa), Anna Monreale (University of Pisa), Mirco Nanni (ISTI­CNR), Fosca Giannotti (ISTI­CNR), Dino Pedreschi (University of Pisa) Ego­Splitting Framework: from Non­Overlapping to Overlapping Clusters Alessandro Epasto (Google Research), Silvio Lattanzi (Google Research Zurich), Renato Paes Leme (Google Research) Local Higher­Order Graph Clustering Hao Yin (Stanford University), Austin R. Benson (Stanford University), Jure Leskovec (Stanford University), David F. Gleich (Purdue University) A Hierarchical Algorithm for Extreme Clustering Ari Kobren (University of Massachusetts Amherst), Nicholas Monath (University of Massachusetts Amherst), Akshay Krishnamurthy (University of Massachusetts Amherst), Andrew McCallum (University of Massachusetts Amherst)

Wednesday August 16, 4:00pm ­ 5:00pm, Poster Presenter Setup (Poster Group 2) – WTCC Room 100

Wednesday August 16, 7:00pm ­ 10:00pm, Poster Reception (Poster Group 2) – WTCC Room 100

Poster 1: Peeking at A/B Tests Ramesh Johari (Stanford University), Pete Koomen (Optimizely, Inc.), Leonid Pekelis (Optimizely, Inc.), David Walsh (Stanford University)

Poster 2: KATE: K­Competitive Autoencoder for Text Yu Chen (Rensselaer Polytechnic Institute), Mohammed J Zaki (Rensselaer Polytechnic Institute)

Poster 3: PPDsparse: A Parallel Primal­Dual Sparse Method for Extreme Classification Ian E.H. Yen (Carnegie Mellon University), Xiangru Huang (University of Texas at Austin), Wei Dai (Carnegie Mellon University & Petuum Inc.), Pradeep Ravikumar (Carnegie Mellon University), Inderjit Dhillon (University of Texas at Austin), Eric Xing (Carnegie Mellon University & Petuum Inc.)

Poster 4: A Hierarchical Algorithm for Extreme Clustering Ari Kobren (University of Massachusetts Amherst), Nicholas Monath (University of Massachusetts Amherst), Akshay Krishnamurthy (University of Massachusetts Amherst), Andrew McCallum (University of Massachusetts Amherst)

Poster 5: Toeplitz Inverse Covariance­Based Clustering of Multivariate Time Series Data David Hallac (Stanford University), Sagar Vare (Stanford University), Stephen Boyd (Stanford University), Jure Leskovec (Stanford University)

Poster 6: Ego­Splitting Framework: from Non­Overlapping to Overlapping Clusters Alessandro Epasto (Google Research), Silvio Lattanzi (Google Research Zurich), Renato Paes Leme (Google Research)

Poster 7: Clustering Individual Transactional Data for Masses of Users Riccardo Guidotti (ISTI­CNR & University of Pisa), Anna Monreale (University of Pisa), Mirco Nanni (ISTI­CNR), Fosca Giannotti (ISTI­CNR), Dino Pedreschi (University of Pisa)

Poster 8: Towards an Optimal Subspace for K­Means Dominik Mautz (Ludwig­Maximilians­Universität München), Wei Ye (Ludwig­Maximilians­Universität München), Claudia Plant (University of Vienna), Christian Böhm (Ludwig­Maximilians­Universität München)

Poster 9: MARAS: Signaling Multi­Drug Adverse Reactions Xiao Qin (Worcester Polytechnic Institute), Tabassum Kakar (Worcester Polytechnic Institute), Susmitha Wunnava (Worcester Polytechnic Institute), Elke A Rundensteiner (Worcester Polytechnic Institute), Lei Cao (Massachusetts Institute of Technology)

Poster 10: A Century of Science: Globalization of Scientific Collaborations, Citations, and Innovations Yuxiao Dong (Microsoft Research), Hao Ma (Microsoft Research), Zhihong Shen (Microsoft Research), Kuansan Wang (Microsoft Research)

Poster 11: Discovering Reliable Approximate Functional Dependencies Panagiotis Mandros (Max Planck Institute for Informatics & Saarland University), Mario Boley (Max Planck Institute for Informatics & Saarland University), Jilles Vreeken (Max Planck Institute for Informatics & Saarland University)

Poster 12: A Minimal Variance Estimator for the Cardinality of Big Data Set Intersection Reuven Cohen (Technion), Liran Katzir (Technion), Aviv Yehezkel (Technion)

Poster 13: The Selective Labels Problem: Evaluating Algorithmic Predictions in the Presence of Unobservables Himabindu Lakkaraju (Stanford University), Jon Kleinberg (Cornell University), Jure Leskovec (Stanford University), Jens Ludwig (University of Chicago), Sendhil Mullainathan (Harvard University)

Poster 14: Tracking the Dynamics in Crowdfunding Hongke Zhao (University of Science and Technology of China), Hefu Zhang (University of Science and Technology of China), Yong Ge (University of Arizona), Qi Liu (University of Science and Technology of China), Enhong Chen (University of Science and Technology of China), Huayu Li (University of North Carolina at Charlotte), Le Wu (Hefei University of Technology)

Poster 15: Human Mobility Synchronization and Trip Purpose Detection with Mixture of Hawkes Processes Pengfei Wang (Chinese Academy of Sciences), Yanjie Fu (Missouri University of Science and Technology), Guannan Liu (Beihang University), Wenqing Hu (Missouri University of Science and Technology), Charu Aggarwal (IBM T. J. Watson Research Center)

Poster 16: Matrix Profile V: A Generic Technique to Incorporate Domain Knowledge into Motif Discovery Hoang Anh Dau (University of California Riverside), Eamonn Keogh (University of California Riverside)

Poster 17: Constructivism Learning: A Learning Paradigm for Transparent Predictive Analytics Xiaoli Li (University of Kansas), Jun Huan (University of Kansas)

Poster 18: Scalable and Sustainable Deep Learning via Randomized Hashing Ryan Spring (Rice University), Anshumali Shrivastava (Rice University)

Poster 19: Anomaly Detection with Robust Deep Autoencoders Chong Zhou (Worcester Polytechnic Institute), Randy C. Paffenroth (Worcester Polytechnic Institute)

Poster 20: Structural Deep Brain Network Mining Shen Wang (University of Illinois at Chicago), Lifang He (Shenzhen University), Bokai Cao (University of Illinois at Chicago), Chun­Ta Lu (University of Illinois at Chicago), Philip S Yu (University of Illinois at Chicago), Ann B. Ragin (Northwestern University)

Poster 21: Deep Embedding Forest: Forest­based Serving with Deep Embedding Features

Jie Zhu (Microsoft Corporation), Ying Shan (Microsoft Corporation), JC Mao (Microsoft Corporation), Dong Yu (Microsoft Corporation), Holakou Rahmanian (University of California, Santa Cruz), Yi Zhang (Microsoft Corporation)

Poster 22: Accelerating Innovation Through Analogy Mining Tom Hope (Hebrew University of Jerusalem), Joel Chan (Carnegie Mellon University), Aniket Kittur (Carnegie Mellon University), Dafna Shahaf (Hebrew University of Jerusalem)

Poster 23: DeepSD: Generating High Resolution Climate Change Projections through Single Image Super­Resolution Thomas Vandal (Northeastern University), Evan Kodra (risQ Inc.), Sangram Ganguly (Bay Area Environmental Research Institute / NASA Ames Research Center), Andrew Michaelis (University Corporation, Monterey Bay), Ramakrishna Nemani (NASA Advanced Supercomputing Division / NASA Ames Research Center), Auroop R Ganguly (Northeastern University)

Poster 24: Using Convolutional Networks and Satellite Imagery to Identify Patterns in Urban Environments at a Large Scale Adrian Albert (Massachusetts Institute of Technology), Jasleen Kaur (Philips Lighting Research), Marta C. Gonzalez (Massachusetts Institute of Technology)

Poster 25: Patient Subtyping via Time­Aware LSTM Networks Inci M Baytas (Michigan State University), Cao Xiao (IBM T. J. Watson Research Center), Xi Zhang (Cornell University), Fei Wang (Cornell University), Anil K Jain (Michigan State University), Jiayu Zhou (Michigan State University)

Poster 26: TrioVecEvent: Embedding­Based Online Local Event Detection in Geo­Tagged Tweet Streams Chao Zhang (University of Illinois at Urbana­Champaign), Liyuan Liu (University of Illinois at Urbana­Champaign), Dongming Lei (University of Illinois at Urbana­Champaign), Quan Yuan (University of Illinois at Urbana­Champaign), Honglei Zhuang (University of Illinois at Urbana­Champaign), Tim Hanratty (U.S. Army Research Lab), Jiawei Han (University of Illinois at Urbana­Champaign)

Poster 27: Luck is Hard to Beat: The Difficulty of Sports Prediction Raquel Y S Aoki (Universidade Federal de Minas Gerais), Renato M Assuncao (Universidade Federal de Minas Gerais), Pedro O S Vaz de Melo (Universidade Federal de Minas Gerais)

Poster 28: Effective and Real­time In­App Activity Analysis in Encrypted Internet Traffic Streams Junming Liu (Rutgers University), Yanjie Fu (Missouri University of Science and Technology), Jingci Ming (Rutgers University), Yong Ren (Futurewei Tech. Inc), Leilei Sun (Tsinghua University), Hui Xiong (Rutgers University)

Poster 29: Developing a Comprehensive Framework for Multimodal Feature Extraction Quinten McNamara (University of Texas at Austin), Alejandro De La Vega (University of Texas at Austin), Tal Yarkoni (University of Texas at Austin)

Poster 30: Randomized Feature Engineering as a Fast and Accurate Alternative to Kernel Methods Suhang Wang (Arizona State University), Charu Aggarwal (IBM T. J. Watson Research Center), Huan Liu (Arizona State University)

Poster 31: Estimating Treatment Effect in the Wild via Differentiated Confounder Balancing Kun Kuang (Tsinghua University), Peng Cui (Tsinghua University), Bo Li (Tsinghua Univeristy), Meng Jiang (University of Notre Dame), Shiqiang Yang (Tsinghua University)

Poster 32: Groups­Keeping Solution Path Algorithm for Sparse Regression with Automatic Feature Grouping Bin Gu (University of Texas at Arlington), Guodong Liu (University of Texas at Arlington), Heng Huang (University of Texas at Arlington)

Poster 33: SPARTan: Scalable PARAFAC2 for Large & Sparse Data Ioakeim Perros (Georgia Institute of Technology), Evangelos E. Papalexakis (University of California, Riverside), Fei Wang (Weill Cornell Medicine), Richard Vuduc (Georgia Institute of Technology), Elizabeth Searles (Children's Healthcare Of Atlanta), Michael Thompson (Children's Healthcare Of Atlanta), Jimeng Sun (Georgia Institute of Technology)

Poster 34: Interpretable Predictions of Tree­based Ensembles via Actionable Feature Tweaking Gabriele Tolomei (Yahoo Research), Fabrizio Silvestri (Facebook), Andrew Haines (Yahoo Research), Mounia Lalmas (Yahoo Research)

Poster 35: KunPeng: Parameter Server based Distributed Learning Systems and Its Applications in Alibaba and Ant Financial Jun Zhou (Ant Financial Services Group), Xiaolong Li (Ant Financial Services Group), Peilin Zhao (Ant Financial Services Group), Chaochao Chen (Ant Financial Services Group), Longfei Li (Ant Financial Services Group), Xinxing Yang (Ant Financial Services Group), Qing Cui (Alibaba Cloud), Jin Yu (Alibaba Cloud), Xu Chen (Alibaba Cloud), Yi Ding (Alibaba Cloud), Yuan Alan Qi (Ant Financial Services Group)

Poster 36: HinDroid: An Intelligent Android Malware Detection System Based on Structured Heterogeneous Information Network Shifu Hou (West Virginia University), Yanfang Ye (West Virginia University), Yangqiu Song (HKUST), Melih Abdulhayoglu (Comodo Security Solutions, Inc.)

Poster 37: A Dirty Dozen: Twelve Common Metric Interpretation Pitfalls in Online Controlled Experiments Pavel Dmitriev (Microsoft Corporation), Somit Gupta (Microsoft Corporation), Dong Woo Kim (Microsoft Corporation), Garnet Vaz (Microsoft Corporation)

Poster 38: A Data Mining Framework for Valuing Large Portfolios of Variable Annuities Guojun Gan (University of Connecticut), Jimmy Xiangji Huang (York University)

Poster 39: FORA: Simple and Effective Approximate Single­Source Personalized PageRank Sibo Wang (University of Queensland & Nanyang Technological University), Renchi Yang (Nanyang Technological University), Xiaokui Xiao (Nanyang Technological University), Zhewei Wei (Renmin University of China & Nanyang Technological University), Yin Yang (Hamad Bin Khalifa University)

Poster 40: Graph Edge Partitioning via Neighborhood Heuristic Chenzi Zhang (University of Hong Kong & Noah’s Ark Lab), Fan Wei (Stanford University), Qin Liu (Huawei Noah’s Ark Lab & Chinese University of Hong Kong), Zhihao Gavin Tang (University of Hong Kong), Zhenguo Li (Huawei Noah’s Ark Lab)

Poster 41: metapath2vec: Scalable Representation Learning for Heterogeneous Networks Yuxiao Dong (Microsoft Research & University of Notre Dame), Nitesh V Chawla (University of Notre Dame), Ananthram Swami (Army Research Laboratory)

Poster 42: Weisfeiler­Lehman Neural Machine for Link Prediction Muhan Zhang (Washington University in St. Louis), Yixin Chen (Washington University in St. Louis)

Poster 43: A Local Algorithm for Structure­Preserving Graph Cut Dawei Zhou (Arizona State University), Si Zhang (Arizona State University), Mehmet Yigit Yildirim (Arizona State University), Scott Alcorn (Early Warnings LLC.), Hanghang Tong (Arizona State University), Hasan Davulcu (Arizona State University), Jingrui He (Arizona State University)

Poster 44: struc2vec: Learning Node Representations from Structural Identity Leonardo F. R. Ribeiro (Federal University of Rio de Janeiro), Pedro H. P. Saverese (Federal University of Rio de

Janeiro), Daniel R. Figueiredo (Federal University of Rio de Janeiro)

Poster 45: Network Inference via the Time­Varying Graphical Lasso David Hallac (Stanford University), Youngsuk Park (Stanford University), Stephen Boyd (Stanford University), Jure Leskovec (Stanford University)

Poster 46: PReP: Path­Based Relevance from a Probabilistic Perspective in Heterogeneous Information Networks Yu Shi (University of Illinois at Urbana­Champaign), Po­Wei Chan (University of Illinois at Urbana­Champaign), Honglei Zhuang (University of Illinois at Urbana­Champaign), Huan Gui (University of Illinois at Urbana­Champaign), Jiawei Han (University of Illinois at Urbana­Champaign)

Poster 47: Improved Degree Bounds and Full Spectrum Power Laws in Preferential Attachment Networks Chen Avin (Ben Gurion University of the Negev), Zvi Lotker (Ben Gurion University of the Negev), Yinon Nahum (Weizmann Institute of Science), David Peleg (Weizmann Institute of Science)

Poster 48: Fast Enumeration of Large k­Plexes Alessio Conte (University of Pisa), Donatella Firmani (Roma Tre University), Caterina Mordente (Be Think Solve Execute), Maurizio Patrignani (Roma Tre University), Riccardo Torlone (Roma Tre University)

Poster 49: PNP: Fast Path Ensemble Method for Movie Design Danai Koutra (University of Michigan), Abhilash Dighe (University of Michigan), Smriti Bhagat (Facebook & Technicolor), Udi Weinsberg (Facebook & Technicolor), Stratis Ioannidis (Northeastern University), Christos Faloutsos (Carnegie Mellon University), Jean Bolot (Technicolor)

Poster 50: FIRST: Fast Interactive Attributed Subgraph Matching Boxin Du (Arizona State University), Si Zhang (Arizona State University), Nan Cao (Tongji University), Hanghang Tong (Arizona State University)

Poster 51: Estimation of Recent Ancestral Origins of Individuals on a Large Scale Ross E Curtis (AncestryDNA), Ahna R Girshick (AncestryDNA)

Poster 52: A Practical Exploration System for Search Advertising Parikshit Shah (Yahoo Research), Ming Yang (Yahoo), Sachidanand Alle (Yahoo), Adwait Ratnaparkhi (Yahoo Research), Ben Shahshahani (Yahoo Research), Rohit Chandra (Yahoo)

Poster 53: Local Higher­Order Graph Clustering Hao Yin (Stanford University), Austin R. Benson (Stanford University), Jure Leskovec (Stanford University), David F. Gleich (Purdue University)

Poster 54: Meta­Graph Based Recommendation Fusion over Heterogeneous Information Networks Huan Zhao (Hong Kong University of Science and Technology), Quanming Yao (Hong Kong University of Science and Technology), Jianda Li (Hong Kong University of Science and Technology), Yangqiu Song (Hong Kong University of Science and Technology), Dik Lun Lee (Hong Kong University of Science and Technology)

Poster 55: AnnexML: Approximate Nearest Neighbor Search for Extreme Multi­label Classification Yukihiro Tagami (Yahoo Japan Corporation & Kyoto University)

Poster 56: MOLIERE: Automatic Biomedical Hypothesis Generation System Justin Sybrandt (Clemson University), Michael Shtutman (University of South Carolina), Ilya Safro (Clemson University)

Poster 57: Pharmacovigilance via Baseline Regularization with Large­Scale Longitudinal Observational Data Zhaobin Kuang (University of Wisconsin­Madison), Peggy Peissig (Marshfield Clinic), Vitor Santos Costa (Universidade do Porto), Richard Maclin (University of Minnesota­Duluth), David Page (University of Wisconsin­Madison)

Poster 58: Predicting Clinical Outcomes Across Changing Electronic Health Record Systems Jen J Gong (Massachusetts Institute of Technology), Tristan Naumann (Massachusetts Institute of Technology), Peter Szolovits (Massachusetts Institute of Technology), John V. Guttag (Massachusetts Institute of Technology)

Poster 59: Prognosis and Diagnosis of Parkinson's Disease Using Multi­Task Learning Saba Emrani (SAS Institute Inc.), Anya McGuirk (SAS Institute Inc.), Wei Xiao (SAS Institute Inc.)

Poster 60: GELL: Automatic Extraction of Epidemiological Line Lists from Open Sources Saurav Ghosh (Virginia Tech), Prithwish Chakraborty (Virginia Tech), Bryan L. Lewis (Virginia Tech), Maimuna S. Majumder (Massachusetts Institute of Technology & Boston Children’s Hospital), Emily Cohn (Boston Children's Hospital), John S. Brownstein (Boston Children’s Hospital), Madhav V. Marathe (Virginia Tech), Naren Ramakrishnan (Virginia Tech)

Poster 61: Long Short Memory Process: Modeling Growth Dynamics of Microscopic Social Connectivity Chengxi Zang (Tsinghua University), Peng Cui (Tsinghua University), Christos Faloutsos (Carnegie Mellon University), Wenwu Zhu (Tsinghua University)

Poster 62: Unsupervised Network Discovery for Brain Imaging Data Zilong Bai (University of California, Davis), Peter Walker (Naval Medical Research Center), Anna Tschiffely (Naval Medical Research Center), Fei Wang (Cornell University), Ian Davidson (University of California, Davis)

Poster 63: Coresets for Kernel Regression Yan Zheng (University of Utah), Jeff M. Phillips (University of Utah)

Poster 64: Linearized GMM Kernels and Normalized Random Fourier Features Ping Li (Rutgers University)

Poster 65: Communication­Efficient Distributed Block Minimization for Nonlinear Kernel Machines Cho­Jui Hsieh (University of California, Davis), Si Si (Google Inc. & Google Research), Inderjit S. Dhillon (University of Texas at Austin)

Poster 66: Randomization or Condensation? Linear­Cost Matrix Sketching Via Cascaded Compression Sampling Kai Zhang (Temple University), Chuanren Liu (Drexel University), Jie Zhang (Fudan University), Hui Xiong (Rutgers University), Eric Xing (Carneigie Mellon University), Jieping Ye (University of Michigan, Ann Arbor)

Poster 67: Discrete Content­aware Matrix Factorization Defu Lian (University of Electronic Science and Technology of China), Rui Liu (University of Electronic Science and Technology of China), Yong Ge (University of Arizona), Kai Zheng (University of Electronic Science and Technology of China), Xing Xie (Microsoft Research), Longbing Cao (University of Technology Sydney)

Poster 68: Functional Annotation of Human Protein Coding Isoforms via Non­convex Multi­Instance Learning Tingjin Luo (National University of Defense Technology), Weizhong Zhang (Zhejiang University), Shang Qiu (University of Michigan), Yang Yang (Beihang University), Dongyun Yi (National University of Defense Technology), Guangtao Wang (University of Michigan), Jieping Ye (University of Michigan), Jie Wang (University of Michigan)

Poster 69: Collaboratively Improving Topic Discovery and Word Embeddings by Coordinating Global and Local Contexts Guangxu Xun (SUNY at Buffalo), Yaliang Li (SUNY at Buffalo & Baidu Research Big Data Lab), Jing Gao (SUNY at Buffalo), Aidong Zhang (SUNY at Buffalo)

Poster 70: Efficient Correlated Topic Modeling with Topic Embedding Junxian He (Carnegie Mellon University & Shanghai Jiao Tong University), Zhiting Hu (Carnegie Mellon University & Petuum Inc.), Taylor Berg­Kirkpatrick (Carnegie Mellon University), Ying Huang (Shanghai Jiao Tong University), Eric P

Xing (Carnegie Mellon University & Petuum Inc.)

Poster 71: Contextual Motifs Ian Fox (University of Michigan), Lynn Ang (University of Michigan), Mamta Jaiswal (University of Michigan), Rodica Pop­Busui (University of Michigan), Jenna Wiens (University of Michigan)

Poster 72: EmbedJoin: Efficient Edit Similarity Joins via Embeddings Haoyu Zhang (Indiana University Bloomington), Qin Zhang (Indiana University Bloomington)

Poster 73: Learning Certifiably Optimal Rule Lists Elaine Angelino (University of California, Berkeley), Nicholas Larus­Stone (Harvard University), Daniel Alabi (Harvard University), Margo Seltzer (Harvard University), Cynthia Rudin (Duke University)

Poster 74: Is the Whole Greater Than the Sum of Its Parts? Liangyue Li (Arizona State University), Hanghang Tong (Arizona State University), Yong Wang (Hong Kong University of Science and Technology), Conglei Shi (IBM Research), Nan Cao (Tongji University), Norbou Buchler (US Army Research Laboratory)

Poster 75: Google Vizier: A Service for Black­Box Optimization Daniel Golovin (Google Research), Benjamin Solnik (Google Research), Subhodeep Moitra (Google Research), Greg Kochanski (Google Research), John Karro (Google Research), D. Sculley (Google Research)

Poster 76: Quick Access: Building a Smart Experience for Google Drive Sandeep Tata (Google USA), Alexandrin Popescul (Google USA), Marc Najork (Google USA), Mike Colagrosso (Google USA), Julian Gibbons (Google Australia), Alan Green (Google Australia), Alexandre Mah (Google Australia), Michael Smith (Google Australia), Divanshu Garg (Google Australia), Cayden Meyer (Google Australia), Reuben Kan (Google Australia)

Poster 77: TFX: A TensorFlow­Based Production­Scale Machine Learning Platform Denis Baylor (Google Inc.), Eric Breck (Google Inc.), Heng­Tze Cheng (Google Inc.), Noah Fiedel (Google Inc.), Chuan Yu Foo (Google Inc.), Zakaria Haque (Google Inc.), Salem Haykal (Google Inc.), Mustafa Ispir (Google Inc.), Vihan Jain (Google Inc.), Levent Koc (Google Inc.), Chiu Yuen Koo (Google Inc.), Lukasz Lew (Google Inc.), Clemens Mewald (Google Inc.), Akshay Naresh Modi (Google Inc.), Neoklis Polyzotis (Google Inc.), Sukriti Ramesh (Google Inc.), Sudip Roy (Google Inc.), Steven Euijong Whang (Google Inc.), Martin Wicke (Google Inc.), Jarek Wilkiewicz (Google Inc.), Xin Zhang (Google Inc.), Martin Zinkevich (Google Inc.)

Poster 78: A Quasi­experimental Estimate of the Impact of P2P Transportation Platforms on Urban Consumer Patterns Zhe Zhang (Carnegie Mellon University), Beibei Li (Carnegie Mellon University)

Poster 79: HoORaYs: High­order Optimization of Rating Distance for Recommender Systems Jingwei Xu (Nanjing University), Yuan Yao (Nanjing University), Hanghang Tong (Arizona State University), Xianping Tao (Nanjing University), Jian Lu (Nanjing University)

Poster 80: Large­scale Collaborative Ranking in Near­Linear Time Liwei Wu (University of California, Davis), Cho­Jui Hsieh (University of California, Davis), James Sharpnack (University of California, Davis)

Poster 81: Collaborative Variational Autoencoder for Recommender Systems Xiaopeng Li (Hong Kong University of Science and Technology), James She (Hong Kong University of Science and Technology)

Poster 82: LiJAR: A System for Job Application Redistribution towards Efficient Career Marketplace Fedor Borisyuk (LinkedIn Corporation), Liang Zhang (LinkedIn Corporation), Krishnaram Kenthapadi (LinkedIn

Corporation)

Poster 83: Cascade Ranking for Operational E­commerce Search Shichen Liu (Alibaba Group), Fei Xiao (Alibaba Group), Wenwu Ou (Alibaba Group), Luo Si (Alibaba Group)

Poster 84: Unsupervised P2P Rental Recommendations via Integer Programming Yanjie Fu (Missouri University of Science and Technology), Guannan Liu (Beihang University), Mingfei Teng (Rutgers University), Charu Aggarwal (IBM T. J. Watson Research Center)

Poster 85: The Simpler The Better: A Unified Approach to Predicting Original Taxi Demands based on Large­Scale Online Platforms Yongxin Tong (Beihang University), Yuqiang Chen (4Paradigm Inc.), Zimu Zhou (ETH Zurich), Lei Chen (Hong Kong University of Science and Technology), Jie Wang (Didi Research Institute), Qiang Yang (4Paradigm Inc. & Hong Kong University of Science and Technology), Jieping Ye (Didi Research Institute), Weifeng Lv (Beihang University)

Poster 86: Deep Choice Model Using Pointer Networks for Airline Itinerary Prediction Alejandro Mottini (Amadeus SAS), Rodrigo Acuna­Agost (Amadeus SAS)

Poster 87: Backpage and Bitcoin: Uncovering Human Traffickers Rebecca S Portnoff (University of California, Berkeley), Danny Yuxing Huang (University of California, San Diego), Periwinkle Doerfler (New York University), Sadia Afroz (ICSI), Damon McCoy (New York University)

Poster 88: Planning Bike Lanes based on Sharing­Bikes' Trajectories Jie Bao (Microsoft Research), Tianfu He (Harbin Institution of Technology), Sijie Ruan (Xidian University), Yanhua Li (WPI), Yu Zheng (Microsoft Research)

Poster 89: The Co­Evolution Model for Social Network Evolving and Opinion Migration Yupeng Gu (University of California, Los Angeles), Yizhou Sun (University of California, Los Angeles), Jianxi Gao (Northeastern University)

Poster 90: On Finding Socially Tenuous Groups for Online Social Networks Chih­Ya Shen (National Tsing Hua University), Liang­Hao Huang (Academia Sinica), De­Nian Yang (Academia Sinica), Hong­Han Shuai (National Chiao Tung University), Wang­Chien Lee (Pennsylvania State University), Ming­Syan Chen (National Taiwan University)

Poster 91: Compass: Spatio Temporal Sentiment Analysis of US Election Debjyoti Paul (University of Utah), Feifei Li (University of Utah), Murali Krishna Teja (University of Utah), Xin Yu (University of Utah), Richie Frost (University of Utah)

Poster 92: HyperLogLog Hyperextended: Sketches for Concave Sublinear Frequency Statistics Edith Cohen (Google Research)

Poster 93: Multi­Aspect Streaming Tensor Completion Qingquan Song (Texas A&M University), Xiao Huang (Texas A&M University), Hancheng Ge (Texas A&M University), James Caverlee (Texas A&M University), Xia Hu (Texas A&M University & Texas A&M Engineering Experiment Station)

Poster 94: Robust Top­k Multiclass SVM for Visual Category Recognition Xiaojun Chang (Carnegie Mellon University), Yao­Liang Yu (University of Waterloo), Yi Yang (University of Technology Sydney)

Poster 95: Not All Passes Are Created Equal:" Objectively Measuring The Risk and Reward of Passes in Soccer from Tracking Data" Paul Power (STATS), Hector Ruiz (STATS), Xinyu Wei (STATS), Patrick Lucey (STATS)

Poster 96: No Longer Sleeping with a Bomb: A Duet System for Protecting Urban Safety from Dangerous Goods Jingyuan Wang (Beihang University), Chao Chen (Beihang University), Junjie Wu (Beihang University), Zhang Xiong (Beihang University)

Poster 97: Similarity Forests Saket Sathe (IBM T. J. Watson Research Center), Charu C Aggarwal (IBM T. J. Watson Research Center)

Poster 98: A Data Science Approach to Understanding Residential Water Contamination in Flint Alex Chojnacki (University of Michigan), Chengyu Dai (University of Michigan), Arya Farahi (University of Michigan), Guangsha Shi (University of Michigan), Jared Webb (Brigham Young University), Daniel T. Zhang (University of Michigan), Jacob Abernethy (University of Michigan), Eric Schwartz (University of Michigan)

Poster 99: Online Ranking with Constraints: A Primal­Dual Algorithm and Applications to Web Traffic­Shaping Parikshit Shah (Yahoo Research), Akshay Soni (Yahoo Research), Troy Chevalier (Yahoo Research)

Poster 100: FLAP: An End­to­End Event Log Analysis Platform for System Management Tao Li (Nanjing University of Posts and Telecommunications), Yexi Jiang (Florida International University), Chunqiu Zeng (Florida International University), Bin Xia (Nanjing University of Posts and Telecommunications), Zheng Liu (Nanjing University of Posts and Telecommunications), Wubai Zhou (Florida International University), Xiaolong Zhu (Florida International University), Wentao Wang (Florida International University), Liang Zhang (Huawei Nanjing Research and Development Center), Jun Wu (Huawei Nanjing Research and Development Center), Li Xue (Huawei Nanjing Research and Development Center), Dewei Bao (Huawei Nanjing Research and Development Center)

Poster 101: KDD Cup: Travel Time Prediction by team “Convolution” (1st place) Ke Hu (Microsoft China Co., Ltd), Huan Chen (Beihang University) , Pan Huang (Microsoft China Co., Ltd), Peng Yan (Meituan­Dianping Co., Ltd.)

Poster 102: KDD Cup: Travel Time Prediction by team "Longing for a Teammate” (2nd place) Huang Yide (Zhejiang University)

Poster 103: KDD Cup: Travel Time Prediction by team “Warriors” (3rd place) Hengxing Cai, Sun Yat­sen University, Guangdong Key Laboratory of Intelligent Transportation Systems, Runxing Zhong, Beihang University, Chaohe Wang, Southwest Jiaotong University, Intel, Ruihuan Zhou, Southwest Jiaotong University, Kejie Zhou, University of Chinese Academy of Sciences, Hongyun Lee, University of Chinese Academy of Sciences, Kele Xu, National University of Defense Technology, Zhifeng Gao, Peking University, Renxin Zhong, Sun Yat­sen University, Guangdong Key Laboratory of Intelligent Transportation Systems, Jiachen Luo, Sun Yat­sen University, Guangdong Key Laboratory of Intelligent Transportation Systems, Yao Zhou, Chongqing University of Posts and Telecommunications, Tencent, Ming Ding, China­Power Information Technology Co. Ltd, Lang Li, ChinaTelecom Bestpay Co. Ltd, Qiang Li, Fudan University, Da Li, Beihang University, Nan Jiang, Beihang University, Xu Cheng, China Mobile Communications Corporation, Shiwen Cui, Harbin Engineering University, Hongfei Ye, Shanghai Jiao Tong University, Jiawei Shen, Shanghai China­Cubee Information Technology Co. Ltd

Poster 104: KDD Cup: Volume Prediction by team “Convolution” (1st place) Ke Hu (Microsoft China Co., Ltd.), Huan Chen (Beihang University), Pan Huang (Microsoft China Co., Ltd.), Peng Yan (Meituan­Dianping Co., Ltd.)

Poster 105: KDD Cup: Travel Time Prediction by team “Black­Swan” (2nd place) Yitian Chen (JD.COM), Jie Zhou (East China Normal University), Jie Lin (Nanjing University of Science and Technology), Hao Lin (Tencent), Yang Guo (JD.COM)

Poster 106: KDD Cup: Travel Time Prediction by team “CarTrailBlazer” (3rd place) Suiqian Luo (Guazi.com)

Thursday August 17, 2017 Detailed Program

Thursday 8:30am ­ 3:00pm, Registration, Registration Desk ­ Level 1 Atrium

Thursday 7:00am ­ 8:00am, KDD Breakfast ­ Level 3 Foyer

Thursday 9:30am ­ 12:30pm, KDD Exhibit Hall ­ Exhibit Hall

Thursday 9:30am ­ 12:00pm, KDD Sponsor Room ­ Level 8 Meeting Room 4

Thursday 8:00am ­ 9:30am, Scotiabank Centre Keynote Session 3: The Future of Data Integration Speaker: R enée Miller, Professor ­ University of Toronto chaired by S tan Matwin Abstract: The value of data explodes when it is integrated. In this talk, I present some important innovations in data integration over the last two decades. These include data exchange, which provides a foundation for reasoning about the correctness of transformed data, and the use of declarative mappings in integration. I discuss how data mining has been used to facilitate data integration and present some important new data integration challenges that arise in data science.

Thursday 9:30am ­ 10:00am, Coffee Break, Level 2 and 3 Foyer / Exhibit Hall

Thursday 10:00am ­ 12:00pm

Applied Data Science Session AT7: Web Applications, Room 200C Chair: D. Sculley

Peeking at A/B Tests Ramesh Johari (Stanford University), Pete Koomen (Optimizely, Inc.), Leonid Pekelis (Optimizely, Inc.), David Walsh (Stanford University) A Dirty Dozen: Twelve Common Metric Interpretation Pitfalls in Online Controlled Experiments Pavel Dmitriev (Microsoft Corporation), Somit Gupta (Microsoft Corporation), Dong Woo Kim (Microsoft Corporation), Garnet Vaz (Microsoft Corporation)

Cascade Ranking for Operational E­commerce Search Shichen Liu (Alibaba Group), Fei Xiao (Alibaba Group), Wenwu Ou (Alibaba Group), Luo Si (Alibaba Group) A Practical Exploration System for Search Advertising Parikshit Shah (Yahoo Research), Ming Yang (Yahoo), Sachidanand Alle (Yahoo), Adwait Ratnaparkhi (Yahoo Research), Ben Shahshahani (Yahoo Research), Rohit Chandra (Yahoo) LiJAR: A System for Job Application Redistribution towards Efficient Career Marketplace Fedor Borisyuk (LinkedIn Corporation), Liang Zhang (LinkedIn Corporation), Krishnaram Kenthapadi (LinkedIn Corporation)

Research Track Session RT11: Supervised Learning II, Room 200E

Chair: Johannes Fürnkranz

Functional Annotation of Human Protein Coding Isoforms via Non­convex Multi­Instance Learning Tingjin Luo (National University of Defense Technology), Weizhong Zhang (Zhejiang University), Shang Qiu (University of Michigan), Yang Yang (Beihang University), Dongyun Yi (National University of Defense Technology), Guangtao Wang (University of Michigan), Jieping Ye (University of Michigan), Jie Wang (University of Michigan) Structural Deep Brain Network Mining Shen Wang (University of Illinois at Chicago), Lifang He (Shenzhen University), Bokai Cao (University of Illinois at Chicago), Chun­Ta Lu (University of Illinois at Chicago), Philip S Yu (University of Illinois at Chicago), Ann B. Ragin (Northwestern University) Is the Whole Greater Than the Sum of Its Parts? Liangyue Li (Arizona State University), Hanghang Tong (Arizona State University), Yong Wang (Hong Kong University of Science and Technology), Conglei Shi (IBM Research), Nan Cao (Tongji University), Norbou Buchler (US Army Research Laboratory) Estimating Treatment Effect in the Wild via Differentiated Confounder Balancing Kun Kuang (Tsinghua University), Peng Cui (Tsinghua University), Bo Li (Tsinghua University), Meng Jiang (University of Notre Dame), Shiqiang Yang (Tsinghua University) Deep Embedding Forest: Forest­based Serving with Deep Embedding Features Jie Zhu (Microsoft Corporation), Ying Shan (Microsoft Corporation), JC Mao (Microsoft Corporation), Dong Yu (Microsoft Corporation), Holakou Rahmanian (University of California, Santa Cruz), Yi Zhang (Microsoft Corporation)

Research Track Session RT12: Humans and Crowds, Suite 202­205 Chair: Tim Weninger

Robust Top k Multiclass SVM for Visual Category Recognition Xiaojun Chang (Carnegie Mellon University), Yao­Liang Yu (University of Waterloo), Yi Yang (University of Technology Sydney) The Co­Evolution Model for Social Network Evolving and Opinion Migration Yupeng Gu (University of California, Los Angeles), Yizhou Sun (University of California, Los Angeles), Jianxi Gao (Northeastern University) Tracking the Dynamics in Crowdfunding Hongke Zhao (University of Science and Technology of China), Hefu Zhang (University of Science and Technology of China), Yong Ge (University of Arizona), Qi Liu (University of Science and Technology of China), Enhong Chen (University of Science and Technology of China), Huayu Li (University of North Carolina at Charlotte), Le Wu (Hefei University of Technology) Accelerating Innovation Through Analogy Mining Tom Hope (Hebrew University of Jerusalem), Joel Chan (Carnegie Mellon University), Aniket Kittur (Carnegie Mellon University), Dafna Shahaf (Hebrew University of Jerusalem) Human Mobility Synchronization and Trip Purpose Detection with Mixture of Hawkes Processes Pengfei Wang (Chinese Academy of Sciences), Yanjie Fu (Missouri University of Science and Technology), Guannan Liu (Beihang University), Wenqing Hu (Missouri University of Science and Technology), Charu Aggarwal (IBM T. J. Watson Research Center)

Research Track Session RT13: Recommenders, Room 200D Chair: Leman Akoglu

HoORaYs: High­order Optimization of Rating Distance for Recommender Systems Jingwei Xu (Nanjing University), Yuan Yao (Nanjing University), Hanghang Tong (Arizona State University), Xianping Tao (Nanjing University), Jian Lu (Nanjing University) Online Ranking with Constraints: A Primal­Dual Algorithm and Applications to Web Traffic­Shaping Parikshit Shah (Yahoo Research), Akshay Soni (Yahoo Research), Troy Chevalier (Yahoo Research) Unsupervised P2P Rental Recommendations via Integer Programming Yanjie Fu (Missouri University of Science and Technology), Guannan Liu (Beihang University), Mingfei Teng (Rutgers University), Charu Aggarwal (IBM T. J. Watson Research Center) Meta­Graph Based Recommendation Fusion over Heterogeneous Information Networks Huan Zhao (Hong Kong University of Science and Technology), Quanming Yao (Hong Kong University of Science and Technology), Jianda Li (Hong Kong University of Science and Technology), Yangqiu Song (Hong Kong University of Science and Technology), Dik Lun Lee (Hong Kong University of Science and Technology) Large­scale Collaborative Ranking in Near­Linear Time Liwei Wu (University of California, Davis), Cho­Jui Hsieh (University of California, Davis), James Sharpnack (University of California, Davis)

Thursday 12:00pm ­ 12:30pm, Coffee Break, Level 2 Foyer

Thursday 12:30pm ­ 1:00pm, KDD 2017 Closing Session, Room 200CD

Thursday 2:00PM ­ 5:00PM KDD Ocean Data Analytics Industry Connector Level 8 ­ Summit Suite / Meeting Room 5

KDD 2017 Conference Organization KDD 2017 Organizing Committee

General Chairs ● Stan Matwin (Dalhousie University) ● Shipeng Yu (LinkedIn) Associate General Chair ● Faisal Farooq (IBM) Research Track Program Chairs ● Ravi Kumar (Google) ● Tina Eliassi­Rad (Northeastern University) Applied Data Science Track Program Chairs ● Roberto J. Bayardo (Google) ● Charles Elkan (Amazon & UCSD) Applied Data Science Invited Talks Chairs ● Usama Fayyad (Open Insights) ● Evangelos Simoudis (Synapse Partners) ● Ashok Srivastava (Verizon) Workshop Chairs ● Leman Akoglu (CMU) ● Tijl De Bie (Ghent University & University of Bristol) Tutorial Chairs ● Yan Liu (University of Southern California) ● Jianyong Wang (Tsinghua University) Hands­on Tutorial Chairs ● HangHang Tong (Arizona State University) ● Ritwik Kumar (Apple) Publications Chair ● Rómer Rosales (LinkedIn) Poster Chairs ● Jing Gao (University at Buffalo) ● Xiaoguang Wang (Alibaba) Local Arrangements Chair ● Evangelos Milios (Dalhousie University) KDD Cup Chairs ● Lei Li (Toutiao Lab and Byte Dance) ● Danai Koutra (University of Michigan) Panel Chairs ● Andrew Tomkins (Google) ● Muthu Muthukrishnan (Rutgers) Media and Publicity Chairs ● Tiger Zhang (LinkedIn) ● Aijun An (York University) Sponsorship Chairs ● Claudia Perlich (Dstillery) ● Diana Inkpen (University of Ottawa) Best Paper Chairs ● Deepak Agarwal (LinkedIn) ● Jennifer Neville (Purdue University)

Doctoral Dissertation Award Chair ● Tanya Berger­Wolf (UIC) Web Chair ● Derek Young (Amazon) Innovation and Service Award Chair ● Wei Wang (UCLA) Test­of­Time Paper Award Chair ● Qiang Yang (HKUST) Social Network Chairs ● Jingrui He (Arizona State University) ● Xia “Ben” Hu (Texas A&M University) Student Travel Awards Chair ● Qiaozhu Mei (University of Michigan) Treasurer ● Ying Li (DataSpark) Video and Streaming Chair ● Marko Grobelnik (IJS) Job Matching Chair ● Jiliang Tang (Michigan State University) Social Impact Chair ● Mohak Shah (Bosch Research) ● Rayid Ghani (University of Chicago) Research Track Senior Program Committee

Naoki Abe (IBM Research) Arindam Banerjee (University of Minnesota) Paul Bennett (Microsoft Research) Tanya Berger­Wolf (Department of Computer Science, University of Illinois at Chicago) Wray Buntine (Monash University) James Caverlee (Texas A&M University) Hong Cheng (The Chinese University of Hong Kong) Flavio Chierichetti (Sapienza University) Diane Cook (Washington State University) Cristian Danescu­Niculescu­Mizil (Cornell University) Anirban Dasgupta (IIT Gandhinagar) Ian Davidson (U.C. Davis) Martin Ester (SFU) Peter Flach (University of Bristol) Benjamin C. M. Fung (McGill University) Johannes Fürnkranz (TU Darmstadt) Thomas Gaertner (University of Nottingham) Aristides Gionis (Aalto University) David Gleich (Purdue University) Dimitrios Gunopulos (University of Athens) Stratis Ioannidis (Northeastern University) Panos Ipeirotis (New York University) George Karypis (University of Minnesota) Eamonn Keogh (UC Riverside) Tamara Kolda (Sandia National Laboratories) Jure Leskovec (Stanford) Edo Liberty (Yahoo Labs) Chih­Jen Lin (National Taiwan University) Tie­Yan Liu (Microsoft Research Asia)

Qiaozhu Mei (University of Michigan) Katharina Morik (TU Dortmund) Benjamin Moseley (Washington University in St. Louis) Siegfried Nijssen (Katholieke Universiteit Leuven) Srinivasan Parthasarathy (Ohio State University) B. Aditya Prakash (Virginia Tech) Matteo Riondato (Two Sigma Investments, LP) Arno Siebes (Universiteit Utrecht) Ambuj Singh (University of California, Santa Barbara) Yizhou Sun (UCLA) Jie Tang (Tsinghua University) Dacheng Tao (The Hong Kong Polytechnic University) Evimaria Terzi (Boston University) Charalampos Tsourakakis (Harvard University) Sergei Vassilvitskii (Google) Tim Weninger (University of Notre Dame) David P. Woodruff (IBM Almaden) Elad Yom­Tov (Microsoft Research) Jeffery Xu Yu (Dpt. of Systems Engineering & Engineering Management, Chinese Univ. of Hong Kong) Yisong Yue (California Institute of Technology) Mohammed Zaki (RPI) Applied Data Science Track Senior Program Committee

Kamal Ali (Apple Inc.) Shlomo Berkovsky (CSIRO) Chiranjib Bhattacharyya (Indian Institute of Science) Albert Bifet (Telecom ParisTech) Kofi Boakye (Yahoo) Alexey Borisov (University of Amsterdam & Yandex) Tilmann Bruckhaus (Intuit) Edith Cohen (Google Research) Brain d’Alessandro (Zocdoc) Jesse Davis (KU Leuven) Dmitriy Fradkin (Siemens) Glenn Fung (American Family Insurance) Ariel Fuxman (Google) Byron Galbraith (Talla) Rayid Ghani (University of Chicago) Moises Goldszmidt (Apple) Jingrui He (Arizona State University) Diane Hu (Etsy) Vanja Josifovski (Pinterest) Pallika Kanani (Oracle Labs) Nick Koudas (University of Toronto) Hang Li (Huawei Technologies) Tie­Yan Liu (Microsoft Research Asia) Yoelle Maarek (Yahoo Research) Jérémie Mary (Univ. Lille / Inria Lille Nord Europe) Julian Mcauley (UC San Diego) Gabor Melli (Sony Interactive Entertainment) Nina Mishra (Amazon) S Muthukrishnan (Rutgers University) Vijay Narayanan (Microsoft) Jennifer Neville (Purdue University) Nikunj Oza (NASA) Alessandro Panconesi (Sapienza University of Rome)

Hanchuan Peng (Allen Inst) Florian Pinel (IBM) Naren Ramakrishnan (Virginia Tech) Nikhil Rao (Technicolor Research) D. Sculley (Google) Srinivasan Sengamedu (Amazon) Pavel Serdyukov (Yandex) Dou Shen (Microsoft Adcenter Labs) Gautam Shroff (Tata Consultancy Services) Le Song (Georgia Tech) Isabelle Stanton (Google) V.S. Subrahmanian (University of Maryland) Jimeng Sun (Georgia Tech) Vincent S. Tseng (National Chiao Tung University) Peter van der Putten (LIACS Leiden University & Pegasystems) Fei Wang (Cornell University) Graham Williams (Microsoft) Josh Wills (Rui Yan) Rui Yan (Peking University) Martin Zinkevich (Google) Research Track Program Committee

Bruno Abrahao (Stanford University) Tim Althoff (Stanford University) Aris Anagnostopoulos (Sapienza University of Rome) Ira Assent (Aarhus University) Renato Assuncao (Universidade Federal de Minas Gerais) Stephen Bach (Stanford University) Mohammad Taha Bahadori (Georgia Institute of Technology) Eric Balkanski (Harvard University) Richard Baraniuk (Rice University) Nicola Barbieri (Tumblr) Senjuti Basu Roy (New Jersey Institute of Technology) Gustavo Batista (University of São Paulo) Andras A. Benczur (Institute for Computer Science and Control, Hungarian Academy of Sciences) Austin Benson (Stanford University) Klaus Berberich (Max Planck Institute for Informatics) Michael Berthold (University of Konstanz) Alex Beutel (Google) Sahely Bhadra (Network Science Institute, Northeastern University) Smriti Bhagat (Facebook) Aditya Bhaskara (University of Utah) Siddhartha Bhattacharyya (Information and Decision Sciences, College of Business, University of Illinois, Chicago) Jinbo Bi (University of Connecticut) Jiang Bian (Microsoft Research) Albert Bifet (Telecom ParisTech) Mustafa Bilgic (Illinois Institute of Technology) Christian Böhm (Ludwig­Maximilians­Universität München) Klemens Böhm (Karlsruhe Institute of Technology (KIT)) Francesco Bonchi (The ISI Foundation, Turin, Italy) Rajesh Bordawekar (IBM research) Alessandro Bozzon (Delft University of Technology) Ulf Brefeld (Leuphana Universität Lüneburg) David Buttler (Lawrence Livermore National Laboratory) Jose Cadena (Virginia Tech)

Deng Cai (Zhejiang University) B. Barla Cambazoglu (NTENT Inc.) Jian Cao (Department of Computer Science and Engineering, Shanghai Jiao Tong University) Longbing Cao (Faculty of IT, University of Technology Sydney) Michele Catasta (Ecole Polytechnique Fédérale de Lausanne) Jeffrey Chan (Department of Computer Science and Software Engineering, University of Melbourne) Vineet Chaoji (Amazon) Duen Horng Chau (Georgia Tech) Sanjay Chawla (Qatar Computing Research Institute (QCRI), HBKU, Doha, Qatar) Bee­Chung Chen (LinkedIn) Enhong Chen (University of Science & Technology of China) Ming­Syan Chen (National Taiwan University) Ping Chen (UMB) Rui Chen (Samsung Research America) Shu­Ching Chen (Florida International University) Shuo Chen (Blizzard Entertainment) Songqing Chen (George Mason University) Wei Chen (Microsoft Research Asia) Yixin Chen (Washington University) James Cheng (Chinese University of Hong Kong) Ingemar Cox (University College London & University of Copenhagen) Mark Crovella (Boston University) Philippe Cudre­Mauroux (University of Fribourg) Bin Cui (Peking University) Abhimanyu Das (Google) Mahashweta Das (Visa Research) Gianluca Demartini (University of Sheffield) Ayhan Demiriz (Gebze Technical University) Alex Deng (Microsoft) Amit Dhurandhar (IBM TJ Watson) Djellel Eddine Difallah (University of Fribourg) Bolin Ding (Microsoft Research) Nemanja Djuric (Uber ATG) Guozhu Dong (Wright State University) Yuxiao Dong (University of Notre Dame) Petros Drineas (Purdue) Nan Du (Google) Saso Dzeroski (Jozef Stefan Institute) Alina Ene (Boston University) Alessandro Epasto (Google Research) Alex Fabrikant (Google Research) Boi Faltings (Ecole Polytechnique Federale de Lausanne) Liyue Fan (State University of New York Albany) Ad Feelders (Universiteit Utrecht) Yun Fu (Northeastern University) Pavid Fuhry (The Ohio State University) Aram Galstyan (USC/ISI) Joao Gama (University Porto) Bin Gao (Microsoft Research) Byron Gao (Texas State University) Jing Gao (University at Buffalo) David Garcia (ETH Zurich) Wolfgang Gatterbauer (Carnegie Mellon University) Yong Ge (The University of North Carolina at Charlotte) Rumi Ghosh (University of Southern California) Amol Ghoting (LinkedIn Corporation) Fosca Giannotti (Istituo di Scienza e Tecnologie dell’Informazione, National Research Council of Italy) Derek Greene (University College Dublin)

Quanquan Gu (University of Virginia) Francesco Gullo (UniCredit) Krishna Gummadi (MPI­SWS) Stephan Günnemann (Technical University of Munich) Abhishek Gupta (LinkedIn) Wook­Shin Han (POSTECH) Mohammad Hasan (Indiana University Purdue University, Indianapolis) Xiaofeng He (East China Normal University) Andreas Hotho (University of Wuerzburg) Cho­Jui Hsieh (UC Davis) Hsun­Ping Hsieh (National Taiwan University) Chun­Nan Hsu (UC San Diego) Xia Ben Hu (Texas A&M University) Jun Huan (University of Kansas) Sheng­Jun Huang (Nanjing University of Aeronautics and Astronautics) Yen E. H. Ian (Carnegie Mellon University) Tomoharu Iwata (NTT) Szymon Jaroszewicz (Institute of Computer Science, Polish Academy of Sciences) David Jensen (University of Massachusetts Amherst) Shuiwang Ji (WSU) Yexi Jiang (Florida International University) Wen Jin (Independent Researcher) Panos Kalnis (King Abdullah University of Science and Technology) U Kang (Seoul University University) Murat Kantarcioglu (University of Texas at Dallas) Ben Kao (The University of Hong Kong) Komal Kapoor (University of Minnesota Twin Cities) Amin Karbasi (Yale University) Panagiotis Karras (Aalborg University) Hisashi Kashima (Kyoto University) Latifur Khan (UTD) Emre Kiciman (Microsoft Research) Deguang Kong (Yahoo Research) Xiangnan Kong (Worcester Polytechnic Institute) Nicolas Kourtellis (Telefonica Research) Ioannis Koutis (University of Puerto Rico­Rio Piedras) Tim Kraska (Brown University) Srijan Kumar (University of Maryland, College Park) Himabindu Lakkaraju (Stanford University) Laks Lakshmanan (University of British Columbia) Theodoros Lappas (Stevens Institute of Technology) Longin Jan Latecki (Temple University) Neal Lathia (Skyscanner) Silvio Lattanzi (Google) Kyumin Lee (Utah State University) Kristen Lefevre (Google) Kristina Lerman (University of Southern California) Cheng Li (University of Michigan) Cheng­Te Li (National Cheng Kung University) Chengkai Li (University of Texas at Arlington) Jiuyong Li (University of South Australia) Kai Li (University of Central Florida) Shuai Li (University of Cambridge) Yanen Li (University of Illinois at Urbana­Champaign) Zhenhui Li (Penn State University) Hsuan­Tien Lin (National Taiwan University) Xuemin Lin (The University of New South Wales) Chuanren Liu (Rutgers Business School)

Guimei Liu (Institute for Infocomm Research) Huan Liu (Arizona State University) Jun Liu (SAS Institute Inc.) Xuanzhe Liu (Peking University) David Lo (Singapore Management University) Peter Lofgren (Stanford) Felipe Llinares Lopez (Swiss Federal Institute of Technology, Zürich) Bo Long (Yahoo! Labs) Yin Lou (Airbnb) Chang­Tien Lu (Virginia Tech) Patrick Lucey (STATS) Jiebo Luo (University of Rochester) Ping Luo (Institute of Computing Technology, CAS) Hao Ma (Microsoft Research) Shuai Ma (Beihang University) Richard Maclin (University of Minnesota Duluth) Malik Magdon­Ismail (Rensselaer Polytechnic Institute) Anirban Majumder (Amazon) Hiroshi Mamitsuka (Kyoto University) Nikos Mamoulis (University of Hong Kong) Amin Mantrach (Yahoo!) Michael Mathioudakis (Aalto University) Julian Mcauley (UC San Diego) Wagner Meira Jr. (Universidade Federal de Minas Gerais) Weiyi Meng (Department of Computer Science, Binghamton University) Pauli Miettinen (Max Planck Institute for Informatics) Baharan Mirzasoleiman (Swiss Federal Institute of Technology, Zürich) Dunja Mladenic (Jozef Stefan Institute) Joshua Moore (Cornell) Sebastian Moreno (Universidad Adolfo Ibañez) Abdullah Mueen (University of New Mexico) Kamesh Munagala (Duke University) Emmanuel Müller (Hasso­Plattner­Institute) Mirco Nanni (KDD­Lab ISTI­CNR Pisa) Olfa Nasraoui (University of Louisville) Wilfred Ng (The Hong Kong University of Science and Technology) Feiping Nie (Northwestern Polytechnical University, China) Zaiqing Nie (Microsoft Research Asia) Eirini Ntoutsi (Gottfried Wilhelm Leibniz Universität Hannover) Tim Oates (University of Maryland Baltimore County) Zoran Obradovic (Temple University) Rasmus Pagh (IT University of Copenhagen) Themis Palpanas (Paris Descartes University) Rong Pan (Sun Yat­sen University) Rina Panigrahy (Microsoft Research) Evangelos Papalexakis (University of California Riverside) Jian Pei (Simon Fraser University) Bryan Perozzi (Google Research) Francois Petitjean (Monash University) Bernhard Pfahringer (University of Waikato) Dinh Phung (Deakin University) Claudia Plant (University of Vienna) Tao Qin (Microsoft Research Asia) Vladan Radosavljevic (Uber Advanced Technology Group) Chedy Raïssi (INRIA) Karthik Raman (Google) Jan Ramon (INRIA) Huzefa Rangwala (George Mason University)

Sayan Ranu (IIT Delhi) Chotirat Ratanamahatana (Chulalongkorn University) Xiang Ren (University of Illinois at Urbana­Champaign) Steffen Rendle (Google) Matthias Renz (Ludwig­Maximilians University) Bruno Ribeiro (Purdue University) Manuel Gomez Rodriguez (MPI for Software Systems) Oleg Rokhlenko (Amazon) Salvatore Ruggieri (Dipartimento di Informatica, Università di Pisa) Joerg Sander (University of Alberta) Raul Santos­Rodriguez (University of Bristol) Nishanth Sastry (King’s College London) Assaf Schuster (Technion) C. Seshadhri (University of California, Santa Cruz) Aneesh Sharma (Twitter Inc) Pannaga Shivaswamy (Netflix Inc.) Milad Shokouhi (Microsoft) Vikas Sindhwani (IBM Research) Yaron Singer (Harvard University) Kaushik Sinha (Wichita State University) Kevin Small (Amazon) Yangqiu Song (The Hong Kong University of Science and Technology) Anna Squicciarini (The Pennsylvania State University) Mudhakar Srivatsa (IBM T.J. Watson Research Center) Michael Steinbach (University of Minnesota) Torsten Suel (New York University) Mahito Sugiyama (Osaka University) Masashi Sugiyama (RIKEN/The University of Tokyo) Huan Sun (The Ohio State University) Jimeng Sun (Georgia Tech) Einoshin Suzuki (Kyushu University) Adith Swaminathan (Cornell University) Ichiro Takeuchi (Nagoya Institute of Technology) Partha Talukdar (Indian Institute of Science) Chenhao Tan (Cornell University) Jian Tang (University of Michigan) Yufei Tao (The University of Queensland) Nikolaj Tatti (Aalto University) Kai­Ming Ting (Federation University Australia) Volker Tresp (Siemens AG) Panayiotis Tsaparas (University of Ioannina) Vincent S. Tseng (National Chiao Tung University) Antti Ukkonen (Finnish Institute of Occupational Health) Jaideep Vaidya (Rutgers University) Kush Varshney (IBM Thomas J. Watson Research Center) Michail Vlachos (IBM Research) Jilles Vreeken (Max­Planck Institute for Informatics and Saarland University) Slobodan Vucetic (Temple University) Anil Kumar Vullikanti (Virginia Tech) Can Wang (Zhejiang University) Dashun Wang (Kellogg School of Management, Northwestern University) Dong Wang (University of Notre Dame) Guan Wang (Nio Automotive) Hongning Wang (University of Virginia) Jian Wang (LinkedIn Corporation) Ke Wang (Simon Fraser University) Ting Wang (Lehigh University) Wei Wang (UCLA)

Xiang Wang (IBM Research) Takashi Washio (ISIR, Osaka Univ.) Ingmar Weber (Qatar Computing Research Institute) Robert West (École Polytechnique Fédérale de Lausanne) Anthony Wirth (The University of Melbourne) Ran Wolff (Yahoo Labs) Raymond Chi­Wing Wong (The Hong Kong University of Science and Technology) Weng­Keen Wong (Oregon State University) Jungie Wu (Beihang University) Xindong Wu (University of Louisiana of Lafayette) Xintao Wu (University of Arkansas) Keli Xiao (Stony Brook University) Xiaokui Xiao (Nanyang Technological University) Hui Xiong (Rutgers University) Chang Xu (University of Technology Sydney) Jiajing Xu (Pinterest) Xiaowei Xu (University of Arkansas at Little Rock) Takeshi Yamada (NTT Communication Science Laboratories) Xifeng Yan (University of California at Santa Barbara) Zhijun Yin (Twitter) Hsiang­Fu Yu (Amazon) Kui Yu (Hefei University of Technology) Lei Yu (Binghamton University) Xiao Yu (Google) Yang Yu (Nanjing University) Chunqiu Zeng (Florida International University) Duo Zhang (Kunlun Inc.) Kun Zhang (Carnegie Mellon University) Min­Ling Zhang (School of Computer Science and Engineering, Southeast University) Peng Zhang (Fictitious Economics & Data Technology Research Centre) Ping Zhang (IBM Thomas J. Watson Research Center) Rui Zhang (IBM Research ­ Almaden) Shichao Zhang (Guangxi Normal University) Weinan Zhang (Shanghai Jiao Tong University) Wenlu Zhang (Old Dominion University) Xiang Zhang (The Pennsylvania State University) Ya Zhang (Shanghai Jiao Tong University) Yu Zhang (HKUST) Zhenjie Zhang (Advanced Digital Sciences Center) Bo Zhao (Pinterest) Peixiang Zhao (Florida State University) Yuchen Zhao (AppDynamics) Vincent W. Zheng (Advanced Digital Sciences Center) Yu Zheng (Microsoft Research) Jiayu Zhou (Michigan State University) Mianwei Zhou (Yahoo Lab) Wenjun Zhou () Feida Zhu (Singapore Management University) Xingquan Zhu (Florida Atlantic University) Applied Data Science Track Program Committee

Talel Abdessalem (Télécom ParisTech, Université Paris­Saclay) Puneet Agarwal (Tata Consultancy Services) John­Mark Agosta (Microsoft) Amr Ahmed (Google Research) Nesreen Ahmed (Intel Labs)

Jae­Wook Ahn (University of Pittsburgh) Ioannis Akrotirianakis (Siemens) Bo An (Nanyang Technological University) Azin Ashkan (Google) Kun Bai (MIG Tencent Company) Gagan Bansal (Microsoft) Rohan Baxter (ATO) Michael Bendersky (Google) Charlie Berger (Oracle) Debarun Bhattacharjya (IBM T.J. Watson Research Center) Sourangshu Bhattacharya (IIT KGP) Jiang Bian (Microsoft Research) Joseph Bockhorst (American Family Insurance) Alessandro Bria (University of Cassino) Donna Byron (IBM) Diane Chang (Intuit) Abon Chaudhuri (Walmart Labs) Daizhuo Chen (Dstillery & Columbia Business School) Feng Chen (SUNY Albany) Jie Chen (University of South Australia) Kuan­Ta Chen (Academia Sinica) Zhengzhang Chen (NEC Laboratories America, Inc.) Zhitang Chen (Huawei Technologies) Juan Colmenares (Samsung Research America) Prakash Mandayam Comar (Amazon) James Cook (Google) Peng Cui (Tsinghua University) Bo Dai (Georgia Tech) Hanjun Dai (Georgia Institute of Technology) Kamalika Das (NASA Ames Research Center) Danielle Dean (Microsoft) Cheng Deng (Xidian University) Lipika Dey (TCS Innovation Lab Delhi) Sauptik Dhar (Robert Bosch) Weicong Ding (Amazon) Yuxiao Dong (University of Notre Dame) Zhenhua Dong (Huawei Technologies) Pinar Donmez (Microsoft) Alexey Drutsa (Yandex & Lomonosov Moscow State University) Nan Du (Google Research) Hoda Eldardiry (PARC) Thomas Emerson (Freddie Mac) Theodoros Evgeniou (INSEAD) Mehrdad Farajtabar (Georgia Institute of Technology) Luca Foschini (Evidation Health) Sorelle Friedler (Haverford College) Ravi Ganti (Walmart Labs, San Bruno) Pierre Garrigues (Yahoo) Mina Ghashami (University of Utah) Shalini Ghosh (SRI) Balaji Gopalan (Google) Mihajlo Grbovic (Airbnb) Tamara Greasby (Oracle Data Cloud) Elena Grewal (Airbnb) Sudipto Guha (University Of Pennsylvania) Gleb Gusev (Yandex) Eui­Hong (Sam) Han (The Washington Post) Tj Hazen (Microsoft)

Arjen Hommersom (Open University of the Netherlands) Mahmud Shahriar Hossain (University of Texas at El Paso) Joshua Huang (Shenzhen University) Samuel Huston (Google) Hsun­Ping Hsieh (National Cheng Kung University) Monica Hsu (Intuit) Alex Jaimes (Columbia University) Vijay Manikandan Janakiraman (USRA/NASA AMES Research Center) Warren Jin (CSIRO) Xiaoming Jin (Tsinghua University) Purushottam Kar (Indian Institute of Technology Kanpur) Alexandros Karatzoglou (Telefonica Research) Tom Kenter (University of Amsterdam) Krishnaram Kenthapadi (LinkedIn) Brendan Kitts (AOL) Noam Koenigstein (Microsoft) Yun Sing Koh (University of Auckland) Alek Kolcz (Pushd) Konstantinos Kollias (Google Research) James Kunz (Google) Henry Leung (University of Calgary) Cheng­Te Li (National Cheng Kung University) Shuai Li (University of Cambridge) Yun Li (Nanjing University of Posts and Telecommunications) Xiaoli Li (Institute for Infocomm Research) Shou­De Lin (National Taiwan University) Lin Liu (University of South Australia) Yiqun Liu (Tsinghua University) Yuchen Liu (UCLA) Sean Lorenz (Senter) Kun Lui (LinkedIn) Qiang Ma (Yahoo! Inc.) Amogh Mahapatra (Apple) Anirban Majumder (Amazon) Gideon Mann (Bloomberg LP) Manish Marwah (Hewlett Packard Labs) Charalampos Mavroforakis (Boston University) Wannes Meert (KU Leuven) Sam Michalka (Olin College) Boriana Milenova (Oracle) Avishkar Misra (Oracle) Gianmarco De Francisci Morales (Qatar Computing Research Institute) Robert Moskovitch (Deutsche Telekom Laboratories at Ben­Gurion University) Abdullah Mueen (University of New Mexico) Saikat Mukherjee (Intuit) Shubha Nabar (Salesforce) Nagarajan Natarajan (University of Texas Austin) Richi Nayak (Queensland University of Technology ­ Brisbane, Australia) See­Kiong Ng (National University of Singapore) Karl Ni (In­Q­Tel) Scott Nicholson (Consultant) Alexandros Ntoulas (LinkedIn) John O’Donovan (University of California Santa Barbara) Neil O’Hare (Yahoo! Research) Kok­Leong Ong (La Trobe University) Girish Palshikar (Tata Consultancy Services Limited) Debprakash Patnaik (Amazon.com) Wen­Chih Peng (National Chiao Tung University)

Luisa Polania (AMFAM) Foster Provost (NYU) Vladan Radosavljevic (Uber Advanced Technology Group) Balaraman Ravindran (Indian Institute of Technology Madras) Bruno Ribeiro (Purdue University) Marco Tulio Ribeiro (University of Washington) Alex Sadovsky (Oracle Data Cloud) Nima Sarshar (Apple) Ed Seabolt (IBM) Bin Shao (Microsoft Research Asia) Li Shen (Indiana University School of Medicine) Conglei Shi (IBM T.J. Watson Research Center) Larisa Shwartz (IBM T.J. Watson Research Center) Mauro Sozio (Télécom ParisTech) Francesca Spezzano (Boise State University) Jacob Spoelstra (Microsoft) Ori Stitelman (dstillery) Yifan Sun (Technicolor Research) Can Ozan Tan (Harvard Medical School) Wei Tan (IBM) Liang Tang (Google) Dan Tecuci (Ernst & Young (EY)) Marko Tkalcic (Free University of Bozen­Bolzano) Yongxin Tong (Beihang University) Tanguy Urvoy (France Telecom R&D) Hamed Valizadegan (NASA Ames Research Center) Christophe Van Gysel (University of Amsterdam) Jan Van Haaren (KU Leuven) Lav Varshney (University of Illinois at Urbana­Champaign) Lovekesh Vig (TCS Research, New Delhi) Chenguang Wang (IBM Research) Dingding Wang (Florida Atlantic University) John Wang (Facebook) Pengyuan (Penelope) Wang (University of Georgia) Quan Wang (Chinese Academy of Sciences) Taifeng Wang (Microsoft Research Asia) Renato Werneck (Amazon) Melinda Han Williams (Dstillery) Yinglong Xia (Huawei Research America) Cao Xiao (IBM T.J. Watson Research Center) Bo Xie (Georgia Tech) Jun Xu (Chinese Academy of Sciences) De­Nian Yang (Academia Sinica) Hongxia Yang (Alibaba Group) Pei Yang (South China University of Technology) Yuan Yao (Nanjing University) Yanfang Ye (West Virginia University) Shi Yu (American Family Insurance) Mingxuan Yuan (Huawei Noah Arklab) Morteza Zadimoghaddam (Google Research) Chunqiu Zeng (FIU) Yiye Zhang (Weill Cornell Medical College) Yuyu Zhang (Georgia Institute of Technology) Xi Zhang (Weill Cornell Medical College) Yu Zheng (Microsoft Research) Zhi Zhou (Allen Institute for Brain Science) Yada Zhu (IBM T.J. Watson Research Center) Albrecht Zimmermann (Université Caen Normandie)

KDD 2017 Sponsors & Supporters

Sponsor: ACM SIGKDD

Supporters: Diamond

Supporters: Platinum

Supporters: Gold

Supporters: Silver

Supporters: Bronze

Supporters: Lanyard Sponsor

Supporters: KDD CUP Sponsor

Supporters: Research Track Best Paper Award

Supporters: Industry/Government Track Best Paper Award

Supporters: Best Student Paper Award

Supporters: Dissertation Award

Supporters: Student Travel Awards

Exhibitors:

WiFi Sponsor:

Conference Venue and WiFi Information

Conference Venue:

World Trade and Convention Centre 1800 Argyle St., Halifax, Nova Scotia, B 3J 3N8, tel: ( 902) 421­8686 Scotiabank Centre 1800 Argyle Street, Halifax, NSB3J2V9 *Exhibit Hall, Keynote Talks, Lunch each day provided

Banquet Venue:

Pier 23, Cunard Center

WiFi Information in the conference areas

SSID: KDD2017 Access Code: SIGKDD2017

Halifax, Points of Interest

1: KDD Venue 2: KDD Banquet Venue 3: Citadel Hill Historic Site 4: Public Gardens 5: Dalhousie University 6: St Mary’s University 7: Halifax Farmers’ Market 8: Pier 21 Immigration Museum 9: Maritime Museum of the Atlantic 10: City Hall 11: Museum of Natural History 12: Halifax Public Library 13: Waterfront Boardwalk (14 to 7) 14: Dartmouth Ferry Terminal 15: VIA Rail Station

Useful Links and Emergency Contacts

About Halifax and Nova Scotia: ● City of Halifax: h ttps://www.halifax.ca/ ● Halifax Wikipedia page: h ttps://en.wikipedia.org/wiki/Halifax,_Nova_Scotia ● Halifax History: h ttps://en.wikipedia.org/wiki/History_of_Halifax_(former_city) ● Halifax News and Events: h ttps://www.thecoast.ca/ ● Halifax Transit: h ttps://www.halifax.ca/transportation/halifax­transit ● Destination Halifax: h ttp://www.destinationhalifax.com/ ● Tourism Nova Scotia: h ttp://www.novascotia.com/ ● Dalhousie University (1818­2018): h ttps://www.dal.ca/ ● Saint Mary’s University: h ttp://www.smu.ca/ ● Nova Scotia College of Art & Design University: h ttp://nscad.ca

For Emergency and help: ● In emergencies call 911 to reach Police Department and Medical Care ● Stop one of the volunteers in a KDD T­shirt for help