December 11 to 14, 2011

Conference Program

TABLE OF CONTENTS MESSAGE FROM CONFERENCE GENERAL CHAIRS ...... 1 MESSAGE FROM THE PROGRAM CO-CHAIRS ...... 3 SPONSORS ...... 5 GENERAL INFORMATION ...... 6 INFORMATION FOR SESSION CHAIRS AND PRESENTERS...... 7 CONFERENCE PROGRAM AT-A-GLANCE ...... 9

WORKSHOPS AT-A-GLANCE ...... 9 TECHNICAL PROGRAM AT-A-GLANCE ...... 10

CONFERENCE PROGRAM ...... 12

WORKSHOP PROGRAM ...... 12 TECHNICAL CONFERENCE PROGRAM ...... 31 DEMOS AND EXHIBITIONS PROGRAM ...... 43 TUTORIALS PROGRAM ...... 45

ICDM 2011 KEYNOTE SPEECHES ...... 46 SOCIAL PROGRAM ...... 48 ORGANISING COMMITTEE ...... 49

CONFERENCE CHAIRS ...... 49 PROGRAM COMMITTEE ...... 50 PROGRAM COMMITTEE MEMBERS ...... 50 EXTERNAL REVIEWERS ...... 56 VOLUNTEERS ...... 58 VOLUNTEERS (STUDENT TRAVEL AWARD RECIPIENTS) ...... 58

USEFUL LINKS ...... 59 ABOUT VANCOUVER ...... 59 CONFERENCE VENUE LOCATIONS ...... 60

CONFERENCE ROOM FLOOR PLANS ...... 61

Conference program

MESSAGE FROM CONFERENCE GENERAL CHAIRS

On behalf of the organizing committee for the IEEE International Conference on Data Mining 2011, we would like to welcome you to the conference, its numerous workshops and other activities in the program. We extend our warmest welcome to all attendees from around the globe to this vibrant and world class city of Vancouver, Canada.

The lively and animated city of Vancouver surrounded by ocean, mountains, and a majestic river valley, is reputed among the best cities in the world with endless indoor and outdoor activities, outstanding international restaurants and first-rate entertainment. The city of more than 2 million people (metro), barely 125 years old, houses a large number of IT companies, among them many world-leaders in business intelligence, analytics, video game industry, and software development in general.

We wish you a memorable time in Vancouver, experiencing this multicultural city celebrating the heritage of its inhabitants and showcasing the important cultures of the region’s First Nations.

This is the eleventh time ICDM is organized, each time elsewhere in the world. It has grown to a respectable size, considered today as the premier international research conference on data mining. After San Jose, USA (2001), Maebashi City, Japan (2001), Melbourne, USA (2003), Brighton, UK (2004), Houston, USA (2005), Hong Kong, China (2006), Omaha, USA (2007), Pisa, Italy (2008), Miami, USA (2009), Sydney, Australia (2010), this is the 5th time ICDM comes back to North America. Organizing an international conference, the magnitude of ICDM, is not an easy task. It requires the coordination of a multitude of individuals and a tremendous effort by an army of volunteers. Without these volunteers, many of them working on the organization for a full year, such a large conference cannot take place. The individuals involved are from the students helping at the conference registration, packing of bags, setting up audio, visual equipment, to selection committee members and scientific reviewers as well as the organizers of the program, the logistics, publicity, sponsorship and finance. We would like to officially warm heartedly thank all volunteers for their hard work, and to whom the success of this conference is attributed. In particular, we would like to extend our gratitude for an excellent technical program to: Diane Cook and Jian Pei (Program co-Chairs), Myra Spiliopoulou and Haixun Wang (Workshops Co-Chairs), Evimaria Terzi and Jure Leskovec (Tutorials Co-Chairs), Ming Hua and Alex Thomo (Exhibits and Demos Co-Chairs), Ashok Srivastava and Larry Holder (Contest Co-Chairs), Rosa Meo and Alfredo Cuzzocrea (PhD Forum Co-Chairs), and George Karypis (Panel Chair). Of course the program in itself is not enough to have a successful and memorable conference and we owe the accomplishment of this fine organization to Xindong Wu (ICDM Steering Committee Chair), Charles X. Ling (Finance Chair), Carson Leung (Local Arrangements Chair), Olfa Nasraoui, Latifur Khan and Jie Tang (Publicity Co-Chairs), Wei Ding and Gabor Melli (Sponsorship Co-Chairs), Justin Fagnan (Webmaster), and Juzhen Dong (Cyberchair). We would like to highlight their tremendous contributions.

At its 10th anniversary last year, ICDM started the ICDM highest impact paper award to recognize the best paper from the ICDM proceedings 10 years prior, that has had the most impact (methodology, applications, products) over the intervening decade. This year, this award goes to Xifeng Yan, and Jiawei Han for "gSpan: Graph-Based Substructure Pattern Mining".

ICDM has always been an innovator among top tier conferences in data mining in terms of improving the quality of its program. ICDM was the first to introduce the double-blind review process in 2007 in which the identity of authors is concealed to reviewers. This was demonstrated to improve the chances for newcomers to publish peer reviewed papers, and it reduced the bias towards known names. This year, to avoid a bias during the discussions about papers among committee members, the identity of reviewers were concealed.

1

Conference program

We also introduced for the first time the PhD Forum, a meeting in the format of a workshop allowing PhD students to present and discuss their research strategies and the new trends in data mining research.

Our gratitude also goes to our corporate sponsors (listed in the program) for their important support. Last but not least, we would like to thank the many authors who submitted research papers to the conference and all the attendees whose contributions resulted in this enriching conference.

We wish you a productive conference with new discoveries, new relationships and networking and a most enjoyable time in Vancouver.

Wei Wang and Osmar R. Zaïane ICDM 2011 Conference General Chairs

2

Conference program

MESSAGE FROM THE PROGRAM CO-CHAIRS

Welcome to the Eleventh IEEE International Conference on Data Mining! The ICDM conference is held in varying locations throughout the world, and the 2011 ICDM conference will be held for the first time in Canada, in the metropolitan city of Vancouver, British Columbia. ICDM is established as the world’s premier research conference in data mining. The conference provides an opportunity to present original research results, to relay practical development experiences, and to spark ideas for new research directions.

The ICDM conference is truly an international forum. During its eleven-year history, the conference has been held in eight countries around the world. This year’s conference continues this global trend: our organizing and program committee members represent 36 countries and authors submitted papers from 47 different countries.

This year’s conference was extremely competitive. A total of 786 papers were submitted for review. Each paper was reviewed by at least three program committee members and the selection was made on the basis of discussion among the reviewers, a vice chair, and the program co-chairs. This year, 101 regular papers and 47 short papers were accepted for presentation, representing an acceptance rate of 18.83%. In keeping with the goal of advancing the state-of-the-art in data mining, paper topics span numerous active and emerging topic areas including feature analysis, classification, privacy, anomaly detection, semi-supervised learning, clustering, recommendations, time series mining, sparse representations, data summarization, and mining data found in graphs, video, images, and text.

Reviewing and selecting papers from such a large set of research groups required the coordinated effort of many individuals. We want to thank the 36 Vice Chairs and 273 members of the Program Committee who provided insightful feedback to the authors and helped with this selection process. Of the papers that were submitted, 139 had student first authors. These authors represent the future of our field and we want to thank the National Science Foundation for sponsoring awards that funded student travel to attend the conference. The awards committee selected two papers that were particularly strong. The ICDM 2011 Best Paper Award is given to Qi Liu, Yong Ge, Zhongmou Li, EnHong Chen, and Hui Xiong for their paper Personalized Travel Package Recommendation. The Best Student Paper Award is given to Hongliang Fei and Jun Huan for their paper Structured Feature Selection and Task Relationship Inference for Multi-Task Learning. We congratulate these authors for their outstanding contribution to the field.

For the first time, this year we implemented a triple-blind review process. This process, designed by the ICDM steering committee, ensures that the reviewers do not know the identity of the authors or of the other individuals reviewing the same submission. This process is intended to remove bias during the review paper discussions. We invite you to share with the ICDM steering committee your experience and suggestions regarding this new review model.

In addition to the technical presentations, our program also highlights three invited speakers. Cynthia Dwork (Microsoft Research) details differential privacy mechanisms to make confidential data available for mining, C. Lee Giles (Pennsylvania State University) addresses the topic of data mining of large information sources such as CiteSeer, and Renée Miller (University of Toronto) presents her work on schema discovery. Four tutorials are offered this year on the topics of “Sentiment Analysis in Practice”, “Mining Social Networks for Recommendation”, “Mining Sets of Patterns: Next Generation Pattern Mining”, and “Anomaly Detection”. In addition, a panel is organized around the challenges in developing data mining solutions that can run on exascale-class machines. A set of nineteen workshops, seven demonstrations, a PhD forum, and an ICDM contest complete the program.

3

Conference program

Organizing the ICDM 2011 program required the time and expertise of numerous contributors. We are grateful for the tremendous help of Myra Spiliopoulou and Haixun Wang who served as Workshop Co-Chairs, Evimaria Terzi and Jure Leskovec who were this year’s Tutorial Co-Chairs, Ming Hua and Alex Thomo who served as Exhibits and Demos Co-Chairs, Rosa Meo and Alfredo Cuzzocrea who organized the PhD Forum, and George Karypis who led the panel discussion and organized the best paper selection. In addition, our appreciation goes to Larry Holder and Ashok Srivastava for chairing the ICDM Contest and to Howie Fung, Diederik van Liere, and the Wikimedia Foundation for sponsoring this contest which focused on developing methods for predicting future Wikipedia editing activity. The guidance of the steering committee chair, Xindong Wu, was valuable throughout each step of the conference organization and we thank him for his tireless efforts as well. Finally, we want to give a special thank you to Juzhen Dong for the many hours she put in to maintain and enhance the Cyberchair web system in support of the conference.

Finally, we thank the ICDM community for their support of this premier conference. We hope you enjoy the ICDM conference and that you are inspired by the ideas found in these papers.

Diane Cook and Jian Pei ICDM 2011 Program Co-Chairs

4

Conference program

SPONSORS

We would like to thank all of our sponsors for their support and contributions.

PLATINUM SPONSOR

GOLD SPONSOR

SILVER SPONSORS

BRONZE SPONSOR

5

Conference program

GENERAL INFORMATION CONFERENCE IDENTIFICATION BADGE Each badge carries the name and affiliation of the badge holder. Admission to the conference and workshop sessions is by badge only. If you lose your badge, please go to the Registration Desk located in the foyer of the Marriot Pinnacle for a replacement.

CONFERENCE REGISTRATION Sunday, December 11, 07:30 to 17:00 Foyer, Marriott Pinnacle Monday, December 12, 07:30 to 17:00 Foyer, Marriott Pinnacle Tuesday, December 13, 07:30 to 17:00 Foyer, Marriott Pinnacle Wednesday, December 14, 07:30 to 17:00 Foyer, Marriott Pinnacle

CONFERENCE RECEPTION Monday, December 12, 18:00 to 20:00, Harbourside Ballroom, Renaissance Vancouver Harbourside Hotel

LUNCHES Lunch breaks will take place daily from 12:00 to 13:30. Lunch will be provided to conference participants on December 14 only; see December 14 program schedule for details.

CONFERENCE BANQUET/AWARD CEREMONY Tuesday, December 13, 18:00 to 21:00. Harbourside Ballroom, Renaissance Vancouver

BREAKS In addition to lunch breaks, there will also be one morning and one afternoon break. See daily schedules for exact break times

VOLUNTEERS Volunteers wearing red ICDM t-shirts will be on hand to answer your questions and assist you.

LANGUAGE The conference and all related activities will be conducted in English.

INTERNET ACCESS The hotel does have Wi-Fi access. Individual log on information will provided upon registration.

SMOKING POLICY Smoking is not permitted inside any of the conference facilities, including hotel rooms.

GENERAL INQUIRIES

If you have any queries regarding the conference organization, please contact the conference Local Arrangements Chair Carson Leung at [email protected] or contact the registration desk.

6

Conference program

INFORMATION FOR SESSION CHAIRS AND PRESENTERS

PRESENTATION ROOM FACILITIES All rooms are equipped with digital projector.

PRESENTATION TIME The presentation time allocated to each regular paper is 20 minutes, and 10 minutes for each short paper, including questions and answers.

SESSION CHAIRS If you cannot fulfill your duties as a session chair, please ensure that you have arranged for someone else to take your place. If you are unable to find someone to fill in for you, please contact the Program Chairs to arrange a back-up.

Session chairs are kindly requested to help with the following:

 Note the time allocated for each paper in your session. Each regular paper is allocated 20 minutes (18 minutes for the presentation plus 2 minutes for discussion). Each short paper is allocated 10 minutes (8 minutes for the presentation plus 2 minutes for discussion).  Arrive at the room of the session 5 minutes before the session starts and identify each of the speakers for the session.  Remind each speaker to keep track of their time to ensure enough time is left for discussions (questions and answers), and for transition to the next presentation. If a presentation extends into the time for discussions, please shorten the discussions accordingly, or postpone the discussions until after the session.  Do not allow presentations or the discussion periods to run over the starting time of the next presentation.  If the presenter of a paper is absent ("no-show"), please continue to the next presentation. Please check after the end of the last presentation if the "no-show" has arrived. Best efforts have been made to reduce the number of no-shows; however, they may not be eliminated.  Each presentation room is equipped with a digital projector. If something is not working properly, please contact a volunteer for help.

7

Conference program

PRESENTERS Please check your presentation time and room. Please go to the room 5-minutes before the session starts and inform the session chairs you have arrived.

Presenters please note the following:

 Note the time allocated for each regular paper is 20 minutes (18 minutes for your presentation plus 2 minutes for discussion) and 10 minutes for each short paper (2 minutes for discussions).  When it is your turn to present, please leave corresponding time for discussion (questions and answers), and for transition to the next presentation. If your presentation extends into the time for discussions, discussions on your paper will be shortened by the session chair or postponed until after the session.  Please do not exceed your allocated time. Please follow the instructions of the Session Chairs.

If you cannot find your name in sessions or your information is incorrect in the Conference Program, please contact the Program Chairs.

8

Conference program

CONFERENCE PROGRAM AT-A-GLANCE

WORKSHOPS AT-A-GLANCE

SUNDAY, DECEMBER 11

DaMNET, SENTIRE, OEDM, DDDM, MMIS, PADM, BioDM, HaCDAIS, DMCS, LSVA, SSTDM, DPM, 08:30 PhD forum

10:00 BREAK

DaMNET, SENTIRE, OEDM, DDDM, MMIS, PADM, BioDM, HaCDAIS 10:30 DMCS, LEMIR, LSVA, SSTDM, DPM, PhD forum

12:30 LUNCH (not provided)

DaMNET, SENTIRE, OEDM, DDDM, MMIS, PADM, BioDM, HaCDAIS 14:00

KDCloud, DMCCI, ClimKD, DMS, ContrastDM, COMMPER

16:00 BREAK

DaMNET, SENTIRE, OEDM, DDDM, MMIS, PADM, BioDM 16:30 KDCloud, DMCCI, ClimKD, DMS, ContrastDM, COMMPER

9

Conference program

TECHNICAL PROGRAM AT-A-GLANCE

MONDAY, DECEMBER 12

08:30 Opening - Pinnacle Ballroom

08:40 Keynote by Renee Miller

09:50 BREAK

S1C Frequent

10:10 S1A Graph mining S1B Features Tutorial T1 (part 1)

patterns

12:10 LUNCH (not provided)

Tutorial T1 13:30

(part 2) S2B Similarity and S2C Sequences and Demos &

S2A Graphs

distribution patterns Exhibits Tutorial T2 14:30

(part 1)

15:30 BREAK

S3A Groups, S3C Time series and Tutorial T2 Demos &

15:45 S3B Learning

influences, and time sequences (part 2) Exhibits

17:45 Reception - Harbourside Ballroom

TUESDAY, DECEMBER 13

Keynote by Lee Giles 08:30 Pinnacle Ballroom

09:45 BREAK

S4A Topic Panel: Data mining meets analysis and S4B Machine learning S4C Video, images, and exascale and many-core 10:00 text and applications bioinformatics computing Pinnacle I Pinnacle II Ballroom Pinnacle III Ballroom Point Grey Room Ballroom

12:00 LUNCH (not provided)

14:00 Excursion

18:00 Awards Banquet - Harbourside Ballroom

10

Conference program

WEDNESDAY, DECEMBER 14

Keynote by Dr. Cynthia Dwork 08:30 Pinnacle Ballroom

09:45 BREAK

S5A Intrusion and anomaly Tutorial 3 S5B Classification S5C Patterns and applications 10:00 detection (part 1) Pinnacle II Pinnacle III Pinnacle I Point Grey

12:00 Community Meeting (with Lunch, open to all ICDM '11 participants)

Tutorial T3 13:30 (part 2) S6A Communities and S6C Reduction, summarization Point Grey S6B Clustering privacy and simplification Pinnacle II Pinnacle I Pinnacle III Tutorial T4 14:30 (part 1) Point Grey

15:30 BREAK

S7B Semi-supervised S7C Matrix, tensor, and sparse Tutorial T4 S7A Recommendation 15:45 learning representation (part 2) Pinnacle I Pinnacle II Pinnacle III Point Grey

11

Conference program

CONFERENCE PROGRAM

WORKSHOP PROGRAM

GENERAL INFORMATION FOR ALL WORKSHOPS

SUNDAY, DECEMBER 11

Start Time 08:30

Morning Break 10:00 to 10:30

Lunch Break 12:30 to 2:00

Afternoon Break 16:00 to 16:30

End Time 18:00

Please note individual workshops may alter the break times to better fit the workshop schedule; please check the workshops’ detailed schedules to confirm start-, finish- and break- times.

FULL DAY WORKSHOPS – 08:30 TO 14:00

THE SECOND WORKSHOP ON BIOLOGICAL DATA MINING AND ITS APPLICATIONS HEALTHCARE (BIODM 2011)

Workshop Organizers: Xiao-Li Li, See-Kiong Ng, and Jason T. L. Wang Location: Marriot Pinnacle – Pinnacle I

Schedule 08:40 Opening Xiao-Li Li, See-Kiong Ng, and Jason T. L. Wang

08:50 Keynote I Non-conventional approach to stem cell image classification Prof. Ming Li

09:35 Medical Data Mining for Early Deterioration Warning in General Hospital Wards Yi Mao, Yixin Chen, Gregory Hackermann, Minmin Chen, Chenyang Lu, Marin Kollef, and Thomas Bailey

12

Conference program

10:00 BREAK

10:30 Identifying HotSpots in Lung Cancer Data Using Association Rule Mining Ankit Agrawal and Alok Choudhary 10:55 Mining protein sequence databases for remote homologues that can display considerable domain length variations Eshita Mutt and Ramanathan Sowdhamini 11:20 FTCluster: Efficient Mining Fault-Tolerant Biclusters in Microarray Dataset Miao Wang, Xuequn Shang, Miao Miao, Zhanhuai Li, and Wenbin Liu 11:45 From sequences to Papers: an Information Retrieval Exercise Celia Goncalves, Rui Camacho, and Eugenio Oliveira 12:10 A three-step validation procedure in genome-wide data mining for myosin family members improves search efficiency Divya Syamaladevi, Naseer Pasha, and Ramanathan Sowdhamini

12:35 LUNCH BREAK

14:00 Keynote II Combinatorial Biomarker Discovery Prof. Raymond Ng

14:45 In Vivo and In Silico Evidence: Hippocampal Cholesterol Metabolism Decreases with Aging and Increases with Alzheimer's Disease Clyde Phelix, Richard LeBaron, George Perry, Rosa Villanueva, Greg Villareal, Sandra Siedlak, and Xiongwei Zhu 15:10 Graph Based Classification of MRI Data Based on the Ventricular System Seth Long and Lawrence Holder 15:35 Transduction of Semi-Supervised Regression Targets in Survival Analysis for Medical Prognosis Faisal Khan and Qiuhua Liu

16:00 BREAK

16:30 Brain tumor pathological area delimitation through Non-negative Matrix Factorization Sandra Ortega-Martorell, Paulo J.G. Lisboa, Alfredo Vellido, Rui V. Simoes, Margarida Julia-Sape, and Carles Arus 16:55 Confident Surgical Decision Making in Temporal Lobe Epilepsy by Heterogeneous Classifier Ensembles Shobeir Fakhraei, Hamid Soltanian-Zadeh, Kourosh Jafari-Khouzani, Kost Elisevich, and Farshad Fotouhi 17:20 Analyzing trends of hospital length of stay using Phase-type distributions Truc-Viet Le, Chee-Keong Kwoh, Eng-Soon Teo, and Kheng-Hock Lee

13

Conference program

INTERNATIONAL WORKSHOP ON DATA MINING IN NETWORKS (DAMNET 2011)

Workshop Organizers: Giuseppe Di Fatta Location: Renaissance - Port of Hong Kong Room

Schedule 08:40 Welcome and Opening Remarks: Dr. Giuseppe Di Fatta

08:45 Poll: A Citation-Text-Based System for Identifying High-Impact Contributions of an Article Lalith Polepeddi, Ankit Agrawal, and Alok Choudhary 09:10 Stochastic Network Motif Detection in Social Media Kai Liu, William K. Cheung, and Jiming Liu 09:35 Collaborative Filtering for Coordinated Monitoring in Sensor Networks Janne Toivola and Jaakko Hollmén

10:00 BREAK

10:30 Temporal Scale of Processes in Dynamic Networks Rajmonda Caceres, Tanya Berger-Wolf, and Robert Grossman 10:55 Imputation of Missing Links and Attributes in Longitudinal Social Surveys Vladimir Ouzienko and Zoran Obradovic 11:20 Prophet - a link-predictor to learn new rules on NELL Ana Paula Appel and Estevam Rafael Hruschka Junior 11:45 Scalable Link Prediction on Multidimensional Networks Michele Berlingerio, Giulio Rossetti, and Fosca Giannotti

12:30 LUNCH BREAK

14:00 to 16:00 (no presentations)

16:00 BREAK

16:30 Automatically Spotting Information-rich Nodes in Graphs Xiao He, Jing Feng, and Claudia Plant 16:55 Working for influence: effect of network density and modularity on diffusion in networks Habiba Habiba and Tanya Berger-Wolf 17:20 Diffusion in Networks With Overlapping Community Structure Fergal Reid and Neil Hurley

17:45 Closing Remarks: Dr. Giuseppe Di Fatta

14

Conference program

WORKSHOP ON DOMAIN DRIVEN DATA MINING (DDDM 2011)

Workshop General Chair: Prof Philips Yu Program Chairs: Dr. Fei Wang; Prof. Wolfgang Nejdl; and Dr. Ling Chen Location: Renaissance – Port of Singapore Room

Schedule 09:00 Opening Address by Workshop Co-Chairs

09:05 Keynote Speech Mining Patterns in Social Media - A New Frontier Prof Huan Liu

09:35 Session I (1) – Ubiquitous Intelligence Certainty-Enhanced Active Learning for Improving Imbalanced Data Classification JuiHsi Fu and SingLing Lee

10:00 BREAK

10:30. Session I (2) – Ubiquitous Intelligence

Automatic Training Data Cleaning for Text Classification Hassan Malik and Vikas Bhardwaj FAARM: Frequent association action rules mining using FP-Tree Djellel Difallah, Ryan Benton, Tom Johnsten and Vijay Raghavan Novel Knowledge-Based Twin Support Vector Machines YingJie Tian and XuChan Ju Agent Assignment for Process Management: Competency-driven Dynamic Resource Management Methodology Ramzan Talib, Bernhard Volz and Stefan Jablonski Domain-specific adaptation of a partial least squares regression model for loan defaults prediction Balaji Vasan Srinivasan, Nathan Gnanasambandam, Shi Zhao and Raj Minhas

12:30 LUNCH

14:00 Keynote Speech Towards Context Aware On-demand Data Mining by Prof Jian Pei

14:30 Session II: Mining Complex Data and Applications

CAPRE: A New Methodology for Product Recommendation Based on Customer Actionability and Profitability Thomas Piton and Julien Blanchard Improving Energy Use Forecast for Campus Micro-grids using Indirect Indicators Saima Aman, Yogesh Simmhan, Viktor Prasanna Revealing cluster formation over huge volatile robotic data Nikos Mitsou, Irene Ntoutsi, Dirk Wollherr, Costas Tzafestas and Hans-Peter Kriegel

15

Conference program

Automatic Cleaning and Linking of historical Census Data Using Household Information Zhichun Fu, Peter Christen and Mac Boot

16:00 BREAK

16:30 Session III: Domain-specific Applications

Mining Infrequent Causal Associations in Electronic Health Databases Yanqing Ji, Hao Ying, John Tran, Peter Dews, Ayman Mansour, Richard Miller and Michael Massanari Semi-supervised Failure Prediction for Oil Production Wells Yintao Liu and Ke-Thia Yao Domain Driven Data Mining in Human Resource Management - A Review Franca Piazza and Stefan Strohmeier

17:40 Closing remarks by Workshop Co-chairs

HANDLING CONCEPT DRIFT AND REOCCURRING CONTEXTS IN ADAPTIVE INFORMATION SYSTEMS (HACDAIS 2011)

Workshop Chairs: Latifur Khan; Mykola Pechenizkiy; and Indrë Þliobaitë. Location: Renaissance – Port of San Francisco Room

Schedule: Session I 08:30 Introduction to the workshop from the organizers

08:35 Unifying Change - Towards a Framework for Detecting the Unexpected Iris Ada and Michael R. Berthold 08:55 Change Mining of Customer Profiles based on Transactional Data Edward Apeh and Bogdan Gabrys 09:21 Drift Detection using Uncertainty Distribution Divergence Patrick Lindstrom, Brian Mac Namee, and Sarah Jane Delany 09:41 Pool and Precision Based Stream Classification: A new ensemble algorithm on data stream classification using recurring concept detection Mohammad Javad Hosseini, Zahra Ahmadi, and Hamid Beigy

10:00 - 10:30 BREAK and Discussions

Session II

10:30 Invited Talk To adapt, or not to adapt: adapt but with caution Petr Kaldec

11:17 Detecting Mean Changes in Data Streams Murad Badarna and Ran Wolff

16

Conference program

11:37 What's your current stress level? Detection of stress patterns from GSR sensor data Jorn Bakker, Mykola Pechenizkiy, and Natalia Sidorova 12:03 Interpretability of Sudden Concept Drift in Medical Informatics Domain Gregor Stiglic and Peter Kokol

12:30 LUNCH BREAK and Discussions

Session III

14:00 Invited Talk Streaming Models and Systems for Smarter Transportation Systems: Challenges and Solutions Wei Fan

14:47 Classification in Presence of Drift and Latency Georg Krempl and Vera Hofer 15:09 Interpretable, Online Soft-Sensors for Process Control Mark Eastwood and Petr Kadlec

15:29 Discussion and closing

THE 5TH INTERNATIONAL WORKSHOP ON MINING MULTIPLE INFORMATION SOURCES (MMIS 2011)

Workshop Organizers: Bin Li; Xingquan (Hill) Zhu; and Qiang Yang Location: Marriot Pinnacle – Shaughnessy II

Schedule: 08:50 Welcome & Opening Remarks

09:00 Keynote Speech On the Usefulness of Multiple Views in Unlabeled Data Exploitation Prof Zhi-Hua Zhou

10:00 BREAK

10:30 Session I (20min each):

Greedy Regularized Least-Squares for Multi-Task Learning Pekka Naula, Tapio Pahikkala, Antti Airola, Tapio Salakoski Multitask Multiclass Support Vector Machines You Ji, Shiliang Sun Relational Fuzzy Clustering with Multiple Kernels Naouel Baili, Hichem Frigui Transportability of Causal and Statistical Relations: A Formal Approach Judea Pearl, Elias Bareinboim

17

Conference program

12:30 LUNCH

14:00 Session II (20min each)

Cross-Domain Recommender Systems Roberto Turrin, Paolo Cremonesi, Antonio Tripodi A Tag-based Hybrid Music Recommendation System Using Semantic Relations and Multi-domain Information Ipek Tatli, Aysenur Birturk Classification of Patients Using Novel Multivariate Time Series Representations of Physiological Data Patricia Ordonez Rozo, Tom Armstrong, Tim Oates, Jim Fackler Indexing Faces in Broadcast News Video Archives Duy-Dinh Le, Shin'ichi Satoh Approximate Record Matching Using Hash Grams Mohammed Gollapalli, Xue Li, Ian Wood, Guido Governatori

16:00 BREAK

THE 6TH WORKSHOP ON OPTIMIZATION BASED TECHNIQUES FOR EMERGING DATA MINING PROBLEMS (OEDM 2011)

General Co-Chairs: Chris Ding; Tao Li; Yong Shi. Program Co-Chairs: Jing He; Fei Wang. Location: Marriot Pinnacle – Pinnacle III

Schedule: 08:30 Opening Remarks

08:40 Entity resolution with attribute and connection graph Lingfeng Niu 09:00 Boosting Unsupervised Additive Clustering Using Cluster-Wise Optimization and Multi-Label Learning Stephen L. France 09:20 A Novel Co-clustering method with Intra-Similarities Jian-Sheng Wu 09:40 Efficient Iterative Semi-Supervised Classification on Manifold Mehrdad Farajtaba

10:00 BREAK

10:30 Invited Talk Social Network Analysis by Compression: Not Only Space Saving, but also Insight Gaining Prof. Jian Pei

11:30 Stream Prediction Using Representative Episode Rules

18

Conference program

Huisheng Zhu, Peng Wang, Wei Wang, and Baile Shi 11:50 Learning to Group Web Text Incorporating Prior Information Yu Cheng 12:10 Active Learning from Positive and Unlabeled Data Alireza Ghasemi

12:30 LUNCH BREAK

13:30 Invited Talk Tractable Convex Formulations of Feature Discovery Prof. Dale Schuurman

14:30 Parametric Characterization of Multimodal Distributions with Non-Gaussian Modes Ashutosh Tewari 14:50 Integrating pairwise constraints into clustering algorithms: optimization-based approaches Sublemontier Jacques-Henri 15:10 Temporal Cross-Sell Optimization Using Action Proxy-Driven Reinforcement Learning Nan Li 15:30 Meta-Learning for Selecting a Multi-Label Classification Algorithm Lena Chekina 15:50 Kernel-Based Clustering with Automatic Cluster Number Selection Chang-Dong Wang

16:10 BREAK

16:30 Twitter Trending Topic Classification Kathy Lee 16:50 Classification for Orange Varieties Using Near Infrared Spectroscopy Warawut Suphamitmongkol 17:10 Post Mining of Multiple Criteria Linear Programming Classification Model in Credit Card Churning Management Yibing Chen

17:30 Closing Remarks

THIRD INTERNATIONAL WORKSHOP ON PRIVACY ASPECTS OF DATA MINING (PADM 2011)

Workshop Chairs: Raghav Bhaskar; Aris Gkoulalas-Divanis; Dan Kifer; and Srivatsan Laxman. Location: Marriot Pinnacle – Shaughnessy I

Schedule: 08:45 Opening Session

09:00 Invited Talk

19

Conference program

Privacy and Confidentiality and the Release of Large Sparse Contingency Table Data Stephen E. Fienberg

10:00 BREAK

10:30 Session I: Empirical Valuation of Utility of Various Privacy Mechanisms

Applicability of regression-tree-based synthetic data methods for business data Joo Ho Lee; In Yong Kim; and Christine O'Keefe. An Application of Differentially Private Linear Mixed Modeling John Abowd and Matthew Schneider. Privacy Preserving GWAS Data Sharing Stephen E. Fienberg; Aleksandra Slavkovic; and Caroline Uhler.

12:00 LUNCH BREAK

13:00 Invited Talk Privacy-Preserving Data Mining at 10: What's Next? Christopher W. Clifton

14:00 Session II: Novel Applications

enList: Automatically Simplifying Privacy Policies Rajesh Bejugam and Kristen LeFevre Fairness-aware Learning through Regularization Approach Toshihiro Kamishima; Shotaro Akaho; and Jun Sakuma Data Mining and Privacy of Personal Behaviour Types in Smart Grid Georgios Kalogridis and Stojan Denic

15:30 BREAK

16:00 Session III: New Privacy Algorithms

Rating: Privacy Preservation for Multiple Attributes with Different Sensitivity Requirements Jinfei Liu; Jun Luo; and Joshua Zhexue Huang Privacy Preserving Outlier Detection using Locality Sensitive Hashing Nisarg Raval; Madhuchand Rushi Pillutla; Piyush Bansa;, Kannan Srinathan; and C.V. Jawahar Preventing Identity Disclosure with Community Preservation in Hypergraphs Yidong Li and Hong Shen

17:30 Concluding Remarks

20

Conference program

SENTIMENT ELICITATION FROM NATURAL TEXT FOR INFORMATION RETRIEVAL AND EXTRACTION (SENTIRE 2011)

Workshop Organizers: Erik Cambria; Yangqiu Song; Catherine Havasi; and Amir Hussain. Location: Marriot Pinnacle – Pinnacle II

Schedule: Session I

08:30 Opening Remarks

08:40 Keynote by Bing Liu

09:35 STARLET: Multi-document Summarization of Service and Product Reviews with Balanced Rating Distributions Giuseppe Di Fabbrizio

10:00 BREAK

Session II 10:30 Multilingual Sentiment Analysis Using Latent Semantic Indexing and Machine Learning Philip Kegelmeyer 11:00 Deriving Insights from National Happiness Indices Daniel Archambault 11:30 Multi-aspect Sentiment Analysis with Topic Models Myle Ott 12:00 AQA: Aspect-based Opinion Question Answering Samaneh Moghaddam

12:30 LUNCH

Session III 14:00 A Method for Improving Sentiment Classification Using Feature Highlighting and Bagging Lin Dai 14:20 Machine Reading for Notion-Based Sentiment Mining Roula Hobeica 14:40 Mining Opinion Attributes From Texts using Multiple Kernel Learning Aleksander Wawer 15:00 Longitudinal Sales Responses with Online Reviews Wenyin Liu 15:20 Fine-Grained Opinion Mining Using Conditional Random Fields Shabnam Shariaty 15:40 Discourse Structure and Sentiment Martin van den Berg

16:00 BREAK

21

Conference program

Session IV 16:30 Detecting General Opinions from Customer Surveys Evgeny Stepanov 17:00 SES: Sentiment Elicitation System for Social Media Data Yusheng Xie 17:30 Concluding Remarks

MORNING WORKSHOPS – 08:30 TO 12:30

THE FOURTH WORKSHOP ON DATA MINING CASE STUDIES (DMCS 2011)

Workshop Chair: Brendan Kitts Location Marriot Pinnacle – Ambleside Room

Schedule 08:30 Introduction to Special Session on Data Mining Case Studies

08:45 Analysis of Treatment Patterns: A Case Study on Carpal Tunnel Syndrome Ching Lien and Silvia Figueira 09:00 Evaluation & extension to Duckworth Lewis method: A dual application of data mining techniques Viraj Phanse and Sourabh Deorah 09:15 Dynamic Loan Service Monitoring using Segmented Hidden Markov Models Haengju Lee, Nathan Gnanasambandam, Raj Minhas, Shi Zhao 09:30 "The 100 Most Influential Persons in History”: A Data Mining Perspective Noora Al-Naimi and Khaled Shaban 09:45 Crime Forecasting Using Data Mining Techniques Chung-Hsien Yu, Max Ward, Melissa Morabito, and Wei Ding

10:00 BREAK

10:30 Invited talk Television Ad Targeting Brendan Kitts, Dyng Au, Ryan Brooks

10:55 A/B Testing at SweetIM: The Importance of Proper Statistical Analysis Slava Borodovsky and Saharon Rosset 11:10 Evolutionary algorithms for selecting the architecture of a MLP Neural Network: A Credit Scoring Case Alejandro Correa Bahnsen and Andres F. Gonzalez 11:25 Integrating Collaborative Filtering and Search-based Techniques for Personalized Online Product Recommendation Yue Xu, Noraswaliza Abdullah, and Shlomo Geva 11:40 Detection of Anomalous Particles from Deepwater Horizon Oil Spill Using SIPPER3 Underwater Imaging Platform Sergiy Fefilatyev, Kurt Kramer, Lawrence Hall, Dmitry Goldgof, Rangachar Kasturi, Andrew Remsen, Kendra Daly

22

Conference program

12:05 An Augmented Vector Space Information Retrieval for Recovering Requirement Traceability Yoshihisa Udagawa

12:20 Closing Remarks and Prize winners

INTERNATIONAL WORKSHOP ON LEARNING AND DATA MINING FOR ROBOTICS (LEMIR 2011)

Workshop Chairs: Einoshin Suzuki and Michèle Sebag Location: Renaissance – Port of New York

Schedule 10:25 Opening by Einoshin Suzuki

Presentations (Chair: Shin Ando) 10:30 Characterizing Anomalous Behaviors and Revising Robotic Controllers David Meunier, Michele Sebag, and Shin Ando) 11:00 Adaptive Windowing for Online Learning from Multiple Inter-Related Data Streams Elena Ikonomovska, Kurt Driessens, Saso Dzeroski, and Joao Gama 11:30 Lifted-Rollout for Approximate Policy Iteration of Markov Decision Process Wang- Zhou Dai, Yang Yu, and Zhi-Hua Zhou 12:00 Implementing Camshift on a Mobile Robot for Person Tracking and Pursuit Somar Boubou, Asuki Kouno, and Einoshin Suzuki

LARGE SCALE VISUAL ANALYTICS (LSVA 2011)

Workshop Co-Chairs: Dacheng Tao; Zhu Li; Jun Li; and Aggelos Katsaggelos Location: Marriot Pinnacle - Coal Harbour

Schedule 08:30 Oral Session (20 min. each)

Clustering Based Fast Low-Rank Approximation for Large-Scale Graph Wei Chen, Ming Shao, and Yun Fu Visual Analysis for Bipartite Networks Supaporn Spanurattana and Tsuyoshi Murata Scalable Visualization of Frequent Patterns Carson Leung and Fan Jiang

09:30 Spotlight (5 min. per entry)

Efficient Indexing for Mobile Image Retrieval Deying Feng, Jie Yang, and Cheng Yang Online Heterogeneous Feature Fusion for Visual Recognition Shuangping Huang and Lianwen Jin

23

Conference program

Boosting Highly discriminative and low Redundant Graphlets for Large-scale Outdoor Scene Classification Luming Zhang, Mingli Song, Xiaoyu Deng, Jiajun Bu, and Chun Chen Wisdom of Crowds: Single Image Super-resolution from the Web Jun Li and Dacheng Tao Large Scale Content based Image Retrieval with Learning to Rank Yangxi Li, Bo Geng, Chao Zhou, and Chao Xu Incremental Support Vector Clustering Chang-Dong Wang, Jian-Huang Lai, and Dong Huang From videos to places: Geolocating the world's videos Jasper Snoek, Luciano Sbaiz, and Hrishi Aradhye

10:05 BREAK

10:30 to 12:30 Poster session

INTERNATIONAL WORKSHOP ON SPATIAL AND SPATIO-TEMPORAL DATA MINING (SSTDM 2011)

General Chairs: Shashi Shekar and Peggy Agouris Program Chairs: Ranga Raju Vatsavai and Anthony Stefanidis Location: Renaissance – Hastings Room

Schedule 08:30 Welcome and Opening Remarks

80:35 Invited Talk Prof. Sanjay Chawla

09:15 Incremental Maintenance of Topological Patterns in Spatial-Temporal Database Yi-Cheng Chen, Chao-Ying Wu, and Suh-Yin Lee. 09:35 Efficiently Mining Dynamic Zonal Co-location Patterns Based on Maximal Co- locations Bi-Ru Dai and Meng-Yan Lin.

10:10 BREAK

10:30 High-resolution Urban Image Classification Using Extended Features Raju Vatsavai. 10:50 Trended DTW Based On Piecewise Linear Approximation for Time Series Mining Lei Sun, Yujiu Yang, and Wenhuang Liu. 11:10 Data mining cancer registries: a method for retrospective surveillance of small area time trends in cancer incidence using Bayesian posterior model probabilities Guangquan Li, Sylvia Richardson, Mireille Toledano, Lea Fortunato, and Nicky Best. 11:30 Origin/Destination-estimation using cellular network data Erik MellergŒrd, Simon Moritz, and Mohamed Zahoor. 11:50 Linearly-Combined Web Sensors for Spatio-Temporal Data Extraction from the Web Shun Hattori

24

Conference program

12:05 Spatial and Temporal Analysis of Planet Scale Vehicular Imagery Data Gautam Kumar 12:20 Unknown Object Detection for On-board Robot Vision by Lifting Complex Wavelet Transforms Shigeru Takano and Einoshin Suzuki

DECLARATIVE PATTERN MINING (DPM 2011)

Workshop Organizers: Hiroki Arimura; Raoul Medina; and Jean-Marc Petit Location: Marriot Pinnacle – Dundarave Room

Schedule 08:15 Foreword by Hiroki Arimura, Raoul Médina, Jean-Marc Petit

08:30 Declarative Heuristic Search for Pattern Set Mining Tias Guns, Siegfried Nijssen, Albrecht Zimmermann, and Luc De Raedt 09:00 A Constraint-based Language for Declarative Pattern Discovery Jean-Philippe Métivier, Patrice Boizumault, Bruno Crémilleux, Mehdi Khiari, and Samir Loudni 09:30 A Constraint Programming Approach for Enumerating Motifs in a Sequence Emmanuel Coquery, Saïd Jabbour, and Lakhdar Saïs

10:00 BREAK

10:30 Constraint-based Pattern Mining in Multi-Relational Databases Siegfried Nijssen, Tias Guns, and Aida Jimenez 11:00 Mining of EL-GCIs Daniel Borchmann and Felix Distel 11:30 CLIM : CLosed Inclusion dependency Mining in databases Fabien De Marchi

12:00 Concluding remarks and opportunities

12:30 LUNCH BREAK

PHD FORUM Program Chairs: Rosa Meo and Alfredo Cuzzocrea Location: Marriot Pinnacle -Point Grey Room

Schedule 08:30 Introduction and Welcome Rosa Meo, Co-chair

08:40 Community Evolution in Dynamic Social Networks - Challenges and Problems Mansoureh Takaffoli 09:15 Towards a Framework for Detecting and Managing Opinion Contradictions Mikalai Tsytsarau

25

Conference program

9:50 BREAK

10:20 Understanding and Exploiting the Connections between NMF and SVM Vamsi Potluru 10:55 Nearest Neighbor Voting in High-Dimensional Data: Learning from Past Occurrences Nenad Tomasev and Dunja Mladenic 11:30 On Efficient Distance-based Similarity Search Jianquan Liu, Hanxiong Chen, Kazutaka Furuse, Hiroyuki Kitagawa, Jeffrey Xu Yu 12:05 Active Learning of Transfer Relationships for Multiple Related Bayesian Network Structures Diane Oyen

AFTERNOON WORKSHOPS 14:00 P.M. TO 18:00 P.M.

WORKSHOP ON KNOWLEDGE DISCOVERY FROM CLIMATE DATA: PREDICTION, EXTREMES, AND IMPACTS (CLIMKD 2011)

Workshop Organizers: Nitesh V. Chawla; Auroop R. Ganguly; Vipin Kumar; Michael Steinbach; and Karsten Steinhaeuser. Location: Renaissance – Hastings Room

Schedule 14:00 Welcome & Opening Remarks

14:10 Invited Talk William Hsieh, University of British Columbia

Paper Presentations 15:00 S1208: Sparse Group Lasso for Regression on Land Climate Variables. S. Chatterjee, A. Banerjee, S. Chatterjee, A. Ganguly. 15:20 S1203: Spatio-Temporal Mining of Core Regions: Study of Rainfall Patterns in Monsoonal India. S. Kollukuduru, K. S. Rajan. 15:40 S1207: Evaluating Hurricane Intensity Prediction Techniques in Real Time: Work in Progress. V. Jovanovic, M. Dunham, M. Hahsler, Y. Su.

16:00 BREAK

16:30 S1201: Parallel Kriging Analysis for Large Spatial Datasets. W. Zhuo, Prabhat P., C. Paciorek, C. Kaufman, W. Bethel. 16:50 S1205: Community Dynamics and Analysis of Decadal Trends in Climate Data. W. Hendrix, I. Tetteh, A. Agrawal, F. Semazzi, W.-k. Liao, A. Choudhary. 17:10 S1202: Modeling Unreliable Data and Sensors: Using F-measure Attribute Performance with Test Samples from Low-Cost Sensors. V. Iyer, S. S. Iyengar.

26

Conference program

17:30 Open Discussion

18:00 Closing Remarks

Participants in this workshop are encouraged to attend the morning session of the Workshop on Spatial and Spatio-Temporal Data Mining, which features several presentations relevant to the climate domain.

WORKSHOP ON MINING COMMUNITIES AND PEOPLE RECOMMENDERS (COMMPER 2011)

Workshop organizers: Panagiotis Papapetrou; Luiz Augusto Pizzato; Aristides Gionis; and Xiongcai Cai. Location: Renaissance - Port of New York

Schedule 14:00 Opening Remarks

14:10 Invited Talk Algorithmic problems in review-management systems Evimaria Terzi.

15:00 A Diffusion of Innovation-Based Closeness Measure for Network Associations Reihaneh Rabbany Khorasgani and Osmar Zaiane. 15:30 An Effective Expertise Team Formation in Social Networks Based on Skill Grading Farnoush Farhadi, Maryam Sorkhi, Sattar Hashemi, and Ali Hamzeh.

16:00 BREAK

16:30 Employing Team Composition Strategies for Recommending Teams Michele Brocco and Yonata Andrelo Asikin. 17:00 Random Walk based Resource Allocation: Predicting and Recommending Links in Cross-Operator Mobile Communication Networks Yuxiao Dong. 17:30 Time-aware Ranking in Dynamic Citation Networks Rumi Ghosh, Tsung-Ting Kuo, Chun-Nan Hsu, Shou-De Lin, and Kristina Lerman.

27

Conference program

CONTRAST DATA MINING AND APPLICATIONS (CONTRASTDM 2011)

Workshop Co- Chairs: Guozhu Dong and James Bailey Location: Marriot Pinnacle – Ambleside Room

Schedule 14:00 Summarizing Contrasts by Recursive Pattern Mining. Arnaud Soulet, Bruno Crémilleux, and Marc Plantevit Visually Contrast Two Collections of Frequent Patterns. Carson Leung and Chris Carmichael Correlation and Contrast Link Formation Patterns in a Time Evolving Graph Tomonobu Ozaki and Minoru Etoh A New Algorithm Based on Shared Pattern-tree to Mine Shared Emerging Patterns. Xiangtao Chen and Lijuan Lu. Overview of Contrast Data Mining as a Field and Preview of Upcoming Book. Guozhu Dong and James Bailey

DATA MINING TECHNOLOGIES FOR COMPUTATIONAL COLLECTIVE INTELLIGENCE (DMCCI 2011)

General Co-Chairs: Charu Aggarwal; Tina Eliassi-Rad; and Philip S. Yu. Program Co-Chairs: Hanghang Tong; Fei Wang; and Hong Cheng. Location: Marriot Pinnacle – Point Grey Room Schedule 14:00 Welcome & Opening Remarks 14:05 Invited Talk Mining Useful Patterns Dr. Jilles Vreeken 15:05 DM313: Exploring Support Vector Machines and Random Forests to Detect Advanced Fraud Activities on Internet Abiodun Modupe. 15:30 DM967: Isanette: A Common and Common Sense Knowledge Base for Opinion Mining Erik Cambria 16:00 BREAK 16:30 S5202: Retweet Modeling Using Conditional Random Fields Huan-Kai Peng. 16:55 S5201: Mobile Phone Graph Evolution: Findings, Model and Interpretation Siyuan Liu. 17:20 S5203: SLPA: Uncovering Overlapping Communities in Social Networks via A Speaker-listener Interaction Dynamic Process Jierui Xie. 17:45 Open Discussion 18:00 Closing Remarks

28

Conference program

DATA MINING FOR SERVICE (DMS 2011)

Workshop Chair: Katsutoshi Yada Location: Marriot Pinnacle – Dundarave Room

Schedule: Session 1 14:00 Invited Talk Knowledge discovery in city planning and office building management Naoki Katoh

Discussion RFM variables revisited using quantile regression Michel Ballings, Dries Benoit, Dirk Van den Poel. Estimating Post-Event Seller Productivity Profiles in Dynamic Sales Organizations Kush R. Varshney, Moninder Singh, Mayank Sharma, Aleksandra Mojsilovic. Analysis of Residence Time in Shopping using RFID Data -An Application of the Kernel density estimation to RFID- Shinya Miyazaki, Takashi Washio, Katsutoshi Yada

Discussion

16:00 BREAK

Session II

16:30 Invited Talk Human Information Mining as Netizen Satoshi Kurihara

Discussion

Visualization of Hospital Services using Data Mining Methods Shusaku Tsumoto, Shoji Hirano, Yuko Tsumoto. An Information Extraction Method from Different Structural Web Sites by Word Distances Between a User Instantiated Label and Similar Entity Daisuke Nakajima, Yuki Mitsui, Masaki Samejima, Masanori Akiyoshi

Discussion

29

Conference program

INTERNATIONAL WORKSHOP ON KNOWLEDGE DISCOVERY USING CLOUD AND DISTRIBUTED COMPUTING PLATFORMS (KDCLOUD 2011)

Workshop Chair: Ranga Raju Vatsavai and Nitesh Chawla Location: Marriot Pinnacle - Coal Harbour Room

Schedule 14:00 Welcome and Opening Remarks: Dr. Varun Chandola

14:05 Invited Talk

14:45 Epidemic K-Means Clustering Giuseppe Di Fatta, Francesco Blasa, Simone Cafiero, and Giancarlo Fortino. 15:05 A Large Scale URL Verification Pipeline Using Hadoop Songtao Guo and Jianxiong Dong. 15:25 Temporal Distributed Learning with Heterogeneous Data Using Gaussian Mixtures Dean Teffer, Amanda Hutton, and Joydeep Ghosh. 15:45 Finding the Needle: Locating Interesting Nodes Using the K-Shortest Paths Algorithm in MapReduce Christopher McCubbin, Andrew Levine, Bryan Perozzi, and Abdul Rahman.

16:00 BREAK

16:30 The Deployment of MML for Data Analytics over the Cloud Jonathan Tancer and Aparna Varde 16:50 Detecting Abnormal Machine Characteristics in Cloud Infrastructures Kanishka Bhaduri, Kamalika Das, and Bryan Matthews 17:10 Latency Minimizing Tasking for Information Processing Systems James Horey and Brent Lagesse 17:25 Frequent Pairs in Data Streams: Exploiting Parallelism and Skew Andrea Campagna, Konstantin Kutzkov, and Rasmus Pagh 17:40 LI-MR: A Local Iteration Map/Reduce Model and Its Application to Mine Community Structure in Large-scale Networks Qiuhong Li, Zhihui Wang, and Wei Wang

17:55 Closing Remarks: Dr. Varun Chandola

30

Conference program

TECHNICAL CONFERENCE PROGRAM

MONDAY, DECEMBER 12

08:30 Opening - Marriot Pinnacle - Pinnacle Ballroom

08:40 Keynote On Schema Discovery Dr. Renee J. Miller Pinnacle Ballroom

09:50 BREAK

10:10 Sessions start

SESSION 1A: GRAPH MINING SESSION CHAIR: JAMES BAILEY LOCATION: Pinnacle I

Regular Beyond `Caveman Communities': Hubs and Spokes for Graph Compression and Mining. U Kang and Christos Faloutsos An In-Depth Study of Stochastic Kronecker Graphs C. Seshadhri, Ali Pinar, and Tamara Kolda Infrastructure Pattern Discovery in Configuration Management Databases via Large Sparse Graph Mining Pranay Anchuri; Mohammed Zaki; Omer Barkol; Ruth Bergman; Yifat Felder; Shahar Golan; and Arik Sityon Fast and Robust Graph-based Transductive Learning via Minimum Tree Cut Yan-Ming Zhang, Kaizhu Huang, and Cheng-Lin Liu Mining Heavy Subgraphs in Time-Evolving Networks Petko Bogdanov; Misael Mongiovi; and Ambuj K. Singh

Short Scalable Diversified Ranking on Graphs Ronghua Li and Jeffrey Xu Yu Entropy-Based Graph Mining: Application to Biological and Social Networks Young-Rae Cho and Edward Casey Kenley

31

Conference program

SESSION 1B: FEATURES SESSION CHAIR: LARRY HALL LOCATION: Pinnacle II

Regular Random Forest Based Feature Induction Celine Vens and Fabrizio Costa Structured Feature Selection and Task Relationship Inference for Multi-Task Learning Hongliang Fei and Jun Huan Feature Selection via $\ell_{2,1}$-Norm Support Vector Machine Xiao Cai; Feiping Nie; Heng Huang; and Chris Ding An Efficient Greedy Method for Unsupervised Feature Selection Ahmed Farahat; Ali Ghodsi; and Mohamed S. Kamel Partitionable kernels for mapping kernels Kilho Shin

Short Optimizing Performance Measures for Feature Selection QI MAO and Ivor Wai Hung TSANG Calculating Feature Weights in Naive Bayes with Kullback-Leibler Measure Chang-Hwan Lee; Fernando Gutierrez; and Dejing Dou

SESSION 1C: FREQUENT PATTERNS SESSION CHAIR: BART GOETHALS LOCATION: Pinnacle III

Regular Causal Associative Classification Kui Yu; Xindong Wu; Wei Ding; and Hao Wang Efficient Mining of a Concise and Lossless Representation of High Utility Itemsets Cheng-Wei Wu; Philippe Fournier-Viger; Philip S. Yu; and Vincent S. Tseng CEMiner - An Efficient Algorithm for Mining Closed Patterns from Interval-based Data Yi-Cheng Chen; Wen-Chih Peng; and Suh-Yin Lee Interesting Multi-Relational Patterns Eirini Spyropoulou and Tijl De Bie Finding Robust Itemsets under Subsampling Nikolaj Tatti and Fabian Moerchen

Short Efficient Incremental Mining of Closed Sequential Patterns and its Application in Query Completion over Structural Data Chuancong Gao; Jianyong Wang; and Qingyan Yang Using frequent closed itemsets for data dimensionality reduction Petr Krajca; Jan Outrata; and Vilem Vychodil

12:10 to 1:30 LUNCH (not provided)

32

Conference program

SESSION 2A: GRAPHS SESSION CHAIR: HONG CHENG

LOCATION: Pinnacle I

Regular Tag Refinement on Semantic Unity Graph Yueting Zhuang; Yang Liu; Yin Zhang; Jian Shao; and Fei Wu D-cores: measuring cohesion and collaboration of directed graphs Christos Giatsidis; Michalis Vazirgiannis; and Dimitrios Thilikos Threshold Conditions for Arbitrary Cascade Models on Arbitrary Networks B. Aditya Prakash; Deepayan Chakrabarti; Michalis Faloutsos; Nicholas Valler; and Christos Faloutsos BibClus: A Clustering Algorithm of Bibliographic Networks by Message Passing on Center Linkage Structure. Xiaoran Xu and Zhi-Hong Deng Algorithms for Mining the Evolution of Conserved Relational States in Dynamic Networks Rezwan Ahmed and George Karypis

Short A Study of Laplacian Spectra of Graph for Subgraph Queries Lei Zhu and Qinbao Song DIGRank: Using Global Degree to Facilitate Ranking in an Incomplete Graph Xiang Niu; Lusong Li; Xiaobing Xiong; Daniel Tkach; He Li; and Ke Xu

SESSION 2B: SIMILARITY AND DISTRIBUTION SESSION CHAIR: ZHI-HUA ZHOU LOCATION: Pinnacle II Regular Multi-Instance Metric Learning Ye Xu and Wei Ping Sparse Domain Adaptation in Projection Spaces based on Good Similarity Functions Emilie Morvant; Amaury Habrard; and Stéphane Ayache Density Estimation based on Mass Kai Ming Ting; Takashi Washio; Jonathan Wells; and Tony Liu Word Cloud Model for Text Categorization Tam T. Nguyen; Kuiyu Chang; and Siu Cheung Hui Learning Dirichlet Processes from Partially Observed Groups Avinava Dubey; Indrajit Bhattacharya; Mrinal Das; Tanveer Faruquie; and Chiranjib Bhattacharyya

Short Modeling High-Level Behavior Patterns for Precise Similarity Analysis of Software Taeho Kwon and Zhendong Su Mixture of softmax sLDA Xiao Xu LI

33

Conference program

SESSIONS 2C: SEQUENCES AND PATTERNS SESSION CHAIR: ARIS GKOULALAS-DIVANIS LOCATION: Pinnacle III

Regular Role-behavior Analysis from Trajectory Data by Cross-domain Learning Shin Ando and Einoshin Suzuki A Taxi Driving Fraud Detection System Yong Ge; Hui Xiong; Chuanren Liu; and Zhi-Hua Zhou Finding Novel Diagnostic Gene Patterns based on Interesting Non-redundant Contrast Sequence Rules Yuhai Zhao and Guoren Wang Efficiently Mining Unordered Trees Mostafa Haghir Chehreghani A Generalized Fast Subset Sums Framework for Bayesian Event Detection Kan Shao; Yandong Liu; and Daniel Neill

Short Online Multi-Task Learning for Personalized Continuous Activity Recognition Xu Sun; Hisashi Kashima; Ryota Tomioka; and Naonori Ueda Helix: Unsupervised Grammar Induction for Structured Human Activity Recognition Huan-Kai Peng; Pang Wu; Jiang Zhu; and Ying Zhang

SESSION 3A: GROUPS, INFLUENCES, AND TIME SESSION CHAIR: DEJING DOU LOCATION Pinnacle I

Regular Exploiting False Discoveries – Statistical Validation of Patterns and Quality Measures in Subgroup Discovery Wouter Duivesteijn and Arno Knobbe Local Models for Expectation-Driven Subgroup Discovery Florian Lemmerich and Frank Puppe SIMPATH: An Efficient Algorithm for Influence Maximization under the Linear Threshold Model Amit Goyal, Wei Lu, and Laks Lakshmanan Minimizing Seed Set for Viral Marketing Long Cheng and Raymond Chi-Wing WONG

Short Discovering Emerging Topics in Social Streams via Link Anomaly Detection Toshimitsu Takahashi, Ryota Tomioka, and Kenji Yamanishi Characterizing Inverse Time Dependency in Multi-class Learning Danqi Chen, Weizhu Chen, and Qiang Yang Discovery of Versatile Temporal Subspace Patterns in 3-D Datasets Zhen Hu and Raj Bhatnagar

34

Conference program

SESSION 3B: LEARNING SESSION CHAIR: WEI DING LOCATION: Pinnacle II

Regular Simple Multiple Noisy Label Utilization Strategies Victor Sheng Clusterability Analysis and Incremental Sampling for Nystrom Extension Based Spectral Clustering Xianchao Zhang and Quanzeng You Multi-Task Learning with Task Relations Zhao Xu and Kristian Kersting Understanding Propagation Error and Its Effect on Collective Classification Rongjing Xiang and Jennifer Neville Learning classification with auxiliary probabilistic information Quang Nguyen, Hamed Valizadegan, and Milos Hauskrecht

Short Manifold Learning and Missing Data Recovery through Unsupervised Regression Miguel Carreira-Perpinan and Zhengdong Lu Co-clustering for Binary and categorical data with maximum Modularity Lazhar Labiod and Mohamed Nadif

SESSION 3C: TIME SERIES AND SEQUENCES SESSION CHAIR: KAI-MING TING

LOCATION: Pinnacle III

Regular Using Bayesian Network Learning Algorithm to Discover Causal Relations in Multivariate Time Series Zhenxing Wang and Laiwan Chan Time Series Epenthesis: Clustering Time Series Streams Requires Ignoring Some Data Thanawin Rakthanmanon, Eamonn Keogh, Stefano Lonardi, and Scott Evans Recursive multi-step time series forecasting by perturbing data Souhaib Ben Taieb and Gianluca Bontempi Enabling Fast Lazy Learning for Data Streams Peng Zhang, Byron Gao, Xingquan Zhu, and Li Guo A New Markov Model for Clustering Categorical Sequences Tengke Xiong, Shengrui Wang, Qingshan Jiang, and Joshua Zhexue Huang

Short Discovering the Intrinsic Cardinality and Dimensionality of Time Series using MDL Bing Hu, Thanawin Rakthanmanon, Yuan Hao, Scott Evans, Stefano Lonardi, and Eamonn Keogh SPO: Structure Preserving Oversampling for Imbalanced Time Series Classification Hong Cao, Xiao-Li Li, Yew-Kwong Woon, and See-Kiong Ng

35

Conference program

TUESDAY, DECEMBER 13

08:30 Keynote Marriot Pinnacle - Pinnacle Ballroom Data Mining and Information Extraction for CiteSeerX and Friends Dr. Lee Giles

09:45 BREAK

10:00 Sessions Start

PANEL: DATA MINING MEETS EXASCALE MANY-CORE COMPUTING

Location: Point Grey Room

Time: 10:00 a.m. to 12:00

Panelists: Barbara Chapman (University of Houston) Edwin Pednault (IBM) Chandrika Kamath (Lawrence Livermore National Laboratory) Srinivasan Parthasarathy (Ohio State University)

SESSION 4A: TOPIC ANALYSIS AND TEXT SESSION CHAIR: KYUSEOK SHIM

LOCATION: Pinnacle I

Regular LPTA: A Probabilistic Model for Latent Periodic Topic Analysis Zhijun Yin, Liangliang Cao, Jiawei Han, Chengxiang Zhai, and Thomas Huang Generating Breakpoint-based Timeline Overview for News Topic Retrospection Po Hu, Minlie Huang, Peng Xu, Weichang Li, Adam Usadi, and Xiaoyan Zhu SolarMap: Multifaceted Visual Analytics for Topic Exploration Jimeng Sun, Nan Cao, David Gotz, Yu-Ru Lin, and Huamin Qu Analysis of Textual Variation by Latent Tree Structures Teemu Roos and Yuan Zou

Short Text Clustering via Constrained Nonnegative Matrix Factorization Yan Zhu, Liping Jing, and Jian Yu Tracking and Connecting Topics via Incremental Hierarchical Dirichlet Processes Zekai Gao, Yangqiu Song, Shixia Liu, and Haixun Wang Cross-temporal Link Prediction Satoshi Oyama, Kohei Hayashi, and Hisashi Kashima Supervised Lazy RandomWalk for Topic-Focused Multi-Document Summarization Pan Du, Jiafeng Guo, and Xue-Qi Cheng

36

Conference program

SESSION 4B: MACHINE LEARNING AND APPLICATIONS SESSION CHAIR: CHARLES LING LOCATION: Pinnacle II Regular COMET: A Recipe for Learning and Using Large Ensembles on Massive Data Justin Basilico, Arthur Munson, Tamara Kolda, Kevin Dixon, and Philip Kegelmeyer Confidence in Predictions from Random Tree Ensembles Siddhartha Bhattacharyya Ranking Web-based Partial Orders by Significance Using a Markov Reference Model Michel Speiser, Gianluca Antonini, and Abderrahim Labbi Class Imbalance, Redux Byron Wallace, Kevin Small, Carla E. Brodley, and Thomas Trikalinos Learning Markov Logic Networks via Functional Gradient Boosting Tushar Khot, Sriraam Natarajan, Kristian Kersting, and Jude Shavlik

Short Learning from Negative Examples in Set-Expansion Prateek Jindal and Dan Roth Low Rank Metric Learning with Manifold Regularization Guoqiang Zhong, Kaizhu Huang, and Cheng-Lin Liu

SESSION 4C: VIDEO, IMAGES AND BIOINFORMATICS SESSION CHAIR: GUOZHU DONG LOCATION: Pinnacle III Regular Mining Historical Archives for Near-Duplicate Figures Thanawin Rakthanmanon, Qiang Zhu, and Eamonn Keogh Combining feature context and spatial context for image data mining Hongxing Wang, Junsong Yuan, and Yap-Peng Tan Improving Product Classification Using Images Anitha Kannan, Partha Talukdar, Nikhil Rasiwasia, and Qifa Ke Learning Tags from Unsegmented Videos of Multiple Human Actions Timothy Hospedales, Shaogang Gong, and Tao Xiang

Short Learning Protein Folding Energy Functions Wei Guan, Arkadas Ozakin, Alexander Gray, Jose Borreguero, Shashi Pandit, and Jeffrey Skolnick Web Horror Image Recognition Based on Context-Aware Multi-Instance Learning Bing Li, Weihua Xiong, and weiming Hu Identifying Differentially-Expressed Genes via Weighted Rank Aggregation Qiong Fang, Jianlin Feng, and Wilfred Ng Discovering Thematic Patterns in Videos via Cohesive Sub-graph Mining Gangqiang Zhao and Junsong Yuan

37

Conference program

WEDNESDAY, DECEMBER 14

08:30 Keynote Marriot Pinnacle - Pinnacle Ballroom The Promise of Differential Privacy Dr. Cynthia Dwork

09:45 BREAK

10:00 Sessions Start

SESSION 5A: INTRUSION AND ANOMALY DETECTION SESSION CHAIR: CHARLES PERNG LOCATION: Pinnacle I Regular Detection of Cross-Channel Anomalies From Multiple Data Channels Duc-Son Pham, Saha Budhaditya, Dinh Phung, and Svetha Venkatesh Direct Robust Matrix Factorization for Anomaly Detection Liang Xiong, Xi Chen, and Jeff Schneider Conditional Anomaly Detection with Soft Harmonic Functions Michal Valko, Branislav Kveton, Hamed Valizadegan, Gregory Cooper, and Milos Hauskrecht Incremental Elliptical Boundary Estimation for Anomaly Detection in Wireless Sensor Networks Masud Moshtaghi, Christopher Leckie, Shanika Karunasekera, James Bezdek, Sutharshan Rajasegarar, and Marimuthu Palaniswami

Short Identities Anonymization in Dynamic Social Networks Chih-Hua Tai Tai, Peng-Jui Tseng, Philip S. Yu, and Ming-Syan Chen Review Graph based Online Store Review Spammer Detection Guan Wang, Sihong Xie, Bing Liu, and Philip S. Yu Unsupervised Anomaly Intrusion Detection Via Localized Bayesian Feature Selection Wenato Fan, Nizar Bouguila, and Djemel Ziou A Spectral Framework for Detecting Inconsistency across Multi-Source Object Relationships Jing Gao, Wei Fan, Deepak Turaga, Srinivasan Parthasarathy, and Jiawei Han

38

Conference program

SESSION 5B: CLASSIFICATION SESSION CHAIR: ROSA MEO LOCATION: Pinnacle II Regular Towards Optimal Discriminating Order for Multiclass Classification Dong Liu, Shuicheng Yan, Yadong Mu, Xian-Sheng Hua, and Hong-Jiang Zhang On Generating all Optimal Monotone Classifications Luite Stegeman and Ad Feelders Healing Sample Selection Bias by Source Classifier Selection Chun Wei Seah, Ivor Wai Hung TSANG, and Yew-Soon Ong Positive and Unlabeled Learning for Graph Classification Yuchen Zhao, Xiangnan Kong, and Philip S. Yu An Analysis of Performance Measures for Binary Classifiers Charles Parker

Short Classifying Categorical Data by Rule-based Neighbors Jiabing Wang, Pei Zhang, Guihua Wen, and Jia Wei Twin Gaussian Processes for Binary Classification Jianjun He, Hong Gu, and Zhelong Wang

SESSION 5C: PATTERNS AND APPLICATIONS SESSION CHAIR: JIAN HUANG

LOCATION: Pinnacle III

Regular Signature Pattern Covering via Local Greedy Algorithm and Pattern Shrink Hyungsul Kim, David Sheridan, Sungjin Im, Shobha Vasudevan, Tarek Abdelzaher, and Jiawei Han Diverse Dimension Decomposition of an Itemsets Space Mikalai Tsytsarau, Themis Palpanas, Francesco Bonchi, and Aristides Gionis Mining Dominant Patterns in the Sky Arnaud Soulet, Chedy Raïssi, Marc Plantevit, and Bruno Crémilleux A Hypergraph-based Method for Discovering Semantically Associated Itemsets Haishan Liu, Paea LePendu, Ruoming Jin, and Dejing Dou Cross Domain Random Walk for Query Intent Pattern Mining from Search Engine Log Siyu Gu, Jun Yan, and Shuicheng Yan ADANA: Actively Disambiguating Person Names with User Interaction Xuezhi Wang, Jie Tang, Hong Cheng, and Philip S. Yu

12:00 to 1:30 COMMUNITY MEETING with Lunch Open to all ICDM participants Renaissance Harbourside Ballroom

39

Conference program

SESSION 6A: COMMUNITIES AND PRIVACY SESSION CHAIR: TANYA BERGER-WOLF

LOCATION: Pinnacle I

Regular Detecting Community Kernels in Large Social Networks Liaoruo Wang, Tiancheng Lou, Jie Tang, and John Hopcroft Inferring the Diffusion and Evolution of Topics in Social Communities Xide Lin, Qiaozhu Mei, Jiawei Han, Yunliang Jiang, and Marina Danilevsky LinkBoost: A Novel Cost-Sensitive Boosting Framework for Community-Level Network Link Prediction Prakash Mandayam Comar, Pang-Ning Tan, and Anil Jain Privacy Risk In Graph Stream Publishing For Social Network Applications Nigel Medforth and Ke Wang Secure Clustering in Private Networks Bin Yang, Issei Sato, and Hiroshi Nakagawa

Short Finding Communities in Dynamic Social Networks Chayant Tantipathananandh and Tanya Berger-Wolf On the Hardness of Graph Anonymization Charu Aggarwal, Yao Li, and Philip S. Yu

SESSION 6B: CLUSTERING SESSION CHAIR: LATIFUR R. KHAN

LOCATION: Pinnacle II

Regular NonnegativeMatrix Tri-Factorization Based Simultaneous Clustering of Large-Scale Multi-Type Related Data Hua Wang, Feiping Nie, Heng Huang, and Chris Ding A Robust Clustering Algorithm Based on Aggregated Heat Kernel Mapping Hao Huang, Shinjae Yoo, Hong Qin, and Dantong Yu Flexible Fault Tolerant Subspace Clustering Stephan Günnemann, Emmanuel Müller, Sebastian Raubach, and Thomas Seidl Detection of Arbitrarily Oriented Synchronized Clusters in High-dimensional Data Junming Shao, Claudia Plant, Qinli Yang, and Christian Boehm Overlapping Correlation Clustering Francesco Bonchi, Aristides Gionis, and Antti Ukkonen

Short Clustering with Attribute-Level Constraints Jana Schmidt, Elisabeth Braendle, and Stefan Kramer A Fast and Flexible Clustering Algorithm Using Binary Discretization Mahito Sugiyama and Akihiro Yamamoto

40

Conference program

SESSION 6C: REDUCTION, SUMMARIZATION AND SIMPLIFICATION SESSION CHAIR: XIAOWEI XU LOCATION: Pinnacle III Regular S-preconditioner for Multi-fold Data Reduction with Guaranteed User-controlled Accuracy Ye Jin, Sriram Lakshminarasimhan, Neil Shah, Zhenhuan Gong, and Nagiza Samatova Handling Conditional Discrimination Indre Zliobaite, Faisal Kamiran, and Toon Calders Learning to Rank for Query-focused Multi-Document Summarization Chao Shen and Tao Li Heuristic Updatable Weighted Random Subspaces for Nonstationary Environments Thomas Hoens, Robi Polikar, and Nitesh Chawla Maximum Entropy Modelling for Assessing Results on Real-Valued Data Kleanthis-Nikolaos Kontonasios, Jilles Vreeken, and Tijl De Bie

Short Distance Preserving Graph Simplification Ning Ruan, Ruoming Jin, and Yan Huang How Does Research Evolve? Pattern Mining for Research Meme Cycles Dan He, Xingquan Zhu, and Douglas Parker

SESSION 7A: RECOMMENDATION SESSION CHAIR: ANKUR TEREDESAI LOCATION: Pinnacle I Regular Personalized Travel Package Recommendation Qi Liu, Yong Ge, Zhongmou Li, EnHong Chen, and Hui Xiong Novel Recommendation based on Personal Popularity Tendency Jinoh Oh, Sun Park, Hwanjo Yu, Min Song, and Seung-Taek Park Patent Maintenance Recommendation with Patent Information Network Model Xin Jin, Scott Spangler, Ying Chen, Keke Cai, Rui Ma, Li Zhang, Xian Wu and Jiawei Han TWITOBI: A Recommendation System for Twitter using Probabilistic Modeling Younghoon Kim and Kyuseok Shim SLIM: Sparse Linear Methods for Top-N Recommender Systems Xia Ning and George Karypis

Short Detecting Recurring and Novel Classes in Concept-Drifting Data Streams Mohammad Masud, Tahseen Al-Khateeb, Latifur Khan, Charu Aggarwal, Jing Gao, and Jiawei Han ASAP: A Self-Adaptive Prediction System for Instant Cloud Resource Demand Provisioning Yexi Jiang, Chang-shing Perng, Tao Li, and Rong Chang

41

Conference program

SESSION 7B: SEMI-SUPERVISED LEARNING SESSION CHAIR: NITIN INDURKHYA LOCATION: Pinnacle II Regular Semi-supervised Hierarchical Clustering Li Zheng and Tao Li Learning Spectral Embedding for Semi-supervised Clustering Fanhua Shang, L.C. Jiao, YuanYuan Liu, and Fei Wang Semi-Supervised Feature Importance Evaluation with Ensemble Learning Barkia Hasna, Elghazel Haytham, and Aussem Alex Learning with Minimum Supervision: A General Framework for Transductive Transfer Learning Mohammad Taha Bahadori, Yan Liu, and Dan Zhang Isograph: Neighbourhood Graph Construction Based On Geodesic Distance For Semi-Supervised Learning Marjan Ghazvininejad, Mostafa Mahdieh, Hamid R. Rabiee, Parisa Khanipour Roshan, and Mohammad Hossein Rohban

Short Semi-Supervised Discriminant Hashing Saehoon Kim and Seungjin Choi Constraint selection based semi-supervised feature selection Mohammed Hindawi, Kais Allab, and Khalid Benabdeslem

SESSION 7C: MATRIX, TENSOR AND SPARSE REPRESENTATION SESSION CHAIR: SIEGFRIED NIJSSEN LOCATION: Pinnacle III Regular Multi-Task Learning for Bayesian Matrix Factorization Chao Yuan Document Clustering via Matrix Representation Xufei Wang, Jiliang Tang, and Huan Liu Boolean Tensor Factorizations Pauli Miettinen Context-Aware Multi-Instance Learning based on Hierarchical Sparse Representation Bing Li, Weihua Xiong, and Weiming Hu

Short Tensor Fold-in Algorithms for Social Tagging Prediction Miao Zhang and Chris Ding A Fixed Parameter Tractable Integer Program for Finding the Maximum Order Preserving Submatrix Jens Humrich, Thomas Gärtner, and Gemma Garriga

42

Conference program

DEMOS AND EXHIBITIONS PROGRAM

GENERAL INFORMATION FOR EXHIBITS AND DEMOS

Date: Monday, December 12.

Time: 13:30 to 17:45

BREAK: 15:30 to 15:45

Location: Marriot Pinnacle - Shaughnessy Salon

ACCEPTED ICDM 2011 DEMOS SO001 BeTracker: A System for Finding Behavioral Patterns from Contextual Sensor and Social Data Hsun-Ping Hsieh (National Taiwan University) Cheng-Te Li (National Taiwan University) SO002 TeamExp: Top-k Team Formation in Social Networks Mehdi Kargar (York University) Aijun An (York University) SO003 SentiCorr: Multilingual Sentiment Analysis of Personal Correspondence Erik Tromp (Eindhoven University of Technology) Mykola Pechenizkiy (Eindhoven University of Technology) SO004 Route Discovery from Mining Uncertain Trajectories Hechen Liu (University of Florida) Ling-Yin Wei (National Chiao Tung University) Yu Zheng (Microsoft Research) Markus Schneider (University of Florida) Wen-Chih Peng (National Chiao Tung University) SO005 NetDriller: A Powerful Social Network Analysis Tool Negar Koochakzadeh (University of Calgary) Atieh Sarraf University of Calgary) Keivan Kianmehr (University of Western Ontario) Jon Rokne (University of Calgary) Reda Alhajj (University of Calgary, Global University) SO006 AskUs: an Opinion Search Engine Shamita Pisal (Aruba Networks) Japinder Singh (Google Inc) Magdalini Eirinaki (San Jose State University) SO007 BioGraph: Knowledge Discovery and Exploration in the Biomedical domain Jeroen De Knijf (Antwerp University) Anthony Liekens (Antwerp University) Walter Daelemans (Antwerp University) Peter De Rijk (Antwerp University) Jurgen Del-Favero (Antwerp University) Bart Goethals (Antwerp University)

43

Conference program

EXHIBITIONS

SoftProEuro Lucian Hancu Romania

Tableau Software Sophia Kan United States

Conrady Applied Science, LLC Stefan Conrady United States

44

Conference program

TUTORIALS PROGRAM

MONDAY, DECEMBER 12

TUTORIAL 1: SENTIMENT ANALYSIS IN PRACTICE Location: Point Grey Room

Times: 10:10 to 12:10 LUNCH – 12:10 to 13:30 13:30 to 14:30

Presenters: Yongzheng (Tiger) Zhang; Dan Shen; and Catherine Baudin – eBay Research Labs

TUTORIAL 2: MINING SOCIAL NETWORKS FOR RECOMMENDATION Location: Point Grey Room

Times: 14:30 to 15:30 BREAK – 15:30 to 15:45 15:45 to 17:45

Presenters: Mohsen Jamali and Martin Ester – Simon Fraser University

WEDNESDAY, DECEMBER 14

TUTORIAL 3: MINING SETS OF PATTERNS: NEXT GENERATION PATTERN MINING Location: Point Grey Room

Times: 10:10 to 12:10 LUNCH – 12:10 to 1:30 13:30 to 14:30

Presenters: Bjorn Bringmann; Siegfried Nijssen; Nikolaj Tatti; Jilles Vreeken; and Albrecht Zimmermann - Katholieke Universiteit Leuven

TUTORIAL 4: ANOMALY DETECTION: A TUTORIAL Location: Point Grey Room

Times: 14:30 to 15:30 BREAK – 15:30 to 15:45 15:45 to 17:45

Presenters: Sanjay Chawla and Varun Chandola – University of Sydney

45

Conference program

ICDM 2011 KEYNOTE SPEECHES

TITLE: THE PROMISE OF DIFFERENTIAL PRIVACY Dr. Cynthia Dwork, Distinguished Scientist, Microsoft Research

Abstract: "Differential privacy" describes a promise, made by a data curator to a data subject: you will not be affected, adversely or otherwise, by allowing your data to be used in any study, no matter what other studies, data sets, or information from other sources is available. At their best, differentially private mechanisms can make confidential data widely available for accurate datamining, without resorting to data clean rooms, institutional review boards, restricted views, or data protection plans. Nonetheless, a fundamental limit exists: overly accurate answers to too many questions will destroy privacy. Differentially private access provides a measure of cumulative privacy risk, permitting data access to be interdicted before privacy is destroyed. The goal of algorithmic research on differential privacy is to postpone this inevitability as long as possible. This talk surveys the current state of the art.

Speaker Bio Cynthia Dwork, Distinguished Scientist at Microsoft Research, is the world's foremost expert on placing privacy-preserving data analysis on a mathematically rigorous foundation. A cornerstone of this work is differential privacy, a strong privacy guarantee permitting highly accurate data analysis. Dr. Dwork has also made seminal contributions in cryptography and distributed computing, and is a recipient of the Edsger W. Dijkstra Prize, recognizing some of her earliest work establishing the pillars on which every fault-tolerant system has been built for decades. She is a member of the US National Academy of Engineering and a Fellow of the American Academy of Arts and Sciences.

TITLE: DATA MINING AND INFORMATION EXTRACTION FOR CITESEERX AND FRIENDS Dr. C. Lee Giles

Abstract: Cyberinfrastructure or e-science has become crucial in many areas of science where data access often defines scientific progress. Open source (OS) systems have greatly facilitated design and implementation and supporting cyberinfrastructure permitting the design of specialized integrated search engines and digital libraries which offer many opportunities for domain relevant information and knowledge extraction, such as citation extraction, automated indexing and ranking, chemical formulae search, table indexing, etc. We describe the open source SeerSuite architecture which is a modular, extensible system built on successful OS projects such as Lucene/Solr and discuss issues in building domain specific enterprise search and cyberinfrastructure for the sciences and academia. Because of the large amount of information crawled and/or search there are many scale problems in information extraction and data mining such as author and entity disambiguation, data extraction and ranking, etc. We highlight application domains with examples from computer science, CiteSeerX, and chemistry, ChemXSeer and related problem areas.

Because such enterprise systems require unique information extraction approaches, several different machine learning methods, such as conditional random fields, support vector machines, mutual information based feature selection, sequence mining, etc. are critical for performance. We draw lessons for other e-science and cyberinfrastructure systems in terms of design, implementation and research and discuss future directions, systems and research.

46

Conference program

Speaker Bio Dr. C. Lee Giles holds the David Reese Professorship at the Pennsylvania State University, University Park, PA, with appointments in the College of Information Sciences and Technology, Computer Science and Engineering, and Supply Chain and Information Systems. He is or has been on the following program committees: KDD, SIGIR, WWW, JCDL, ICML, NIPS, AAAI. He is a Fellow of the ACM, IEEE and INNS. He is probably best known for his work on estimating the size of the web and with the search engine and digital library, CiteSeer, which he cocreated, developed and maintained. He has published over 300 refereed articles.

TITLE: ON SCHEMA DISCOVERY Dr. Renée J. Miller

Abstract: Structured data is distinguished from unstructured data by the presence of a schema describing the logical structure and semantics of the data. The schema is the means through which we understand and query the underlying data. Schemas enable data independence. In this talk, I consider new challenges in the old problem of schema discovery. I'll discuss the changing role of schemas from prescriptive to descriptive. I'll use examples from Web data publishing and from Business Analytics to motivate the automation of schema discovery and maintenance.

Speaker Bio Renée J. Miller received BS degrees in Mathematics and in Cognitive Science from the Massachusetts Institute of Technology. She received her MS and PhD degrees in Computer Science from the University of Wisconsin in Madison, WI. She received the Presidential Early Career Award for Scientists and Engineers (PECASE) , the highest honor bestowed by the United States government on outstanding scientists and engineers beginning their careers. She received the National Science Foundation Early Career Award. She is a Fellow of the ACM, the President of the VLDB Endowment, and the Program Chair for ACM SIGMOD 2011 in Athens, Greece. Her research interests are in the efficient, effective use of large volumes of complex, heterogeneous data. This interest spans data integration, data exchange, knowledge curation and data cleaning. She is a Professor and the Bell Canada Chair of Information Systems at the University of Toronto. In 2011, she was elected to the Fellowship of the Royal Society of Canada (FRSC), Canada's National Academy.

47

Conference program

SOCIAL PROGRAM

CONFERENCE RECEPTION Monday, December 12 18:00 to 20:00 Marriot Renaissance - Harbourside Ballroom

EXCURSION Our excursion this year will be weather dependent. If the weather allows we will be visiting The Capilano Suspension Bridge located on Vancouver’s North Shore. If the weather is poor, we will be visiting the Vancouver Aquarium located within Vancouver’s famous Stanley Park. Either way, pick-up and drop-off times will be the same.

Tuesday, December 13 Pick-up at the Marriot Pinnacle front entrance 14:00

CONFERENCE BANQUET/AWARD CEREMONY Tuesday, December 13 18:00 to 21:00 Renaissance – Harbourside Ballroom

COMMUNITY MEETING WITH LUNCH Wednesday, December 14 12:00 to 13:30 Renaissance - Harbourside Ballroom

48

Conference program

ORGANISING COMMITTEE

CONFERENCE CHAIRS

GENERAL CO-CHAIRS Osmar Zaïane, University of Alberta, Canada Wei Wang, University of North Carolina at Chapel PHD FORUM CO-CHAIRS Hill, USA Rosa Meo, University of Torino Alfredo Cuzzocrea, University of Institute of High PROGRAM CO-CHAIRS Performance Computing and Networking, Italy Jian Pei, Simon Fraser University, Canada Diane Cook, Washington State University, USA PUBLICITY CO-CHAIRS Olfa Nasraoui, University of Louisville ICDM STEERING COMMITTEE CHAIR Latifur Khan, University of Texas at Dallas Xindong Wu, University of Vermont, USA Jie Tang, Tsinghua University

WORKSHOPS CO-CHAIRS PANEL CHAIR Myra Spiliopoulou, University of Magdeburg George Karypis, University of Minnesota Haixun Wang, Microsoft Research DOCUMENTATION CHAIR TUTORIALS CO-CHAIRS Gabor Melli, Prediction Works Evimaria Terzi, Boston University Jure Leskovec, Stanford WEBMASTER Justin Fagnan, University of Alberta, Canada EXHIBITS AND DEMOS CO-CHAIRS Ming Hua, Facebook ICDM 2011 PANELISTS Alex Thomo, University of Victoria, Canada Barbara Chapman, University of Houston Edwin Pednault, IBM CONTEST CO-CHAIRS Chandrika Kamath, Lawrence Livermore National Laboratory Ashok Srivastava, NASA Ames Srinivasan Parthasarathy, Ohio State University Larry Holder, Washington State University

Howie Fung, Wikimedia Foundation Diederik van Liere, Wikimedia Foundation

LOCAL ARRANGEMENTS CHAIR Carson Leung, University of Manitoba

FINANCE CHAIR Charles X. Ling, University of Western Ontario

SPONSORSHIP CO-CHAIRS

Wei Ding, University of Massachusetts Boston Gabor Melli, Prediction Works

49

Conference program

PROGRAM COMMITTEE  Balaji Krishnapuram, Siemens Medical Solutions VICE CHAIRS USA, Image & Knowledge Management, CAD group (Malvern, PA)  Deepak Agarwal, Yahoo! Research  Ravi Kumar, Yahoo! Research  Charu C. Aggarwal, IBM T. J. Watson Research  Laks Lakshmanan, Department of Computer Center Science, University of British Columbia  James Bailey, Department of Computer Science  Ee-Peng Lim, School of Information Systems, and Software Engineering, The University of Singapore Management University Melbourne  Bing Liu, Department of Computer Science,  Arindam Banerjee, Department of Computer University of Illinois at Chicago (UIC) Science and Engineering University of Minnesota,  Michael Mahoney, The Math department at Twin Cities Stanford University  Michael R. Berthold, KNIME.com, University of  Srinivasan Parthasarathy, Department of Computer Konstanz Science and Engineering and Department of  Nitesh V. Chawla, Computer Science and Biomedical Informatics, Ohio State University, USA Engineering Department, University of Notre  Dacheng Tao, Centre for Quantum Computation Dame, USA and Intelligent Systems, University of Technology,  Gautam Das, Database Exploration Laboratory Sydney, (DBXLAB), Computer Science and Engineering  Evimaria Terzi, Computer Science Department MCS Department, University of Texas at Arlington 280, Boston University  Ian Davidson, Department of Computer Science ,  Suresh Venkatasubramanian, School of Computing University of California - Davis at the University of Utah  Evgeniy Gabrilovich, Yahoo! Research  Haixun Wang, Microsoft Research Asia  Joao Gama, Laboratory of Artificial Intelligence and  Jianyong Wang, Department of Computer Science Decision Support University of Porto and Technology, Tsinghua University  Bart Goethals, Dept. of Mathematics and Computer  Geoff Webb, Faculty of Information Technology Science, University of Antwerp Monash University  Dimitrios Gunopulos, Department of Informatics  Xifeng Yan, Computer Science Department, and Telecommunications, University of Athens University of California Santa Barbara, USA  Yike Guo, Technical Director of ICPC Department of  Qiang Yang, Hong Kong University of Science and Computing Imperial College of Science Technology Technology and Medicine  Jeffrey Xu Yu, The Chinese University of Hong Kong  Xiaofei He, Computer Science, Zhejiang University,  Philip S. Yu, Department of Computer Science, China University of Illinouis Chicago  Larry Holder, School of Electrical Engineering and  Zhi-Hua Zhou, Department of Computer Science & Computer Science, Washington State University Technology, Nanjing University  Daxin Jiang, Microsoft Research Asia, China  Xingquan Zhu, University of Technology Sydney  Hillol Kargupta, Department of Computer Science and Electrical Engineering University of Maryland Baltimore County PROGRAM COMMITTEE MEMBERS  Eamonn J. Keogh, Computer Science & Engineering Department, University of California – Riverside  Periklis Andritsos, Ontario Cancer Institute and the  Osman Abul, Department of Computer University of Toronto, Department of Computer Engineering, TOBB University of Economics and Science Technology  Maria-Luiza Antonie, Department of Economics at  Reda Alhajj, Computer Science Department University of Guelph. University of Calgary  Lars Asker, Department of Computer and System  Aijun An, Department of Computer Science and Sciences Stockholm University Engineering, York University  Ira Assent, Aarhus University  Gennady L. Andrienko, Fraunhofer IAIS , University  Vassilis Athitsos, Computer Science and of Bonn Engineering Department, University of Texas at Arlington

50

Conference program

 R. Backofen, Institute of Computer Science,  David Wai-Lok Cheung, Department of Computer University of Freiburg, Germany Science, University of Hong Kong  Tony Bagnall, University of East Anglia  Yun Chi, NEC Laboratories America, Inc.  Rohan A. Baxter, ATO, Australia  Silvia Chiusano, Department of Control and  Bettina Berendt, Department of Computer Science, Computer Engineering, Politecnico University Katholieke Universiteit Leuven  Wei Chu, Yahoo! Labs, Audience Science  Tanya Y. Berger-Wolf, Department of Computer  Fu-Lai Chung, Department of Computing, Hong Science, University of Illinois at Chicago Kong Polytechnic University, Hong Kong  Kanishka Bhaduri, Mission Critical Technologies,  Amanda Clare, Department of Computer Science, NASA Ames Research Center Aberystwyth University  C. Bhattacharyya , Department of Computer  S. Craw, School of Computing, Robert Gordon Science and Automation, Indian Institute of Science University  Albert Bifet, University of Waikato Computer  Tom Croonenborghs, Biosciences and Technology Science Department Department, K.H.Kempen University College  Marco Botta, Computer Science, University of  Alfredo Cuzzocrea, Institute of High Performance Torino, Italy Computing and Networking, Italian National  Karl Branting, Information Discovery and Research Council University of Calabria Understanding Department, The MITRE  Honghua Dai, School of Computing and Corporation Mathematics, Deakin University  Bjorn Bringmann, Katholieke Universiteit Leuven,  Nilesh Dalvi, Yahoo! Research Silicon Valley Belgiu  Kamalika Das, Intelligent Data Understanding  Maurice Bruynooghe, Katholieke Universiteit group at NASA Ames Research Center Leuven Departement Computerwetenschappen  Kaustav Das, Microsoft  Mary Elaine Califf, School of Information  Anne Denton, Department of Computer Science, Technology, Illinois State University North Dakota State University  Longbing Cao, School of Software, Faculty of  Wei Ding, Department of Computer Science, Engineering and IT, University of Technology University of Massachusetts Boston Sydney  Zhongli Ding, Google Inc.  Luca Cazzanti , Applied Physics Laboratory,  Carlotta Domeniconi, Department of Computer University of Washington Science, George Mason University  Michelangelo Ceci, University of Bari, Italy  Josep Domingo-Ferrer, Univ. Rovira i Virgili E.T.S.E.  N. Cercone, Faculty of Science and Engineering,  Guozhu Dong, Data Mining Research Lab, Dept of York University, Canada Computer Sci & Engr, Wright State University  G. Cervone, Department of Geography and  Dejing Dou, Computer and Information Science Geoinformation Science (GGS) George Mason Department,the University of Oregon University, USA  Lian Duan, Management Sciences Department,  Kaushik Chakrabarti, Data Management Tippie College of Business at the Univeristy of Iowa. Exploration and Mining Group Microsoft Research  Haimonti Dutta, Center for Computational Learning  Philip Chan, Department of Computer Sciences, Systems (CCLS), Florida Institute of Technology  William Eberle, Department of Computer Science  Georgios Chatzimilioudis, University of Cyprus at Tennessee Technological University  Arbee L. P. Chen, Department of Computer  K. Efe, Computer Science,University of Louisiana at Science, National Chengchi University Lafayette  Lei Chen, Department of Computer Science and  C. Eick, UH Data Mining & Machine Learning Group Engineering , Hong Kong University of Science and Department of Computer Science University of Technology Houston  Xue-wen Chen, University of Kansas  Tina Eliassi-Rad, Department of Computer Science,  Yixin Chen, Department of Computer Science and Rutgers University Engineering, Washington University in St Louis  Vladimir Estivill-Castro, Griffith University  Zheng Chen, Microsoft Research Asia  Fazel Famili, National Research Council Canada  Zhengxin Chen, College of Information Science and (NRC) Technology, University of Nebraska at Omaha  Wei Fan, IBM T.J. Watson Research Center  Hong Cheng  Yi Fang, Department of Computer Science, Purdue  Department of Systems Engineering and University Engineering Management, The Chinese University  Nicola Fanizzi, Computer Science Department, of Hong Kong, Hong Kong University of Bari, Italy

51

Conference program

 Fabio Fassetti, University of Calabria  Shen-Shyang Ho, University of Maryland, College  Ling Feng, Database Group Department of Park, USA Computer Science & Technology Tsinghua  Tu Bao Ho, School of Knowledge Science (JAIST) University  Se June Hong, IBM Research, Emeritus, USA  Qinyuan Feng, Peking University  Andreas Hotho, University of Kassel  H. Ferhatosmanoglu, Computer Science and  Vagelis Hristidis, School of Computing and Engineering Ohio State University Information Sciences Florida International  Stefano Ferilli, Dipartimento di Informatica University Universita' di Bari  Minqing Hu, Teradata  Eibe Frank, Department of Computer Science,  Xiaohua Hu, College of Information Science and University of Waikato Technology Drexel University  Lawrence D. Fu, Department of Medicine, Division  K. A. Hua, School of Computer Science University of of Clinical Pharmacology, New York University Central Florida  Zhouyu Fu, Monash University  Ming Hua, Facebook Inc.  Takeshi Fukuda, IBM Tokyo Research Laboratory  Jun Huan, Information and Telecommunication  Gabriel Pui Cheong Fung, Arizona State University Technology Center (ITTC) Department of Electrical  Johannes Fürnkranz, Knowledge Engineering at the Engineering and Computer Science University of TU Darmstadt Kansas  Gammerman, Computer Science, Royal Holloway,  Jian Huang, Google Pittsburgh University of London  Dino Ienco, Dipartimento di Informatica Università  Matjaz Gams, Jozef Stefan Institute, Department of di Torino Intelligent Systems  Nitin Indurkhya, eBay Research Laboratories, USA  Aryya Gangopadhyay, Academic Affairs --  Hasan Jamil, Department of Computer Science Information Systems College of Science Wayne State University  Byron J. Gao, Department of Computer Science,  Szymon Jaroszewicz, Institute of Computer Science, Texas State University - San Marcos Polish Academy of Sciences, Poland  Minos Garofalakis, Department of Electronic &  Shuiwang Ji, Computer Science Department Old Computer Engineering Technical University of Dominion University Crete University Campus -- Kounoupidiana  Bin Jiang , School of Computing Science, Simon  Gemma C. Garriga, INRIA Lille Nord Europe, France Fraser University, Canada.  Tingjian Ge, Department of Computer Science at  Ruoming Jin, Computer Science Department, Kent the University of Kentucky State University  Rayid Ghani, Accenture Technology Labs  Tamer Kahveci, Department of Computer and  Amol Ghoting, Data Mining Systems Group, IBM T. Information Science and Engineering, University of J. Watson Research Center Florida  Chris Giannella, The MITRE Corporation, USA  Ananth Kalyanaraman  Phillip B. Gibbons, Intel Labs Pittsburgh  Konstantinos Kalpakis , Computer Science and  Aristides Gionis, Yahoo! Research Electrical Engineering Department University of  Attilio Giordana, Computer Science university of Maryland Baltimore County Torino  Toshihiro Kamishima, National Institute of  Aris Gkoulalas-Divanis, IBM Research - Zurich Advanced Industrial Science and Technology (AIST),  Shantanu Godbole, IBM Research Japan  Robert Grossman, University of Illinois at Chicago  M. Kantarcioglu, Computer Science Department, (UIC) University of Texas at Dallas  Fabrice Guillet, Polytech Nantes - University of  George Karypis, Department of Computer Science Nantes LINA CNRS 6241 & Engineering, University of Minnesota  Diansheng Guo, University of South Carolina  Hiroyuki Kawano, Nanzan University, Japan  Habiba, Department of Computer Science,  John Keane, Computing Science, University of University of Illinois at Chicago Manchester  Maria Halkidi , Dept of Digital Systems, University  Latifur Khan, Computer Science department, of Pireaus University of Texas at Dallas  Lawrence Hall , Univ. of South Florida, USA  Arno Knobbe, LIACS, Leiden University, the  Xiaofei He, Computer Science, Zhejiang University, Netherlands China  Yehuda Koren, Yahoo! Research  H. Hirsh, Department of Computer Science Rutgers  Walter A. Kosters, Computer Science department, University Leiden University

52

Conference program

 Raghu Krishnapuram, IBM India Research Lab  Claudio Lucchese, Istituto di Scienza e Tecnologia  Deept Kumar, Computer Science, Virginia Tech, dell'Informazione USA  Richard Maclin, Department of Computer Science,  Wai Lam, The Chinese University of Hong Kong, University of Minnesota-Duluth Hong Kong  Sanjay K. Madria, Department of Computer  Longin Jan Latecki, CIS Dept., Temple University Science, University of Missouri-Rolla, USA  Hady Wirawan Lauw, Institute for Infocomm  Arun S. Maiya, Institute for Defense Analyses - Research Alexandria  Chengkai Li, Department of Computer Science and  Masoud Makrehchi, Thomson Reuters Corp. Engineering, University of Texas at Arlington  Donato Malerba, Computer Science, University of  Fei-Fei Li, School of Computing, University of Utah Bari  Hang Li, Information Retrieval and Mining Group at  Bradley Malin, Dept of Biomedical Informatics Microsoft Research Asia School of Medicine Vanderbilt University  Ming Li, LAMDA, Department of Computer Science  Marcus A. Maloof, Department of Computer and Technology, Nanjing University, China Science at Georgetown University  Ping Li, Cornell University, Ithaca, NY  Mark Manasse, Microsoft Research - Silicon Valley  Qi Li, Department of Computer Science, Western  Giuseppe Manco, ICAR-CNR, Italy Kentucky University  Pedro Jose Marron, University of Duisburg-Essen  Li Xiaoli, Institute for Infocomm Research  F. Masseglia , INRIA Sophia Antipolis AxIS  Xuelong Li, Computer Science and Information  Sameep Mehta, IBM Research - India Systems Birkbeck (University of London)  Shicong Meng, Computer Science at Georgia  Yu-Feng Li, Department of Computer Science and Institute of Technology Technology Nanjing University, China  Rosa Meo, Department of Computer Science,  Yuefeng Li, Faculty of Science and Technology, University of Torino, Italy Queensland University of Technology  Taneli Mielikainen, Nokia Research Center, Palo  Jessica Lin, Department of Computer Science, Alto George Mason University  S. Minton, University of Southern California, USA  Song Lin, Google, USA  Dunja Mladenic, Department of Knowledge  Charles X. Ling, Department of Computer Science Technologies, Josef Stefan Institute University of Western Ontario (UWO)  Bamshad Mobasher, School of Computing, College  Dan Liu, Google Inc. USA of Computing and Digital Media, DePaul University  Fei Tony Liu, Monash University  Fabian Morchen, Siemens Corporate Research  Huan Liu, Ira A. Fulton Schools of Engineering,  Noman Mohammed, Department of Computer Arizona State University Science and Software Engineering Concordia  Jinze Liu, Department of Computer Science College University of Engineering University of Kentucky  Shinichi Morishita, Department of Computational  Jun Liu , Siemens Corporate Research Biology, Graduate School of Frontier Sciences,  Kun Liu, Yahoo! Labs University of Tokyo  Nathan Nan Liu, Hong Kong University of Science  Tsuyoshi Murata, Department of Computer and Technology Science, Graduate School of Information Science  Tie-Yan Liu, Microsoft Research Asia and Engineering, Tokyo Institute of Technology,  Xiao Hui Liu, School of Information Systems, Olfa Nasraoui Computing and Mathematics Brunel University  Dept. of Computer Science and Computer  Yan Liu, Computer Science Department, Viterbi Engineering, Speed School of Engineering, School of Engineering, University of Southern University of Louisville California  Sumit Negi , IBM Research - India  Rohit Lotlikar, Business Analytics and Optimization,  Benjamin Negrevergne, Grenoble University, India Research Lab, IBM France  Grigorios Loukides, Dept of Biomedical Informatics,  Jennifer Neville, Computer Science and Statistics, Vanderbilt University Purdue University  Chang-Tien Lu, Department of Computer  See-Kiong Ng, Institute for Infocomm Research, Science,Northern Virginia Graduate Center, Virginia Singapore Tech  Vinh Nguyen, Gippsland School of IT, Monash  Zhenyu Lu, Morphology, Evolution & Cognition University Laboratory Department of Computer Science,  Siegfried Nijssen, Department of Computer University of Vermont Science, Katholieke Universiteit Leuven

53

Conference program

 T. Oates, Computer Science & Electrical  Shashi Shekhar, Department of Computer Science Engineering University of Maryland Baltimore Institute of Technology, University of Minnesota County  Shirish Shevade, Indian Institute of Science, INDIA  Salvatore Orlando, Università Ca' Foscari di Venezia  Kyuseok Shim, School of Electrical Engineering and Dipartimento di Informatica Computer Science Seoul National University  Riccardo Ortale, ICAR-CNR, Italy  Fabrizio Silvestri, Information Science and  Gerhard Paass, Fraunhofer Institute for Intelligent Technology Institute (ISTI) of the Italian National Analysis and Informations Systems IAIS Research Council  Spiros Papadimitriou, Google Research  D. A. Simovici, University of Massachusetts Boston  Rajesh Parekh, Groupon Department of Computer Science  D. S. Parker, UCLA Computer Science Dept.  Ambuj K. Singh, Department of Computer Science  Ruggero G. Pensa, Italian National Research Department of Biomolecular Science and Council, IRPI-CNR, Research Institute for Geo- Engineering University of California at Santa hydrological Protection of Torino (Italy). Barbara  Chang-Shing Perng, IBM T.J. Watson Research  Lisa Singh, Georgetown University Center, USA  Krishnamoorthy Sivakumar, Washington State  Luigi Pontieri, ICAR-CNR institute University  Giuseppe Psaila, University of Bergamo, ITALY  Skowron, Computer Science and Mechanics  Kunal Punera, Yahoo! Research Warsaw University  Bo Qin, Universitat Rovira i Virgili, Spain  Ashok N. Srivastava, NASA Ames  Davood Rafiei, Department of Computing Science  Michael Steinbach, Department of Computer University of Alberta Science and Engineering, University of Minnesota  Chedy Raissi, Institut National de Recherche en  W. Nick Street, University of Iowa Informatique et en Automatique (INRIA)  Keith M. Sullivan, George Mason University, USA  Aruna Rajan, IBM Research Lab, Bangalore, India  Jimeng Sun, IBM TJ Watson Research Center, USA  Naren Ramakrishnan, Department of Computer  Wojciech Szpankowski, Purdue University Science, Virginia Tech  Atsuhiro Takasu, National Institute of Informatics,  G. Ramesh, Yahoo! Labs., USA Japan  Sanjay Ranka, Computer and Information Science  Domenico Talia, Computer Engineering Faculty of and Engineering, University of Florida Engineering University of Calabria  Z. W. Ras, Department of Computer Science  Ah-Hwee Tan, School of Computer Engineering University of North Carolina Nanyang Technological University  S.S. Ravi, University at Albany State University of  Pang-Ning Tan, Department of Computer Science & New York Engineering, Michigan State University  Umaa Rebbapragada, Machine Learning and  Jie Tang, Department of Computer Science and Instrument Autonomy Group at JPL Technology, Tsinghua University, China  Christophe Rigotti, INSA Lyon (France)  David Taniar, Monash University  John Roddick, Flinders University  Chayant Tantipathananandh, University of Illinois  Celine Rouveirol, Institut Galilee - Universite Paris- at Chicago Nord  Shirish Tatikonda, IBM Almaden Research Center  Stefan Rueping, Fraunhofer Institute IAIS,  Masahiro Terabe, Mitsubishi Research Institute, Knowledge Discovery Group, Integrated Data Inc. Mining Team  Ankur Teredesai, Institute of Technology  S. Ruggieri, Computer Science Department,  Alexandre Termier, Universite Joseph Fourier/Grenoble University, LIG, France  Sourav Bhowmick, Division of Information Systems,  Kai Ming Ting, Gippsland School of Information School of Computer Engineering , Nanyang Technology Monash University Technological University, Singapore  Masashi Toyoda, Kitsuregawa Laboratory Institute  Lorenza Saitta, University of Piemonte Orientale, of Industrial Science University of Tokyo Italy  Vincent S. Tseng, Department of Computer Science  Sampath Kameshwaran, IBM Research - India, and Information, National Cheng Kung University Bangalore  K. Unnikrishnan, Center for Computational  Tamas Sarlos, Yahoo! Research Medicine and Bioinformatics, University of  Yucel Saygin, Faculty of Engineering and Natural Michigan Sciences Sabanci University  Aparna Varde, Department of Computer Science at  Martin Scholz, HP Labs,USA Montclair State University in Montclair, New Jersey

54

Conference program

 Vassilios S. Verykios, Computer and  Xingwei Yang, Temple University,USA Communications Engineering Department,  Xiaoxin Yin, Internet Services Research Center, University of Thessaly Microsoft Research  Fei Wang, IBM T. J. Watson Research Lab,  Bianca Zadrozny, IBM Research Brazil Healthcare Transformation Research Group  Chengqi Zhang, Centre for Quantum Computation  J. T. L. Wang, Department of Computer Science and Intelligent Systems, University of Technology, New Jersey Institute of Technology University Sydney Heights  Harry Zhang, Faculty of Computer Science,  Xufei Wang, Arizona State University University of New Brunswick  Ran Wolff, Management Information Systems,  Kun Zhang, Department of Computer Science, University of Haifa Xavier University of Louisiana  Wensheng Wu, Department of Computer Science  Yanchun Zhang, Centre for Applied Informatics The University of North Carolina at Charlotte School of Engineering and Science Victoria  Xindong Wu, Department of Computer Science, University University of Vermont  Ying Zhao, Department of Computer Science and  Yanghua Xiao, School of Computer Science, Fudan Technology, Tsinghua University University  Zijian Zheng, Microsoft, USA  Zhengzheng Xing, Amazon Inc.  Bin Zhou, Simon Fraser University  Xiaowei Xu, University of Arkansas at Little Rock  Hui Yang, The Ohio State University, USA

55

Conference Program

EXTERNAL REVIEWERS

Artur Abdullin Rob Cooke Stephanus Daniel Handoko Jiye Li Zubin Abraham Fabrizio Costa Naeemul Hassan Peipei Li Ibrahim Adeyanju Gianni Costa Basheer Hawwash Ping Li Nagesh Adluru Christoph Csallner Jianshan He Qiang Li Muhaimenul Adnan Puja Das Ai H. Ho Ronghua Li Sara Aghakhani Santanu Das Ryan Hoens Xiao Li Rezwan Ahmed Jesse Davis Ben Horsburgh Yuan Li Reza Akbarinia Martine De Cock Yunhua Hu Zhao Li Esra Akbas Dennis DeCoste Guangyan Huang Huizhi Liang Abdulmohsen Algarni Engin Demir Xin Huang Chen-Yi Lin Nawaf Alkharoush Kevin DeRonne Farkhund Iqbal Yi-Wen Lin Xiangdong An Sanjoy Dey Andrzej Janusz Yun Lin Periklis Andritsos Martin Dimkovski Nandish Jayaram Huang Ling-wei Fabrizio Angiulli Wenkui Ding Ming Ji Fei Tony Liu Yindalon Aphinyanaphongs Pavel Dmitriev Yi Jia Haishan Liu Annalisa Appice Stephan Doerfel Shangpu Jiang Lei Liu Gowtham Atluri Jun Du Wenhao Jiang Lin Liu Alex Aved Wouter Duivesteijn Yexi Jiang Qiaoling Liu Ferhat Ay Seyda Ertekin Zhe Jiang Xin Liu Nirmalya Bandyopadhyay Roberto Esposito Sachindra Joshi Henry Lo Nicola Barbieri James Faghmous Santosh Kabbur Corrado Loglisci Satrajit Basu Hongliang Fei Nobuhiro Kaji Robert Lothian Montserrat Batet Sergey Feldman Amol Kapila Nicholas Loulloudes Kedar Bellare Mengling Feng Alexandros Karakasidis Jiangang Ma Dominik Benz Francesco Folino Mehdi Kargar Tianyang Ma Indrajit Bhattacharya Neil Fore Peter Karsmakers Lucrezia Macchia Jiang Bian Dmitriy Fradkin Abhijith Kashyap Kathy Macropol Wei Bian Antonino Freno Chris Kauffman Prakash MandayamComar Hamad Binsalleeh Natalja Friesen Gokhan Kaya Alex Marin julien Blanchard Qiang Fu Enver Kayaaslan Claudia Marinica Petko Bogdanov Tak-chung Fu Mehdi Kaytou Kyle Martin Bo Cao Zhouyu Fu Mehdi Kaytoue Elio Masciari Chen Cao Dave Fuhry Mikaela Keller Stewart Massie Hong Cao Eric Garcia Rohan Khade Carlo Mastroianni Tianyu Cao Vikaskumar Garg Vaibhav Khadilkar Pasquale Minervini Ruben Cavazos Tingjian Ge Mohammad Khoshneshin Pradeep Mohan Diego Ceccarelli Jonathan Gemmell Younghoon Kim Misael Mongiovi Eugenio Cesario Sean Gilpin Peter Kluegl Michael Lind Mortensen Soumyadeep Chatterjee David Gleich Suzan Koknar-Tezel Yang Mu Sriram Chellappan Robby Goetschalckx Giogos Kollias Abdullah Mueen Alan Chen Siddharth Gopal Giorgos Kollias Smruthi Mukund Chia Ching Chen Amit Goyal Negar Koockakzadeh Maybin Muyeba Chun-Sheng Chen Yuhua Gu John Korecki Mirco Nanni Ling Chen Zhenmei Gu Da Kuang Krishnasuri Narayanam Rui Chen Gunhan Gulsoy Vimal Kumar Ramasuri Narayanam You Chen Tias Guns Sofiane Lagraa Franco Maria Nardini Shiwen Cheng Fernando Gutierrez Nicholas Larusso Xuan Bach Ngo Eng Yeow Cheu Bernd Gutmann Ngoc Tu Le Canh Hao Nguyen Si-Chi Chin Sébastien Guérif Si Quang Le Minh Nhut Nguyen Shu-i Chiu Robert Gwadera Pei Lee Eileen Ni Yongwook Choi David Haglin Remi Lehn Razieh Niazi Pirooz Chubak Sara Hajian Florian Lemmerich Xia Ning Chun Kit Chui Huseyin Hakkoymaz Ho-bun Leung Ilia Nouretdinov Joseph Cohen Bing Han Gang Li Stefan Novak Carmela Comito Qian Han Hui Li Greg Okopal

56

Conference Program

Nikunj Oza Li Rong-Hua Liang Sun Yuchen Wu Deepak P Olaf Ronneberger Wojciech Swieboda Wei Xiang Clint P. George Ning Ruan Gabriel Synnaeve Chenyan Xiong Sinno Pan Shuhua Ruan andrea tagarelli Jianpeng Xu Weike Pan Cynthia Rudin Amir Taheri Mohamed Yakout Vinayaka Pandit Eduardo Ruiz Liang-Kuang Tai Ning Yan Evangelos Papalexakis Brian Ruttenberg Frank Takes Chen Yang Brandon Parker Reza Sadoddin Swee Chuan Tan Jing Yang Jonathon Parker Peter Sadowski Mingwang Tang Linji Yang Nathan Parrish Tanwistha Saha Nikolaj Tatti Zhenglu Yang Vandanaben Patel Esin Saka giorgio terracina Bin Yao Puntip Pattaraintakorn Kameshwaran Sampath Quang Khoat Than Hengshuai Yao Antoniya Petkova Kameswaran Sampath Marc Tommasi Senay Yasar Saglam Ilias Petrounias Atieh Sarraf Victor Tong Jun Ye Ngoc Khanh Pham James Schaffer Yongxin Tong Vincent Yip Fabien Picarougne Tom Schimoler Massimo Torquati Naoki Yoshinaga Anja Pilz Christoph Scholz Gianluca Torta Kui Yu Gianvito Pio Sundararajan Daniel Trabold Mingxuan Yuan Luepol Pipanmaekaporn Sellamanickam Rolando Trujillo Junyuan Zeng Miriam Pirra Omair Shafiq Paolo Trunfio Xiao-jun Zeng Marc Plantevit Ghada Shalaby Hossein Vahabi Chi Zhang Nayot Poolsappasit Leonid Shamis Joaquin Vanschoren Lei Zhang Aditya Prakash Hanhuai Shan Virendra Varshneya Libiao Zhang Anita Prinzie Rajesh Shenoy Mathias Verbeke Nan Zhang Andrea Pugliese Xiaoxiao Shi Alessia Visconti Ying Zhang Alan Qi Feng Shu Kong-wah WAN Chuan Zhao zhi qiao Claudio Silvestri Dawei Wang Wenbo Zhao Bo Qin Vishwakarma Singh Huahua Wang Yanchang Zhao Lu Qin Sajid Siraj Qi Wang Yuchen Zhao Brian Quanz Lek Sirikunya Ting Wang Ling Zhong Ali Rahmani Nicholaus Skapura Wen Wang Tingting Zhong Piyush Rai Alex Smola Xinggang Wang Xujuan Zhou Parasaran Raman Jordi Soria Comas Zhuang Wang Yan Zhou Jan Ramon Damon Sotoudeh Jonathan Wells Yuanyuan Zhu Sayan Ranu Emmanouil Spanakis Dongrong Wen Wei Zhuo Mike Rechenthin Irena Spasic Sebastian Will Albrecht Zimmermann Yongli Ren Junilda Spirollari Chris Wu Ettore Ritacco Alexander Statnikov Qianhong Wu Simona Rombo John Stutz Tianyi Wu Andrea Romei Ilija Subasic Wei Wu

57

VOLUNTEERS

 Shailendra Agarwal, University of British Columbia,  Xiao Meng, Simon Fraser University, Canada Canada  ZhenSong Qian, Simon Fraser University, Canada  Pirooz Chubak, University of Alberta, Canada  Mohammad Khabbazhaye Tajer, University of British  Shewangizaw (Shewan) Digefe, University of British Columbia, Canada Columbia, Canada  GuanTing Tang, Simon Fraser University, Canada  Chao Han, Simon Fraser University, Canada  Andrew Hung Yao Tjia, University of British Columbia,  Bo Hu, Simon Fraser University, Canada Canada  Qiang Jiang, Simon Fraser University, Canada  Peng Wang, Simon Fraser University, Canada  Da Kuang, University of Western Ontario, Canada  Lan Wei, University of British Columbia, Canada  Pei Lee, University of British Columbia, Canada  Min Xie, University of British Columbia, Canada  TianYu Li, University of British Columbia, Canada  YongMin Yan, Simon Fraser University, Canada  Yun Lou, University of British Columbia, Canada  Guang-Tong Zhou, Simon Fraser University, Canada

VOLUNTEERS (STUDENT TRAVEL AWARD RECIPIENTS)  Ahmed Farahat, University of Waterloo  Qi Liu, USTC & Rutgers University  Bing Hu, UC Riverside  Qiong Fang, Hong Kong Univ of Sci & Tech  Diane Oyen, University of New Mexico  Rajmonda Caceres, Univ. of Illinois at Chicago  Eirini Spyropoulou, University of Bristol  Reihaneh Rabbany khorasgani, University of Alberta  Farnoush Farhadi, Shiraz University  Saima Aman, USC  Gautam S. Thakur, University of Florida  Siyu Gu, Beijing Institute of Technology  Guan Wang, Univ. of Illinois at Chicago  Souhaib Ben Taieb, Machine Learning Group, ULB  Haishan Liu, University of Oregon  Tam Thanh Nguyen, Nanyang Technological  Hao Huang, Stony Brook University University  Hongliang Fei, University of Kansas  Thanawin Rakthanmanon, UC Riverside  Hongxing Wang, Nanyang Technological University  Tias Guns, Katholieke Universiteit Leuven  Huan-Kai Peng, Carnegie Mellon University  Vamsi K Potluru, University of New Mexico  Jana Schmidt, TU München  Vasanth Iyer, International Institute of IT  Jierui Xie, Rensselaer Polytechnic Insti  Vladimir Jovanovic, Southern Methodist University  Jinoh Oh, POSTECH  Vladimir Ouzienko, Temple University  Junming Shao, University of Munich  Wang-Zhou Dai, Nanjing University  Kleanthis-Nikolaos Kontonasios, University of Bristol  Wei Chen, SUNY at Buffalo  Li Qiuhong, FuDan University  Wei Ping, UCI  Liaoruo Wang, Cornell University  Wei Zhuo, Georgia Institute of Technology  Mansoureh Takaffoli, University of Alberta  Wouter Duivesteijn, LIACS, Leiden University  Mikalai Tsytsarau, University of Trento  Xia Ning, UMN  Mostafa Haghir Chehreghani, Katholieke Universiteit  Xiaoran Xu, Peking University Leuven  Xiaoxu Li, BUPt  Nenad Tomašev, Institute Jožef Stefan  Xide Lin, UIUC  Paea LePendu, Stanford University  Xuezhi Wang, Carnegie Mellon University  Petko Bogdanov, UCSB  Ye Xu, Dartmouth College  Po Hu, Tsinghua University  Yong Ge, Rutgers University  Prakash Manayam Comar, Michigan State Univ  Yuchen Zhao, BUPt  Pranay Anchuri, Rensselaer Polytechnic Insti

58

USEFUL LINKS

ABOUT THE ICDM 2011: ICDM 2011 Website: http://icdm2011.cs.ualberta.ca/index.php

ABOUT VANCOUVER:  City of Vancouver: http://vancouver.ca/  About Vancouver: http://vancouver.ca/aboutvan.htm  About British Columbia http://www.hellobc.com/en-CA/AboutBC/BritishColumbia.htm  Canadian government: http://www.canada.gc.ca/home.html  Current events: http://www.vancouver-bc.com/events  Vancouver map: http://www.tourismvancouver.com/visitors/vancouver/tools/maps  Visitor guide and information: http://www.tourismvancouver.com/visitors/  Public transportation: http://www.translink.ca/  BC ferries: http://www.bcferries.com/

ABOUT VANCOUVER

FACTS ABOUT VANCOUVER At any time of the year, Vancouver is sure to surprise and delight. Surrounded by water on three sides and nestled alongside the Coast Mountain Range, Vancouver is the largest city in the province of British Columbia with over half a million residents and more than 2.1 Million people in metropolitan area. It has one of the mildest climates in Canada. Vancouver, home to spectacular natural scenery and a bustling metropolitan core, is well known as an urban centre surrounded by nature, making tourism its second largest industry.

It also is the third largest film production centre in North America after Los Angeles and New York City, earning it the nickname Hollywood North.Vancouver has ranked highly in worldwide "livable city" rankings for more than a decade according to business magazine assessments. It has hosted many international conferences and events, including the 1954 British Empire and Commonwealth Games, the 1976 United Nations Conference on Human Settlements and the 1986 World Exposition on Transportation and Communication. The 2010 Winter Olympics and 2010 Winter Paralympics were held in Vancouver and nearby Whistler, a resort community 125 km (78 miles) north of the city.

Vancouver's climate is temperate by Canadian standards. Winters in Greater Vancouver are among the mildest of Canadian cities. While it often rains in December, on average, only 4.5 days a year have temperatures staying below freezing. It is said that Vancouverites "tamed the snow to remain on the nearby mountains."

TRANSPORTATION AROUND THE CITY Vancouver boasts a variety of transporation options, including light rail (Skytrain), busses, ferries (Seabus), Taxis, and Car Rental agencies. Please note that the Skytrain system provides an easy connection from the Airport to downtown Vancouver and more.

Taxi service: The estimated taxi rate from the Vancouver International Airport to Downtown is around $25~$30 CAD.

Car rental: There are many local and international car rental companies and agencies available on-site at the airport or downtown, for instance, Hertz, AVIS, Budget, National, etc.

CONFERENCE VENUE LOCATIONS

CONFERENCE ROOM FLOOR PLANS

MARRIOT PINNACLE

renaissancevancouver.com SECOND LEVEL - Meeting Rooms Legend Port of Vancouver Chair Patio North Shore Mountain 1040 sq. ft Waterfront View Table W o M m e Window e n n s s Foyer Elevators to all floors Stairs to Hotel Lobby

Fire Exit

Stage Ballroom 1 Ballroom 2 Ballroom 3 3300 sq. ft. 2310 sq. ft. 2310 sq. ft.

Harbourside Ballroom 7920 sq. ft. Legend

Window renaissancevancouver.com THIRD LEVEL - Meeting Rooms

Port of Patio Singapore Sliding Floor to Ceiling Windows 1040 sq. ft. North Shore Mountain Waterfront View

Port of Port of Port of Hong Kong New York San Francisco 700 sq. ft. 1036 sq. ft. 672 sq. ft. Foyer Elevators to all floors

Port of Macau 506 sq. ft. Port of Port of Womens Mens Shanghai Sydney West Stairwell 420 sq. ft. 420 sq. ft.

East Stairwell Fire Exit