Conference Registration Brochure

The Third International Conference on Knowledge Discovery and Data Mining August 14–17 1997, Newport Beach, California Sponsored by the American Association for Artificial Intelligence KDD-97: A Preview The rapid growth of data and information has created a need and an opportunity for extracting knowledge from databases, and both researchers and application developers have been responding to that need. Knowledge discovery in databases (KDD), also referred to as data mining, is an area of common interest to researchers in machine discovery, statistics, databases, knowledge acquisition, machine learning, data visualization, high performance computing, and knowledge-based systems. KDD applications have been developed for astronomy, biology, finance, insurance, marketing, medicine, and many other fields. The Third International Conference on Knowledge Discovery and Data Mining (KDD-97) will follow up the success of KDD-95 and KDD-96 by bringing together researchers and application developers from different areas focusing on unifying themes. Keynote Address: From Large to Huge. A Statistician’s Reactions to KDD and DM Peter Huber, Universität Bayreuth, Germany The statistics and AI communities are confronted by the same challenge, the onslaught of ever larger data collections, but the two communities have reacted independently and differently. What could they learn from each other if they looked over the fence? What is amiss on either side? KDD-97 Organization General Conference Chair Gregory Cooper, University of Pittsburgh Eric Mjolsness, University of California at Robert Cowell, City University, UK San Diego Ramasamy Uthurusamy, General Motors Bruce Croft, University of Massachusetts Sally Morton, Rand Corporation Corporation at Amherst Richard Muntz, University of California at William Eddy, Carnegie Mellon University Los Angeles Program Cochairs Charles Elkan, University of California at Raymond Ng, University of British David Heckerman, Microsoft Research San Diego Columbia Heikki Mannila, University of Helsinki, Usama Fayyad, Microsoft Research Steve Omohundro, NEC Research Finland Ronen Feldman, Bar-Ilan University, Israel Gregory Piatetsky-Shapiro, Daryl Pregibon, AT&T Laboratories Jerry Friedman, Stanford University GTE Laboratories Dan Geiger, Technion, Israel Daryl Pregibon, Bell Laboratories Publicity Chair Clark Glymour, Carnegie-Mellon Raghu Ramakrishnan, University of University Wisconsin, Madison Paul Stolorz, Jet Propulsion Laboratory Moises Goldszmidt, SRI International Patricia Riddle, Boeing Computer Services Georges Grinstein, University of Ted Senator, NASD Regulation Inc. Tutorial Chair Massachusetts, Lowell Jude Shavlik, University of Wisconsin at Padhraic Smyth, University of California, Jiawei Han, Simon Fraser University, Madison Irvine Canada Wei-Min Shen, University of Southern David Hand, Open University, UK California Demo and Trevor Hastie, Stanford University Arno Siebes, CWI, Netherlands David Heckerman, Microsoft Corporation Avi Silberschatz, Bell Laboratories Poster Sessions Chair Haym Hirsh, Rutgers University Evangelos Simoudis, IBM Almaden Tej Anand, NCR Corporation Jim Hodges, University of Minnesota Research Center Se June Hong, IBM T .J. Watson Research Andrzej Skowron, University of Warsaw Awards Chair Center Padhraic Smyth, University of California Tomasz Imielinski, Rutgers University at Irvine Gregory Piatetsky-Shapiro, Yannis Ioannidis, University of Wisconsin Ramakrishnan Srikant, IBM Almaden GTE Laboratories Lawrence D. Jackel, AT&T Laboratories Research Center David Jensen, University of Massachusetts John Stasko, Georgia Institute of Technology Program Committee Members Michael Jordan, Massachusetts Institute of Salvatore J. Stolfo, Columbia University Helena Ahonen, University of Helsinki Technology Paul Stolorz, Jet Propulsion Laboratory Tej Anand, NCR Daniel Keim, University of Munich Hanna Toivonen, University of Helsinki Ron Brachman, AT&T Laboratories Willi Kloesgen, GMD, Germany Alexander Tuzhilin, New York University, Carla Brodley, Purdue University Ronny Kohavi, Silicon Graphics Stern School Dan Carr, George Mason University David D. Lewis, AT&T Laboratories – Ramasamy Uthurusamy, General Motors Peter Cheeseman, Research R&D Center NASA AMES Research Center David Madigan, University of Washington Graham Wills, Bell Laboratories David Cheung, University of Hong Kong Heikki Mannila, University of Helsinki David Wolpert, IBM Almaden Research Wesley Chu, University of California at Brij Masand, GTE Laboratories Center Los Angeles Gary McDonald, General Motors Research Jan Zytkow, Wichita State University 2 KDD–97 KDD-97 Tutorial Program All tutorials will be presented on Thursday, August 14, 1997. The times listed below are tentative. Admission to the tutorials is included in your conference registration fee. Registrants can attend up to four consecutive tutorials, including four tutorial syllabi. TIME SESSION 1 SESSION 2 8:00 to 10:00 AM T1 – Fayyad and Simoudis (single session) 10:30 AM to 12:30 PM T2 – Hand T3 – Feldman 1:30 to 3:30 PM T4 – Swayne and Cook T5 – Chaudhuri and Dayal 4:00 to 6:00 PM T6 – Keim T7 – DuMouchel Tutorial 1: 8:00–10:00 AM Tutorial 2: 10:30 AM–12:30 PM Tutorial 3: 10:30 AM–12:30 PM Data Mining and KDD: Modeling Data and Text Mining— An Overview Discovering Knowledge Theory and Practice Usama Fayyad, Microsoft Research and Evangelos David Hand, Open University, UK Ronen Feldman, Bar-Ilan University, Israel Simoudis, IBM ur aim is to extract knowledge from nowledge discovery in databases e present a basic tutorial of this new Olarge bodies of data. The size of these K(KDD) focuses on the computerized Wand emerging area and emphasize bodies mean that we cannot do it unaid- exploration of large amounts of data and relations to constituent communities, in- ed, but must use fast computers, applying on the discovery of interesting patterns cluding statistics, databases, pattern recog- sophisticated statistical tools. Attempts to within them. While most work on KDD nition, learning, and visualization. The tu- automate the process of knowledge ex- has been concerned with structured torial provides a basic overview of the traction date from at least the early 1980s, databases, there has been little work on KDD process for extracting knowledge with the work on statistical expert sys- handling the huge amount of information from databases and covers the basics of tems. We examine this work, noting its that is available only in unstructured textu- each step in the process including: data successes and failures and especially what al form. In this tutorial we will present the warehousing, selection and cleaning, data researchers in data mining and knowledge general theory of text mining and will transformation, data mining, evaluation, discovery can learn from those efforts. We demonstrate several systems that use these and visualization. We also cover a sam- examine what data are, what information principles to enable interactive exploration pling of successful applications and out- is, and what knowledge is. We contrast of large textual collections. We will de- line challenges and issues to be addressed. modeling with discovery, especially in the scribe generic techniques for text catego- Usama Fayyad is a senior researcher at context of large data sets. We examine rization and information extraction that Microsoft Research, the Decision Theory high level modeling issues, such as over- are used by these systems. The systems that & Adaptive Systems Group. His research fitting, generalisability, overmodeling, will be presented are KDT which is the sys- interests include knowledge discovery in and model evaluation. And we examine tem for knowledge discovery in texts; large databases, data mining, machine high level exploration issues such as the FACT, which discovers associations among learning, statistical pattern recognition, discovery of accidental artifacts. The con- keywords labeling the items in a collection and clustering. After receiving a Ph.D. de- fluence of computing and statistics in of textual documents; and the Text Explor- gree in 1991, he joined the Jet Propulsion some areas provides a nice backdrop er, which is a system that provides a high Laboratory (JPL), California Institute of against which to examine these issues, level language for interactive exploration Technology (until 1996). At JPL, he head- and we briefly discuss neural networks of textual collections. We will present a ed the Machine Learning Systems Group and classification trees from these two general architecture for text mining and where he developed data mining systems perspectives. will outline the algorithms and data struc- for analysis of large scientific databases. David Hand is a professor of statistics tures behind the systems. We will give spe- Evangelos Simoudis is Vice President, at the Open University. His research inter- cial emphasis to incremental algorithms Global Business Intelligence Solutions – ests include the foundations of statistics, and to efficient data structures. IBM North America, where he is respon- statistical computing, and multivariate Ronen Feldman is a lecturer at the sible for the development and deploy- statistics, the latter especially as applied to Mathematics and Computer Science De- ment of data mining and decision sup- classification problems. His applications partment of Bar-Ilan University in Israel. port solutions to IBM’s customers interests include medicine, finance, and He received his B.Sc. in math, physics and worldwide.

Conference Registration Brochure

Big Data Everywhere, and No SQL in Sight Editorial: from SIGKDD Chairman Usama M

Organization List

Conference Program

Thèses Et HDR 2015

Ensemble Learning for Ranking Interesting Attributes

Session at AAAS Annual Meeting

London, United Kingdom August 19 ‑ 23, 2018 24Th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Partners for a New Beginning Status Report | January 2013 Table of Contents

Science Without Borders 17–21 February, Washington, D.C

The KDD Process for Extracting Useful Knowledge from Volumes of Data

Course Summary

A Data Scientist's Guide to Start-Ups