Mining Relational and Nonrelational Data with IBM Intelligent Miner For

Mining Relational and Nonrelational Data with IBM Intelligent Miner for Data Using Oracle, SPSS, and SAS As Sample Data Sources Joerg Reinschmidt, Rahul Bhattacharya, Paul Harris, Athanasios Karanasos International Technical Support Organization http://www.redbooks.ibm.com SG24-5278-00 SG24-5278-00 International Technical Support Organization Mining Relational and Nonrelational Data with IBM Intelligent Miner for Data Using Oracle, SPSS, and SAS As Sample Data Sources December 1998 Take Note! Before using this information and the product it supports, be sure to read the general information in Appendix C, “Special Notices” on page 197. First Edition (December 1998) This edition applies to Version 2.1.2 of IBM Intelligent Miner for Data for use with the AIX V 4.3.1 Windows NT operating systems and to Version 2.1.2 of IBM DataJoiner for use with the AIX or Windows NT operating systems. Comments may be addressed to: IBM Corporation, International Technical Support Organization Dept. QXXE Building 80-E2 650 Harry Road San Jose, California 95120-6099 When you send information to IBM, you grant IBM a non-exclusive right to use or distribute the information in any way it believes appropriate without incurring any obligation to you. © Copyright International Business Machines Corporation 1998. All rights reserved Note to U.S Government Users – Documentation related to restricted rights – Use, duplication or disclosure is subject to restrictions set forth in GSA ADP Schedule Contract with IBM Corp. Contents Figures. vii Preface. .xi How This Book Is Organized . xi The Team That Wrote This Redbook . xii Comments Welcome . xiii Part 1. Introduction to Data Mining and the Products Used. 1 Chapter 1. An Analytical Approach to Data Mining . 3 1.1 The Challenge of the Information Age . 3 1.2 Data Mining . 3 1.3 Data Mining Tools . 4 1.4 From Data to Knowledge . 6 1.4.1 Setting Goals . 7 1.4.2 Data Collection . 8 1.4.3 Sampling . 8 1.4.4 Data Preparation . 8 1.4.5 Dimension Reduction . 9 1.4.6 Data Modeling . 10 1.4.7 Decision Making . 10 1.4.8 Measuring the Results . 11 Chapter 2. Introduction to Intelligent Miner for Data and DataJoiner .13 2.1 Intelligent Miner . 13 2.1.1 Overview of the Intelligent Miner. 13 2.1.2 Working with Databases . 13 2.1.3 The User Interface . 14 2.1.4 Data Preparation Functions . 15 2.1.5 Statistical and Mining Functions . 16 2.1.6 Processing IM Functions . 17 2.1.7 Creating and Visualizing the Results . 17 2.1.8 Creating Data Mining Operations . 18 2.2 DataJoiner . 19 2.2.1 The Multidatabase Solution . 20 2.2.2 Heterogeneous Data Access . 20 2.2.3 Database Integration . 22 2.2.4 Heterogeneous Replication. 24 2.2.5 Global Query Optimization . 25 2.2.6 Web Support . 26 © Copyright IBM Corp. 1998 iii Part 2. Accessing Relational Data . 27 Chapter 3. System Architecture for Mining Oracle Databases . 29 3.1 The Oracle Database Environment. 29 3.2 Mining Oracle Databases Using IBM’s Intelligent Miner for Data . 32 3.2.1 Using Fast Extract with Oracle Databases . 33 3.2.2 The DataJoiner Interface to Oracle Databases . 34 3.3 System Architecture . 35 Chapter 4. Install and Configure DataJoiner with Oracle. 39 4.1 Product Installation on AIX . 39 4.1.1 Install Oracle Client . 39 4.1.2 Connecting to the Oracle Server from the Oracle Client . 42 4.1.3 Install DataJoiner . 45 4.2 DataJoiner Configuration on AIX . 47 4.2.1 Configuring DataJoiner to Access Oracle . 47 4.2.2 Creating a DataJoiner Instance . 52 4.2.3 Configuring DataJoiner to Accept DB2 Client Connections . 54 4.2.4 Configuring DataJoiner Nicknames. 55 4.3 Test Access on AIX . 58 4.4 Product Installation on Windows NT . 60 4.4.1 Install Oracle Client . 60 4.4.2 Connecting to the Oracle Server from the Oracle Client . 61 4.4.3 Install DataJoiner . 63 4.5 DataJoiner Configuration on Windows NT . 67 4.5.1 Configuring DataJoiner to Access Oracle . 67 4.6 Test Access on Windows NT . 68 Chapter 5. Fast Extract Utility . 71 5.1 Installation . 71 5.2 Use . 72 Chapter 6. Sample Mining of Oracle Data . 75 6.1 Intelligent Miner for Data Installation on AIX . 75 6.2 The Mining Scenario . 76 6.3 The Mining Solution . 77 6.4 Using Intelligent Miner to Implement the Solution . 78 6.4.1 Creating a Data Object for the CUSTOMER Table . 80 6.4.2 Setting Up Demographic Clustering . 83 6.4.3 Applying the Model . 90 6.4.4 Generating the Target Customer Set . 97 iv Mining Relational and Nonrelational Data with IM for Data Part 3. Accessing Nonrelational Data . 107 Chapter 7. System Architecture for Mining ODBC Data Sources . 109 7.1 Call Level Interface . 109 7.2 Open Database Connectivity . 109 7.3 Mining ODBC Data Sources with IBM’s Intelligent Miner for Data. 110 7.3.1 Using Flat Files with ODBC Data Sources . 112 7.3.2 The DataJoiner Interface to ODBC Data Sources . 112 7.4 System Architecture . 113 Chapter 8. Access SPSS Data from Intelligent Miner for Data. 115 8.1 SPSS ODBC Configuration . 115 8.2 SPSS - DataJoiner Configuration . 117 8.3 SPSS ODBC Specification Compliance Levels . 120 Chapter 9. Access SAS Data from Intelligent Miner for Data. 121 9.1 SAS ODBC Configuration. 121 9.2 SAS - DataJoiner Configuration . 125 9.3 SAS Server Communications . 126 9.3.1 Local DDE Access Method . 126 9.3.2 Remote Access Methods . 127 9.3.3 SAS ODBC Specification Compliance Levels . 129 Chapter 10. Sample Mining of ODBC Data Sources . 131 10.1 Intelligent Miner Server Installation on Windows NT . 131 10.2 Starting the Intelligent Miner Server and Clients . 133 10.3 Home Sales Mining Scenario . 134 10.3.1 Implementing the Solution with Intelligent Miner . 135 10.4 Loan Approval Mining Scenario . 145 10.4.1 Implementing the Solution with Intelligent Miner . 146 Chapter 11. Summary. 157 11.1 Environments Used . 157 11.2 Sample Environment for Ralational.

Mining Relational and Nonrelational Data with IBM Intelligent Miner For

Netezza EOL: IBM IAS Is a Flawed Upgrade Path Netezza EOL: IBM IAS Is a Flawed Upgrade Path

SAP Database Administration with IBM

IBM DB2 for Z/OS: the Database for Gaining a Competitive Advantage!

Revised Naming for IBM Db2 Family Products

Developing Embedded SQL Applications

Proc SQL Vs. Pass-Thru SQL on DB2® Or How I Found a Day

Part 2 the Next Generation Iseries... Simplicity in an on Demand World

Toad for IBM DB2 Release Notes

DB2 Database Facilities Management …

IBM DB2 Database (Was 5.6) Builder Tools 5.3.1 IBM DB2 Database Server Tuning (Was 5.2) A

A Survey: Security Perspectives of ORACLE and IBM-DB2 Databases

Z/OS Basics Preface