Affymetrix Data Mining Tool Manual

Affymetrix® Data Mining Tool User’s Guide Version 3.0 For Research Use Only. Not for use in diagnostic procedures. Affymetrix Confidential 700233 Rev. 3 Trademarks Affymetrix®, GeneChip®, EASI™, ™,, ™, HuSNP™, GenFlex™, Jaguar™, MicroDB™, 417™, 418™, 427™, 428™, Pin-and-Ring™, Flying Objective™, NetAffx™ and CustomExpress™ are trademarks owned or used by Affymetrix, Inc. Microsoft® is a registered trademark of Microsoft Corporation. Oracle® is a registered trademark of Oracle Corporation. Limited License PROBE ARRAYS, INSTRUMENTS, SOFTWARE AND REAGENTS ARE LICENSED FOR RESEARCH USE ONLY AND NOT FOR USE IN DIAGNOSTIC PROCEDURES. NO RIGHT TO MAKE, HAVE MADE, OFFER TO SELL, SELL, OR IMPORT OLIGONUCLEOTIDE PROBE ARRAYS OR ANY OTHER PRODUCT IN WHICH AFFYMETRIX HAS PATENT RIGHTS IS CONVEYED BY THE SALE OF PROBE ARRAYS, INSTRUMENTS, SOFTWARE, OR REAGENTS HEREUNDER. THIS LIMITED LICENSE PERMITS ONLY THE USE OF THE PARTICULAR PRODUCT(S) THAT THE USER HAS PURCHASED FROM AFFYMETRIX. Patents Software products may be covered by one or more of the following patents: U.S. Patent Nos. 5,733,729; 5,795,716; 5,974,164; 6,066,454; 6,090,555; 6,185,561 and 6,188,783; and other U.S. or foreign patents. Copyright ©1999, 2001 Affymetrix, Inc. All rights reserved. Contents CHAPTER 1 Welcome 3 Data Mining Tool User’s Guide 3 What’s New in DMT 3.0 3 Conventions Used 4 On-line Documentation 5 Technical Support 6 Your Feedback is Welcome 6 CHAPTER 2 Installing Data Mining Tool 3.0 9 Before You Begin 9 Microsoft® SQL Server LIMS Users 9 Oracle® LIMS Users 9 MicroDB™ Users 9 Installing Data Mining Tool 10 Creating an Oracle® Alias 17 Oracle 8.1.7 Alias Configuration 17 CHAPTER 3 Affymetrix® Data Mining Tool Overview 25 Access Data 25 Affymetrix Publish Database 25 Affymetrix® Analysis Data Model 26 i ii Contents DMT Windowpanes 27 Query Data 30 Building and Running a Query 30 Viewing Query Results 33 Tables 33 Graphs 38 Analyze Query Results 43 Statistical Analyses 43 Cluster Analysis 43 Matrix Analysis 46 CHAPTER 4 Getting Started 49 Starting DMT 49 Managing Database Connections 50 Registering a Database 51 Unregistering a Database 52 Selecting a Database 53 Specifying the Default Directory 54 CHAPTER 5 Building and Running a Query 59 Building a Query 59 Starting a New Query 59 Specifying the Filters 61 Query Builder 68 Selecting Analyses for the Query 70 Specifying Analysis Filters 70 Running a Query 79 Affymetrix® Data Mining Tool User’s Guide iii Normalizing GeneChip® Signal Data 79 Choosing Normalization Before a Query or Pivot 80 Choosing Normalization After a Query or Pivot 81 Normalization Options 81 CHAPTER 6 Managing Queries 87 Saving a Query 87 Using the Save As Command 88 Opening a Previously Saved Query 89 Deleting a Query 90 CHAPTER 7 Query Results Tables 93 Experiment Information Table 93 GeneChip® Data Mode 94 Spot Data Mode 95 Query Table 96 Pivot Data Table 97 Selecting Results for the Pivot Table 99 Running the Pivot Operation 101 Including Probe Descriptions in the Pivot Table 102 Including Annotations in the Pivot Table 102 Sorting Pivot Table Columns 103 Pivot Options 104 Working with Tables 106 Finding Probes 106 Viewing Descriptions & Obtaining Further Gene Information 107 Annotating Probes 108 Adding Probes to the Filter Grid 109 iv Contents Copying Tables 110 Exporting Data 111 Expanding the Results Pane 111 Clearing the Results Pane 112 CHAPTER 8 Annotations 115 Annotating Probes 115 Loading Annotations 116 Querying Annotations 118 Adding Probes to the Filter Grid 121 Deleting Annotations 122 CHAPTER 9 Probe Lists 127 Creating Probe Lists 127 Creating a Probe List from the Query or Pivot Table 128 Creating a Probe List from Cluster Analysis 130 Creating a Probe List from Search Array Descriptions 131 Creating a Probe List from Filter 132 Creating a Probe List by Combining Existing Lists 132 Loading a Probe List 134 Specifying Probe List Members 134 Specifying an Input File 135 Using Probe Lists 137 Adding a Probe List to the Filter Grid 137 Displaying Selected Probe List Members 138 Managing Probe Lists 140 Viewing and Editing Probe List Members 140 Combining Probe Lists 142 Affymetrix® Data Mining Tool User’s Guide v Exporting a Probe List 143 Deleting a Probe List 144 CHAPTER 10 Array Sets 149 Creating an Array Set 149 Working with Array Sets 151 Viewing Array Sets 151 Managing Array Sets 152 Editing an Array Set 152 Deleting an Array Set 153 CHAPTER 11 Graphing Results 157 Scatter Graph 158 Plotting the Scatter Graph 158 Working with the Scatter Graph 161 Scatter Graph Options 168 Fold Change Graph 171 Plotting the Fold Change Graph 173 Working with the Fold Change Graph 176 Fold Change Graph Options 183 Series Graph 185 Plotting the Series Graph 186 Working with the Series Graph 188 Series Graph Options 191 Histogram 193 Plotting the Histogram 193 Working with the Histogram 195 Histogram Options 199 vi Contents Other Graphing Features 202 Enlarging the Graph Pane 202 Changing Graph Colors 202 Copying and Clearing Graphs 204 Printing Graphs 204 CHAPTER 12 Statistical Analyses 209 Selecting an Operator 209 Average, Median, Standard Deviation or Inter-Quartile Range 210 Fold Change 212 T-Test 214 Mann-Whitney Test 216 Count & Percentage 218 CHAPTER 13 Matrix Analysis 223 Overview 223 Population Size 224 Running a Matrix Analysis 225 CHAPTER 14 Cluster Analysis 231 Self Organizing Map (SOM) Algorithm 231 Running a SOM Cluster Analysis 232 Saving a Probe List 237 SOM Filters 238 SOM Parameters 239 Affymetrix® Data Mining Tool User’s Guide vii Correlation Coefficient Clustering Algorithm 240 Running the Correlation Coefficient Cluster 241 Correlation Coefficient Clustering Options 244 Effect of Changing Algorithm Parameters 246 Saving and Importing Seed Patterns 248 Saving a Probe List 251 CHAPTER 15 DMT Tutorial 255 Introduction 255 Step 1: Restoring the MicroDB™ Database 256 Step 2: Starting DMT 256 Step 3: Registering the Database 256 Step 4: Selecting the Tutorial Database 258 Step 5: Opening the DMT Session 258 Lesson 1: Identifying Highly Expressed Genes 259 Step 1: Specifying a Filter 259 Step 2: Selecting Analyses for the Query 260 Step 3: Pivoting on Signal & Detection Call 260 Step 4: Querying and Pivoting the Data 262 Step 5: Sorting the Pivot Table by Signal 263 Step 6: Saving a Probe List 263 Step 7: Plotting the Series Line Graph 264 Lesson 1 Summary 268 Suggested Exercise 269 Lesson 2: Calculating Averages of Replicates 270 Step 1: Specifying a Probe List for the Filter 270 Step 2: Selecting Analyses for the Query 272 Step 3: Pivoting on Signal 273 Step 4: Query and Pivot the Data 274 Step 5: Selecting Average & Standard Deviation Operators 276 Step 6: Sorting the Pivot Table 279 viii Contents Step 7: Displaying Probe Set Descriptions 280 Lesson 2 Summary 281 Suggested Exercise 281 Lesson 3: Summarizing Qualitative Data 282 Step 1: Pivoting on Detection Call 282 Step 2: Performing Count & Percentage Analysis 284 Step 3: Sorting Pivot Table Results 286 Step 4: Saving a Probe List 287 Step 5: Annotating Probe List Members 287 Lesson 3 Summary 288 Suggested Exercise 288 Lesson 4: Evaluating Difference Between Two Tissues 289 Step 1: Pivoting on Signal 290 Step 2: Mann-Whitney Test 292 Step 3: Annotating Probe Sets 295 Step 4: Saving a Probe List 295 Lesson 4 Summary 296 Suggested Exercise 296 Lesson 5: Evaluating Change Call Consistency 297 Step 1: Clearing the Filter Grid & Selecting Comparison Analyses 299 Step 2: Pivoting on Difference Call 300 Step 3: Comparison Ranking 301 Step 4: Annotating Probe Sets 303 Step 5: Saving a Probe List 304 Lesson 5 Summary 304 Suggested Exercise 304 Lesson 6: Self Organizing Map (SOM) Cluster Analysis 305 Step 1: Clearing the Filter Grid & Selecting Analyses 306 Step 2: Pivoting on Signal 307 Step 3: Computing Average Signal 308 Step 4: SOM Cluster Analysis 310 Affymetrix® Data Mining Tool User’s Guide ix Step 5: Saving & Annotating a Probe List 318 Lesson 6 Summary 318 APPENDIX A Filter Grid 323 GeneChip Data Mode 323 Statistical Expression Algorithm 323 Empirical Expression Algorithm 324 Spot Data Mode 330 APPENDIX B Working with Windows & Tables 333 Query Windowpanes 333 Expanding a Windowpane 333 Resizing a Windowpane 333 Clearing the Results or Graph Pane 334 Tables 334 Selecting the Entire Table 334 Selecting Rows 334 Resizing Columns 335 Hiding Columns 335 Reordering Columns 336 APPENDIX C Query Table Data 339 GeneChip® Data Mode 339 Statistical Expression Algorithm Metrics 339 Empirical Expression Algorithm Metrics 340 Spot Data Mode 346 x Contents APPENDIX D DMT Algorithms 349 The SOM Algorithm 349 Neighborhood 351 Learning Rate 352 The Correlation Coefficient Clustering Algorithm 353 The Matrix Algorithm 354 APPENDIX E Toolbars & Shortcuts 359 DMT Main Toolbar 359 Session Toolbar 360 Shortcut Descriptions 361 1 Chapter 1 Welcome 1 Welcome to the Affymetrix® Data Mining Tool (DMT) User’s Guide. The DMT filters, queries and analyzes publish databases of GeneChip® or spotted array expression data. Data Mining Tool User’s Guide This manual explains how to use DMT to: I Build a query. I Display the query results in table or graph format. I Evaluate and compare replicate data using statistical analyses. I Calculate the overlap significance between two lists of GeneChip® probe sets or spot probes. I Apply cluster analysis to experimental results to help identify gene expression patterns. This manual also includes a tutorial that demonstrates; 1) a data mining strategy to identify genes that significantly change expression level, 2) statistical analyses of replicate data, and 3) cluster analysis. What’s New in DMT 3.0 Compatible with Microarray Suite Statistical or Empirical Expression Algorithm DMT can query and analyze experimental results generated by the Statistical Expression algorithm (in Microarray Suite 5.0) as well as the Empirical Expression algorithm (in versions of Microarray Suite prior to 5.0).

Affymetrix Data Mining Tool Manual

(BI) Using MS Excel Powerpivot

Calculated Field in Pivot Table Data Model

Sharing Files with Microsoft Office Users

Fast Foreign-Key Detection in Microsoft SQL

Building an Effective Data Warehousing for Financial Sector

Business Intelligence Tools

Pivot Tables

Appendix D: How to Use a Data Spreadsheet: Excel

The Benefits of Data Modeling in Business Intelligence

Data Warehousing

Microsoft Excel • Pivottables • Dashboards 3

Teaching Tip a Teaching Module of Database-Centric Online Analytical Process for MBA Business Analytics Programs