WEKA Manual for Version 3-7-8

WEKA Manual for Version 3-7-8

WEKA Manual for Version 3-7-8 Remco R. Bouckaert Eibe Frank Mark Hall Richard Kirkby Peter Reutemann Alex Seewald David Scuse January 21, 2013 ⃝c 2002-2013 University of Waikato, Hamilton, New Zealand Alex Seewald (original Commnd-line primer) David Scuse (original Experimenter tutorial) This manual is licensed under the GNU General Public License version 3. More information about this license can be found at http://www.gnu.org/licenses/gpl-3.0-standalone.html Contents ITheCommand-line 11 1Acommand-lineprimer 13 1.1 Introduction . 13 1.2 Basic concepts . 14 1.2.1 Dataset . 14 1.2.2 Classifier . 16 1.2.3 weka.filters . 17 1.2.4 weka.classifiers . 19 1.3 Examples . 23 1.4 Additional packages and the package manager . .24 1.4.1 Package management . 25 1.4.2 Running installed learning algorithms . 26 II The Graphical User Interface 29 2LaunchingWEKA 31 3PackageManager 35 3.1 Mainwindow ............................. 35 3.2 Installing and removing packages . 36 3.2.1 Unofficalpackages ...................... 37 3.3 Usingahttpproxy.......................... 37 3.4 Using an alternative central package meta data repository . 37 3.5 Package manager property file . 38 4SimpleCLI 39 4.1 Commands . 39 4.2 Invocation . 40 4.3 Command redirection . 40 4.4 Command completion . 41 5Explorer 43 5.1 The user interface . 43 5.1.1 Section Tabs . 43 5.1.2 Status Box . 43 5.1.3 Log Button . 44 5.1.4 WEKA Status Icon . 44 3 4 CONTENTS 5.1.5 Graphical output . 44 5.2 Preprocessing . 45 5.2.1 Loading Data . 45 5.2.2 The Current Relation . 45 5.2.3 Working With Attributes . 46 5.2.4 Working With Filters . 47 5.3 Classification . 49 5.3.1 Selecting a Classifier . 49 5.3.2 Test Options . 49 5.3.3 The Class Attribute . 50 5.3.4 Training a Classifier . 51 5.3.5 The Classifier Output Text . 51 5.3.6 TheResultList........................ 51 5.4 Clustering............................... 53 5.4.1 Selecting a Clusterer . 53 5.4.2 ClusterModes ........................ 53 5.4.3 Ignoring Attributes . 53 5.4.4 Working with Filters . 54 5.4.5 Learning Clusters . 54 5.5 Associating .............................. 55 5.5.1 SettingUp .......................... 55 5.5.2 Learning Associations . 55 5.6 Selecting Attributes . 56 5.6.1 Searching and Evaluating . 56 5.6.2 Options . 56 5.6.3 Performing Selection . 56 5.7 Visualizing .............................. 58 5.7.1 The scatter plot matrix . 58 5.7.2 Selecting an individual 2D scatter plot . 58 5.7.3 Selecting Instances . 59 6Experimenter 61 6.1 Introduction . 61 6.2 Standard Experiments . 62 6.2.1 Simple............................. 62 6.2.1.1 New experiment . 62 6.2.1.2 Results destination . 62 6.2.1.3 Experiment type . 64 6.2.1.4 Datasets . 66 6.2.1.5 Iteration control . 67 6.2.1.6 Algorithms . 67 6.2.1.7 Saving the setup . 69 6.2.1.8 Running an Experiment . 70 6.2.2 Advanced . 71 6.2.2.1 Defining an Experiment . 71 6.2.2.2 Running an Experiment . 74 6.2.2.3 Changing the Experiment Parameters . 76 6.2.2.4 Other Result Producers . 83 6.3 Cluster Experiments . 89 6.4 Remote Experiments . 92 CONTENTS 5 6.4.1 Preparation . 92 6.4.2 Database Server Setup . 92 6.4.3 Remote Engine Setup . 93 6.4.4 Configuring the Experimenter . 94 6.4.5 Multi-core support . 95 6.4.6 Troubleshooting . 95 6.5 AnalysingResults........................... 97 6.5.1 Setup ............................. 97 6.5.2 Saving the Results . 100 6.5.3 Changing the Baseline Scheme . 100 6.5.4 Statistical Significance . 101 6.5.5 Summary Test . 101 6.5.6 Ranking Test . 102 7KnowledgeFlow 103 7.1 Introduction . 103 7.2 Features . 105 7.3 Components . 106 7.3.1 DataSources . 106 7.3.2 DataSinks . 106 7.3.3 Filters . 106 7.3.4 Classifiers . 106 7.3.5 Clusterers . 106 7.3.6 Evaluation . 106 7.3.7 Visualization . 108 7.4 Examples . 109 7.4.1 Cross-validated J48 . 109 7.4.2 Plotting multiple ROC curves . 111 7.4.3 Processing data incrementally . 114 7.5 Plugins . 116 7.5.1 Flow components . 116 7.5.2 Perspectives . 116 8ArffViewer 119 8.1 Menus.................................120 8.2 Editing . 122 9BayesianNetworkClassifiers 125 9.1 Introduction . 125 9.2 Local score based structure learning . 129 9.2.1 Local score metrics . 129 9.2.2 Search algorithms . 130 9.3 Conditional independence test based structure learning. 133 9.4 Global score metric based structure learning . ...135 9.5 Fixed structure ’learning’ . 136 9.6 Distribution learning . 136 9.7 Running from the command line . 138 9.8 Inspecting Bayesian networks . 148 9.9 Bayes Network GUI . 151 9.10 Bayesian nets in the experimenter . 163 6 CONTENTS 9.11 Adding your own Bayesian network learners . .163 9.12 FAQ . 165 9.13 Future development . 166 III Data 169 10 ARFF 171 10.1 Overview . 171 10.2 Examples . 172 10.2.1 The ARFF Header Section . 172 10.2.2 The ARFF Data Section . 174 10.3 Sparse ARFF files . 175 10.4 Instance weights in ARFF files . 176 11 XRFF 177 11.1 File extensions . 177 11.2 Comparison . 177 11.2.1 ARFF . 177 11.2.2 XRFF . 178 11.3 Sparse format . 179 11.4 Compression . 180 11.5 Useful features . 180 11.5.1 Class attribute specification . 180 11.5.2 Attribute weights . 180 11.5.3 Instance weights . 181 12 Converters 183 12.1 Introduction . 183 12.2 Usage . 184 12.2.1 File converters . 184 12.2.2 Database converters . 184 13 Stemmers 187 13.1 Introduction . ..

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    327 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us