KNIME User Training

KNIME.com AG

Copyright © 2017 KNIME.com AG Overview KNIME Analytics Platform

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 2 Noncommercial-Share Alike license 1 https://creativecommons.org/licenses/by-nc-sa/4.0/ What is KNIME Analytics Platform?

• A tool for data analysis, manipulation, visualization, and reporting • Based on the graphical programming paradigm • Provides a diverse array of extensions: • Text Mining • Network Mining • Cheminformatics • Weka machine learning • Many integrations, such as Java, , Python, etc.

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 3 Noncommercial-Share Alike license 2 https://creativecommons.org/licenses/by-nc-sa/4.0/ Additional Resources

KNIME pages (www..org) • SOLUTIONS for example workflows • RESOURCES/LEARNING HUB www.knime.org/learning-hub • RESOURCES/NODE GUIDE https://www.knime.org/nodeguide

KNIME Tech pages (tech.knime.org) • FORUM for questions and answers • DOCUMENTATION for docs, FAQ, changelogs, ... • COMMUNITY CONTRIBUTIONS for dev instructions and third party nodes

KNIME TV on YouTube https://www.youtube.com/user/KNIMETV

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 4 Noncommercial-Share Alike license 3 https://creativecommons.org/licenses/by-nc-sa/4.0/ The KNIME® Analytics Platform

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 5 Noncommercial-Share Alike license 4 https://creativecommons.org/licenses/by-nc-sa/4.0/ Visual KNIME Workflows

NODES perform tasks on data

Not Configured Idle Outputs Inputs Executed Status Error Nodes are combined to create WORKFLOWS

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 6 Noncommercial-Share Alike license 5 https://creativecommons.org/licenses/by-nc-sa/4.0/ Data Access

• Databases • MySQL, PostgreSQL • any JDBC (Oracle, DB2, MS SQL Server) • Files • Csv, txt • Excel, Word, PDF • SAS, SPSS • XML • PMML • Images, texts, networks, chem • Web, Cloud • REST, Web services • Twitter, Google

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 7 Noncommercial-Share Alike license 6 https://creativecommons.org/licenses/by-nc-sa/4.0/ Big Data

• Spark • HDFS support • Hive • Impala • HP Vertica • In-database processing

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 8 Noncommercial-Share Alike license 7 https://creativecommons.org/licenses/by-nc-sa/4.0/ Transformation

• Preprocessing • Row, column, matrix based • Data blending • Join, concatenate, append • Aggregation • Grouping, pivoting, binning • Feature Creation and Selection

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 9 Noncommercial-Share Alike license 8 https://creativecommons.org/licenses/by-nc-sa/4.0/ Analyze & Data Mining

• Regression • Linear, logistic • Classification • Decision tree, ensembles, SVM, MLP, Naïve Bayes • Clustering • k-means, DBSCAN, hierarchical • Validation • Cross-validation, scoring, ROC • Misc • PCA, MDS, item set mining • External • R, Weka

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 10 Noncommercial-Share Alike license 9 https://creativecommons.org/licenses/by-nc-sa/4.0/ Visualization

• Interactive • Scatter plot, histogram, pie charts, box plot • Highlighting (brushing) • JFreeChart • JavaScript • Misc • Tag cloud, open street map, networks, molecules • External • R

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 11 Noncommercial-Share Alike license 10 https://creativecommons.org/licenses/by-nc-sa/4.0/ Deployment

• Database • Files • Excel, csv, txt • XML • PMML • to: local, KNIME Server, SSH-, FTP-Server • BIRT Reporting

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 12 Noncommercial-Share Alike license 11 https://creativecommons.org/licenses/by-nc-sa/4.0/ Over 1500 native and embedded nodes included:

Data Access Transformation Analysis & Mining Visualization Deployment MySQL, Oracle, ... Row, Statistics R via BIRT SAS, SPSS, ... Column Data Mining JFreeChart PMML Excel, Flat, ... Matrix Machine Learning JavaScript XML, JSON Hive, Impala, ... Text, Image Web Analytics Community / 3rd Databases XML, JSON, PMML Time Series Text Mining Excel, Flat, etc. Text, Doc, Image, ... Java Network Analysis Text, Doc, Image Web Crawlers Python Social Media Analysis Industry Specific Industry Specific Community / 3rd R, Weka, Python Community / 3rd Community / 3rd Community / 3rd

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 13 Noncommercial-Share Alike license 12 https://creativecommons.org/licenses/by-nc-sa/4.0/ Overview

• Installing KNIME Analytics Platform • The KNIME Workspace • The KNIME File Extensions • The KNIME Workbench • Workflow editor • Explorer • Node repository • Node description • Installing new features

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 14 Noncommercial-Share Alike license 13 https://creativecommons.org/licenses/by-nc-sa/4.0/ Install KNIME Analytics Platform

• Select the KNIME version for your computer: • Mac, Win, or Linux and 32 / 64bit • Note different downloads (minimal or full)

• Download archive and extract the file, or download installer package and run it

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 15 Noncommercial-Share Alike license 14 https://creativecommons.org/licenses/by-nc-sa/4.0/ Start KNIME Analytics Platform

• Go to the installation directory and launch KNIME, or use the shortcut created on your Desktop.

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 16 Noncommercial-Share Alike license 15 https://creativecommons.org/licenses/by-nc-sa/4.0/ The KNIME Workspace

• The workspace is the folder/directory in which workflows (and potentially data files) are stored for the current KNIME session. • Workspaces are portable (just like KNIME)

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 17 Noncommercial-Share Alike license 16 https://creativecommons.org/licenses/by-nc-sa/4.0/ Welcome Page

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 18 Noncommercial-Share Alike license 17 https://creativecommons.org/licenses/by-nc-sa/4.0/ The KNIME Workbench

Servers and Workflows Workflow Editor

Node Recommendations

Node Description Node Repository Console

Outline

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 19 Noncommercial-Share Alike license 18 https://creativecommons.org/licenses/by-nc-sa/4.0/ Creating New Workflows, Importing and Exporting

• Right-click Workspace in KNIME Explorer to create new workflow or workflow group or to import workflow • Right-click on workflow or workflow group to export

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 20 Noncommercial-Share Alike license https://creativecommons.org/licenses/by-nc-sa/4.0/ KNIME File Extensions

• Dedicated file extensions for Workflows and Workflow groups associated with KNIME Analytics Platform

• *.knwf for KNIME Workflow Files

• *.knar for KNIME Archive Files

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 21 Noncommercial-Share Alike license 20 https://creativecommons.org/licenses/by-nc-sa/4.0/ More on Nodes…

A node can have 3 states:

Idle: The node is not yet configured and cannot be executed with its current settings.

Configured: The node has been set up correctly, and may be executed at any time

Executed: The node has been successfully executed. Results may be viewed and used in downstream nodes.

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 22 Noncommercial-Share Alike license 21 https://creativecommons.org/licenses/by-nc-sa/4.0/ Inserting and Connecting Nodes

• Insert nodes into workspace by dragging them from Node Repository or by double-clicking in Node Repository • Connect nodes by left-clicking output port of Node A and dragging the cursor to (matching) input port of Node B • Common port types: Model

Image Flow Variable

Data Database Database Conection Query

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 23 Noncommercial-Share Alike license 22 https://creativecommons.org/licenses/by-nc-sa/4.0/ Node Configuration

• Most nodes require configuration • To access a node configuration window: • Double-click the node • Right-click > Configure

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 24 Noncommercial-Share Alike license 23 https://creativecommons.org/licenses/by-nc-sa/4.0/ Node Execution

• Right-click node • Select Execute in context menu • If execution is successful, status shows green light • If execution encounters errors, status shows red light

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 25 Noncommercial-Share Alike license 24 https://creativecommons.org/licenses/by-nc-sa/4.0/ Node Views

• Right-click node • Select Views in context menu • Select output port to inspect execution results

Plot View

Data View

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 26 Noncommercial-Share Alike license 25 https://creativecommons.org/licenses/by-nc-sa/4.0/ Workflow Coach

• Recommendation engine – It gives hints about which node use next in the workflow – Based on KNIME communities' usage statistics – Usage statistics available also with Personal Productivity Extension and KNIME Server products (these products require a purchased license)

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 27 Noncommercial-Share Alike license 26 https://creativecommons.org/licenses/by-nc-sa/4.0/ Getting Started: KNIME Example Server

• Public repository with large selection of example workflows for many, many applications • Connect via KNIME Explorer

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 28 Noncommercial-Share Alike license 27 https://creativecommons.org/licenses/by-nc-sa/4.0/ Online Node Guide

• Workflows from Example Server also available online – https://www.knime.org/nodeguide

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 29 Noncommercial-Share Alike license 28 https://creativecommons.org/licenses/by-nc-sa/4.0/ Hot Keys (for future reference)

Task Hot key Description Node Configuration F6 opens the configuration window of the selected node F7 executes selected configured nodes Shift + F7 executes all configured nodes Node Execution Shift + F10 executes all configured nodes and opens all views F9 cancels selected running nodes Shift + F9 cancels all running nodes Ctrl + Shift + Arrow moves the selected node in the arrow direction Move Nodes and Ctrl + Shift + moves the selected annotation in the front or in the back Annotations PgUp/PgDown of all overlapping annotations F8 resets selected nodes Ctrl + S saves the workflow Workflow Operations Ctrl + Shift + S saves all open workflows Ctrl + Shift + W closes all open workflows Meta-node Shift + F12 opens meta-node wizard

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 30 Noncommercial-Share Alike license 29 https://creativecommons.org/licenses/by-nc-sa/4.0/ Today’s Example: Next Best Offer (NBO)

• Traditional Direct Marketing advertises a single product to a specific audience. The Next Best Offer (NBO) approach focuses on taking existing customers (and their data) and using upsell models to find interesting new products for them.

• Today we construct a workflow that joins diverse data sources into a set of complete customer records. Using this, we will build and deploy a predictive model to find people who might be interested in a newly available product.

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 31 Noncommercial-Share Alike license 1 https://creativecommons.org/licenses/by-nc-sa/4.0/ Today’s Example: Next Best Offer (NBO)

Explore the final workflow

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 32 Noncommercial-Share Alike license 2 https://creativecommons.org/licenses/by-nc-sa/4.0/ Importing Data Accessing files and databases

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 33 Noncommercial-Share Alike license 1 https://creativecommons.org/licenses/by-nc-sa/4.0/ Data Source Nodes

Typically characterized by: • color • No input ports, 1-2 output ports

Output port Status

Node description

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 34 Noncommercial-Share Alike license 2 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: File Reader

Workhorse of the KNIME Source nodes • Reads text based files • Many advanced features allow it to read most ‘weird’ files • Short lines, inline comments, headers and special encoding

YouTube KNIME TV Channel video: https://youtu.be/flaHQw-Qhlg

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 35 Noncommercial-Share Alike license 3 https://creativecommons.org/licenses/by-nc-sa/4.0/ File Reader Configuration

File path

Basic Settings Advanced Settings

Preview

Help Button

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 36 Noncommercial-Share Alike license 4 https://creativecommons.org/licenses/by-nc-sa/4.0/ Alternative Faster Way …

Drag & Drop OR Copy & Paste

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 37 Noncommercial-Share Alike license 5 https://creativecommons.org/licenses/by-nc-sa/4.0/ Filenames and the knime:// protocol

Absolute URL

Mountpoint-relative URL

Local path

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 38 Noncommercial-Share Alike license https://creativecommons.org/licenses/by-nc-sa/4.0/ Workflow Relative File Paths

• Best choice if workflows are to be shared • Requires matching folder structure within workflow group • Independent of environment outside of workflow group

Example: Path to „Sentiment Analysis.table“ • Local path: :\Users\rb\knime-workspace\KNIMEUserTraining\data\Sentiment Analysis.table • Workflow relative:

YouTube KNIME TV Channel: https://youtu.be/U9sP4g4yGwY

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 39 Noncommercial-Share Alike license 7 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: Excel Reader (XLS)

• Reads .xls and .xlsx file from Microsoft Excel – Supports reading from multiple sheets

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 40 Noncommercial-Share Alike license 8 https://creativecommons.org/licenses/by-nc-sa/4.0/ Excel Reader Configuration

File path

Sheet specific settings

Preview

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 41 Noncommercial-Share Alike license 9 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: Table Reader

• Reads tables from the native KNIME Format. – Maximum performance, minimum configuration File path

YouTube KNIME TV channel video: https://youtu.be/tid1qi2HAOo

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 42 Noncommercial-Share Alike license 10 https://creativecommons.org/licenses/by-nc-sa/4.0/ Database Connectivity

• Read data from any JDBC enabled database • Write your own SQL or model it using dedicated nodes

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 43 Noncommercial-Share Alike license 11 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Nodes: Database Connectors

• Native: Postgres, MySQL, SQLite, MSSQL (SQL Server) • Database Connector (e.g. Oracle, DB2, HANA). • Commercial extensions: HIVE and Impala

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 44 Noncommercial-Share Alike license 12 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Nodes: SQLite

• Propagate connection information to other DB nodes • File-based database • Useful for prototyping (switch to real connector on deployment)

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 45 Noncommercial-Share Alike license 13 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: Database Table Selector

• Take connection information and construct a query • Explore DB metadata • Output is an SQL query

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 14 46 Noncommercial-Share Alike license https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: Database Connection Table Reader

• Executes SQL Query • Reads results into a KNIME table

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 15 47 Noncommercial-Share Alike license https://creativecommons.org/licenses/by-nc-sa/4.0/ Other Useful Data Sources

• PMML Reader – reads standard predictive models • XML Reader with XPATH support • Python/R Source nodes • SAS7BDAT (Labs) • REST/SOAP for web services, and many more

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 48 Noncommercial-Share Alike license 16 https://creativecommons.org/licenses/by-nc-sa/4.0/ Importing Data Exercise

Starting with exercise: Importing Data Read the following files – Sentiment Analysis.table – Sentiment Rating.csv – Product Data2.xls

Optional: Read table web_activity from the database WebActivity.sqlite (hint: drag and drop the file to your workflow to get started)

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 49 Noncommercial-Share Alike license 17 https://creativecommons.org/licenses/by-nc-sa/4.0/ Data Manipulation Clean, join, aggregate

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 50 Noncommercial-Share Alike license 1 https://creativecommons.org/licenses/by-nc-sa/4.0/ Data Manipulation Nodes

• Yellow color with a variety of input and output ports • Apply a transformation to input data • Many, many nodes!

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 51 Noncommercial-Share Alike license 2 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: Concatenate

Combine rows from 2 tables with shared columns • Handles duplicate row keys gracefully • Take the union or intersection of columns

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 52 Noncommercial-Share Alike license 3 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: Cell Replacer

Replaces the content of a column based on a lookup • Top port references the table to be searched • Bottom port holds the lookup table (search keys and replacement values)

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 53 Noncommercial-Share Alike license 4 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: String Manipulation

Create and edit values in String columns • Clean up capitalization (eg. Lowercase) • Modify existing strings or create new columns

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 54 Noncommercial-Share Alike license 5 https://creativecommons.org/licenses/by-nc-sa/4.0/ Data Manipulation Exercise, Activity I

Starting with exercise: Data Manipulation, Activity I – Concatenate web activity data from old and new systems – Replace sentiment evaluation (strings) with corresponding numeric values – Use String Manipulation to ensure that all entries of the Products column are lower case from the product data spreadsheet.

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 55 Noncommercial-Share Alike license 6 https://creativecommons.org/licenses/by-nc-sa/4.0/ Joining Columns of Data

Left Table Right Table Join by ID

Inner Join

Left Outer Join Right Outer Join Missing values in the right table. Missing values in the left table.

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 56 Noncommercial-Share Alike license 7 https://creativecommons.org/licenses/by-nc-sa/4.0/ Joining Columns of Data

Left Table Right Table Join by ID

Full Outer Join

Missing values in the right table. Missing values in the left table.

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 57 Noncommercial-Share Alike license 8 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: Joiner

• Combines columns from 2 different tables • Top port contains “Left” data table • Bottom port contains the “Right” data table

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 58 Noncommercial-Share Alike license 9 https://creativecommons.org/licenses/by-nc-sa/4.0/ Joiner Configuration – Linking Rows

Values to join on. Multiple joining columns are allowed.

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 59 Noncommercial-Share Alike license 10 https://creativecommons.org/licenses/by-nc-sa/4.0/ Joiner Configuration – Column Selection

Columns from left table to output table

Columns from right table to output table

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 60 Noncommercial-Share Alike license 11 https://creativecommons.org/licenses/by-nc-sa/4.0/ Data Aggregation

aggregated on “group” by method: sum(“value”)

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 61 Noncommercial-Share Alike license 12 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: GroupBy

Aggregate to remove duplicates or summarize data • First tab provides grouping options • Second tab provides control over aggregation details YouTube KNIME TV video: https://youtu.be/bDwF-TOMtWw

Aggregation columns

Aggregation methods

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 62 Noncommercial-Share Alike license 13 https://creativecommons.org/licenses/by-nc-sa/4.0/ In-database Data Manipulation

• Model SQL query using nodes • DB versions of GroupBy, Joiner, Row Filter, Sorter, etc.

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 63 Noncommercial-Share Alike license 14 https://creativecommons.org/licenses/by-nc-sa/4.0/ Comments & Annotations

Double-click to write Right-click to change properties

Double-click to write Right-click to change properties

YouTube KNIME TV Channel: https://youtu.be/AHURYB_O8sA

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 64 Noncommercial-Share Alike license 15 https://creativecommons.org/licenses/by-nc-sa/4.0/ Workflow Organisation – Good Practices • Workflow annotations • Node labels • Metanodes – Right click -> Collapse... – Organize workflow by task – Hide complexity & improve readability

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 65 Noncommercial-Share Alike license 16 https://creativecommons.org/licenses/by-nc-sa/4.0/ Data Manipulation Exercise, Activity II

Starting with exercise Data Manipulation, Activity II • Join all data together using a series of joiner nodes and the “Customer Key” field • Resolve duplicates in the joined dataset (hint: GroupBy node) • Clean up and document your workflow using annotations, node labels and metanodes

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 66 Noncommercial-Share Alike license 17 https://creativecommons.org/licenses/by-nc-sa/4.0/ Data Visualization Charts and tables

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 67 Noncommercial-Share Alike license 1 https://creativecommons.org/licenses/by-nc-sa/4.0/ Data Visualization

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 68 Noncommercial-Share Alike license 2 https://creativecommons.org/licenses/by-nc-sa/4.0/ Data Visualization

• Interactive plots and tables (with Highlighting) • JavaScript integration for interactive views • R View nodes for building advanced graphics in KNIME

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 69 Noncommercial-Share Alike license 3 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: Color Manager

One of several visual property managers (e.g. size, shape) • Color by nominal or continuous values • Sync colors between views using the color model port

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 70 Noncommercial-Share Alike license 4 https://creativecommons.org/licenses/by-nc-sa/4.0/ Hiliting

• Hilited data is visible across all views • Keep multiple views open to explore complex data

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 71 Noncommercial-Share Alike license 5 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: Scatter Plot

• Plot different columns on X and Y • Displays data including pre calculated visual properties (size, shape, color) • Supports highlighting • Produces a view, no image

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 72 Noncommercial-Share Alike license 6 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: JavaScript Scatter Plot

• Plot different columns on X and Y • Displays data including pre calculated visual properties (size, shape, color) • Does not support highlighting • Produces an interactive View and an Image

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 73 Noncommercial-Share Alike license 7 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: JavaScript Scatter Plot

• 3 configuration tabs

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 74 Noncommercial-Share Alike license 8 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: Scatter Plot (JFreeChart)

• Plot different columns on X and Y • Displays data including pre-calculated visual properties (size, shape, color) • Does not support highlighting • Produces a static view and an image

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 75 Noncommercial-Share Alike license 9 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: Interactive Table

• No graphics • Supports highlighting

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 76 Noncommercial-Share Alike license 10 https://creativecommons.org/licenses/by-nc-sa/4.0/ Other Nodes: R View

• R View nodes for maximum customizibility

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 77 Noncommercial-Share Alike license 11 https://creativecommons.org/licenses/by-nc-sa/4.0/ Visualization Exercise

Start with exercise: Visualization • Color data by Product • Produce a scatter plot of Age vs. Estimated Yearly Income • In the “Scatter Plot” node highlight some data points and view only highlighted points using an Interactive Table node • Optional: Visualize data with the R (View node) • (start with a script like: plot(knime.in)

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 78 Noncommercial-Share Alike license 12 https://creativecommons.org/licenses/by-nc-sa/4.0/ Data Mining Partition, learn, predict, score

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 79 Noncommercial-Share Alike license 1 https://creativecommons.org/licenses/by-nc-sa/4.0/ Data Mining Strategies

Example applications: • Anomaly Detection (fraud, predictive maintenance) • Association Rule Learning (market basket analysis) • Clustering (market segmentation) • Classification (next best offer, churn preventions) • Regression (trend estimation)

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 80 Noncommercial-Share Alike license 2 https://creativecommons.org/licenses/by-nc-sa/4.0/ Data Mining: Process Overview

Train Training Model Set Apply Score Model Model Original Data Set Test Set

Train and Evaluate Partition data apply models performance

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 81 Noncommercial-Share Alike license 3 https://creativecommons.org/licenses/by-nc-sa/4.0/ Data Mining in KNIME

• KNIME has many modeling tools! • Decision tree, random forest, SVM, regression, neural networks, clustering, … • and integrations with other libraries: WEKA, libSVM, R, Python (scikit-learn) etc. • And many model evaluation nodes • ROC, standard, numeric and entropy scorers • Feature elimination • Cross validation

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 82 Noncommercial-Share Alike license 4 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: Partitioning

• Use to split data into training and evaluation sets – Partition by count (e.g. 10 rows) or fraction (e.g. 10%) – Sample by a variety of methods; random, linear, stratified

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 83 Noncommercial-Share Alike license 5 https://creativecommons.org/licenses/by-nc-sa/4.0/ Learner-predictor Motif

• Most data mining approaches in KNIME use a Learner-predictor motif. • The Learner node trains the Trained model with its input data. Model • The Predictor node applies the model to a different subset of data.

New data!

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 84 Noncommercial-Share Alike license 6 https://creativecommons.org/licenses/by-nc-sa/4.0/ Classification Predict nominal outcomes on existing data (supervised)

• Applications • Churn analysis (yes/no) • Chemical activity (active/inactive) • Spam detection (spam/not spam) • Optical character recognition (A-Z)

• Methods • Decision Trees • Neural Networks • Naïve Bayes •

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 85 Noncommercial-Share Alike license 7 https://creativecommons.org/licenses/by-nc-sa/4.0/ KNIME’s Decision Tree

J.R. Quinlan, “C4.5 Programs for machine learning” J. Shafer, R. Agrawal, M. Mehta, “SPRINT: A Scalable Parallel Classifier for Data Mining”

C4.5 builds a tree from a set of training data using the concept of information entropy. At each node of the tree, the attribute of the data with the highest normalized information gain (difference in entropy) is chosen to split the data. The C4.5 algorithm then recurses on the smaller sub lists.

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 86 Noncommercial-Share Alike license 8 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: Decision Tree Learner

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 87 Noncommercial-Share Alike license 9 https://creativecommons.org/licenses/by-nc-sa/4.0/ Decision Tree View

Most unmarried people earn < 50K per year

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 88 Noncommercial-Share Alike license 10 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: Decision Tree Predictor

• Takes a decision tree model & applies it to new data • Check the box to append class probabilities

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 89 Noncommercial-Share Alike license 11 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: Scorer

• Compare predicted results to known truth in order to evaluate model quality • Confusion matrix shows the distribution of model errors • An accuracy statistics table provides a detailed analysis of model quality.

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 90 Noncommercial-Share Alike license 12 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: Scorer

True Positives Income = “<=50K” Predicted = “<=50K” False Positives Income = “<=50K” Predicted = “>50K”

False Negatives True Negatives

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 91 Noncommercial-Share Alike license 13 https://creativecommons.org/licenses/by-nc-sa/4.0/ Scorer: Accuracy Measures

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 92 Noncommercial-Share Alike license 14 https://creativecommons.org/licenses/by-nc-sa/4.0/ Receiver Operating Characteristics

• Sort by confidence in target class • Plot true positive rate vs false positive rate • Ideal models achieve 100% TPR with 0% FPR • Area under the curve indicates model quality (1=ideal model, 0.5 = random outcome)

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 93 Noncommercial-Share Alike license 15 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: ROC Curve

• Requires individual class probabilities from a preceding predictor • User must define: 1. Original class column 2. Positive class value 3. Probability for that class from 1 or more models • See also the JavaScript ROC Curve node

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 94 Noncommercial-Share Alike license 16 https://creativecommons.org/licenses/by-nc-sa/4.0/ Data Mining Exercise, Activity I

Starting with exercise: Data Mining, Activity I: • Partition the fully joined data – 50%, Stratified Sampling • Train a decision tree on the training data – (Learn against “Target” column) • Use the model to predict the upsell potential for remaining records. • Evaluate the quality of a model with a Scorer. • Optional: Find AUC for the model using ROC curve.

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 95 Noncommercial-Share Alike license 17 https://creativecommons.org/licenses/by-nc-sa/4.0/ Regression

Predict numeric outcomes on existing data (supervised)

Applications – Forecasting – Quantitative Analysis

Methods – Linear – Polynomial – Regression Trees – Partial Least Squares

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 96 Noncommercial-Share Alike license 18 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Nodes: Linear Regression Learner & Regression Predictor

• A linear model relating a dependent variable to 1 or more independent variables. – Model coefficients provided in 2nd output port – Also available: Polynomial and Tree Ensemble Regression nodes

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 97 Noncommercial-Share Alike license 19 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: Numeric Scorer

Similar to scorer node, but for nodes with numeric predictions (e.g. linear/polynomial regression) • Compare dependent variable values to predicted values to evaluate goodness of fit. • Report R2, RMSD, SEM etc.

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 98 Noncommercial-Share Alike license 20 https://creativecommons.org/licenses/by-nc-sa/4.0/ Data Mining Exercise, Activity II

Starting with exercise: Data Mining, Activity II: • Partition the fully joined data – 50%, Stratified Sampling • Train a linear regression model that predicts age as a function of some other parameters in the data set • Use the model to predict the age of the remaining users • Evaluate the quality of a model with a Numeric Scorer. • Is this model useful for predicting customer age?

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 99 Noncommercial-Share Alike license 21 https://creativecommons.org/licenses/by-nc-sa/4.0/ Clustering

Discover hidden structure in unlabeled data (unsupervised)

Applications – Market Segmentation – Diversity picking Methods – K-means/medoids – Hierarchical – DBScan – Neighbourgrams

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 100 Noncommercial-Share Alike license 22 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Nodes: k-Means Clustering

• Looks at n observations to define the means for k clusters. • Each observation is then assigned to its closest cluster center. • You must provide k.

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 101 Noncommercial-Share Alike license 23 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: Entropy Scorer

• Similar to scorer node, but used with unsupervised learning (no target to predict) – Cluster labels and reference clusters do not need to be in the same domain (e.g. Match “Cluster 1” to “iris setosa”) – Reports entropy based statistics which indicate model quality

(low entropy, high quality is the aim)

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 102 Noncommercial-Share Alike license 24 https://creativecommons.org/licenses/by-nc-sa/4.0/ Data Mining Exercise, Activity III

Start with exercise: Data Mining, Activity III • Read the location_data.table file • Filter to entries from California (region_code = CA) • Train a k-means model with k=3. Use only position data for clustering (latitude and longitude) • Evaluate with Entropy Scorer. Compare cluster labels to the “City” column. Hint: use the k-means output for both ports in the input to the scorer. • Optional: Plot latitude and longitude in a view (OSM Map or Scatter Plot) and use that to help you visually optimize k.

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 103 Noncommercial-Share Alike license 25 https://creativecommons.org/licenses/by-nc-sa/4.0/ Integrating External Tools

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 104 Noncommercial-Share Alike license 1 https://creativecommons.org/licenses/by-nc-sa/4.0/ Goal of This Session

This session gives a quick overview of the external tools that can be called within KNIME, e.g.: Java, R, any external tool of choice, web service, and so on.

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 105 Noncommercial-Share Alike license 2 https://creativecommons.org/licenses/by-nc-sa/4.0/ KNIME Labs

• KNIME Labs enable you to preview new KNIME features and plug-ins that are still under development. • The nodes provided in KNIME Labs (KNIME Tech) are not (yet) part of the official KNIME version because the functionality and/or API may not be finalized. • You can get these plug-ins by installing the KNIME Labs extension package. Most of the plug-ins come already with a detailed description and some workflow examples.

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 106 Noncommercial-Share Alike license 3 https://creativecommons.org/licenses/by-nc-sa/4.0/ Java Snippet

• Fastest running scripting node in KNIME • Syntax highlighting, auto completion, error checking • Templates allow you to save scripts for later re-use • Import custom libraries

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 107 Noncommercial-Share Alike license 4 https://creativecommons.org/licenses/by-nc-sa/4.0/ Java Edit Variable

Same as Java snippet, but without table ports

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 108 Noncommercial-Share Alike license 5 https://creativecommons.org/licenses/by-nc-sa/4.0/ R Integration

• Run R inside KNIME. • Works with existing R installations. • Nodes for many tasks, but all with similiar interfaces… • First run: install.packages(‘Rserve‘) and install.packages(‘Cairo‘)*

*mac only

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 109 Noncommercial-Share Alike license 6 https://creativecommons.org/licenses/by-nc-sa/4.0/ R Integration

Syntax Highlighting Create and store templates

Select R workspace

Show Results Evaluate R error script messages

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 110 Noncommercial-Share Alike license 7 https://creativecommons.org/licenses/by-nc-sa/4.0/ Python Integration

• Run Python inside KNIME • Works with existing installations • UI modeled after R integration

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 111 Noncommercial-Share Alike license 8 https://creativecommons.org/licenses/by-nc-sa/4.0/ Python Scripting UI

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 112 Noncommercial-Share Alike license 9 https://creativecommons.org/licenses/by-nc-sa/4.0/ RESTful Web Services

In KNIME Labs:

JSON Response:

XML Response:

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 113 Noncommercial-Share Alike license 10 https://creativecommons.org/licenses/by-nc-sa/4.0/ RESTful Web Services

https://www.knime.org/blog/IBM-Watson-meets-Google-API https://www.knime.org/blog/OSM-meets-CSV-file-and-Google-API

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 114 Noncommercial-Share Alike license 11 https://creativecommons.org/licenses/by-nc-sa/4.0/ KNIME Server as a REST resource

https://www.knime.org/blog/giving-the-knime-server-a-rest

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 115 Noncommercial-Share Alike license 12 https://creativecommons.org/licenses/by-nc-sa/4.0/ KNIME Server as a REST resource

• Use cURL, SOAPUI or Chrome extension Postman to explore the REST API

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 116 Noncommercial-Share Alike license 13 https://creativecommons.org/licenses/by-nc-sa/4.0/ Generic Web Service Client (SOAP)

“X” resets all fields

Input Values “Analyze” parses the wsdl file and automatically fills the details

Output Values

Select to include

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 117 Noncommercial-Share Alike license 14 https://creativecommons.org/licenses/by-nc-sa/4.0/ External Tool Node

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 118 Noncommercial-Share Alike license 15 https://creativecommons.org/licenses/by-nc-sa/4.0/ JSON and JSONPath

• Use the JSON Reader (or the GET Resource) nodes to get an JSON cell • Use JSONPath nodes to ‘query’ the JSON and extract certain parameters • Editor window simplifies construction of JSON queries by auto-generating them (click on properties)

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 119 Noncommercial-Share Alike license 16 https://creativecommons.org/licenses/by-nc-sa/4.0/ JSONPath

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 120 Noncommercial-Share Alike license 17 https://creativecommons.org/licenses/by-nc-sa/4.0/ XML and XPath

• Use the XML Reader (or the GET Resource) nodes to get an XML cell • Use XPath nodes to ‘query’ the XML and extract certain parameters • Editor window simplifies construction of XPath queries by auto-generating them (click on XML elements)

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 121 Noncommercial-Share Alike license 18 https://creativecommons.org/licenses/by-nc-sa/4.0/ XPath

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 122 Noncommercial-Share Alike license 19 https://creativecommons.org/licenses/by-nc-sa/4.0/ Hive, HDFS, Spark Architecture

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 123 Noncommercial-Share Alike license 20 https://creativecommons.org/licenses/by-nc-sa/4.0/ Big Data Connector Extensions

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 124 Noncommercial-Share Alike license 21 https://creativecommons.org/licenses/by-nc-sa/4.0/ Query Big Data Platform with Hive

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 125 Noncommercial-Share Alike license 22 https://creativecommons.org/licenses/by-nc-sa/4.0/ Loading Data to Hive

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 126 Noncommercial-Share Alike license 23 https://creativecommons.org/licenses/by-nc-sa/4.0/ Spark Functionality

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 127 Noncommercial-Share Alike license 24 https://creativecommons.org/licenses/by-nc-sa/4.0/ Machine Learning in Spark

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 128 Noncommercial-Share Alike license 25 https://creativecommons.org/licenses/by-nc-sa/4.0/ Exercises

• Use the REST nodes to call an external web service • Choose either books.json or books.xml and use the appropriate tools to extract the book name, author and title • : choose your favorite from Java, Python and R • Take the KNIME table and multiply all values in one column by 10. • Python/R: Build a simple decision tree learner • Alternatively use the Generic JavaScript node and D3.js to create a custom JavaScript visualization that can be viewed in KNIME Analytics Platform, and KNIME WebPortal.

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 129 Noncommercial-Share Alike license 26 https://creativecommons.org/licenses/by-nc-sa/4.0/ Exporting Data & Deployment

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 130 Noncommercial-Share Alike license 1 https://creativecommons.org/licenses/by-nc-sa/4.0/ Exporting Data

After an analysis is completed, what next? • Write results to a file • Create/update a database • Save the model for use elsewhere • Generate a rich report • Deploy via KNIME WebPortal • Deploy via workflow as RESTful web service

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 131 Noncommercial-Share Alike license 2 https://creativecommons.org/licenses/by-nc-sa/4.0/ Input/Output in Deployment

Input Output • File (CSV, Table, XLS, …) • Report (BIRT, Tableau, • Database Spotfire) • JSON for REST API • Email • File (CSV, table, XLS, …) • Dashboard on WebPortal

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 132 Noncommercial-Share Alike license 3 https://creativecommons.org/licenses/by-nc-sa/4.0/ To Report / Email

To BIRT Report

Also available Nodes to Tableau and Spotfire

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 133 Noncommercial-Share Alike license 4 https://creativecommons.org/licenses/by-nc-sa/4.0/ To File / Database

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 134 Noncommercial-Share Alike license 5 https://creativecommons.org/licenses/by-nc-sa/4.0/ REST API on Server

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 135 Noncommercial-Share Alike license 6 https://creativecommons.org/licenses/by-nc-sa/4.0/ To Dashboard on WebPortal

Step 1 Step 2 Upload File Dashboard

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 136 Noncommercial-Share Alike license 7 https://creativecommons.org/licenses/by-nc-sa/4.0/ Workflow on KNIME WebPortal

Step 1 Step 2 Upload File Dashboard

Available in KNIME Server

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 137 Noncommercial-Share Alike license 8 https://creativecommons.org/licenses/by-nc-sa/4.0/ Wrapped Node to produce Dashboard on Web Page

Interactive Table / Plots on web page

HTML Text

Bar Chart

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 138 Noncommercial-Share Alike license 9 https://creativecommons.org/licenses/by-nc-sa/4.0/ Data Export Nodes

Typically characterized by: • Magenta color • 1 input port, no output ports • Create file on file system or write to database

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 139 Noncommercial-Share Alike license 10 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: Table Writer

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 140 Noncommercial-Share Alike license 11 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: XLS Writer

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 141 Noncommercial-Share Alike license 12 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: Database Writer

Only if no Connector node

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 142 Noncommercial-Share Alike license 13 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: Database Update

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 143 Noncommercial-Share Alike license 14 https://creativecommons.org/licenses/by-nc-sa/4.0/ Reporting in KNIME

• Reporting in KNIME is done via a 3rd party application named BIRT (Business Intelligence Reporting Tool) • Data is sent to BIRT from KNIME using special nodes. • Reports in BIRT are constructed from report items, which may include images, tables, charts and labels. • Reports may be generated in a variety of formats (html, pdf, pptx, xlsx, docx)

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 144 Noncommercial-Share Alike license 15 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: Data to Report

Send a data table to BIRT

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 145 Noncommercial-Share Alike license 16 https://creativecommons.org/licenses/by-nc-sa/4.0/ New Node: Image to Report

Send an image to BIRT • PNG and SVG are supported formats (see node description for details)

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 146 Noncommercial-Share Alike license 17 https://creativecommons.org/licenses/by-nc-sa/4.0/ Edit the Report

Open the workflow > Click the Report Editor button in the tool bar

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 147 Noncommercial-Share Alike license 18 https://creativecommons.org/licenses/by-nc-sa/4.0/ Reporting Perspective

Data from KNIME

Views Report Layout

Report Items

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 148 Noncommercial-Share Alike license 19 https://creativecommons.org/licenses/by-nc-sa/4.0/ Charting in BIRT

• Many chart types • Fine control of plot appearance • Familiar ‘Excel Like’ interface • Supports interactivity

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 149 Noncommercial-Share Alike license 20 https://creativecommons.org/licenses/by-nc-sa/4.0/ Exporting Data Exercise

Starting with exercise: Exporting Data • Write your predictions to disk as a KNIME table • Create a heatmap of the normalized confusion matrix and send it to BIRT • Send your model accuracy table to BIRT • Define a very simple report showing the accuracy of your model • Generate a PDF of your report

Licensed under a Creative Commons Attribution- ® Copyright © 2017 KNIME.com AG 150 Noncommercial-Share Alike license 21 https://creativecommons.org/licenses/by-nc-sa/4.0/ The End

[email protected]

Copyright © 2017 KNIME.com AG