From Insight to Action: Analytics from Both Sides of the Brain

Total Page:16

File Type:pdf, Size:1020Kb

From Insight to Action: Analytics from Both Sides of the Brain From Insight to Action: Analytics from Both Sides of the Brain Vaz Balasingham Director of Solutions Consulting [email protected] Insight to Action – from Both Sides of the Brain • Both Sides of the Brain Value • Fast & Slow • Insight to Action • Visual Analytics Grow Revenue • Numerical Algorithms • Insight Execution Reduce • Wrap-Up / Questions Risk • Come see the demos Increase Productivity ROI © Copyright 2000-2016 TIBCO Software Inc. 2 Fun Problem from Kahneman “A bat and a ball cost $1.10 in total. The bat costs $1.00 more than the ball. How much does the ball cost?” © Copyright 2000-2016 TIBCO Software Inc. Themes: Thinking Fast and Slow… from Both Sides of the Brain System 1: System 2: Association Monitor & Engine Control What you see Lazy is all there is Controller © Copyright 2000-2016 TIBCO Software Inc. Themes: Thinking Fast and Slow… from Both Sides of the Brain System 2: System 1: Monitor & Association Control Engine Spotfire © Copyright 2000-2016 TIBCO Software Inc. Themes: Insight without Action has Little Value Event $$$$ Data Ready for Analysis $$$ Analysis Completed Business $$ Value Decision Made $ Action Taken Time © Copyright 2000-2016 TIBCO Software Inc. People & Reporting & Processes Dashboards Streaming APIs Analytics Data & Data Systems Visualizations The TIBCO Insight Platform Insights EVENTS Actions The TIBCO Insight Platform DATA WRANGLE EVENTS ANALYZE MODEL The TIBCO Insight Platform DATA WRANGLE Insights ANALYZE MODEL The TIBCO Insight Platform PREDICT DECIDE MODEL ACT MONITOR The TIBCO Insight Platform PREDICT DECIDE Actions ACT MONITOR The TIBCO Insight Platform Insights MODEL Actions The TIBCO Insight Platform TIBCO Spotfire Matlab TIBCO Streambase Insights MODEL Actions TIBCO Live Datamart #1. Smart Visual Analytics © Copyright 2000-2016 TIBCO Software Inc. #1. Smart Visual Analytics © Copyright 2000-2016 TIBCO Software Inc. Spotfire : Visual Analytics Spotfire © Copyright 2000-2016 TIBCO Software Inc. Spotfire : Graphs Spotfire Venn Sankey Dials Donut Chord Gantt © Copyright 2000-2016 TIBCO Software Inc. Enhanced Data Connectivity Spotfire Connect Seamlessly with your Data Sources • Hassle-free connectivity to your data sources • Database drivers included with Spotfire software download Recommendations Spotfire Jump Start Your Analysis • Visualization recommendations based on your selected data • Build one visualization – or entire dashboard – in just a couple of clicks • Data previews accelerate discovery & insights • Powered by Spotfire Analytics Recommendation Engine™ Data Source View Spotfire Visual Overview of data table sources & operations Gain a better understanding of the data you’re trying to visualize. Speed up visualization design. Data Panel Spotfire Key Information At Your Fingertips • See all dimensions and measure available to visualize. • Expanded view gives extra insights without going to other screens. • Change data type, category, sort order all without dropping what you’re doing. • Data filters built-in. • Views based on type of data in column. Annotations Spotfire Add Commentary To Your Analysis • Overlay annotations on any part of your analysis. • Select size, color, and font to match the look and feel of the rest of your visualizations. Collaboration Spotfire Connect Your Data To The Conversation • Have conversations about specific issues and insights found in your data. • Save your views and discoveries on each message in the conversation. TIBCO GeoAnalytics WHAT IS IT? FOR WHOM MATURITY CURVE STEP • Analysts Diagnose (foundation for ¡ High-accuracy, global base maps designed for data • Data Scientist Predict & Optimize) exploration & location analytics • Business Users ¡ Visualize, explore, analyze data in the context of location ¡ Modern, interactive and easy map navigation ¡ Mash-in new data sources quickly and provide accurate enterprise geo-coding ¡ Use multi-layered analysis to understand geographic correlations in data and expand understanding QUICK DETAILS A rich foundation for location analytics ¡ Automatic, offline geo-coding ¡ Combine multiple layers of data on a same map ¡ Import & save geographic files for future use ¡ Cloud based base map always available, always up-to-date ¡ Drill-down to a deeper level of detail #2. Numerical Models © Copyright 2000-2016 TIBCO Software Inc. Insight to Action : the Model Spotfire Streambase PMML H2O - R PMML (CRAN) - POJO - R2PMML (JPMML) - Model object - KNIME PMML MODEL - SAS PMML - … many R Model Object - *.rds Matlab Model Object - *.RData - MDS - *.mdl (TERR) - PMML © Copyright 2000-2016 TIBCO Software Inc. Algorithms: Rules, Machine Learning & Optimization • Declarative & Heuristic Rules • SPC and Anomaly Detection • Machine Learning • Supervised • Unsupervised • Gradient Boosting Machines • Random Forests • Deep Learning • Optimization • Linear & Quadratic Programming • Genetic Algorithms • Process optimization • Capacity constraints © Copyright 2000-2016 TIBCO Software Inc. Machine Learning Machine Learning finds predictive models in data without being told where to look • Supervised – Solve known problems: y=f(X) • Build a model that predicts a condition (failure, success, ..) • What factors are driving customer network issues? • Decision Trees, Random Forests, Gradient Boosting Machines, Deep Learning • Unsupervised – Identify new patterns, Detect anomalies X only • Are there new patterns or failure modes emerging? • Clustering, Principle Components, Deep Learning Autoencoder • Optimization – Support Decision-making • Find best solution even when there are constraints on the process • What is the optimum allocation of resources for equipment maintenance? • Genetic Algorithm, Linear/Quadratic Programming © Copyright 2000-2016 TIBCO Software Inc. Anomaly Detection Sensor Data / Anomaly Detection Existence of known anomaly ? What if we don’t have this ? Unsupervised – no y variable Can still model ! J • SPC and Western Electric Rules (ESP) • Single class Support Vector Machine • Principal Components: PC Score • Deep Learning: Auto encoding © Copyright 2000-2016 TIBCO Software Inc. Model: Gradient Boosting Machine Model: Gradient Boosting Machine • Boosting Process • Sample the data: fit a tree: recursive partitioning • Drop the observations down the tree • Re-sample the data; up-weighting the observations that weren’t fitted well in previous model • Save all the trees and average them • Excellent fit + prediction © Copyright 2000-2016 TIBCO Software Inc. Reference Diagram © Copyright 2000-2016 TIBCO Software Inc. Increasing Capacity and Resilience © Copyright 2000-2016 TIBCO Software Inc. Spotfire & Matlab Demo © Copyright 2000-2016 TIBCO Software Inc. Spotfire Platform – Complete © Copyright 2000-2014 TIBCO Software Inc. 41 How does Spotfire Consume Data? In- SQL In-Database On-Demand MDX101 Memory 0 Leave data in DB Dynamically Load data 011 Dynamically load and swap data in from source 0 discard data to and out of in to memory visualize memory. SOURCES DATA Event Data Active Spread- Flat Hadoop & Analytical XML Cubes Streams Spaces RDBMS sheets Files Big Data DWs e.g. stores Exadata © Copyright 2000-2013 TIBCO Software Inc. 42 #3. Streaming Analytics Streambase © Copyright 2000-2016 TIBCO Software Inc. Issues with Traditional Data Processing • Data collected from multiple sources and periodically placed in Act a persistent store. • Analytical processes are executed against the stored data Analyze • Introduces too much “decision latency” • Responses are delivered “after- the-fact”. Store • Decisions are made on old and stale data. • Maximum value is lost. The New Era: Fast Data Processing • Events are analyzed and processed in real-time as they arrive. Act Store • Decisions are timely, contextual, and based on fresh data. • Decision latency is eliminated, resulting in Analyze • Superior Customer Experience • Operational Excellence • Instant Awareness and Timely Decisions © Copyright 2000-2015 TIBCO Software Inc. Anomaly Detection: Data Flow Streambase Spotfire Transactions “Featurize” Matlab Dashboards * 1 row / Data transaction detail Prepare Data for * 1 row / group Anomaly Detection Modeling (Matlab, TERR) Models (TERR, Matlab) DATA Spotfire/Matlab AT REST Anomaly Detection Anomaly Detection Streambase Models (TERR, Matlab) Model Real Time - Autoencoder * Rdata object Scoring - Gradient Boosting Machine © Copyright 2000-2016 TIBCO Software Inc. Streaming Analytics with Streambase Streambase Business Continuous Strategy Spotfire Visualization Connection Analytics designed by data scientists Data Infrastructure Automated Action © Copyright 2000-2016 TIBCO Software Inc. Streaming Analytics Streambase Analytics designed by data scientists © Copyright 2000-2016 TIBCO Software Inc. Streaming Analytics : Equipment Monitoring Streambase Load Reference Output to BPM Data to be used in Publish and Expose Clean Systems, Web Rules/Alerts, Data to Whole Services, Databases, Cleaning, or and anywhere else! Anywhere else Organization Continuously Build Features and Publish Continuously Publish Summary Statistics Publish Raw Events to for Analysis another StreamBase Workflow or other Applications © Copyright 2000-2016 TIBCO Software Inc. StreamBase Adapters and Connectivity Streambase Financial Databases IoT: Integration Messages OSI-PI Kafka WITSML Flume MQTT Rabbit MQ Big Data Social Compute API’s C++, Java, .NET, JavaScript, Python Messaging TIBCO RV & EMS, JMS, MQ Series, RMDS, Solace, Tervela, Wombat General Purpose Adapters FIX, STAMP, JDBC, ActiveSpaces, POP3, IRC, HTTP, SMTP, IM, XML, RSS © Copyright 2000-2016 TIBCO Software Inc. Streambase & Matlab Demo © Copyright 2000-2016 TIBCO Software Inc. Insight Platform Demo © Copyright 2000-2016 TIBCO Software Inc. Learn more about TIBCO • www.tibco.com • Accelerators & Community • https://community.tibco.com/ • https://community.tibco.com/wiki/accelerators • Spotfire micro site • http://spotfire.tibco.com/ • http://spotfire.tibco.com/demos People & Reporting & Processes Dashboards Streaming APIs Analytics Data & Data Systems Visualizations.
Recommended publications
  • Text Mining Course for KNIME Analytics Platform
    Text Mining Course for KNIME Analytics Platform KNIME AG Copyright © 2018 KNIME AG Table of Contents 1. The Open Analytics Platform 2. The Text Processing Extension 3. Importing Text 4. Enrichment 5. Preprocessing 6. Transformation 7. Classification 8. Visualization 9. Clustering 10. Supplementary Workflows Licensed under a Creative Commons Attribution- ® Copyright © 2018 KNIME AG 2 Noncommercial-Share Alike license 1 https://creativecommons.org/licenses/by-nc-sa/4.0/ Overview KNIME Analytics Platform Licensed under a Creative Commons Attribution- ® Copyright © 2018 KNIME AG 3 Noncommercial-Share Alike license 1 https://creativecommons.org/licenses/by-nc-sa/4.0/ What is KNIME Analytics Platform? • A tool for data analysis, manipulation, visualization, and reporting • Based on the graphical programming paradigm • Provides a diverse array of extensions: • Text Mining • Network Mining • Cheminformatics • Many integrations, such as Java, R, Python, Weka, H2O, etc. Licensed under a Creative Commons Attribution- ® Copyright © 2018 KNIME AG 4 Noncommercial-Share Alike license 2 https://creativecommons.org/licenses/by-nc-sa/4.0/ Visual KNIME Workflows NODES perform tasks on data Not Configured Configured Outputs Inputs Executed Status Error Nodes are combined to create WORKFLOWS Licensed under a Creative Commons Attribution- ® Copyright © 2018 KNIME AG 5 Noncommercial-Share Alike license 3 https://creativecommons.org/licenses/by-nc-sa/4.0/ Data Access • Databases • MySQL, MS SQL Server, PostgreSQL • any JDBC (Oracle, DB2, …) • Files • CSV, txt
    [Show full text]
  • Imagej2-Allow the Users to Use Directly Use/Update Imagej2 Plugins Inside KNIME As Well As Recording and Running KNIME Workflows in Imagej2
    The KNIME Image Processing Extension for Biomedical Image Analysis Andries Zijlstra (Vanderbilt University Medical Center The need for image processing in medicine Kevin Eliceiri (University of Wisconsin-Madison) KNIME Image Processing and ImageJ Ecosystem [email protected] [email protected] The need for precision oncology 36% of newly diagnosed cancers, and 10% of all cancer deaths in men Out of every 100 men... 16 will be diagnosed with prostate cancer in their lifetime In reality, up to 80 will have prostate cancer by age 70 And 3 will die from it. But which 3 ? In the meantime, we The goal: Diagnose patients that have over-treat many aggressive disease through Precision Medicine patients Objectives of Approach to Modern Medicine Precision Medicine • Measure many things (data density) • Improved outcome through • Make very accurate measurements (fidelity) personalized/precision medicine • Consider multiple perspectives (differential) • Reduced expense/resource allocation through • Achieve confidence in the diagnosis improved diagnosis, prognosis, treatment • Match patients with a treatment they are most • Maximize quality of life by “targeted” therapy likely to respond to. Objectives of Approach to Modern Medicine Precision Medicine • Measure many things (data density) • Improved outcome through • Make very accurate measurements (fidelity) personalized/precision medicine • Consider multiple perspectives (differential) • Reduced expense/resource allocation through • Achieve confidence in the diagnosis improved diagnosis,
    [Show full text]
  • KNIME Workbench Guide
    KNIME Workbench Guide KNIME AG, Zurich, Switzerland Version 4.4 (last updated on 2021-06-08) Table of Contents Workspaces . 1 KNIME Workbench . 2 Welcome page . 4 Workflow editor & nodes . 5 KNIME Explorer . 13 Workflow Coach . 35 Node repository . 37 KNIME Hub view . 38 Description. 40 Node Monitor. 40 Outline. 41 Console. 41 Customizing the KNIME Workbench . 42 Reset and logging . 42 Show heap status . 42 Configuring KNIME Analytics Platform . 43 Preferences . 43 Setting up knime.ini. 47 KNIME runtime options . 49 KNIME tables . 55 Data table . 55 Column types. 56 Sorting . 59 Column rendering . 59 Table storage. 61 KNIME Workbench Guide This guide describes the first steps to take after starting KNIME Analytics Platform and points you to the resources available in the KNIME Workbench for building workflows. It also explains how to customize the workbench and configure KNIME Analytics Platform to best suit specific needs. In the last part of this guide we introduce data tables. Workspaces When you start KNIME Analytics Platform, the KNIME Analytics Platform launcher window appears and you are asked to define the KNIME workspace, as shown in Figure 1. The KNIME workspace is a folder on the local computer to store KNIME workflows, node settings, and data produced by the workflow. Figure 1. KNIME Analytics Platform launcher The workflows and data stored in the workspace are available through the KNIME Explorer in the upper left corner of the KNIME Workbench. © 2021 KNIME AG. All rights reserved. 1 KNIME Workbench Guide KNIME Workbench After selecting a workspace for the current project, click Launch. The KNIME Analytics Platform user interface - the KNIME Workbench - opens.
    [Show full text]
  • Data Analytics with Knime
    DATA ANALYTICS WITH KNIME v.3.4.0 QUALIFICATIONS & EXPERIENCE ▶ 38 years of providing professional services to state and local taxing officials ▶ TMA works exclusively with government partners WHO ▶ TMA is composed of 150+ WE ARE employees in five main offices across the United States Tax Management Associates is a professional services firm that has ▶ Our main focus is on revenue served the interests of state and local enhancement services for state government since 1979. and local jurisdictions and property tax compliance efforts KNIME POWERED CUSTOM ANALYTICS ▶ TMA is a proud KNIME Trusted Consulting Partner. Visit: www.knime.org/knime-trusted-partners ▶ Successful analytics solutions: ○ Fraud Detection (Michigan Department of Treasury) ○ Entity Discovery (multiple counties) ○ Data Aggregation (Louisiana State Tax Commission) KNIME POWERED CUSTOM ANALYTICS ▶ KNIME is an open source data toolkit ▶ Active development community and core team ▶ GUI based with scripting integration ○ Easy adoption, integration, and training ▶ Data ingestion, transformation, analytics, and reporting FEATURES & TERMINOLOGY KNIME WORKBENCH TAX MANAGEMENT ASSOCIATES, INC. KNIME WORKFLOW TAX MANAGEMENT ASSOCIATES, INC. KNIME NODES TAX MANAGEMENT ASSOCIATES, INC. DATA TYPES & SOURCES DATA AGNOSTIC ▶ Flat Files ▶ Shapefiles ▶ Xls/x Reader ▶ HTTP Requests ▶ Fixed Width ▶ RSS Feeds ▶ Text Files ▶ Custom API’s/Curl ▶ Image Files ▶ Standard API’s ▶ XML ▶ JSON TAX MANAGEMENT ASSOCIATES, INC. KNIME DATA NODES TAX MANAGEMENT ASSOCIATES, INC. DATABASE AGNOSTIC ▶ Microsoft SQL ▶ Oracle ▶ MySQL ▶ IBM DB2 ▶ Postgres ▶ Hadoop ▶ SQLite ▶ Any JDBC driver TAX MANAGEMENT ASSOCIATES, INC. KNIME DATABASE NODES TAX MANAGEMENT ASSOCIATES, INC. CORE DATA ANALYTICS FEATURES KNIME DATA ANALYTICS LIFECYCLE Read Data Extract, Data Analytics Reporting or Predictive Read Transform, and/or Load (ETL) Analysis Injection Data Read Data TAX MANAGEMENT ASSOCIATES, INC.
    [Show full text]
  • Sheffield HPC Documentation
    Sheffield HPC Documentation Release November 14, 2016 Contents 1 Research Computing Team 3 2 Research Software Engineering Team5 i ii Sheffield HPC Documentation, Release The current High Performance Computing (HPC) system at Sheffield, is the Iceberg cluster. A new system, ShARC (Sheffield Advanced Research Computer), is currently under development. It is not yet ready for use. Contents 1 Sheffield HPC Documentation, Release 2 Contents CHAPTER 1 Research Computing Team The research computing team are the team responsible for the iceberg service, as well as all other aspects of research computing. If you require support with iceberg, training or software for your workstations, the research computing team would be happy to help. Take a look at the Research Computing website or email research-it@sheffield.ac.uk. 3 Sheffield HPC Documentation, Release 4 Chapter 1. Research Computing Team CHAPTER 2 Research Software Engineering Team The Sheffield Research Software Engineering Team is an academically led group that collaborates closely with CiCS. They can assist with code optimisation, training and all aspects of High Performance Computing including GPU computing along with local, national, regional and cloud computing services. Take a look at the Research Software Engineering website or email rse@sheffield.ac.uk 2.1 Using the HPC Systems 2.1.1 Getting Started If you have not used a High Performance Computing (HPC) cluster, Linux or even a command line before this is the place to start. This guide will get you set up using iceberg in the easiest way that fits your requirements. Getting an Account Before you can start using iceberg you need to register for an account.
    [Show full text]
  • Mathematica Document
    Mathematica Project: Exploratory Data Analysis on ‘Data Scientists’ A big picture view of the state of data scientists and machine learning engineers. ����� ���� ��������� ��� ������ ���� ������ ������ ���� ������/ ������ � ���������� ���� ��� ������ ��� ���������������� �������� ������/ ����� ��������� ��� ���� ���������������� ����� ��������������� ��������� � ������(�������� ���������� ���������) ������ ��������� ����� ������� �������� ����� ������� ��� ������ ����������(���� �������) ��������� ����� ���� ������ ����� (���������� �������) ����������(���������� ������� ���������� ��� ���� ���� �����/ ��� �������������� � ����� ���� �� �������� � ��� ����/���������� ��������������� ������� ������������� ��� ���������� ����� �����(���� �������) ����������� ����� / ����� ��� ������ ��������������� ���������� ����������/�++ ������/������������/����/������ ���� ������� ����� ������� ������� ����������������� ������� ������� ����/����/�������/��/��� ����������(�����/����-�������� ��������) ������������ In this Mathematica project, we will explore the capabilities of Mathematica to better understand the state of data science enthusiasts. The dataset consisting of more than 10,000 rows is obtained from Kaggle, which is a result of ‘Kaggle Survey 2017’. We will explore various capabilities of Mathematica in Data Analysis and Data Visualizations. Further, we will utilize Machine Learning techniques to train models and Classify features with several algorithms, such as Nearest Neighbors, Random Forest. Dataset : https : // www.kaggle.com/kaggle/kaggle
    [Show full text]
  • Direct Submission Or Co-Submission Direct Submission
    Z-Matrix template-based substitution approach Title for enumeration of 3D molecular structures Authors Wanutcha Lorpaiboon and Taweetham Limpanuparb* Science Division, Mahidol University International College, Affiliations Mahidol University, Salaya, Nakhon Pathom 73170, Thailand Corresponding Author’s email address [email protected] • Chemical structures • Education Keywords • Molecular generator • Structure generator • Z-matrix Direct Submission or Co-Submission Direct Submission ABSTRACT The exhaustive enumeration of 3D chemical structures based on Z-matrix templates has recently been used in the quantum chemical investigation of constitutional isomers, diastereomers and 5 rotamers. This simple yet powerful initial structure generation approach can apply beyond the investigation of compounds of identical formula by quantum chemical methods. This paper aims to provide a short description of the overall concept followed by a practical tutorial to the approach. • The four steps required for Z-matrix template-based substitution are template construction, generation of tuples for substitution sites, removal of duplicate tuples and 10 substitution on the template. • The generated tuples can be used to create chemical identifiers to query compound properties from chemical databases. • All of these steps are demonstrated in this paper by common model compounds and are very straightforward for an undergraduate audience to reproduce. A comparison of the 15 approach in this tutorial and other options is also discussed. SPECIFICATIONS TABLE Subject Area Chemistry More specific subject area Cheminformatics Method name Z-matrix template-based substitution Name and reference of original method N/A Source codes are available as supplementary information in this Resource availability paper. 2 of 10 Method details 20 1. Introduction Initial structures (Z-matrix or Cartesian coordinate) are important starting points for the in silico investigation of chemical species.
    [Show full text]
  • Role of Materials Data Science and Informatics in Accelerated Materials Innovation Surya R
    Role of materials data science and informatics in accelerated materials innovation Surya R. Kalidindi , David B. Brough , Shengyen Li , Ahmet Cecen , Aleksandr L. Blekh , Faical Yannick P. Congo , and Carelyn Campbell The goal of the Materials Genome Initiative is to substantially reduce the time and cost of materials design and deployment. Achieving this goal requires taking advantage of the recent advances in data and information sciences. This critical need has impelled the emergence of a new discipline, called materials data science and informatics. This emerging new discipline not only has to address the core scientifi c/technological challenges related to datafi cation of materials science and engineering, but also, a number of equally important challenges around data-driven transformation of the current culture, practices, and workfl ows employed for materials innovation. A comprehensive effort that addresses both of these aspects in a synergistic manner is likely to succeed in realizing the vision of scaled-up materials innovation. Key toolsets needed for the successful adoption of materials data science and informatics in materials innovation are identifi ed and discussed in this article. Prototypical examples of emerging novel toolsets and their functionality are described along with select case studies. Introduction goal of reducing the time and cost of materials development Materials innovation initiatives and deployment by 50%. 1 Essential to achieving this goal is A number of US-based, 1 – 3 as well as international, 4 , 5 efforts are the development and deployment of a supporting infrastruc- now focused on accelerated deployment of advanced materials ture that integrates a wide range of data, experimental, and in commercial products.
    [Show full text]
  • Titel Untertitel
    KNIME Image Processing Nycomed Chair for Bioinformatics and Information Mining Department of Computer and Information Science Konstanz University, Germany Why Image Processing with KNIME? KNIME UGM 2013 2 The “Zoo” of Image Processing Tools Development Processing UI Handling ImgLib2 ImageJ OMERO OpenCV ImageJ2 BioFormats MatLab Fiji … NumPy CellProfiler VTK Ilastik VIGRA CellCognition … Icy Photoshop … = Single, individual, case specific, incompatible solutions KNIME UGM 2013 3 The “Zoo” of Image Processing Tools Development Processing UI Handling ImgLib2 ImageJ OMERO OpenCV ImageJ2 BioFormats MatLab Fiji … NumPy CellProfiler VTK Ilastik VIGRA CellCognition … Icy Photoshop … → Integration! KNIME UGM 2013 4 KNIME as integration platform KNIME UGM 2013 5 Integration: What and How? KNIME UGM 2013 6 Integration ImgLib2 • Developed at MPI-CBG Dresden • Generic framework for data (image) processing algoritms and data-structures • Generic design of algorithms for n-dimensional images and labelings • http://fiji.sc/wiki/index.php/ImgLib2 → KNIME: used as image representation (within the data cells); basis for algorithms KNIME UGM 2013 7 Integration ImageJ/Fiji • Popular, highly interactive image processing tool • Huge base of available plugins • Fiji: Extension of ImageJ1 with plugin-update mechanism and plugins • http://rsb.info.nih.gov/ij/ & http://fiji.sc/ → KNIME: ImageJ Macro Node KNIME UGM 2013 8 Integration ImageJ2 • Next-generation version of ImageJ • Complete re-design of ImageJ while maintaining backwards compatibility • Based on ImgLib2
    [Show full text]
  • Leveraging SAS with KNIME
    Leveraging SAS with KNIME Thomas Gabriel and Phil Winters Copyright © 2013 by KNIME.com AG. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Copyright © 2013 by KNIME.com AG. All rights reserved. 2 A bit of History: The Original SAS Concept Access Manage Present Analyse Copyright © 2013 by KNIME.com AG. All rights reserved. 4 More Products to meet new requirements EG DI RTD DB2 Macro Oracle PC FILIE Connect Access Manage MM SCL Present Analyse EM ® AF DVD QC JMP OR IML IDP ETS STAT Graph Copyright © 2013 by KNIME.com AG. All rights reserved. 5 Even More Products…. Teradata Hadoop EG DI RTD DB2 Macro Oracle PC FILIE Connect Access Manage MM SCL Present Analyse EM PMML …Model Manager …Model PMML High Performance Analytics Performance High ® AF DVD QC JMP OR IML IDP ETS STAT Graph Text Mining Social Media Analytics R…. in IML in R…. Copyright © 2013 by KNIME.com AG. All rights reserved. 6 Our new reality: You must have Choice and Control New Infrastructures New Data New Other Science Applications New New Users Methods New Business Copyright © 2013 by KNIME.com AG. All rights reserved. Challenges 7 The KNIME Platform Open, Open Source, Free on the Desktop Copyright © 2013 by KNIME.com AG. All rights reserved. 8 The question is not “which is better”…. Copyright © 2013 by KNIME.com AG. All rights reserved. 9 The question is: What’s the Big Difference? SAS KNIME A script-oriented 4GL programming language A Script-free environment in four major parts: comprised of • The DATA step • Nodes and Connectors • Procedure steps • A macro language, • Metanodes and Flow variables for packaging a metaprogramming language • ODS statements • GUIs: are most often front-ends • GUI is the interface to facilitate SAS Program script generation • Through Proprietary Technology “Provide it all” • Through Open Source “Provide it all” Copyright © 2013 by KNIME.com AG.
    [Show full text]
  • Introduction to Label-Free Quantification
    SeqAn and OpenMS Integration Workshop Temesgen Dadi, Julianus Pfeuffer, Alexander Fillbrunn The Center for Integrative Bioinformatics (CIBI) Mass-spectrometry data analysis in KNIME Julianus Pfeuffer, Alexander Fillbrunn OpenMS • OpenMS – an open-source C++ framework for computational mass spectrometry • Jointly developed at ETH Zürich, FU Berlin, University of Tübingen • Open source: BSD 3-clause license • Portable: available on Windows, OSX, Linux • Vendor-independent: supports all standard formats and vendor-formats through proteowizard • OpenMS TOPP tools – The OpenMS Proteomics Pipeline tools – Building blocks: One application for each analysis step – All applications share identical user interfaces – Uses PSI standard formats • Can be integrated in various workflow systems – Galaxy – WS-PGRADE/gUSE – KNIME Kohlbacher et al., Bioinformatics (2007), 23:e191 OpenMS Tools in KNIME • Wrapping of OpenMS tools in KNIME via GenericKNIMENodes (GKN) • Every tool writes its CommonToolDescription (CTD) via its command line parser • GKN generates Java source code for nodes to show up in KNIME • Wraps C++ executables and provides file handling nodes Installation of the OpenMS plugin • Community-contributions update site (stable & trunk) – Bioinformatics & NGS • provides > 180 OpenMS TOPP tools as Community nodes – SILAC, iTRAQ, TMT, label-free, SWATH, SIP, … – Search engines: OMSSA, MASCOT, X!TANDEM, MSGFplus, … – Protein inference: FIDO Data Flow in Shotgun Proteomics Sample HPLC/MS Raw Data 100 GB Sig. Proc. Peak 50 MB Maps Data Reduction 1
    [Show full text]
  • Bringing Open Source to Drug Discovery
    Bringing Open Source to Drug Discovery Chris Swain Cambridge MedChem Consulting Standing on the shoulders of giants • There are a huge number of people involved in writing open source software • It is impossible to acknowledge them all individually • The slide deck will be available for download and includes 25 slides of details and download links – Copy on my website www.cambridgemedchemconsulting.com Why us Open Source software? • Allows access to source code – You can customise the code to suit your needs – If developer ceases trading the code can continue to be developed – Outside scrutiny improves stability and security What Resources are available • Toolkits • Databases • Web Services • Workflows • Applications • Scripts Toolkits • OpenBabel (htttp://openbabel.org) is a chemical toolbox – Ready-to-use programs, and complete programmer's toolkit – Read, write and convert over 110 chemical file formats – Filter and search molecular files using SMARTS and other methods, KNIME add-on – Supports molecular modeling, cheminformatics, bioinformatics – Organic chemistry, inorganic chemistry, solid-state materials, nuclear chemistry – Written in C++ but accessible from Python, Ruby, Perl, Shell scripts… Toolkits • OpenBabel • R • CDK • OpenCL • RDkit • SciPy • Indigo • NumPy • ChemmineR • Pandas • Helium • Flot • FROWNS • GNU Octave • Perlmol • OpenMPI Toolkits • RDKit (http://www.rdkit.org) – A collection of cheminformatics and machine-learning software written in C++ and Python. – Knime nodes – The core algorithms and data structures are written in C ++. Wrappers are provided to use the toolkit from either Python or Java. – Additionally, the RDKit distribution includes a PostgreSQL-based cartridge that allows molecules to be stored in relational database and retrieved via substructure and similarity searches.
    [Show full text]