SQL ANALYTICS – Spark Mllib Algorithm Integration E.G

Total Page:16

File Type:pdf, Size:1020Kb

Load more

Oracle Machine Learning

& Advanced Analytics

Data Management Platforms

Move the Algorithms; Not the Data!

Charlie Berger, MS Engineering, MBA

Sr. Director Product Management,

Machine Learning, AI and Cognitive Analytics [email protected]

www.twitter.com/CharlieDataMine

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Safe Harbor Statement

The following is intended to outline our general product direction. It is intended for

information purposes only, and may not be incorporated into any contract. It is not a

commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or

functionality described for Oracle’s products remains at the sole discretion of Oracle.

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

2

Machine Learning, Data Mining, Predictive Analytics

Automatically sift through large amounts of data to find

hidden patterns, discover new insights and make predictions

A1 A2 A3 A4 A5 A6 A7

• Identify most important factor (Attribute Importance)

• Predict customer behavior (Classification)

• Predict or estimate a value (Regression)

• Find profiles of targeted people or items (Decision Trees)

• Segment a population (Clustering)

• Find fraudulent or “rare events” (Anomaly Detection) • Determine co-occurring items in a “baskets” (Associations)

Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |

Oracle’s Data Management and Machine Learning

Market Observations:

• Machine learning, predictive analytics & “AI” now must-have requirements

• Separate islands for data management and data science just don’t work

• Enterprises whose data science teams most rapidly extract insights and

predictions win

Conclusions:

• Must “operationalize” ML insights and predictions throughout enterprise

• Multilingual Machine Learning: SQL, R, Python, Workflow UI, Notebooks, Embed ML in Appls,

• Evolving towards combined Data Management + Machine Learning

environment that can essentially to manage and “think” about data

  • Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
  • 4

Oracle’s Data Management and Machine Learning

Architectural Strategy

• In-Database proprietary implementations of machine learning algorithms • Leverage strengths of the Database and adds new ML tech

– Counting, conditional probabilities, sort, rank, partition, group-by, collections, etc. – Parallel execution, bitmap indexes, partitioning, aggregations, recursion w/in parallel infrastructure, IEEE float, frequent itemsets, Automatic Data Preparation (ADP), Text processing, etc.

• Focus on intelligent ML defaults, simplification & automation to enable applications

– ADP, xforms, binning, missing values, Prediction_Details, Predictive_Queries, Model_views

• Machine learning models built via PL/SQL script; scored via SQL functions (1st class DB objects)

select cust_id

from customers

True power evident when scoring

where region = ‘US’

and prediction_probability(churnmod, ‘Y’ using *) > 0.8;

models using SQL functions, e.g.

– “Smart scan” ML model scoring “push down”; Supports OLTP and ATPC environments

• Machine Learning and Advanced Analytics are peer to rest of Oracle Data Mgmt features

– Security, Back-up, Encryption, Scalability, “Big Data” eco-system, BDA, Big Data SQL, Cloud, Spark, etc. – Best ML & analytical development and deployment platform

  • Copyright © 2016 Oracle and/or its affiliates. All rights reserved.
  • 5

Oracle’s Machine Learning & Advanced Analytics

Fastest Way to Deliver Enterprise-wide Predictive Analytics

Traditional ML
Oracle’s in-DB Machine Learning

Major Benefits

Data Import Data Mining

. Data remains in Database & Hadoop

. Model building and scoring occur in-database

. Use R packages with data-parallel invocations

Model “Scoring”

Data Prep. &
Transformation

avings

. Leverage investment in Oracle IT

. Eliminate data duplication

Data Mining
Model Building

. Eliminate separate analytical servers

Data Prep &
Transformation

. Deliver enterprise-wide applications

. GUI for ML/Predictive Analytics & code gen

. R interface leverages database as HPC engine

Model “Scoring”

Embedded Data Prep Model Building

Data Extraction

Data Preparation

Secs, Mins or Hours
Hours, Days or Weeks

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Oracle’s Machine Learning & Adv. Analytics Algorithms

  • CLASSIFICATION
  • REGRESSION
  • FEATURE EXTRACTION

  • – Naïve Bayes
  • – Linear Model
  • – Principal Comp Analysis (PCA)

– Non-negative Matrix Factorization – Singular Value Decomposition (SVD) – Explicit Semantic Analysis (ESA)
– Logistic Regression (GLM) – Decision Tree
– Generalized Linear Model – Support Vector Machine (SVM) – Stepwise Linear regression

– Neural Network

– Random Forest

– Neural Network

TEXT MINING SUPPORT

– Support Vector Machine – Explicit Semantic Analysis
– LASSO
– Algorithms support text type – Tokenization and theme extraction – Explicit Semantic Analysis (ESA) for

document similarity

A1 A2 A3 A4 A5 A6 A7

ATTRIBUTE IMPORTANCE
CLUSTERING

– Minimum Description Length

– Principal Comp Analysis (PCA)

– Unsupervised Pair-wise KL Div – CUR decomposition for row & AI

– Hierarchical K-Means

– Hierarchical O-Cluster – Expectation Maximization (EM)

STATISTICAL FUNCTIONS

– Basic statistics: min, max, median, stdev, t-test, F-test, Pearson’s, Chi-Sq, ANOVA, etc.

  • ANOMALY DETECTION
  • ASSOCIATION RULES

  • – One-Class SVM
  • – A priori/ market basket

  • TIME SERIES
  • PREDICTIVE QUERIES
  • R PACKAGES

– State of the art forecasting using
Exponential Smoothing.

  • – Predict, cluster, detect, features
  • – CRAN R Algorithm Packages

through Embedded R Execution
– Spark MLlib algorithm integration

SQL ANALYTICS

– Includes all popular models e.g. Holt-Winters with trends,

seasons, irregularity, missing data

– SQL Windows, SQL Patterns,

SQL Aggregates

EXPORTABLE ML MODELS

• OAA (Oracle Data Mining + Oracle R Enterprise) and ORAAH combined

– REST APIs for deployment

• OAA includes support for Partitioned Models, Transactional, Unstructured, Geo-spatial, Graph data. etc,

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Multiple Data Scientist User Roles Supported

Oracle’s Machine Learning/Advanced Analytics

Additional relevant data
Historical or Current Data to

be “scored” for predictions and “engineered features”

Oracle Database 12c

  • Historical data
  • Assembled
  • Predictions & Insights

historical data

Sensor data, Text, unstructured data, transactional data, spatial data, etc.

  • DBAs
  • Application Developers

  • OML Notebook Users
  • R, Python Users, Data Scientists
  • Data Analysts, Citizen Data Scientists

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

OAA Model Build and Real-time SQL Apply Prediction

Simple SQL Syntax

ML Model Build (PL/SQL)

begin dbms_data_mining.create_model('BUY_INSUR1', 'CLASSIFICATION',
'CUST_INSUR_LTV', 'CUST_ID', 'BUY_INSURANCE', CUST_INSUR_LTV_SET'); end;

/

Model Apply (SQL query)

Select prediction_probability(BUY_INSUR1, 'Yes'
USING 3500 as bank_funds, 825 as checking_amount, 400 as credit_balance, 22 as age, 'Married' as marital_status, 93 as

MONEY_MONTLY_OVERDRAWN, 1 as house_ownership)

from dual;

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Fraud Prediction Demo

Automated In-DB Analytical Methodology

drop table CLAIMS_SET; exec dbms_data_mining.drop_model('CLAIMSMODEL');

create table CLAIMS_SET (setting_name varchar2(30), setting_value varchar2(4000));

insert into CLAIMS_SET values ('ALGO_NAME','ALGO_SUPPORT_VECTOR_MACHINES'); insert into CLAIMS_SET values ('PREP_AUTO','ON'); commit;

begin dbms_data_mining.create_model('CLAIMSMODEL', 'CLASSIFICATION',

'CLAIMS', 'POLICYNUMBER', null, 'CLAIMS_SET');

end; /

Automated Monthly “Application”! Just

-- Top 5 most suspicious fraud policy holder claims select * from (select POLICYNUMBER, round(prob_fraud*100,2) percent_fraud, rank() over (order by prob_fraud desc) rnk from
(select POLICYNUMBER, prediction_probability(CLAIMSMODEL, '0' using *) prob_fraud from CLAIMS

add:

Create

View CLAIMS2_30 As Select * from CLAIMS2 Where mydate > SYSDATE – 30 where PASTNUMBEROFCLAIMS in ('2to4', 'morethan4'))) where rnk <= 5 order by percent_fraud desc;

Time measure: set timing on;

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

ML Model Deployment for Real-Time Scoring

Real-Time Scoring, Predictions and Recommendations

• On-the-fly, single record apply with new data (e.g. from call center)

Oracle Cloud

Select prediction_probability(CLAS_DT_1_15, 'Yes'

USING 7800 as bank_funds, 125 as checking_amount, 20 as credit_balance, 55 as age, 'Married' as marital_status, 250 as MONEY_MONTLY_OVERDRAWN, 1 as house_ownership)

from dual;

Social Media

Likelihood to respond:

Call Center
Branch

Office

Get Advice

R

Mobile
Web

Email

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Manage and Analyze All Your Data

Data Scientists, R Users, Citizen Data Scientists

Architecturally,

Many Options

and Flexibility

SQL / R

Boil down the Data Lake

“Engineered Features”

Big Data SQL / R

– Derived attributes that reflect domain

knowledge—key to

best models e.g.:
• Counts

Object

Store

• Totals • Changes

over time

  • Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
  • 12

OAA Algorithm Performance

  • Algorithm
  • Time (sec.)
  • Number of Attributes

SVM SGD Solver GLM Classification SVM IPM Solver

GLM Regression

K-Means
83s
180s 811s

59s

900
9000
900

900

  • 168s
  • 9000

• In-memory 640 million rows, Airline On-Time dataset • SPARC M8-2 hardware • Courtesy of the SPARC Performance Team

Copyright © 2018 Oracle and/or its affiliates. All rights reserved. |

OAA Algorithm Scalability: Rows and Parallelism

Random Forest

200 180

160

140 120 100

80

60 40 20
0

  • 100
  • 150
  • 200
  • 250
  • 300
  • 350
  • 400
  • 450
  • 500
  • 550

Row Count (M records)

• X6-2 hardware • Benchmarks from Oracle Labs, Systems
Research Group

  • Copyright © 2018 Oracle and/or its affiliates. All rights reserved. |
  • 14

Operationalizing and Embedding Analytics for Action

How long does it take to put a defined model into operational use?

?
?

Length of time to put a model into production.

?

Based on 141 respondents who

stated they are doing this today

  • Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
  • 15

The Core Ingredients of Good Machine Learning

  • Domain Knowledge + Data Machine Learning Algorithms
  • Insights, Predictions

Machine Learning Algorithms

A1 A2 A3 A4 A5

A1 A2 A3 A4 A5 A6 A7

A6 A7

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Deployment!!

Most Important Factor in Machine Learning?

Machine Learning Algorithms

A “Thinking”

Database

A1 A2 A3 A4 A5 A6 A7

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Introducing:

Oracle Machine Learning SQL Notebook

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Introducing Oracle Autonomous Data Warehouse Cloud

Value Proposition

  • Easy
  • Fast
  • Elastic

• Provision a data warehouse in as little

  • • Up to 14x performance advantage than
  • • Only Pay for What you Use with user

defined sizing, on-demand scaling & idle shut-off

as 15-seconds
• Automated management of database

Redshift1
• High concurrency supports multi-user

administration

• Simple Load and Go with Automated

access and workloads

• Based on Exadata for extreme

• Independent scaling of compute and

storage

Tuning

  • performance
  • • Instant scaling with zero downtime

• Dedicated cloud-ready migration tools including Redshift

  • Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
  • 19

Confidential – Oracle

Oracle Autonomous Data Warehouse Cloud Key Features

Highly Elastic
High-Performance Queries

Independently scale compute and storage, without having to overpay for fixed blocks of resources

and Concurrent Workloads

Optimized query performance with

preconfigured resource profiles for different types of users

Oracle Machine Learning

Oracle SQL

Built-in Web-Based SQL ML Tool

Apache Zeppelin Oracle Machine Learning

notebooks ready to run ML from browser
Autonomous DW Cloud is compatible with all business analytics tools that support

Oracle Database

Database migration utility
Self Driving

Dedicated cloud-ready migration tools

for easy migration from Amazon
Fully automated database for self-tuning patching and upgrading itself while the system is running
Redshift, SQL Server and other databases

Cloud-Based Data Loading

Fast, scalable data-loading from Oracle Object Store, AWS S3, or on-premises

Enterprise Grade Security

Data is encrypted by default in the cloud, as well as in transit and at rest

  • Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
  • 20

Confidential – Oracle

ADWC + Oracle Machine Learning Notebooks

Developer
Analytics

Autonomous Data Warehouse Cloud

Tools

Data Warehouse Services

(EDWs, DW, departmental marts and sandboxes)

  • Service Management
  • Built-in Access Tools

Oracle Exadata Cloud Service

Oracle Analytics Cloud
Oracle SQL Developer

Oracle Database Cloud Service

Service Console

Oracle Machine Learning

Express Cloud Service

Data
Integration
Services

Oracle Data Integration
Platform Cloud

Autonomous Database Cloud

3rd Party DI on
Oracle Cloud Compute

Oracle Object Storage Cloud

Flat Files and Staging

3rd Party DI On-premises

  • Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
  • 21

Oracle Machine Learning

Machine Learning Notebook for Autonomous Data Warehouse Cloud

Key Features

• Collaborative UI for data scientists

– Packaged with Autonomous Data

Warehouse Cloud (V1)
– Easy access to shared notebooks, templates, permissions, scheduler, etc.

– SQL ML algorithms API (V1)

– Supports deployment of ML analytics

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Oracle Machine Learning

Machine Learning Notebook for Autonomous Data Warehouse Cloud

Key Features

• Collaborative UI for data scientists

– Packaged with Autonomous Data

Warehouse Cloud (V1)
– Easy access to shared notebooks, templates, permissions, scheduler, etc.

– SQL ML algorithms API (V1)

– Supports deployment of ML analytics

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Oracle’s Machine Learning & Adv. Analytics Algorithms

  • CLASSIFICATION
  • REGRESSION
  • FEATURE EXTRACTION

  • – Naïve Bayes
  • – Linear Model
  • – Principal Comp Analysis (PCA)

– Non-negative Matrix Factorization – Singular Value Decomposition (SVD) – Explicit Semantic Analysis (ESA)
– Logistic Regression (GLM) – Decision Tree
– Generalized Linear Model – Support Vector Machine (SVM) – Stepwise Linear regression

– Neural Network

– Random Forest

– Neural Network

STATISTICAL FUNCTIONS

– Basic statistics: min, max, median, stdev, t-test, F-test, Pearson’s, Chi-Sq, ANOVA, etc.
– Support Vector Machine – Explicit Semantic Analysis

A1 A2 A3 A4 A5 A6 A7

ATTRIBUTE IMPORTANCE

– Minimum Description Length

– Principal Comp Analysis (PCA)

– Unsupervised Pair-wise KL Div

CLUSTERING

– Hierarchical K-Means

– Hierarchical O-Cluster – Expectation Maximization (EM)

ASSOCIATION RULES
ANOMALY DETECTION

– A priori/ market basket
– One-Class SVM

Recommended publications
  • Oracle White Paper June 2009

    Oracle White Paper June 2009

    An Oracle White Paper June 2009 Oracle Data Mining 11g Competing on In-Database Analytics Oracle White Paper— Oracle Data Mining 11g: Competing on In-Database Analytics Disclaimer The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle. Oracle White Paper— Oracle Data Mining 11g: Competing on In-Database Analytics Executive Overview............................................................................. 1 In-Database Data Mining .................................................................... 1 Key Benefits .................................................................................... 3 Introduction ......................................................................................... 4 Oracle Data Mining ......................................................................... 4 Data Mining Deep Dive ....................................................................... 6 Oracle Data Mining for Data Analysts ............................................... 15 Oracle Data Mining for Applications Developers............................... 16 Competing on In-Database Analytics................................................ 18 Beyond a Tool; Enabling
  • Oracle Data Sheet

    Oracle Data Sheet

    ORACLE DATA SHEET ORACLE DATA MINING KEY FEATURES AND BENEFITS IN-DATABASE DATA MINING Oracle Data Mining, an Option to the Oracle Database EE, provides 12 powerful in-database data mining algorithms as a feature of the FEATURES database. Oracle Data Miner help users mine their data and define, Option to Oracle Database save and share advanced analytical methodologies. Developers can use Enterprise Edition In-Database analytics the SQL APIs to build applications to automate knowledge discovery. Wide range of algorithms Oracle Data Miner Automatic data preparation Easy to use Oracle Data Oracle Data Miner, a free download from the Oracle Technology Network with SQL Miner GUI (Extension to SQL Developer 3.0) Developer 3.0, simplifies and automates the data mining process. Text mining PL/SQL and Java APIs Score models at storage layer on Exadata BENEFITS Eliminate data movement Discover patterns and new insights Build predictive applications Maintain security during analysis In-Database Data Mining With Oracle Data Mining, everything occurs in the Oracle Database—in a single, secure, scalable platform for advanced business intelligence. Coupled with the power of SQL, Oracle Data Mining eliminates data movement and duplication, maintains security and minimizes latency time from raw data to valuable information. ORACLE DATA SHEET Full Set of Mining Algorithms Oracle Data Mining provides support for a wide range of data mining functionality. Oracle Data Mining 11g Release 2 Algorithms Technique Algorithm Classification Logistic Regression (GLM)—classic statistical technique, supports nested data, text, star schemas. Naive Bayes—Fast, simple, commonly applicable Support Vector Machine—Cutting edge. Supports many input attributes, transactional and text data.
  • Courses Can Be Given Throughout the Year, on Request

    Courses Can Be Given Throughout the Year, on Request

    Training Education of R - from the mere basics to the necessary skills to use and deploy R and advanced analytics in your organisation www.bnosac.be – [email protected] Overview This document gives an overview of the training offered by BNOSAC regarding the use of R (www.r- project.org). The training consists of a set of modules which can be chosen depending on your skillset and ranges from introductory to advanced R programming, package development, R deployments and R analytics (statistical learning, text mining, plotting, graphics, spatial statistics) as well as on-demand training. Courses can be given throughout the year, on request. If interested, do not hesitate to get into contact. Some of the courses are offered also through LStat – the Leuven Statistics Research Center on a yearly basis, other courses are offered at the BNOSAC offices or can be given at your offices. The courses about the use of Oracle R Enterprise are offered in cooperation with our Oracle Partner Tripwire Solutions. Courses The following courses are offered by BNOSAC - R programming o R for starters o Common data manipulation for R programmers o Reporting with R o Creating R packages and R repositories o Managing R processes o R code management, Git and Continuous Integration o Integration of R into web applications - R analytics o Statistical Machine Learning with R o Text mining with R o Applied Spatial modelling with R o Big data analytics with R on top of Hadoop, Spark & HAWQ o Computer Vision with R and Python - Oracle R Enterprise o ROracle and Oracle R Enterprise (ORE) - transparancy layer o Oracle R Enterprise - advanced data manipulation o Data mining models inside Oracle R Enterprise (ORE) and Oracle Data Mining (ODM) Price The prices of the courses are indicated as a price per person per day.
  • Visualizing In-Database Machine Learning with APEX

    Visualizing In-Database Machine Learning with APEX

    Visualizing in-database Machine Learning with APEX Heli Helskyaho, Marko Helskyaho APEX Connect 2019 Introduction, Heli Graduated from University of Helsinki (Master of Science, computer science), currently a doctoral student, researcher and lecturer (databases, Big Data, Multi-model Databases, methods and tools for utilizing semi-structured data for decision making) at University of Helsinki Worked with Oracle products since 1993, worked for IT since 1990 Data and Database! CEO for Miracle Finland Oy Oracle ACE Director Ambassador for EOUC (EMEA Oracle Users Group Community) Listed as one of the TOP 100 influencers on IT sector in Finland (2015, 2016, 2017, 2018) Public speaker and an author Winner of Devvy for Database Design Category, 2015 Author of the book Oracle SQL Developer Data Modeler for Database Design Mastery (Oracle Press, 2015), co-author for Real World SQL and PL/SQL: Advice from the Experts (Oracle Press, 2016) Copyright © Miracle Finland Oy Copyright © Miracle Finland Oy 500+ Technical Experts Helping Peers Globally 3 Membership Tiers Connect: • Oracle ACE Director bit.ly/OracleACEProgram [email protected] • Oracle ACE • Oracle ACE Associate Facebook.com/oracleaces @oracleace Nominate yourself or someone you know: acenomination.oracle.com Introduction, Marko Graduated from University of Helsinki Master of Science, computer science Full stack Developer Oracle since 1990, v6 Copyright © Miracle Finland Oy Why in-database? Why to move the data? It usually is a lot of data… No hassle with moving the
  • Data-Centric Automated Data Mining

    Data-Centric Automated Data Mining

    Data-Centric Automated Data Mining Marcos M. Campos Peter J. Stengard Boriana L. Milenova Oracle Data Mining Oracle Data Mining Oracle Data Mining Technologies Technologies Technologies [email protected] [email protected] [email protected] Abstract models. Models should only be used with data that have the same set or a subset of the attributes in the Data mining is a difficult task. It requires complex model and that come from a distribution similar to that methodologies, including problem definition, data the training data. Due to the many steps required for preparation, model selection, and model evaluation. the successful use of data mining techniques, it is This has limited the adoption of data mining at large common for applications and tools to focus on the and in the database and business intelligence (BI) process of model creation instead of on how to use the communities more specifically. The concepts and results of data mining. The complexity of the process methodologies used in data mining are foreign to is well captured by data mining methodologies such as database and BI users in general. This paper proposes CRISP-DM [4]. a new approach to the design of data mining In order to increase the penetration and usability of applications targeted at these user groups. This data mining in the database and business intelligence approach uses a data-centric focus and automated communities, this paper proposes a new design methodologies to make data mining accessible to non- approach for incorporating data mining into experts. The automated methodologies are exposed applications.
  • Oracle Data Mining Administrator's Guide Explains How to Install the Various Components of Oracle Data Mining and Perform Basic Administration Tasks

    Oracle Data Mining Administrator's Guide Explains How to Install the Various Components of Oracle Data Mining and Perform Basic Administration Tasks

    Oracle® Data Mining Administrator's Guide 10g Release 2 (10.2) B14338-02 November 2005 Oracle Data Mining Administrator’s Guide, 10g Release 2 (10.2) B14338-02 Copyright © 2005, Oracle. All rights reserved. The Programs (which include both the software and documentation) contain proprietary information; they are provided under a license agreement containing restrictions on use and disclosure and are also protected by copyright, patent, and other intellectual and industrial property laws. Reverse engineering, disassembly, or decompilation of the Programs, except to the extent required to obtain interoperability with other independently created software or as specified by law, is prohibited. The information contained in this document is subject to change without notice. If you find any problems in the documentation, please report them to us in writing. This document is not warranted to be error-free. Except as may be expressly permitted in your license agreement for these Programs, no part of these Programs may be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose. If the Programs are delivered to the United States Government or anyone licensing or using the Programs on behalf of the United States Government, the following notice is applicable: U.S. GOVERNMENT RIGHTS Programs, software, databases, and related documentation and technical data delivered to U.S. Government customers are "commercial computer software" or "commercial technical data" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the Programs, including documentation and technical data, shall be subject to the licensing restrictions set forth in the applicable Oracle license agreement, and, to the extent applicable, the additional rights set forth in FAR 52.227-19, Commercial Computer Software—Restricted Rights (June 1987).
  • Empowering Applications with Spatial Analysis and Mining

    Empowering Applications with Spatial Analysis and Mining

    Empowering Applications with Spatial Analysis and Mining ORACLE WHITE PAPER | NOVEMBER 2016 Disclaimer The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle. EMPOWERING APPLICATIONS WITH SPATIAL ANALYSIS AND MINING Table of Contents Disclaimer 1 Introduction 1 Spatial Analysis in Oracle Database12c 3 Tiled Bins 3 Aggregates_For_Geometry 3 Nearest-neighbor aggregate 4 Within-distance aggregate 4 Aggregates_For_Layer 4 Spatial_Clusters 5 Datasets 6 Spatial Analysis: Example Applications 6 Location Prospecting 6 Clustering Analysis 10 Neighborhood-based Estimation 12 Within-distance aggregates 12 Nearest-neighbor aggregates 13 Weighted Proximity Aggregates 15 Real-estate scenario 15 EMPOWERING APPLICATIONS WITH SPATIAL ANALYSIS AND MINING Spatial Data Mining: Framework and Case Studies 16 Classification: Case Study using US Blockgroups Data 16 Classification with Spatial Aggregates: Case Study 18 Regression Analysis with and without Spatial Estimates 20 Conclusion 22 . EMPOWERING APPLICATIONS WITH SPATIAL ANALYSIS AND MINING Introduction Oracle has several products that support efficient and effective analysis of warehouse data. For exploratory analysis, using online analytical processing/warehousing techniques, users could analyze data by rolling-up/drilling-down on specific dimension hierarchies. Unlike such OLAP processing, data mining allows automatic discovery of knowledge (or information content) from a database. In this context, Oracle Data Mining (ODM), a component of Oracle Advanced Analytics option, offers several techniques to mine large datasets inside an Oracle database.
  • HOL: Learn to Use SQL Developer's Oracle Data Miner Workflow UI Move the Algorithms; Not the Data!

    HOL: Learn to Use SQL Developer's Oracle Data Miner Workflow UI Move the Algorithms; Not the Data!

    HOL: Learn to use SQL Developer's Oracle Data Miner Workflow UI Move the Algorithms; Not the Data! Charlie Berger Sr. Director Product Management, Machine Learning, AI and Cognitive Analytics, [email protected] www.twitter.com/CharlieDataMine Brendan Tierney, President, Oralytics, Author multiple books, [email protected] Tim Vlamis, VP Analytics, Vlamis Software, [email protected] Dhvani Sheth, Senior Solution Engineer, [email protected] Siddesh C Prabhu Dev Ujjni, Staff Solution Engineer, [email protected] Copyright © 2020 Oracle and/or its affiliates. Safe harbor statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, timing, and pricing of any features or functionality described for Oracle’s products may change and remains at the sole discretion of Oracle Corporation. Copyright © 2020 Oracle and/or its affiliates. d Oracle Machine Learning OML4SQL OML Notebooks SQL API with Apache Zeppelin on Autonomous Database OML4R Oracle Data Miner R API Oracle SQL Developer extension OML4Py* OML4Spark Python API R API on Big Data OML Microservices* Model Deployment and Management, Cognitive Image and Text Copyright © 2020 Oracle and/or its affiliates. * Coming soon What is Machine Learning? Algorithms automatically sift through
  • Oracle R Enterprise Charlie Berger Sr

    Oracle R Enterprise Charlie Berger Sr

    1 Copyright © 2011, Oracle and/or its affiliates. All rights reserved. FPO In-Database Analytics: Statistics and Advanced Analytics with R—Oracle R Enterprise Charlie Berger Sr. Director Product Management, Data Mining and Advanced Analytics Oracle Corporation [email protected] R 2 Copyrightwww.twitter.com/CharlieDataMine © 2011, Oracle and/or its affiliates. Open Source All rights reserved. The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remain at the sole discretion of Oracle. 3 Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Agenda • Big Data & Big Data Analytics • Open Source Project – Challenges limiting enterprise adoption of R New• R Enterprise Open Source – Features, benefits and advantages • Big Data Appliance New – Open source distribution of R • Oracle R Enterprise Beta Program • Q & A 4 Copyright © 2011, Oracle and/or its affiliates. All rights reserved. What Makes it Big Data? SOCIAL BLOG SMART METER VOLUME VELOCITY VARIETY VALUE 5 Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Big Data in Action DECIDE ACQUIRE Make Better Decisions Using Big Data ANALYZE ORGANIZE 6 Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Announcing Oracle R Enterprise New 7 Copyright © 2011, Oracle and/or its affiliates. All rights reserved. R Statistical Programming Language Open source language and environment Used for statistical computing and graphics Strength in easily producing publication-quality plots Highly extensible with open source community R packages 8 Copyright © 2011, Oracle and/or its affiliates.
  • Oracle Data Mining Concepts, 11G Release 2 (11.2) E16808-07

    Oracle Data Mining Concepts, 11G Release 2 (11.2) E16808-07

    Oracle® Data Mining Concepts 11g Release 2 (11.2) E16808-07 June 2013 Oracle Data Mining Concepts, 11g Release 2 (11.2) E16808-07 Copyright © 2005, 2013, Oracle and/or its affiliates. All rights reserved. Primary Author: Kathy L. Taylor This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, the following notice is applicable: U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs.
  • Oracle Data Miner (Extension of SQL Developer 4.0) Generate a PL/SQL Script for Workflow Deployment

    Oracle Data Miner (Extension of SQL Developer 4.0) Generate a PL/SQL Script for Workflow Deployment

    An Oracle White Paper October 2013 Oracle Data Miner (Extension of SQL Developer 4.0) Generate a PL/SQL script for workflow deployment Denny Wong Oracle Data Mining Technologies 10 Van de Graff Drive Burlington, MA 01803 USA [email protected] 1 Contents Introduction .................................................................................................................................................. 3 Importing the Workflow ............................................................................................................................... 3 Create the User ......................................................................................................................................... 3 Load the Table ........................................................................................................................................... 3 Import the Workflow ................................................................................................................................ 4 Workflow Overview ...................................................................................................................................... 5 Modeling ................................................................................................................................................... 5 Scoring....................................................................................................................................................... 6 Deployment Use Case ..............................................................................................................................
  • Oracle Advanced Analytics Oracle R Enterprise & Oracle Data Mining

    Oracle Advanced Analytics Oracle R Enterprise & Oracle Data Mining

    Oracle Advanced Analytics Oracle R Enterprise & Oracle Data Mining R The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle. 2 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. USE CASE 8: ELECTRICAL CONSUMPTION . Short Description : – CERN ENS (Electrical Network Supervisor) system archives values from measurements – Around 12000 every 10 minutes – The storage is managed by an external company on an instance of Oracle relational database. – Nevertheless, the DB instance is installed on one dedicated server connected to CERN Technical Network . Issues: – Make data aggregation (by time and by families of measurements) – Electricity load forecast by system (families of measurements) using the historical data – Make the result obtained accessible to CERN users from General Purpose Network 3 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 13 Oracle Technology mapped to Analytics Landscape Oracle R Oracle Data Master & Oracle 12g Oracle 12g Enterprise & Integrator Reference Oracle Data Structured Mining Oracle Transactions Files Times Ten - Oracle BI Oracle Enterprise Semi Golden Gate Endeca structured Machine MDEX Oracle Oracle Real Time Generated NoSQL Decisions Oracle Oracle Endeca Oracle Text, Image, Hadoop Oracle Information Hadoop Video, Audio MapReduce Essbase Discovery HDFS Unstructured Data Acquire Organize Analyze Decide 4 Copyright © 2012, Oracle and/or its affiliates.