Establish an Analytics Hub on Linux on IBM Z Systems & Linuxone
Total Page:16
File Type:pdf, Size:1020Kb
Establish an Analytics Hub on Linux on IBM z Systems & LinuxONE Wilhelm Mild Executive IT Architect Integration Architectures for Mobile, Linux & IBM Z IBM Boeblingen Laboratory © 2016 IBM Corporation The Platform of Choice integrate transactional and analytic processing Reduced data movement, reduced complexity, reduced configuration resources, more accurate data, more secure, more available Streamlined decision process Transaction Processing Faster,Data more Serving Faster, more accurate scores Faster accurate modeling transactions Mixed Workloads Operational Efficiency Trusted and Secure Computing Reliable, Available, Resilient Faster Faster, more transformation accurateVirtually reporting Limitless Scale of data Easier integration of other data sources 1 IBM Z point-of-view Building a foundation to grow with business needs Why z13 and z13s or z14? . Larger cache = in-memory analytics for faster insight Cognitive . MASS libraries = 2X to 10X improvement porting x86 analytic workloads . SIMD improvements = 80% increase in CPLEX on z/OS modeling . CPU enhancements = no performance impact to real-time scoring in transactions Spark Why IBM Z? . Currency of data . Reduce complexity . Eliminate data duplication . Improve synchronization . Bring analytic function to the data Evolution of Analytics data is the resource for competitive advantage Descriptive Predictive Cognitive Analytics by spreadsheet Alerts & suggest next actions Identify & immediately act A business Studying business data to Expanding beyond management understand trends and business data for exercise predict the future business opportunities Analysis by small Owned by IT organizations Lines of business teams who On historical Copy data to build data Combine structured & structured data warehouses & data marts unstructured data What Using utilities or roll Tools build models to learn Discover & act using your own tools from historical data open source tools How Internal executive Increase business Embed analytics into why decision support opportunities business processes IT infrastructure matters not because of what you are doing now But because of what happens next Now What Happens Next • Currency of data becomes the driver of business value Increasingly • Embedding real-time analytics into the business transaction sophisticated • Privacy of client data and issues of ethical use analysis on • Integration of structured and unstructured data: Bringing business Analytics static data level QoS to a large scale with clustered and unstructured data • Analytics evolving into Cognitive: Power of Inference • Orchestration, automation, and control of all resources: On-premise and off-premise Cloud Virtualization • Splitting of application and data layers Cloud • Hybrid Cloud and Multi-architecture integration • Data privacy and security • “Remember the lessons of the PC”- Today's toy can run tomorrow's world • Broader access unprecedented and unpredictable “Let’s talk to transaction and data volumes fluid scale the iPhone” • Unpredictable timing dictates 24x7 response and availability Mobile • Security: Data integrity and privacy with broader transaction access 4 © 2015 IBM Corporation Imagine the possibility of leveraging all of your data assets Traditional Technique Emerging Technique Structured Creative Analytical Holistic thought Logical Intuition Data Hadoop Warehouse Transaction Streams Multimedia Data Web Logs Internal App Data New ideas, Social Data “Here’s a Structured “Here’s some Mainframe new UnstructuredText Data: question, what’s Data Repeatable Exploratory emails data, are there Linear questions, the answer?” OLTP System Dynamic Sensor data: Data new answers images correlations?” ERP RFID Data New Traditional Sources Sources Transformational benefit comes from integration of new data sources with traditional corporate data 5 © Copyright IBM Corporation 2015 IBM Systems with z Architecture IBM Z IBM LinuxONE Systems IBM z13 IBM LinuxONE Emperor The world’s fastest processor Massive I/O throughput Dedicated cryptographic processors IBM LinuxONE IBM z13s Rockhopper IBM LinuxONE Systems Linux your way, open & without limits An open source community and ISV eco system to support a data serving environment Database Analytics Languages Db2 Distributions 7 (2) The Enterprise Analytics Hub on Linux on z Systems Off-prem / Build an end-to-end Analytics and Real-time Cloud services Watson SAP fraud detection environment Analytics on Linux on z Systems Linux on z Systems Siebel, PeopleSoft Spark Cognos BI z/OS SPSS Watson Explorer DB2 z/OS Explore & Analyze dashDB local* SPARK IDAA DB2 BLU Machine Learning Analytics Unstructured Content More… Collaboration Web Email File Content Cloud Systems Management Federation Server SQL structured & + no-SQL VSAM, SAG CA (DB2, Oracle, Postgres, IMS data Adabas Datacom MariaDB, MySQL, …) * - offering not yet available (MongoDB, Cloudant, CouchDB) The Enterprise Analytics Hub on LinuxONE Cognos Watson Product SPARK dashDB SPSS local* Analytics Explorer Analyze and predict Analyze the context Analyze current Predictive Analyze the context from the context of Function of data in structured & historical data Analytics of data in structured data using structured & unstructured data & unstructured data & unstructured data commands, Report Studio for SPSS Scala IDE or Jupyter visual, interactive, self- Explorer UI dashboard Notebooks Notebook service business users dashboard SPARK SPARK APIs Various Main JDBC JDBC, APIs SQL, ML, SQL, ML, ODBC interfaces i.e GraphX GraphX JDBC, JSON DB2 BLU Analytics Reports, scoring, Machine learning Warehouse & Interactive reports Predictive models, Interactive reports, results In memory processing in memory and dashboards statistics, scoring Predictive, statistics analytics Structured & Data sources Structured & Structured Structured Structured-, unstructured unstructured data data unstructured data, MESOS, data, MESOS, data and Hadoop FS Hadoop Watson cloud services * - offering not yet available What is Apache Spark? Java / Python / Scala / R Languages Spark SQL Spark MLlib Spark GraphX Spark Streaming Relational Machine Graph Real-Time Spark Libraries Operators Learning Processing Streaming Spark Core General Execution Engine Spark Core YARN MESOS Standalone Cluster Manager HDFS / Cassandra / HBase / Parquet / ... Data Abstraction DB2 Connect Driver (JDBC) Get Started – Exploit Your Data download free software to start a project .Federated, data-in-place Stream GraphX SQL MLlib analytics – reduce ETL .Performance gains from co- Compression RDDcache location with data .z Systems: SMT2, zEDC, SIMD, File System Network Scheduler Serialization Large Pages, very high zIIP use Find insights from structured & unstructured data with Apache Spark https://www.ibm.com/developerworks/java/jdk/spark/ 11 Spark Analytics on Linux on z and z/OS IBM z Systems provides an optimized platform to derive insights from all client data without moving it Accurate – Secure – Federated analysis in a hybrid cloud model Linux on z Systems z/OS Spark Spark Spark CICS IMS WAS IBM P Systems Spark Spark Spark Spark Spark Spark Spark RDD* RDD* RDD* RDD* x86 z/OS Optimized data Layer Leverage non-z data Type 2 / Type 4 Type 4 DB2 DB2 z/OS IMS VSAM SMF and more… Leverage Linux on z Leverage z/OS data and transactions virtualization benefits *RDD – Resilient Distributed Dataset is the Java of data analysis! IBM Machine Learning for z/OS - Overview Linux on z Machine Learning User Interface Machine Learning Application IBM z/OS Platform for Apache Spark z/OS Data IBM dashDB Local for Analytics* dashDB Local is the premier private cloud data warehouse optimized for analytic workloads for Software Defined Environments (SDE) such as private clouds, virtual private clouds and other infrastructures that support Docker container. Benefits of dashDB Technology with Fast Deployment into Private Cloud Environment For apps that need: Private or • Elastic scalability Virtualized . Highly flexible data warehouse • High availability Private Cloud . Optimized for fast and flexible deployment • Data model flexibility into private or virtual private clouds Docker • Data mobility Container . Uses Docker• Textcontainer search technology Technology . Built on top of dashDB technology, it • Geospatial shares the benefits of dashDB Available. BLU Acceleration as: in-memory columnar technology Technology . Massively Parallel Processing (MPP) with • Fully managed DBaaS automated• scalingOn-premises capabilities private cloudto increase infrastructure• Hybrid efficiency architecture MPP with Automatic Scaling * - offering not yet available dashDB local* https://developer.ibm.com/clouddataservices/docs/dashdb/analyze/use-dashdb-local-spark-notebooks/ https://www.youtube.com/watch?v=mzOi45-KJN4 Streaming data using the built-in Apache Spark infrastructure in dashDB Local • runs in Docker containers • UI using Jupyter Notebook IoT example * - offering not yet available 15 DB2 w/ BLU Acceleration – inside dashDB Super Simple. Super Fast. Solution • DB2 with BLU Acceleration is the preferred solution for customers who would like to run analytics with z Systems & Linux data • Satisfy requirement for a columnar in-memory db • Alternative of Linux on z Oracle installations • Enhanced for distributed consolidations onto z Systems Predictive Analytics: IBM SPSS on Linux on z learn from historical data to make predictions Techniques used to analyze data Why IBM Z - Data mining - Statistics Currency of - Modeling data - Machine learning - Artificial intelligence