Pivotal Data Science Labs is the new name for Greenplum analytics lab.

Data Sheet

Pivotal Data Science Labs Expert-led discovery for actionable business insight

Overview At-a-glance Onsite assistance delivered by experienced data scientists Accelerating Analytics in to help: a World • Solve critical problems using advanced analytics and Acquiring and developing advanced analtyics skills is a key deliver actionable insights priority for many organizations. Faced with limited staffing and • Advance new methodologies within the full range of skills, while demand for talented analytics and data science the Pivotal UAP’s analytical capabilities professionals is on the rise, many organizations struggle to grow • Learn and apply Magnetic, Agile, and Deep design their abilities and deliver results quickly. philosophies to analytics To help both existing and prospective Pivotal users develop actionable insights for the business and grow their skill base Key benefits more rapidly, Pivotal has assembled a team of experienced data • Rapid insights into critical business questions scientists that is available for analytics-focused engagements. • Training for key analysts Through our Data Science Labs, Pivotal’s data science team can • Conversion of existing models help accelerate analytical skill development and kick-start your • A roadmap for on-going analytics including sample ability to deliver immediate value to the business. models, tools methods, and best practices • An opportunity to guide Pivotal analytics development Pivotal Data Science Labs A Data Science Lab is a package of services, technology, or training delivered by Pivotal’s team of leading data scientists. During the engagement, your analytics stakeholders and data platform leadership work in partnership with Pivotal’s team of statisticians and modelers to solve real business problems using Big Data advanced analytics.

Data Science Labs are collaborative projects that can include:

• Studies to build analytical insight regarding key business problems • Development of analytics roadmaps • Data asset study and validation • Scoping and initial development of analytics application and algorithms • Alignment of analytics goals with business needs

goPivotal.com Data Sheet Pivotal Data Science Labs

WORLD-CLASS DATA SCIENTISTS GUIDE ALPINE Data Labs YOUR SUCCESS Pivotal has worked with Alpine Data Labs to develop an intuitive The Data Science Labs are delivered by Pivotal’s team of leading graphical workflow model builder for data mining, seamlessly data scientists. Pivotal’s data scientists partner with your analysts, integrated with the Pivotal Greenplum . Alpine Miner data platform administrators, and business leadership to crack provides statistical transformation and modeling methods for your top business challenges and opportunities over a project data analysis, modeling, and scoring with which analysts can duration of your choosing. In each Lab, our data scientists flexibly and efficiently conduct end-to-end knowledge discovery identify and, in some cases, implement the appropriate analytical and predictive analysis. Alpine Miner operates directly on data methods on massive datasets using the full range of capabilities where it resides, regardless of the number of independent of the Pivotal Unified Analytics Platform (UAP). variables or complexity of the data types.

HOW A PIVOTAL DATA SCIENCE LAB WORKS PMML PMML, or Predictive Modeling Markup Language, while not a • Identify goals for the project that are critical to the business tool itself, provides a method for exchanging models between and amenable to advanced analytics; agree on priorities and a timeline. model development and model execution environments. PMML can help to optimize development processes and Pivotal data • Build a set of detailed requirements and exit criteria. scientists can help you begin to leverage PMML to speed analytics • Assemble the Pivotal team to execute: in addition to the development. data scientist, we provide project management, architectural oversight, and data migration/loading support for the project. TECHNIQUES FOR ACCELERATING • Employ an iterative approach while pursuing clear project goals and milestones, encouraging a process of discovery AGILE ANALYTICS based on what the data reveals, all targeted to contribute to Optimizing analytics on Big Data requires new techniques to project goals. harness the massively-parallel computational capabilities of UAP. • Work closely with IT, paying close attention to security, During a Pivotal Data Science Lab, we can help your team apply a permissions, protocols, and testing procedures. variety of the following new techniques quickly.

CHOICE OF MODELING AND Embedded Analytic Functions STATISTICS TOOLS Pivotal is dedicated to bringing the power of parallelism to commonly used modeling and analytics functions, and supports many of these within UAP including matrix operations, multiple linear regressions, and Bayesian statistics.

SQL MADLIB Python R The analytics team at Pivotal is actively contributing to an open- C Java source library of advanced analytics functions in cooperation Perl with the University of California at Berkeley designed to run on MPP platforms. These functions are available at no cost as part of Your analysts will likely employ a wide variety of analytics, the MADLib analytics library, including documentation and source development, and business intelligence tools. Pivotal Data code from which users can customize the algorithms. Science Labs are unique in the broad array of these technologies and approaches that our data scientists can support. Some of MapReduce the major analytical technologies supported are as follows: MapReduce has proven to be a powerful platform for Big Data analytics by Internet leaders including Google and Yahoo!. Pivotal SAS® UAP uniquely supports MapReduce in both Pivotal HD and Pivotal During Pivotal Data Science Labs, our analysts can help you Greenplum Database, giving your analysts freedom to choose the leverage and improve analytics initiatives for your SAS users. right tool and environment for each job. Pivotal’s data scientists In addition to analytics and modeling in SAS, we can also help can help you effectively apply MapReduce and jumpstart its use in you run models directly in Pivotal UAP using the SAS In- Pivotal UAP. Database Scoring Accelerator for Pivotal and the new SAS High Performance Analytics (HPA) for Pivotal. 2 Data Sheet Pivotal Data Science Labs

Hbase, Pig, and Hive for data, process, and platform enhancements to enable best-in- The Hadoop community is rapidly augmenting MapReduce with class analytical performance. new higher-level tools, including the Apache Foundation’s Hbase, This option is appropriate for companies that are new to Hive, and Pig toolsets. Each is included in Pivotal UAP, and can be exploring analytics on massively-parallel-programming (MPP) the utilized during an Data Science Lab engagement to address platforms, those that are in the beginning stages of elevating new analytics challenges. analytics as a mission-critical business function, or those suspecting they are under-leveraging valuable data assets. SQL Analytics SQL, and more specifically SQL 2003 OLAP functions, are LAB 100 (ANALYTICS BUNDLE) commonly used in analytic environments. Our data scientists can Lab 100 engagements are typically two weeks long, working help your team tune and optimize SQL to accelerate your SQL with your data and analytics team to assist with introducing or analytics in the massively-parallel environment of Pivotal UAP. optimizing your Pivotal UAP analytics environment. In addition, our data scientists work closely with your team to ensure that Custom Analytical Algorithms users are fully equipped with the tools to leverage the advanced For many, adapting existing algorithms to run in a massively- analytics capabilities of Pivotal. parallel environment can vastly increase analytical agility. For computer scientists, well-known procedural languages including The result is onsite analytics training with the Pivotal UAP Java, C, Python, and Perl can be used to create algorithms targeting your future in-house data scientists, a review of that harness the parallel computational power of UAP. For languages and tools such as SQL, R, MapReduce, MADLib, SAS, statisticians, the R programming langauge can also be used to and Alpine Miner (a GUI-based statistical package optimized for parallelize existing and create new analtyical algorithms. Pivotal) and the presentation of a business insight by our data science team. During Data Science Labs, Pivotal data scientists can help you exploit any of these procedural languages for flexibility and This option is appropriate for new or existing Pivotal performance, while shortening the time-to-value for high- customers who are interested in jumpstarting their advanced performance agile analytics. analytics efforts to maximize the performance of their Pivotal environments and the value they’re extracting from their data DATA SCIENCE LAB PACKAGES assets.

Pivotal Data Science Labs are available in a range of ANALTYICS LAB 600 engagement durations and deliverables: An Analytics Lab 600 is a six-week model-development Lab Primer Lab 100 Lab 600 Lab 1200 engagement focusing on solving a top business challenge or on (One-Day Workshop) (Analytics Bundle) (Six-Week Lab) (12-Week Lab) discovering a key insight that can be further operationalized to address marketing or product goals. • Analytics Roadmap • Onsite MPP • Analytics Roadmap • Analytics Roadmap Analytics Training • Prioritzed • Prof. Services on • Prof. Services on The result of the Lab 600 is typically a QA’d, ready-to-deploy • Analytics Toolkit Pivotal UAP Pivotal UAP • Opportunities model or set of models that are tuned to optimally perform in • Quick Insight • Ready-to-deploy • Ready-to-deploy • Architectural (Two weeks) Model(s) Model(s) a Pivotal UAP environment. Recommendations This option is appropriate for companies with a known business challenge that can benefit from the brief injection of additional LAB PRIMER (One-DAY WORKSHOP) analytics experience. Examples of possible business challenges A Lab Primer is a one-day moderated session bringing together addressable in a six-week timeframe include: customer your data and business leadership with Pivotal’s data scientists segmentations, affinity models, and experimentation frameworks. and architects to review the existing data platforms and business goals. Through the day, teams discuss opportunities and ANALYTICS LAB 1200 approaches to apply advanced analytics, and to chart a concrete Analytics Lab 1200 is a 12-week Pivotal UAP-based Analtyics Lab path toward making this a reality. engagement, focused on solving a top business challenge deemed The result is an analytic roadmap, with a step-by-step guide for more complex than would be tractable in a Lab 600 engagement. an analytics-based approach to solving one to three business As with the Lab 600, the focus is usually upon discovering one or opportunities, as well as tactical and strategic recommendations more key insights that can be further operationalized to address

3 Data Sheet Pivotal Data Science Labs

marketing or product goals. As with a Lab 600, our data science Time is of the Essence team works with you to address your analytics challenges, work Pivotal UAP brings a platform rich with advanced analytical hands-on with your data, and deliver actionable insights for your capabilities to your data science teams. Capitalizing on that business or organization. capability depends on execution of an analytics plan and strategy that takes time — time you may not have. With Pivotal Data The result of a Lab 1200 is a QA’d, ready-to-deploy complex Science Labs, project start-up time shrinks from months to model or set of models that are tuned to optimally perform in the weeks, accelerated by the efforts of experienced data scientists, Pivotal UAP environment. working on your behalf, at your site. The standard packages This option is appropriate for companies with a known business of assistance previously described are flexible guidelines, and challenge that could benefit from the injection of analytical customized engagements are encouraged. knowledge and capability to address a particularly tough analtyics or modeling challenge. Examples of possible business challenges Learn More addressable in a 12-week timeframe include: churn drivers, To learn more about our products, services and solutions, visit us behavioral targeting, risk models, fraud detection, media mix, and at goPivotal.com. campaign attribution modeling.

Key Benefits of PIVOTAL DATA SCIENCE LABS • Insights into critical business questions • Training for key analysts • Conversion of existing models • A framework for on-going analytics - sample models - tools, methodologies, best practices - Analytics Roadmap • An opportunity to guide Pivotal analytics development

At Pivotal our mission is to enable customers to build a new class of applications, leveraging big and fast data, and do all of this with the power of cloud independence. Uniting selected technology, people and programs from EMC and VMware, the following products and services are now part of Pivotal: Greenplum, Cloud Foundry, Spring, GemFire and other products from the VMware vFabric Suite, Cetas and . Pivotal 1900 S Norfolk Street San Mateo CA 94403 goPivotal.com

GoPivotal, Pivotal, and the Pivotal logo are registered trademarks or trademarks of GoPivotal, Inc. in the United States and other countries. All other trademarks used herein are the property of their respective owners. © Copyright 2013 Go Pivotal, Inc. All rights reserved. Published in the USA. PVTL-DS-118-04/13