<<

Pentaho

Big Data & More: The Power to Data Integration’s graphical designer includes: Access, Prepare & Blend Multiple > Intuitive, drag and drop designer Data Sources Faster > Rich library of pre-built components With Pentaho, managing the enormous volumes and > Dynamic transformations, to determine field map- increased variety and velocity of data entering organi- pings, validation and enrichment rules zations, regardless of type of data and number of data using variables sources, is simplified. Pentaho’s complete data inte- > Integrated debugger for testing and tuning gration platform delivers “analytics ready” data to end job execution users 15X faster with visual tools that reduce time and complexity. Instead of coding in SQL or writing MapRe- Integration and High-Volume duce, organizations immediately gain real value from Data Processing their data, from data sources like Hadoop, NoSQL and Pentaho speeds time and reduces the complexity of relational data stores, with a graphical designer. integrating with big data sources. Pentaho’s intuitive graphical design provides: Turn Big Data into Actionable Analytics > Native connectivity to leading Hadoop, NoSQL and Pentaho’s adaptive big data layer allows you to plug analytic databases into popular big data stores with flexibility and insula- > Visual designer for MapReduce jobs to reduce tion from change. Data can be accessed once then development cycles by as much as 15x processed, combined and consumed anywhere. The > Data preparation, modeling and exploration of Pentaho adaptive big data layer includes plug-ins for unstructured data sets Hadoop distributions from Cloudera, Hortonworks, MapR and Intel, as well as popular NoSQL databases Pentaho’s powerful data integration engine provides: Cassandra and MongoDB and Splunk – for consum- > Multi-threaded engine for fast execution able and actionable analytics. > Cluster support, enabling distributed processing of jobs across multiple nodes Deliver Data to a Wide Variety > Unique in-Hadoop execution for extremely of Applications fast performance Pentaho’s out-of-the-box data standardization, enrichment and quality capabilities provide informa- Broad Connectivity and Data Delivery tion to SaaS providers and ISVs the shape and form Pentaho Data Integration offers broad connectivity to a most- suited for their applications. variety of diverse data including all popular structured, unstructured and semi-structured data sources. Some Integrate and Blend Big Data with examples include: Existing Enterprise Data > Standard relational databases, Oracle, DB2, MySQL, With broad connectivity to any data type and a high SQL Server performance in-Hadoop execution, Pentaho makes it > Hadoop, , Cloudera, HortonWorks, simplifies and speed the process of integrating existing MapR databases with new sources of data. > NoSQL databases, MongoDB, Cassandra, HBase > Analytic databases, Vertica, Greenplum, Teradata

Copyright ©2014 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For the latest information, please visit our web site at pentaho.com. > Packages enterprise applications, SAP > Identify data that fails to comply with business > Cloud-based and SaaS applications, Salesforce, rules and standards Amazon Web Services > De-duplicate and cleanse inconsistent and > Files, XML, Excel, flat file and web service APIs redundant data > Validate, standardize and correct name, address, To increase the performance of data extraction, email and telephone data loading and delivery processes, Pentaho offers the following capabilities: > Native connectivity and bulk-loading to most common data sources WHY PENTAHO DATA INTEGRATION? > Data delivery in a multi-dimensional format for analytics > Power of big data orchestration and > Data delivery through real-time data services integration: Integration of all data - for operational 3rd party applications Hadoop, NoSQL and relational - in one platform; In-Hadoop and clustered Team Work and Collaboration execution of data processing for for Developers maximum scalability Pentaho Data Integration is built on a centralized > Ease of use: Simple set up; Intuitive repository where all stakeholders in a data integra- graphical designer; No extra code tion project share and collaborate on developing data generation; Over 100 out-of-the-box flows. Pentaho provides: mapping objects, including a visual > Shared repository for collaboration among data MapReduce designer for Hadoop analysts, job developers and data stewards > Modern and extensible: 100% for > Content management, versioning and locking to cross-platform deployment; Pluggable easily version jobs for roll-back to prior versions architecture for adding connectors, transformations and user-defined Powerful Administration and Management expressions Pentaho Data Integration provides out-of-the box > High value, low cost: No upfront capabilities for managing operations for data integra- fees; Subscription license model with tion projects. These capabilities include: no developer/user license fees; No > Managing security privileges for users and roles maintenance fees > Integrating with existing security definitions in LDAP and Active Directory > Setting permissions to control user actions, read,

Spoon - mongo_data_merge (changed) 4:09 PM pentaho execute or create Perspective: Data Integration Model Visualize View Design Welcome mongo_data_merge Steps 100%

> Scheduling of data integration flows Big Data Cassandra Input Cassandra Output HBase Input Calc Mn/Yr > Monitoring and analyzing the performance of data Hadoop File Input Sales Data Hadoop File Output HBase Input integration processesData Profiling and Data Quality HBase Output MapReduce Input MapReduce Output Add Count Sort country/date Group by country/date Lookup Sales Table output MongoDb Input MongoDb Output Input Output Pentaho provides basic data profiling capabilities such Transform as row counts, mathematical functions and identi- fication of null values as well as data quality opera- tors such as string manipulators, mapping functions, filtering and sorting. For name and address verifica- tion capabilities, Pentaho integrates with leading data quality vendors, such as Human Inference and Melissa Data. Pentaho data profiling and data quality capabili- ties help:

To learn more about Pentaho software and services, contact Pentaho Certified Partner, Layer-9 www.layer-9.com or + 44 (0) 1223 797243 Be social with Pentaho: Copyright ©2014 Pentaho Corporation. All rights reserved. 14-504 A4