Quick viewing(Text Mode)

Pentaho & Mongodb Partner to Solve Government Big Data Challenges

Pentaho & Mongodb Partner to Solve Government Big Data Challenges

& MongoDB Partner to Solve Government Challenges

December 2013

Bob Gourley Publisher, CTOvision.com Will LaForest Director of Federal, MongoDB Dave Henry SVP Enterprise Solutions, Pentaho

1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Big Data Management Best Practices for Federal Big Data Projects

Bob Gourley Publisher, CTOvision.com

2 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Brief Purpose Research & Reports Intro to top 5 A focus on a “Best new discipline Practices” of “Big Data Management” of Federal Data activities

Invitation to A perpetual Contribute your collaborate draft - your thoughts at and refine input is CTOvision.com approaches requested

3 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Update Sources

. Big Data Government Newsletter - reader survey  2,600 readers  2% response rate, across Federal agencies . Review of openly published research by Wikibon, TDWI, IDC, Gartner, Forrester and of course our own CTOvision . Review of best practices and use cases from the best vendors in Enterprise Big Data . Engagement of the community at events like Strata and Hadoop World

Planning Assumption The ability to collect, parse, analyze machine data in real time, whether on premise or in the cloud, will continue to grow

4 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Big Data Management

. Agencies are thinking through the right changes to concepts and technologies . Old approaches still important, but cannot solve emerging problems . Big Data Management is an evolved discipline which builds on existing data management approaches to leverage new concepts, technologies and best practices to optimize mission support

5 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Solutions That Require Big Data Management

• Open Source Information: analysis and integration • Situational Awareness across disparate data sets • Two use cases: “Connect the Dots” and “Needle in Haystack” • Cyber Security: rapid real time analysis of all relevant data • Asset catalog across extensive/dynamic enterprises • Rapid return of geospatial data • Location based push of data • Real time return of relevant search • Real time suggestion of topics • Bioinformatics: • Human Genome • Patient location, treatment, outcomes • Law Enforcement: Predictive Policing • Data Hub: Unified storage, governance, security, functionality

6 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Best Practices in Big Data Management

Start with a mission-focused vision. This will vary by organization. Support VISION to mission will drive everything else. Consider that analytics and Big Data go together. Should prioritize and tackle challenges like: Changes to governance processes, right mix of skills for workforce, learning new technology, STRATEGY prioritizing which workload types will be handled by which part of the architecture. Know existing infrastructure and process with focus on: Understanding of legal/policy dynamics relevant to your agency, understanding of new KNOW capabilities available, current and required throughputs/capacities, types of workloads supported by each components in the architecture, available tech choices. Document and continuously improve. Architect to manage data in its original form. Include right mix of traditional and new in your design. Don’t DESIGN assume any one platform will be a solution. Architect to insulate applications and users from a variety of disparate big data platforms. Avoid custom coding wherever possible. Don’t let new Big Data Platforms EXECUTE become proprietary silos. ETL remains important. Ensure training for all based on job function. Don’t neglect your own training. Serve the analyst.

7 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Next Steps  Continue your market surveys, stay aware of what new technologies can do for you.

 Revisit your vision. As you do, ponder this: How can you leverage data to support your mission?

 Continue to study use-cases and exchange best practices. Dialog with others in and out of your sector. Great lessons are coming from other industries.

 Continue to engage with the broader community. Sign-up for our Government Big Data Weekly.

 Share your lessons learned.

8 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Provide Your Thoughts, Input, Questions

E-mail: [email protected] Blog: http://ctovision.com Twitter: http://www.twitter.com/bobgourley Facebook, LinkedIn, etc: See the blog

9 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 The Modern Operational Database for Government

Will LaForest Director of Federal, MongoDB

10 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 The Evolution of Databases

Online 1990 2000 2010

Operational & Real-time NoSQL RDBMS RDBMS RDBMS Datawarehouse OLAP/BI OLAP/BI Hadoop

Offline

11 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Relational Database Challenges

Variety Agile Development • Unstructured data • Iterative • Semi-structured • Short development data cycles • Polymorphic data • New workloads

Volume & Velocity New Architectures • Petabytes of data • Horizontal scaling • Trillions of records • Commodity servers • Millions of queries per second • Cloud computing

12 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 MongoDB The Modern Operational Database

General Document Open- Purpose Oriented Source

13 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Fully Featured

• Find Paul’s cars MongoDB Rich Queries • Find everybody in London with a car built between 1970 and 1980 { first_name: ‘Paul’, • Find all of the car owners within 5km of Geospatial surname: ‘Miller’, Trafalgar Sq. city: ‘London’, location: [45.123,47.232],

• Find all the cars described as having cars: [ Text Search leather seats { model: ‘Bentley’, year: 1973, value: 100000, … }, • Calculate the average value of Paul’s Aggregation car collection { model: ‘Rolls Royce’, year: 1965, value: 330000, … } • Secondary • Full Text Native Indexes • Compound • Hash } • • Geospatial Covering }

14 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 MongoDB and Enterprise IT Stack

Applications

CRM, ERP, Collaboration, Mobile, BI Security &Security Auditing

Data Management

Online Data Offline Data RDBMS RDBMS Hadoop EDW

Management & Monitoring & Management Infrastructure OS & Virtualization, Compute, Storage, Network

15 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Variety – Modern Data Document Data Model

Relational MongoDB

{ first_name: ‘Paul’, surname: ‘Miller’ city: ‘London’, location: [45.123,47.232], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } ] }

17 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Dynamic Schema

MongoDB does not need any defined data schema. Every document could have different data

{name: “will”, {name: “jeff”, {name: “brendan”, eyes: “blue”, eyes: “blue”, aliases: [“el diablo”]} birthplace: “NY”, height: 72, aliases: [“bill”, “la boss: “ben”} ciacco”], {name: “matt”, gender: ”???”, pizza: “DiGiorno”, boss: ”ben”} {name: “ben”, height: 74, hat: ”yes”} boss: 555.555.1212}

18 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Volume, Velocity, and New Architectures Automatic Sharding

• Increase or decrease capacity as you go • Automatic balancing • Optimized for commodity servers and cloud infrastructure

20 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 High Availability

• Automated replication and failover • 0 down time with hardware failure and upgrades • Multi-data center support • Improved operational simplicity (e.g., HW swaps) • Data durability and consistency

21 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 MongoDB Performance*

Top 5 Marketing Government Top 5 Investment Firm Agency Bank Data Key/value 10+ fields, arrays, 20+ fields, arrays, nested documents nested documents

Queries Key-based Compound queries Compound queries 1 – 100 docs/query Range queries Range queries 80/20 read/write MapReduce 50/50 read/write 20/80 read/write Servers ~250 ~50 ~40

Ops/sec 1,200,000 500,000 30,000

* These figures are provided as examples. Your application governs your performance.

22 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Replication Benefits Operational and Analytical Workloads

• Application interacts with primaries • Analytical workloads on secondaries • Workloads are isolated from one another • Working set appropriate for each application

24 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Global Data Distribution

Real-time Real-time

Real-time Real-time

Real-time

Real-time

Real-time

25 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Read Global / Write Local

Primary:LON

Secondary:NYC

Secondary:SYD Primary:NYC

Secondary:LON

Secondary:SYD

Primary:SYD

Secondary:LON

Secondary:NYC

26 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Solving Big Data Challenges in the Federal Government

Dave Diegtel Head of Federal Sales, Pentaho

27 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Why Pentaho for Federal Government

• Company and Product Maturity: Pentaho has been around for over 9 years, with 1,000’s of paid customers, and 5.0 Version release. Pentaho is proven and less risky.

• Business Model and Subscription: Pentaho’s Subscription Model and Server-based pricing allows for lower upfront investment and risk compared to legacy BI vendors who traditionally cost an average of 4X for similar size deployments.

• Government Certifications: Pentaho has made significant investments in Government Certifications and Compliance such as 508 and Security.

• Open API’s and extensible architecture enable ease of integration and reduce potential for vendor lock-in.

• Existing Government Customers and Cleared Personnel

28 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555

A Comprehensive Big Data Platform

Dave Henry Senior VP Enterprise Solutions, Pentaho

29 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho 5.0 Architected for the Future Simplified analytics experience for all users

Billing Customer

Existing & New Analytics Data Infrastructure Social Web & Processes Media

Location Network

ANY Data ANY Environment ANY Analytics • Relational • Data warehouses • Reports • Operational • Data marts • Dashboards • Big Data • Stack vendors • Visualizations • Data sources not yet • Cloud • Discovery anticipated… • Embedded • Predictive

30 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 The New Reality Simplified analysis for all users

Simplified Analytics Experience

Blended Big Data

Enterprise Big

31 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho & MongoDB Enable Key Use Cases Customer 360 and Device Data Analytics enable comprehensive insight

… • MongoDB delivers Scalable, Low-Latency Enterprise Data Store Mission Scope Pentaho Data Integration • Visual ETL development with Pentaho Analytics Pentaho Data Integration • Reporting • Dashboards (PDI) • Visualization • Discovery Pentaho Data • Reporting, Dashboards, Integration Visualization and Discovery with Pentaho Analytics

32 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Enterprise Customer Data Store Powerful data integration for MongoDB

Customer Master

mongoDB PDI ETL cluster POS Data

Web Event Data

$push to data arrays

33 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Data Integration Exploits MongoDB’s native APIs and query language

34 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Operational Reports Multi-page, highly formatted reports – real-time, scheduled or burst to email

35 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Operational Dashboards Highly tailored, pixel-perfect dashboards on MongoDB

36 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Analyzer Explore and visualize data

37 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 As CTO at Pentaho, James Dixon is responsible for Pentaho's architecture and technology roadmap. James has over 15 years of professional experience in software architecture, development and systems consulting. Prior to Pentaho, James held key technical roles at AppSource Corporation (acquired by Arbor Software which later merged into Hyperion Solutions) and Keyola (acquired by Lawson Software). Earlier in his career, James was a technology consultant working with large and small firms to deliver the benefits of James Dixon innovative technology in real-world Founder and CTO, Pentaho environments.

38 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Why Pentaho?

• Pentaho is the best platform to connect, integrate, and analyze both traditional sources and MongoDB

• Pentaho embraces and extends the MongoDB environment with rich visualization and exploration of data

• Pentaho’s Subscription-based business model lowers upfront investments, enabling faster ROI

• Pentaho has dozens of Federal Government Customers and made significant investments in government certifications and cleared personnel

• Pentaho and MongoDB are established partners – Pentaho carefully engineers its products to use the latest MongoDB APIs to provide the best possible performance

39 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Next Steps and Q&A • Needs Assessment with Pentaho and MongoDB • Dave Diegtel - [email protected] • Will LaForrest - [email protected] • Try Pentaho (30 Free Trial) -- pentaho.com/download • Learn More about Big Data and Government Solutions • Pentaho • Big Data Website: pentahobigdata.com/ • Government Solutions: pentaho.com/solutions/government • MongoDB: • Government Solutions: .com/industries/government • Big Data: Examples and Guidelines for the Enterprise Decision Maker mongodb.com/lp/whitepaper/big-data-nosql • MongoDB Top 5 Considerations When Evaluating NoSQL Databases mongodb.com/lp/whitepaper/nosql-considerations • Sign-up for the Big Data Government Newsletter at CTOvision.com & take reader survey

40 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Thank You

41 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555