Data Integration Solutions Buyers Guide

Total Page:16

File Type:pdf, Size:1020Kb

Data Integration Solutions Buyers Guide 2014 Data Integration Solutions Buyers Guide Includes a Category Overview; the Top 10 Questions to Ask; Plus a Capabilities Reference of the Top 24 Solution Providers in Data Integration Technology Released by Solutions Review 11/1/2014 INTRODUCTION: Big Data is the buzz - with everybody looking to jump onto the bandwagon. However, there is substance behind the buzz. When we talk about Big Data, we mean the following: with the sources and amounts of data available for analysis increasing rapidly by orders of magnitude, new analytical tools are increasingly able to gain new and important insights into the past, present and future that were not available before. Many areas of human endeavor benefit from Big Data. Science and research has certainly seen a gain, with examples including the Large Hadron Collider using hundreds of millions of data points per second to find the Higgs Boson and DNA sequencing now taking less than a week to complete. So have many private sector companies, such as Walmart, Facebook and Amazon, now better able to discover trends and exploit normally hidden shopper idiosyncrasies in order to drive revenue and profit. And of course Uncle Sam is getting in on the Big Data action, with the construction in Utah of a massively powerful data center for the NSA that will be able to handle Yottabytes of internet data. The result of all this is an industry now valued at $100 Billion, growing at 10% a year. Solutions Review is quite interested in covering this quickly evolving topic. However, we face a conundrum in trying to organize Big Data, which is, well, big. Too big for a single category, at least. So, as with any overly complex problem, the first step is to break it down into its constituent parts. Solutions Review is therefore launching our newest category, Data Integration – perhaps the most vital solution needed to take advantage of Big Data. First, a description of what we mean by Data Integration in terms of the specific tools you will need. In order to “Data Integration is take advantage of Big Data, you have to be able to have perhaps the most vital access to all of that data, no matter its physical/virtual solution needed to take location and structure. Data Federation Technology, also called Data Virtualization Technology or Data Federation advantage of Big Data.” Services, offers a way to access information about your information, called Metadata, across all parts of your organization, no matter how or where it’s stored. You can also set up a Data Federation solution to enable queries over multiple data sources, ensure data integrity, manage transactions and enable an integrated, real-time view of all data across the enterprise. This is done through mapping all data you want federated into a virtual database. Accessing data doesn’t just mean having a unified view of it all, however. For practical purposes of crunching all that data, it needs to be in one place where your analytics program can reach it. That involves “moving” data (more like copying and pasting, actually) from one place to another, usually from storage systems into a Data Warehouse capable of analyzing it. Methods for doing just that include processes called ETL (for Extract, Transform and Load) and Data Replication, the latter of which, while often used for tasks like disaster recovery and data migration, in relation to big data offers a high performance data movement tool that should be able to quickly synchronize large quantities of data. In order to conceptualize the ETL and Data Replication processes, people in the Data Integration space usually refer to where the Data is stored at the start of the process as the source or sources, whereas where you want to move/copy the data to is the target or targets. ETL tools are your basic data movement tools, which extract all the data files selected from the source, transform them into a structure readable by the target and Business Intelligence Applications on the target, and then load the transformed data into the target. ETL tools are good at moving large quantities of data all at once in what is called a batch. 2 Solutions Review | 500 West Cummings Park | Woburn, Massachusetts 01801 | USA They also do a good job when significant transformation of the data is required before loading into the target. ETL on its own can have trouble handling certain situations, however. If data in the source is changing in real-time, you may need to be able to analyze that data quickly and perhaps in real- time as well. Because ETL loads everything from the source into the target all at once during a batch transfer, the target can experience down-times for hours while the data is loaded. The more data you need to move, the more down-time. If the target isn’t supposed to be used for long periods of time, like at night, and if you don’t require immediate analysis of new/changed data, then that down-time may not be an issue. However, you could still be wasting time and money if much of the data that your ETL tool is extracting from the source is already in the target from a previous batch load. So, for those with high-performance requirements and a need to increase the efficiency of data transfer, a data replication solution will be a necessary add-on. Most data replication solutions will contain a Change-Data-Capture (CDC) module which captures changes made in source systems and then replicates that change into a target system, keeping the databases synchronized. In some cases, the CDC tool can be sold separately from the rest of the data replication package. Other parts of a data replication solution can include schema and DDL replication, an easy to manage user interface, and software and hardware architecture designed for moving large amounts of data very quickly without creating down-time for your sources or targets or interfering with the ability of your enterprise applications to keep running. This same capability can also ensure that in the event of a crash, your company has the most up to date data with which to pick up the pieces. In addition, good data replication solutions should be fully automated in order to optimize IT productivity and save costs on professional service needs. Data replication can have drawbacks, however. An enterprise data replication solution can cost many tens of thousands of dollars, placing the capability out of reach for many smaller companies. Additionally, many data replication solutions are not very good at the transformation task that’s often needed when moving data from sources to targets. The result is that you need to piggyback the replication solution on top of an ETL solution, and not all replication solutions work with all ETL solutions. In fact, it’s best to think of data replication not as a replacement for ETL, but as a complementary solution. Both processes will be needed in executing the data integration part of your Big Data strategy. To recap, we’ve covered the Federation, ETL and Replication tools for the data integration space. For the purposes of keeping the topic of data integration as narrow as possible, we at Solutions Review are going to limit the Data Integration site, solutions directory and buyers guide to just those three solutions types. This obviously ignores the physical databases and warehouses that store and process data, as well as the business intelligence platforms and applications needed to get value out of all that data, along with a whole host of other technologies that go into the big data environment. These will be topics we will revisit in other sections of our Big Data suite over the coming months. Matt Adamson Editor, Solutions Review [email protected] (339) 927-9237 3 Solutions Review | 500 West Cummings Park | Woburn, Massachusetts 01801 | USA 5 Questions You Should Ask Yourself Before Selecting a Data Integration Solution QUESTION #1 What are the business and technical needs driving my interest in data integration? In other words, why do you need to integrate your data? The nature of the application needing that data, such as a BI/Analytics program, and what you need that program to do will determine many of the technical requirements of the data integration solution you need. Will you require real-time data access and transfer? How much data will you need to move and how quickly? Can you afford some down-time on source/target systems, or do you need them running at all times? Note all these data requirements based off of your technical and business needs so that you can compare them with what prospective solutions offer. QUESTION #2 What IT resources are available to implement, run and maintain the solution? Of course, this doesn’t just refer to your IT budget. Your IT people’s time and skills will also need to align with any data integration solution, as implementation, operation and maintenance time and costs are key considerations. If the data integration solution is sucking up your IT Department’s time, problems can go unfixed and ROI on multiple IT projects could suffer. QUESTION #3 What are my data sources? And where are they located – on-premise or in the Cloud? The basic elements of data integration revolve around moving data from sources (applications) to targets (data warehouses, etc.). Much of what is powering the Big Data movement is the massive data being collected in the cloud through very large Software as a Service (SaaS) solutions like Saleforce.com. Some solutions listed in this buyers guide specialize in the integration of cloud application data with on-premise systems to ensure that your users can access complete, current, and accurate data.
Recommended publications
  • From ESB to Integration Platform As a Service (Ipaas)
    Buses Don't Fly: Why the ESB is the Wrong Approach for Cloud Integration A SNAPLOGIC WHITEPAPER 2 Why the ESB is the Wrong Approach for Cloud Integration A SnapLogic Whitepaper SOA was DOA Thanks to the ESB 3 Re-Inventing Integration in the 4 Table of Enterprise SMAC Architecture Contents From XML to JSON 5 From SOAP to REST APIs in the Mobile World 6 From ESB to Integration Platform 6 as a Service (iPaaS) About SnapLogic 7 3 Why the ESB is the Wrong Approach for Cloud Integration A SnapLogic Whitepaper SOA was DOA Thanks to the ESB The shift to software as a service (SaaS) applications and the new era of open application programming interfaces (APIs) has led to a re-imagination of data, application and process integration in the enterprise. The vision of building a services-based abstraction layer to make enterprise business applications consistently and universally accessible is not new, but the technological game has changed, as have expectations of the business. The service-oriented architecture (SOA) vision was powerful. Unfortunately, it was rarely realized in the on-premises world. When SOA was first conceived, the enterprise service bus (ESB) was seen as its enabling technology. An ESB as the service-based abstraction layer between applications was appealing to enterprise IT organizations that were struggling with constantly changing application versions and upgrades. “Loose coupling” would introduce much more flexibility to application lifecycle management, without brittle integrations frequently breaking. Unfortunately, due to the high cost of implementing the all-or-nothing SOA + ESB vision, most IT organizations very tactically continued to use the same old point- to-point enterprise application integration (EAI) patterns that were already in place.
    [Show full text]
  • We Left Informatica. Now You Can, Too. a Conversation with the Former Co-Founder and CTO of Informatica About Modern Data Integration for the Cloud
    We left Informatica. Now you can, too. A conversation with the former co-founder and CTO of Informatica about modern data integration for the cloud. Can a business in 2017 effectively run on 25-year-old technology? Both logic and intuition say it can’t. Yet that is exactly what thousands of companies Founded in 2006 by Gaurav Dhillon, attempt to do today by using Informatica for data integration. Founded in the co-founder of Informatica, 1992, Informatica is now a billion-dollar company, with legacy technology that SnapLogic is the leading Enterprise remains a fixture in enterprise technology environments. Integration Cloud company, reinventing data integration But it doesn’t have to be. SnapLogic, led by the former CEO and co-founder for the modern enterprise. of Informatica and its former CTO, has reinvented data integration for the modern enterprise. Built for the cloud, and to meet users’ insatiable de- Gaurav Dhillon James Markarian mands for data access, SnapLogic’s elastic integration platform as a service CEO SnapLogic CTO SnapLogic (iPaaS) solution connects cloud applications and data far faster and easier than any legacy platform. This paper captures the conversation between Gaurav Dhillon, SnapLogic CEO and founder, and James Markarian, SnapLogic CTO, who discuss the business reasons and technology factors driving modern enterprises to enthusiastically embrace SnapLogic data integration, built for the cloud. Q Let’s talk about the modern enterprise and how Informatica is unable to keep pace with the needs of today’s businesses. James Markarian: When Informatica was created, it was the best technology for data integration at the time.
    [Show full text]
  • The Digital Enterprise Demands a Modern Integration Approach
    The Digital Enterprise Demands a Modern Integration Approach Nada daVeiga, Sr. Dir. of Technical Sales Tony LaVasseur, Territory Leader Yesterday’s approach to data and application integration is a barrier to becoming an agile connected enterprise… CSV The Digital Divide Forces Decreasing IT Responsiveness • “Technical Debt” • Growing IT Skills Gap Business • Outmoded Operating Model Forces Increasing Business IT Needs & Expectations • Rate of business change • Growing Consumerism • Easy access to SaaS Source: PWC Common Barriers to Big Data Adoption Common barriers to big data adoption: . Skills shortages . Integrating multiple data sources . Connecting big data technology with existing infrastructure “Survey Analysis: Big Data Investment Grows But Deployments Remain Scarce in 2014,” Nick Heudecker & Lisa Kart, Gartner, 2014. Hybrid cloud and on-premises Explosion of Big Data brings: application environments: • New opportunities for insights • Create data silos • New data integration and • Increase integration complexity management challenges Source: Forrester Research, Inc. All rights reserved. Hybrid Enterprise Demands New Approach Why Not Traditional Or Open Source ETL/ESB Tools? • Costly hardware/software required • Inability to handle unstructured data • Manual processes require expert coders or consultants • Not designed for cloud applications • Long lead times to handle new/changed sources The dilemma for enterprise IT organizations is that their legacy integration technologies were built before the era of big data, social, mobile
    [Show full text]
  • A Comprehensive Guide to the Enterprise Integration Cloud
    White Paper — Enterprise Integration Cloud Comprehensive Guide A comprehensive guide to the enterprise integration cloud The 11 key criteria for selecting an enterprise integration platform White Paper — Enterprise Integration Cloud Comprehensive Guide Table of Contents Introduction 3 The integration challenge for the digitally disrupted 4 The emergence of the enterprise integration cloud 4 Enables the cloudified, hybrid enterprise 5 Harmonizes integration across Apps, Data, and Things 6 Keeps pace with an increasing volume and variety of data 8 Provides a foundation for IT and business users to co-integrate 9 Ensures agile, no-code development 10 Facilitates collaborative development 12 Provides security and data governance 12 Maximizes reuse to minimize risk 13 Integrates with enterprise IaaS platforms 14 Scales up across old and new digital data flows 15 Applies analytics and predictive intelligence 16 Your checklist for the enterprise integration cloud 18 White Paper — Enterprise Integration Cloud Comprehensive Guide Introduction: A perspective on the enterprise integration cloud The rise of the digital enterprise is reshaping the enterprise application and data landscape. Digital initiatives are expected to deliver annual growth and cost efficiencies of 5 to 10 percent or more in the next three to five years,1 yet the technology and talent barriers to transformation remain substantial. The dramatic growth of the cloud is shifting the center of gravity for applications and data. Increasing digital data flows from rapidly deployed cloud apps, social media, and things (IoT) are creating a plethora of new integration opportunities. Data warehousing, the traditional foundation of analytics, is moving to the cloud, while business users are looking for more agile data flows to feed analytics tools.
    [Show full text]
  • THE SNAPLOGIC INTEGRATION CLOUD Datasheet
    THE SNAPLOGIC INTEGRATION CLOUD Datasheet Powering the “Cloudification” of Enterprise IT The SnapLogic Integration Cloud is the industry’s first elastic integration platform as a service (iPaaS). Powering the next-generation Analytics social, mobile, analytics and cloud (SMAC) IT architecture, SnapLogic Mobile Cloud delivers a fast, multi-point and modern platform and a rich library of Internet intelligent connectors called Snaps that allow you to quickly and Social of Things affordably connect cloud services such as Amazon Redshift, Salesforce, ServiceNow and Workday with each other as well as social and big data sources and on-premises enterprise applications like SAP, Oracle EBS and Microsoft Dynamics AX. Funded by leading venture investors, including Andreessen Horowitz and Ignition Partners, and a seasoned Databases Enterprise Systems Your Custom Apps management team including the co-founder and former CEO of Informatica, Gaurav Dhillon, SnapLogic works with companies of all sizes to accelerate cloud application adoption and ensure you get maximum value from all of your cloud application and analytics investments. Why SnapLogic? Faster Integration • Designer: Quickly build integration data and process flows, called pipelines, in a simple drag, drop and configure HTML5-based user interface. • Manager: Easily administer the lifecycle of data and process pipelines from development to production and manage user access controls, schedules and groups. • Dashboard: Get immediate visibility into the health of your system, performance of your integrations and drill-into historical trends. Multi-point Orchestration • SnapStore: Connect to everything from Amazon to Zuora with 160+ Snaps. • Snap Patterns: Jump-start your integration project with re-usable templates and a step-by-step configuration wizard.
    [Show full text]
  • Anaplan Extensibility and Data Integration Options Achieve Open Platform Connectivity for Optimized Plans Across the Enterprise
    Anaplan extensibility and data integration options Achieve open platform connectivity for optimized plans across the enterprise Executive summary With data dispersed across the enterprise by function and Anaplan extensibility options simplify integration with disparate integration tools, it can be difficult to execute built-in and third-party extensions, allowing business informed plans with the most up-to-date information. leaders to access enterprise-wide data on a unified Disconnected systems and data create a complex, platform. This open platform approach for connectivity tedious process for business leaders and stakeholders enables teams to easily and quickly share the most to access and incorporate meaningful and relevant data important information, so stakeholders can uncover into their plans. Teams across different departments deeper insights and unlock more impactful business leveraging an array of third-party integrations must find plans. Anaplan provides flexible extensibility and common ground and leverage a single source of truth in interoperability options that meet the unique needs of order to unlock true Connected Planning. every organization to empower better decision-making. Anaplan extensibility: 3 key components The goal of extensibility is to create interoperability with other enterprise systems. True extensibility creates an open platform for data integration and exchange with third-party sources and systems. WORKFORCE AND SALES Anaplan delivers the ability to bring in data for planning Proactive, automatic incentive purposes as well extract data to external systems, with a compensation three-pronged approach: Zillow was challenged with late incentive • Connectors payments that arrived two weeks post pay period and was struggling to integrate HR • Integrations and compensation systems.
    [Show full text]