Next Generation Data Integration

TDWI RESE A RCH TDWI BEST PRACtiCES REPORT SECOND QUARTER 2011 NEXT GENERATION DATA InTEGRATION By Philip Russom tdwi.org Research Sponsors DataFlux IBM Informatica SAP Syncsort Talend SECOND QUARTER 2011 TDWI BEST PRACtiCES REPORT NEXT GENERATION DATA InTEGRATION By Philip Russom Table of Contents Research Methodology and Demographics 3 Introduction to Next Generation Data Integration 4 Ten Rules for Next Generation Data Integration . 4 Why Care About NGDI Now? . 6 Leading Generational Changes for Data Integration 7 Expanding Into More DI Techniques . 7 Users’ Data Integration Tool Portfolios . 9 DI Tool and Platform Replacements . 10 Data Types Being Integrated . 13 Data Integration Architecture . 14 Organizational Issues for NGDI 17 Organizational Structures for DI Teams . 17 Unified Data Management . .20 Collaborative Data Integration . 22 Catalog of NGDI Practices, Tools, and Platforms 23 Potential Growth versus Commitment for DI Options . 24 Trends for Next Generation Data Integration Options . .26 Vendor Products and Platforms for NGDI 28 Recommendations 30 © 2011 by TDWI (The Data Warehousing InstituteTM), a division of 1105 Media, Inc. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. E-mail requests or feedback to [email protected]. Product and company names mentioned herein may be trademarks and/or registered trademarks of their respective companies. tdwi.org 1 NE X T Generation Data Integration About the Author PHILIP RUSSOM is a well-known figure in data warehousing and business intelligence, having published more than 500 research reports, magazine articles, opinion columns, speeches, Webinars, and more. Today, he’s TDWI Research Director for Data Management at The Data Warehousing Institute (TDWI), where he oversees many of TDWI’s research-oriented publications, services, and events. Before joining TDWI in 2005, Russom was an industry analyst covering BI at Forrester Research, Giga Information Group, and Hurwitz Group. He also ran his own business as an independent industry analyst and BI consultant and was a contributing editor with leading IT magazines. Before that, Russom worked in technical and marketing positions for various database vendors. You can reach him at [email protected]. About TDWI TDWI, a division of 1105 Media, Inc., is the premier provider of in-depth, high-quality education and research in the business intelligence and data warehousing industry. TDWI is dedicated to educating business and information technology professionals about the best practices, strategies, techniques, and tools required to successfully design, build, maintain, and enhance business intelligence and data warehousing solutions. TDWI also fosters the advancement of business intelligence and data warehousing research and contributes to knowledge transfer and the professional development of its Members. TDWI offers a worldwide Membership program, five major educational conferences, topical educational seminars, role-based training, onsite courses, certiﬁcation, solution provider partnerships, an awards program for best practices, live Webinars, resourceful publications, an in-depth research program, and a comprehensive Web site: tdwi.org. About the TDWI Best Practices Reports Series This series is designed to educate technical and business professionals about new business intelligence technologies, concepts, or approaches that address a significant problem or issue. Research for the reports is conducted via interviews with industry experts and leading-edge user companies and is supplemented by surveys of business intelligence professionals. To support the program, TDWI seeks vendors that collectively wish to evangelize a new approach to solving business intelligence problems or an emerging technology discipline. By banding together, sponsors can validate a new market niche and educate organizations about alternative solutions to critical business intelligence issues. Please contact TDWI Research Director Philip Russom ([email protected]) to suggest a topic that meets these requirements. Acknowledgments TDWI would like to thank many people who contributed to this report. First, we appreciate the many users who responded to our survey, especially those who responded to our requests for phone interviews. Second, our report sponsors, who diligently reviewed outlines, survey questions, and report drafts. Finally, we would like to recognize TDWI’s production team: Jennifer Agee, Rod Gosser, and Denelle Hanlon. Sponsors DataFlux, IBM, Informatica, SAP, Syncsort, and Talend sponsored the research for this report. 2 TDWI RESE ARCH Research Methodology and Demographics Research Methodology and Position Demographics Corporate IT professional 67% Consultants 26% Report Scope. Data integration (DI) has changed so quickly Business sponsors/users 7% and completely in recent years that it scarcely resembles older definitions. For example, some people still think of DI as merely ETL for data warehousing or data movement utilities Industry for database administration. Those basic tasks and use cases Financial services 17% are still prominent in DI practice. Yet, DI practices and tools Consulting/professional services 16% have broadened into many more techniques and use cases. Insurance 9% While it’s good to have options, it’s hard to track them and Software/Internet 8% determine in which situations they are ready for use. The Telecommunications 6% purpose of this report is to accelerate users’ understanding Healthcare 5% of the many new products and options that have entered DI Manufacturing (non-computers) 5% practices in recent years. It will also help readers map newly Retail/wholesale/distribution 4% available technologies, products, and practices to real-world Government: federal 4% use cases. Education 3% Survey Methodology. In November 2010, TDWI sent an Pharmaceuticals 3% invitation via e-mail to the data management professionals Media/entertainment/publishing 3% in its database, asking them to complete an Internet- Utilities 3% based survey. The invitation was also distributed via Web Other 14% sites, newsletters, and publications from TDWI and other firms. The survey drew responses from almost 350 survey (“Other” consists of multiple industries, each represented by 2% or less of respondents.) respondents. From these, we excluded incomplete responses and respondents who identified themselves as academics or vendor employees. The resulting completed responses of 323 Geography respondents form the core data sample for this report. United States 51% Survey Demographics. The wide majority of survey Europe 25% respondents are corporate IT professionals (67%), whereas the Asia 8% remainder consists of consultants (26%) or business sponsors/ Australia 4% users (7%). We asked consultants to fill out the survey with a Canada 4% recent client in mind. Africa 2% Central or South America 2% The financial services (17%) and consulting (16%) industries Middle East 1% dominate the respondent population, followed by insurance 3% (9%), software (8%), telecommunications (6%), and other Other industries. Most survey respondents reside in the U.S. (51%) or Europe (25%). Respondents are fairly evenly distributed Company Size by Revenue across all sizes of companies and other organizations. Less than $100 million 22% Other Research Methods. In addition to the survey, TDWI $100–500 million 14% Research conducted many telephone interviews with $500 million–$1 billion 11% technical users, business sponsors, and recognized data $1–5 billion 16% management experts. TDWI also received product briefings $5–10 billion 9% from vendors that offer products and services related to the More than $10 billion 18% best practices under discussion. Don’t know 10% Based on 323 survey respondents. tdwi.org 3 NE X T Generation Data Integration Introduction to Next Generation Data Integration All aspects of DI have Data integration (DI) has undergone an impressive evolution in recent years. Today, DI is a rich set improved significantly of powerful techniques, including ETL (extract, transform, and load), data federation, replication, of late synchronization, changed data capture, data quality, master data management, natural language processing, business-to-business data exchange, and more. Furthermore, vendor products for DI have achieved maturity, users have grown their DI teams to epic proportions, competency centers regularly staff DI work, new best practices continue to arise (such as collaborative DI and agile DI), and DI as a discipline has earned its autonomy from related practices such as data warehousing and database administration. This report brings the To help user organizations understand and embrace all that next generation data integration reader up to date on (NGDI) now offers, this report catalogs and prioritizes the many new options for DI. This report DI’s many changes literally redefines data integration, showing that its newest generation is an amalgam of old and new techniques, best practices, organizational approaches, and home-grown or vendor-built functionality. The report brings readers up to date by discussing relatively recent (and ongoing) evolutions of DI that make it more agile, architected, collaborative, operational, real-time, and scalable. It points to new platforms for DI tools (open source, cloud, SaaS, and unified data management) and DI’s growing coordination with related best practices in data management (especially data quality, metadata and master data management, data integration

Load more