PERSPECTIVE

EFFECTIVE

Abstract Data governance is no more just another item that is good to talk about and nice to have, for global organizations. This PoV looks into why data governance is now on the core agenda of next-generation organizations, and how they can implement it in the most effective manner. Why is data governance Variety of data and increase in demanding around data privacy, personal important and sandboxing culture information protection, , data lineage, and historical data. challenging now? The next-generation utilize data from all kinds of social networks and This makes data governance top priority for Data has grown significantly blogospheres, machine-generated data, Chief Information Officers (CIOs). In fact, a Omniture / clickstream data, as well as survey by Gartner suggested that by 2016, Over time the desire and capability of customer data from credit management 20% of CIOs in regulated industries would organizations, to collect and process data lose their jobs for failing to implement and loyalty management. Alongside this, has increased manifold. Some of the facts the discipline of , organizations have now set up sandboxes, that came out in various analyst surveys successfully [3]. pilot environments, and adopted data and research suggest that: discovery tools and self-service tools. Such Data to insights to actions: Need for Structured data is growing by over 40% • data proliferation and the steep increase in accurate information every year data consumption applications demands Today’s managers use data for decisions Traditional content types, including stringent and effective data governance. • and actions. In our experience, many unstructured data, are growing by managers feel that the data they are [1] More stringent regulations and up to 80% every year accessing is inaccurate or incomplete, compliance Global data will grow to 40 zettabytes and hence, both confidence and adoption • The growing regulatory and compliance (ZB) by 2020 [2] of and analytics requirements are making effective data systems is low. Hence, data governance • 85% of this data growth is expected to governance a must have for industries initiatives need to generate good come from new types; with machine- like financial services and healthcare. The confidence in the data managers see and generated data being projected to regulatory requirements are now more use for decision-making. increase 15 times by 2020[2]

What are organizations consumption applications like data vision, identify data to be governed, doing to establish warehousing, business intelligence support the implementation of DG effective data and analytics systems, , policies, while actively participating metadata management, master data in change management and governance? management, data strategy, and data monitoring. IT teams are responsible Role of the chief data officer life cycle. for the management of data and measurement of data quality, as well (CDO) Collaborative Data Governance as collection and management of Gartner predicts that 50% of all Organizations metadata and master data. They also companies in regulated industries It is important for Chief Data Officers provide tools and technologies for will have a CDO by 2017 [4]. The to engage the right teams in the Data implementation of the related tools CDO will be responsible for Governance Organization, to align and technologies. enterprise data management, data governance and business goals. enterprise data governance, data Business teams are important to define

External Document © 2018 Infosys Limited External Document © 2018 Infosys Limited Typical Data Governance Org Structure

et stratey and direction for oerall data roducts teerin ommittee roe fundin and resource allocation from s BU rioritie adotion of enterrise data caailities Sr. Mgmt.

ltimate usiness utority for te data tey on Data oernance Business ouncil Data Owners uort data oernance and standards teams Deelo inestment recommendations for Exec anaement reie aroal

Ensure datarelated or is erformed accordin to oeratin rocedures Data Determine usae ermissions and olicies tandards rou Liaison eteen usiness and tecnoloy around data issues and uestions

Deliery esonsile for understandin data modelin tracin usae access Execution rou Data Custodians Key SMEs and imlementation Ensure data is aailale accessile ellerformin and recoerale

Figure 1: Role of Business Teams in Data Governance

Articulate tangible benefits from governance strategy includes policy and implementations that drive common data governance initiatives and rules for data quality management, goals and integrate outcomes across these metadata management, master data disciplines enjoy benefits like improved In order to keep all the stakeholders management, as well as data security. The reliability, traceability, and authenticity engaged and ensure continuous implementation of each of these individual of data. This will also provide a common investments in data governance initiatives, disciplines presents the intended benefits. communication platform across all the it is important to articulate the value However, data governance strategies stakeholders. generated by data governance initiatives via the right metrics. A dashboard with relevant metrics can be an easy tool to assess the impact of data governance initiatives, such as improvement in data quality parameters, improvement Reliability Traceability Authenticity

in dollar-value impact because of EndtoEnd data lineae onsistent consumtion of improved data quality, and the ability oernance olicies and traceaility enalin data from autoritatie standards and rocedures etter audit controls data assets tat are to rationalize enterprise metrics to data etter accountaility and aaility to resond to dataonersi imroes canes raidly trou etter understandin of definition standardization. Such tools comreensie imact play an important role in monitoring data ein reorted ot analysis consistent usae externally and internally Imroe uality oust uality controls and controlling adherence to the data Enale etter uality consistency and usaility across te data life cycle governance policies. information deliery and of master and reference analytics data across sements

Sum of parts is greater than the KEY BENEFITS whole – seamless data governance ccountaility usiness etter troner or Data ility omliance I ility Insits Organizations are now looking at

data governance holistically. The Data Figure 2: Benefits of Seamless Data Governance

External Document © 2018 Infosys Limited External Document © 2018 Infosys Limited user ratings are replacing Effective Data Governance at requirements, tools and business traditional data quality (DQ) metrics Infosys processes, easy decision-making process, Ratings by data consumers are replacing stakeholder interaction, and business Infosys holistic service offerings help next- traditional means of measuring data access. generation organizations to implement accuracy for non-traditional data such Once an enterprise establishes the above effective data governance. as sentiments and opinions. Most of the process of enabling informed business leading data platform providers such as The goal of data governance initiative decisions as the collective aim of data Cloudera, Hortonworks, and Informatica is to manage data for delivering timely, governance, it becomes significantly easier (BDE) offer ways of capturing the usage trustworthy, and relevant information to define the goals and priorities for any of data from data lakes and user ratings to make informed business decisions. initiative. Good data governance paves the about the usefulness and quality of the However, data governance is not only way for enterprise policies, people, and information. about data, but also about enabling clear processes to launch that initiative. ownership, business rules, operational

Enterprise Data Quality Master Data Metadata Data Data Strategy Data Management Management Management Protection & Diagnostics Governance

Enterrise data Data uality Leeraes rameors Deloy data Data oernance assessment Infosys dee accelerators security across ssessment frameor exerience on aced u y te enterrise oolit to includin monitorin master data dee it identify rocesses transformation manaement cometency in otimiation oraniation frameor imlementation industryleadin scalale areas in data and Infosys and toolit s and uniue tools and solution landscae Data aced u y metodoloy tecnoloies ensurin data redundancy oernance reuilt rotection and tecnoloy solutions and riacy as a and data solution accelerators serice life cycle ettin data manaement life cycle manaement olicies

Figure 3: Effective Data Governance @ Infosys

The focus now turns to the tools and technologies to implement and ensure compliance with data governance requirements in accordance to business access.

Infosys Data Governance Solution offers a robust framework and technology solutions that leverage partnerships with leading product vendors in this space.

External Document © 2018 Infosys Limited External Document © 2018 Infosys Limited Infosys Data Governance Framework with measurable KPIs efficiency and People- RACI Processes and Policies efficacy • Data Owners / Custodians, Stewards • Standardization Definition, and Reuse • IT Solution Technologists • Data Quality Mgmt This framework helps next-generation • Data Creators and Consumers • IT Systems CRUD – SDLC Management organizations to assess the data governance maturity and needs, define the strategy accordingly, and subsequently Measured By implement the same. Efficiency and Efficacy KPIs / SLAs / Metrics Data Governance Maturity Assessment Execution Model Tools and Technologies Understand data assets of an • • Scope and Mandate • Metadata / Data Quality organization • Centralized / Federated • Classification, Security Audits, and • Funding and Budgeting Masking • Understand data-spread across, on • Reference and Master Data premise, and Cloud • Understand , metadata management processes, data Figure 4: Infosys Data Governance Framework quality processes, and data security setup and processes

Predictive Proactive • Process feedback • Continuous loops are tuning as Defined improvement feedback opposed to fixing loops operating • KPIS identified & • DQM processes fully measured • Root cause analysis automated with feeding into feedback complete audit trail • Data dictionary and process rule dictionary • Top-down strategy documented and • Proactive approach to fully in tune with the maintained management of data bottom up application Reactive and rules dictionary of stewardship=> • N-Tiered stewardship complete cultural • Standards established • DQM process alignment across Chaotic established automating the enterprise • Master data plan measurement of • No standrads • Basic DQM process executed function performance • People, Process, and established Technology operating • Reactive • Supporting technology • All information silos in harmony approach • Master data plan framework deployed fully integrated with identified master data systems • No master • Root cause for issues data plan being tracked and • Strategy defined and communicated measured • No strategy 4

2

Figure 5: Data Governance Maturity Phases

External Document © 2018 Infosys Limited External Document © 2018 Infosys Limited Establish Goals and Objectives organization structure to maintain quality rules management, stewardship • Understand the key drivers of data accountability and data quality metrics reporting, governance, and the pain points of • Define the responsibilities of key roles master data management, lineage and data management such as application and data stewards traceability visualization, and data issue • Define the objectives to be achieved • Define the review frequency for investigations. Some of the key features of with data governance policies, standards, governance the tool are: • Define the business case of data structure, and roles Single point access for measurable DG metrics for all the DG stakeholders governance • Define mechanisms to enforce policies Data Policies, Procedures, and Standards and procedures • Overall data heath for executive sponsors • The policies and standards for • Communication plan to keep the data Dollar-value impact of data quality for data management, data availability, stewardship community informed • transparency, and security business leaders Data stewardship reports to enable the • Impacted data consumers – reports, • Define the metadata standards and effective management of data • policies for representation of data feeds, and analytics due to bad data stewardship lineage, data models, and data quality Infosys Data Governance Tool dictionaries • Correlation analysis of data quality This is a robust data governance tool • Define the policies for capturing issues across the application for data metadata, and Information Life cycle that provides single point access for data stewards, and many more Management • Define the policies for data transparency – audit, data security, authentication, and authorization Metric Control and Framework 18 • Define the control metrics to evaluate the effectiveness of data governance Integrity Uniqueness Format Validity • Define the process of capturing the metrics and how they will be reported • Establish the link between the metrics and the business goals of the organization 2 11 10 • Define the roadmap for the implementation of data governance, its specific objectives, and the timelines for achievement Completeness Precision Consistency Others Organization Structure and Rollout • Define the data governance Figure 6: Infosys Data Governance Solution

Complementary to industry-leading MM tools, with data quality metrics as Informatica and IBM have string metadata management, master from DQ tools to enable analytics for offerings around data profiling, cleansing, data management, and data quality various stakeholders. lineage, master data management, and management tools metadata management. These vendors Covers unhandled data quality issues • have already introduced, or are in the Infosys data governance tool has from support tickets. • process of enhancing the capabilities of connectors to popular third part Big data governance their technology stacks to address the metadata management (MM) and data Most of the leading vendors in the challenges of big data. quality (DQ) management tools. It information management space, such integrates lineage data available from Big data platform vendors such as

External Document © 2018 Infosys Limited External Document © 2018 Infosys Limited Cloudera, Hortonworks, and MapR are offering components that provide capabilities around metadata management (like LOOM and HCatalog), data lineage, and data security (like Sentry, Ranger, and Knox).

The Infosys data governance solution offers a holistic approach to manage big data governance with connectors to these leading tools, and establishing traditional data quality metrics, as well as user ratings. Leverage Strategic Partners

Infosys works with leading technology vendors in data governance, data quality, diagnostics like Informatica, IBM and These partnerships help us provide truly metadata management, master data Collibra to name a few – to bring the best integrated solutions and accelerate management, data security, and data of solutions and services to our customers. implementations.

Success Stories

Infosys helped one of the world’s Infosys helped a leading bank in For a leading national cable television leading financial services company South East USA to automate data operator in the US, Infosys defined that provides asset management, quality checks for auditing data and implemented a total data portfolio management, and mutual for FR FY14 compliance reporting quality solution across data from fund services, to realize their to the Federal. The solution was sales and marketing, commercials, enterprise data governance vision. then extended to other lines of and finance lines of business. The We achieved this by implementing business and BI applications. solution monitored the health of the the Infosys Data Governance Infosys managed to deliver data landing in the enterprise data Solution, data quality management compliance reporting in time, warehouse, and raised alarms in case tools, and metadata management with about 300+ rules configured, of threshold breaches. tools. The customer was able to and executed using Ab Initio BRE, Timely capture of key metrics variances visualize metadata, data quality, in a very short span of time. The between sources and corresponding and the impact of data quality in a customer benefited hugely from targets, and timely automated much better way; thereby improving the improvement in the regulatory communications across stakeholders operation efficiency and saving a lot compliance reporting, reinforced when thresholds were met, helped in of effort in data quality check. confidence, and credibility. raising the confidence in data.

References 1 Source Gartner Information Governance: 12 Things to Do in 2012 by Gartner 2 THE DIGITAL UNIVERSE IN 2020: Big Data, Bigger Digital Shadow s, and Biggest Growth in the Far East Sponsored by EMC Corporation. 3 Source Gartner http://www.gartner.com/newsroom/id/1898914 4 http://www.gartner.com/smarterwithgartner/understanding-the-chief-data-officer-role/

External Document © 2018 Infosys Limited External Document © 2018 Infosys Limited For more information, contact [email protected]

© 2018 Infosys Limited, Bengaluru, India. All Rights Reserved. Infosys believes the information in this document is accurate as of its publication date; such information is subject to change without notice. Infosys acknowledges the proprietary rights of other companies to the trademarks, product names and such other intellectual property rights mentioned in this document. Except as expressly permitted, neither this documentation nor any part of it may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, printing, photocopying, recording or otherwise, without the prior permission of Infosys Limited and/ or any named intellectual property rights holders under this document.

Infosys.com | NYSE: INFY Stay Connected