Handbook Lineage Handbook 2018

sponsored by

www.datamanagementinsight.com From A-Team Insight Data Lineage Handbook 2018 Contents

Introduction 5 Overview 6 Benefits 11 Regulation 16 Challenges 22 Best practice approaches 29 Technology solutions 32 Outlook 35

Editor Marketing Operations Manager Production Manager Sarah Underwood Leigh Hill Sharon Wilbraham [email protected] [email protected] [email protected] Director of Event Operations Design A-Team Group Jeri-Anne McKeon Victoria Wren Chief Executive Officer [email protected] [email protected] Angela Wilbraham [email protected] Events Content Manager Lorna Van Zyl Postal Address President & Chief [email protected] Church Farmhouse Content Officer Old Salisbury Road Andrew P. Delaney Group Marketing Manager Stapleford, Salisbury [email protected] Claire Snelling Wiltshire, SP3 4LN [email protected] Editorial +44-(0)20 8090 2055 [email protected] Sarah Underwood Social Media Manager [email protected] Jamie Icenogle www.a-teamgroup.com [email protected] www.a-teaminsight.com Sales Director www.datamanagementinsight.com Jo Webb Client Services Manager [email protected] Ron Wilbraham [email protected]

www.datamanagementinsight.com 3 From A-Team Insight LONDON New YORK October NOVEMBER 4 15

From A-Team Insight Data Lineage Handbook 2018 Introduction

The critical need for data lineage and how to get it right

Welcome to our handbook on data lineage, a subject that has shot up the agenda at financial institutions over the past few years as it is not only essential to regulatory compliance, but also a real benefit for the business. Among regulations requiring data lineage are General Data Protection Regulation (GDPR), Markets in Financial Instruments Directive II (MiFID II) and the US Comprehensive Capital Analysis and Review (CCAR). Others will follow, including Fundamental Review of the Trading Book (FRTB) regulation that is scheduled to take effect in January 2022. From a business perspective, data lineage has much to offer, including opportunities to get a better understanding of your data, know that the data you depend on is reliable, make smarter business decisions and explore new business propositions. Operational benefits include the ability to reduce costs and risk by eradicating redundant systems and data. By tracing how data flows through an organisation’s IT landscape from source to destination, and discovering who uses the data, when and what for, data lineage also allows data ownership to be handed over to individuals or lines of business that can best exploit the data for financial or operational gain. With so much at stake, this handbook is designed to help you build and sustain effective and cost-efficient data lineage. It discusses the nature of lineage and why it is important, considers the challenges and opportunities of implementation, touches on regulations requiring lineage, and sets out some best practice approaches and technology solutions to help you get your lineage programme right. We’ll continue to update you on the development of data lineage, its technologies and potential with blogs on our Insight website (formerly Data Management Review) – www.datamanagmentinsight.com – which will give you broader content and easier access to our leading commentary on data management. You can find out more about our data management services and sign up for webinars, events and our weekly newsletter on the website. In the meantime, I would like to thank the sponsors of this handbook for their valuable input, and wish you success as you roll out data lineage across your organisation.

Angela Wilbraham CEO A-Team Group

www.datamanagementinsight.com 5 From A-Team Insight Data Lineage Handbook 2018 Overview

Introduction recognition of the importance Data lineage has become a of data governance and critical concern and challenge for accurate, complete and data managers working in capital sustainable data lineage. markets. Initially implemented without specific regulatory What is data lineage? requirements to track data across Essentially, data lineage covers individual development projects, the lifecycle of data, from its data lineage rose to prominence origins, through what happens following the implementation to the data when it is processed of BCBS 239 in January 2016, a by different systems, and where Basel Committee on Banking it moves from and to over time. Supervision (BCBS) rule designed It can be applied to most types to avert another financial of data and systems, and is disaster on the scale of the crisis particularly valuable in complex, experienced in 2008. environments.

Data lineage is usually Data lineage is usually represented represented visually to show the visually to show the movement of movement of data from source data from source to destination, to destination, changes to the changes to the data and how it is data and how it is transformed by processes or users as it transformed by processes or users as moves from one system to it moves from one system to another another across an enterprise, across an enterprise, and how it splits and how it splits or converges or converges after each move after each move. Visualisation can demonstrate data lineage BCBS 239 called for improved at different levels of granularity, data aggregation and perhaps at a low level providing reporting across financial a view of what systems data markets, as well as better interacts with before it reaches accountability for data. This its destination. As granularity required enhancements to increases, it is possible to view data governance and data detail around particular data, lineage that have since been such as its attributes and the reinforced by other regulations quality of the data, at specific and by financial institutions’ points in the data lineage.

6 www.datamanagementinsight.com From A-Team Insight Data Lineage Handbook 2018

By building a picture of how data flows through an The Ws of data lineage organisation and is transformed • Where is the data from source to destination, it • What does it mean is possible to create complete • Where was it sourced audit trails of data points, • Who is using it an aspect of lineage that • Why is it used has become increasingly • When is it used necessary to meeting regulatory • Where does it flow • What is its end point requirements (more of which later) and ensuring data integrity for the business. metadata required, with scope often determined by regulatory While data lineage helps requirements, enterprise to track data from its data management strategy, origin to destination and data impact and critical data identify different processes elements of an organisation. involved in the data flow and their dependencies, In many financial firms, users of metadata management – the data lineage include business management of data that managers and analysts, describes data – is often compliance professionals, employed to capture enterprise strategy developers, data data flow and present data governance teams, data lineage. modellers, and IT management, development and support Metadata management collects personel. When considering a and integrates consistent end- data lineage programme, avoid to-end metadata throughout boiling the ocean and instead an organisation, and creates identify regulations requiring a metadata repository that is data lineage and business areas accessible and can provide to which its application could complete data lineage be beneficial. information to different user groups. Why is data lineage important? The scope of data lineage Data lineage is key to both determines the volume of regulatory compliance and

www.datamanagementinsight.com 7 From A-Team Insight Data Lineage Handbook 2018

business opportunity. demonstrate exactly how they came to the results published From a regulatory perspective, in reports. Using data lineage, compliance requirements have they can not only prove the been tightened up considerably accuracy of results, but also since the 2008 financial crisis. take a proactive approach to Rather than merely producing identifying and fixing any gaps reports for compliance, in reporting data. regulations – such as BCBS Complete data lineage can also reduce the burden of regulation From a business perspective and by providing operational at a base level, data lineage helps transparency, and reducing risk financial firms stay on the right side and costs. Its metadata can help firms consolidate regulatory of regulators and avoid the penalties reporting by identifying data of non-compliance that is used across numerous regulations and move towards 239, General Data Protection processing the data once for Regulation (GDPR), Markets in multiple purposes. Similarly, Financial Instruments Directive II metadata for data lineage can (MiFID II), the US Comprehensive simplify and reduce the cost of Capital Analysis and Review implementing new regulations. (CCAR), and Fundamental Review of the Trading Book From a business perspective, (FRTB) – now require firms to and at a base level, data lineage implement data lineage to helps firms stay on the right side of regulators and avoid the FIGI penalties of non-compliance. Knowing the lineage or having accurately recorded history of changes Equally importantly, it helps to data is critical to the successful operations of a firm. The Financial Instrument Global Identifier, or FIGI, is an important component in the firms gain an understanding of identification framework. The FIGI can help standardize the process and their data, ensure the data is overcome some of the hurdles in tracing data lineage in a cost-effective reliable, and envision the impact manner. Go to OpenFIGI.com today to learn more. on data from any changes to systems and processes. Armed with these capabilities, firms can gain business and operational benefits beyond compliance.

8 www.datamanagementinsight.com From A-Team Insight Data Lineage Handbook 2018

Extent of development Looking at the extent of The development of data business and operational lineage and acknowledgement benefits organisations are of its importance has tracked gaining, or expect to gain, from the increasing burden of data lineage, another poll noted regulation since the financial 58% of respondents gaining or crisis. Few firms can claim expecting to gain significant complete and entirely business benefits, 42% successful systems, but most significant operational benefits, have developed a response to 27% some operational benefits, regulation that is beginning to and 21% some business morph into a business benefit. benefits. Just 3% suggested they would gain no benefits. Looking at how firms’ perception of data lineage has changed over time, the results of a poll run Few firms can claim complete and during an A-Team Group webinar entirely successful systems, but in 2016 showed about half the most have developed a response to respondents in the early stages of implementing a data lineage regulation that is beginning to morph strategy to track the starting into a business benefit point of data through to the end of its lifecycle. The webinar speakers noted that firms making most progress Fast forward to 2018, and are Tier 1 banks and other another A-Team Group webinar, large organisations subject to ‘How to Get Data Lineage Right’, extensive regulation and with and things have moved on, the resources to implement and with a poll considering how maintain data lineage, although much progress firms have made all financial firms that want to showing 13% of respondents stay in the game are likely to with a complete data lineage need data lineage across some solution, 19% close to having a aspects of their business going complete solution, 38% starting forward. to build, and 28% in the planning stage. The remainder have not Rise in automation yet addressed data lineage. Early approaches to data lineage involved manual

www.datamanagementinsight.com 9 From A-Team Insight Data Lineage Handbook 2018 Benefits

to accommodate changes, and Ultimate goals sustainability, but it is not fool Summing up the importance and development of data proof and unlikely to provide a lineage results in ultimate goals including improved 100% solution, particularly in compliance, optimised data flows, better insight into the majority of circumstances data, smarter business decisions, reduced risk, lower where firms face the complexity operational costs, and ultimately, the ability to recognise of retrofitting data lineage into new business opportunities existing organisations. Green field start-ups certainly have the processes, vast numbers of advantage here. large spreadsheets, complex and incomplete in-house Restrictions to automation developments, and custom include data that is difficult to solutions developed by access, perhaps data in black expensive consultants that boxes, or in legacy systems, were difficult to defend from a such as old mainframes that regulatory standpoint. no-one left in the organisation understands in terms of what is In recent years, and in response going on inside the systems. to regulatory requirements, there has been an inexorable One approach to increasing move towards automation automation is to plan and delivered by vendors of document data lineage and data lineage solutions. assess what will happen if there Automation can improve is a change in a regulation or speed to implementation, system, and then apply the accuracy of lineage, flexibility change on the fly.

ASG Market participants suggest ASG Data Intelligence is a metadata management platform for data-related that taking into account the compliance and data-driven business agility that enables users to find, understand, trust, and securely share their data, across legacy and relational problems of complexity and data stores and the data lake. ASG Data Intelligence provides best in class data access, firms should be automation and lineage and the broadest support for data sources and able to automate about 70% of applications available. www.asg.com/en/Solutions/Enterprise-Data-Intelligence.aspx the data lineage process now and look forward to increasing that percentage as new technologies emerge.

10 www.datamanagementinsight.com From A-Team Insight Data Lineage Handbook 2018 Benefits

Overview new products by combining Data lineage offers both certain data and processes, business and operational or the possibility of finding an benefits, but it must be external partner to upscale and approached as a long-term commercialise specific datasets. service rather than a point solution if it is to provide Improved data reliability: ongoing value. It also By tracking data from its origin requires data ownership and to its destination, data lineage accountability, and can be can identify any gaps in data expensive to implement, but that need to be filled, which on the whole, the costs and is often done manually, and requirements can be countered reduce the data remediation by the many benefits it delivers. cycle. It can also combine

Improved business decisions: By supporting By supporting a better understanding a better understanding of of an organisation’s data and an organisation’s data and providing access to trusted data providing access to trusted data quickly and efficiently, data quickly and efficiently, data lineage lineage allows the business to allows the business to make smarter, make smarter, faster and better faster and better informed decisions informed decisions. Decisions can be made more proactively data sources and eliminate where there is data lineage and data duplication. First phase defended on the basis of being implementation begins the able to determine the exact data journey towards improved data underlying any decision. reliability, which will continue as data lineage matures and data Identifying business becomes trusted throughout an opportunities: Using data organisation. lineage to gain a better understanding of data and to Understanding data: It may visualise data and processes, sound simple, but understanding organisations can identify data that is used and stored new business opportunities, across an organisation can be such as the potential to create very difficult when it includes

www.datamanagementinsight.com 11 From A-Team Insight Data Lineage Handbook 2018

masses of internal data, several of data lineage programmes sources of external data, data and can be a natural benefit silos and data in different of tracking data, eliminating formats. By applying data duplicate or redundant data, lineage, at either a business or and improving data reliability. technical level, it is possible to Alternatively, can gain a greater understanding be built into data lineage by of the data a company holds, monitoring, checking and where it is, what it is used for, its improving the accuracy and value and potential. With a good consistency of specific datasets understanding of data, it is also that must be of high quality. possible to assign responsibility for data ownership to individuals, The audit element of data lineage departments or lines of business helps data managers trace within the organisation. errors back to their source, fix any problems and improve data Data discovery: Data lineage accuracy on an ongoing basis. provides the ability to decide what data is important and find Data governance: Data lineage the right data quickly. This is is an important component crucial to business decisions of data governance and when and can help firms remain implemented successfully can competitive and identify new support the role of governance business opportunities. in managing the availability, quality and security of data Data quality and accuracy: across an organisation. Data quality is often an objective Easier regulatory compliance: AxiomSL By creating complete and AxiomSL’s platform and dynamic data-lineage capabilities provide trusted data with transparency organizations with a foundation of data transparency, integrity and control where each data point’s path is visible and documented from origination and a full audit trail, data onward. Empowered to trace, understand and trust their data, firms satisfy lineage eases the burden of regulators’ data-governance and reporting requirements and withstand gathering the right data for audits confidently, while unlocking insight and value from their data to drive competitive advantage. regulatory reporting and helps www.axiomsl.com firms defend decisions when challenged by regulators.

Data lineage can also support

12 www.datamanagementinsight.com From A-Team Insight Solidatus

NEXT-GENERATION DATA LINEAGE Solidatus is a specialised, powerful and modern data lineage tool. It’s a cloud-ready, flexible web-based application which allows organisations to rapidly document and visualise how data flows through their systems landscape. Whether used to demonstrate regulatory lineage, improve governance, assist with transformational change projects or reduce inefficiencies in data handling, Solidatus is uniquely engineered to build end-to-end models efficiently and effectively.

solidatus.com/handbook Data Lineage Handbook 2018

data harmonisation across Reduced risk: Data lineage also multiple regulations by plays into data safeguarding and discovering common data that risk. By collecting large amounts can be generated once for use by of data, organisations expose several regulations. On this basis themselves to regulatory and and by reducing duplication of business liabilities around data effort, when a new regulation breaches and the disclosure of is introduced, it is also possible sensitive data. Data lineage can to establish what part of the reduce the amount of data held required data is already being by a company, improve data governed and documented by management and knowledge, another regulation. and provide an audit trail that helps firms avoid the liabilities Improved : More associated with data breaches, reliable and better quality data disclosure, and not knowing that is understood and easily where data is at any given time. accessible supports improved analytics and the knock-on effect Visualisation of data lineage also of better business decisions. allows organisations to identify key risks in the data cycle and Increased efficiency: By check if proper controls are in eliminating duplicated data and place or need to be improved. redundant data and systems, and providing a clear view of data Cost reduction: While often and how it changes and moves expensive to implement, around an organisation, data data lineage offers a number lineage can provide increased of ways to reduce costs. The operational efficiency that can need to review data across support both cost reduction and an organisation as a first step business needs for fast access to towards successful data lineage trusted data. allows firms to identify and delete any duplicated data, Impact assessment: Data focus on data silos and decide lineage can be used to study their fate, and discover unused how changes in data, systems data that can be eradicated and or processes can affect specific redundant systems that can be products or financial reports switched off. This will optimise a downstream. firm’s data footprint and reduce

14 www.datamanagementinsight.com From A-Team Insight Data Lineage Handbook 2018

the challenges and costs of data of change, and favourable management. projects can be developed quickly using existing and new Understanding data provides an resources. Rather than calling opportunity to review licensed on IT to build new systems data, which may be licensed from scratch, the business can more than once by any one large discover how new commercial organisation or not used to any concepts could work before great extent, avoid the penalties investing in systems. of using unlicensed data, and renew licenses with data vendors to make external data provision Data lineage solves the problem of more efficient and cost effective. data ownership by clarifying where data is, who uses it and what for, Data lineage and data discovery can also support new projects, and allows ownership to be handed and even transformation over to the relevant individual, programmes, at lower cost department or line of business that as some required data and can best exploit the data for financial processes can be identified and reused, avoiding the need or operational gain to create all new data and the delays and costs of doing so. Data ownership: With a In the same vein, data lineage hodgepodge of data and can support and reduce the numerous data silos, firms find costs of data modernisation and it difficult, if not impossible, migration programmes. to assign data ownership and accountability to individuals, and departments or lines of business change management: The and make it stick. Data lineage ability of data lineage to expose solves the problem by clarifying an organisation’s data lends where data is, who uses it itself well to business intelligence and what for, and allows data and change management. ownership to be handed over What-if analyses can be to the relevant individual, made using existing data and department or line of business processes, starter projects can be that can best exploit the data for undertaken to predict outcomes financial or operational gain.

www.datamanagementinsight.com 15 From A-Team Insight Data Lineage Handbook 2018 Regulation

Overview It is based on 14 principles that The regulatory requirement for are aimed at underpinning data lineage kicked in with BCBS accurate risk aggregation 239 in 2016 and has since been and reporting in normal extended to other regulations times and times of crisis, and that oblige firms to provide are split into four sets: data transparency and a data audit governance and IT architecture trail. These include General Data requirements necessary to risk Protection Regulation (GDPR), data aggregation and reporting; Markets in Financial Instruments effective risk data aggregation; Directive II (MiFID II), the US improved risk reporting; and Comprehensive Capital Analysis regulatory supervision. and Review (CCAR), and most likely Fundamental Review The regulation is a supplement of the Trading Book (FRTB) of the capital adequacy regulation scheduled to take requirements of Basel III, which effect in January 2022. consider whether firms have enough resources to monitor BCBS 239 and cover risk exposure. BCBS 239 is a regulation issued Like Basel III, BCBS 239 has by the Basel Committee on a significant effect on data Banking Supervision (BCBS) and management, requiring firms to designed to improve risk data improve risk data aggregation aggregation and reporting across capabilities according to the financial markets. It came into principles and present accurate force on January 1, 2016. risk data for reporting. Risk data must be captured across a bank, which means consistent AxiomSL data taxonomies need to be With solution coverage encompassing 70 regulators across 50 jurisdictions established, and the data and 4,000 regulatory reports, AxiomSL enables firms to meet regulatory and risk reporting requirements for Financial Regulations, Liquidity, Capital/ needs to be stored in a way that Credit, Trade/Transactions, and Tax. Its ‘Platform-for-Change’ empowers makes it accessible and easy to firms to manage risk/regulatory data transparently and strategically, understand. integrating risk, finance and operational data environments to align with data-governance initiatives, e.g., Basel-III/IV, BCBS-239, IFRS-9/CECL, MiFID, GSD. Data lineage: The requirements www.axiomsl.com of BCBS 239, particularly around risk data aggregation, data accuracy, and risk management

16 www.datamanagementinsight.com From A-Team Insight Regulation

4 Contracts & licenses in place 4 Application data itemised 4 Data and content flow captured 4 Internal and external reporting 4 Regulatory mandates covered ∑ = Data Compliance GOOD COMPLIANCE IS GOOD BUSINESS.

Provide your enterprise with the tools to make the right decisions and protect your business from risk. Data compliance and licensing solutions from 3d innovations’ range from online executive training to data lineage and traceability surveillance. Everything you require for best practice data management and adherence to both proprietary and regulatory licensing rules.

l Executive training workshops on IP restrictions and contractual data compliance. l Application inventory and data recording for effective license reporting. l Graphical surveillance of application connectivity and data content flow.

data compliance & licensing solutions

3di-ltd.com/Consultancy Governance | Licensing | Cost Management

3DIAdvertsX2.indd 1 31/08/2018 09:26 Data Lineage Handbook 2018

reports used to make decisions personal data, ensuring data about risk, highlight the portability, notifying authorities importance of applying and individuals of data breaches, metadata to strengthen risk data and giving individuals the right to aggregation and data lineage to have their data deleted provided ensure data can be tracked and there are no legitimate grounds risk reports defended. for keeping it.

GDPR Financial institutions processing GDPR is an EU regulation large volumes of sensitive replacing Data Protection personal data may need to Directive 95/46/EC that was appoint a data protection established in 1995. The officer and must carry out regulation came into force on privacy impact assessments May 25, 2018 and is designed to identify risks, minimise to harmonise data privacy laws potential data breaches and across Europe and protect EU implement data protection citizens’ data privacy. While strategy. Those that do this well GDPR sustains the key principles should benefit from improved of data privacy set out in customer communication and the 1995 directive, many are a higher level of trust in the extended. market. For those that breach compliance, the stakes are high The challenges of GDPR include – reputational damage and fines gaining explicit consent to of up to 4% of annual turnover process personal data, giving or €20 million. data subjects access to their Data lineage: Firms subject Bloomberg to GDPR are dependent on The need for an LEI continues to expand as regulators endorse its use to data lineage to track data bring transparency and efficiency to the capital markets. Bloomberg, as an accredited Local Operating Unit (LOU), is proud to be a part of this global across their organisation and effort to adopt LEIs by offering services to issue new LEIs and maintain, provide transparency about renew, or even transfer existing LEIs. where it is and how it used. Go to lei.bloomberg.com to register today. Data lineage also provides firms with the ability to demonstrate compliance with the regulation. From a data subject’s perspective, data lineage

18 www.datamanagementinsight.com From A-Team Insight Data Lineage Handbook 2018

supports access to personal data regulatory focus on investor and the execution of other rights protection. such as the right to be forgotten. Ensuing data management MiFID II challenges include sourcing MiFID II is a principles-based required data, standardisation directive issued by the EU. It is of specific security and entity much broader than MiFID, which identifiers, reporting in near was introduced in 2007, went live on January 3, 2018, and aims to improve the competitiveness of MiFID II operations can benefit from European markets by creating data lineage in a number of ways – a single market for investment data lineage can be used to identify services and ensure protection for investors in financial instruments. any gaps in trade reporting data, and identify similarities and differences The most sweeping changes across reporting requirements made by the directive include the extension of MiFID real-time, and managing requirements covering equity requirements such as attaching trades on regulated markets to traders’ names to trades and non-equity instruments traded uploading reference and market on any trading venue, greater data to MiFID II mechanisms demand for pre- and post-trade including Approved Publication transparency, and the inclusion Arrangements (APAs), Approved of systematic internalisers and Reporting Mechanisms (ARMs) other investment firms that and the European Securities trade financial instruments and Markets Authority’s (ESMA) over the counter in the Financial Instruments Reference expanded pre- and post-trade Data System (FIRDS). transparency regime. Data lineage: MiFID II The demand for reference operations can benefit from and market data for both pre- data lineage in a number of and post-trade transparency, ways. Data lineage can be used including trade reporting to identify any gaps in trade and transaction reporting, reporting data and identify is unprecedented, as is the similarities and differences

www.datamanagementinsight.com 19 From A-Team Insight Data Lineage Handbook 2018

across reporting requirements. in November 2011. The rule It can also be used to map specifies four mandatory MiFID II reporting from source requirements that span both systems to APAs. Visualisation of quantitative and qualitative data lineage provides a view of factors: the first is an assessment a firm’s MiFID II data landscape of expected uses and sources including changes, why they of capital over a nine-month were made and by whom, and period; the second calls for a the ability to run impact analysis detailed description of a BHC’s ahead of changes to understand process for assessing capital the implications of change for adequacy; the third covers a data consumers. BHC’s capital policy; and the fourth requires a BHC to notify the regulator of any changes to CCAR requires attribute level data its business plan that are likely lineage to track data from source to to have a material impact on destination and ensure the validity capital adequacy or liquidity. and veracity of capital plans – data From a data management lineage can also be used to identify perspective, CCAR requires data any data gaps and data quality issues sourcing, analytics, risk data management and risk data CCAR aggregation for stress tests CCAR is an annual exercise designed to assess the capital carried out by the Federal adequacy of BHCs and for Reserve to assess whether the regulatory reporting purposes. largest bank holding companies Data must be accessed, (BHCs) operating in the US have validated and reconciled across sufficient capital to continue a BHC, often requiring data operations throughout times of to be managed across siloed economic and financial stress, systems to provide consistent and have robust, forward- and accurate data. Financial, looking capital planning risk and reference data must processes that account for their then be integrated to fulfil the unique risks. reporting requirement.

The Federal Reserve issued Data lineage: CCAR requires the CCAR capital plan rule attribute level data lineage

20 www.datamanagementinsight.com From A-Team Insight Data Lineage Handbook 2018

to track data from source to The data management destination and ensure the challenges of the regulation validity and veracity of capital are significant and include data plans. Data lineage can also be sourcing and quality, deciding used to identify any data gaps whether to use the internal or and highlight any data quality standardised model approach issues. to calculate capital to cover risk, and gathering long-term FRTB historical data as well as real The Basel Committee on price observations for executed Banking Supervision introduced trades or committed quotes to FRTB in a May 2012 consultation meet requirements around non- paper that set out a revised modellable risk factors (NMRFs) market risk framework and and the linked risk factor proposals to improve trading eligibility test. book capital requirements. The final FRTB paper was released Data lineage: To satisfy the on January 15, 2016, replacing demands of FRTB, firms may existing capital requirements need to implement data lineage for market risk. The regulation to track historical data and is due to take effect in January trade data aggregation required 2022. for the risk factor eligibility test of NMRFs, essentially the FRTB is a response to the 2008 provision of at least 24 real price financial crisis, which exposed observations of the value of the fundamental weaknesses in the risk factor over the previous 12 overall design of the trading months, with no more than a book regime, and focuses one-month gap between any on a revised internal model two observations. approach to market risk and capital requirements, a revised standardised approach, a shift from value at risk to an expected shortfall measure of To find out more about the latest on regulations risk, incorporation of the risk of that are likely to have an impact on data and data market illiquidity, and reduced management at your organisation download your scope for arbitrage between copy of A Team Group’s Regulatory Data Handbook banking and trading books. http://bit.ly/RegDataHB5

www.datamanagementinsight.com 21 From A-Team Insight Data Lineage Handbook 2018 Challenges

Overview modicum of advantage in Like most data management early implementation. Poor programmes, data lineage understanding of data lineage has inherent challenges: and its potential benefits by from winning management senior executives can stymie buy-in for initial projects to approval, while the prospect of understanding and tracking lengthy and complex projects huge volumes of data with could be enough to bring the complex links in a big data shutters down. environment. The challenges tend to fall into three buckets – The best solution here is to operations, technology and data educate management and management – and while many start small. Decide whether a are ongoing pain points for data pilot project is going to provide managers across all sorts of insight into business processes programmes, some are specific or achieve an element of to data lineage. regulatory compliance, prioritise KNOW YOUR RIGHTS. the most important and relevant Operational challenges data, scope the project carefully, Data is widely recognized as critical to the functioning of a financial institution. The operational challenges and identify stakeholders that Yet financial markets firms don’t own much of the external data they consume, of data lineage start with should be involved. creating issues around control and commercial / operational usage. winning management buy- Whether you are distributing exchange prices and indices information across in and funding for a solution In the first instance, it may be your organisation, downloading ratings data from your Bloomberg terminals, or that can be expensive, useful to assess where required storing identifiers in your Enterprise Data Hub – there are intellectual property requires significant human data comes from manually and constraints as to what you can do with that data. input, and offers only a create baseline data lineage before considering automation. Data lineage will tell you what you have, where it’s going and where it came It is also important to make sure from. Let 3d innovations help you legally resolve what you can do with that AxiomSL the pilot project is scalable for data. AxiomSL’s data integrity and control platform and dynamic data-lineage other data sources or areas of capabilities empower firms to map complex data flows, trace data lineage, l Exchange data policies separate data attributes from calculation processes, and manage structured/ the organisation before making l unstructured data in a controlled production environment. Its data-driven a business case for data lineage. Third party rights management processes enable executives to strengthen internal data management and l Licensing and costs controls, and by delivering fit-for-use trustworthy data with dynamic lineage, firms easily withstand audits. Proving the concept of data www.axiomsl.com lineage and demonstrating quick wins to the business should, at least in some cases, be enough to start the journey data compliance & licensing solutions

22 www.datamanagementinsight.com 3di-ltd.com/Product Governance | Licensing | Cost Management From A-Team Insight

3DIAdvertsX2.indd 2 07/09/2018 16:35 KNOW YOUR RIGHTS.

Data is widely recognized as critical to the functioning of a financial institution. Yet financial markets firms don’t own much of the external data they consume, creating issues around control and commercial / operational usage.

Whether you are distributing exchange prices and indices information across your organisation, downloading ratings data from your Bloomberg terminals, or storing identifiers in your Enterprise Data Hub – there are intellectual property constraints as to what you can do with that data.

Data lineage will tell you what you have, where it’s going and where it came from. Let 3d innovations help you legally resolve what you can do with that data.

l Exchange data policies l Third party rights management l Licensing and costs

data compliance & licensing solutions

3di-ltd.com/Product Governance | Licensing | Cost Management

3DIAdvertsX2.indd 2 07/09/2018 16:35 Data Lineage Handbook 2018

towards a larger data lineage they are unlikely to fall in with programme spanning part or all the cause and follow carefully of the organisation. created data lineage processes. These producers and consumers need to look beyond their own Data ownership and accountability environment and understand are an ongoing challenge for how the organisation can benefit organisations and data lineage isn’t a from data lineage. silver bullet, but by tracking data and That is not to say any data showing how it is used and by whom, lineage. As data lineage can be it does add clarity to the data chaos expensive to build and manage, and allow responsibility for specific it is important to understand areas of data to be allocated to their what level of data lineage users require. Depending on resources, rightful owners it may or may not be possible to While a good start to any data match extensive requirements, management project means so the initial aim must be to it should gain momentum, build a data lineage solution the success of data lineage that delivers value and is right- is particularly dependent on sized for consumers, with later people and their approaches. iterations providing more detail It takes a range of data and around data and data flows. metadata management skills to develop and maintain data Data ownership and lineage, but if data producers and accountability is an ongoing consumers don’t see its value, challenge that many organisations with huge

Solidatus amounts of data, myriad systems Solidatus is a specialised, powerful and modern data lineage tool. It’s a and applications, and little cloud-ready, flexible web-based application which allows organisations to appetite among employees to rapidly discover and visualise how data flows through their architecture. Whether used to demonstrate regulatory lineage, improve governance, take responsibility for data have assist with transformational change projects or reduce inefficiencies in data failed to resolve. Data lineage handling, Solidatus is uniquely engineered to build end-to-end models more isn’t a silver bullet, but by efficiently and effectively. solidatus.com/handbook tracking data and showing how it is used and by whom, it does add clarity to the data chaos and allow responsibility for specific

24 www.datamanagementinsight.com From A-Team Insight Data Lineage Handbook 2018

areas of data to be allocated to organisation will be covered and their rightful owners. at what level in terms of technical and business lineage, how will Technology challenges data lineage be sustained, what The technology challenges of skills are required, and how much data lineage reflect growing will it cost? numbers of regulations with overlapping requirements, and smarter auditors and The technology challenges of data regulators asking for responses lineage reflect growing numbers to questions on demand. of regulations with overlapping Advances in technology add to the challenge, with cloud- requirements and smarter auditors based applications and services, and regulators asking for responses and big data systems – not to to questions on demand mention emerging , artificial intelligence There are no catch-all answers and natural language processing to these questions and few technologies – creating a complex organisations that will find data infrastructure. Data can be answers to all the questions in managed in new and interesting one solution, leading most firms ways, but keeping track of it to implement a combination and ensuring it can be trusted is of in-house development and increasingly difficult. deployment of vendor solutions.

At the heart of addressing these A poll question posed during a challenges is the selection of a recent A-Team Group webinar solution, or solutions, to support an organisation’s data lineage. AxiomSL Questions to consider include: AxiomSL’s ‘Platform-for-Change’ integrates non-disruptively into existing technology infrastructures. Ingesting data seamlessly from diverse sources how much lineage is already in without transformation, it enables automated enrichment, pre-processing place, to what extent will manual aggregation, validation and calculation processes for its broad array of lineage be necessary, how will regulatory/risk solutions, delivering intelligent insights with dynamic lineage visibility (no black-box). Utilizing AxiomSL’s data integrity and control lineage be documented, how platform significantly strengthens the firm’s data-management/governance will it need to be scaled, how will capabilities and regulatory transparency. impact assessment be managed, www.axiomsl.com what is the long-term aim for automation, which areas of the

www.datamanagementinsight.com 25 From A-Team Insight Data Lineage Handbook 2018

The only global open resulted in 42% of respondents Whatever the selected solution, data standard enabling saying they are using in-house however, it will not provide and vendor solutions, 29% value in isolation. It is important effective data management. building in house, 26% using to consider how data lineage Financial instrument global identifier* vendor enterprise solutions, and its metadata will integrate 26% using consultancy support, with the rest of an organisation’s Only through an open, shared framework will the financial and 10% using only a vendor business metadata as this will industry be able to finally address core data quality and managed solution. Speakers on provide rich data and the ability lineage issues across legacy codes and standards. the webinar suggested that firms to slice and dice it. Lineage also needs to run alongside 90% of firms globally using more than one existing an organisation’s systems standard to identify instruments. Scalable and flexible technology is development lifecycle plan 48% of firms pointing to incorrect or incomplete essential, not only to master growing to ensure it is maintained as instrument identification as cause for a growing volumes of existing data types, but technologies are changed. percentage of operational errors. 86% of firms agree that an open, shared framework also to embrace additional datasets, And, of course, scalable and that can establish relationships between different alternative data and data resulting flexible technology is essential, existing legacy instrument identifiers is needed. from mergers and acquisitions not only to master growing volumes of existing data types, Streamline your trading workflow and will move away from early in- but also to embrace additional reduce operational risk. For inquiries house implementations made datasets, alternative data, and regarding FIGI integration: when few vendor solutions were data resulting from mergers [email protected] available, and migrate to vendor and acquisitions. solutions, including big data and OpenFIGI.com cloud solutions, as data lineage Data management challenges FIGI becomes a commodity. Implementing data lineage is a complex data management task that could include huge volumes 3d Innovations of data, the creation of metadata, Educate your business, change behaviour and manage market data risk. multiple legacy systems, 3d Innovations helps firms with practical, interactive and customisable solutions that help change behaviours and support a culture of data integrity mountains of spreadsheets, and compliance. The company surveys regulators and exchanges globally disparate systems, siloed data, to provide a library of policies, licensing interpretation and regulatory uncharted data flows and overviews enabling and empowering you to implement and manage data- Source: TABB Group Research, April 2018 * centric solutions with confidence. mixed data formats. FIGI is provided under the MIT Open Source license, with Governance by the Financial Domain Task 3di-ltd.com/Consultancy The impact of regulatory changes must also be assessed and fixes made, data quality

26 www.datamanagementinsight.com From A-Team Insight The only global open data standard enabling effective data management. Financial instrument global identifier*

Only through an open, shared framework will the financial industry be able to finally address core data quality and lineage issues across legacy codes and standards.

90% of firms globally using more than one existing standard to identify instruments. 48% of firms pointing to incorrect or incomplete instrument identification as cause for a growing percentage of operational errors. 86% of firms agree that an open, shared framework that can establish relationships between different existing legacy instrument identifiers is needed.

Streamline your trading workflow and reduce operational risk. For inquiries regarding FIGI integration: [email protected] OpenFIGI.com FIGI

Source: TABB Group Research, April 2018 * FIGI is provided under the MIT Open Source license, with Governance by the Financial Domain Task Data Lineage Handbook 2018 Best practice approaches

considered, and manual data can be left as is, and which processes brought into the data can be scrapped. Data lineage framework. in legacy systems and black boxes will be difficult, if not Big data, data lakes, swamps impossible, to capture, as will and repositories also raise issues data that changes continually around how data is stored, but not consistently. tagged and linked to other data and systems, while outsourced Considering the scope and scale data and automated data feeds of these data management need to be mined and brought challenges, particularly in large into the data lineage scheme. organisations, data lineage utopia is not in sight, but there Is all this data valuable, is it are tools and solutions that duplicated, or is it redundant? Is can break the backbone of it internal data, or is it external implementation and provide data that is, or should be, a sturdy platform on which licensed, and what are the tools to build and maintain data for the task? lineage that can provide useful and timely information to the Whatever the data management business. challenges, an early inventory of an organisation’s data can start the process of identifying which data is important to the business and should be part of a data lineage programme, which

AxiomSL With AxiomSL’s ‘Platform-for-Change’, firms conquer the three ‘V’s of data- management challenges: velocity, volume, veracity. They can establish Service-Level-Expectations (SLEs) that drive data quality and integrity and well-controlled business processes that deliver accountability and sustainability. High-quality data, fully documented from origin onward by AxiomSL’s dynamic data-lineage capability, gives leaders confidence in their regulatory data and reporting and yields actionable insights enterprise-wide. www.axiomsl.com

28 www.datamanagementinsight.com From A-Team Insight Data Lineage Handbook 2018 Best practice approaches

Overview Early planning must also Best practice approaches to detail how lineage metadata data lineage have previously will be included in the been driven by regulatory organisation’s metadata compliance requirements, but repository and integrated with they are evolving as more firms business metadata to deliver adopt the discipline for not only maximum benefits. regulatory but also business reasons as technology tools support increasing automation, Start with a relatively small pilot and as successful programmes project with a well-defined scope result in significant benefits. that will have a relatively large Not all best practice approaches impact on the organisation – this will fit all financial firms, but project may take a manual approach here are some guidelines that as a starter to demonstrating the should be considered in any potential of data lineage automation data lineage project, whether or not they are implemented. Structured and unstructured data: A decision must also be Planning and preparation: taken on whether to include Early planning and preparation structured and unstructured are key to data lineage to ensure data in a lineage project. projects stay within scope and While structured data lineage budget, and are delivered to an is relatively straightforward, expected specification. Consider perhaps using a handful of the scope of the effort, whether attributes that uniquely identify the drivers are about regulation a specific data element, or business, or both, and unstructured data is not so whether the data lineage should linear and may require critical cover one data flow, many flows data elements to be identified or the organisation’s entire IT and included in lineage. landscape. It is also important to determine whether a project Start small: Start with a will cover lineage for data relatively small pilot project with elements, information assets or a well-defined scope that will , although these can have a relatively large impact on be combined. the organisation. This project

www.datamanagementinsight.com 29 From A-Team Insight Data Lineage Handbook 2018

may take a manual approach out. A programme is likely to as a starter to demonstrating need a champion and will the potential of data lineage need business involvement, automation. It also needs to as well as metadata experts, show scalability, efficiency data custodians, stewards and and true business value to win operators. management buy-in and funding. Manual data lineage: Consider the scope of the project, its Ensure the data lineage is always objectives, and the elements documented accurately and in the dataflows that will be completely, and that the covered, such as data sources, systems that aggregate or documentation can be updated calculate data, and reporting easily when changes are made to tools. Make an inventory of all data, systems and flows – this will data, circle data specific to the help to sustain lineage area to which lineage is being applied and discover how the Understand required skills: data originates and moves Assess whether the organisation between people, processes, has sufficient resources to services and products. implement and maintain a data lineage project or whether it The level of data granularity must look for help externally. that will be included in lineage Define an internal management must be decided and it is then team and consider how best possible to work backwards data ownership could pan or forwards to map data flows from source to end point. Data Compliance – The problem... Contemporary data licensing practices of exchanges and other information Ensure the data lineage is providers revolve around three key areas: 1. Data usage (breadth & depth of securities and field attributes). documented accurately and 2. End users with line of sight/use of data. completely, and that the 3. Applications and functions consuming/processing market and reference data. documentation can be updated 3di’s solutions effectively manage the licensing in the context of application and end-user usage profiles. easily when changes are made 3di-ltd.com/Consultancy to data, systems and flows. This will sustain lineage, but can be difficult in a large or dynamic environment, suggesting

30 www.datamanagementinsight.com From A-Team Insight Data Lineage Handbook 2018

development of at least some key to providing an accurate automated data lineage. and timely view of how data is flowing through an Automated data lineage: Early organisation. It also supports planning and preparation for assessment of the impact of automated data lineage follows any changes to regulatory or similar steps to manual lineage. business data or systems on With scope, flows and data upstream and downstream granularity for the data lineage systems. Ultimately, it can help established, an automation firms determine operational tool can be implemented to improvements and potential gather required data or lineage business opportunities. metadata and demonstrate data flows. These tools should Automated data lineage and be able to react quickly to any visualisation tools should changes to data or systems, be made available to all verify the origins of data, and stakeholders in a lineage help firms identify any missing programme. data, data quality issues that need to be addressed, and any Software development other weaknesses in the lineage. lifecycle (SDLC): To make data lineage accurate, efficient, As mentioned previously, accessible and sustainable, it automation is unlikely to is important to consider how provide a complete solution it is going to be included in an to data lineage, with experts organisation’s SDLC. suggesting 70% is closer to the mark. The remaining portion will need manual intervention, which is also important in terms of monitoring the automated process to ensure it is working correctly, exception management and responding to alerts raised by the solution.

Visualisation: Visualisation of automated data lineage is

www.datamanagementinsight.com 31 From A-Team Insight Data Lineage Handbook 2018 Technology solutions

Overview by vendor solutions, although While most data lineage in practice, many firms need projects in financial firms hybrid solutions covering both start as in-house manual manual and automated lineage. developments responding to These requirements can also be regulatory requirements, times supported by solution providers. are changing. An increasingly regulated environment, growing A typical data lineage volumes of data, complex data automation solution includes infrastructures and the need to functionality that captures and react quickly to changes and documents data flows, such as provide fast access to data are a flow of financial instruments, driving firms towards a mix of from the data source to its in-house and vendor, or purely final destination, perhaps a vendor, technology solutions. regulatory or internal report. These come in varying types Drilldown functionality allows from enterprise solutions to particular points in the lineage cloud-based services, their to be inspected more closely, commonality being in bringing while traceability and audit automation and increased ensure it is possible to track a timeliness, flexibility and piece of data though its journey accuracy to previously manual across an organisation and data lineage processes. verify its accuracy. Filtering capabilities allow users to filter Technology solutions for different data categories, There are two ends to the such as reference data or trade spectrum of data lineage data, and understand the data’s technology and much in corresponding attributes. between. At one end is the manual approach that maps Another important facet of data across the IT landscape data lineage is visualisation – from source to destination – technology, which can provide and uses generic tools such as a real-time view of data moving Microsoft Excel and PowerPoint through an organisation’s to maintain and visualise processes and systems, improve lineage. At the other end of the the understanding of data, spectrum is fully automated, highlight any defects in data zero-gap data lineage supported flows, and visualise the impact

32 www.datamanagementinsight.com From A-Team Insight Data Lineage Handbook 2018

on the IT landscape of any Technology options regulatory or business changes. Today, most data lineage Documentation is managed solutions are coupled to dynamically to reflect any traditional databases and changes in the lineage. data warehouses, and include automation tools, data Automation can also capture management, visualisation and, business logic and/or metadata increasingly, cloud technologies, that can be stored in a repository but more options are expected and used to create source to as new technologies emerge. target data lineage, eliminate duplicated or redundant data, Among these are blockchain, and provide business and machine learning, artificial technical users with the ability to intelligence (AI) and graph locate, understand, and manage databases. Blockchain information that supports technology is a potential business operations. candidate as it provides consensus on the most recent In terms of data quality, an version of golden copy data and organisation’s critical data an immutable historical record elements can be identified in of data, both of which could data lineage and data quality support data lineage. checks can be established across the organisation as part of a data Machine learning has a part to governance strategy. play by learning and repeating required actions within data These types of automated lineage such as automatically solutions offer many benefits, tagging private or sensitive including the ability to trace data data elements and visualising errors, identify discrepancies, these. AI will go further, perhaps control access to information replicating data lineage and model what would happen processes and identifying new if a new process or department business propositions without were added to the business. human intervention. They can also reduce time spent on validating data accuracy and Graph technology put trusted information in the is also a good match for data hands of decision-makers. lineage as it is relatively easy

www.datamanagementinsight.com 33 From A-Team Insight Data Lineage Handbook 2018 Outlook

to model the flow of data in a technologies, scope and graph, data relationships can potential for automation, be queried in real time, and a but the key difference, for the graph schema can evolve to moment is least, is delivery, with accommodate new data and some vendors providing cloud- relationships. Similarly, the based solutions that can be up database’s query language can and running quickly, and others be used to understand what offering enterprise software data is used by whom, and solutions that need to be which systems and reports implemented and maintained would be impacted by a in-house. Going forward, change in a particular process. however, data lineage is likely to Graph visualisation can help follow the steady flow of data, technology and business users applications and analytics into investigate data lineage. the cloud.

Graph database technology is also a good match for data lineage as it is relatively easy to model the flow of data in a graph, data relationships can be queried in real time, and a graph schema can evolve to accommodate new data and relationships

Vendor approaches Vendor solutions cover similar data lineage functionality in terms of capturing data and creating lineage that makes trusted data accessible to business users and can accommodate changes on the fly. There may be slight differences in underlying

34 www.datamanagementinsight.com From A-Team Insight Data Lineage Handbook 2018 Outlook

Business need and new an organisation’s IT landscape, technologies drive data and ultimately, the potential lineage development to identify new business Today’s data lineage solutions, opportunities. From an whether built in-house or operational perspective, data bought from vendors, deliver lineage can help firms reduce significant business and costs and risk by eliminating operational benefits, new redundant systems and data. business opportunities and the ability to reduce cost and risk – but there is more to Whatever the outcome of come as firms push lineage development, the purpose of beyond regulatory compliance and further into the business, implementing data lineage beyond solutions mature and new regulatory compliance will remain technologies emerge. the same: to move from defensive to offensive data intelligence that drives From a standing start in 2016, data lineage took off when smarter decisions and unlocks new regulation BCBS 239 came into business opportunities force with the aim of averting further financial crises on the Increasing business interest in scale of the 2008 crash by data lineage and its ability to improving data aggregation provide a better understanding and reporting across markets. of data and easier access to Regulation drove data lineage, meaningful information in real but as this handbook outlines, time will be met by the vendor it is now as much a business community as it fine tunes as compliance concern with solutions and automates data financial firms showing a flows that have previously been growing appetite for its sweet difficult to address. While 100% spots. automation remains a utopia, supply and demand will raise These include fast access to the bar of automation over reliable information, smarter coming years. analytics, better decision making, improved impact To get the most out of data assessment of any changes to lineage quickly and gain

www.datamanagementinsight.com 35 From A-Team Insight Data Lineage Handbook 2018

competitive edge, firms that have of data lineage, which could be still to implement will probably amplified by the application of go straight to vendor solutions, emerging technologies such as those with in-house builds will blockchain, machine learning follow suit, and those already and artificial intelligence. Not all LEI running vendor solutions will Alternatively, data lineage per look for more functionality. se could become extinct as it is issuers are woven into the fabric of more far- With some data lineage solutions reaching technologies that we already in the cloud, others – and have not yet conceived. created equal. their users – will follow, joining the growing ranks of non-core Whatever the outcome, the Legal Entity Identifier (LEI) records and the Local applications and data moving to purpose of implementing data Operating Units (LOUs) that issue and manage them the cloud, although it could be lineage beyond regulatory have become an essential part of the trading process. argued that the outcomes of data compliance will remain the Through the Bloomberg LEI website your organization lineage are as valuable as those same, at least for the foreseeable has access to a competitive rate for both registration and of any other core assets. future: to move from defensive renewal of LEIs. Bolstered by proven record-keeping to offensive data intelligence experience, 24-hour support and a dataset designed to These advances will continue to that drives smarter decisions handle evolving regulatory demand, the user-friendly be pushed forward by market and unlocks new business platform makes managing your LEI simple. acknowledgement of the benefits opportunities. Sign up for a Bloomberg LEI account today.

[email protected] lei.bloomberg.com LEI on the Bloomberg Terminal®

36 www.datamanagementinsight.com From A-Team Insight ©2018 Bloomberg 188585 0618 Not all LEI issuers are created equal.

Legal Entity Identifier (LEI) records and the Local Operating Units (LOUs) that issue and manage them have become an essential part of the trading process.

Through the Bloomberg LEI website your organization has access to a competitive rate for both registration and renewal of LEIs. Bolstered by proven record-keeping experience, 24-hour support and a dataset designed to handle evolving regulatory demand, the user-friendly platform makes managing your LEI simple.

Sign up for a Bloomberg LEI account today.

[email protected] lei.bloomberg.com LEI on the Bloomberg Terminal®

©2018 Bloomberg 188585 0618