WHITE PAPER Rethinking Your Data Retention Strategy to Better Exploit

Total Page:16

File Type:pdf, Size:1020Kb

WHITE PAPER Rethinking Your Data Retention Strategy to Better Exploit WHITE PAPER Rethinking Your Data Retention Strategy to Better Exploit the Big Data Explosion Sponsored by: Dell Richard L. Villars Marshall Amaldas October 2011 IDC OPINION The continued generation of business-critical semistructured data (including large volumes of machine-generated data [MGD] from smart sensors and mobile devices) is changing the storage dynamic in a wide range of industries and organizations. Making investments to extract value from this expanding pool of information is fast becoming a core business mandate, but such efforts can quickly lead to spiraling IT costs and growing corporate risk without the right data retention and long-term archiving strategy. Making the wrong choice in a technology decision (e.g., deciding between an OLTP, OLAP, or OLDR approach to data storage) can lead to significantly high data management and retention costs in both the short run and the long run. It can also jeopardize compliance and privacy standards for data such as call detail records (CDRs) and trading records. IT organizations need to deploy active archival storage solutions that address the total cost of ownership (TCO) for archival data at many layers. Specifically, such a solution: Provides a semistructured archive platform that's significantly less expensive than archiving that same information on individual database, data warehouse, or file systems Maximizes the utilization of that hardware with intelligent data management/reduction software Reduces the ongoing operational burden of the archival storage environment When selecting a storage and data management partner to help you manage the "Big Data" challenge, you will need a partner that can address the entire spectrum of data assessment, data retention, and data use requirements of this new environment. Dell, as a leading designer and provider of IT solutions optimized for Big Data analytics, is also providing enterprise-class solutions that address the cost, performance, and intelligence requirements at the heart of Big Data retention and active archiving. Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.872.8200 F.508.935.4015 www.idc.com F.508.935.4015 P.508.872.8200 USA 01701 MA Framingham, Street Speen 5 Headquarters: Global INFORMATION EVERYWHE RE, BUT WHERE ' S THE KNOWLEDGE? For the first 40 years of the IT industry, the main data challenge for most organizations was enabling/recording more and faster business transactions, often referred to as structured data. Today, much of the focus is on more and faster exchanges of information (e.g., documents, medical images, movies, gene sequences, data streams, tweets) from scale-out cloud clusters to systems, PCs, mobile devices, and living rooms. This information is often categorized as unstructured data (e.g., image, audio, or video files) or semistructured data (e.g., emails, logs, call detail records). Semistructured data is often overlooked, but with the advent of RFID tracking, smart sensors, mobile devices with geospatial information, and a growing array of data collection devices, MGD will be a leading driver of the data explosion. The business challenge for the next decade will be finding ways to better analyze, monetize, and capitalize on all this MGD (see Figure 1). It will be the age of Big Data. For the IT organization, the challenge will be to implement an archival storage system that ensures that this information is reliably and efficiently ingested, protected, organized, accessed, and preserved. F I G U R E 1 Changing Business Priorities in a F a s t - Shifting World Companies rely on a growing range of devices, data sources, and applications to compete in today's evolving business environment. Facebook MORE MORE DEVICES APPLICATIONS VMware salesforce.com MORE Apple MORE CONTENT DATA The range of information created, accessed, and retained affects how companies organize datacenters and retain information. Source: IDC, 2011 2 #230747 ©2011 IDC The Ongoing Data Explosion Data creation is occurring at a record rate. In 2010, the world generated over 1 zettabyte (ZB) — that's 1 million petabytes (PB) — of data; by 2014, we will generate 7ZB a year. While much of this is "unsaved" or highly duplicated data like personal photos or copies of music/videos, one of the fastest-growing and most important sources of growth is machine-generated data: Financial transactions. With the consolidation of global trading environments and the greater use of programmed trading, the volume of transactions that need to be collected and analyzed can double or triple in size, while the transaction volumes can also fluctuate much faster, more widely, and more unpredictably, and competition among firms forces trading decisions to be made at ever smaller intervals. Smart instrumentation. The use of intelligent meters in "smart grid" energy systems that shift from a monthly meter read to an "every 15 minutes" meter read can translate into a multi-thousandfold increase in data generated. Similar data bursts are looming in healthcare, where low-cost gene sequencing will have a profound impact on medical data volumes. Mobile devices. Until quite recently, the main data generated on landline and traditional mobile phones was limited to CDRs with caller, receiver, and length of call data. With smartphones and tablets, additional CDR data to harvest includes geographic location, text messages, browsing history, and (thanks to the addition of accelerometers) even motions. All of this data creates new opportunities to "extract more value" in sectors such as energy, human genomics, healthcare, retail, online search, surveillance, and finance, as well as many other areas. IDC believes that organizations that are best able to make real-time business decisions based on machine-generated data streams at the lowest possible cost will thrive, while those that are unable to embrace and make use of this expanding data source will increasingly find themselves at a competitive disadvantage in the market. This situation will be particularly true in industries that are experiencing high rates of business change and aggressive consolidation. Big Data Valu e : W h a t ' s in It for Me? Regardless of industry or sector, the ultimate value of Big Data implementations will be judged based on one or more of three criteria: Does it provide more useful information? For example, a major retailer might implement a digital video system throughout its stores, not only to monitor theft but also to implement a Big Data pattern detection system to analyze the flow of shoppers — including demographical information such as gender and age — through the stores at different times of the day, week, and year. ©2011 IDC #230747 3 Does it improve the fidelity of the information? For example, a number of earth science and medical epidemiological research teams are using Big Data systems to monitor and assess the quality of data being collected from remote sensor systems; they are using Big Data not just to look for patterns but also to identify and eliminate false data caused by malfunctions, user error, or temporary environmental anomalies. Does it improve the timeliness of the response? Consumer products companies can use kiosks like Coca-Cola's Freestyle to collect real-time consumer taste preferences in different regions. This move makes it easier to tune promotions and control inventory levels on a regional or even store-by-store basis. Big Data Analytics Versus Retention: Distinct Solutions for Distinct Needs Today, a number of Big Data analytics solutions use a combination of open source software frameworks such as Hadoop and MPP (massively parallel processing) hardware architectures to support compute- and data-intensive applications that can consume multiple petabytes of disk storage across thousands of individual server nodes. Both hardware and software components of such analytics systems are optimized for performance where the data distributed over multiple nodes is kept redundant for resiliency and high-availability reasons. The MPP architecture–based systems are designed such that compute and storage are tightly coupled to minimize contention for resources. While these solutions are best suited to run complex large-scale analytics where performance is the prime objective these systems are not suitable targets for long-term retention of big data content. A key element in all these use cases is that organizations must be able to continually go back and reanalyze the same machine-generated data sets over and over again. They need to continually look for patterns stretching over hours, days, months, and years. If it's too expensive to retain the needed historical data or too difficult to organize the data for timely, ad hoc retrieval, organizations won't be able to capitalize on their collected information. The key question you need to be asking is whether your current storage environment can handle this new data explosion and the data retention challenges it will create. Traditionally, MGD was treated like either structured or unstructured data sets: 1. It was maintained in a database or data warehouse (leveraging SAN-attached storage), which is very expensive and can significantly impact performance, unless an organization used the archiving functions (not also provided) for each application. In this approach, the data is also trapped in a single application environment and is difficult to repurpose/reuse. 2. It was pushed down as a blob (sometimes aptly called a TARball) onto a file system to be retained. In this approach, an organization sacrificed the structure detail, significantly impacting the querying and analytical ability and, once again, the ability to repurpose/reuse. Because MGD was often linked to a tape library, it posed significant data retrieval burdens. 4 #230747 ©2011 IDC 3. It was kept as a set of personal files on a file server or NAS device and then either orphaned (when the owner left) or deleted. In both cases, the ability to access the data and to manage its retention/disposal for regulatory reasons was severely compromised.
Recommended publications
  • Data Retention Policy (GDPR Compliant)
    Data Retention Policy (GDPR Compliant) Data controller: Habasit (UK) Ltd Habegger House Gannex Park Dewsbury Road Elland HX5 9AF 1 INTRODUCTION 1.1 Habasit (UK) Ltd (“we”, “our”, “us” or “the Company”) must comply with our obligations under data protection laws (including the GDPR and the Data Protection Act 2018) whenever we Process Personal Data relating to our employees, workers, customers and suppliers and any other individuals we interact with. 1.2 This includes the obligation not to Process any Personal Data which permits the identification of Data Subjects for any longer than is necessary and the purpose of this policy is to assist us to comply with that obligation. This policy should be read alongside the Data Retention Matrix which is appended at Schedule 1 to this policy and which provides guideline data retention periods for various different types of Personal Data we hold. 1.3 Compliance with this policy will also assist us to comply with our ‘data minimisation’ and accuracy obligations under data protection laws which require us to ensure that we do not retain Personal Data which is irrelevant, excessive, inaccurate or out of date. 1.4 A failure to comply with data protection laws could result in enforcement action against the Company, which may include substantial fines of up to €20 million or 4% of total worldwide annual turnover (whichever is higher), significant reputational damage and potential legal claims from individuals. It can also have personal consequences for individuals in certain circumstances i.e. criminal fines/imprisonment or director disqualification. 1.5 Compliance with this policy will also assist in reducing the Company’s information storage costs and the burden of responding to requests made by Data Subjects under data protection laws such as access and erasure requests.
    [Show full text]
  • Basic Overview of Data Retention Mandates – Privacy and Cost
    BASIC OVERVIEW OF DATA RETENTION MANDATES – PRIVACY AND COST September 2012 Introduction The use of telephone and Internet services generates information useful to governments in conducting law enforcement and national security investigations. In an effort to guarantee the availability of communications data for investigations, some governments have imposed or have considered imposing legal obligations requiring communications service providers to retain for specified periods of time certain data about all of their users. Generally, under these “data retention” mandates, data about individuals’ use of communications services must be collected and stored in a manner such that it is linked to a specific user’s name or other identification information. Government officials may then request access to this data, pursuant to the laws of their respective countries, for use in investigations. As a tool for addressing law enforcement challenges, data retention comes with a very high cost and is ultimately disproportionate to the goals it seeks to advance. Less privacy-burdensome alternatives are likely to accomplish governments’ legitimate goals just as, and perhaps more, effectively. I. Data Retention: The Basics Data retention laws vary with respect to the types of companies, data, and services that they cover. Types of companies covered: Most of the data retention laws that have been adopted by governments around the world focus on telephone companies (both fixed line and wireless) and Internet service providers (ISPs), including cable companies cable and mobile providers. Some data retention laws also apply to any entity that offers Internet access, such as Internet cafes and WiFi “hotspots.” Some data retention laws place retention obligations on online service providers (OSPs) – companies that provide, among other things, web-hosting services, email services, platforms for user-generated content, and mobile and web applications.
    [Show full text]
  • Facts About the Federal Government's Data Retention Scheme
    Consumer Fact Sheet Facts about the Federal Government’s data retention scheme The Federal Government’s data retention scheme, enacted in March 2015, will come into effect between 13 October 2015 and 12 April 2017. Our fact sheet covers what consumers need to know. What is metadata? Metadata, simply put, is ‘data about data’. In telecommunications it is information about communications (e.g. the time a phone call was made and its duration), information about the people communicating (e.g. the sender and the receiver) including account and location information, and the device used. The scheme requires that service providers retain metadata but not the content or substance of a communication. However metadata can still reveal a lot of information about an individual and those they interact with. The set of metadata that will be required is set out in the legislation – see http://www.ag.gov.au/NationalSecurity/DataRetention/Documents/Dataset.pdf How will your metadata be used? It will be mandatory for telcos and ISPs to store your metadata for two years (some may have a business need for longer retention of some data). This metadata will be available to specified government agencies (such as law enforcement and national security agencies) upon request. You will be able to access your own data and many service providers do some of this already in your ordinary bill. How will it affect consumers? Costs We don’t know how much a data retention scheme will cost to set up but in March the Government estimated it at $400 million to set up and $4 per year, per customer to run.
    [Show full text]
  • 10. GCHQ. Handling Arrangements for Bulk
    OFFICIAL This information has been gisted, GCHQ Bulk Personal Datasets Closed Handling Arrangements 1. introduction 1.1 These handling arrangements are made under section 4(2)(a) of the Intelligence Services Act 1994 (ISA). They come into force on 4 November 2015. 1.2 These arrangements apply to the Government Communications Headquarters (GCHQ) with respect to the obtaining, use and disclosure of the category of information identified in Part 2 below, namely "bulk personal datasets". 1.3 The rules set out in these arrangements are mandatory and must be followed by GCHQ staff. Failure by staff to comply with these arrangements may lead to disciplinary action, which can include dismissal, and potentially to criminal prosecution. 2- Information covered by these arrangements 2.1 The Security and Intelligence Agencies (SIA) have an agreed definition of a 'Bulk Personal Dataset" (BPD). A BPD means any collection of information which: • comprises personal data; • relates to a wide range of individuals, the majority of whom are unlikely to be of intelligence interest; • is held, or acquired for the purpose of holding, on one or more analytical systems within the SIA. 2.2 Bulk Personal Datasets will in general also share the characteristic of being too large to be manually processed (particularly given that benefit is derived from using them in conjunction with other datasets). 2.3 In this context, "personal data" has the meaning given to it in section 1(1) of the Data Protection Act 1998 (DPA), which defines "personal data" as follows: "data which relate to a livingl individual who can be identified — • from those data; or • from those data and other information which is in the possession of, or is likely to come into the possession of, the data controller [e.g.
    [Show full text]
  • NSA) Surveillance Programmes (PRISM) and Foreign Intelligence Surveillance Act (FISA) Activities and Their Impact on EU Citizens' Fundamental Rights
    DIRECTORATE GENERAL FOR INTERNAL POLICIES POLICY DEPARTMENT C: CITIZENS' RIGHTS AND CONSTITUTIONAL AFFAIRS The US National Security Agency (NSA) surveillance programmes (PRISM) and Foreign Intelligence Surveillance Act (FISA) activities and their impact on EU citizens' fundamental rights NOTE Abstract In light of the recent PRISM-related revelations, this briefing note analyzes the impact of US surveillance programmes on European citizens’ rights. The note explores the scope of surveillance that can be carried out under the US FISA Amendment Act 2008, and related practices of the US authorities which have very strong implications for EU data sovereignty and the protection of European citizens’ rights. PE xxx.xxx EN AUTHOR(S) Mr Caspar BOWDEN (Independent Privacy Researcher) Introduction by Prof. Didier BIGO (King’s College London / Director of the Centre d’Etudes sur les Conflits, Liberté et Sécurité – CCLS, Paris, France). Copy-Editing: Dr. Amandine SCHERRER (Centre d’Etudes sur les Conflits, Liberté et Sécurité – CCLS, Paris, France) Bibliographical assistance : Wendy Grossman RESPONSIBLE ADMINISTRATOR Mr Alessandro DAVOLI Policy Department Citizens' Rights and Constitutional Affairs European Parliament B-1047 Brussels E-mail: [email protected] LINGUISTIC VERSIONS Original: EN ABOUT THE EDITOR To contact the Policy Department or to subscribe to its monthly newsletter please write to: [email protected] Manuscript completed in MMMMM 200X. Brussels, © European Parliament, 200X. This document is available on the Internet at: http://www.europarl.europa.eu/studies DISCLAIMER The opinions expressed in this document are the sole responsibility of the author and do not necessarily represent the official position of the European Parliament.
    [Show full text]
  • Data Localization and the Role of Infrastructure for Surveillance, Privacy, and Security
    International Journal of Communication 10(2016), 2221–2237 1932–8036/20160005 Data Localization and the Role of Infrastructure for Surveillance, Privacy, and Security TATEVIK SARGSYAN American University, USA Due to the increased awareness of the politics embedded in Internet technologies, there has been a growing tendency for state and nonstate actors around the world to leverage Internet infrastructure configurations to attain various political and economic objectives. Governments push for infrastructure modifications in pursuit of economic development, data privacy and security, and law enforcement and surveillance effectiveness. Information intermediaries set and enact their infrastructure to maximize revenue by enabling data collection and analytics, but have the capacity to implement tools for protecting privacy and limiting government surveillance. Relying on a conceptual framework of the politics of infrastructure, this article explores tensions and competing interests that emerge around intermediaries’ technical and policy infrastructure through analysis of (a) data localization strategies in a number of countries and (b) privacy and security undertakings by information intermediaries. Keywords: privacy, security, Internet infrastructure, surveillance, data localization The Politics of Infrastructure Governments across the world have come to recognize the importance of information intermediaries’ infrastructure for national security, public safety, and other political interests. Law enforcement and intelligence agencies are tasked with addressing various challenges, including the growth of terrorism, cyberattacks, cybercrime, fraud, and—in some regimes—political opposition and social movements. To pursue these goals, government agencies often need to access communications data that are beyond their immediate control, facilitated by a handful of information intermediaries. These companies mediate content by providing online services and communication platforms to global users.
    [Show full text]
  • 2017 Data Mining Report to Congress October 2018 2017 DHS Data Mining Report
    Privacy Office 2017 Data Mining Report to Congress October 2018 2017 DHS Data Mining Report FOREWORD August 2018 I am pleased to present the Department of Homeland Security’s (DHS) 2017 Data Mining Report to Congress. The Federal Agency Data Mining Reporting Act of 2007, 42 U.S.C. § 2000ee- 3, requires DHS to report annually to Congress on DHS activities that meet the Act’s definition of data mining. For each identified activity, the Act requires DHS to provide the following: (1) a thorough description of the activity and the technology and methodology used; (2) the sources of data used; (3) an analysis of the activity’s efficacy; (4) the legal authorities supporting the activity; and (5) an analysis of the activity’s impact on privacy and the protections in place to protect privacy. This is the twelfth comprehensive DHS Data Mining Report and the tenth report prepared pursuant to the Act. Two annexes to this report, which include Law Enforcement Sensitive information and Sensitive Security Information, are being provided separately to Congress as required by the Act. With the creation of DHS, Congress authorized the Department to engage in data mining and the use of other analytical tools in furtherance of Departmental goals and objectives. Consistent with the rigorous compliance process it applies to all DHS programs and systems, the DHS Privacy Office works closely with the programs discussed in this report to ensure that they employ data mining in a manner that both supports the Department’s mission to protect the homeland and protects privacy. www.dhs.gov/privacy 2017 DHS Data Mining Report Pursuant to congressional requirements, this report is being provided to the following Members of Congress: The Honorable Michael Pence President, U.S.
    [Show full text]
  • Humanitarian Futures for Messaging Apps
    HUMANITARIAN FUTURES FOR MESSAGING APPS UNDERSTANDING THE OPPORTUNITIES AND RISKS FOR HUMANITARIAN ACTION Syrian refugees, landed on Lesbos in Greece, looking for a mobile signal to check their location and notify relatives that they arrived safely. International Committee of the Red Cross 19, avenue de la Paix 1202 Geneva, Switzerland T +41 22 734 60 01 F +41 22 733 20 57 E-mail: [email protected] www.icrc.org January 2017 Front cover: I. Prickett/UNHCR HUMANITARIAN FUTURES FOR MESSAGING APPS UNDERSTANDING THE OPPORTUNITIES AND RISKS FOR HUMANITARIAN ACTION This report, commissioned by the International Committee of the Red Cross (ICRC), is the product of a collaboration between the ICRC, The Engine Room and Block Party. The content of this report does not reflect the official opinion of the ICRC. Responsibility for the information and views expressed in the report lies entirely with The Engine Room and Block Party. Commissioning Editors: Jacobo Quintanilla and Philippe Stoll (ICRC). Lead Researcher: Tom Walker (The Engine Room). Content: Eytan Oren (Block Party), Zara Rahman (The Engine Room), Nisha Thompson, and Carly Nyst. Editors: Michael Wells and John Borland. Project Manager: Waiyee Leong (ICRC). The ICRC, The Engine Room and Block Party request due acknowledgement and quotes from this publication to be referenced as: ICRC, The Engine Room and Block Party, Humanitarian Futures for Messaging Apps, January 2017. This report is available at www.icrc.org, https://theengineroom.org and http://weareblockparty.com. This work is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License. To view a copy of this license, visit: http://creativecommons.org/licenses/by-sa/4.0/.
    [Show full text]
  • Two Years After Snowden
    TWO YEARS AFTER SNOWDEN PROTECTING HUMAN RIGHTS IN AN AGE OF MASS SURVEILLANCE (COVER IMAGE) A student works on a computer that is projecting former U.S. National Security Agency contractor Edward Snowden as he appears live via video during a world affairs conference in Toronto © REUTERS/Mark Blinch 2 TWO YEARS AFTER SNOWDEN JUNE 2015 © REUTERS/Zoran Milich © REUTERS/Zoran “The hard truth is that the use of mass surveillance technology effectively does away with the right to privacy of communications on the Internet altogether.” Ben Emmerson QC, UN Special Rapporteur on counter-terrorism and human rights EXECUTIVE SUMMARY On 5 June 2013, a British newspaper, The exposed by the media based on files leaked by Guardian, published the first in a series Edward Snowden have included evidence that: of revelations about indiscriminate mass surveillance by the USA’s National Security Companies – including Facebook, Google Agency (NSA) and the UK’s Government and Microsoft – were forced to handover Communications Headquarters (GCHQ). their customers’ data under secret orders Edward Snowden, a whistleblower who had through the NSA’s Prism programme; worked with the NSA, provided concrete evidence of global communications the NSA recorded, stored and analysed surveillance programmes that monitor the metadata related to every single telephone internet and phone activity of hundreds call and text message transmitted in of millions of people across the world. Mexico, Kenya, and the Philippines; Governments can have legitimate reasons GCHQ and the NSA have co- for using communications surveillance, for opted some of the world’s largest example to combat crime or protect national telecommunications companies to tap security.
    [Show full text]
  • II. Data Retention: the Basics
    INTRODUCTION TO DATA RETENTION MANDATES September 2012 This memo introduces the concept of data retention, describes the common attributes of data retention laws, and discusses the risks to human rights, broadband deployment, economic growth and law enforcement effectiveness that such laws create. I. What is data retention? The telephone network (both fixed and wireless) and Internet services generate huge amounts of transactional data that reveals the activities and associations of users. Increasingly, law enforcement officers around the world seek such information from service providers for use in criminal and national security investigations. In order to ensure the ready availability of such data, some governments have imposed or have considered imposing mandates requiring communications companies to retain certain data – data that these companies would not otherwise keep – about all of their users. Under these mandates (imposed by law or regulation or through licensing conditions), data must be collected and stored in such a manner that it is linked to users’ names or other identification information. Government officials may then demand access to this data, pursuant to the laws of their respective countries, for use in investigations.1 As a tool for addressing law enforcement challenges, data retention comes with a very high cost and is ultimately disproportionate to the goals it seeks to advance. Less privacy-burdensome alternatives are likely to accomplish governments’ legitimate goals just as effectively and perhaps more effectively. II. Data Retention: The Basics Data retention laws vary with respect to the types of companies, data, and services that they cover. Types of companies covered: Most of the data retention laws that have been adopted thus far focus on telephone companies (both fixed line and wireless) and Internet service providers (ISPs), including cable companies and mobile providers.
    [Show full text]
  • A Practical Guide to Implementing a Data Retention Policy
    AA PRACTICALPRACTICAL GUIDEGUIDE TOTO IMPLEMENTING IMPLEMENTING AA DATADATA RETENTION RETENTION POLICY POLICY ACHTERGROND ACHTERGROND A practical guide to implementing a data retention policy drs. J. Blaauw en Y. Ajibade Msc* Trefwoorden: data management, data retention, data retentie, record retention, GDPR, AVG, data, dataverwerking, privacy Every organization processes1 data for different reasons and in different ways. In a data driven world, an organization largely depends on the data it has and uses.2 Part of fruitful processing data is making sure that you know when data must be kept and when it must be removed. Data3 is subject to different data retention periods, which may vary per country and/or industry. As a result, thousands of retention rules require you to either keep or destroy data and it is challenging to get advice on this topic. Implementing a data retention policy will help organizations to be in control of their data and it will reduce the risk of being non-com- pliant with laws and regulation, including the General Data Protection Regulation (GDPR).4 Properly implementing such a policy takes effort, commitment and management support. As a bonus, up-to-date and relevant data increases the value of your organization. This article offers a helping hand to organizations that are in various stages of implementing a data retention policy. In the first part of this article, we will focus on the legal framework. Here we will zoom in on the complexity of various rules and * Joris Blaauw is Senior Compliance Officer, regulations. In the second part, we outline eight steps to build a Group Data Protection Officer and Group solid data retention policy.
    [Show full text]
  • Benefits of Data Archiving in Data Warehouses 2 Benefits of Data Archiving in Data Warehouses
    IBM Software White Paper Benefits of data archiving in data warehouses 2 Benefits of data archiving in data warehouses Contents This unchecked data growth often results in ever-increasing infrastructure and operational costs, poor data warehouse 2 Executive summary performance, and an inability to support complex data 3 Typical reasons for rapid data growth retention and legal hold requirements. 4 Challenges associated with data warehouse growth A data archiving solution helps organizations address these 5 Traditional data growth solutions that do not work challenges by allowing IT staff to intelligently move (and purge) historical and inactive data from production databases 6 Understanding data archiving into a more cost-effective location while still providing the capabilities to query, search or even restore data if needed. 9 Benefits of data archiving A tiered archiving strategy provides additional benefits in 10 Guiding principles and technology requirements terms of managing performance and cost-effectiveness. Data archiving can also alleviate data growth issues by: 11 Managing data growth responsibly with data warehouse archiving • Removing or relocating inactive and dormant data out of the database to improve data warehouse performance • Reducing the infrastructure and operational costs typically Executive summary associated with data growth Data warehouses are the pillars of business intelligence and • Leveraging proven policies and processes to cost-effectively analytics systems, often integrating data from multiple data manage multi-temperature data sources in an organization to provide historical, current or • Improving disaster recovery and backup/restore plans to even predictive analysis of the business. Information from consistently meet service-level agreements (SLAs) multiple internal or external transactional systems is extracted, • Supporting compliance with data retention, purge or transformed and loaded into data warehouses as atomic hold policies data.
    [Show full text]