These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. Cloud Archiving For Dummies

Actiance Special Edition

By Bill Tolson with David Seidl, Scott Whitney, and Trevor Starr

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. Cloud Archiving For Dummies®, Actiance Special Edition Published by John Wiley & Sons, Inc. 111 River St. Hoboken, NJ 07030‐5774 www.wiley.com Copyright © 2015 by John Wiley & Sons, Inc., Hoboken, New Jersey No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without the prior written permission of the Publisher. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748‐6011, fax (201) 748‐6008, or online at http://www.wiley.com/go/permissions. Trademarks: Wiley, For Dummies, the Dummies Man logo, The Dummies Way, Dummies.com, Making Everything Easier, and related trade dress are trademarks or registered trademarks of John Wiley & Sons, Inc., and/or its affiliates in the United States and other countries, and may not be used without written permission. Actiance and the Actiance logo are trademarks or registered trademarks of Actiance, Inc. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc., is not associated with any product or vendor mentioned in this book.

LIMIT OF LIABILITY/DISCLAIMER OF WARRANTY: THE PUBLISHER AND THE AUTHOR MAKE NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE ACCURACY OR COMPLETENESS OF THE CONTENTS OF THIS WORK AND SPECIFICALLY DISCLAIM ALL WARRANTIES, INCLUDING WITHOUT LIMITATION WARRANTIES OF FITNESS FOR A PARTICULAR PURPOSE. NO WARRANTY MAY BE CREATED OR EXTENDED BY SALES OR PROMOTIONAL MATERIALS. THE ADVICE AND STRATEGIES CONTAINED HEREIN MAY NOT BE SUITABLE FOR EVERY SITUATION. THIS WORK IS SOLD WITH THE UNDERSTANDING THAT THE PUBLISHER IS NOT ENGAGED IN RENDERING LEGAL, ACCOUNTING, OR OTHER PROFESSIONAL SERVICES. IF PROFESSIONAL ASSISTANCE IS REQUIRED, THE SERVICES OF A COMPETENT PROFESSIONAL PERSON SHOULD BE SOUGHT. NEITHER THE PUBLISHER NOR THE AUTHOR SHALL BE LIABLE FOR DAMAGES ARISING HEREFROM. THE FACT THAT AN ORGANIZATION OR WEBSITE IS REFERRED TO IN THIS WORK AS A CITATION AND/OR A POTENTIAL SOURCE OF FURTHER INFORMATION DOES NOT MEAN THAT THE AUTHOR OR THE PUBLISHER ENDORSES THE INFORMATION THE ORGANIZATION OR WEBSITE MAY PROVIDE OR RECOMMENDATIONS IT MAY MAKE. FURTHER, READERS SHOULD BE AWARE THAT INTERNET WEBSITES LISTED IN THIS WORK MAY HAVE CHANGED OR DISAPPEARED BETWEEN WHEN THIS WORK WAS WRITTEN AND WHEN IT IS READ.

For general information on our other products and services, or how to create a custom For Dummies book for your business or organization, please contact our Business Development Department in the U.S. at 877‐409‐4177, contact [email protected], or visit www.wiley.com/go/ custompub. For information about licensing the For Dummies brand for products or services, contact [email protected]. ISBN 978‐1‐119‐ 14972‐9 (pbk); ISBN 978‐1‐118‐ 14973‐6 (ebk) Manufactured in the United States of America 10 9 8 7 6 5 4 3 2 1

Publisher’s Acknowledgments Some of the people who helped bring this book to market include the following: Project Editor: Jennifer Bingham Acquisitions Editor: Amy Fandrei Editorial Manager: Rev Mengle Business Development Representative: Karen Hattan

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. Introduction

n this data‐driven world, organizations rely on email, social Imedia, , instant messaging, and other new types of communication and collaboration capa- bilities. This means that more data is being generated and stored than ever before. The ever‐increasing reliance on cloud storage and other third‐party tools to handle these new types of communication means that it can be difficult to meet new demands from electronic discovery requirements and compli- ance regulations.

When IT staff, legal counsel, information security, and com- pliance officers are asked to store, manage, and search this massive assortment of information, it can look like an impos- sible challenge. Traditional backup solutions may capture and store the data, but don’t make it easily searchable or provide context. Legacy archives are often limited in the platforms or applications they support, and can have significant costs associated with the growth of records they’re purchased to retain.

When you add in the need to track how conversations and data exchanges are occurring between several different medi- ums (sometimes at the same time), then try to make them accessible and viewable in a way that preserves the context they occurred in, your task may seem like science fiction. Fortunately, cloud archives are quickly evolving to handle these and future requirements in a way that solves the com- plexity, cost, and sizing challenges that legacy on‐premise solutions face. About This Book This book examines the evolving case for archiving, and how cloud archiving can help you tackle difficult data handling and management challenges head‐on. I explain how archiving tools can help you meet regulatory compliance and eDiscov- ery requirements while saving space and making data easier

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. 2 Cloud Archiving For Dummies, Actiance Special Edition

to analyze. I also show you how cloud archiving tools can help you provide data in its original context across a wide variety of communications platforms while allowing you to cut costs, empower your legal and compliance staff, and make accessing and searching the data easier than ever before. Icons Used in This Book The margins of this book use several helpful icons that can help guide you through the content:

This icon marks tips that can save you time and effort.

This icon is for the technical types who are reading the book. It may be geeky, but it can be useful too.

If you see this icon, make sure you pay attention — you’ll want this knowledge close at hand later.

This icon marks something that you will want to take note of because it can cause problems. Beyond the Book You can find additional information about Actiance’s cloud archiving capabilities and its cloud archiving platform, Alcatraz, by visiting www.actiance.com.

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. Chapter 1 Evolving the Case for Archiving

In This Chapter ▶▶Understanding the need for archiving ▶▶Using archiving for regulatory compliance and eDiscovery ▶▶Improving storage management ▶▶Enhancing analytics using archiving tools

nterprises today generate massive amounts of data Ewhile facing compliance, electronic discovery, storage, and analysis requirements. Spending days or weeks sort- ing through traditional backups to find and extract the data you need isn’t acceptable when you need to meet a compli- ance requirement, respond to an audit, or respond to an ­eDiscovery request. This scenario gets more alarming when you take into account the fact that many organizations, prob- ably including yours, have multiple data repositories, each with different retention and management policies.

In short, backups aren’t archives. Simply backing up files like you have done for years doesn’t give you an easy way to find files, and doesn’t provide the capability to easily understand the context that communications were stored in. Backups also typically don’t include social media and other emerging methods of communication that are used in addition to email and collaboration systems such as SharePoint.

Fortunately, enterprise archiving can help. Enterprise archiving can store user‐generated data like email, instant messages, SharePoint documents, and other files, as well as website information and social media content all in one place

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. 4 Cloud Archiving For Dummies, Actiance Special Edition

while applying intelligent management policies and analysis tools. That giant mass of archived data can become a cen- trally managed and fully accessible information repository your users can rely on to easily perform the work your organi- zation needs them to do. Exploring Regulatory Compliance Following your industry’s regulatory requirements can be both complex and costly. Many of these regulations require that you retain documents and communications for a specific length of time — and that you must be able to produce them within a relatively short time frame when asked by a regulatory agency via an information request. An archiving solution can help you meet these retention requirements and can make being audited for compliance a much less stressful and risky task. Regulations and archiving There are thousands of regulations around the world that require content archiving for regulatory compliance. For example, in the United States, the Financial Institution Regulatory Authority (FINRA) Rule 2210 specifically states that “any written (hardcopy or electronic) communication that is distributed or made available to . . . investors” be archived for a specific period of time and to be made available when asked by a court or regulatory agency. Other regula- tory agencies such as the FDA, the SEC, the Financial Conduct Authority, as well as regulations like the Health Insurance Portability and Accountability Act (HIPAA), also require that data be retained and be made available.

Because there is a broad range of regulatory agencies and legal requirements that require data to be held and available for audits for prescribed periods of time, you’re probably wondering how to keep these sometimes conflicting require- ments straight. Each subsidiary, office, and department in your organization likely has differing regulatory requirements, from retention periods to country‐specific laws around shar- ing data. This means that your archiving solution should be flexible and have the capability to define policies and manage

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. Chapter 1: Evolving the Case for Archiving 5

content for each part of your organization. I discuss how you can target those specific needs in Chapter 2. Handling regulatory information requests Regulators are increasingly aware that business communica- tion is transitioning beyond the old‐reliable email system. That means that not only do you need to be able to capture all compliance‐related communications, including email, instant messaging, social media, SMS, and other collaboration applications, but also be able to fully respond to those agency information requests quickly. For example, public sector orga- nizations in the U.S. must meet Freedom of Information ACT (FOIA) requests within 30 days, making slow responses a com- pliance issue and eventually an eDiscovery issue. Supporting eDiscovery A major reason you might want to acquire an archiving solu- tion is to support your electronic discovery (eDiscovery) efforts. The eDiscovery process in the U.S. is the process of collecting, protecting, and producing all information (no matter where it resides) that could be relevant to the oppos- ing side’s case in a civil lawsuit. Lawyers (especially opposing counsel) and judges are now increasingly tech savvy. This means you should be ready to fully respond to an eDiscovery request across a broad variety of data and communications channels in official (and unofficial) use in your organization.

It’s important to check with your legal counsel to make sure that you’re doing what you need to for electronic discovery. Before you commit to a plan or tool, make sure your legal counsel is on board. Identifying data The identification phase of eDiscovery covers identification of all the places where relevant information could be found. Searching through data and communications stored by a vari- ety of business units in multiple IT systems (or across different­ countries) can be complex and time consuming — and result

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. 6 Cloud Archiving For Dummies, Actiance Special Edition

in missing deadlines, overlooked data, rising costs, and loss of the case. Plus, as the case and eDiscovery phase proceed, eDiscovery requirements can change causing rereview of already discarded contents.

Fortunately, an archiving solution can help. Centralizing and indexing data in an auditable system before discovery starts — think of it as proactive eDiscovery collection — means that identification can focus on the archiving system as the system of record. You can spend your time identifying the relevant data, instead of spending it trying to figure out where every individual or department has data stashed. Preserving and collecting After data sources have been identified in an electronic dis- covery workflow, the data needs to be preserved. The preser- vation phase centers on protecting potentially relevant data from loss, destruction, or altering. This can mean placing the data in a distinct, secure location or making an unalterable copy of it so that it can be represented as an “original copy” with a verifiable chain of custody. After the data has been preserved, it’s collected. Collection includes gathering all potentially relevant data (including all metadata) in a legally defensible manner (including chain of custody). If you don’t have a central archive, this can mean manually pulling data from many different sources and matching it among those sources. An archiving solution can help collection by both centralizing data and providing context to the conversations and other histories you’re collecting.

Accurate searches and export capabilities are especially important during collection for two major reasons: First, inac- curate or incomplete searches of the various data repositories can cause relevant items to be missed and can lead to chal- lenges in court. And second, reviewing the exported data can be a big part of your legal costs, because counsel has to look through all the data provided.

Integrated archives can include very powerful search capabili- ties to help you ensure all items have been found. Those same archives can also include first‐level review capabilities that allow you to cull irrelevant information from the collected data set before being sent out to review — dramatically reducing the overall cost of eDiscovery.

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. Chapter 1: Evolving the Case for Archiving 7

The eDiscovery Reference Model (EDRM) EDRM provides a reference model and ­processing to production and for electronic discovery that ­presentation. In the following figure, details the process from informa- you can see the EDRM path from tion governance and identifica- volume to ­relevance. tion of data through preservation

The EDRM framework includes ­preserved, as well as regular report- detailed breakdowns of each of ing and documentation. A well‐ the phases with guidelines on what designed and implemented archiving occurs in each part of the work- tool with built‐in workflow tools can flow. The following figure shows the make it easier to use the reference Preservation stage, which includes model for your organization’s eDis- developing a preservation strat- covery process. egy, identifying how data will be

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. 8 Cloud Archiving For Dummies, Actiance Special Edition

A critical part of the collection process is ensuring that the collected data remains unaltered and tamper proof. If data can be modified from its original form or deleted, your discov- ery process can be questioned — which is never a good thing when standing before the judge!

A 2012 report by the RAND Institute for Civil Justice found that it cost on the average $18,000 to review one gigabyte of data. Trimming what you export for review can be a huge money saver! Performing litigation holds Litigation holds (also called legal holds) are sent by legal ­counsel to their custodians to preserve data when litigation is anticipated or underway or when a regulatory audit or investi- gation may occur, to stop the inadvertent destruction of poten- tial evidence. Email, chat, text messages, and the dozens of other ways that communications occur in organizations mean that litigation holds will always have an electronic component in modern investigations. In fact, as of December 2006, the U.S. Federal Rules of Civil Procedure (FRCP) specifically highlights electronically stored information (ESI) as discoverable, and requires that organizations keep all potentially relevant elec- tronic data until the eDiscovery request has been fulfilled.

The EDRM model notes that the litigation hold process used should be legally defensible, efficient, auditable, proportion- ate, and targeted. In short, when issuing a litigation hold notice, you want to be able to defend how and why you did things and demonstrate that they were done properly and in a timely manner! Processing, review, and analysis After you’ve identified, preserved, and collected potentially relevant data, the next step is to process it.

The primary goal of processing is to figure out exactly what data is contained in the collected data set. This includes point- ing out all situations that could cause problems during review, such as the presence of encrypted files, password-protected files, corrupted files, or hidden embedded content such as

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. Chapter 1: Evolving the Case for Archiving 9

locked cells in an Excel spreadsheet. Processing should occur with strict adherence to process auditing; quality control; analysis and validation; and chain of custody considerations.

The review and analysis processes are dependent on each other. The analysis process adds sophisticated analysis processes to the review phase to increase productivity and reduce review costs. The review process involves reviewing all processed content to determine relevancy to the eDiscov- ery request. Nowadays, judges won’t allow a data dump, which means to overload the opposing side with information to drive up the cost of defending against the lawsuit. The court’s expectation is that only relevant data will be turned over so the responsibility to review each and every document falls on the responding party. The review process also includes the identification of privileged and confidential information that doesn’t need to be turned over. Production The production phase’s main aim is to prepare and produce the reviewed content in an efficient and usable format in order to reduce errors and be in compliance with agreed pro- duction specifications and timelines. Providing Storage Management If you’re used to relying on backups as your long‐term stor- age, you probably think in terms of the total space used by daily, weekly, and monthly backups, and how much your storage grows from year to year. Backups don’t apply much intelligence to the content they contain and don’t provide an easily searchable format; however, that’s where archiving provides real storage management advantages.

Metadata is data that describes other data. This descriptive data can include things like who the author or creator is; the time, date, and location of the data’s creation; and even tags that indicate specific information about the data that will help with searching and indexing.

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. 10 Cloud Archiving For Dummies, Actiance Special Edition Saving space The way you use your storage space can drive up a significant portion of your costs when storing data. Backup solutions often back up entire drives, or all the data in a given set of directories on a scheduled basis. That can result in large amounts of duplicated data across multiple backups, and doesn’t provide any granularity. On the other hand, if you can apply a more nuanced retention scheme to your data storage, you can save money and make data easier to find.

Archives have these three major capabilities beyond those that backups normally include the following:

✓✓Deduplication, particularly across multiple storage areas or systems. This can save a lot of space if the same con- tent is stored in many places throughout your organiza- tion’s email or other communication systems. ✓✓Retention management, which provides a systematic way to determine what is kept and for how long. ✓✓Disposal policies, which, when paired with retention management, help to rein in growth and ensure that old data (including smoking guns) doesn’t haunt your ­organization. Reducing staff workload If you have a lot of stored data, you probably have a dedi- cated storage staff whose job involves overseeing, main- taining, and optimizing how you use storage. An archiving solution can help save them time (and therefore money) in a number of ways:

✓✓Automatically handling many of the day‐to‐day storage optimizations ✓✓Decreasing the amount of time they spend hunting for files and data during eDiscovery processes ✓✓Allowing centralized configuration of policies and work- flows for file retention and deletion ✓✓Providing analytical capabilities that help your storage staff quickly identify where waste may be occurring

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. Chapter 1: Evolving the Case for Archiving 11 Managing and optimizing costs Storage space can be expensive, particularly when you’re paying for on‐site disk arrays, filers, and storage management software. An archiving solution’s analysis and deduplication capabilities can help reduce costs, but an even better solu- tion can be found with a cloud solution that allows you to avoid the up‐front investment a major local storage solution requires. Instead, you pay for just what you need, rather than spending the money that a failure-tolerant system would cost. Enhancing Analytics Archived data can be a treasure-trove of information about your organization and its business. The capability to analyze the data and to create useful knowledge based on its content, size, growth patterns, and many other details can provide big benefits. Analyzing archives Imagine being able to look at all the metadata from every file, instant message conversation, and email in an easy‐to‐­ understand graphical interface. For example, you can look at where your email goes, who sent it, and a multitude of other details. Applying big data analysis techniques to your own organization’s data can help you detect patterns and trends that you might otherwise miss. In Figure 1-1, you can see how archive analytics can also provide a view of your organiza- tion’s social media efforts.

Make your analytical capabilities available to support business in addition to discovery and compliance efforts. Social media and email analysis can provide useful insight for business.

In 2014, Nucleus Research found that the average return of a dollar spent on analytics was $13, and that that number had increased from $10.66 in 2010. Analyzing your archives can allow you to quickly leverage a data set you already have. That preexisting data can help you to look for patterns that indicate compliance issues, where new technology is being used, or how well your email and social media campaigns are doing.

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. 12 Cloud Archiving For Dummies, Actiance Special Edition

Figure 1-1: Analyzing archived data in Actiance’s Socialite. Considering the future of analytics The big advantage of cloud archiving analytics is the fact that your data set already exists; you just need to analyze what you have. Leveraging scalable cloud platforms to quickly provide deep insight means that your organization can detect issues as they’re starting, including important and hard‐to‐ detect issues like financial impropriety, by looking at many channels at once.

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. Chapter 2 Moving Archiving to the Cloud

In This Chapter ▶▶Examining cloud advantages ▶▶Supporting eDiscovery and compliance efforts ▶▶Making costs predictable

ackup and archiving solutions were traditionally on‐ Bsite, using local employees to manage storage, servers, and the software platform itself. That meant that you had to do your own patching and upgrades, deal with storage and server growth, and build redundancy, disaster recovery, and recovery into your plans and design.

Legacy archiving products were also often designed around a relatively narrow set of communication and collaboration technologies that were hosted on‐site: Exchange, SharePoint, and similar locally managed technologies were supported through what were essentially dedicated archiving solutions. Backup solutions, on the other hand, back up files from many platforms, but don’t build in intelligence to understand con- text and content.

In this chapter, I show you what a cloud archiving solution can offer as your organization uses an ever‐increasing number of social communications, instant messaging, email, and mobile devices to conduct its business; handle electronic dis- covery (eDiscovery); and manage compliance needs.

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. 14 Cloud Archiving For Dummies, Actiance Special Edition Examining Cloud Advantages The move to the cloud from legacy on‐premise archiving solu- tions provides a number of advantages. Some of these will be familiar if you have adopted (SaaS) solu- tions already. Taken together these advantages make cloud archiving a powerful solution. Handling more types of data The big advantage of a cloud archiving solution is the capabil- ity to add functionality quickly, without interrupting service. That means that users can add more support (and capability) more quickly to a list of communication mediums that include the following:

✓✓Email ✓✓Unified communications ✓✓Public social networks ✓✓Enterprise social networks ✓✓Instant messaging systems ✓✓Corporate devices ✓✓SMS ✓✓Specialized community networks

The broad range of communication mediums can make a cloud archive much more useful than a limited on-site legacy archive designed to handle one or two solutions. In fact, a cloud archiving solution makes a lot of sense when your social media requirements include capturing metadata such as when a posting was edited, hashtags and their context, the content of drafts and deleted postings, or even what images were included in the message.

The capability to handle popular social media content like , LinkedIn, +, and YouTube can be a big advan- tage if you’re trying to meet compliance requirements. Putting those into context with email and unified communications is an even bigger benefit.

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. Chapter 2: Moving Archiving to the Cloud 15

Office 365 If you’ve made the jump to the cloud ✓✓ Search index builds can be slow for your productivity software, with delays reaching up to a there’s a good chance that you’re day for some content, calling using Office 365. Office 365 puts searches of newer content into Office and Exchange into question. a cloud environment, and builds in ✓✓ Unlike a dedicated archiving some basic email archiving and eDis- solution, Office 365 doesn’t pro- covery capabilities. This may sound tect items in an archive from like an easy way to take advantage of changes or deletion by default. a cloud archive, but there is a reason It requires a mailbox hold to be that dedicated archiving solutions placed before it does so. exist. Here are a few of Office 365’s current limitations: ✓✓ Searches of over 100 simultane- ous mailboxes and more than ✓✓ It only provides two archive tags, two concurrent searches aren’t which limits your flexibility when currently supported. it comes to retention and dis- posal policies. These limitations help show why it is important to understand your dis- ✓✓ Office 365 indexes 58 types of covery, compliance, and archiving files, but doesn’t cover the full needs when you select a solution. range of possible file types, meaning that searches can miss important files.

Delivering context‐aware archives Archiving tools are designed to store content (including communications) and make them easily accessible and ­searchable. That’s a great feature, and it drove the market for years. But there’s something missing if the tool just brings back all the email sent by, for example, an employee named Gary Smith on April 12: Where’s the context? If the reason why email was sent, or what it was in response to matters, and if the instant messages or VoIP phone calls and voice- mails that were made by Gary in that same time frame to the same set of people might matter, then you want your archive to be context aware to give you a wider visibility into, for example, a conversation that bridged several communication platforms. Figure 2-1 shows how a context‐aware archiving system can be helpful.

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. 16 Cloud Archiving For Dummies, Actiance Special Edition

Figure 2-1: Context‐aware archives in Actiance Alcatraz. Enabling compliance and legal staff More and more federal, state, industry, and foreign authori- ties require that social media communications receive the same archiving care as email and other business records did in previous years. That means that building an archive that captures and enables access to those new types of data, in context, and with metadata that provides relevant detail about the data, is incredibly important.

At the same time, staff are being asked to respond to more frequent and more complex information requests. That means that compliance and legal staff will spend more time interfac- ing with archiving tools. A highly intuitive archiving tool with a web interface that compliance and legal staff can easily learn and use is key. Supporting eDiscovery In addition to compliance efforts, eDiscovery is a major reason to use a cloud‐based archiving solution. Moving archiving to the cloud can provide a number of advantages, including the following:

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. Chapter 2: Moving Archiving to the Cloud 17

✓✓Allowing investigators to do searches themselves, rather than waiting for technical staff to get involved. ✓✓Providing a more intuitive search interface. A search engine style search interface means that the searches that investigators conduct will work the way that they’re already used to. That can lead to big‐time savings and less support overhead. ✓✓Enabling independent (and more granular) legal holds. This reduces another historical drain on IT staff time because counsel or investigators can place legal holds on just the relevant data, themselves. ✓✓Performance increases. More servers with more powerful infrastructure behind them means that a cloud archive can deliver search results up to ten times faster than a local solution. ✓✓Export support that allows legal hold data sets to be exported in a format that works for dedicated discovery and analysis tools (and that is legally defensible). This can be another big timesaver for legal counsel, which translates directly into money saved on legal costs. ✓✓Support for data retention policies to ensure old data that should have expired or aged out of storage is removed when it should be.

Exporting an entire conversation (with possibly a detailed timeline) rather than individual email messages can save time spent by counsel sorting through messages and reconstruct- ing conversation threads, which doesn’t come cheap! Evolving from point production to life cycle management Electronic discovery and compliance work have historically been triggered when a government agency issued an informa- tion request or when an eDiscovery request was received from opposing counsel. The point‐of‐production model was encouraged by the design of many legacy archiving products that focused on storing messages (without context), and then searching through them to produce data when needed.

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. 18 Cloud Archiving For Dummies, Actiance Special Edition

Redesigning the workflow to focus on the full life cycle of data highlights a number of advantages to cloud archives. The first, and perhaps biggest advantage, is that policies and rules are applied to data when it’s archived. At the same time, metadata is created and indexed, providing a more complete view of the data. This means that compliance objectives can be set before the data enters the archive, and that reports on compliance can be automated as part of the normal operation of the archive itself. It also makes legal holds and eventual production for electronic discovery easier by keeping commu- nications in context and easily searchable.

The final element is the end of the data life cycle. Compliance and business record retention requirements usually have an end date after which data can be deleted. This defensible deletion is an important part of space management and helps maintain performance when searching and reviewing data. Adding retention rules management to the archive can not only limit the scope of what is lurking in your archives, but it can also help to save space and time.

Keeping data that you don’t need can be as big of an issue as not having the data you need! Improving searches Searches are one of the most important tools at a user’s disposal when working with an archiving solution. A cloud archive can offer two major advantages over an on‐premise legacy solution. First, and possibly most important, the search tool can take advantage of the scale of the cloud infrastruc- ture. That means that where a single server might have been used for an on‐site solution, multiple indexing and search servers can be leveraged in the cloud. Indexing as well as searches can be done more quickly due to this simple perfor- mance enhancement.

The second big advantage that a cloud archive can bring to searches is flexibility and similarity to what staff members are used to. Making search work like an everyday Google search may not seem like a big advantage, but it really pays off when your nontechnical staff can get complex searches done with- out calling IT staff for help.

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. Chapter 2: Moving Archiving to the Cloud 19 Scaling to handle large volumes of data Organizations generate increasing amounts of data every year and with the arrival of big data analysis as a mainstream concept, the need to keep and work with data keeps grow- ing. If you add in the need to handle more types of data from social communications to unified communications, and from email to work files, you’re quickly faced with a huge amount of information. That means that archiving solutions need to be scalable and handle significantly more data than they ever did in the past.

Unified communications includes the integration of instant messaging, telephony, presence, conferencing, and other fea- tures under a central umbrella.

The same cloud scalability advantages that help make searches faster, apply to intelligently storing massive volumes of data. Cloud storage can scale faster because the needs of one organization are added to the needs of many others, allowing a cloud archive to buy storage resources in bulk, pro- actively, to be ready for your growing needs quickly (and to share the costs among many customers). It means that your costs reflect what you actually need, now, and that they can scale directly with your data rather than growing in huge and often costly chunks when you add more local storage to your on‐premise archive.

That also means that costs can shrink through smart use of filtering and retention policies. With legacy archiving solu- tions that rely on local storage, even if lots of data is removed, the costs of maintaining the underlying systems and storage remain. They can’t grow or shrink dynamically like a cloud solution can. Cutting IT Costs One advantage of an SaaS solution is that you no longer deploy the infrastructure elements necessary to support an archiving solution on‐site. All you need is bandwidth to trans- fer the data to your cloud provider. That means that staff time

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. 20 Cloud Archiving For Dummies, Actiance Special Edition

and hardware costs associated with on‐site legacy archives can be significantly reduced or eliminated altogether.

Switching to an SaaS solution isn’t a guarantee of savings. Make sure you have a clear picture of your total cost of own- ership and a realistic expected return on investment (ROI). I give you some tips in Chapters 3 and 4. Changing subscription models The same SaaS model that is causing major changes in a wide variety of traditionally on‐premise solutions has a big impact here too. The costs of scaling a traditional on‐premise solu- tion to handle a large increase in data storage can be sub- stantial. I show you more of what this can mean in Chapter 3, where I discuss how to effectively leverage cloud archives. Key considerations for cloud contracts Moving to an SaaS vendor means that you’ll have to rely on the vendor to provide services that you may have handled internally before. That means it is really important to negoti- ate these key items into your contract:

✓✓How you will be billed for your usage. You need to under- stand the different costs that your organization will be subject to, including: per-user costs, how much storage costs and how those costs change as you use more, and other costs that change as your usage grows or shrinks. ✓✓What is the service level agreement? In other words, what are the expectations for uptime, how fast should searches occur, how long will exports take, and what should you be able to expect for eDiscovery responses? ✓✓What are the penalties if the service level is not met? How will you be notified if the service has problems? ✓✓What is the process to get data out and what is the cost if you want to move your data or even to move to another vendor? ✓✓What happens when you want all your data out? Does the vendor guarantee deletion?

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. Chapter 3 Leveraging Cloud Archives

In This Chapter ▶▶Improving data collection ▶▶Using cloud archives in your workflow ▶▶Saving money

egacy archiving systems are sometimes associated with La single type of data or a single system such as email or a major platform like SharePoint. At times, they’re installed and managed in different parts of a large organization, result- ing in multiple data stores spread through departments and locations. That means that legacy archives tend to prolifer- ate in organizations, resulting in a multitude of places where data can exist. When multiple archives need to be searched for important data, it can quickly lead to a lot of time and expense, or worse, missed information.

A major advantage of moving to a cloud archive is the oppor- tunity to use a single, centrally accessible platform to store and search many types of archival data. Rather than using a platform or data‐type specific archive, it is possible (and preferable) to store many types of data into a single archive designed for ease of access and scalability.

Using a central repository also means that you have the oppor- tunity to place email, instant messages, files, and other data into a shared context. That’s a huge advantage over manual processes that require you to match timestamps between each communication or file when performing eDiscovery, and it reduces the number of mistakes and missed items.

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. 22 Cloud Archiving For Dummies, Actiance Special Edition

Obviously, putting all your data in one repository means that you’ll have quite a bit to deal with. Fortunately, an effective cloud archive is designed from the ground up to handle bulk data in intelligent, cost effective ways.

If you simply move all your historical data and don’t take advantage of the filtering and data management capabilities that your cloud archive offers, you can drive your costs and time to migrate up, even with the increased scalability and performance a cloud archive can offer. Handling Bulk Data One of the biggest concerns that comes up when organiza- tions consider a cloud archive is how they will move the tera- bytes of data to the cloud. It turns out that concern can be easier to manage than it might appear at first glance.

The solution to bulk migration issues is scaling. In simple terms, to send more data to the cloud, you need more servers to handle the data, and sufficient bandwidth to transfer the data. To do this, cloud archiving consultants and vendors will deploy multiple servers with sufficient Internet bandwidth to extract and upload data at a rate that best matches the capa- bilities of the existing legacy storage environment as well as the receiving cloud archive ingestion rate.

Make sure your vendor or consultant does an assessment of the data transfer rates your current archive or file stores can support without causing issues for your current users. It’s safe to make the move, but it is important to plan and test thoroughly first!

Figure 3-1 shows an example of the infrastructure used to export data to a cloud archive. Note that there are multiple servers on‐site sending data to the cloud vendor’s ingestion servers. That allows the processing and filtering load to be spread rather than relying on a single transfer server. Scaling horizontally across multiple resources can speed up trans- fers, allowing terabytes a day to be uploaded.

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. Chapter 3: Leveraging Cloud Archives 23

Figure 3-1: Bulk data ingest scaled onto multiple servers. Improving Data Collection Cloud archives can make a big difference in your data collec- tion process by removing some of the barriers that traditional on‐premise archiving solutions create. Taking advantage of additional data sources An archiving solution that can handle a wide variety of data sources can provide better context and relevance for com- pliance and discovery work. It can also make organizational choices easier by not limiting the storage and search options to those limited choices provided by your legacy archive. Dealing with new data types The software and systems organizations use are changing quickly, and that means that you will have to adapt to new

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. 24 Cloud Archiving For Dummies, Actiance Special Edition

data types more frequently. When you have a legacy archiving solution, you may have to deal with constant patching and upgrades with accompanying downtime for the archive and extra work for staff. It can also mean purchasing an entirely new archiving product to add to a growing list of specialized tools due to the need to stay compliant or to be able to con- duct discovery work.

A cloud archive can help fix that issue by handling the upgrade process seamlessly while adding new capabilities without causing downtime and interruptions for your organi- zation. It is still important to make sure your cloud archiving solution supports your existing key systems and data types from the start, but the sometimes painful process of staying on top of changes can be a lot easier with a cloud archive! Using Metadata When you have massive amounts of data stored in an archive, sorting through it can be a nightmare. There are different types of information created by various users, groups, and automated systems, and you can’t rely on that data always having a full complement of useful context or information built into it such as file name, date sent, email subject, or other content you might look for to try to quickly identify specific data. Fortunately, there is a way to provide more granular and richer information about the contents of your archives: metadata. Defining metadata Metadata is data about the data itself. It can include details about a file such as where it was created, who created it, and when, among other things. Metadata can also be used to add informational metadata tags that could provide additional capabilities such as helping identify financial data, personally identifiable information (PII), or email messages with content covered by a specific compliance requirement.

Think of metadata like you would an old library card catalog. Because it contains information about the data, you can use it to find relevant data more quickly than by simply searching through every message or file.

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. Chapter 3: Leveraging Cloud Archives 25 Using metadata Because a cloud archive can centralize many types of data, the importance of metadata is even greater than it would be in a platform- or application-specific legacy environment. That means that your cloud archive should make using metadata quick, flexible, easy, and more productive. Clear graphical user interfaces (GUI) like the one shown in Figure 3-2 can make leveraging metadata for searches, analysis, and compliance efforts a lot easier.

Figure 3-2: Metadata analysis in Alcatraz.

When you use and apply new metadata, remember that you’re designing a tagging system for long‐term usage. Hiring an expert or working with your service provider to learn the best way to use it can really pay off! Metadata and reporting Reporting and analysis tools provide an extra layer of useful information about archives. They can also work with metadata to help you understand what your archive contains from a high-level perspective. That means that if you have to supply a specific conversation thread due to an eDiscovery request, being able to search for a specific conversation thread ID and assemble the entire conversation automatically from among

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. 26 Cloud Archiving For Dummies, Actiance Special Edition

thousands of archived emails will save a great deal of time and expense as well as ensuring that the entire conversation is found and produced. Using your cloud archive’s search and reporting tools in combination with ­captured metadata can be very powerful. Taking Advantage of Improved Usability A cloud archiving platform can offer usability improvements for both technical and nontechnical users. Technical users can spend their time working on data management doing things like managing user accounts, setting retention and expiration policies, and reviewing usage and total current storage requirements. These tools are specifically designed to give them the information they need without making them do the underlying integration and systems management that they would have to do with a legacy archiving system (via APIs, or even using their proprietary interface software), which saves time and effort, and requires less deep technical knowledge of the underlying infrastructure.

For nontechnical users, the advantages can be even greater: They can use an easy‐to‐understand web interface. For most nontechnical staff, a web interface is a familiar way to access systems, and using a web interface also means that they don’t need to have a specific software package installed on their local workstation to do their job. That means they’re able to do their work more easily from more places! Saving your storage team time Legacy archiving solutions rely on underlying storage systems to provide space for data storage. Moving to a cloud archiving solution means that organizational storage administrators and support teams don’t need to provide bulk storage for archives, and time they previously spent supporting compli- ance and discovery efforts can be spent supporting other important organizational efforts.

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. Chapter 3: Leveraging Cloud Archives 27 Helping auditors and compliance teams The centralization, metadata, and reporting capabilities that cloud archives offer also mean that auditors and compliance managers can perform their tasks far more easily than they could in a legacy system where they don’t have access to those tools. Centralizing data and providing powerful search tools can make compliance and audit work faster and easier, while ensuring that it meets both sets of requirements. Plugging cloud archives into your workflow A cloud archive can support multiple workflows using the same data. For example, it can support the following:

✓✓eDiscovery workflows by archiving, tagging, filtering, and providing auditable information about communica- tions and files that can then be exported as needed for faster legal hold, collection, and first pass review. ✓✓Compliance workflows involving searching for and reporting on compliance‐related data. ✓✓Reporting and analysis workflows to show both visual representations of data and its context and detailed ­granular reports to help find specific issues or items. Saving Money with a Cloud Archive If the capability to handle many types of data combined with built‐in scalability, availability, and disaster recovery capabili- ties didn’t sell you on the need for a cloud archive, the cost model probably will. Cloud archiving can save you staff, infra- structure, and licensing costs.

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. 28 Cloud Archiving For Dummies, Actiance Special Edition

TCO and archiving solutions The total cost of ownership of a (scalability). With many legacy plat- legacy archiving solution includes forms, growth involves purchas- more than just the cost of storage, ing and scaling up local storage the archiving product, and the staff systems that require expensive time that it spent managing the plat- upgrades. That can be a major cost form. In fact, when you look at your if only a small amount of additional total archiving costs, you should storage is required. Adding redun- include the following: dancy and then providing for disas- ter recovery can make an otherwise ✓✓ The cost of underlying storage small increase in capacity a major and systems cost if the underlying system needs ✓✓ The cost of licensing replaced or upgraded. ✓✓ Disaster recovery and backup Cloud archiving costs follow a simi- costs lar model to those of other cloud ser- vices: Scalability, disaster recovery, ✓✓ Costs associated with scaling and redundancy are typically built the system to larger capacities in, and patching and upgrades are and speeds handled by the vendor. That leaves ✓✓ Staff time to maintain the storage you paying for only the capacity and and archiving system, as well as capabilities you need, and takes the staff time spent using it large capital outlays that often come with growth or changes in usage ✓✓ Costs of downtime during out of the equation. Instead, you pay upgrades and patches for what you actually use, with the ✓✓ The relative speed with which knowledge that you can grow easily the archiving solution can ingest and quickly if you need to. data versus a cloud platform if These before and after costs will you anticipate acquiring another play a major role in determining the organization or simply adding return on investment for moving to new types of data from your own a cloud archive, discussed in more existing organization detail in Chapter 4. It is also a good idea to look at how your cost models handle growth

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. Chapter 3: Leveraging Cloud Archives 29

As an example, legacy archiving solutions are usually com- posed of three major components:

✓✓A data source, or data sources, like Exchange email serv- ers, chat and instant messaging systems, SharePoint servers, or other messaging and data storage systems ✓✓A large‐scale data storage capability like an EMC or NetApp filer with many disks and a central server(s) to provide access to storage space ✓✓The archiving software and hardware platform

Of course, these all have infrastructure behind them like power, heating and cooling, tape backup arrays, redundant systems, and annual support contracts.

If you use a cloud archive, you may have a gateway server or servers that send data out to the cloud from your local data sources. That’s it. The rest of the infrastructure stays in the cloud.

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. 30 Cloud Archiving For Dummies, Actiance Special Edition

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. Chapter 4 Understanding Cloud ROI

In This Chapter ▶▶Understanding ROI ▶▶Examining your costs ▶▶Exploring cost savings

cloud Software as a Service (SaaS) archiving platform Aprovides several additional benefits beyond what a traditional on‐premise archive may deliver, including a ­noticeable increase on your return on investment. Understanding ROI Return on investment, or ROI, is a term used to describe how efficient an investment is. To determine if it makes sense to make a change, you can use ROI analysis to evaluate whether adopting a cloud‐based archive for the first time or whether moving from an on‐premise archive to a cloud‐based archive makes sense for your organization.

ROI is the gain on the investment, minus the cost of the investment, divided by the cost of the investment, as shown in Figure 4-1. It gives you a calculated rate of return on the investment (as a percentage).

In order to calculate ROI for an archiving solution, you need to understand your costs before and after the investment so you can fully understand what your expected investment return may be.

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. 32 Cloud Archiving For Dummies, Actiance Special Edition

Figure 4-1: Calculating ROI. Understanding Your Costs In order to really understand your potential ROI for a cloud archiving solution, you need to calculate the first part of the equation in Figure 4-1: the cost of the process before the investment. You can look at the advantages of a cloud archive from two positions: from a current state of not having any archive solution, and from a state of already having an on‐premise archiving solution. From not currently having any arching solution, the three primary parts of this calculation are infrastructure costs, eDiscovery costs, and compliance costs. If you already have an on‐premise archiving solution, the main costs will involve infrastructure costs (assuming your current on‐premise archive already provides eDiscovery and compliance benefits). Infrastructure costs The first step to understanding your total cost of ownership (TCO) for an archiving solution is to look at the costs of the servers, storage, staff, and other infrastructure costs that go into supporting your archiving system. I discuss the most common costs faced by organizations, but you may have other costs that you’ll want to include — make sure to involve all the technical areas that support your infrastructure and systems to get a complete view.

For those without an on‐premise archive, infrastructure would include the cost of enterprise storage, file servers, and hard- ware support agreements.

Personnel The cost of technical staff with the expertise to build, deploy, and support an archiving system can be significant. A typical on‐premise archiving system requires support from staff with multiple disciplines:

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. Chapter 4: Understanding Cloud ROI 33

✓✓Email, social media, and unified communications ✓✓Windows or Linux server administration ✓✓Storage administrators for large‐scale storage and backups

More complex installations may require other experts. Remember to look at all the time your staff spend working to support your on‐premise archiving solution, not just the time they spend working with the solution itself. If you’re calculat- ing TCO, that should include some of the time spent doing maintenance of underlying infrastructure too.

Servers Archiving environments typically require dedicated serv- ers to perform indexing, search, storage management, and analysis work. That means that even a small on‐site archive normally requires at least one, but usually two or more, serv- ers. Archives for larger organizations, or installations that deal with a large amount of data require more processing power, more memory, and more storage resources. Because archives often also contain sensitive data, these requirements may preclude the archiving system from being run in a virtual environment, further increasing costs by requiring dedicated hardware.

Don’t forget the costs of annual software and hardware sup- port agreements and the costs to maintain servers and other devices in your data center.

Storage Another significant driver of costs in on-premise archiving deployments is the cost of the underlying storage. It’s not just the disk space itself that raises costs. In fact, if you look at a typical deployment for 10 terabytes of archived email and related data, you can see how quickly costs can add up. Here’s what that can look like:

✓✓10 terabytes of storage for the data itself ✓✓80 terabytes of backups on disk or tape ✓✓Costs for additional storage devices or servers ✓✓Software and licensing costs for storage systems

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. 34 Cloud Archiving For Dummies, Actiance Special Edition

The estimated space for backups is from a real life example of an enterprise Exchange cluster using daily differential ­backups, weekly backups, and with six months of monthly backups — a reasonably common backup strategy. It may vary for your organization, but it’s likely at least three to four times larger than your actual data store.

Software support The final common infrastructure element is the cost of annual software support. Industry‐wide, this often amounts to 20 ­percent or more of the initial license cost of the soft- ware that you acquire. In addition to that cost, you may have licensing and support costs for the underlying server operat- ing system, storage system, backup software, and other soft- ware elements of the solution. eDiscovery costs One of the most common benefits of an archiving system is to support eDiscovery. That means that the costs of eDiscovery without an archiving solution should be part of your equation when looking at a cloud archiving solution. These costs can include:

✓✓Legal time spent by counsel searching, reviewing, or dealing with exports ✓✓Technical staff time creating exports, copying or moving files, or otherwise supporting discovery ✓✓Time spent building custom integrations or performing searches for nontechnical users Regulatory compliance costs Regulatory compliance can be costly, because it often requires specific actions, including data preservation for spe- cific periods of time, and monitoring. Look at what you spend on staff time for recurring compliance efforts by legal, techni- cal, and compliance staff members.

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. Chapter 4: Understanding Cloud ROI 35 Exploring Cost Savings After you have a good idea of the costs before and after you start archiving to the cloud, and a realistic estimate of the cost of the cloud archiving solution, you can calculate your estimated ROI (refer to Figure 4-1 for the ROI formula). Areas where you’re likely to save money include the following:

✓✓Storage savings: Cloud archives charge for the space you use, not for layers of backups and expanding storage in large chunks. ✓✓Faster, more efficient eDiscovery: Save legal and techni- cal staff time when searching for data, placing holds, and performing other common tasks. ✓✓Easier compliance: Achieve major staff time-savings by avoiding manual processes while ensuring the correct content is retained. ✓✓Moving large capital expenses to a pay‐as‐you‐go opera- tional expense model: Directly control archiving costs and get better insight into what you actually spend. Also take advantage of moving the cost of archiving from a capital expense (CapEx) to an operating expense (OpEx).

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. 36 Cloud Archiving For Dummies, Actiance Special Edition

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. Chapter 5 Avoiding Cloud Migration Pitfalls

In This Chapter ▶▶Avoiding slow migrations ▶▶Dealing with data ▶▶Supporting eDiscovery and preserving the chain of custody

ven if your organization is highly committed to adopting Ea cloud archiving solution, it can still fall prey to some of the common issues that any migration can face. Fortunately, you can avoid those issues with a little bit of prior knowledge and planning.

As you might expect, there are ways to avoid each of these pitfalls through a combination of technical solutions, policies and practices, and, of course, planning how you approach the migration. In this chapter, I discuss how to prevent each of these problems so that your cloud archive migration goes smoothly. Avoiding Slow Migrations One of the biggest issues you can face when moving to a cloud archive is a slow migration. There are a few reasons that migrations can go slowly when moving to a cloud archive.

Your archives or traditional backups no doubt hold a massive amount of data in its raw form. There are quite a few deci- sions about data and data retention that should be addressed in moving to a new archive that you can use to not only save

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. 38 Cloud Archiving For Dummies, Actiance Special Edition

space and money but also to improve your capability to use the cloud archive effectively. You should make decisions based on the following:

✓✓Deduplication of files, ensuring that a valid copy is kept, but that you don’t have dozens (or thousands!) of copies of the same data. ✓✓The age of the data and whether it is still relevant should be based on legal, regulatory, or business reasons. Getting rid of expired and useless data can help you meet your information governance goals. ✓✓Retention of unwanted data like spam email, marketing email, and other junk that can fill your archive, raising the cost during discovery or regulatory response. You can even filter these by the domain they come from or their subject line, so the “25% off at Awesome Shop” email everyone got three years ago can be automatically removed. ✓✓The types of files you should archive. Archiving poli- cies can ensure you’re not archiving common operating system files, RSS feeds, or the application installation and “readme” files. ✓✓File size limitations. This feature can help avoid retaining enormous files that aren’t relevant to the reasons you maintain the archive.

Take advantage of classification and filtering capabilities to target and specially handle your highly sensitive or very important data. This allows you to take action to either remove or intentionally choose to retain the data for a spe- cific period of time before it goes into a cloud archive. That can be particularly important if you deal with personally identifiable information (PII) such as Social Security numbers, credit card numbers, or other data that may have a compli- ance requirement.

The whole decision‐making process for filtering data during the move from a legacy archive to a cloud archive is shown in Figure 5‐1. Note that there are three major flows: a litiga- tion support flow at the beginning to ensure that eDiscovery activities are supported, a remediation flow for the sensitive

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. Chapter 5: Avoiding Cloud Migration Pitfalls 39

data that needs attention and special handling, and finally, the filtered high value data you actually want to move.

Figure 5-1 illustrates an easy way to think about where bad migration decisions can cost you increased risk and addi- tional time, and leave you with a lot of data you don’t want or need in your archive. Make sure you think carefully about each layer to avoid leaving yourself with lots of useless data!

Figure 5-1: Decision points in data migration.

Ensuring adoption and usage IT organizations that handle cloud That means that all the technical archive migrations are often excited and policy decisions that this chap- about the chance to save money ter focuses on won’t mean much and improve the quality and acces- if those end users haven’t been sibility of the data they archive. The included in the decisions and adop- only sure way to succeed in a legacy tion process. The most success- archive migration, however, is to ful adoptions include end users in involve the actual end users of the selection, deployment, training, and solution in the change to make sure long‐term feedback loops for the that they have bought in and both entire ­solution. can and will use the new tool as intended.

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. 40 Cloud Archiving For Dummies, Actiance Special Edition Dealing with Data Cloud archives exist to manage large amounts of data, so it shouldn’t come as a surprise that the pitfalls of migration deal heavily with data handling. There are three key areas to pay attention to when migrating. Handling historical data Legacy archives often contain a massive amount of histori- cal (and potentially worthless) data that has accumulated over time with little oversight and less ongoing management. That historical data usually drives increasing costs for legacy archives as they drive the need to add more and more disk space and management hardware/software, and it can create issues around moving your most important/valuable data to a cloud solution. Historical data can also include sensitive infor- mation such as intellectual property, confidential information, or worse, smoking guns that shouldn’t be retained, which means that cleaning it up is important.

Handling historical data is a two‐part process. First, a regu- larly updated organizational data retention policy can really help to determine what should be kept, for how long, and why. Although legacy archives and backup products may not be designed to filter through all the data that they contain, a cloud archiving solution should have built‐in filtering capabili- ties to meet these additional requirements. That means that an updated, legally defensible data retention policy can help make decisions on what to keep easier.

The second key element of dealing with historical data is leveraging the filtering capabilities that a cloud archive can deliver. For example, in Figure 5-2, you can see how the Actiance Alcatraz cloud archiving solution provides granular filtering capabilities that provide each layer of the filtering process.

The more effectively you use filtering capabilities, the faster your migration will go. It will also improve searches and other work!

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. Chapter 5: Avoiding Cloud Migration Pitfalls 41

Figure 5-2: Analyzing historical data. Removing unneeded data One of the key data migration decisions most will face is deciding what data you can dispose of before migration into the new cloud archive. Opting to not clean up your data when you migrate it to the cloud can create a lot of additional work- load and cost, whereas defensible deletion can save a lot of space, time, and expense.

Defensible deletion is the process of deleting files and docu- ments that are not required to be retained for compliance, legal, or business reasons. A data retention and deletion policy is a key part of ensuring that your data deletion is defensible.

The total cost of ownership advantages of cloud archives can be significant, but you can inadvertently reduce those advan- tages if you migrate lots of duplicate files, spam, and unneces- sary email messages, or if you opt to keep data you no longer need. Not only does it add to the overall cost of using the cloud service, but it can also drive up the costs of discovery and compliance efforts when lawyers or investigators and auditors spend additional time working through junk data to find the items that are actually needed. Maintaining access to data It should go without saying that the ease of access to the migrated data is important to your migration project’s

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. 42 Cloud Archiving For Dummies, Actiance Special Edition

­success. Where this can go wrong is if your cloud archiving solution wasn’t able to map and ingest the data correctly so that end users can intuitively find it in the new cloud archive. Another access issue occurs when the end‐user’s access per- missions are wrong in the new archive. That means you need to pay attention to users, roles, and permissions when you migrate the data to your cloud vendor.

This is a good time to engage your nontechnical users. Involve your compliance, legal, or eDiscovery experts early, and have them make sure that they’re comfortable getting to the data they need to succeed. Maintaining the Chain of Custody In many cases, the biggest driver for moving to cloud archiving is the need to provide eDiscovery or compliance support for large amounts of data. When adopting a cloud archive, you need to ensure that you have the capability to run reports on various aspects of the archived data and dem- onstrate the chain of custody of any data movement to an auditor or possibly in court.

That means your cloud migration vendor needs to be able to positively answer several critical questions to make sure you don’t end up with unrecoverable post‐migration problems. Those questions are as follows:

✓✓How do you prove that the data in your archive wasn’t moved or changed, and in fact the files match the original files? ✓✓During the migration, did the migration encounter prob- lems like corrupted files or files that were encrypted or unreadable? And if so, were the exceptions fully docu- mented? ✓✓How were embedded files, zipped files, encrypted files or password protected files handled? And if issues occurred, how were those issues logged and reported?

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. Chapter 5: Avoiding Cloud Migration Pitfalls 43

✓✓Are all actions taken by the archiving system and its users logged, and are those logs secure and auditable? ✓✓When you extract/copy data for discovery, compli- ance, or audit purposes, what information goes with it? Does it include chain of custody information, and is there descriptive log information or metadata that the archiving solution can provide?

As you can tell, access to auditing and log reports is a critical part of the capability to prove the chain of custody. If your cloud archiving solution makes that easy, you can avoid a lot of the potential problems you might otherwise face. Figure 5-3 shows Actiance Alcatraz solution’s audit log reporting capa- bility. The capability to easily provide audit logs like these can make discovery and compliance reporting a lot simpler: saving time, money, and effort.

Figure 5-3: Audit logs views.

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. 44 Cloud Archiving For Dummies, Actiance Special Edition

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. Chapter 6 Ten Things to Look For in Cloud Archiving Systems

In This Chapter ▶▶Recognizing ten key features of cloud archiving systems ▶▶Understanding what to look for when selecting a cloud archiving system

hen you’re considering an archiving solution, it is Wimportant to pick the right one. Your cloud archive not only stores your data, but it also allows you to easily access and leverage it for compliance, eDiscovery, and a multitude of other uses. That means that choosing the right cloud archiving system is especially critical. In this chapter, you find ten things that you should have on your short list of ­considerations when choosing a solution:

✓✓Ease of use: Your archiving system will only be a success if it is easy to use and doesn’t take your end users out of their comfort zones. Look for a solution that emphasizes usability, and have your nontechnical staff and a sam- pling of end users try it out before you buy. ✓✓The capability to handle a wide range of data sources: Look for a cloud archiving system that not only handles the data sources you have today, but can also keep up with what you might use in the future. ✓✓Fit to workflow: Workflows are really important when managing compliance, analysis, and discovery, which means that making sure that your cloud archiving solu- tion can fit seamlessly into your existing infrastructure is important.

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. 46 Cloud Archiving For Dummies, Actiance Special Edition

✓✓Configurable retention policies: What data you keep and how long you keep it can have a big impact on your costs and how easy it is to find data when you need to. Look for smart, flexible retention policy capability that you can adjust as needed. ✓✓High availability: Cloud providers can help provide much better availability and disaster recovery than an on-site system can. They should have guarantees (via SLA) on uptime and a plan for recovery if the worst ­happens. ✓✓Strong security: Sending your data to the cloud can be scary, especially if it’s intellectual property or data you need for compliance and eDiscovery purposes. Check your cloud provider’s security capabilities to make sure you’re picking the right one, check its references, and run a security audit on the provider annually. ✓✓Cost savings: When you’re looking at a cloud archive, you should look for one that can provide a lower total cost of ownership. Remember to account for things like system administration time so that you get the whole picture. ✓✓Search speed: Cloud archives combine their expertise with lots of hardware to provide faster searches and indexing. You should look for a provider whose searches are fast in addition to being accurate so that you can optimize how your users spend their time using the archive. ✓✓Fit for compliance: If you have compliance or eDiscovery needs, or want to analyze your archives, you need a tool that can fit your workflow instead of making you fit its requirements. Look for a tool with flexible workflow capa- bilities and connectors that can integrate with the way you do things. ✓✓Handling large mailboxes: Everyone has a co‐worker who retains every email he or she has ever received. Those huge mailboxes mean that you need to select a tool that can handle them with ease but also alert your system administrators if they go over a set limit.

These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. These materials are © 2015 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited.

WILEY END USER LICENSE AGREEMENT

Go to www.wiley.com/go/eula to access Wiley’s ebook EULA.