Andrew Harrison

Read this before buying a Thin storage system

Thin Provisioning what is it and why are you buying it? In the current economic climate IT is under ever increasing pressure to reduce infrastructure costs. Storage can account for up to 50% of the equipment costs in a datacenter, so clearly increasing storage utilization and reducing the unit prices is a priority.

Thin Provisioning systems tackle the over-allocation problem which is a major source of inefficiency which reduces storage utilization. Over-allocation is where application owners ask to be allocated more storage than they end up actually needing. The storage is provided to the servers hosting the application but only a percentage of the storage supplied is used. There are many reasons why the practice of over-allocation is endemic, some of which can be solved by focusing on people and process some of which are best addressed by technology.

Two key factors driving over-allocation are long lead times for application owners to obtain additional storage if they run out and real issues sizing the exact amount of storage required to support new applications. In these cases simply restricting the amount of storage an application owner can request may be impractical and may impose unacceptable constraints on the business.

Thin Provisioning allows storage administrators to continue to allocate application owners more storage than they actually need while ensuring that the actual amount of storage used in the arrays matches the amount of storage that the application is really consuming. If for example the owner of application like a contact management system asks for 1TB of storage while actually consuming 300GB then the Thin system will present the application with 1TB of storage while only allocating 300GB on disk which is what the application uses. When applications grow their storage the Thin system automatically allocates the necessary storage to application that needs it.

How you cost justify buying Thin Provisioning technology? On the face of it Thin Provisioning is relatively simple to cost justify. Organizations with access to a comprehensive Storage Resource Management product such as Symantec CC Storage can run reports showing over allocated storage. For example an organization has 50TB of storage allocated and consumed by servers. The SRM tool shows this 50TB of storage is on average 50% utilized so 25TB is being used and 25TB is being wasted in over allocation.

It is fairly simple to calculate the cost to the business of that 25TB, using for example the Gartner storage pricing data for Tier 1 storage we can assume that this 25TB of storage has a value of $360,000 which is the return which may be achieved if IT makes a capital investment in Thin Provisioning technology.

Clearly there is no point in spending more on the Thin system than the calculated return and most organizations expect any capital outlay to provided a return which is a multiple of the outlay. Most organizations also require these returns to be made within a specified period often within a financial year.

Investment committees evaluating projects which require capital outlay stack rank them based on projected returns and time to projected return. With limited funds available underperforming projects from a return perspective or ones which have a long time to return are unlikely to be funded.

Page 1 Andrew Harrison So purchasing a Thin Provisioning system seems to be a project which should be funded, the technology is low risk, it appears to offer a return which is a multiple of the expenditure and with access to effective data migration technology there is little to suggest that the storage team cannot migrate the bulk of the existing storage to a Thin environment thus freeing up 25TB of storage with a value of $360,000.

The devil is in the detail! Now you need to stop and carefully re-evaluate the likely returns and the time to return based not on how the Thin Provisioning system works in a best case scenario but how it works in practice. I apologize in advance if any of this is technical but in this case the technology choices make the difference between a project that will satisfy your investment criteria and a project that will not!

The 5 Thin Provisioning problems

Problem 1. How do you migrate to Thin One could simply target all new applications at the Thin environment but with the relatively slow rates of new application deployment we are seeing in the current economic climate many IT organizations may conclude that the quickest way to achieve significant savings using Thin Provisioning systems is to migrate existing applications into the Thin environment.

With access to technology able to migrate application data from existing storage to Thin storage it would be tempting to assume that this is a simple process. A data migration tool can copy the existing data volumes to the Thin environment, once moved to the Thin environment the Contact Management System referred to earlier thinks that it has access to 1TB of storage, while the Thin system has allocated 300GB which is what it is actually using.

Caution, this may or may not be true depending on the migration tool used and how sophisticated the Thin Provisioning system is.

Reasons, nearly all volume migration tools, either host based or appliances simply copy all the blocks in a volume from one storage system to another, they make no distinction between blocks that contain data and blocks which do not contain data, they are all copied. In the case of the CMS application this means that nearly all volume migration tools will copy the 1TB of storage allocated to the application not the 300GB actually used.

Impact, 1 the migration takes longer than it would have if the tool had only copied 300GB but 2 and more critically depending on the Thin Provisioning system the application may actually get allocated the full 1TB and not the 300GB you are expecting. Some Thin systems simply detect that a block has been updated and then allocate physical storage to that changed block, it does not matter that the block contains no data it still gets physical storage allocated to it.

Potential mitigation, some Thin Provisioning systems are able to detect that blocks contain no actual data and either reclaim them later or never allocate storage to them in the first place. In this case even though your data migration tool is not Thin friendly you are likely to end up with less storage allocated in the Thin environment than was allocated in the previous setup.

Page 2 Andrew Harrison Questions to ask your Thin Provisioning and data migration technology suppliers: 1 Is the data migration tool Thin friendly, does it copy all the blocks or just the blocks being used by the application? 2 Is the Thin Provisioning system capable of recovering blocks that are empty?

Problem 2. Are you really migrating what you think you are migrating? Clearly if the answer to the previous 2 questions are, no the tool is not Thin Friendly and no the Thin system cannot recover empty blocks then you must abandon any plans to migrate existing applications using your exiting volume migration tools. If you proceeded with this plan your Thin Provisioning system will not return any value at all. Using the CMS application as an example the Thin system will end up allocating 1 TB to the application not saving any storage.

What happens if you find that the data migration tool is not Thin friendly but that the Thin Provisioning system can reclaim empty blocks.

Will the project still provide the returns modeled in your business case?

Caution, not necessarily. The Thin Provisioning system can only reclaim empty blocks. The filesystem consuming the original storage may have created and deleted files. Typically when a filesystem deletes files it does not write nulls to the storage used for those files, instead it no longer references the storage. Because of this storage that has been used for a period of time has 3 distinct classes.

1 Blocks that are being used by the filesystem. 2 Blocks that have been used by the filesystem but which are currently unused. 3 Blocks that have never been used.

Of these three Thin Provisioning systems can only reclaim (3) blocks that have never been used, while your Storage Reporting product will generally show (1) blocks that are being used by the application. How effective your Thin Provisioning system is in this scenario depends on how many of (2) there are, if they make up a large percentage of the total volume then the amount of storage allocated will be much higher than you are expecting.

Potential mitigation. How many blocks have been used, but are currently unused is determined by how effective the filesystem is at reclaiming previously used blocks. Some filesystems are much more intelligent about the way they allocate blocks to files and the more efficient they are the more likely the Thin Provisioning system will be able to recover blocks. Some filesystems such as ZFS and NTFS use 40% more of the underlying blocks during the typical life of a filesystem than more efficient technologies. In the example of the CMS application using a filesystem which is not Thin Friendly could result in the Thin system having to allocate 420GB rather than 300GB to the application.

Question to ask your filesystem suppliers: 3 Is the filesystem Thin Friendly?

If your filesystem supplier does not know what this means then assume no!

Page 3 Andrew Harrison Problem 3. How fast is your real data migration going to be? In many cases because of the answers to questions 1, 2 and 3 the option to migrate applications using volume migration tools will be unattractive. IT will have to resort to different routes into Thin storage. Methods include re-loading existing applications from the backup environment and/or only targeting new applications at the Thin storage platform. Both take longer than online volume migration and many IT professionals would rightly have reservations about the risk and disruption associated with re-loading existing data from backup.

Depending on what you decide to do at this point your time to return will be longer or much longer and may require more resources with a much greater potential for disruption.

It is likely that a project with a revised time to return caused by more complex data migrations or only new application deployment will not meet the criteria required by the investment board to qualify it for funding.

In the case of the CMS application the value of the project will be reduced by waiting for 3 years until a new major release of the application is deployed to get a return of 580 GB rather than the original 700GB.

Problem 4. Surely you meant a saving 700GB for the new app not 580GB? The issue with deploying new applications on top of filesystems which are not Thin Friendly is the same as Problem 2.

Any blocks which are not empty but which are not being used by the filesystem, in other words blocks that have been used but are no longer in use will have physical storage allocated to them in a Thin Provisioning environment.

Over time all filesystems end up with some blocks falling into this category, how many depends on how Thin Friendly the filesystem is. Because this is partly workload dependent filesystems which are not Thin Friendly may appear initially to work effectively with your Thin storage platform however over time the amount of physical storage allocated to the filesystem will increase even if the of storage apparently used by the filesystem appears not to be growing.

Caution. Thin Provisioning systems however intelligent are unable to reclaim a block if it is not empty by themselves. Filesystems do not write nulls to blocks they no longer need because this would be too expensive from a performance perspective.

Potential Mitigation. This is the same as problem 2. A Thin Friendly filesystem addresses this issue by ensuring than there are the lowest proportion of previously used but no longer used blocks. Without this you should assume 40% more storage being physically used than your original allocated verses used calculation. In the case of the 50PB you could assume that without a Thin Friendly filesystem your physical usage will be 35PB not 25PB reducing your saving by 10PB and your cost saving by $140,000.

Question to ask your filesystem suppliers: 4 Is the filesystem Thin Friendly?

Page 4 Andrew Harrison

Problem 5. How do you reclaim the storage at the end of the life of the files? As mentioned in Problem 1 and 2 even the most intelligent Thin Provisioning systems are unable to automatically reclaim blocks if they are not empty. At the end of the life of a file, database etc you will end up with a filesystem which has had most of its files deleted, however because this deletion process does not write nulls to the blocks used by the file the Thin Provisioning system will still have physical storage allocated to all the blocks used by the file.

Unless you have a mechanism to reclaim these blocks the effectiveness of your Thin Provisioning system will decline over time as more and more storage is allocated to files/applications that no longer need it.

This compounds you time to return problem illustrated in Problem 3, without effective reclamation mechanisms your actual return declines over time and a possible outcome is that by the time your migration into Thin storage is complete your return has declined to zero or is negative.

Potential Mitigation. All the major Thin Provisioning suppliers support a Symantec interface that allows the filesystem to provide the system with a list of all the blocks that the filesystem has used in the past but which it is no longer using. The Thin Provisioning system is then able to reclaim all these blocks reducing cost and increasing utilization.

Question to ask your filesystem suppliers: 5 Does your filesystem support the Thin Provisioning Reclamation interface?

Page 5 Andrew Harrison

The solution to the 5 Thin Provisioning problems These 5 problems have the potential to invalidate the business cases for deploying Thin Provisioning systems. However it would be wrong to assume that there are no solutions available which address these 5 issues.

Symantec Storage Foundation is a host based storage management and virtualization suite which provides the only solution to all 5 Thin Provisioning problems.

Problem 1, 2 and 3. Storage Foundation solves Problems 1. How do you migrate to Thin? , 2 Are you really migrating what you think you are migrating? and 3 How fast is your real data migration going to be? in two ways.

1 Storage Foundation includes SmartMove, this is used when data is migrated from an existing array to a Thin storage platform. SmartMove only copies blocks being used by the filesystem, it does not copy empty blocks or blocks which have been used but which are no longer in use. This reduces the time taken to migrate data from one platform to another and also ensures that the Thin Provisioning system allocates the minimum storage required to support the application. In the case of the CMS application this results in a fast online migration with the Thin Provisioning system allocating 300GB of physical storage to the application not the minimum of 420GB and maximum of 1TB if other methods are used. 2 If your Thin Provisioning system can reclaim empty blocks and you have chosen to use a non host based data migration technology then Storage Foundation helps because it is designed to ensure that disk space is not wasted and that blocks which were used but have been discarded are reclaimed and re-used. This Thin friendly design is 30-40% more efficient than other filesystems and means that the source data when migrated to a Thin platform will use 30-40% less physical storage initially and over time.

Problem 4. Storage Foundation solves Problem 4 Surely you meant a saving 700GB for the new app not 580GB? by being Thin friendly and re-using blocks in a way that is most friendly to Thin Provisioning Systems. It also locates filesystem metadata at the beginning of the filesystem, some filesystems write metadata across the space occupied by the filesystem, While this data is very small in comparison with the data held in the filesystem it can cause a considerable impact on Thin Provisioning system efficiency in particular if the Thin Storage system uses large block sizes for allocation purposes. For example HDS USP uses a 42MB block size while EMC uses 768KB. writing 4KB of metadata will result in an allocation of either 42MB or 768KB depending on the type of system.

Problem 5. Storage Foundation solves Problem 5 How do you reclaim the storage at the end of the life of the files? by supporting the Thin Reclamation interface originally developed by Symantec. This allows Storage Foundation to provide the Thin platform with all the blocks which have been used by the filesystem but which are no longer required. The Thin Provisioning system then de-allocates all the storage allocated to these blocks.

Page 6 Andrew Harrison

The financial impact of different technology choices

The chart show the possible financial impacts on ROI and time to ROI for the different Thin Provisioning technology choices.

Storage Foundation with SmartMove, a Thin Friendly filesystem and Thin Reclamation will allow the Thin Provisioning system to deliver the financial returns you are expecting.

What should you do next? If you need your Thin Storage project to achieve the financial returns modeled in your initial business case then we would strongly recommend you consider using Storage Foundation as the Thin Friendly host layer. It is the only solution that provides the capabilities at the server level required by Thin Provisioning systems to provide the returns these systems are capable of.

If this is not an option then we would strongly recommend that you re-visit your Thin Provisioning business case adjusting the time to return up and the level of return down to reflect the real levels achievable with the toolsets you have available to you.

References: EMC Symmetrix V-Max with Storage Foundation (benefits of SmartMove and a Thin Friendly filesystem) http://eval.symantec.com/mktginfo/enterprise/white_papers/b-symmetrix-v-max-storage-foundation-whitepaper.en- us.pdf 3Par (benefits of SmartMove, Thin Friendly filesystem and thin reclamation) http://eval.symantec.com/mktginfo/enterprise/white_papers/b-whitepaper_start_thin_get_thin.en-us.pdf#17 Why thin provisioning is not the holy grail for increasing utilization. http://www.thestoragearchitect.com/2009/06/04/enterprise-computing-why-thin-provisioning-is-not-the-holy-grail- for-utilisation/comment-page-1/#comment-1668

Page 7