INCORPORATING STORAGE WITH AN OPEN SOURCE PLATFORM

Ganpat Agarwal Software Engineer, EMC [email protected] Table of Contents Introduction ...... 3

What is ? ...... 4

Cloud Computing Service Models ...... 4

Storage in Cloud Computing ...... 5

Advantages of Storage as a Service ...... 7

Disdvantages of Storage as a Service ...... 7

Limitations with vendor based cloud service ...... 8

OpenStack – An Open Source Cloud Computing platform ...... 9

Overview of OpenStack ...... 10

Cinder – OpenStack Block Storage Service ...... 11

Enterprise Storage Solutions with OpenStack Cinder ...... 13

EMC and OpenStack? ...... 14

Writing an OpenStack Cinder Driver with EMC storage ...... 15

Summary ...... 20

References ...... 21

Disclaimer: The views, processes, or methodologies published in this article are those of the author. They do not necessarily reflect EMC Corporation’s views, processes, or methodologies.

2014 EMC Proven Professional Knowledge Sharing 2

Introduction As cloud computing becomes well-established in the corporate world, organizations want to explore and leverage the functionalities that cloud computing provide. However, vendor-based cloud solutions often face compatibility issues. Thus, organizations are looking for open source- based platforms that enable cloud solutions where endless features could be implemented without compatibility or other issues.

OpenStack, an open source cloud computing platform with no compatibility issues, enables organizations to integrate all the necessary and desired features.

As there is a big amount of data need to be maintained with every implementation of cloud computing, storage providers have an important and exciting role to play.

OpenStack provides a platform to integrate object and block storage features. As a leader in software defined storage, EMC has the potential to participate in the OpenStack open source program. In addition to providing its own storage solutions, EMC can leverage the many outstanding features that OpenStack provides.

This Knowledge Sharing article will walk through the advantages that OpenStack provides over vendor-based cloud solutions, the storage as a service aspect in cloud computing platforms, and integration of an EMC block storage device with OpenStack.

2014 EMC Proven Professional Knowledge Sharing 3

What is cloud computing? Cloud computing is based on a fundamental principal of ‘reusability of IT capabilities’. In science, cloud computing is a synonym for distributed computing over a network, enabling programs or applications to run on many connected computers at the same time.[1]

The term ‘cloud’ implies that it has an unlimited capability to expand. This expansion could be anything which is a part of a computing setup, i.e. software applications, computing resources, networking resources, storage resources, etc.

The difference that cloud computing brings compared to traditional concepts of “grid computing”, “distributed computing”, “utility computing”, or “autonomic computing” is to broaden horizons across organizational boundaries. Cloud computing is a practical approach to experiencing direct cost benefits and has the potential to transform a from a capital-intensive setup to a variable-priced environment.

Cloud Computing Service Models Popular service models of cloud computing include:

 Infrastructure as a Service (IaaS)  (PaaS)  (SaaS)  (NaaS)

In addition, a new service model is approaching fast:

 “Storage as a Service”

2014 EMC Proven Professional Knowledge Sharing 4

Storage as a Service in Cloud Computing Computing produces a great amount of data which can be either input for some other computation job or be an end result of a computation process. As the complexity of work increases, so does data production, creating the need for a good data storage service for the cloud computing platform.

Documents Music

CLOUD Files Photos

Videos Presentations

Cloud storage is a model of networked enterprise storage where data is stored in virtualized pools of storage which are generally hosted by third parties. Hosting companies operate large data centers, and customers that require their data to be hosted buy or lease storage capacity from these hosting companies. The data center operators virtualize the resources according to customer requirements and expose them as storage pools, which the customers can use to store data. Physically, the resource may span multiple servers and multiple locations. The safety of the data depends upon the hosting companies and on the applications that leverage the . [2]

2014 EMC Proven Professional Knowledge Sharing 5

Cloud storage is based on highly virtualized infrastructure and has the same characteristics as cloud computing in terms of agility, scalability, elasticity, and multi-tenancy. It is available both off-premises and on-premises.

While it is difficult to declare a canonical definition of cloud storage architecture, is reasonably analogous. Cloud storage software such as OpenStack Cinder[2], cloud storage products like EMC Atmos®[3] and Hitachi Content Platform, and distributed storage research projects like OceanStore or VISION Cloud are examples of object storage and infer the following guidelines.

Cloud storage is:

 Made up of many distributed resources but still acts as one; often referred to as federated storage clouds.  Highly fault tolerant through redundancy and distribution of data.  Highly durable through the creation of versioned copies.

 Typically eventually consistent with regard to data replicas.[4]

Introduction to EMC Atmos [5]

2014 EMC Proven Professional Knowledge Sharing 6

Many people use the public cloud applications provided by , Drop , and other service providers to leverage cloud storage options. Public cloud storage gives users the flexibility to store and retrieve their data from different devices at the same location.

Public Cloud storage providers[6]

Advantages of Storage as a Service  Less capital expense  Elasticity  Depending on project and application cost, storage availability could be changed to meet all the needs.  No need to think about storage maintenance.  Can leverage the features provided by the storage service provider like data encryption, recovery, backup, etc.

Disdvantages of Storage as a Service  Risk of unauthorized physical access to the data that may be spread across multiple locations.  Risk of supplier instability.  Changing the storage vendor may delay production and increase project cost.

2014 EMC Proven Professional Knowledge Sharing 7

Limitations with vendor-based cloud service A cloud computing platform may need to accommodate different resources from different service providers or vendors, followed by assembling them.

Possible limitations for using vendor-based services include:

 Inappropriate costs – Some vendors charge extra for their in-built cloud functionalities and services. Organizations are charged extra for data transfer or other services. Organizations should select the vendor and their features after proper analysis of project needs.  Compatibility Issues – Vendors compatibility issues across their products can be a cause of concern. These should be taken into consideration prior to vendor selection.  Inflexibility – Take care when choosing a cloud computing vendor so as to not lock-in the business by using proprietary applications or formats; i.e. importing a document created in another application into a Google Docs spreadsheet. Vendors should ensure ease of provisioning and de-provisioning cloud computing users.  Vendor Lock-In – Vendor lock-in occurs when an organization becomes overly reliant on a single vendor for too many solutions and/or services. Organizations face this problem when they need to migrate from their current service provider to another vendor. Sometimes this cost becomes so high that they are not able to migrate and also creates a big problem for the environment[7].  Regulatory and Compliance Restrictions – In some countries, government regulations do not allow a customer's personal information and other sensitive information to be physically located outside the state or country. In that case, organizations must select vendors based on the services they provide in the particular zone they want to do their business.  Device Integration – Integrating peripherals such as printers, mobile devices, etc. to the vendor-specific cloud service can be challenging. Organizations should collect all the necessary information regarding peripheral compatibility with the service providers.[8]  Understanding – A vendor might not disclose their internal implementation information, which can become a problem for the organizations.  Support Issues – Getting good support from the vendors can be challenging for organizations. Many companies have a slow turn-around time when answering customer questions.

2014 EMC Proven Professional Knowledge Sharing 8

OpenStack – An Open Source Cloud Computing platform As the cloud computing platform evolved, cloud service providers began to provide different features which were not common across all vendors. This made it difficul for organizations to choose the right service provider.

In the year 2010, Rackspace and NASA jointly launched an open source cloud computing platform known as OpenStack. The OpenStack project intended to help organizations offer cloud computing services running on standard hardware.

The technology consists of a series of interrelated projects that control pools of processing, storage, and networking resources throughout a data center, able to be managed or provisioned through a web-based dashboard, command-line tools, or a epresentational state transfer (REST) API.

OpenStack[9]

Mission statement by the OpenStack community “The OpenStack Open Source Cloud Mission: to produce the ubiquitous Open Source Cloud Computing platform that will meet the needs of public and private clouds regardless of size, by being simple to implement and massively scalable.”[10]

As per Rackspace “OpenStack is an open and scalable operating system for building public and private clouds. It provides both large and small organizations an alternative to closed cloud environments, reducing the risks of lock-in associated with proprietary platforms. OpenStack offers flexibility and choice through a highly engaged community”. [11]

2014 EMC Proven Professional Knowledge Sharing 9

A community of over 6000 developers across the world with more than 200 companies participates in the OpenStack program. The project is managed by the OpenStack Foundation, a non-profit corporate entity established in September 2012.

The community collaborates around a six-month, time-based release cycle with frequent development milestones. During the planning phase of each release, the community gathers for the design summit to facilitate developer working-sessions and to assemble plans.[12]

Overview of OpenStack  Providers for both public and private clouds.

 Simple to implement.

 Feature rich.

 Total cloud infrastructure solution – Combination of interrelated projects.

 Flexibility – Modular design which helps in integrating third party applications.

 Compatibility – Very easily portable on different platforms.

 Well supported – Supported by a large community of enthusiastic developers.

 Scalability and elasticity are the main goals.

 All OpenStack components are horizontally scalable.

 Distributed approach.

 Test every implementation with the complete work-flow.

 Being open source, better understanding of feature implementation with third party applications.

 No vendor lock-in.

2014 EMC Proven Professional Knowledge Sharing 10

Cinder – OpenStack Block Storage Service Cinder is an OpenStack component responsible for the block storage functionalities. It has exposed the Application Programing Interface (API) which is used to integrate different block storage devices to the OpenStack deployment. Cinder provides persistent block storage resources that OpenStack Compute instances can consume, including secondary attached storage similar to the Amazon Elastic Block Storage (EBS) offering. In addition, images can be written to a Block Storage device for Compute to use as a bootable persistent instance.[13]

Cinder block storage services are delivered using the following Cinder services:  cinder- - A Web Server Gateway Interface (WSGI) app that authenticates and routes requests throughout the Block Storage Service. It supports the OpenStack only, although there is a translation that can be done through Compute's Amazon Elastic Compute Cloud (EC2) interface, which calls in to the cinderclient.  cinder-scheduler - Schedules and routes requests to the appropriate volume service. Depending upon the configuration this may be simple round-robin scheduling to the running volume services, or it can be more sophisticated through the use of the Filter Scheduler. The Filter Scheduler enables filters on things like Storage Capacity, Availability Zone, Volume Types, and Capabilities as well as custom filters.  cinder-volume - Manages Block Storage devices, specifically the back-end devices themselves.  cinder-backup - Provides a means to back up a Block Storage Volume to OpenStack Object Store (SWIFT).

Cinder Architecture [14]

2014 EMC Proven Professional Knowledge Sharing 11

Cinder could be deployed on the OpenStack platform in the following ways:  Single node deployment: In this mode of deployment, all the Cinder services run on the controller node only.  Multi-node deployment: In this mode of deployment, all Cinder services except Cinder- volume run on the controller node and the Cinder-volume service run on a standalone server or distributed among different standalone servers as per the requirement.

Characteristics of Cinder  Manages block storage part of OpenStack.  Volumes created could be attached to the Virtual Machine (VM) instances.  VM instances can boot from volume.  Cinder provides VM instances with block storage volumes that persist even when the instances they are attached to are terminated. Volumes can exist independent of VM instances.  A single block volume can only be attached to a single VM instance at any one time but provides the flexibility to be detached from one VM instance and attached to another VM instance. The best analogy to this is a USB drive which can be attached to one machine and moved to a second machine with any data on that drive staying intact across multiple machines.  A VM instance can attach multiple volumes.

In a Cloud platform such as OpenStack, persistent block storage has several potential use cases:

 If you need to terminate and re-launch an instance, you can keep any “non-disposable” data on Cinder volumes and re-attach them to the new instance.  If an instance misbehaves or “crashes” unexpectedly, you can launch a new instance and attach Cinder volumes to that new instance with data intact.  If a compute node crashes, you can launch new instances on surviving compute nodes and attach Cinder volumes to those new instances with data intact.  Using a dedicated storage node or storage subsystem to host Cinder volumes, capacity can be provided that is greater than what is available via the direct-attached storage in the compute nodes.  A Cinder volume can be used as the boot disk for a Cloud instance; in that scenario, an ephemeral disk is not required.[14]

2014 EMC Proven Professional Knowledge Sharing 12

Cinder API functionalities

The following operations can be performed via Cinder-API:

 Volume Create/Delete

 Snapshot Create/Delete

 Volume-clone creation

 Volume Attach/Detach ( via compute node)

 Backup Create/Restore

 Volume-types

 Volume-quota

 Copy image to volume

 Copy volume to image

 Extend volume

Enterprise Storage Solutions with OpenStack Cinder Cinder also supports drivers that enable Cinder volumes to be created and presented using storage solutions from vendors such as EMC, NetApp, Ceph, GlusterFS, etc. These solutions can be used in place of Linux servers with commodity storage leveraging LVM.[14]

2014 EMC Proven Professional Knowledge Sharing 13

Enterprise Storage Solutions [14]

Since each storage provider has its own way to present the storage to the clients, this way to implement drivers looks pretty good. Each storage provider can write their own code for the Cinder API functionalities without interference.

Since the different storage providers are working under the common open source platform, they can share their high level description and ideas among themselves and could interact if they face any implementation problems.

EMC and OpenStack? EMC is one of the market leaders in providing software defined storage solutions. EMC’s flagship products such as VMAX®, VNX®, and VPLEX® are well known in the corporate world for the class of storage solutions they provide.

EMC’s product Atmos® is capable of providing a highly efficient cloud storage solution for organizations.

2014 EMC Proven Professional Knowledge Sharing 14

OpenStack is an open source cloud computing platform which uses object and block storage features from the storage perspective.

EMC should contribute in the open source program for the following reasons:

 EMC can provide world-class block storage and object storage solutions.  EMC has a strong support community for organizations.  Existing customers of EMC are moving toward OpenStack and want EMC storage solutions for OpenStack.  By providing the storage solution to an open source community, EMC can increase the customer base.  Enable leveraging the features provided by the OpenStack community.  Via the open source program, we can interact with our counterparts to get suggestions and high level views for the challenges that we may face while contributing for OpenStack.  It will be a good platform to learn and to face the upcoming challenges in the world of cloud.

Writing an OpenStack Cinder Driver with EMC storage In this section, I will walk through the process to create a Cinder driver using reference of EMC Direct driver for VNX.

We used EMC NaviSecCli command line tool for incorporating Cinder-API functionalities on the OpenStack platform. NaviSecCli command line tool is used to send status or configuration requests to a system through the command line. This enables us to execute different commands to communicate with the VNX storage and to perform the storage related operations.

2014 EMC Proven Professional Knowledge Sharing 15

NaviSecCli Implementation Diagram

For writing a Cinder driver, there are certain functionalities which must be implemented to make the OpenStack platform use the underlying storage efficiently.

List of operations to be performed on the EMC VNX via OpenStack

Operation to be performed on OpenStack Operation performed on VNX Create/Delete Volume Create/Destroy LUN Create/Delete Snapshot Create/Destroy SNAP Create Cloned Volume Create clone of a LUN Create Volume from Snapshot Create LUN from a SNAP Attach/Detach Volume Associate/Disassociate LUN to/from an instance Extend Volume LUN expansion

2014 EMC Proven Professional Knowledge Sharing 16

Details of NaviSecCli commands used to implement the Cinder functionalities:

 Create Volume

“lun –create” command of NaviSecCLI tool is executed to create a volume of specified size.

lun -create -capacity lunCapacity -sq gb -poolId storagePoolID -name lunName

 Delete Volume

“lun –destroy” command of NaviSecCLI tool is executed to delete a specified volume.

lun -destroy -name lunName -forceDetach –o

 Create Snapshot

“snap -create” command of NaviSecCLI tool is executed to create a specified snapshot.

snap -create -res lunID -name snapName -allowReadWrite yes

 Delete Snapshot

“snap - destroy” command of NaviSecCLI tool is executed to delete a specified snapshot.

snap -destroy -id snapName –o

 Find LUN ID

“lun -list” command of NaviSecCLI tool is executed and the output is parsed to fetch the LUN ID of the specified LUN.

lun -list -name lunName

 Create Volume from Snapshot

Sequence of commands to be executed for creating a volume from snapshot:

o Create Mount point for volume lun -create -type Snap -primaryLunName src_lunName –sp A –name new_lunName

o Attach the snapshot on the mount point lun -attach -name new_lunName -snapName snapshotname

2014 EMC Proven Professional Knowledge Sharing 17

o Migrate the mount point LUN : migrate -start -source src_lun_id -dest temp_lun_id -rate ASAP –o

 Create Cloned Volume

Steps to create a cloned volume:

1. Create a temporary snapshot of the source volume.

2. Create the target volume using the temporary snapshot. (as explained in the section above)

3. Delete the temporary snapshot.

 Get Volume Stats

Finds the current volume usage and returns it to the caller for information purpose.

storagepool -list -id POOLID –userCap –availableCap

 Initialize Connection

In initialize_connection, a volume is mapped to a compute node, followed by an iSCSI discovery. After which the volume’s iSCSI property including the IQN of the iSCSI target, the portal of the iSCSI target, the LUN of the iSCSI target, and the volume ID will be returned to the caller.

1. get_storage_group: a) This function first looks if there is an existing storage group with the hostname provided. storagegroup -list –host b) If the storage group does not exist then create a new storage group and connect it to host. storagegroup -create -gname storagegrpname

storagegroup -connecthost -host hostname -gname storagegrpname

2014 EMC Proven Professional Knowledge Sharing 18

2. find_device_details: This function returns the dictionary of hostlunid, ownersp and lunmap using these commands: storagegroup -list -gname storagegrpname

lun -list -l LUNID –owner

3. _add_lun_to_storagegroup: This function adds the LUN to the storage group using this command: storagegroup -addhlu -o -gname storagegrpname -hlu HOSTLUNID -alu LUNID

 Terminate Connection

In terminate_connection, a volume is unmapped from a compute node.

1. get_storage_group: This function first looks to see if there is an existing storage group with the hostname provided. storagegroup -list –host

2. find_device_details : This function returns the dictionary of hostlunid, ownersp and lunmap using these commands: storagegroup -list -gname storagegrpname lun -list -l LUNID –owner

3. _remove_lun_from_storagegroup: This function removes the LUN from the storage group using this command:

storagegroup -removehlu -gname storagegrpname -hlu HOSTLUNID -o

2014 EMC Proven Professional Knowledge Sharing 19

 Find iSCSI protocol end points

The function _find_iscsi_protocol_endpoints returns the iSCSI initiators for a particular SP.

connection -getport -sp device_sp

 Extend Volume

“lun –expand” command of NaviSecCLI tool is executed to extend the volume.

lun -expand -name lunName -capacity new_size -sq gb -o -ignoreThresholds

Summary Integrating an open source cloud platform with EMC enterprise storage was a good learning experience.

The challenge of executing the NaviSecCli commands, getting back the return code, and processing the results of the output was tough as it was cross-platform functionality.

Advantages I realized from OpenStack:

 Managing the storage arrays from a different API which looked more simple and robust.  The User Interface was easy to understand.  Easy to perform the volume-related operations.  Easy to provision volume to a VM instance.  Provision to create image of a VM instance and use that image to launch a new VM instance.  Provision to copy an image to a volume and boot an instance using that volume.  Focus only on the storage part; the rest taken care of by the OpenStack community.  Good support for any issues I faced during the implementation phase.  Free to interact with other solution providers to understand a common issue.

2014 EMC Proven Professional Knowledge Sharing 20

References [1] http://en.wikipedia.org/wiki/Cloud_computing

[2] https://wiki.openstack.org/wiki/Cinder

[3] http://en.wikipedia.org/wiki/EMC_Atmos

[4] http://en.wikipedia.org/wiki/Cloud_storage

[5] http://www.emc.com/R1/images/EMC_Image_C_1310596345255_atmos-capabilities,1.jpg

[6] http://blog.webafrica.co.za/wp-content/uploads/2013/06/CloudStorage.jpg

[7] http://www.linuxinsider.com/story/79417.html

[8] http://www.superb.net/blog/2013/03/04/top-9-disadvantages-of-cloud-computing/

[9] http://upload.wikimedia.org/wikipedia/en/thumb/4/4c/OpenStack.png/170px-OpenStack.png

[10] https://wiki.openstack.org/wiki/Main_Page

[11] http://www.rackspace.com/cloud/openstack/

[12] http://en.wikipedia.org/wiki/OpenStack

[13] http://docs.openstack.org/havana/config-reference/content/section_block-storage- overview.html

[14] http://cloudarchitectmusings.com/2013/11/18/laying-cinder-block-volumes-in-openstack- part-1-the-basics/

[15] http://www.openstack.org/software/openstack-storage/

[16] http://www.openstack.org/downloads/openstack-object-storage-datasheet.pdf

2014 EMC Proven Professional Knowledge Sharing 21

EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice.

THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION MAKES NO RESPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.

2014 EMC Proven Professional Knowledge Sharing 22