MEETING BIG DATA CHALLENGES WITH EMC ISILON STORAGE SYSTEMS Anuj Sharma

Contents Abstract...... 3 Big Data Challenges and EMC Isilon Storage Systems ...... 6 Big Data Value-Add To Business ...... 9 OneFS Architecture ...... 11 Which EMC Isilon Cluster to choose? ...... 16 EMC Isilon Cluster Networking Best Practices ...... 17 EMC Isilon Smart Connect Internals ...... 20 SmartConnect Architecture Example ...... 27 EMC Isilon Smart Quotas Internals ...... 28 EMC Isilon and vSphere Integration Best Practices ...... 31 EMC Isilon SyncIQ Architecture and Tips ...... 36 EMC Isilon NDMP Backup Configuration for EMC NetWorker ...... 37 Cluster Performance Tuning ...... 41 EMC Isilon Cluster Maintenance ...... 43 References ...... 47 Glossary...... 48

Figures Figure 2: Big Data Sources ...... 5 Figure 3: Isilon Node Types ...... 8 Figure 5: Big Data Enabled Property and Casualty Insurance Policy Premium Factors ...... 9 Figure 6: OneFS vs. Traditional File Systems ...... 11 Figure 7: OneFS vs. Traditional File Systems ...... 12 Figure 8: Isilon Cluster ...... 13 Figure 9: OneFS Protection...... 15 Figure 10: 10GigE Networking with Accelerator Node ...... 19 Figure 11: Redundant Internal Network Topology ...... 20 Figure 12: SmartConnect Communication ...... 22 Figure 14: SmartConnect Configuration ...... 27 Figure 15: Optimizing Isilon NFS for VM I/O Operations ...... 33 Figure 16: Isilon NFS Architecture ...... 34 Figure 17: Isilon iSCSI Architecture ...... 35 Figure 19: Direct NMDP Method ...... 37 Figure 20: Remote NDMP Model ...... 38

Disclaimer: The views, processes, or methodologies published in this article are those of the author. They do not necessarily reflect EMC Corporation’s views, processes, or methodologies.

2012 EMC Proven Professional Knowledge Sharing 2

Abstract What is Big Data? Big Data does not refer to a specific type of data; every kind of unstructured data can be considered big data when a single file size is in terabytes. Digital data is growing at an exponential rate today, and “big data” is the new buzzword in IT circles. According to International Data Corporation (IDC), the amount of digital data created and replicated will surpass 1.8 Zettabytes (1.8 trillion GB) in 2011, having grown by a factor of nine in just five years. The information we deal with today is very different from the information that we used to deal with 20-30 years ago. Chip manufacturers render terabyte files, oil and gas exploration companies deal with terabytes of data to be analyzed, advancements in healthcare has led to the creation of high-definition 4-D imaging files ranging up to terabytes, and NASA deals with a large number of files with sizes in terabytes. Social networking sites and online communities generate data in huge numbers. It seems no industry is safe from massive data growth and the storage implications are profound.

Figure 1: Data Growth

The massive amount of rich unstructured file data generated by richer file formats and Internet Era computing is creating a demand for new and innovative scale-out file storage solutions to economically scale bandwidth and performance to previously unheard of capacities.

2012 EMC Proven Professional Knowledge Sharing 3

I have seen many organizations relying on scale-up storage for Big Data storage and analytics. However, in the long run, companies are bound to encounter performance and storage problems using scale-up storage systems for Big Data storage. Scale-up storage are monolithic storage systems where lots of storage sits behind one or two file server heads and is designed to scale to multi-TB range behind those file server heads. Once the storage and performance limitations are reached, a new monolithic storage system must be added. As the existing file systems residing on the previous scale-up storage system cannot be expanded to leverage the storage of the new scale-up storage system, a new file system needs to be managed, even if there is only the need to add minimal incremental storage capacity. This is one of the problems that enterprises run into while dealing with Big Data using scale-up storage systems. A single file system is limited to TB’s in the case of scale-up storage systems and file system migration is often a painful exercise and requires downtime. Also, traditional analysis tools cannot be used to analyze Big Data. Data needs to be mined real-time and results need to be published. For example, a retail store can see instantaneously which stores are most profitable, which item is in demand, the consumer choices per region, and so on. Big Data analysis is a critical factor that plays a significant role in future business decisions of the organization. Parallel processing is required to mine data of such huge volumes simultaneously. Consequently, scale-out storage systems are the best candidates for high parallel processing power.

Scale-out storage architectures are significantly different than monolithic scale-up storage architectures (e.g. traditional NAS or SAN systems) that were developed to meet distributed computing needs. “Scale-out NAS” are systems designed from the ground up for economically dynamic scale and for supporting extremely high bandwidth applications dealing with multi- terabyte files often referred to as Big Data. The EMC Isilon® storage system is the world leader in the scale-out NAS category.

2012 EMC Proven Professional Knowledge Sharing 4

Figure 2: Big Data Sources

In this article we will discuss why scale-up storage is not able to meet the performance, cost, and storage requirements of Big Data efficiently and how scale-out storage successfully meets all the requirements of Big Data. We also discuss an example of how Big Data can add value to business. Additionally, the following areas will be covered:

• Big Data Challenges • EMC Isilon Storage Systems and OneFS® Architecture • EMC Isilon Storage Systems Features such as SyncIQ, SmartConnect, SmartQuotas • Best practices for the Implementation of EMC Isilon Storage Clustered Systems • Best Practices for VSphere 4 integration with EMC Isilon Storage Systems • Cluster Maintenance • EMC Isilon NDMP Backup Configuration

And much more …

2012 EMC Proven Professional Knowledge Sharing 5

Big Data Challenges and EMC Isilon Storage Systems • Unstructured Data is being generated at exponential rates

The pace of data growth requires storage that can scale on demand. Typically, storage is purchased with a view on future or peak requirements. Most often, we end up spending more as the cost of the equipment declines over time. EMC Isilon provides the benefit of adding the storage on demand instead of buying at once; start with minimal nodes and then scale out over time.

• Seismic applications, NASA satellite imaging, and high performance video rendering applications require storage that support huge IOPS and data transfer throughput

EMC Isilon has a scale-out architecture; whenever a node is added, storage and compute is also added. Hence, compute increases linearly as nodes are added to an EMC Isilon cluster. EMC accelerator nodes can be added to increase the compute and data transfer throughput. For example, using Isilon IQ NASA has been able to consolidate more than 8,000 large Landsat 7 satellite images into a single volume and single file system while providing high-performance and concurrent access to i-cubed geoprocessing applications.

• File System Size required in Petabytes Some Big Data applications require that petabyte-size data be stored in a single file system. EMC Isilon OneFS spans across the EMC ISILON Cluster nodes and present the application with one file system in petabytes spanned across the nodes.

• Big Data analytics requirements To analyze Big Data real-time requires storage that can withstand the simultaneous read and write requirements of the analytical engine. Owing to the architecture of EMC Isilon, data can be analyzed almost real time so that organizations can look at the analytics in real time and make quick decisions. EMC Isilon addresses all the requirements that the Big Data analytical application requires.

2012 EMC Proven Professional Knowledge Sharing 6

For example, for Oil and Gas exploration companies, the cost of oil and gas exploration continues to skyrocket, thus making the rapid analysis of exploratory data essential in order to stay ahead of competitors. Companies cannot afford to have crews sit idle while drill/no drill decision-making data is being analyzed. To speed data analysis computational workflows, oil and gas organizations analytical application requires storage such as EMC Isilon with the latest multi-core processors, petabytes of storage, the fastest-available networking, and the intelligence to divide and handle workloads across an array of compute nodes.

• Data transfer types for Big Data can be random or sequential As per Big Data transfer types, organizations can choose from different Isilon models, providing flexibility to select the model that best meets their requirements.

• Tight Backup Window As data grows exponentially, so does the time required to back up data also increase. EMC Isilon has NDMP accelerator nodes that increase the NDMP backup throughput thus reducing the backup window.

• Isilon Solves Media Industry Challenges  Rendering/compositing/encoding jobs no longer need to be scheduled and queued based on storage limitations.  Artists do not need to wait for one job to complete before another can begin. Nor do they need to determine what volume or drive a particular file resides on or where it should be written.  Data wranglers are no longer needed to manually move files and processes from over‐taxed drives.  No downtime is required when more space is added.  And most importantly, heavy workloads with concurrent access patterns will not degrade the performance of an Isilon IQ cluster.

• High Performance Computing Challenges Solved High Performance Computing (HPC) applications need multiple processors, memory modules, and data paths. HPC needs parallel data services, which break up single files and deliver them in pieces in parallel. Isilon meets all the requirements of High Performance computing, i.e. multiple computing nodes can access the data in parallel from the Cluster Nodes and perform the desired operations on the data efficiently and at

2012 EMC Proven Professional Knowledge Sharing 7

faster rates. Isilon eliminates storage from becoming a bottleneck in High Performance computing.

EMC Isilon is designed with a view toward addressing all of the Big Data challenges above. Organizations can mix and match various hardware elements depending on specific needs. For example, the IsilonS-Series delivers the performance needed for IOPS-intensive applications, the X-Series is ideal for high-concurrent and sequential throughput workflows, and the NL- Series provides economical storage that enables organizations to keep data online and available for longer periods of time. This article will look at implementing EMC Isilon features using best practices to get the best performance out of EMC Isilon systems.

Figure 3: Isilon Node Types

2012 EMC Proven Professional Knowledge Sharing 8

Big Data Value-Add To Business Business houses, corporations, and enterprises are dealing with a huge amount of unstructured data. This data can turn out to be a real value-add in terms of revenue to the organizations. There can be many use cases where Big Data can do wonders for an organization.

The insurance industry can benefit from Big Data analytics by analyzing the large amount of data almost in real time.

Typically, to generate a quote, an insurance company will judge the premium by application form and the credit history of the individual. Thus, insurance companies depend on the data that the applicant fills in and the credit history.

Application Insurance Credit form data Policy History analysis Premium

Figure 4: Traditional Property and Casualty Insurance Policy Premium Factors

Now, with the power of Big Data analytics, insurance companies can analyze the factors below

to decide on the insurance premium.

Telematics Social Application Credit Networking Insurance form data Integration analysis History Data Premium Data Analysis

Figure 5: Big Data Enabled Property and Casualty Insurance Policy Premium Factors

2012 EMC Proven Professional Knowledge Sharing 9

Suppose a request is made for an auto insurance quote. The insurance company can use big data analytics to calculate the insurance premium by analyzing the data points below that are beyond the typical application form data.

• Individual purchases a new car and requests an insurance premium quote. His previous car, insured by the same insurance company, has been fitted with a telemetric device. • The telemetric device provides data that the insurance company can use to get data points such as the speed at which the driver drives the car, accidents, fuel economy, rapid acceleration, average speed, highway speeds, and city speeds. The big data analytical software can grade the insurance seeker by comparing these data points with the poor, average, good, and excellent data points that have been decided by the insurance company. For example, they can set ratings—a safe driver has a 5-star rating, a poor driver has a 1-star rating—and factor these data points while calculating the premium. • Consequently, companies are able to increase or decrease the premium amount as per the nature of the driving of the insurance quote seeker. • In addition to the data points above, data analytical software can take data from social networking sites such as Facebook, Twitter, and YouTube for the insurance premium quote seeker. For example, their YouTube activities may show that they liked videos or shared videos related to Formula 1 racing or car stunts which would influence insurance quote calculations. Similarly, data from Facebook status updates such as “car bumped into other car”, or “touched 150 miles/hour on the highway“ can also be used by analytical software for calculating the premium.

Driving Rating Social Average Big Data Networking Premium Analytics User analysis rating Calculated Premium

User A 5 4 800$ 600$

User B 1 1 800$ 1200$

These examples provide an overview of how companies can use big data to add value to their business while also benefiting users. To store this big data efficiently and economically and analyze the big data in real time, EMC Isilon is the storage system that companies should consider.

2012 EMC Proven Professional Knowledge Sharing 10

OneFS Architecture

Figure 6: OneFS vs. Traditional File Systems

OneFS eliminates the need for a separate file system, volume manager, and RAID system OneFS runs on a scale-out NAS architecture across the cluster of Isilon nodes. It creates a single namespace and file system on each Isilon cluster. The OneFS is spread across all the nodes in the cluster. All information is shared among nodes; the entire file system is accessible by clients connecting to any node in the cluster. Because all nodes in the cluster are peers, the Isilon clustered storage system does not have any master or slave nodes. All data is striped across all nodes in the cluster. As nodes are added, the file system grows dynamically and content is redistributed. Each Isilon storage node contains globally coherent RAM, meaning that, as a cluster becomes larger, it also becomes faster. Each time a node is added, the cluster's concurrent performance scales linearly.

2012 EMC Proven Professional Knowledge Sharing 11

Figure 7: OneFS vs. Traditional File Systems

• On-the-fly node expansion Adding a new node requires no downtime and takes under 60 seconds. Scaling a cluster requires no reconfiguration, no server or client mount points, and no application changes. • OneFS filesystem scalability OneFS can scale to 15.5 PB of storage in a single file system, so there is no need to create small volumes or logical units. As the cluster scales, Isilon AutoBalance™ migrates content to new storage nodes while the system is online and in production. There is never more than 5% imbalance of the percentage of used data between any nodes in the cluster. Data is automatically balanced across all nodes, reducing costs, complexity, and risk.

2012 EMC Proven Professional Knowledge Sharing 12

Figure 8: Isilon Cluster

• Separate Internal and External Networks Isilon clusters use separate internal and external networks, so each node in the cluster requires multiple network connections. Even the simplest non-redundant network topology requires two network connections per node—an internal network connection for intra-cluster communication and an external network connection for client traffic. The internal network, also called the back-end network, uses InfiniBand to connect the nodes in a cluster. InfiniBand is a switched-fabric I/O standard that offers high throughput and low latency. Essentially, the back-end network acts as the backplane of the cluster, enabling each node to act as a contributor to the whole. Clusters using an InfiniBand back end can grow to a maximum of 144 nodes.

• OneFS uses Reed-Solomon to provide redundancy and high availability As traditional storage systems scale, techniques that were appropriate at a small size become inadequate at a larger size, and there is no better example of this than RAID. RAID can only be effective if the data can be reconstructed before another failure can occur. However, as the amount of data increases, the speed to access that data does not and the probability of additional failures continues to increase. OneFS does not depend on hardware-based RAID technologies to provide data protection. Instead, OneFS includes a core technology, FlexProtect™, which is built on solid mathematical constructs and utilizes Reed-Solomon Encodings to provide redundancy and availability.

2012 EMC Proven Professional Knowledge Sharing 13

FlexProtect provides protection for up to four simultaneous failures of either full nodes or individual drives and as the cluster scales in size, FlexProtect delivers on the need to ensure minimal reconstruction time for an individual failure. • OneFS uses FlexProtect to protect data in case of failure FlexProtect, a key innovation in OneFS, takes a file-specific approach towards data protection, storing protection information for each file independently. This independent protection allows protection data to be dispersed throughout the cluster along with the file data, dramatically increasing the potential parallelism for access and reconstruction when required. When there is a failure of a node or drive in an Isilon storage system, FlexProtect is able to identify which portions of files are affected by the failure and employs multiple nodes to participate in the reconstruction of only the affected files. Since the AutoBalance feature in OneFS spreads files across the cluster, the amount of spindles and CPUs available for reconstruction far exceeds what would be found in a typical hardware RAID implementation. In addition, FlexProtect doesn't need to reconstruct data back to a single spare drive (which would create a bottleneck); instead, the file data is reconstructed in available space providing a virtual “hot spare”.

2012 EMC Proven Professional Knowledge Sharing 14

Figure 9: OneFS Protection

• OneFS regularly checks the integrity of the files and disk OneFS also monitors the health of all files and disks within the cluster and if components are at risk, the file system automatically flags the problem components for replacement, transparently reallocating those files to healthy components. OneFS also ensures data integrity if the file system has an unexpected failure during a write operation. Each write operation is transactionally committed to the NVRAM journal to protect against node or cluster failure. In the case of a write failure, the journal enables a node to rejoin the cluster quickly, without the need for a file system consistency check. With no single point of failure, the file system is also transactionally safe in the event of an NVRAM failure. Since the FlexProtect feature in OneFS is file aware, it also provides file-specific protection capabilities. An individual file (or more typically, a directory) can be given a specific protection level and different portions of the file system to be protected at levels aligned to the importance of the data or workflow. Critical data can be protected at a

2012 EMC Proven Professional Knowledge Sharing 15

higher level whereas less critical data can be protected at a lower level. This provides storage administrators with a very granular protection/capacity trade-off that can be adjusted dynamically as a cluster scales and a workflow ages.

Which EMC Isilon Cluster to choose? EMC Isilon Cluster Nodes are available for different workloads as per user requirements. Common use cases for the various nodes available are shown below.

IQ S-Series • Suitable for applications that require maximum IOPS performance such as organizations that broadcast real-time streaming. • Suitable in cases where storage requirement is primary storage for mission-critical data with high random read write transaction rates. • Suitable for design and simulation environments that provide an analysis of electronic and mechanical systems. • Suitable for Big Data analytical engines that analyse the data in real time as it is being stored.

Isilon IQ X-Series • Suitable for mission-critical workloads with high concurrent and sequential data throughput rates. • Ideal for virtualized and research environments that have large sequential I/O and throughput. • Well-suited for companies that produce digital media or companies that deliver video, audio, and images services over the Internet. • Best for organizations that perform heavy computation such as life science companies or organizations that provide virtualized environments.

2012 EMC Proven Professional Knowledge Sharing 16

Isilon NL-Series • Nearline storage which can be used as disk based backup device. • Can be used as an archiving target. • Best used for nearline archiving for reference data that is kept for business and legal reasons for longer duration of time.

Accelerator Nodes

• For applications requiring high-throughput such as uncompressed HD ingest, editing, and playback. Isilon Accelerator-x nodes should be used with 10GigE ports.

EMC Isilon Cluster Networking Best Practices • Internal Networking Best Practices  Below are some EMC Isilon Recommended Infiniband Switches that can be considered for Internal Node Networking.

 Infiniband cables with copper connect should be used.

2012 EMC Proven Professional Knowledge Sharing 17

• External Networking Best Practices  Below are some EMC Isilon Recommended Ethernet Switches that can be considered for External Networking.

 CAT-5e or CAT-6 cables with RJ-45 copper connectors should be used.  Isilon Accelerator-x nodes should use 10GigE ports for applications requiring high-throughput such as uncompressed HD ingest, editing, and playback.

 Below are some EMC Isilon recommended 10GigE Ethernet switches

 EMC Isilon Accelerator nodes should use 10 GigE cables with CX4 copper.

2012 EMC Proven Professional Knowledge Sharing 18

Figure 10: 10GigE Networking with Accelerator Node

• Non-blocking Switch Fabric A switch used for external connectivity must use a non-blocking switch fabric. A switch is non-blocking if the internal switching fabric is capable of handling the theoretical total of all ports, such that any routing request to any free output port can be established successfully without interfering with other traffic. All switches recommended by Isilon have this feature. • Switch Port Buffer Size The switch must have a port buffer size of at least 1MB. At load, smaller port buffer sizes will fill up, resulting in dropped packets that must be retransmitted, negatively impacting performance. • Jumbo Frame Support For best performance with most applications, jumbo frames should be enabled on the external network. “Jumbo frames” refers to a Maximum Transmission Unit (MTU) size of 9000 bytes, compared to a standard MTU of 1500 bytes. Jumbo frames allow more data to be transferred between network endpoints with fewer operations, which will increase throughput for most applications. • Internal Network High Availability For redundancy purposes, the redundant internal network topology should be deployed. In this topology, two InfiniBand switches are used and if one node fails to the second switch all nodes in the cluster will also failover to the second switch. On the redundant internal switch the IP addresses used must be on a different subnet than the primary switch.

2012 EMC Proven Professional Knowledge Sharing 19

Figure 11: Redundant Internal Network Topology

• Link Aggregation can be used for increased data throughput by aggregating the external interfaces of the Isilon Nodes together.

EMC Isilon Smart Connect Internals SmartConnect™ is a software component that balances the client connections across the Isilon cluster nodes or selected nodes. It does this by providing a single virtual host name for clients to connect to, which greatly simplifies connection mapping.

SmartConnect comes in two versions: SmartConnect Basic and SmartConnect Advanced. The SmartConnect Basic version of the module manages the client connection balancing by using a simple round robin balancing policy. The SmartConnect Advanced version is a licensable module that provides multiple balancing methods. SmartConnect SmartConnect Basic Advanced Load Balancing - Round Yes Yes Robin

Load Balancing - CPU No Yes Utilization

Load Balancing - No Yes Connection Count

2012 EMC Proven Professional Knowledge Sharing 20

Load Balancing - Network No Yes Throughput

NFS Failover No Yes

IP Allocation Static Dynamic

SmartConnect Zones Single Zone per Subnet Multiple Zones

Rebalance Policy No Yes

IP Failover Policy No Yes

How SmartConnect Works

1. Client sends a request to connect to the Isilon Cluster via a hostname registered in the DNS Server. This hostname has been defined in the SmartConnect settings and will be used by the clients to access the Isilon Cluster. 2. DNS server determines that the SmartConnect name will be resolved by the SmartConnect-based delegation entry in the DNS. DNS Server queries the SmartConnect. 3. SmartConnect will provide DNS with an IP address of a node based on the load balancing policy that has been selected while configuring the SmartConnect. 4. DNS server replies back to the client with the IP address provided by SmartConnect. 5. Client then connects to the Isilon Cluster node via the IP address provided.

2012 EMC Proven Professional Knowledge Sharing 21

Clients

DNS Server 2 1 3 4

5 6

Ethernet Switch

Figure 12: SmartConnect Communication

Configuring SmartConnect

• Entries should be made in the Name Servers. BIND server In BIND, a new name server (NS) record needs to be added to the existing authoritative DNS zone specifying the server of authority for the new sub-zone. For that, an A record must be added, specified in the NS record that points to the SIP address of the cluster. For example, if the SmartConnect zone name is cluster.example.com, the DNS entries would look like: >> cluster.example.com IN NS sip.example.com >> sip.example.com IN A {IP address} Windows DNS Server In the Microsoft DNS wizard, a “New Delegation” record will be added in the forward lookup zone for the parent domain, which is equivalent to the NS record mentioned above. • SmartConnect can be configured from the Isilon GUI by specifying the load balancing policy, SmartConnect zone, Node IP’s, failover IP’s, and SmartConnect Zone IP. You will

2012 EMC Proven Professional Knowledge Sharing 22

need to browse to ClusterNetworking, then click on add subnet and define the settings as required.

• Once you click on Add subnet, define the subnet settings and SmartConnect Service IP.

• Specify the IP Address Pool settings as defined below for the nodes. Node Interfaces added to the Subnet Pool will have the IP address assigned from this pool.

2012 EMC Proven Professional Knowledge Sharing 23

• Specify the SmartConnect Zone Name and the load balancing policy. Smart Connect advanced is grayed out as the license is not procured.

• Add the interfaces of the nodes you want to be used for this SmartConnect Zone and click Submit.

2012 EMC Proven Professional Knowledge Sharing 24

• You will see the settings that you have done as per the screenshot below.

• In the above steps, the basic settings that need to be done for SmartConnect are defined. We can customize as per the requirements. • For NFS failover functionality, Dynamic IP allocation should be used and there should be at least one NFS Failover IP per client with a minimum of at least one Failover IP per node in a cluster. For example, in a 4-node cluster a minimum of 4 NFS Failover IP’s should be configured. For optimal configuration, a one-to-one relationship between the

2012 EMC Proven Professional Knowledge Sharing 25

number of clients and NFS Failover IP’s should be maintained, i.e. for 20 Clients accessing the cluster, 20 Failover IP’s should be configured. • The following load balancing policies can be used as per the requirements but Round Robin and CPU utilization policies are the recommended policies. Round Robin – This connection method works on a rotating basis, so that, as one node IP address is handed out, it moves to the back of the list; the next node IP address is handed out, and then it moves to the end of the list, and so on. This follows an orderly sequence to distribute client connections. This is the default state (once SmartConnect is activated) if no other policy is selected. CPU Utilization – This connection method examines CPU load on each node, and then attempts to distribute the connections to balance the workload evenly across all nodes in the cluster. Connection Count – In this algorithm, the number of established TCP connections is determined, and an attempt is made to balance connections evenly per node. Network Throughput – This method relies on an evaluation of the overall file system throughput per node, and then client connection balancing policies are used to distribute throughput consumption.

Each node will collect these statistics regularly (CPU Utilization – every 5 seconds, Connection Count and Network Throughput – every 10 seconds) and send to the delegated server of authority. This information is maintained in the delegated server of authority for one minute (sliding window) and will be used to determine where a new connection request will be sent. These status messages also double as the heartbeat from the nodes. • SmartConnect zones can be used to segregate data traffic. For example, separating production traffic from test traffic in case the Cluster has high performance S-Series nodes and Nearline Storage NL-Nodes.

In many cases, both SMB and NFS connections to the same cluster can be used by the customer. In these cases, a different SmartConnect zone should be created for each of these workloads. The NFS clients can be put in a dedicated SmartConnect zone that will facilitate failover while the SMB clients are put into another SmartConnect zone that will not participate in failover. This will ensure all SMB clients mount to the “Static node IPs” which do not failover. If a SMB client is put into a zone where NFS failover is enabled,

2012 EMC Proven Professional Knowledge Sharing 26

the clients will experience more frequent lost connection on those SMB mounts requiring re-establishment of Connections.

SmartConnect Architecture Example

Figure 13: SmartConnect Architecture Quality of Service

There may be requirements where production and test clients need to access the same data from an Isilon Cluster. SmartConnect can manage which nodes the clients connect to based on the SmartConnect zone to which it connects. For example, in Figure 13, Nodes 1, 2, and 3 are high performing nodes (S- or X-Series nodes) and Nodes 4, 5, and 6 are more for general use (NL-Series node). The company has two separate workflows for production and test environments which each •Zone of Authority:- anuj.com have to access the same DNS Server •Ist Delegated Zone:- prod.anuj.com Authority:- 192.168.1.1 network share/NFS import Configuration •2nd Delegated Zone:- test.anuj.com Authority:- 192.168.2.1 on the Isilon node. With the

SmartConnect setting as • SmartConnect Zone:- prod.anuj.com SmartConnect • Policy:-RoundRobin shown in Figure 13, all • SmartConnectIP:- 192.168.1.1 Configuration • SmartConnect Zone:- test.anuj.com traffic for the production • Policy:-RoundRobin Example • SmartConnectIP:- 192.168.2.1 environment •Ext 1:- 192.168.1.2-192.168.1.4 (prod.anuj.com) will go to •SmartConnect IP:-192.168.1.1 •Policy :-RoundRobin Node 1, 2, and 3, and all SmartConnect •Interface:-External-1 •Nodes:- 1,2,3 traffic for the test Zone •Ext 2:- 192.168.1.2-192.168.2.4 •SmartConnect IP:-192.168.2.1 environment Configuration •Policy :-RoundRobin •Interface:-External-1 (testenv.isilon.com) •Nodes:- 4,5,6 Figure 14: SmartConnect Configuration

2012 EMC Proven Professional Knowledge Sharing 27

will go to Nodes 4, 5, and 6. The production environment is allowed the use of the high performance nodes and the test environment has access to the same data through the other set of nodes.

EMC Isilon Smart Quotas Internals Isilon IQ SmartQuotas™ provide administrators with the full control of space utilization by enforcing user quotas as well as quotas on the directory level. Isilon IQ SmartQuotas also provide the benefit of thin provisioning. Using SmartQuotas, we can enforce thresholds. SmartQuotas supports three types of thresholds:

1. Hard thresholds deny further attempts to write data. It is up to the user or application to decide what to do with the partially written data. Unless space under the quota is cleared, no further writes are possible. 2. Soft thresholds provide a grace period from the moment the threshold is reached, during which further writes are still accepted. Once the grace period expires, the threshold becomes a hard threshold and further writes are denied. 3. Advisory thresholds simply track usage and record when the threshold was exceeded for reporting and notification purposes, as all thresholds provide the added benefit of generating notifications and displaying overage statistics in reports. See the notifications section and reporting use cases below.

SmartQuotas Example OneFS presents a single pool of space so the basic use that the administrator can think of is dividing this pool of storage into smaller portions using quotas. In the following example we assume that OneFS has a directory structure with a top-level directory for each department. For example: /ifs/data/engineering, /ifs/data/marketing, and /ifs/data/sales and a 90 TB cluster. We can simply create three directory quotas under each of these top-level directories and put a hard threshold of 30 TB on each quota. By default, quotas enforce user data accounting without protection overhead. In this scenario we want to create a storage limit so we will create quotas that include protection overhead thresholds with the ‘–-include-overhead’ option. Here, protection overhead refers to all disk blocks used by OneFS including all the data above as well as blocks allocated for protection.

anuj-1# isi quota create --force --directory –path /ifs/data/engineering --path /ifs/data/sales --path /ifs/data/marketing --hard-th 20TB --include-overhead

anuj-1# isi quota ls Type Path Hard Soft Advr Usage ------directory /ifs/data/sales 30T N/A N/A 1.0K(+) directory /ifs/data/engineering 30T N/A N/A ~1.0K(+) directory /ifs/data/marketing 30T N/A N/A 1.0K(+) anuj-1#

2012 EMC Proven Professional Knowledge Sharing 28

We can also create a soft threshold which will generate a warning and a grace period before further writes are denied as below. anuj-1# isi quota modify --force --directory –path /ifs/data/engineering --path /ifs/data/sales --path /ifs/data/marketing --adv 18TB anuj-1# isi quota ls Type Path Hard Soft Advr Usage ------directory /ifs/data/sales 30T N/A 28T 1.0K(+) directory /ifs/data/engineering 30T N/A 28T 20M(+) directory /ifs/data/marketing 30T N/A 28T 1.0K(+) anuj-1#

Thin Provisioning (Over-Subscription) Thin provisioning enables the administrator to present more storage than is physically available. An alert is generated and an administrator can take the necessary action when the physical storage capacity used reaches a certain threshold. When creating thinly provisioned quotas, best practice is to nest them within a physical storage quota with an advisory threshold that will trigger a notification when the threshold is met. This will act as a physical boundary warning for the administrator to add more storage or adjust the thinly provisioned nested quotas. anuj-1# isi quota create --directory --path /ifs/data/sales –path /ifs/data/marketing --path /ifs/data/engineering --hard 10TB –container true anuj-1# isi quota create --directory --path /ifs/data/ --adv 6T –include-overhead anuj-1# isi quota ls -v Type Path Policy Usage ------directory /ifs/data enforcement 1.1T(+) [advisory-threshold] ( 6.0T) [usage-with-no-overhead] ( 926G) [usage-with-overhead] ( 1.1T) directory /ifs/data/sales enforcement 0B [hard-threshold] ( 10T) [container] [usage-with-no-overhead] ( 0B) [usage-with-overhead] ( 1.0K) directory /ifs/data/engineering enforcement 20M [hard-threshold] ( 10T) [container] [usage-with-no-overhead] ( 20M) [usage-with-overhead] ( 25M) directory /ifs/data/marketing enforcement 0B [hard-threshold] ( 10T) [container] [usage-with-no-overhead] ( 0B) [usage-with-overhead] ( 1.0K)

In the example above, three quota domains with a 10 TB container threshold each are presented to users as shares or exports as a 10 TB volume, nested within a quota with a 6 TB hard storage threshold. This configuration would mask the cluster capacity from users and over- subscribe the top quota by 24 TB (3 * 10 TB – 6 TB). Once a notification is triggered by the advisory threshold, adding more nodes to meet storage consumption can be done in minutes.

2012 EMC Proven Professional Knowledge Sharing 29

The power of SmartQuotas over-subscription combined with the ease of adding storage to Isilon IQ clusters is another proof-point of ease of use and scalability.

Customizing SmartQuotas Email Notifications Email notifications are generated from notification templates. By default there are two template files used by all quota domains: one for hard and advisory quotas located in /etc/ifs/quota_email_template.txt, and one for soft quotas including grace period information, located in /etc/ifs/quota_email_grace_template.txt.

The content of the default templates are: • /etc/ifs/quota_email_template.txt for general notifications: Subject: Disk quota exceeded The disk quota on directory owned by was exceeded. The quota limit is , and is currently in use. Please free some disk space by deleting unnecessary files. Contact your system administrator for details. • /etc/ifs/quota_email_grace_template.txt for grace period notifications: Subject: Disk quota exceeded The disk quota on directory owned by was exceeded. The quota limit is with a grace period of ISI_QUOTA_GRACE>, and is currently in use. Please free some disk space by deleting unnecessary files before . Failure to do so will restrict your ability to write. Contact your system administrator for details.

Template files can be customized and assigned to notification rules. Additionally, new templates can be created. For example, an administrator may want to omit some information from emails sent to end users or be more verbose, giving as much information as possible by customizing or creating a new template. The various tags in the messages are explained below:

: the user or group name of the owner of the quota domain for which a threshold triggered the notification. : the path of the quota domain for which a threshold triggered the notification. : the quota’s human-readable usage of the quota domain for which a threshold triggered the notification. : the threshold name that triggered the notification: “advisory”, “soft”, or “hard”.

2012 EMC Proven Professional Knowledge Sharing 30

: the quota’s human-readable threshold value that triggered the notification. : soft threshold expiration time after which it behaves like a hard threshold.

I have tried to explain the underlying concepts of the Isilon SmartQuotas in the section above. Refer to the OneFS administration guide for more information regarding notification rules and how to assign templates to them.

EMC Isilon and vSphere Integration Best Practices VMware vSphere is used as Cloud OS by most Cloud vendors and many Big Data applications are being migrated to the Cloud for scalability and flexibility. Thus, it’s mandatory that we follow some best practices for optimum application performance of VMware vSphere and the storage system. EMC Isilon Storage System can be used as a Cloud Storage platform for VMware owing to the scalable, flexible, and elastic architecture of EMC Isilon OneFS. iSCSI and NFS datastores can be provisioned using EMC Isilon storage. Important considerations common to both iSCSI and NFS datastore storage networking include:

• Separate storage and VM network traffic to different physical network ports. Doing so separates LAN traffic and storage traffic so that LAN traffic doesn’t hamper the storage traffic performance. It is also recommended to have separate vSwitches for VM network, vMotion, and storage networking. • Consider using switches that support “cross-stack Etherchannel” or “virtual port channel” trunking where interfaces on different physical switches are combined into an 802.3ad LAG for network and switch redundancy. • Choose a production quality switch for mission critical, network intensive purposes— such as VMware datastores (on iSCSI or NFS)—that has the required amount of port buffers and other internal resources. • As we are moving towards a converged infrastructure, 10GbE infrastructure is a great way to consolidate network ports. In case 10GbE infrastructure cannot be deployed we can use Gigabit infrastructure. For Gigabit infrastructure, use Cat6 cables rather than Cat5/5e. • Flow-Control should be used. ESX hosts and the Isilon cluster should be set to On to send and Off to receive. Conversely, the switch ports connected to the ESX host and Isilon cluster nodes should be set to Off to send and On to receive. • ESX hosts that have fewer NICs (such as blade servers), should use VLANs to group common IP traffic onto separate VLANs for optimal performance and improved security. VMware recommends separating service console access and virtual machine network from the VMkernel network for IP storage and vMotion traffic.

2012 EMC Proven Professional Knowledge Sharing 31

• Create the virtual switch using at least one dedicated NIC for network storage traffic. This will ensure good storage I/O performance, as well as isolate any problems caused by other network traffic. • For network redundancy on the ESX host, Isilon recommends teaming multiple network interfaces to the vSwitch of the VMkernel port used for NFS storage. Multiple interfaces in the same vSwitch can also be used to increase the aggregate throughput through the VMkernel NFS storage stack (as long as multiple datastores are assigned different IP addresses). The VMkernel will use multiple TCP/IP connections across multiple network interfaces to provide parallel virtual machine I/O access. iSCSI Storage vs. NFS Storage Capability iSCSI NFS File System VMFS or RDM OneFS Maximum Datastores 256 64 Max ESX Datastore Size 64TB 10PB Max LUN/File System Size 32 TB 10PB Recommended Number of 20/LUN In Thousands VMs per LUN/File System Network Bandwidth 1G/10GbE 1G/10GbE

Optimizing OneFS for NFS DataStores with I/O Intensive Virtual Machines • For virtual machines with small random I/O operations (less than 32K), use mirroring data layout (2X) on VM directories and their sub-directories. This setting increases protection overhead and decreases write overhead. The setting can also be applied on a per VM directory basis while other VM directories continue to use the parity protection layout which is more space efficient. To change a VM directory write data protection and access mode: • From the WebUI, select File System File System SettingsFlex Protect Policy and set to Advanced • From the WebUI, select File System File System Explorer and use navigate to the VM desired VM directory • Select 2X Protection Level and apply protection to contents • Select Optimize writing for Streaming and Apply optimization to contents

2012 EMC Proven Professional Knowledge Sharing 32

Figure 15: Optimizing Isilon NFS for VM I/O Operations

• Enable OneFS ‘streaming’ mode on virtual machines with high read and write I/O requirements (as shown in the screenshot above.) • Disable OneFS read caching (prefetch) in case many virtual machines require high ratio of small random read operations. Disabling read prefetch instructs OneFS to avoid prefetching adjacent file blocks and will eliminate prefetch latency overhead. • Log on to any of the nodes in the cluster over SSH connection • At the login prompt, issue the following command: sysctl efs.bam.enable_prefetch=0 • To make this setting persistent across cluster reboots, add the following line to the file /etc/mcp/override/sysctl.conf: efs.bam.enable_prefetch=0

EMC Isilon Recommended Architecture for performance, flexibility, and availability of NFS datastores

• Assign at least one dynamic IP for each member node interface in a dynamic IP pool. This will enable each unique IP to be mounted by a separate ESX NFS datastore, and the more NFS datastores are created, the more TCP connections an ESX host can leverage to balance VM I/O. • OneFS Dynamic IPs and Cisco cross-stack EtherChannel link aggregation should be combined to provide both node/interface redundancy and switch redundancy.

2012 EMC Proven Professional Knowledge Sharing 33

Figure 16: Isilon NFS Architecture

• EMC Isilon recommends the best design is a mesh connectivity topology, in which every ESX server is connected to every IP address on a cluster-dynamic IP pool. • Connecting “everything to everything” enables all l ESX hosts to be connected to all datastores, enabling vMotion between all ESX servers to be performed, knowing all servers can see the same datastore and that the migration will be successful. Virtual machines can be created on any datastore to balance the I/O load between ESX servers and the cluster; virtual machines can be easily moved between datastores to eliminate hot spots without moving VM data.

EMC Isilon Recommended Architecture for performance, flexibility, and availability of iSCSi datastores

• It is always recommended to dedicate a static IP pool on the Isilon cluster for managing iSCSI target IPs. • Create multiple VMkernel port groups on the ESX host, with a single active network interface and no standby interfaces. • Make sure the selected Storage Array Type Plugin (SATP) is Active/Active and the Path Selection Policy (PSP) is FIXED. If those are not set by default, they can be manually changed on a per target, datastore, or LUN level. • To avoid lock-contention as a result of multiple ESX hosts accessing the same LUN from different target nodes, make sure all ESX hosts use the same preferred path to the same LUN.

2012 EMC Proven Professional Knowledge Sharing 34

Figure 17: Isilon iSCSI Architecture

• Isilon recommends using more, smaller VMFS datastores because there is less wasted storage space (thick provisioned LUNs). You also can tune each underlying LUN in a more granular method to meet a specific set of application I/O and data protection requirements for a virtual machine. Aggregate I/O increases as more datastores are used concurrently because more paths can be used in parallel and contention on a per VMFS volume is reduced.

2012 EMC Proven Professional Knowledge Sharing 35

EMC Isilon SyncIQ Architecture and Tips EMC Isilon SyncIQ is used for data replication for disaster recovery and business continuity purposes.

Figure 18: SyncIQ

SyncIQ replicates data from a primary site to a secondary, local, or remote site. When a SyncIQ job initiates, the system first takes a snapshot of the data to be replicated. SyncIQ compares this snapshot to the snapshot from the previous replication job, which enables it to quickly identify the changes that need to be addressed. SyncIQ then pools the aggregate resources from the cluster, splitting the replication job to smaller work items and farming them out to multiple workers across all nodes in the cluster. Each worker scans a part of the snapshot differential for changes and transfers those changes to the target cluster. After the initial full replication, every incremental job execution of that policy will transfer only the 8K per block that changed since the previous replication job. This is critical in cases where only a small fraction of a dataset has changed, as in the case of virtual machine VMDK files in which only a block may have changed in a multi-gigabyte virtual disk file.

SyncIQ tips

• After creating a policy and before running the policy for the first time, use the policy assessment option to see how long it takes to scan the source cluster dataset. • We can increase workers per node in cases where network utilization is low, for example over WAN. • Use file rate throttling to control how much CPU and disk I/O SyncIQ consumes while jobs are running through the day.

2012 EMC Proven Professional Knowledge Sharing 36

• Use SmartConnect IP address pools to control which nodes participate in a replication job and to avoid contention with other workflows accessing the cluster through those nodes.

EMC Isilon NDMP Backup Configuration for EMC NetWorker Backup forms a critical part of IT infrastructure. We need to meet the stringent backup windows defined by the business as the data to be backed up is growing day by day. We can back up Isilon data using NDMP protocol with EMC NetWorker or any backup software that supports NDMP backups. The topologies below can be deployed for NDMP backups as per the customer environment.

Direct NDMP Model • Direct NDMP model is recommended for Isilon Data Backup as the data travels over the SAN to the backup device. • Backup throughput is highest as the data travels over the SAN and is recommended in situations where the Backup Window is very tight. • In this method, EMC Isilon Backup Accelerator Node is connected to the Isilon Cluster and connected through the SAN to Backup Tape Library/VTL. • Backup Server initiates the Isilon data backup and backup data is transferred through SAN to the Backup Device. • Metadata is transferred across the LAN to the Backup Server.

Figure 19: Direct NMDP Method

2012 EMC Proven Professional Knowledge Sharing 37

• General recommendations to achieve peak performance using the Backup Accelerator: • I-Series: 5 nodes for each Backup Accelerator • X-Series: 3 nodes for each Backup Accelerator • NL-Series: 3 nodes for each Backup Accelerator • S-Series: 2 nodes for each Backup Accelerator • LTO-4: 4 Tape devices per Backup Accelerator • LTO-3: 8 Tape devices per Backup Accelerator • It is recommended to limit the number of concurrent backup/restore sessions to eight per Backup Accelerator.

Remote NDMP Model • In remote NDMP model, backup server or any dedicated server is connected to the backup device. • Backup server initiates the backup and backup data travels over the LAN to the backup device. • As the backup data travels over the LAN, throughput will be less as compared to the NDMP Direct method.

Figure 20: Remote NDMP Model

2012 EMC Proven Professional Knowledge Sharing 38

Per best practice, if using a single Backup Accelerator connected to four tape devices via the four fiber channel ports, it makes sense to create four large directories under /ifs, so that the backup configuration can include four job policies using each of the four directories as a source and each of the four tape devices as a target.

In cases where Backup Accelerator is not being used and four tape devices are available as targets along with four Isilon nodes available for backup, it makes sense to manage four large directories under /ifs so that four backup jobs can run simultaneously using each of the four directories as a source, each of the four nodes as a transport of data, and each of the four tape devices as a target.

Direct NDMP Backup Configuration • Create NDMP User on Isilon Data Nodes. o isi ndmp user create • Set NDMP Port to default 10000 on Isilon Data Nodes. o isi ndmp settings set port 10000 • Set DMA Vendor to EMC on Isilon Data Nodes. o The default setting for DMA Vendor is generic but can be set to vendor specific. For EMC we can set to EMC so as to ensure optimum performance. o isi ndmp settings set dma EMC • On the Backup Accelerator, verify that the FC ports are enabled. o isi fc ls • Scan for the tape devices attached to the Backup Accelerator in case of Direct NDMP backup topology. o isi tape rescan [--reconcile] • To verify that the tape devices are configured. o isi tape ls –v

Figure 21: isi tape ls –v output • Configure the tape device attached to the Backup Accelerator in NetWorker using jbconfig command on the backup server/storage node as shown below.

2012 EMC Proven Professional Knowledge Sharing 39

# jbconfig Jbconfig is running on host anuj-1 (Rhel 5.0), and is using anuj-1 as the NetWorker server. 1) Configure an AlphaStor Library. 2) Configure an Autodetected SCSI Jukebox. 3) Configure an Autodetected NDMP SCSI Jukebox. 4) Configure an SJI Jukebox. 5) Configure an STL Silo. What kind of Jukebox are you configuring? [1] 3 Enter NDMP Tape Server name: ? nasanuj Communicating to devices on NDMP Server 'nasanuj', this may take a while... 14484:jbconfig: Scanning SCSI buses; this may take a while ... These are the SCSI Jukeboxes currently attached to your system: 1) [email protected]: Standard SCSI Jukebox, QUANTUM / Scalar i500 2) [email protected]: Standard SCSI Jukebox, ADIC / Scalar i500 Which one do you want to install? 1 Installing 'Standard SCSI Jukebox' jukebox - [email protected]. What name do you want to assign to this jukebox device? mc001 15814:jbconfig: Attempting to detect serial numbers on the jukebox and drives ... 15815:jbconfig: Will try to use SCSI information returned by jukebox to configure drives. Turn NetWorker auto-cleaning on (yes / no) [yes]? no The following drive(s) can be auto-configured in this jukebox: 1> LTO Ultrium-4 @ -24151.9.200 ==> tape002 (NDMP) 2> LTO Ultrium-4 @ -24151.9.200 ==> tape001 (NDMP) These are all the drives that this jukebox has reported. To change the drive model(s) or configure them as shared or NDMP drives, you need to bypass auto-configure. Bypass auto-configure? (yes / no) [no] yes Is (any path of) any drive intended for NDMP use? (yes / no) [no] yes Is any drive going to have more than one path defined? (yes / no) [no] Drive 1, element 256, system device name = tape002, local bus, target, lun value = -24151.9.200, WWNN=500308C09F139000 model LTO Ultrium-4 Drive path ? [nasanuj:tape002] Is this device configured as NDMP? (yes / no) [no]yes Drive 2, element 257, system device name = tape001, local bus, target, lun value = -24151.9.200, WWNN=500308C09F139004 model LTO Ultrium-4 Drive path ? [nasanuj:tape001] Is this device configured as NDMP? (yes / no) [no]yes Only model LTO Ultrium-4 drives have been detected. Are all drives in this jukebox of the same model? (yes / no) [yes] Would you like to configure another jukebox? (yes/no) [no]

• Create a client resource in NetWorker. Name: Storage node: , followed by < nsrserverhost> in next line Ndmp: Yes Saveset: name of saveset to backup, i.e. directories to be backed up from Isilon Remote access: *@* or administrator@ One entry per line Remote user: ndmp #ndmp user created earlier Remote password: password associated with ndmp user Backup command: nsrndmp_save -T tar Application information: HIST=y UPDATE=y DIRECT=y UTF8=y OPTIONS=NT • Schedule the backup of the NDMP client as required.

2012 EMC Proven Professional Knowledge Sharing 40

Remote NDMP Backup Configuration • Create NDMP username and password as created in Direct NDMP configuration. • In case of remote configuration, a drive will need to be configured on the backup server or storage node as NDMP drive using jbconifg. • Create a client resource in NetWorker. Name: Storage node: , followed by < nsrserverhost> in next line Ndmp: Yes Saveset: name of saveset to backup i.e. directories to be backed up from Isilon Remote access: *@* or administrator@ One entry per line Remote user: ndmp #ndmp user created earlier Remote password: password associated with ndmp user Backup command: nsrndmp_save -T tar Application information: HIST=y UPDATE=y DIRECT=y UTF8=y OPTIONS=NT • Schedule the backup of the NDMP client as required.

Cluster Performance Tuning • Cluster Write Caching (Coalescer) Setting By default, all writes to the Isilon cluster are cached by the file system coalescer that allows the file system to determine when it is best to flush the content to disk. This setting is typically optimal for sequential write data access (both small concurrent writes as well as large single stream writes). However, in highly random access or highly transactional access patterns where intermittent latencies are not desired by the application, turning off the coalescer will ensure a more consistent latency across all write operations. This does not mean the data is not kept in cache for subsequent read operations; it simply means that each write operation will be flushed to disk. This setting can be turned on or off on a per directory level allowing high level of flexibility to tune data access per each application based on the directory being accessed.

# isi set -c on -R /ifs/data/dir1_with_write_cache

Or

# isi set -c off -R /ifs/data/dir1_without_write_cache

• OneFS NFS Server Sync/Async Setting By default, the OneFS NFS service is set to synchronously commit every write operation to disk as client NFS commit requests are sent to the cluster, in effect disabling the

2012 EMC Proven Professional Knowledge Sharing 41

OneFS write buffer (coalesce). In many cases, synchronous commits are not necessary and disabling it can improve performance. This is particularly true for large sequential writes to large files and is also true when small write operations are stalled waiting for commit response to return from the cluster. Run the following command to disable the NFS sync feature:

isi_for_array sysctl vfs.nfsrv.async=1

• OneFS Data Prefetch Setting By default, OneFS pre-fetches file data as detailed in Chapter 3 on SmartCache. This is very beneficial for large sequential read operations for single stream and concurrent access because the likelihood of the client needing this data outweighs the work associated with additional data I/O. But if the prefetched data is not needed by the client, the extra I/O operations become an undesired overhead. This is the case with random access in which the likelihood of accessing prefetched data is very low. In this case, prefetching can be disabled with the following command:

isi_for_array sysctl efs.bam.enable_prefetch=0

To make these sysctls changes permanent on the cluster, edit the file etc/override/sysctl.conf and add either or both of the following lines:

efs.bam.fnprefetch.enable_streaming=1

• Client Performance Tuning Options Client performance is dependent on two major factors: The first is the OS file sharing protocols that are predominant for each OS. Second are the hardware considerations that need to be addressed for each type of client. • Make sure that the uplink to the switch connected to your cluster has sufficient bandwidth to fulfill the client requests. • Use protocol options where available to tune for highest performance. • Make sure that the client is not the limiting factor. For example, reading and writing from a slow client hard drive. • For optimal results when connecting to an IQ cluster via NFS clients, Isilon recommends the following settings:  Use NFS V3 over TCP (not UDP).  For 1GigE connections, read and write buffer sizes of at least 32KB (32768 bytes) are recommended.  For 10GigE connections, a read buffer size of 128K (131072), and a write buffer size of 512KB (524288 bytes) are recommended. • Metadata to be stored on SSD Drives It’s seen in cases that Isilon cluster gets slow over a period time in response to internal queries. In such cases, we should configure Isilon to store the metadata on the SSD Drives. This will considerably improve the response times.

2012 EMC Proven Professional Knowledge Sharing 42

EMC Isilon Cluster Maintenance • When using SmartConnect, client connections can become unbalanced in both performance and quantity. Administrators can rebalance the existing connections by using the following commands.

For rebalancing all the connections: isi networks --sc-rebalance-all

For rebalancing connections of a specific subnet and pool: isi networks modify pool --name=subnet0:pool0 --sc-rebalance

• For upgrading OneFS we have the option of performing a rolling or a simultaneous upgrade of the cluster operating system. A rolling upgrade individually upgrades and restarts each node in the cluster sequentially. You can use rolling or simultaneous upgrades for minor releases X.X.X to X.X.Y. A simultaneous upgrade installs the new operating system and restarts all nodes in the cluster at the same time. Simultaneous upgrades are required for major releases X.X.X to X.Y.Y. Major release upgrades often require the upgrade job to run after reboot.

• Adding a node to the cluster As nodes are automatically added, they are assigned a node number and an IP address. They are numbered in the order that they are added to the cluster. If a node attempts to join the cluster with a newer or older OneFS version, the cluster will automatically reimage the node to match the cluster’s OneFS version. After this reimage completes, the node finishes the join. A reimage should take no longer than 5 minutes, which brings the total amount of time taken to approximately 10 minutes. For clusters that use a OneFS version prior to 5.5.x, do not join the node to the cluster. First, reimage the node to the same OneFS version as the cluster before joining the node.

To join a node, launch a serial communication utility such as HyperTerminal for Windows clients. Configure the connection utility to use the following settings: • Transfer rate = 115,200 bps • Data bits = 8 • Parity = none • Stop bits = 1 • Flow control = hardware

2012 EMC Proven Professional Knowledge Sharing 43

• Connect the laptop to the node to be added to the cluster through serial connection. • Power on the node.

• Once the node is booted up, a configuration menu will appear. Choose 2 to Join an existing cluster. • You will see the name of the cluster that is available and can be joined. • Choose the cluster number to join. In the screen above, 1 is the number of the cluster to be joined. • The node will join the cluster and be configured as per the cluster settings.

• Adding a disk drive • Verify that FlexProtect has completed before continuing by typing isi status and verifying there is no active FlexProtect Job. • At the command prompt, type “isi devices –a add -d n:b” (where n is the node number and b is the drive bay number.) This scans the bus for a drive and tries to add it back into the cluster. • Confirm your intention to re-add the device by typing yes. • At the command prompt, type “isi devices -a format -d n:b” (where n is the node number and b is the drive bay number). The system does not re-add the drive to the cluster because it is the same physical drive. Therefore, you must format the drive by using the “isi devices -a format -d n:b” command. When this is completed, the drive is automatically re-added to the cluster. This can take a few minutes.

• Removing a node from the cluster Removing a node from the cluster physically requires logically removing the node from the cluster. The node automatically reformats its own drives and resets itself to the

2012 EMC Proven Professional Knowledge Sharing 44

factory default settings once the node has been logically removed from the cluster. The reset occurs only after OneFS has confirmed that all data has been reprotected. To logically remove the node from the cluster, smartfail process should be used. During the smartfail process, the node that is to be removed is placed in a read-only state while the cluster performs a FlexProtect process to logically move all data from the affected node. After all data migration is complete, the cluster logically changes its width to the new configuration; at this point, it is safe to physically remove the node. If we want to replace a node with another, it’s better to add the replacement node to the cluster before failing the old node because FlexProtect can immediately use the replacement node to rebuild the failed node's data. If you remove the failed node first, FlexProtect must rebuild the node's data into available space in the cluster and AutoBalance then transfers the data back to the added replacement node. • Open the Isilon GUI. • Navigate to Cluster Cluster Management  Remove Node.

• Select the node to be removed and click submit.

2012 EMC Proven Professional Knowledge Sharing 45

• The GUI will show the status of the smartfail process and will show the status as complete once finished. • The node can be physically removed once the status shows as finished.

2012 EMC Proven Professional Knowledge Sharing 46

References www.isilon.com www.google.com

2012 EMC Proven Professional Knowledge Sharing 47

Glossary Backup Server Backup server is the controlling backup entity that directs client backups and stores tracking and configuration information.

LUN In computer storage, a logical unit number (LUN) is simply the number assigned to a logical unit. A logical unit is a SCSI protocol entity, the only one which may be addressed by the actual input/output (I/O) operations. Each SCSI target provides one or more logical units and does not perform I/O as itself, but only on behalf of a specific logical unit.

NAS Network-attached storage (NAS) is file-level computer data storage connected to a computer network providing data access to heterogeneous network clients. A NAS unit is essentially a self-contained computer connected to a network, with the sole purpose of supplying file-based data storage services to other devices on the network. The operating system and other software on the NAS unit provide the functionality of data storage, file systems, and access to files, and the management of these functionalities.

NDMP Network Data Management Protocol (NDMP) is a protocol invented by NetApp and Legato, meant to transport data between NAS devices, also known as Filers. This removes the need for transporting the data through the backup server itself, thus enhancing speed and removing load from the backup server.

Scale-Out NAS Scale-Out NAS, i.e. traditional NAS, comprises one or two controllers, or NAS heads, and a pre- set amount of CPU, memory, and drive slots. In case we need to add memory space, we have to upgrade the NAS storage to its peak limit. Once peak limits are exhausted, the only way to boost capacity and performance is to buy a new, separately managed system.

Scale-Up NAS Scale-out NAS grows by adding clustered nodes. If we need to add memory, cache, or storage we add a node to the cluster. Whole clusters behave as a single storage system and are managed together.

2012 EMC Proven Professional Knowledge Sharing 48

Storage Node The host that receives the client generated data writes it on the backup device, generates the tracking information, and reads the data at the time of recovery. NetWorker Storage node component is installed on the backup server itself.

EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice.

THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION MAKES NO RESPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.

2012 EMC Proven Professional Knowledge Sharing 49