Recovery

KwaiSeng Consulting Systems Engineer

Presentation_ID © 2006 , Inc. All rights reserved. Cisco Confidential 1 Agenda

 Data Center—The Evolution  Data Center Objectives Failure Scenarios Design Options  Components of Disaster Recovery Site Selection—Front End GSLB Server High Availability—Clustering Data Replication and Synchronization—SAN Extension  Data Center Technology Trends  Summary

© 2006 Cisco Systems, Inc. All rights reserved. 2 The Evolution of Data Centers

© 2006 Cisco Systems, Inc. All rights reserved. 3 DataCenterEvolution

NetworkedData CenterPhase DataCenter Continuous DataCenter Availability Virtualization DataCenter Network Consolidation Optimization Compute Evolution Computing DataCenter Client/ Networking Server 1.Consolidation Mainframes 2.Integration Content 3.Virtualization Networking 4.HighAvailability ThinClient:HTTP BusinessAgility

TCP/IP Network Terminal Evolution

1960 1980 2000 2010

© 2006 Cisco Systems, Inc. All rights reserved. 4 Today’sDataCenter IntegrationofManySystemsandServices

Storage NTier FrontEnd Network Applications Network Application/Server WAN/ Optimization FC Security Internet Switch WebServers Resilient Cache IP Firewall DRDataCenter ScalableInfrastructure ApplicationandServerOptimization NAS AppServers Content DataCenterSecurity IDS Switch MAN/ DCStorageNetworks Internet VSANs DistributedDataCenters DBServers FC Switch Mainframe IPComm. Operations FC Switch RAID

MetroNetwork DWDM/SONET/Ethernet FC Tape SAN SecondaryDataCenter © 2006 Cisco Systems, Inc. All rights reserved. 5 WhatIsDistributedDataCenter?

AppA AppB AppA AppC

DataReplication

FC FC Primary Secondary DataCenter DataCenter

© 2006 Cisco Systems, Inc. All rights reserved. 6 DistributedDataCenters

 Required by disaster recovery and business continuance  Avoid single , concentrated data depositary  High availability of applications and data access  Load balancing together with performance scalability  Better response and optimal content routing: proximity to clients

© 2006 Cisco Systems, Inc. All rights reserved. 7 FrontEndIPAccessLayer

“ContentRouting” SiteSelection AppA AppB AppA AppC

FC FC Primary Secondary DataCenter DataCenter

© 2006 Cisco Systems, Inc. All rights reserved. 8 ApplicationandDatabaseLayer

“ContentSwitching” LoadBalancing AppA AppB “ServerClustering” AppA AppC HighAvailability

FC FC Primary Secondary DataCenter DataCenter

© 2006 Cisco Systems, Inc. All rights reserved. 9 BackendSANExtension

“Storage” and “Optical” AppA AppB DataReplication AppA AppC andTransporting

FC FC Primary Secondary DataCenter DataCenter

© 2006 Cisco Systems, Inc. All rights reserved. 10 Data Center Disaster Recovery

© 2006 Cisco Systems, Inc. All rights reserved. 11 Agenda

 Introduction to Data Center—The Evolution  Data Center Disaster Recovery Objectives Failure Scenarios Design Options  Components of Disaster Recovery Site Selection—Front End GSLB Server High Availability—Clustering Data Replication and Synchronization—San Extension  Data Center Technology Trends  Summary

© 2006 Cisco Systems, Inc. All rights reserved. 12 DisasterRecovery

 Recovery of data and resumption of service—Ensuring business can recover and continue after failure or disaster  Ability of a business to adapt, change and continue when confronted with various outside impacts  Mitigating the impact of a disaster

© 2006 Cisco Systems, Inc. All rights reserved. 13 DisasterRecovery WhatItMeansforBusiness

Business Resilience Continued Operation of Business During a Failure

Business Continuance Restoration of Business After a Failure Disaster Recovery Protecting Data Through Offsite Data Replication ZeroDownTime Is and theUltimateGoal

© 2006 Cisco Systems, Inc. All rights reserved. 14 DisasterRecoveryPlanning

 Business Impact Analysis ( BIA ) Determines the impacts of various to specific business functions and company assets  Risk analysis Identifies important functions and assets that are critical to company’s operations  Disaster Recovery Plan ( DRP ) Restores operability of the target systems, applications, or computing facility at the secondary data center after the disaster

© 2006 Cisco Systems, Inc. All rights reserved. 15 DisasterRecoveryObjectives

 Recovery Point Objective (RPO) The point in time (prior to the outage) in which system and data must be restored to Tolerable lost of data in event of disaster or failure The impact of and the cost associated with the loss  Recovery Time Objective (RTO) The period of time after an outage in which the systems and data must be restored to the predetermined RPO The maximum tolerable outage time

© 2006 Cisco Systems, Inc. All rights reserved. 16 RecoveryPoint/Timevs.Cost

CriticalDataIs Disaster SystemsRecovered Recovered Strikes andOperational

Time

RecoveryPoint RecoveryTime

timet 0 Timet 1 Timet 2

Days Hours Mins Secs Secs Mins Hours Days Weeks

Tape Periodic Asynchronous Synchronous Extended Manual Tape backup Replication Replication Replication Cluster Migration Restore

$$$IncreasingCost $$$IncreasingCost

 Smaller RPO/RTO  Larger RPO/RTO Higher $$$, replication, Lower $$$, tape backup/restore, hot standby cold standby

© 2006 Cisco Systems, Inc. All rights reserved. 17 Agenda

 Introduction to Data Center—The Evolution  Data Center Disaster Recovery Objectives Failure Scenarios Design Options  Components of Disaster Recovery Site Selection—Front End GSLB Server High Availability—Clustering Data Replication and Synchronization—San Extension  Data Center Technology Trends  Summary

© 2006 Cisco Systems, Inc. All rights reserved. 18 FailureScenarios DisasterCouldMeanManyTypesofFailure  Network failure  Device failure  Storage failure  Site failure

© 2006 Cisco Systems, Inc. All rights reserved. 19 NetworkFailures

 ISP failure Service Internet Service  Dual ISP connections ProviderA ProviderB  Multiple ISP  Connection failure within the network  EtherChannel ®  Multiple route paths

© 2006 Cisco Systems, Inc. All rights reserved. 20 DeviceFailures

 Routers, switches, Service Internet Service FWs ProviderA ProviderB  HSRP  VRRP  Hosts  HA cluster  LB server farm  NIC teaming

© 2006 Cisco Systems, Inc. All rights reserved. 21 StorageFailures

 Disk arrays Service Internet Service  RAID ProviderA ProviderB  Disk controllers  Storage Replication  Site to Site Mirroring  Optimization

© 2006 Cisco Systems, Inc. All rights reserved. 22 SiteFailures

 Partial site failure Service Internet Service  Application maintenance ProviderA ProviderB  Application migration  Application scheduled DR exercise  Complete site failure  Disaster

© 2006 Cisco Systems, Inc. All rights reserved. 23 Agenda

 Introduction to Data Center—The Evolution  Data Center Disaster Recovery Objectives Failure Scenarios Design Options  Components of Disaster Recovery Site Selection—Front End GSLB Server High Availability—Clustering Data Replication and Synchronization—San Extension  Data Center Technology Trends  Summary

© 2006 Cisco Systems, Inc. All rights reserved. 24 WarmStandby

 A data center that is equipped with hardware and communications interfaces capable of providing backup operating support  Latest from the production data center must be delivered  Network access needs to be activated  Application needs to be manually started

© 2006 Cisco Systems, Inc. All rights reserved. 25 DisasterRecovery—Active/Standby

AppA AppB AppA AppC

IP/OpticalNetwork

FC Secondary FC Primary DataCenter DataCenter (WarmStandby)

© 2006 Cisco Systems, Inc. All rights reserved. 26 HotStandby

 A data center that is environmentally ready and has sufficient hardware, software to provide data processing service with little down time  Hot backup offers disaster recovery, with little or no human intervention  Application data is replicated from the primary site  A hot backup site provides better RTO/RPO than warm standby but cost more to implement  Business continuance

© 2006 Cisco Systems, Inc. All rights reserved. 27 DisasterRecovery—Active/Standby

AppA AppB AppA AppC

IP/OpticalNetwork

FC FC Primary Secondary DataCenter DataCenter

© 2006 Cisco Systems, Inc. All rights reserved. 28 Active/ActiveDRDesign MultipleTiersofApplication

Service Internet Service ProviderA ProviderB

PresentationTier

ApplicationTier

StorageTier

© 2006 Cisco Systems, Inc. All rights reserved. 29 Active/ActiveDataCenters

Internal Internet Network Service Service ProviderA ProviderB Internal Network

Active/Active WebHosting Active/Active ApplicationProcessing Active/Standby DatabaseProcessing or Active/Active forDifferentApplication

© 2006 Cisco Systems, Inc. All rights reserved. 30 Components of Disaster Recovery

© 2006 Cisco Systems, Inc. All rights reserved. 31 Agenda

 Introduction to Data Center—The Evolution  Data Center Disaster Recovery Objectives Failure Scenarios Design Options  Components of Disaster Recovery Site Selection—Front End GSLB Server High Availability—Clustering Data Replication and Synchronization—SAN Extension  Data Center Technology Trends  Summary

© 2006 Cisco Systems, Inc. All rights reserved. 32 SiteSelectionMechanisms

 Site selection mechanisms depend on the technology or mix of technologies adopted for request routing : 1. HTTP redirect 2. DNS-based 3. L3 Routing with Route Health Injection (RHI)  Health of servers and/or applications needs to be taken into account  Optionally, other metrics (like load ) can be measured and utilized for a better selection

© 2006 Cisco Systems, Inc. All rights reserved. 33 HTTPRedirection—TrafficFlow

http://www.cisco.com/

http://www1.cisco.com/ .1 m /1 co TP o. d HT isc ve m / c o co K ET w. M o. . G w 02 sc e 1 : w 3 .ci e t 1 2 p os /1. w H P w a T w l T : i . H ion v 2 at e c s Lo 3. GET/ HTTP/1.1 Host :w ww2.c isco.com

HTTP/1 .1200OK

http://www2.cisco.com/

© 2006 Cisco Systems, Inc. All rights reserved. 34 DNSBasedSiteSelection—TrafficFlow

RootNameServerfor/ Authoritative NameServerfor.com DNSProxy 2 3 4 Authoritative NameServer 5 cisco.com 1 6 10 7 8

Client 9 Authoritative NameServer http://www.cisco.com/ www.cisco.com s s ive e UDP:53 al v ep i Ke l a TCP:80 p e e K

DataCenter1 DataCenter2

© 2006 Cisco Systems, Inc. All rights reserved. 35 RouteHealthInjection—Implementation

ClientA Router11 ClientB Router13

Router10

Router12 LowCost VeryHighCost LocationA BackupLocationfor LocationB VIPx.y.w.z PreferredLocationfor VIPx.y.w.z

© 2006 Cisco Systems, Inc. All rights reserved. 36 SiteSelectionSummary

Redundancy Convergence App Health Site Visibility Persistence Mode HTTP Active/Active No No Yes Re-Direct DNS Active/Active DNS Cache Yes No

RHI Active/ Standby Within Secs Yes No

© 2006 Cisco Systems, Inc. All rights reserved. 37 Agenda  Introduction to Data Center—The Evolution  Data Center Disaster Recovery Objectives Failure Scenarios Design Options  Components of Disaster Recovery Site Selection—Front End GSLB Server High Availability—Clustering Data Replication and Synchronization—San Extension  Data Center Technology Trends  Summary

© 2006 Cisco Systems, Inc. All rights reserved. 38 ClusterOverview

 Load Balancing Cluster : multiple copies of the same application against the same data set,

usually read only WebServers  High Availability Cluster : multiple copies of application that requires access to a

common data depository, usually ApplicationServers read and write  Clustering provides benefits for availability , reliability , scalability , and manageability DatabaseServers

© 2006 Cisco Systems, Inc. All rights reserved. 39 HighAvailabilityClusterDesign

 Public Network : Client /Application requests

APP Cluster Software Cluster Enabler  Private Network : OS Interconnection between nodes

 Storage Disk : Shared storage array, NAS or SAN

© 2006 Cisco Systems, Inc. All rights reserved. 40 HAClusterApplicationView  Active/standby Standby takes over when active fails Two-node or multi-node  Active/active Database requests load balanced all nodes Lock mechanism ensures data integrity  Shared everything Node1 Node2 Each node mounts all storage resources Provides a single layout reference system for all nodes  Shared nothing Each node mounts only its “semi-private” storage Data stored on the peer system’s storage is accessed via the peer-peer communication

© 2006 Cisco Systems, Inc. All rights reserved. 41 GeoClustersConsiderations GeoCluster:ClusterThatSpanMultiple DataCenters

WAN

Local Remote Datacenter Datacenter

Node1 Node2

 Challenges:

DiskReplication Split brain SynchronousorAsynchronous 2xRTT L2 heart-beats Storage

© 2006 Cisco Systems, Inc. All rights reserved. 42 HAClusterChallenges:SplitBrain

 Split-brain : Active nodes concurrently accessing the same disk, leads to data corruption

Node1 Node2  Resolution : Use a Quorum, a tie breaker for gaining access to the disk

DataCorruption

© 2006 Cisco Systems, Inc. All rights reserved. 43 Layer2Heartbeats

 Extended L2 Network : L2 adjacency required for node’s heartbeat. WAN Local Remote Extending VLAN across Datacent er Datacenter

site is hazardous PublicLayer2Network Node1 PrivateLayer2Network Node2  Resolution : L3 Capability for Cluster Heartbeat. EoMPLS to DiskReplication carry L2 hearbits SynchronousorAsynchronous across DR sites.

© 2006 Cisco Systems, Inc. All rights reserved. 44 StorageDiskZoning

 Storage Zoning : Taking Node1 Node2 over of storage disk Active Standby array when active node fails.

ExtendedSAN  Resolution : Cluster software to communicate with the Cluster Enabler. Instructs the Disk Array to perform an failover when sym1320 sym1291 failure is detected. RW WD

RW WD

© 2006 Cisco Systems, Inc. All rights reserved. 45 Agenda

 Introduction to Data Center—The Evolution  Data Center Disaster Recovery Objectives Failure Scenarios Design Options  Components of Disaster Recovery Site Selection—Front End GSLB Server High Availability—Clustering Data Replication and Synchronization—San Extension  Data Center Technology Trends  Summary

© 2006 Cisco Systems, Inc. All rights reserved. 46 StorageforApplications

 Presentation tier Unrelated small data files commonly stored on internal disks Manual distribution  Application processing tier Transitional, unrelated data Small files residing on file systems May use RAID to spread data over multiple disks  Storage tier Large, permanent data files or raw data Large batch updates, most likely real time Log and data on separate volumes

© 2006 Cisco Systems, Inc. All rights reserved. 47 Replication:ModesofOperation

 Synchronous All data written to local and remote arrays before I/O is complete and acknowledged to host

SpeedofLight=3x10 8m/s(Vacuum) ≈≈≈ 3.3s/km SpeedthroughFiber ≈≈≈⅔⅔⅔ c ≈≈≈ 5s/km 2RTTperwriteI/O=20s/km  Asynchronous Write acknowledged and I/O is complete after write to local array; changes (writes) are replicated to remote array asynchronously

© 2006 Cisco Systems, Inc. All rights reserved. 48 Synchronousvs.AsynchronousTrade Off EnterprisesMustEvaluatetheTradeOffs

Synchronous Asynchronous ImpacttoApplication NoApplication Performance PerformanceImpact DistanceLimited(Are UnlimitedDistance(Second BothSitesWithinthe SiteOutsideThreatRadius) SameThreatRadius) ExposuretoPossible NoDataLoss DataLoss

 Maximum tolerable distance ascertained by assessing each application  Cost of data loss

© 2006 Cisco Systems, Inc. All rights reserved. 49 DataReplicationwithDBExample

• DBname Control  Control files identify other files • Creationdate Files • Backup making up the database and performed records content and state of • Redolog the db timeperiod • Datafilestate  Datafile is only updated periodically Identify  Redo logs record db changes resulting from transactions Used to play back changes that may not have been written to Record datafile when failure occurred Datafiles ChangesTo RedoLog Files Typically archived as they fill to • Tablespaces • Databasechanges local and DR site destinations • Indexes • Datadictionary

© 2006 Cisco Systems, Inc. All rights reserved. 50 DataReplicationwithDBExample (Cont.)

Time

FailureorDisasterOccurs ... ... ... atTimet 1 ArchivedRedoLogs t1 • Mediafailure(e.g.,disk) t OnlineRedo 0 Logs • Humanerror(datafiledeletion) • Databasecorruption

HotBackupof  Database restored to state at time of failure Datafilesand (time t ) by: ControlFiles 1 TakenatTimet 0 1. Restoring control files and datafiles from last hot backup (time t 0) 2. Sequentially replaying changes from subsequent redo logs (archived and online)—changes made between time t 0 and t 1

© 2006 Cisco Systems, Inc. All rights reserved. 51 DataReplicationwithDBExample (Cont.)

PrimarySite SecondarySite RedoLogs(Cyclic) RedoLogs(Cyclic) CopyofEvery Synchronously CommittedTransaction Replicated EarlierDB Backups forZeroLoss Database

SAN Database Database Extension Copyat Copyat Transport Timet Timet 0 0 Replicated/Copied Point inTime CopyTaken WhenDB ArchiveLogs Replicated/Copied ArchiveLogs Quiescent

Mixture of Sync and Async Replication Technologies Commonly Used • Usually only redo logs sync replicated to remote site • Archive logs created from redo log and copied when redo log switches • Point in Time (PiT) copies of datafiles and control files copied periodically (e.g., nightly)

© 2006 Cisco Systems, Inc. All rights reserved. 52 DataCenterInterconnectionOptions

Internet Stateful Stateful Internet Firewalls Firewalls Content Content Caching Caching High SONET/SDH High Density Server Server Load Density Multilayer Load Balancing Balancing Multilayer LAN LAN Switch Intrusion Intrusion Detection Detection Switch FrontEnd FrontEnd Application Application Servers Servers DWDM/ CWDM BackEnd BackEnd Application Application Servers Servers High High Density Density Multilayer Multilayer SAN SAN Director IP/MetroE EnterpriseClass Director StorageArrays EnterpriseClass StorageArrays

© 2006 Cisco Systems, Inc. All rights reserved. 53 DataCenterTransportOptions

IncreasingDistance Data Center Campus Metro Regional National

DarkFiber Sync LimitedbyOptics(PowerBudget)

CWDM Sync(2Gbps) LimitedbyOptics(PowerBudget)

DWDM Sync(2GbpsLambda) LimitedbyBB_Credits Optical SONET/SDH Sync(1Gbps+Subrate) Async

IP MDS9000FCIP Sync(MetroEth) Async(1Gbps+)

© 2006 Cisco Systems, Inc. All rights reserved. 54 DATA CENTER ARCHITECTURE TRENDS

© 2006 Cisco Systems,© 2005 Inc. Cisco All rights Systems, reserved. Inc. All rights reserved. 55555555 CiscoDataCenterVision

Server Enterprise Data Storage AUTOMATION Fabric Applications Network Network Dynamicprovisioningand Network Dynamicprovisioningand autonomicInformation Lifecyle Management(ILM) toenablebusinessagilitytoenablebusinessagility BusinessPolicies OnDemand LAN HPC OnDemand SAN HPC VIRTUALIZATION ServiceOriented WAN Cluster Managementofresources MAN GRID Managementofresources independentofunderlyingindependentofunderlying physicalinfrastructureto IntelligentIntelligent physicalinfrastructureto increaseutilization, Information increaseutilization, Compute Information efficiencyandflexibility Network CONSOLIDATION Network Centralizationand standardizationtolower costs,improveefficiency Storage anduptime Compute Network Storage

© 2006 Cisco Systems, Inc. All rights reserved. 56 Summary

© 2006 Cisco Systems, Inc. All rights reserved. 57 Whatwehavetalksofar?

 DR and its Business Objectives Define budget, Technical solution Management Buy In DR is a process  Components of a Data Center Multi Tier Architecture Front-end, Application, Backend Database  Techniques in Data Center Disaster Recovery HTML Re-Direction/GSS/RHI Clustering SAN extension  Trends in Data Center Technology

© 2006 Cisco Systems, Inc. All rights reserved. 58 Today’sDataCenters RequireanArchitecturalApproachto …

 Protect with Business Resilience Tighten security Improve business continuance  Optimize with Consolidation Improve operational efficiency and resource utilization Lower complexity and cost of ownership  Grow towards Services-oriented Infrastructure Align virtualized resources with business demands Automate infrastructure to respond dynamically

© 2006 Cisco Systems, Inc. All rights reserved. 59 TheBigPicture—TheCiscoDataCenter

MAINFRAME ENTERPRISE ENTERPRISE TheEmerging CONNECTIVITY TAPESTORAGE DISKSTORAGE

DataCenter ENTERPRISESAN Architecture SWITCHING

EmbeddedIntelligent MDS9000 VirtualFabrics(VSANs) StorageServices Family StorageVirtualization

DataReplicationSvcs EmbeddedIntelligent FabricRoutingSvcs NetworkServices EmbeddedIntelligent ServerBalancing VirtualizationServices Multiprotocol VPNTermination GatewayServices V ServerVirtualization VFrame SSLTermination VirtualI/O Grid/UtilityComputing FirewallServices Catalyst6500 TOPSPIN Family FAMILY LowLatencyRDMA Services IntrusionDetection Clustering

Server Farm SERVER Switching ENTERPRISE FABRIC SWITCHING NAS WIN UNIX GRID

Enterprise UNIX/Windows Blade VirtualPrivate VirtualPrivate VirtualPrivate NASStorage Servers Servers Server Server BladeServer Fabric#1 Fabric#2 Fabric#3 © 2006 Cisco Systems, Inc. All rights reserved. 60 What’sNext?

 A Security Strategy to Protect the Data Center Understands the vulnerabilities, and apply the relevant mitigations  Leverage on Cisco’s Technology to Optimize the Server Resources Reducing TCO for DRs Virtualization to maximize resource invested Grow DC infrastructure, enabling Business Agility Automating computing resources provisioning Speed of deploying new services

© 2006 Cisco Systems, Inc. All rights reserved. 61 Q and A

© 2006 Cisco Systems, Inc. All rights reserved. 62 © 2006 Cisco Systems, Inc. All rights reserved. 63