Using PubSub For Scheduling in Azure SDN Qi Zhang (Microsoft - Azure Networking) Azure Networking Azure Region ‘A’ Regional Cable Network Consumers Regional CDN Network Carrier Microsoft Edge Enterprise, SMB, WAN mobile Azure Region ‘B’ ExpressRoute Regional Internet Network Exchanges Enterprise Regional DC/Corpnet Network DC Hardware Services Intra-Region WAN Backbone Edge and ExpressRoute CDN Last Mile • SmartNIC/FPGA • Virtual Networks • DC Networks • Software WAN • Internet Peering • Acceleration for • E2E monitoring • SONiC • Load Balancing • Regional Networks • Subsea Cables • ExpressRoute applications and (Network Watcher, • VPN Services • Optical Modules • Terrestrial Fiber content Network Performance • Firewall • National Clouds Monitoring) • DDoS Protection • DNS & Traffic Management Microsoft Global Network Svalbard Greenland United States Sweden Norway Russia Canada United Kingdom Poland Ukraine Kazakistan France Russia United States Turkey Iran China One of the largest private Algeria Pacific Ocean Atlanta Saudi Ocean Libya networks in the world Mexico Egypt Arabia India Myanmar Niger (Burma) Mali Chad Sudan Pacific Ocean Nigeria • 8,000+ ISP sessions Ethiopi Venezuela a Colombia Dr Congo • 130+ edge sites Indonesia Peru Angola Brazil Zambia Indian Ocean • Bolivia 44 ExpressRoute locations Nambia Australia • 33,000 miles of lit fiber South Africa Owned Capacity Data Argentina • SDN Managed (SWAN, OLS) center Leased Capacity Moving to Owned Edge Site DCs and Network sites not exhaustive Software Defined Management Central Commodity HW Networking API Controllers (SDN) vNIC vNIC vNIC vNIC vNIC vNIC Azure SDN Basis of all NW virtualization in our datacenters Control Plane Centralized, hierarchical, highly Agents Host scalable and available controllers SmartNIC Data Plane Host agent, drivers Key to flexibility and scale is SDN PubSub in SDN • Scale: • 40+ regions, hundreds of DCs, millions of servers • millions of VNets and LBs • Flexible, scalable and efficient scheduling between controllers and agents • Publisher/Subscriber pattern Controller Publish flow PubSub Notification flow Agent 1 Agent i Agent N Virtual Network Virtual Network Virtual Network in Azure VNet Peering Secure per customer virtual datacenter in the cloud Virtual Network Instantiate and configure Cross premises Internet complex topologies in Connectivity minutes Rich security and networking services Virtual Network Virtual Network CA-PA Mappings Payload, including CA, is encapsulated Directory Service Traverses physical network 10.1.5.2 Payload 10.1.1.2 CA PA CA PA 10.0.0.1 10.1.1.2 10.0.0.1 10.1.5.3 10.0.0.4 10.1.1.3 10.0.0.7 10.1.1.4 10.0.0.6 10.1.3.3 VM-SW3 . VM-SW2 10.0.0.4. 10.1.3.2 VM-SW1 10.0.0.7 10.1.5.2 10.0.0.7 10.0.0.7 Payload Payload 10.0.0.1 10.0.0.1 PA 10.1.1.2 PA 10.1.1.4 PA 10.1.1.3 PA 10.1.3.3 PA 10.1.3.2 PA 10.1.5.3 PA 10.1.5.2 VM2 VM1 CA 10.0.0.1CA 10.0.0.7 CA 10.0.0.4 CA 10.0.0.6 CA 10.0.0.4 CA 10.0.0.1 CA 10.0.0.7 Host Node 1 Host Node 2 Host Node 3 Data traffic Control msgs PubSub for CA-PA Mapping Challenges: • Scale: hundreds K agents, millions of VNets • Scope: cluster, regional, global • VNet size limit: 4K mappings -> 64K mappings, 500 peerings • Provisioning Speed: minutes -> seconds VNet Controller VNet Controller Directory Service PubSub Agent 1 Agent i Agent N Agent 1 Agent i Agent N Scenario I: Global Peering Region A / VNET A Region B / VNET B VNet VNet Controller Controller PubSub PubSub Agent Agent Scenario II: DataExfil Resource “Metadata” METADATA (resource A): { NRP subscription: “{guid}, Resource A account: “users”, Policy Service Tunnel Policy storage_type: “blob” { } id: “policy-123”, PubSub service: “xstore”, Resource “Metadata” subscription: “{guid}, METADATA (resource B): accounts: [ { “users”, BLOCK subscription: “{guid}, Resource B “wiki.*” account: “users”, Host ], storage_type: “table” storage_type: “blob”, } Agent access: “rw” Resource “Metadata” VNetPolicyCache } METADATA (resource C): { subscription: “{guid}, Resource C account: “wikimain”, Storage FE storage_type: “blob” } Overview Publisher Query Publish GetNodeInfo CreateNode • Persisted KV Store UpdateNode • Hierarchical name space • Set watcher on a node Root • Single watcher PK1 … PKi … PKn • Bulk watcher W W • Interfaces a1 a2 a3 n • Publish (batch/multi supported) • Subscribe b1 b2 b3 b4 b5 • Notification • Query Subscribe Notification watcher Created, Deleted • State Update/Delivery bulkwatcher DataChanged • Initial state ChildrenChanged • Subsequent state updates Subscriber Partition Key Partition Key 4 Microservices: Stateless Service • Routing Service • Notification Service Stateful Service • Selector Service • Madari Service SDN PubSub Service Publisher (Vnet Controller) Subscriber Agent) PK: /Vnet/{VnetId1}, PK: /Vnet/{VnetId2} Path: /mappings/ipv4/{CA1} 6 1 Path: / <notifications> Data (bond message): {PA1} 1 6 Partition Key Partition Key 2 2 /Vnet/{VnetId1} /Vnet/{VnetId2} 4 Microservices: 3 3 Stateless Service MadariService_02 MadariService_03 • Routing Service • Notification Service PK: /Vnet/{VnetId1}, 4 5 5 4 Path: /mappings/ipv4/{CA1} SetBulkWatcher: <notifications> Data (bond message): {PA1} PK: /Vnet/{VnetId2} Stateful Service • Selector Service • Madari Service SDN PubSub Service Madari Selector Service: Data Partitioning AddPartitionKey(“baz”) 1 Selector Service MadariService_01 3 2 MadariService_02 Partition Key Madari Instance Madari Instance Total Data Size MadariService_03 “foo” MadariService_01 MadariService_01 1.05G “bar” MadariService_02 MadariService_02 1.9G ….. ….. MadariService_03 1.6G “baz” MadariService_01 Subscription through Notification Service MadariService_02 MadariService_04 …..Root ….. Root ….. A C D B vnet vnet vnet vnet 1 2 3 4 A C B D ….. NotificationService_03 ….. NotificationService_08 ….. vnet1 vnet1 vnet1 vnet2 vnet3 Subscriber Subscriber Subscriber I III II Service Fabric Ring • Service Fabric ring • Multiple PaaS tenants form a Service Tenant1 Fabric ring n1 n2 n3 n4 n5 Cluster1 • Service Fabric ring is on a VNET Tenant2 • PubSub as Service Fabric application n6 n7 n8 n9 n10 • Routing Service/Notification Service Cluster2 • Stateless Tenant3 • On every node n11 n12 n13 n14 n15 Cluster3 • MadariService/MadariSelectorService(s) • Stateful • Min 3, target 7 Client Libraries • Commit Managed Libraries • hooks Madari.ClientLibrary • Publishing through WCF channel Commit hooks Mark objects • Reliable Publisher modified triggered • IMOS-based publishers • User implements: • Commit hooks IMOS Lib • Handler Repo • Nuget package: Runtime Madari.ReliablePublisher.RSL Persist reliable tasks Madari.ReliablePublisher.ServiceFabric Retry on failure • Native Libraries • Publish Execute handler Worker Handler • Nuget package: Pick up tasks Madari.MadariFrontEnd.Native • Subscribe Delete executed tasks on success • Nuget package: Madari.Subscriber.Native Hierarchical PubSub Infrastructure Resource Scope => PubSub Service Scope Resource Scope Publisher Subscriber CA-PA mapping regional VNet Controller Agent DataExfil policy global NRP Agent DataExfil policy Global PubSub CA->PA CA->PA CA->PA Regional Regional Regional PubSub PubSub PubSub Global PubSub Global PubSub Replication Service Region A Region B PubSub PubSub PubSub PubSub PubSub PubSub (AZ01) (AZ02) (AZ03) (AZ01) (AZ02) (AZ03) Publish Policy – No Replication (Sync) /DataExfil/Policies/ {policyid} 1 8 Routing 4 Madari Service 5 Service /DataExfil/Policies/ 2 3 {policyid} 6 7 Selector Service Replication Remote Service Global PubSub Regional P/S 8 Replication Service Madariservice/01 Partition 1 Operation Tracking Table Op Id Status Operation Replication Details Replicationservice/01 1001 Replicated [add] /DataExfil/Policies/Policy1 {Dest1:Y, Dest2:Y, Dest3:Y } Replication Queue 1002 Replicating [update] /DataExfil/Policies/Policy1 {Dest1:Y, Dest2:N, Dest3:Y } Request to Partition 1 1003 Committed [remove] /DataExfil/Policies/Policy1 {Dest1:N, Dest2:N, Dest3:N } Destination Tracker Dest1: req1002 Dest 2: req1001 Dest 3: req1001 Global SF Ring Tenant1 n1 n2 n3 n4 n5 uswest vnet1 Tenant5 Tenant2 n1 n2 n3 n4 n5 n1 n2 n3 n4 n5 vnet2 europewest vnet5 useast Tenant4 Tenant3 n1 n2 n3 n4 n5 n1 n2 n3 n4 n5 asiasoutheast vnet4 uswestcentral vnet3 Major Performance KPIs • 15 partitions KPI Write throughput 10k req/s Read throughput 42k req/s End to End latency 10ms/300ms (50%/99%) Max subscribers 500K • In a large region: • < 300k agents • < 100K VNets • ~1k read/sec, ~200 write/sec Work in Progress • Accelerating read flow • End to end validation Q & A Thank you!.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages26 Page
-
File Size-