Network-As-A-Service on Bare-Metal Cloud
Total Page:16
File Type:pdf, Size:1020Kb
Network-as-a-Service on Bare-metal Cloud Rakuten, Inc. Cloud Platform Department Tomohisa Egawa JANOG47 Overview Sharing our experience of new private cloud project focusing on Platform and Network Contents (23 minutes) • History of our infrastructure • Concept of new private cloud based on baremetal • Design of platform and network Discussion (7 minutes) 2 Looking back our Infrastructure 3 2018: Freedom and Chaos Silo was occurred due to platform crowd Issues • Bad experience for in-house engineers Container Container PaaS • Multiple tools and portals for each platform Platform 1 Platform 2 • It’s difficult for users to know which one is best for their services • Not consolidated Billing and authentication • Platformer view • Can’t keep up with life cycle even with automation VM VM Baremetal • Upgrade cost is too heavy about platform software • OS management/patching to manage ton of the VMs Platform 1 Platform 2 Platform • Company view • Challenges in resource optimization at company-wide level • Unified security control Concerned looking ahead 3-5 years • Support life cycle Late 2018: Change Infrastructure strategy + Reorganization 4 New Private Cloud based on Bare-metal 5 Concept of Bare-metal Cloud (1) Service-B Service-B Service-B Service-A Service-B Service-C Managed Service approach • Reorganization with inverse Conway’s law • New services are created by combining the managed services GUI Portal/API + IAM + Billing • Generate higher-level managed service based on the combination LBaaS CaaS DBaaS Storage Monitoring Each component is loosely coupled • Clarification of the demarcation point of responsibility Baremetal • Update process can be performed any time • Right technology can be selected according to the trend Network Baremetal Network Controller Controller Let’s use CNCF/OSS product mainly DC-1 DC-4 DC-5 • Adopt de fact standard software DC-2 DC-3 • Easy to obtain information and lower learning costs • Cut CAPEX dramatically Backbone Network/DCI 6 Concept of Bare-metal Cloud (2) Service-B Service-B Service-B Service-A Service-B Service-C GUI Portal/API + IAM + Billing Develop our original IAM and Portal by using OSS • Support for frequent internal reorganizations LBaaS CaaS DBaaS Storage Monitoring • Support for irregular delegation of authority Baremetal Private cloud based on bare-metal Network • Bare-metal server can be provided to all in-house users Baremetal Network • Build a managed service from Bare-metal Controller Controller Focusing on “Core-Infrastructure” DC-1 DC-2 DC-3 DC-4 DC-5 from the next slide Backbone Network/DCI 7 Design of Core-Infrastructure 8 Design concept of Core-Infrastucture 1. Sutainable Core-Infrastructure • Being strong against the change of technology trends and having full control • Design to keep the latest version without affecting the services on the Core-Infrastructure • Support Multi-tenancy 2. (Stable + Scalability + Easy-operation) Network • No risk of whole network failure by eliminating SPOF • Ensure sufficient scalability without worrying about limitation • Easy operation = We can keep extra capacity for the next challenge 3. Network-aaS • Provide useful network function for users via API/GUI • Network is also treated as “Product” 9 Goal 1 Sustainable Core-Infrastructure 10 Comparison: Virtulization Infrastructure Over 10000 VMs Pros VM VM VM VM VM • Integrated management through platform software VM VM VMVM • Resource abstraction • Users donʼt need to care about physical servers • Optimizing utilization by pooling resources Resource Pool Cons Platform Software • Platform software management is high OPEX Tightly • Difficult to keep up with life cycle and upgrades coupled • Impact of changes is a risk against all services Hypervisor • Increased OPEX due to large number of VMs • Configuration drift even using IaC Storage Network • OS-level security patching Suppress the number of VM à Increse the ratio of Container 11 Comparison: Core-Infrastructure based on Bare-metal LifeCycle Advantages short GUI Portal/API • Provide stable and high-performance compute resources LBaaS CaaS DBaaS Storage Monitoring • No noisy-neighbor problem • Simple and resistant to changes in technology trends Loosely-coupled • Offload high-level function as Managed-service • Always up to date Baremetal • Controllers are our permanent software assets Network long Baremetal Network Controller Controller Trade-off Require high infrastructure skills for bare-metal users Sustainable Core-Infrastructure • Ensuring redundancy in case of bare-metal failure • No physical resource abstration • We have full control of the product life • Apply improvement continuously • Resource utilization optimization should be covered by user product • Eliminate of the concept of upgrading Develop new services using a combination of Managed-services 12 Baremetal / Network Controller • Controllers are independent for each data center • No service impact even if controller down Network Engineer GUI/API Portal Operation Baremetal Controller Network Controller System Developper DB DHCP API DB IPAM API Network Network Template Network Template Engineer Apply Deploy or Destroy Configuration 13 Bare-metal Servers Network Switches Multi tenant network design (Ideal) Not adopted • Running different workloads in same subnet should be easier to manage? • Run Bare-metal, VM and Container in same subnet à Simple management? • This approach was discarded because it’s needed tightly coupled between Core-Infrastructure and Platform Tenant-A Tenant-B Subnet-1 Subnet-2 Subnet-1 Subnet-2 Baremetal Baremetal Baremetal Baremetal Default Default Default VM deny VM deny VM deny VM Container Container Container Container 14 Multi tenant network design Adopted Separate tenant resources provided by each mechanism on the Managed-service • Portal provides centralized management to handle different workload resources easier • Example) Core-Infrastructure side provides isolation of bare-metal by subnets Tenant-A Tenant-B Portal BMaaS CaaS XaaS BMaaS CaaS XaaS Subnet-1 Namespace-1 Resource-1 Subnet-1 Namespace-1 Resource-1 BM-1 C-1 X-1 Default BM-1 C-1 X-1 BM-2 C-2 X-2 deny BM-2 C-2 X-2 Subnet-2 Namespace-2 Resource-2 Subnet-2 Namespace-2 Resource-2 BM-3 C-3 X-3 BM-3 C-3 X-3 BM-4 C-4 X-4 BM-4 C-4 X-4 15 Goal 2 Stable + Scalability+ Easy-operation Network 16 How to realize L3 Bare-metal? We aimed to eliminate MLAG from ToR layer to avoid a difficult situation Approach 1: VXLAN/EVPN on the Host Not adopted Approach 2: Routing on the Host Adopted • Switch only needs to support BGP • (Good) Rack-wide L2 network can be deployed without MLAG • Reduce the probability of switch bug • (Bad) High OPEX • Low learning cost and easy trouble shooting • Complicated network configuration on the host • More NOS selection due to simpler requirement • Difficult trouble shooting • Simple redundancy thanks to ECMP • Many features depend on Linux Kernel • Goodbye MLAG ToR Switch ToR Switch ToR Switch ToR Switch Advertise 10.0.0.0/8 eBGP+EVPN Our DC Routing Daemon eBGP Advertise Routing L2VXLAN EVPN x.x.x.x/32 Daemon Server IP Address L3VXLAN VRF vni0 Linux Kernel x.x.x.x/32 Bare metal Bare metal 17 Usecase of L3 Bare-metal Rack wide Mobility Multi Network Subnets can be deployed across the racks • Provide some VRF directly to the bare-metal • Select a server regardless of rack location • Ex) LBaaS nodes have multiple virtual NICs • Simplify subnet management • Set route-map automatically as network catalog Rack-1 Rack-2 ToR Switch ToR Switch Tenant A Tenant A VRF VRF VRF VRF subnet-1 /26 subnet-2 /26 private public private public No VXLAN Tenant B Tenant B on IP-CLOS subnet-2 subnet-2 /27 10.0.0.0/8 0.0.0.0/0 Our DC Internet Tenant C Tenant B subnet-1 /26 subnet-2 NIC1.101 NIC1.201 NIC2.101 NIC2.201 route-map route-map set taG 101 set taG 201 Rack-1 Rack-2 vni0 vni1 x.x.x.x/32 y.y.y.y/32 Tenant A route-map export subnet-1 /22 route-map export Routing-on-Host match taG 101 match taG 201 on IP-CLOS Tenant B set src vni0 Kernel FIB set src vni1 subnet-1 /24 10.0.0.0/8 src vni0 Tenant C 0.0.0.0/0 src vni1 subnet-1 /25 Bare metal 18 Kubernetes Networking on L3 Bare-metal Consider the best network design for CaaS on Bare-metal Bare metal • Demarcation point of responsibility eBGP (FRR) • FRR: Managed by Core-Infrastructure side • iBGP Calico (BIRD): Managed by CaaS (RR) • No concern about explosion of container routes • Switch doesnʼt learn the route of the container ToR ToR • PoC: Container-aware Load balancing by LBaaS eBGP eBGP • Notice • Create a FRR container for Fedora-CoreOS eBGP iBGP iBGP eBGP • systemd-nspawn, not docker for user (FRR) (Calico) (Calico) (FRR) • Change BGP port number on FRR Kernel FIB Kernel FIB • BIRD:179, FRR:20179 • Disable ECMP on CaaS Bare-metal lo /32 C C C /26 lo /32 C C C /26 • Use local-preference (Transmission: Active/Standby) Bare metal Bare metal • Why?: BIRD couldnʼt process IPv4 link-local provided by FRR IPv6 unnumbered (Just my assumption) 19 Provisioning of L3 Bare-metal • Providing a self-service bare-metal server for users • Select Linux distribution, subnet and network catalog (L2 or L3 etc..) • ToR switch port configuration is changed dynamically • Deploy a server: Change switch port connection from L2 to L3 • Back to pooled server: Change switch port connection from L3 to L2 ToR Switch ToR Switch Network 4. Port config 6. Apply Controller Bridge port Routed port Native VLAN eBGP 2. Network Parameter L2 L3 Network Catalog-1 1. Deploy Server with Network Catalog-1 DHCP Routing Provisioning 3. Install OS (PXE) 5. Reboot Daemon Portal