Mellanox Container Journey

Mellanox Container Journey

RDMA Device Isolation Dror Goldenberg, Parav Pandit - Mellanox Technologies ISC Container Workshop Frankfurt, June 2019 © 2019 Mellanox Technologies 1 RDMA Device Access Paths sysfs interface rdma ▪ RDMA Connection Manager (CM) rds tool ▪ Verbs (via uverbs char device) nvme ▪ Rdmatool (via netlink sockets) fabrics ▪ Resource, connection information rdma uverbs ▪ Sysfs file interface rdmacm device device ▪ Counters nfs- ▪ Network addresses (IP/GID/LID) rdma Kernel ▪ More.. RDMA umad, ▪ UMAD char device subsystem issm ▪ MAD packets ▪ Device, address information device rdma device © 2019 Mellanox Technologies 2 RDMA Device - The Need for Isolation ▪ Device cgroup - char devices ACL sysfs ▪ Too coarse for network level interface rdma ▪ RDMA cgroup - # of resources rds tool ▪ Does not control the network access nvme ▪ RDMA is yet another network device fabrics rdma uverbs Need to protect RDMA devices rdmacm device device ▪ At the network level nfs- ▪ In a reliable, unified, deterministic way rdma Kernel ▪ Fit in existing orchestration frameworks RDMA umad, (CNI, device plugin…) subsystem issm ▪ Future proof for new apps, interfaces, APIs device ▪ Backward compatible rdma device © 2019 Mellanox Technologies 3 RDMA Devices in Net Namespace ▪ Isolation ring to access the RDMA device sysfs ▪ Use existing net namespace of Linux kernel interface rdma rds tool ▪ RDMA network namespace modes ▪ Exclusive or shared nvme ▪ Via netlink fabrics rdma uverbs ▪ Default as shared mode (backward compatible) rdmacm device device nfs- rdma ▪ RDMA device associated with net namespace umad, ▪ New netlink command net namespace = foo issm device ▪ Integrates with CNI and device plugin of K8s, Docker network plugin extension rdma device net namespace = bar © 2019 Mellanox Technologies 4 RDMA Devices in Network Namespaces net namespace net namespace net namespace Kubernetes/ = A = B = C Docker Pod1/ Pod2/ Pod3/ Container1 Container2 Container3 SR-IOV SR-IOV Device SR-IOV CNI ibdev=mlx5_1 ibdev=mlx5_2 ibdev=mlx5_3 operator Plugin netdev= netdev= netdev= ib1/eth1 ib2/eth2 ib3/eth3 PF VF-1 VF-2 VF-3 Mellanox ConnectX Adapter Card with SR-IOV Enabled ▪ Every container/POD has an IB device (mlx5_1,2,3) and netdevice ▪ Isolation is done on the net namespace level © 2019 Mellanox Technologies 5 Additional Information… ▪ Examples ▪ Query, Change RDMA subsystem mode ▪ $ rdma system show ▪ $ rdma system set netns exclusive ▪ Move RDMA device to new network namespace ▪ $ ip netns add foo ▪ $ rdma dev set mlx5_1 netns foo ▪ Current status (6/15/2019) ▪ Merged to upcoming Linux kernel 5.2 and iproute2/rdma tool ▪ Merged to netlink golang library ▪ Ahead of us ▪ Integrate to docker sr-iov plugin ▪ Integrate to SR-IOV operator and CNI plugin © 2019 Mellanox Technologies 6 Thank You © 2019 Mellanox Technologies 7.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    7 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us