SMB3.1.1 to the Cloud and Beyond: Update on Progress in SMB3 Support for Steve French Principal Software Engineer Azure Storage - Microsoft Legal Statement

– This work represents the views of the author(s) and does not necessarily reflect the views of Microsoft Corporation – Linux is a registered trademark of . – Other company, product, and service names may be trademarks or service marks of others. Who am I?

– Steve French [email protected] – Author and maintainer of Linux cifs vfs (for accessing Samba, Windows and various SMB3/CIFS based NAS appliances) – Also wrote initial SMB2 kernel client prototype – Member of the Samba team, coauthor of SNIA CIFS Technical Reference,former SNIA CIFS Working Group chair – Principal Software Engineer, Azure Storage: Microsoft Outline

● General Linux File System Status – Linux FS and VFS Activity

● What are the goals?

● What’s New – Key Feature Status

● Cifs-utils enter a new era …

● Features expected soon …

● Performance Overview

● POSIX Extensions

● Testing Improvements A year ago … and now … kernel (including SMB3 client cifs.ko) improving

● A year ago Linux 4.18 Last week:5.3 “Bobtail “Merciless Moray” Squid” Discussions driving some of the FS development activity ?

● New mount API

● Improving support for Containers

● Better support for faster storage (NVME, RDMA)

● io_uring

● Fixing copy offload (e.g. copy and clone file range syscalls)

● Shift to Cloud (longer latencies, object & file coexisting) The “real reason” for kernel 5.0

● Quoting Linus (January 7th email announcing 5.0-rc1): “People might well find a feature _they_ like so much that they think it can do as a reason for incrementing the major number. So go wild. Make up your own reason for why it's 5.0." ● So should we claim: “Version 5.0 marks the reborn, new improved SMB3 Client For Linux” …? 2019 Linux FS/MM summit (last month in Puerto Rico)

● Great group of talented developers

Most Active Linux Filesystems this year

● 5344 kernel filesystem changesets last year (since 4.18 kernel). (down slightly) – FS activity: 6% of overall kernel changes (which are dominated by drivers) down slightly as % of activity – Kernel is huge (> 18 million lines of code, measured Saturday)

● There are many Linux file systems (>60), but six (and the VFS layer itself) drive two thirds of the activity (, xfs and cifs are the three most active) – File systems represent 5.1% of kernel source code (930KLOC)

● cifs.ko (cifs/smb3 client) among more active fs – #3 most active with 466 changesets! – 53KLOC (not counting cifs-utils which are 10.5KLOC and samba tools which are larger) Linux File System Change Detail (4.16- now)

● BTRFS 1004 changesets (down)

● VFS (overall fs mapping layer and common functions) 775 (down)

● XFS 521 (up slightly)

● CIFS/SMB2/SMB3 client 466. And increasing!

● F2FS 358 (down)

● NFS client 328 (down)

● And others: EXT4 206(down), AFS 155, Ceph 172, GFS2 142, 103, ocfs2 108 ...

● NFS server 137 (down). Linux NFS server is MUCH smaller than CIFS client or Samba

● NB: Samba is as active as all Linux file systems put together (>5000 changesets per year) - broader in scope (by a lot) and also is user space not kernel. 100x larger than the NFS server in Linux! Samba now 3.5 million lines of code (measured Saturday with sloccount)! What are the goals?

● Make SMB3 (SMB3.11 and followons) fastest, most secure general purpose way to access file data, whether in the cloud or on premises or virtualized

● Implement all reasonable Linux/POSIX features - so apps don’t have to know running on SMB3 mounts (vs. local)

● As Linux evolves, and need for new features discovered, quickly add them to client and Samba Examples of Great Progress this year! Snapshot mounts

 Want to compare backups?  Look at previous versions?  Recover corrupted data  ...  An example, one mount with “snapshot=” and one without # cat /proc/mounts | grep cifs //172.22.149.186/public /mnt1 cifs ro,vers=default,addr=172.22.149.186,snapshot=131748608570000000,... //172.22.149.186/public /mnt2 cifs rw,vers=default,addr=172.22.149.186,... root@Ubuntu-17-Virtual-Machine:~/cifs-2.6# ls /mnt1 EmptyDir newerdir root@Ubuntu-17-Virtual-Machine:~/cifs-2.6# ls /mnt1/newerdir root@Ubuntu-17-Virtual-Machine:~/cifs-2.6# ls /mnt2 EmptyDir file newerdir newestdir timestamp-trace.cap root@Ubuntu-17-Virtual-Machine:~/cifs-2.6# ls /mnt2/newerdir new-file-not-in-snapshot Snapshots rock!

● Different mounts

● SS is Read-only

● Easy to use rsize and wsize increase

 Previous default 1MB – 4MB gave 1 to 13% improved performance to Samba depending on network speed, 1% better for read.  Moved to 4MB in 4.20 kernel  And default block size moved to 1MB (helps cp) Global Name Space – Much better! Thank you Paulo!

● Remember what DFS could do in Windows … now the Linux client can handle failover and DFS entry caching too New security models

● Modefromsid

● Multiuser improvements Multichannel

● Thank you Aurelien!

● Expected to be a big performance win …

● Testing continues at SDC SMB3 Plugfest Compounding helps a lot – thanks Ronnie!

Added in so far: 1) update timestamps on existing file: touch /mnt/file" goes from 6 request/resp pairs to 4 2) delete file "rm /mnt/file" from 5 to 2 3) make directory "mkdir /mnt/newdir" 6 to 3 4) remove directory "rmdir /mnt/newdir" 6 down to 2 5) rename goes from 9 request/response pairs to 5 ("mv /mnt/file /mnt/file1") 6) hardlink goes from 8 to only 3 (!) ("ln /mnt/file1 /mnt/file2") 7) symlink with mfsymlinks enabled goes from 11 to 9 ("ln -s /mnt/file1 /mnt/file3") 8) query file information “stat /mnt/file” goes from six roundtrips down to 2 9) And get/set xattr, and statfs and more Sparse File Support (and other network fs can’t do this)!

Example test code using sparse files: Now 77 smb3 dynamic tracepoints (a year ago was 20 ...)! GCM Fast

● Can more than double write perf! 80% for read

● In 5.3 kernel NetName context added!

● In 5.3 kernel Can now view detailed info on open files (not just “lsof” output) RDMA – Performance Improved

● Thank you Long Li! Many fixes/improvements

● No longer CONFIG_EXPERIMENTAL for smbdirect (RDMA) on Linux kernel client And can view (on the client) the number of open files per share Slow Response Threshold now configurable

/sys/module/cifs/parameters/slow_rsp_threshold or via modprobe: slow_rsp_threshold:Amount of time (in seconds) to wait before logging that a response is delayed. Default: 1 (if set to 0 disables msg). (uint) Recommended values are 0 (disabled) to 32767 (9 hours) with the default remaining as 1 second SMB3/CIFS Features by release

● 4.17 (56 changesets) - June 3 – Add signing support for smbdirect – Add support for SMB3.11 encryption, and preauth integrity (Thanks Aurelien!) – SMB3.11 dialect improvements (reconnect and encryption) (and no longer marked experimental) ● 4.18 (89 changesets) – August 12 – RDMA and Direct I/O improvements (thank you Long Li!) – SMB3 POSIX extensions (initial minimal set, open and negotiate context only. use ‘’ mnt parm) – Add “smb3” alias to cifs.ko (“insmod smb3”) – Allow disabling less secure dialects through new module install parm (disable_legacy_dialects) – Add support for improved tracing (, trace-cmd) – Cache root file handle, reducing redundant opens, improving perf (Thanks Ronnie) 4.19 (69 cifs/smb3 changesets, Oct 22) cifs.ko internal module version 2.12

● Snapshot (previous version support)

● SMB3.1.1 ACL support

● Remove SMB3.1.1 config option (so support for newest dialect, SMB3.1.1, always built)

● Compounding for statfs (perf improvement)

● smb3 stats and tracepoints much improved (e.g. slow command trancepoint, counter)

● Fix statfs output

● smb3 xattr alias (eg getfattr -n system.smb3_acl /mnt1/file)

● Allow disable insecure dialect, vers=1.0, in kconfig

● Bug fixes (signing, firewall, root dir missing file, backup intent, lease break, security) 4.20 (70 Changesets, December 23rd) cifs.ko internal module version 2.14

● RDMA and direct i/o performance improvements (add direct i/o to smb3 file ops)

● Much better compounding (create/delete/set/unlink/mkdir/rmdir etc.), huge perf improvements for metadata access

● Additional dynamic (ftrace) tracepoints

● Add /proc/fs/cifs/open_files to allow easier debugging

● Slow response threshold is now configurable

● Requested rsize/wsize larger (4MB vs. 1MB)

● Query Info passthrough (enables new “smb-info” tool to display useful metadata in much detail and also ACLs etc.), and allow ioctl on directories ● Many Bug Fixes (including for krb5 mounts to Azure, and fix for OFD locks, backup intent mounts) 5.0 (82 changesets) March 3rd, 2019 (cifs.ko internal module version 2.17)

● SMB3.1.1 requested by default (ie is now in default dialects list)

● DFS failover support added (can reconnect to alternate DFS target) for higher availability and DFS referral caching now possible, cache updated regularly (Thank you Paulo)

● Support for reconnect if server IP address changes (coreq change in user space implemented in latest version of cifs-utils) (Thank you Paulo!)

● Performance improvement for get/set xattr (compounding support extended) ● Many Bug Fixes (24 important enough for stable) including for large file copy in cases where network connection is slow or interrupted, reconnect fixes, and fix for OFD lock support. The buildbot is really helping improve cifs.ko code quality! 5.1 (86 changesets) May 5th 2019 (cifs.ko internal module version 2.19)

● “fsctl passthrough” support improved: allows tools like cifs-utils to easily query any info available over SMB3 fsctl or query_info

● New mount parm “handletimeout” to allow persistent/resilient handle behavior to be configurable

● Allow fallocate zero range to expand a file

● Improve perf: cache FILE_ALL_INFO for the shared root handle

● Improve perf: default inode block size reported as 1MB (NB: 4MB rsize/wsize)

● Cleanup mknod and special file handling (thank you Aurelien)

● Support guest mounts over smb3.1.1

● Add many dynamic trace points to ease debugging and perf analysis

● Bug fixes (23 important enough for stable). Adding even more tests to the buildbot really helped. Multiple fixes for ‘crediting’ (SMB3 flow control) – thank you Pavel! 5.2 (64 changesets, so far) July 7 2019. cifs.ko version 2.20

● Bug fixes (11 important enough for stable)

● Improved perf: sparse file support now allows fiemap, SEEK_HOLE and SEEK_DATA (helps cp to Samba e.g.)

● Add support for fallocate ZERO_RANGE

● Support “fsctl passthrough” for cases where send (write) data in SMB3 fsctl – allows user space tools to do more!

● RDMA support (smbdirect) improved (thanks Long Li) should no longer be considered ‘experimental’ 5.3 (55 changesets) Sept 15th, 2019 (cifs internal module number 2.22) 5.4 (38 changesets so far). Cifs version 2.23 Bugzilla

● Bugzilla.samba.org: 59 bugs open against cifsVFS component

● Bugzilla.kernel.org: 37 bugs open against CIFS component (down significantly)

● Help would be appreciated cleaning up some of these (some are already fixed but not backported by the distros they are using) Cifs-utils enters a new era!

● Smbinfo! Cifs-utils now even has a GUI!

● secddesc-ui.py Another example:

root@smf-Thinkpad-P51:~/cifs-utils-staging# ./smbinfo fileallinfo /mnt2/400M Creation Time Fri May 24 15:06:42 2019 Last Access Time Fri May 24 15:06:42 2019 Last Write Time Fri May 24 15:06:43 2019 Last Change Time Fri May 24 15:06:43 2019 File Attributes 0x00000020: Allocation Size 409993216 End Of File 409600000 Number Of Links 1 Delete Pending 0 Delete Directory 0 Index Number 10354692 Ea Size 0 File/Printer access flags 0x00020080: Current Byte Offset 0 Mode 0x00000000: File alignment: BYTE_ALIGNMENT cifs-utils

● With pass-through SMB3 fsctl and query-info (and set-info) now possible it is easy to write user space tools to get any interesting info from the server

● Would love more contributions!

● Look at smbinfo from cifs-utils for examples Features expected soon

● Security – Make the two common security use cases easy! – Improved GCM support (best SMB3.1.1 crypto, faster) ● Performance

● New function – Two new negotiate contexts: ● Compression (partially implemented) ● NETNAME_NEGOTIATE_CONTEXT ● More improvements to the SMB3.1.1 POSIX Extensions (see slides from yesterday) Exciting year!!

● Faster performance

● Much better tools

● SMB3.1.1 default, improved security

● SMB3.1.1 POSIX Extensions improved!

● LOTS of new features

● And most importantly MUCH better tested and stable! ... Testing … testing … testing

● The “buildbot” - automated regression testing! Thank you Paulo, Ronnie and Aurelien. See: http://smb3-test-rhel-75.southcentralus.cloudapp.azure.com

● See xfstesting page in cifs wiki https://wiki.samba.org/index.php/Xfstesting-cifs

● Easy to setup, exclude file for slow tests or failing ones

● Huge improvement in XFSTEST – up to 127 groups of tests run over SMB3 (more than run over NFS)! And more being added every release The “buildbot” and automated testing Five test groups (should we add more?) Thanks to the buildbot – 5.1 Best Release Ever for SMB3!

● Prevents regressions

● Continues to improve quality Thank you for your time

● Future is very bright! S + M B 3 Additional Resources to Explore for SMB3 and Linux

– https://msdn.microsoft.com/en-us/library/gg685446.aspx ● In particular MS-SMB2.pdf at https://msdn.microsoft.com/en-us/library/cc246482.aspx – https://wiki.samba.org/index.php/Xfstesting-cifs – Linux CIFS client https://wiki.samba.org/index.php/LinuxCIFS – Samba-technical mailing list and IRC channel – And various presentations at http://www.sambaxp.org and Microsoft channel 9 and of course SNIA … http://www.snia.org/events/storage-developer – And the code: ● https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/cifs ● For pending changes, soon to go into upstream kernel see: – https://git.samba.org/?p=sfrench/cifs-2.6.git;a=shortlog;h=refs/heads/for-next