AA NextNext GenerationGeneration CIFSCIFS ClientClient

Steven French Technology Center IBM Austin Connectathon 2002 March 5, 2002 email:[email protected] OutlineOutline

„ ValueValue ofof thethe clientclient perspectiveperspective „ SomeSome currentcurrent clientsclients „ CIFSCIFS ClientClient (VFS)(VFS) forfor LinuxLinux StatusStatus „ NameName ResolutionResolution „ ReliabilityReliability „ CompatibilityCompatibility „ PerformancePerformance „ SecuritySecurity „ AreasAreas toto extendextend thethe dialectdialect &OLHQW 3HUVSHFWLYH ValueValue ofof thethe clientclient

„ ClientClient drivesdrives protocolprotocol flowflow soso cancan determinedetermine ...... „ WhatWhat isis thethe smallestsmallest setset ofof SMBSMB PDUsPDUs thatthat couldcould bebe usedused byby aa clientclient withoutwithout lossloss ofof functionfunction oror performance?performance? –– HowHow differentdifferent isis thethe answeranswer forfor WindowsWindows 20002000 vs.vs. Linux?Linux? „ AndAnd aa wellwell written,written, easilyeasily available,available, modifiablemodifiable referencereference clientclient wouldwould promotepromote interoperability.interoperability. –– smbfssmbfs?? JCIFS?JCIFS? ValueValue ofof thethe clientclient

„ AA consensusconsensus isis emergingemerging thatthat smallsmall enhancementsenhancements areare neededneeded toto keepkeep CIFSCIFS aheadahead ofof NFSNFS andand WebDAVWebDAV inin intranetintranet filefile sharingsharing „ ManyMany ofof thesethese changeschanges couldcould bebe invaluableinvaluable forfor aa smartsmart CIFSCIFS clientclient „ ...... butbut therethere isis constantconstant tensiontension (especially(especially forfor clients)clients) betweenbetween perfectperfect reliability/interoperabilityreliability/interoperability andand enhancedenhanced features/performancefeatures/performance ClientClient

„ MostMost attentionattention hashas beenbeen onon serverserver –– Example:Example: SambaSamba vs.vs. smbfssmbfs » Just smbd (main server daemon) alone is more than 20KLOC vs » VFS filesystem Smbfs (less than 4KLOC) and all related utilities for mount and session establishment (client and library functions less than 14 KLOC) –– ServerServer performanceperformance optimizationoptimization isis bigbig foucsfoucs » benchmarks often show performance of server with client constant (rather than reverse) &XUUHQW &OLHQWV CurrentCurrent ClientsClients „ FilesystemsFilesystems (IFS(IFS oror VFSVFS like)like) –– WindowsWindows 20002000 andand XPXP » Viewed by many as the reference client –– WindowsWindows 9x,9x, WindowsWindows MEME –– ThursbyThursby MacintoshMacintosh ClientsClients –– BSDBSD smbfssmbfs –– AppleApple –– SharitySharity –– LinuxLinux SmbfsSmbfs –– OS/400OS/400 CurrentCurrent ClientsClients -- otherother typestypes

„ ClientClient SMBSMB UtilitiesUtilities suchsuch asas –– SmbclientSmbclient -- ftpftp likelike –– VariousVarious utilitiesutilities builtbuilt onon smblibsmblib „ JavaJava ClientsClients suchsuch asas –– JCIFSJCIFS SomeSome URLsURLs forfor moremore infoinfo

–– SmbclientSmbclient http://www.samba.orghttp://www.samba.org –– JCIFSJCIFS http://http://jcifsjcifs.samba.org.samba.org –– AnotherAnother JCIFS:JCIFS: www.hranitzky.purespace.de/jcifs/jcifs.htmwww.hranitzky.purespace.de/jcifs/jcifs.htm –– jSMBjSMB http://members.nbci.com/harikris_vhttp://members.nbci.com/harikris_v –– SharitySharity light:light: http://www.obdev.at/products/sharity-http://www.obdev.at/products/sharity- lightlight –– LinuxLinux SmbfsSmbfs::http://www.kernel.org/pub/linux/kernel/2.4http://www.kernel.org/pub/linux/kernel/2.4 –– http://http://cvswebcvsweb..netbsdnetbsd.org/.org/bsdwebbsdweb..cgicgi//syssrcsyssrc/sys//sys/smbfssmbfs// 6WDWXV CIFSCIFS StatusStatus

„ Working: – Mount in kernel at NTLM level – Unicode or ASCII – Most Current level of SMBs following CIFS T/R – Open, Read, Write, Release (close) – Get and Set Attributes – Readdir (for small to medium sized directories) – Passes 5 of first 7 tests attempted „ Not implemented: – Chmod (777 returned), Native CIFS ACL/SecurityDescriptors – Locking – Distrubuted caching/oplock, MMAP – Kerberos/SPNEGO Authentication ProblemProblem areasareas

„ Mount does not pass UNC name: – Mount //server/share /mnt –o user=name,pass= „ Hard to detect mandatory vs. advisory locks „ Invalidating mmap pages „ AccessFlags „ Setting Security Descriptors „ Group field of 777 permissions „ Add security code and DNS resolution helper code in kernel „ Efficiency of large scale directory notification „ O_ACCESS flag maps poorly „ Many missing or hard to set flags (such as sparse file, access pattern hints) „ AD Integration, Winbind/NSS integration 1DPH 5HVROXWLRQ KeyKey namename resolutionsresolutions questionsquestions

„ SkippingSkipping NBTNBT (RFC(RFC 10011001 namename service)service) -- "pure"pure TCP"TCP" onon portport 139.139. IsIs alwaysalways ok?ok? „ InIn purepure TCPTCP -- whatwhat shouldshould gogo inin UNCUNC pathpath inin tConXtConX?? IPIP addressaddress worksworks butbut whatwhat aboutabout multiplemultiple serversservers onon oneone port?port? *SMBSERVER*SMBSERVER doesdoes notnot workwork forfor Win2KWin2K „ DoesDoes ipv6ipv6 addressaddress workwork inin tConxtConx path?path? „ AreAre therethere casescases inin whichwhich UnicodeUnicode translationtranslation shouldshould bebe disableddisabled eveneven thoughthough bothboth serverserver andand clientclient supportsupport it?it? (same(same codepagecodepage onon eacheach -- howhow cancan thethe clientclient detectdetect thethe server'sserver's codecode page)page) MoreMore namename resolutionresolution ...... „ IsIs therethere aa recommendrecommend orderorder forfor tryingtrying addressaddress resolution?resolution? BigBig performanceperformance hithit –– ProbablyProbably notnot -- manymany /LinuxUnix/Linux clientsclients maymay bebe satisfiedsatisfied withwith purepure DNSDNS resolutionresolution „ NeedNeed forfor DFSDFS supportsupport inin moremore clientsclients isis aa givengiven butbut ...... –– HowHow dodo youyou locatelocate thethe globalglobal root?root? » DNS lookup of reserved domain specific name » LDAP query (Win2K seems to store it in both) » UDDI query » Local config file (worst case for most) 5HOLDELOLW\ ReliabilityReliability pointspoints toto considerconsider „ ThreeThree bigbig issues:issues: –– LogonLogon reliability,reliability, redundancyredundancy –– RedundancyRedundancy ofof DFSDFS rootroot –– RedundancyRedundancy ofof (especially(especially r/o)r/o) resourcesresources usingusing DFSDFS replicasreplicas –– ReconnectionReconnection afterafter transienttransient TCPTCP sessionsession failurefailure withoutwithout useruser interventionintervention -- transparenttransparent toto anan application?application? –– WhatWhat ifif TGTTGT hashas expired?expired? CanCan KDEKDE oror GNOMEGNOME promptprompt thethe useruser toto reenterreenter thethe passwordpassword oror isis aa locallocal serviceservice necessary?necessary? ReliabilityReliability pointspoints toto considerconsider

„ ItIt isis aa givengiven thatthat clientsclients shouldshould leverageleverage DFSDFS whenwhen availableavailable forfor reconnectionreconnection -- –– ButBut howhow longlong cancan wewe keepkeep infoinfo onon replicasreplicas safelysafely withoutwithout askingasking forfor aa referralreferral again?again? –– WhatWhat orderorder shouldshould clientsclients trytry toto connectconnect toto replicas?replicas? RoundRound robinrobin inin samesame subnetsubnet 1st?1st? –– AndAnd thethe mgmtmgmt APIAPI forfor DFSDFS ((AddRootAddRoot,, etc.)etc.) isis onlyonly partiallypartially implementedimplemented ...... ReliabilityReliability pointspoints toto considerconsider

„ AndAnd howhow dodo wewe handlehandle outout ofof diskdisk spacespace onon write?write? IfIf storagestorage isis replicatedreplicated cancan clientclient initiateinitiate failoverfailover toto replica?replica? ShouldShould serverserver movemove directory,directory, thenthen dropdrop sessionsession thenthen generategenerate referralreferral onon reconnection?reconnection? „ ShouldShould wewe addadd aa newnew returnreturn codecode onon writewrite "ERR_FILE_MOVED""ERR_FILE_MOVED" soso serversservers cancan auto-failoverauto-failover toto anotheranother serverserver thatthat isis lessless full?full? &RPSDWLELOLW\ VFSVFS OperationsOperations

„ RAMFSRAMFS inin LinuxLinux 2.42.4 isis aa goodgood exampleexample ofof aa minimal,minimal, cleanlycleanly writtenwritten filesystemfilesystem „ NetworkNetwork filesystemsfilesystems unfortunatelyunfortunately areare muchmuch moremore complicatedcomplicated forfor LinuxLinux seesee include/include/linuxlinux//fsfs.h.h toto seesee aa completecomplete listlist VFSVFS OperationsOperations (file(file ops)ops)

„ LlseekLlseek „ IoctlIoctl „ ReadRead „ MmapMmap „ WriteWrite „ OpenOpen „ ReaddirReaddir „ FlushFlush „ PollPoll „ ReleaseRelease „ FsyncFsync „ FasyncFasync VFSVFS OperationsOperations (file(file opsops partpart 2)2)

„ LockLock „ ReadvReadv „ WritevWritev „ SendpageSendpage „ Get_unmapped_areaGet_unmapped_area VFSVFS OperationsOperations (part(part 3)3) inodeinode opsops „ CreateCreate „ LookupLookup „ LinkLink „ UnlinkUnlink „ SymlinkSymlink „ MkdirMkdir „ RmdirRmdir „ MknodMknod „ RenameRename „ readlinkreadlink VFSVFS OperationsOperations (part(part 4)4) inodeinode opsops „ Follow_linkFollow_link „ TruncateTruncate „ PermissionPermission „ RevalidateRevalidate „ SetattrSetattr „ GetattrGetattr

„ AndAnd superblocksuperblock operationsoperations – Statfs – Unlockfs – Write_super – Inode_from_fh KeKeyy DataData StructuresStructures

„ SuperblockSuperblock matchesmatches okok (connected(connected toto CIFSCIFS servers)servers) „ InodeInode isis easyeasy matchmatch exceptexcept forfor 777777 permissionspermissions (vs.(vs. SecurityDescriptorsSecurityDescriptors)) andand toto lesserlesser extendextend FileAttributesFileAttributes „ FileFile structurestructure isis okok matchmatch butbut missingmissing AccessFlagsAccessFlags „ DentryDentry structurestructure isis mostlymostly transparenttransparent „ LockLock structurestructure isis reasonablereasonable matchmatch (behavior(behavior differencesdifferences though)though) –– mandatorymandatory vs.vs. advisoryadvisory ErrorError codescodes

„ ListsLists ofof errorerror codescodes byby SMBSMB PDUPDU areare nowherenowhere nearnear completecomplete „ ErrorError mapping,mapping, especiallyespecially fromfrom (large(large listlist of)of) NTNT STATUSSTATUS codescodes toto (smaller)(smaller) errnoerrno.h.h PosixPosix errorserrors isis adhocadhoc „ OutOut ofof bandband alerts,alerts, managementmanagement API,API, CIFSCIFS clientclient MIBMIB etc.etc. isis possibilitypossibility forfor futurefuture documentationdocumentation and/orand/or standardizationstandardization SubtleSubtle OSOS specificspecific featuresfeatures „ UnderdocumentedUnderdocumented andand lessless understoodunderstood featuresfeatures thatthat affectaffect thethe clientclient are:are: –– ReparseReparse pointspoints (e.g.(e.g. copycopy onon writewrite likelike linkslinks usedused byby SISSIS onon Win2KWin2K amongamong others)others) –– SymbolicSymbolic linkslinks –– UnixUnix extensionsextensions cancan bebe usedused –– SparseSparse filefile attributeattribute bitbit –– CompressedCompressed andand encryptedencrypted filefile attributeattribute bitsbits –– ExtendedExtended attributesattributes andand streamsstreams (tend(tend toto bebe application/desktopapplication/desktop specific).specific). StreamStream mechanismmechanism itselfitself isis understoodunderstood though.though. –– EverchangingEverchanging newnew ioctlioctl andand fsctlfsctl callcall andand newnew Transact2Transact2 InfoInfo levelslevels …… 3HUIRUPDQFH PerformancePerformance pointspoints toto considerconsider forfor clientclient „ ForFor commoncommon casescases inin (at(at least)least) aa typicaltypical intranetintranet -- let'slet's showshow (and(and fixfix ifif need)need) –– MaximizeMaximize parallelismparallelism fromfrom eacheach clientclient –– MaximizeMaximize clientclient cachingcaching opportunitiesopportunities –– MinimizeMinimize roundtripsroundtrips fromfrom cltclt toto srvsrv » Command chaining - does it need improvements ala NFS v4? –– MinimizeMinimize protocolprotocol overheadoverhead (frame(frame hdrshdrs)) –– LatencyLatency whenwhen lightlylightly loadedloaded (timers(timers ...)...) –– SessionSession establishmentestablishment andand authauth overheadoverhead PerformancePerformance pointspoints -- whatwhat isis missing?missing? „ ChangeChange openopen typetype -- rerequestrerequest oplockoplock (on(on longlong openedopened file)file) withoutwithout closeclose (Breaking(Breaking newsnews …… ioctlioctl discovereddiscovered thatthat seemsseems toto bebe forfor thisthis purpose)purpose) „ DoDo wewe needneed aa newnew token/token/oplockoplock managermanager toto betterbetter handlehandle replicas?replicas? „ SecondarySecondary sessionsession ((maxvsmaxvs >> 1)1) -- rawVCrawVC oror newnew RDMARDMA secondarysecondary connectionconnection „ ClientClient timing/observationstiming/observations ofof serverserver -- autotuningautotuning read-aheadread-ahead andand oplockoplock/lazy/lazy closeclose behaviorbehavior PerformancePerformance pointspoints -- whatwhat isis missing?missing? „ LotsLots ofof distinctdistinct infrequentlyinfrequently usedused longlong pathspaths areare bigbig problemproblem forfor LinuxLinux „ ...... butbut couldcould bebe toughtough toto skipskip duedue toto referralsreferrals „ VnodeVnode invalidationinvalidation trickytricky onon clientclient duedue toto kernelkernel cachingcaching „ AreAre accessaccess hintshints onon filefile sufficient?sufficient? IsIs somethingsomething similarsimilar forfor directoriesdirectories needed?needed? DataData integrityintegrity inin presencepresence ofof datadata cachedcached byby clientclient kernelkernel „ InvalidatingInvalidating PinnedPinned mmapedmmaped pagespages areare trickytricky invalidate_invalidate_inodeinode_pages2()_pages2() maymay helphelp butbut notnot demonstrateddemonstrated sincesince notnot directlydirectly invokedinvoked byby anyany filesystemsfilesystems inin 2.42.4 „ WhatWhat havinghaving aa distinctdistinct locklock protocolprotocol help?help? „ HowHow toto handlehandle directorydirectory entryentry caching?caching? –– OneOne optionoption isis thethe notifynotify SMBsSMBs Performance issues unproven so ignored by most 6HFXULW\ SecuritySecurity featuresfeatures

„ MinimumMinimum – NTLM authentication „ ButBut wewe shouldshould bebe offeringoffering – Kerberos authentication – NTLMv2 & packet signing – Per file (server decides) signing and encryption (ala NFSv4 proposal) – requires addition of new RC „ TheThe CIFSCIFS andand KerberosKerberos protocolprotocol shouldshould addadd supportsupport forfor – Optional AES encryption (mostly a Kerberos issue) – Optional per-packet privacy or mode widespread support for SSL or dual transport SSL and TCP – Are changes needed for Hardware assist for session establishment SecuritySecurity featuresfeatures

„ ForFor LinuxLinux filesysystemsfilesysystems inin particularparticular aa choicechoice mustmust bebe mademade earlyearly onon –– – Do all users get the same SMB UID when going to the server (relying on local ACLs to prevent users from modifying server data)? – Do you do a new SMB SessSetupX for each new user of the mount point? „ ComplicationsComplications –– wherewhere dodo youyou getget thethe passwordpassword (especially(especially onon implicitimplicit dfsdfs connections)?connections)? DesktopDesktop toto FSFS interactionsinteractions notnot strongstrong suitesuite ofof UnixesUnixes todaytoday „ AndAnd …… whatwhat aboutabout passwordpassword duringduring reconnect?reconnect? StoreStore inin cifscifs fsfs?? 6RPH LQWHUHVWLQJ VHFXULW\ UHIHUHQFHV ForFor moremore informationinformation -- SecuritySecurity „ KerberosKerberos andand interoperableinteroperable authenticationauthentication –– ProjectProject PismerePismere/MIT/MIT http://web.http://web.mitmit..eduedu//pismerepismere/M/M „ MITMIT kerberoskerberos -- –– http://web.http://web.mitmit..eduedu//kerberoskerberos/www//www/ „ HeimdalHeimdal KerberosKerberos implementationimplementation –– http://www.http://www.pdcpdc..kthkth.se/.se/heimdalheimdal/L/L „ LukeLuke Leighton'sLeighton's BookBook onon DCE/RPCDCE/RPC && SMBSMB –– '&(53&RYHU60%6DPEDDQG'&(53&RYHU60%6DPEDDQG :LQGRZV17'RPDLQ,QWHUQDOV:LQGRZV17'RPDLQ,QWHUQDOV ForFor moremore informationinformation -- SecuritySecurity „ WesterlundWesterlund,, AA andand DanielssonDanielsson,, J,J, ""HeimdalHeimdal andand WindowsWindows 20002000 KerberosKerberos ---- HowHow toto getget themthem toto playplay together",together", USENIXUSENIX 20012001 „ Swift,Swift, MM andand BrezakBrezak,, J,J, "The"The WindowsWindows 20002000 RC4-HMACRC4-HMAC KerberosKerberos EncryptionEncryption Type"Type" WorkWork inin progress,progress, draft-draft-brezakbrezak-- win2k-win2k-krbkrb-rc4--rc4-hmachmac-02.txt-02.txt 2SWLRQVIRU WKH)XWXUH $VVLVWDQFH :HOFRPH