B SD Portals for LINUX 2.0
Total Page:16
File Type:pdf, Size:1020Kb
B SD Portals for LINUX 2.0 A. David McNab t NAS Technical Report NAS-99-XXX NASA Ames Research Center Mail Stop 258-6 Moffett Field, CA 94035-1000 mcnab@nas, nasa. gov Abstract Portals, an experimental feature of 4.4BSD, extend the filesystem namespace by exporting certain open ( ) requests to a user-space daemon. A portal daemon is mounted into the file namespace as if it were a standard filesystem. When the kernel resolves a pathname and encounters a portal mount point, the remainder of the path is passed to the portal daemon. Depending on the portal "pathname" and the daemon's configuration, some type of open (2 ) is performed. The resulting file descriptor is passed back to the kernel which eventually returns it to the user, to whom it appears that a "normal" open has occurred. A proxy portalfs filesystem is responsible for kernel interaction with the daemon. The overall effect is that the portal daemon performs an open ( 2 ) on behalf of the kernel, possibly hiding substantial complexity from the calling process. One particularly useful application is implementing a connection service that allows simple scripts to open network sockets. This paper describes the implementation of portals for LINUX 2.0. tDavid McNab is an employee of MRJ Technology Solutions, Inc. 2. OVERVIEW OF PORTALS 2/ OVERVIEW OF PORTALS In 1985 Presotto and Ritchie described a connection server[4]: a service process that accepts requests to open 1/ INTRODUCTION connections of some sort, whether they be to a network resource or instructions to dial a modem and connect to a One of the fundamental design commitments of UNIX is service provide, and then processes them on behalf of its that all I/O devices can be represented as simple files in a clients. Thts has the destrable effect of h,dmg the details hierarchical namespace. For example, disk devices can be of the opening procedure from the client. addressed as long arrays of disk blocks through a device After a connection server does its work, it must have mapped into the filesystem. As networking became more some mechanism by which it can transfer the open con- integrated into the UNIX environment, it was natural to nection to the client. The infrastructure to support this was seek to extend this mo_lel to the network. first introduced in 4.2BSD and is now implemented in al- Programmatically, sockets are represented as file de- most all UNIX systems. The BSD UNIX senclmsg(2) scriptors just as regular files are. However they require and recvmsg (2) system calls can be used to pass open considerably different and more complex handling. For file descriptors across UNIX domain sockets. Thus a con- example to open a TCP socket to a particular port, one nection server can receive a request from a client, open the must first create the socket, then build a structure de- appropriate connection, then use sanc_sg ( ) to pass back scribing the relevant network address, then connect (2) the file descriptor. The client accepts it with recvmsg (), the socket to the address, and finally begin exchanging and the file descriptor can then be used to access the un- data. In a flexible programming language with full ac- derlying connection. Conceptually, the server's file de- cess to the UNIX API this is not terribly complicated, and scriptor is a handle for the kernel data structures repre- in most cases it can be automated by writing a more ab- senting the connection. When the handle is passed to an- stract tcp_open{) function that hides the details. But other process, the kernel structures remain unchanged but other programming environments do not have access to the recipient gains a reference to them. the full UNIX API. For example the standard UNIX shells 4.4BSD portals use essentially the same mechanism, and tools like awk ( 1 ) can open and close files but present although the problem is complicated because in order no interface to the necessary socket system calls. to map the connection service into the filespace a ker- Thus it is desirable, at the very least on an experimental nel filesystem implementation is required. A portal dae- basis, to explore the possibility of mapping the set of pos- mon provides the connection service. This is a user- sible network connections into the filesystem. This would space background process that accepts connection re- allow any process capable of the simplest of UNIX file quests, does the appropriate work, and returns an open I/O operations to open a network connection and read or file descriptor using sendmsg (). The kernel generates write to it. It also renders the use of detail-hiding helper connection requests based on the pathname used to access functions, as described above, unnecessary, since a pro- a portal-space "file". grammer can simply open a standard filesystem pathname The mapping of "connection space" into the file names- to gain access to the network (although in some cases this pace is performed by a filesystem called portalfs. Con- will not provide the necessary flexibility). Finally there ceptually, portalfs is simple. The point at which the por- is a certain aesthetic appeal in demonstrating that the el- tal namespace is mounted indicates a border between the egant UNIX model can be extended to support a type of "normal" file namespace and the connection namespace. I/O for which it was not explicitly designed. As the kernel processes open ( ) requests it gradually re- In 4.4BSD, an experimental feature called portals was solves the pathname passed as an argument of open (). introduced to meet these types of requirements. The work If a portal mount point is encountered, the resolution pro- described here is a transfer of the portal concept to the cess stops. The portion of the pathname after the mount LINUX 2.0 environment. point is passed to the portal daemon. The daemon in- 3. DESIGN AND IMPLEMENTATION terprets the path as a network connection specification, that some minor modifications were necessary, but they compares it with its configuration file to see whether the consist of a handful of additional lines of code in two ker- type of connection is supported, and then executes the nel modules. open ( ) on behalf of the kernel. The open file descriptor Another secondary goal was to ensure that the portal is passed back using sendamg(), and the kernel effec- code did not introduce any additional instability to the tively calls recvmsg ( ) to extract the file descriptor. All LINUX kernel, nor that it could cause user processes to of the communication between kernel and daemon pro- "lock up", in other" words to sleep uninterruptibly while cess takes place over UNIX domain sockets, using a sim- accessing portals. ple protocol, portalfs is responsible for accepting the un- resolved portion of the pathname from the kernel, broker- 3.1/ THE PORTAL DAEMON ing ilae exchange with the daemon process, and returning the new file desj:riptor to the process that called open (). The 4.4BSD portal daemon is implemented as child pro- Consider an example. We mount a portal daemon into cess of the command that mounts a portal filesystem. This the file namespace at/p, and a user subsequently opens program, mount_portal ( 8 ), typically runs as part of the /p/top/foo. corn/1795. The kernel begins translating boot process: Its first action is to open a socket that will the pathname, but when it encounters /p it determines be used to listen for kernel requests. This listen socket that a portal mount point has been crossed. The remain- is recorded as one of the portalfs-specific arguments to der of the pathname, tcp/foo, corn/1966, is passed by mount (2), which is called to graft portalfs into the file portalfs to the portal daemon. This is interpreted as a re- namespace. If the mount succeeds, the program spawns quest to open a TCP connection to host foo. corn, access- a copy of itself which becomes the portal daemon. This ing network port 1966. The daemon builds an address daemon runs in the background for the remainder of the structure and performs the necessary socket: ( ) and con- portal's lifetime, accepting incoming requests and spawn- nect: ( ) calls to set up the connection, then sends back the ing a copy of itself to process each of them. After spawn- open file descriptor. Eventually the descriptor is passed ing the daemon the parent exits, so that the mounu com- back to the client process, which can now use it to access mand behaves synchronously as one would expect. port 1966 on foo. com. The mount:_porcal program used for the LINUX port The implementation of 4.4BSD portals is described in was taken from release 1.3. l of NetBSD. It supports two detail by Pendry and Stevens[2]. The source code is also types of portals: tcp and _. The tcp type maps path- available via the freely available BSD implementations names of the form t:cp/<host:>/<port:> into network NetBSD and FreeBSD, as well as others. connections, where <hose> can be a hostname or an IP address and <port> is a port number. Thefs-style por- tal simply re-maps pathnames of the form fs/<paeh> 31 DESIGN AND IMPLEMENTATION to <pat:h>. It is intended to be used to support controlled egress from a chrooe ( 8 ) environment. An extended ver- The primary design goal was to provide portal service sion of the portal daemon could support filesystem ex- and support the 4.4BSD portal daemon without substan- tensions, for example access control lists. The NetBSD tial modification.