STREAMS Overview
STREAMS is a exible communication sub-
system framework
{ Originally develop ed by Dennis Ritchie for Research
UNIX Network Programming
UNIX
STREAMS provides a uniform infrastruc-
Overview of STREAMS
ture for developing and con guring character-
based I/O
{ e.g., networks, terminals, lo cal IPC
Douglas C. Schmidt
STREAMS supp orts the addition and re-
moval of pro cessing comp onents at installation-
time or run-time
{ Via user-level orkernel-level commands
1 2
STREAMS Overview cont'd
STREAMS Bene ts
The STREAMS paradigm is data-driven, not
demand-driven
STREAMS provides an integrated environ-
ment for developing kernel-resident network-
{ i.e., asychronous in the kernel, yet synchronous at
the application
ing services
STREAMS promotes de nition of standard
Supp orts b oth immediate and deferred pro-
service interfaces
cessing
{ e.g., TPI and DLPI
Internally, data are transferred by passing
p ointers to messages
STREAMS supp orts dynamic \service sub-
stitution" controlled by user-level commands
{ Goal is to reduce memory-to-memory copying over-
head
4 3
A Simple Stream
Bene ts cont'd STREAMS APPLICATION PROCESS fd = open ("/dev/dev0");
USER
Message-based interfaces enable o -b oard
KERNEL
proto col migration
WRITE STREAM READ
QUEUE HEAD QUEUE
Permits layered and de-layered multiplexi ng
More recent implementations take advan-
WRITE- SIDE READ- SIDE
of paralleli sm in the op erating system
tage ( DOWNSTREAM) ( UPSTREAM)
and hardware
"dev0" WRITE STREAM READ
QUEUE DRIVER QUEUE
6 5
A Mo dule on a Stream
The Stream Head
A \stream head" exp orts a uniform service
to other layers of the UNIX kernel APPLICATION PROCESS interface
ioctl (fd, PROCESSI_PUSH, "modx");
{ Including the application \layer" running in user-
USER space
KERNEL
General stream head services include WRITE STREAM READ
QUEUE HEAD QUEUE
1. Queueing
a Provides a synchronous interface to asychronous
STREAM devices "modx" WRITE READ
QUEUE MODULE QUEUE
2. Datagram- and stream-oriented data transfer
3. Segmentation and reassembly of messages
"dev0" WRITE STREAM READ
Event propagation
QUEUE DRIVER QUEUE 4.
{ i.e., signals
8 7
The Stream Head cont'd
The Stream Head cont'd
Stream head op erations include
PROCESS
{ stropen USER
KERNEL
. called from le system layer to op en a Stream
sd_pollist
strclean
stdata_t {
. called from le system layer to remove event
sd_wrq sd_siglist cells from Stream Head when a le is closed
STREAM strevent
strclose queue_t queue_t { Head
strwsrv() strrput() strevent
. called from the le system layer to dismantle a Stream q_next q_next strevent
Intermediate
strread STREAM Modules {
and STREAM
called from the le system layer to retrieve data
Driver .
messages coming upstream
10 9
Stream Head op erations cont'd
Messages
{ strwrite
In STREAMS, all information is exchanged
. called from the le system layer to send data
via messages
messages downstream
{ i.e., b oth data and control messages of various
priorities
{ strioctl
. called from the le system layer to p erform con-
trol op erations
A multi-comp onent message structure is used
to reduce the overhead of
{ strgetmsg
1. Memory-to-memory copying
. called from the system call layer to get a proto col
or data message coming upstream
{ i.e., via \reference counting"
2. Encapsulation/de-encapsulation
{ strputmsg
{ i.e., via \comp osite messages"
. called from the system call layer to send a pro-
to col or data message downstream
{ strpoll
Messages may b e queued at STREAM mo d-
. called from the le system layer to check if p ol-
ules
lable events are satis ed
11 12
Comp osite Message
Message Structure
mblk_t mblk_t b_band b_next b_cont b_prev b_cont b_rptr b_datap b_datap b_wptr b_datap
db_ref DYNAMICALLY ALLOCATED db_base db_base db_type DATA mblk_t db_lim BUFFER db_base dblk_t dblk_t
queue_t dblk_t DATA DATA
BUFFER BUFFER
13 14
Message Bu er Sharing
Message Typ es
mblk_t STREAMS
M DATA user data
M PROTO proto col information
M PASSFP pass le p ointer
IOCTL user io ctl request
mblk_t M
BREAK request line break b_datap M
b_rptr
M SIG signal pro cess group
b_wptr dblk_t
M DELAY request transmit delay
M CTL mo dule-sp eci c control message
SETOPTS set Stream head options
b_datap M
RSE reserved for RSE use b_rptr M db_base b_wptr
db_ref = 2
Normal priority messages
{ M DATA, M PROTO, M PASSFP, and M IOCTL
may b e generated from user-level
Typically subject to ow control DATA {
BUFFER
16 15
STREAMS Message Typ es
Queue Structure
cont'd
M PCPROTO proto col information
M FLUSH ush queues
IOCACK acknowledge io ctl request
M qband
IOCNAK fail io ctl request
M qb_hiwat qb_last qband
COPYIN request to copyin io ctl data M qb_lowat qb_first module_stat
qb_count qb_next
M COPYOUT request to copyout io ctl data
M IOCDATA reply to M COPYIN and M COPYOUT
WRITE READ
PCSIG signal pro cess group
M STREAM QUEUE QUEUE q_qinfo
q_qinfo HEAD
READ read noti cation
M queue_t queue_t
qinit
M HANGUP line disconnect
q_next q_next open()
ERROR fatal error M close()
WRITE READ put()
STOP stop output immediately M STREAM QUEUE QUEUE q_qinfo
DRIVER service()
START restart output
M q_qinfo queue_t queue_t
STOPI stop input immediately
M module_info
STARTI restart input M q_link q_maxpsz
qband qband
PCRSE reserved for RSE use M q_flag q_minpsz
q_hiwat q_last q_lowat q_ptr q_count
q_first
High priority messages
*Typically not ow controlled
*M PCPROTO may b e generated from user-level
* Others passed b etween STREAM comp onents
17 18
Queue Structure cont'd
Queue and Message Linkage
t queue queue_t
qband
{ Primary data structure
BAND 1 BAND 2
. Each mo dule contains a write queue and a read
queue q_nband
q_band qb_next qb_next
{ Stores information on ow control, scheduling, max/min
interface sizes, linked messages, private data q_first qb_first qb_first
q_last qb_last qb_last
qinit
{ Contains put, service, open, close routines sub NORMAL ( BAND 0) MESSAGES PRIORITY
( BAND 1) qband b_next MESSAGES
b_prev PRIORITY
Contains information on each additional message { b_next ( BAND 2)
MESSAGES
> 0 and < 256 band b_prev HIGH b_next PRIORITY
b_prev MESSAGES
module info
{ Stores default ow control information
19 20
Queue Subroutines cont'd
Queue Subroutines
Standard subroutines cont'd
Four standard subroutines are asso ciated with
queues in a mo dule or driver, e.g.,
{ putqueue t *q, mblk t *mp
{ openqueue t *q, dev t *devp, int o ag, int s ag,
cred t *cred p;
. Performs immediate pro cessing
. Called when Stream is rst op ened and on any
. Supp orts synchronous communciation services
subsequent op ens
i.e., further queue pro cessing is blo cked until
. Passed a p ointer to the new read queue
put returns
. Also called any time a mo dule is \pushed" onto
{ servicequeue t *q
the Stream
. Performs deferred pro cessing
{ closedev t dev, int ag, int otyp, cred t *cred p;
. Supp orts asynchronous communication services
. Called when last reference to a Stream is closed
Uses the message queue available in a queue
. Also called when a mo dule is \p opp ed" o the
Stream
. Runs as a \weightless" pro cess:::
21 22
Flow Control and the service
Queue Flow Control
Pro cedure
Typical non-multiplexed example WRITE() SLEEPS UNTIL STREAM IS APPLICATION " BACK- ENABLED" PROCESS
USER
service queue t *q KERNEL int TYPICALLY NO QUEUEING ON
WRITE f STREAM HEAD STREAM
QUEUE HEAD t *mp; WRITE QUEUE mblk
q_next
while mp = getq q != 0
WRITE
queclass mp == QPCTL jj STREAM if QUEUE MODULE PRESSURE
-
canputnext q f BACK
q_flag == QFULL /* Pro cess message */
q_next putnext q, mp;
WRITE STREAM g
QUEUE DRIVER
else f
putb q q, mp;
return 0;
g
g
put and service work together to supp ort ad-
Flow control is more complex with multi-
visory ow control
plexers and concurrency
24 23
Flow Control and the canput
Pro cedure
Flow Control and the put
canputnext is used by put and service
Pro cedure
routines to test advisory ow control condi-
tions
Typical put example
int put queue t *q, mblk t *mp
e.g.,
f
if queclass mp == QPCTL jj
int canputnext queue t *q
canputnext q f
f
/* Pro cess message */
nd closest queue with a service pro cedure
putnext q, mp;
if queue is full f
else
set ag for \back-enabling"
putq q, mp;
return 0;
/* Enables service routine */
g
return 0;
return 1;
g
g
Note that non-MP systems may use canput:: :
25 26
getq
putq
The mblk t *getqqueue t * function dequeues
a message from a queue
The int putqqueue t *, mblk t * function
enqueues a message on a queue
{ It is typically called by a queue's service pro-
cedure
{ It is typically called by a queue's put pro cedure
when it can no longer pro ceed:::
Messages are dequeued in priority order
putq automatically handles
{ i.e., higher priority messages are stored rst in the
queue!
1. priority-band allo cation
2. priority-band message insertion
getq handles
3. ow control
1. Flow control
2. Back-enabling
Enqueueing a high priority message auto-
matically schedules the queue's service
pro cedure to run at some p oint
getq returns 0 when there are no available
{ Di ers on MP vs. non-MP system
messages
28 27
Multiplexin g
STREAMS provides rudimentary supp ort for
Multiplexor Links b efore
multiplexing via multiplexor drivers
{ Unlike mo dules, multiplexors o ccupy a le-system no de that can b e \op ened" fd1 = open ("/dev/ip"); APPLICATION
fd2 = open ("/dev/eth");
. Rather than \pushed" USER
KERNEL
STREAM WRITEWRITE READREAD WRITEWRITE READREAD STREAM
HEAD QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE HEAD
Multiplexors may contain one ormore upp er
and/orlower connections to other STREAM
and/or multiplexors mo dules STREAM WRITEWRITE READREAD WRITEWRITE READREAD STREAM QUEUE QUEUE QUEUE QUEUE MULTIPLEXOR QUEUE QUEUE QUEUE QUEUE DRIVER
"/dev/ip" "/dev/eth"
{ Enables supp ort forlayered network proto col suites
. e.g., "/dev/tcp"
Note there is no automated supp ort forprop-
agating ow control across multiplexors
29 30
Multiple xor Links after
Internetworking Multiplexor ioctl (fd1, I_LINK, fd_2); APPLICATION
APPLICATION A PPLICATION APPLICATION USER U SER
KERNEL KERNEL
STREAM WRITE READ WRITE READ WRITE READ WRITE READ STREAM WRITEWRITE READREAD WRITE READ STREAM WRITE READ WRITE READ WRITE READ WRITE READ DISABLED HEADS QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE HEAD QUEUEQUEUE QUEUEQUEUE QUEUE QUEUE HEAD
UPPER WRITEWRITE READREAD WRITEWRITE READREAD WRITEWRITE READREAD WRITEWRITE READREAD STREAM MULTIPLEXOR QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE STREAM UPPER WRITEWRITE READREAD MULTIPLEXOR MULTIPLEXOR QUEUE QUEUE LOWER WRITEWRITE READREAD MULTIPLEXER QUEUE QUEUE "/dev/ip" MULTIPLEXOR QUEUEQUEUE QUEUEQUEUE
" BORROWED" QUEUES FROM UPPER WRITEWRITE READREAD LOWER WRITEWRITE READREAD MULTIPLEXOR QUEUEQUEUE QUEUEQUEUE STREAM MULTIPLEXOR QUEUE QUEUE THE STREAM HEAD QUEUE QUEUE MULTIPLEXOR LOWER WRITEWRITE READREAD WRITEWRITE READREAD MULTIPLEXOR QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE
WRITEWRITE READREAD STREAM STREAM WRITEWRITE READREAD WRITEWRITE READREAD QUEUEQUEUE QUEUEQUEUE DRIVER DRIVER QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE "/dev/eth" NETWORK
INTERFACES
31 32
Driver Data Structure Linkage Mo dule Data Structure Linkage
strdata (Stream head) Other kernel layers cb_ops fmodsw f_str sd_strtab d_str
streamtab streamtab
st_wrinit st_rdinit st_wrinit st_rdinit
qinit qinit qinit qinit
q_qinfo q_qinfo q_qinfo q_qinfo queue_t queue_t queue_t queue_t qi_minfo qi_minfo qi_minfo qi_minfo STREAM STREAM module_info module_info module_info module_info
DRIVER MODULE
33 34
Pip es and FIFOs
Pip es and FIFOs cont'd
In SVR4 and Solaris, the traditional lo cal
STREAM PIPE
IPC mechanisms such as pip es and FIFOs
APPLICATION APPLICATION using STREAMS have b een reimplemented int fds[2]; pipe (fds); fork (); USER
KERNEL
This has broaded the semantics of pip es and
Os in the following ways:
FIF WRITE READ WRITE READ WRITE READREAD QUEUE QUEUE QUEUE QUEUE QUEUE QUEUE
EADS
1. Pip es are now bidirectional STREAM pip es H
STREAM
2. Pip es and FIFOs now supp ort b oth bytestream-
oriented and message-oriented communication
FIFO
ioctls exist to mo dify the b ehavior of STREAM
{ APPLICATION APPLICATION
descriptors to enable or disable many of the new mkfifo ("/tmp/fifo"); open ("/tmp/fifo", 0); features open ("/tmp/fifo", 1); USER
KERNEL
Pip es can be explicitly named and exp orted into 3. STREAM WRITEWRITE READREAD
the le system QUEUE QUEUE
HEAD QUEUE QUEUE
4. Pip es and FIFOs can b e extended by having mo d-
ules pushed onto them
35 36
Mounted Streams and CONNLD
Mounted Streams and CONNLD
cont'd
In earlier generations of UNIX, pip es were int fds[2];
pipe (fds); SERVER
to allowing multiplexed commu- restricted CLIENT fattach (fds[1], "/tmp/server_x");
nication ioctl (fds[1], I_PUSH, "connld"); open ("/tmp/server_x", O_RDWR);
ioctl (fds[0], I_RECVFD, &recv_fd); USER
This is overly complex for many client/server-style { WRITE READ KERNEL STREAM WRITEWRITE READ QUEUE QUEUE
HEADS QUEUEQUEUE QUEUE applications MOUNTED STREAM ("/tmp/server_x")
connld
SVR4 UNIX and Solaris provide a mech-
known as \Mounted Streams" that
anism SERVER CLIENT
p ermits non-multiplexed communication read()/write() USER
KERNEL
{ This provides semantics similar to UNIX domain STREAM WRITE READ WRITE READ
WRITE READ WRITEWRITE READ WRITE READ WRITEWRITE READ ets so ck QUEUE QUEUE QUEUE QUEUE HEADS QUEUE QUEUE QUEUEQUEUE QUEUE QUEUE QUEUE QUEUEQUEUE QUEUE
connld
{ However, Mounted Streams are more exible since
MOUNTED STREAM
rp orate other features of STREAMS
they inco ("/tmp/server_x")
38 37
Layered Multiplexi ng cont'd
Layered Multiple xin g
Advantages
Share resources such as control blo cks APPLICATION APPLICATION APPLICATION {
USER
KERNEL
{ Supp orts standard OSI and Internet layering mo d-
WRITE READ STREAM WRITE READ WRITEWRITE READREAD WRITEWRITE READREAD WRITEWRITE READREAD els QUEUE QUEUE HEADS QUEUE QUEUE QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE
UPPER WRITEWRITE READREAD WRITEWRITE READREAD WRITEWRITE READREAD WRITEWRITE READREAD MULTIPLEXOR QUEUE QUEUE QUEUE QUEUE QUEUE QUEUE QUEUE QUEUE
QUEUE QUEUE QUEUE QUEUE QUEUE QUEUE QUEUE QUEUE STREAM Disadvantages MULTIPLEXOR LOWER WRITEWRITE READREAD MULTIPLEXOR
QUEUEQUEUE QUEUEQUEUE
{ More pro cessing involved to demultiplex in deep roto col stacks STREAM WRITEWRITE READREAD p
DRIVER QUEUEQUEUE QUEUEQUEUE
{ May be more dicult to parallelize due to lo cks red resource contention
NETWORK INTERFACE and sha
{ Hard to propagate ow control and QOS info across
muxer
40 39
De-layered Multiplexin g cont'd
De-layered Multiplexi ng
Advantages
APPLICATION
Less pro cessing for deep proto col stacks APPLICATION APPLICATION {
USER
KERNEL
{ Potentially increased parallelism due to less con- STREAM WRITEWRITE READREAD WRITEWRITE READREAD WRITEWRITE READREAD WRITEWRITE READREAD QUEUE QUEUE QUEUE QUEUE QUEUE QUEUE QUEUE QUEUE HEADS QUEUE QUEUE QUEUE QUEUE QUEUE QUEUE QUEUE QUEUE tention and lo cking required
STREAM WRITEWRITE READREAD WRITEWRITE READREAD WRITEWRITE READREAD WRITEWRITE READREAD
Easier to propagate ow control and QOS info QUEUE QUEUE QUEUE QUEUE QUEUE QUEUE { MODULES QUEUE QUEUE QUEUE QUEUE QUEUE QUEUE QUEUEQUEUE QUEUEQUEUE
STREAM WRITEWRITE READREAD WRITEWRITE READREAD WRITEWRITE READREAD WRITEWRITE READREAD MULTIPLEXOR QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE QUEUEQUEUE
DRIVER
Disadvantages
NETWORK INTERFACE
{ Violates layering e.g., need a packet lter
{ Replicates resources such as control blo cks
41 42
STREAM Concurrency
Concurrency Alternatives
Mo dern versions of STREAMS supp ort multi- ro cessing p APPLICATION APPLICATION
C1 C2
Since mo dern UNIX systems have multi-threaded
{ STREAM ernels
k Heads
LP_XDR::svc (void) LP_XDR::svc (void) { /* incoming */ } { /* outgoing */ } Modules
Modules
Di erent levels of concurrency supp ort in- LP_TCP::put clude LP_TCP::put (Message_Block *mb) (Message_Block *mb) { /* incoming */ }
{ /* outgoing */ }
1. Fine-grain
LP_DLP::svc (void) LP_DLP::svc (void) { /* incoming */ }
{ /* outgoing */ } * Queue-level
* Queue-pair-level STREAM
Tail
Coarse-grain 2. NETWORK DEVICES
OR PSEUDO-DEVICES
* Mo dule-level * Mo dule-class-level WRITE READ MODULE PROCESS MESSAGE
* Stream-level QUEUE QUEUE
OBJECT OBJECT OBJECT OR THREAD OBJECT
Note, develop ers must use kernel lo cking
primitives to provide mutual exclusion and
Layer Paralleli sm
synchronization
43 44
Concurrency Alternatives cont'd
APPLICATION APPLICATION C1C2 STREAM CP_XDR::put Heads (Message_Block *mb) { /* incoming */ } CP_XDR::put (Message_Block *mb) { /* outgoing */ } CP_TCP::put (Message_Block *mb) CP_TCP::put { /* incoming */ } (Message_Block *mb) { /* outgoing */ } CP_IP::put (Message_Block *mb) CP_IP::put { /* incoming */ } (Message_Block *mb) { /* outgoing */ } Modules Modules CP_DLP::svc (void) { /* incoming */ } CP_DLP::svc (void) { /* outgoing */ } STREAM Tail NETWORK DEVICES OR PSEUDO-DEVICES
WRITE READ MODULE QUEUE QUEUE PROCESS MESSAGE
OBJECT OBJECT OBJECT OR THREAD OBJECT
Connectional Paralleli sm 45
Concurrency Alternatives cont'd
APPLICATION APPLICATION
STREAM Heads MP_XDR::put (Message_Block *mb) MP_XDR::put { /* incoming */ } (Message_Block *mb) { /* outgoing */ } Modules Modules MP_UDP::put (Message_Block *mb) MP_UDP::put { /* incoming */ } (Message_Block *mb) { /* outgoing */ }
MP_DLP::svc (void) { /* incoming */ } MP_DLP::svc (void) { /* outgoing */ } STREAM Tail NETWORK DEVICES OR PSEUDO-DEVICES
WRITE READ PROCESS
MODULE QUEUE QUEUE OR THREAD
Message Paralleli sm 46
STREAMS Evaluation
Advantages
{ Portability, availability, stability
{ Kernel-mo de eciency
Disadvantages
{ Stateless pro cess architecture
. i.e., cannot blo ck!
{ Lack of certain supp ort to ols
. e.g., standard demultiplexing mechanisms
{ Kernel-level development environment
. Limited debugging supp ort:::
{ Lack of real-time scheduling for STREAMS pro cessing:: :
. Timers may b e used for \iso chronous" service 47
Summary
STREAMS provides a exible communica-
tion framework that supp orts dynamic con-
guration and recon guration of proto col
functionality
Mo dule interfaces are well-de ned and rea-
sonably well-do cumented
Supp ort for multi-pro cessing exists in So-
laris 2.x 48