3/3/2021
The Triumph of the New Jersey Style: How Architectural Elegance, Components, and Reuse Shaped the Evolution of Unix Diomidis Spinellis Paris Avgeriou Department of Management Science and Technology Institute for Mathematics and Computer Science Athens University of Economics and Business University of Groningen & Department of Software Technology Delft University of Technology
www.spinellis.gr @CoolSWEng
1
Unix 50th Anniversary 1969–2019
2
1
3/3/2021 Faces of Faces of Open Source / Peter Adams
3
4
2 3/3/2021
5
6
3 3/3/2021
7
8
4 3/3/2021
9
10
5 3/3/2021
11
12
6
3/3/2021 History
13 Data Sources Data
14
7
3/3/2021
Architectural Design Design Decisions
15
Results Quantitative
16
8
3/3/2021
Theory of OS of Theory Architectural Evolution Architectural
17 History
18
9 3/3/2021
19
20
10 3/3/2021
21
22
11 3/3/2021
23
24
12 3/3/2021
25
26
13 3/3/2021
27
28
14 3/3/2021
29
30
15 3/3/2021
31
32
16 3/3/2021
33
34
17 3/3/2021
35 Data Sources Data
36
18 3/3/2021
37
38
19 3/3/2021
39
40
20 3/3/2021
41
42
21 3/3/2021
github.com/dspinellis/unix-history-repo/
43
44
22 3/3/2021
45
46
23 3/3/2021
47
48
24 3/3/2021
49
50
25 3/3/2021
51
dspinellis.github.io/unix-history-man
52
26 3/3/2021
53
54
27
3/3/2021
Architectural Design Design Decisions
55
56
28 3/3/2021
PDP-7 [Unix] (1970)
57
… and I once heard an old-timer growl at a young programmer:
“I've written boot loaders that were shorter than your variable names!”
— Stephen C. Johnson
58
29 3/3/2021
Layering and Partitioning
adm.s cat.s dskio.s init.s s6.s ald.s check.s dskres.s lcase.b s7.s apr.s chmod.s dsksav.s maksys.s s8.s as.s chown.s ds.s s1.s s9.s bc.s chrm.s dsw.s s2.s scope.v bi.s cp.s ed1.s s3.s sop.s bl.s db.s ed2.s s4.s trysys.s cas.s dmabs.s ind.b s5.s
59
Process Management (fork)
60
30 3/3/2021
Descriptor Management
61
Separation of File Metadata from File Naming
62
31 3/3/2021
Devices as Files
63
File I/O API
• open • read • write • seek • tell • close
64
32 3/3/2021
File System API
• creat • rename • link • unlink
65
Interpreter
66
33 3/3/2021
67
First Research Edition (Nov 1971)
68
34 3/3/2021
First Edition Unix 1972 FreeBSD 11.1 2018
69
The Shell as a User Program
70
35 3/3/2021
Abstraction of Standard I/O
71
Interoperability through Documented File Formats
72
36 3/3/2021
Format Description Clients
a.out Assembler and linker output as, ld, strip, nm, un
Archive Object code libraries ar, ld
Core Crashed program image Kernel, db
Directory File system directories du, find, ls, ln, mkdir, rmdir
File system File system format check, dump,* mkfs, restor*
Ident GECOD ident card format opr
Password User accounts and passwords chown, find, getpw,* login,* ls, passwd*
Tape* DECtape file format mt,* tap*
Uid User identifier to name map chown
utmp Logged in users init, login,* who,* write*
wtmp* Users login history acct, date, init, login, tacct, who
73
User-Contributed Tools and Games
74
37 3/3/2021
Tree Directory Structure
• mkdir(II) • chdir(I) • chdir(II) • find(I) • ln(I) • ls(I) • stat(I) • mkdir(I) • mv(I) • rm(I) • rmdir(I)
75
Mountable File System Interface
• mount(II) • mount(I) • umount(II) • umount(I)
76
38 3/3/2021
Second Research Edition (Jun 1972)
• Software Library
77
Third Research Edition (Feb 1973)
78
39 3/3/2021
Pipes and Filters
79
Fourth Research Edition (Nov 1973)
80
40 3/3/2021
Structured Programming
• Kernel implemented in “New B” • 6373 lines New B • 768 lines PDP-11 assembly • 105 functions + 50 assembly symbols • vs 248 global symbols in the First Edition
81
Language-Independent API
82
41 3/3/2021
Data Structure Definition & Reuse
buf.h filsys.h proc.h text.h conf.h inode.h reg.h tty.h file.h param.h systm.h user.h
83
Dynamic Resource Management
int coremap[CMAPSIZ]; int swapmap[SMAPSIZ];
struct map { char *m_size; char *m_addr; };
malloc(mp, size) struct map *mp; { …
84
42 3/3/2021
Device Driver Abstraction
85
Driver Interface
struct { int (*d_open)(); int (*d_close)(); int (*d_strategy)(); } bdevsw[];
struct { int (*d_open)(); int (*d_close)(); int (*d_read)(); int (*d_write)(); int (*d_sgtty)(); } cdevsw[];
86
43 3/3/2021
Buffer Cache
Fourth Edition FreeBSD 11.1 #define B_READ 01 #define B_ASYNC 0x00000004 #define B_DONE 02 /* Start I/O, do not wait. #define B_ERROR 04 */ #define B_BUSY 010 […] #define B_XMEM 060 #define B_DELWRI 0x00000080 #define B_WANTED 0100 /* Delay I/O until buffer #define B_RELOC 0200 reused. */ #define B_ASYNC 0400 #define B_DONE 0x00000200 #define B_DELWRI 01000 /* I/O completed. */
87
Fifth Research Edition (Jun 1974) • Command Files chdir /usr/source/s3 cc -c ctime.c ar r /lib/liba.a ctime.o rm ctime.o chdir /usr/source/s1 cc -s -n date.c cp a.out /bin/date cc -s -n dump.c cp a.out /bin/dump cc -s -n ls.c cp a.out /bin/ls rm a.out
88
44 3/3/2021
Sixth Research Edition (May 1975)
89
Portable C Library
alloc.c clenf.c makbuf.c scan1.c calloc.c copen.c maktab.c scan2.c cclose.c cputc.c nexch.c scan3.c ceof.c cwrd.c nodig.c system.c cerror.c dummy.s printf.c tmpnam.c cexit.c ftoa.c putch.c unget.c cflush.c getch.c puts.c unprnt.s cfree.c gets.c relvec.c wdleng.c cgetc.c getvec.c revput.c ciodec.c iehzap.c run
90
45 3/3/2021
Seventh Research Edition (Jan 1979)
91
Unix as a Virtual Machine
Also, about this time [1973] I had a fateful discussion with Dennis, in which he said:
“I think it may be easier to port Unix to a new piece of hardware than to port a complex application from Unix to a new OS” — Steve Johnson
92
46 3/3/2021
Dynamic User Memory Allocation
• malloc(3), free(3) – Used by 26 programs: awk cc col cron dc dcheck diff ed eqn expr graph icheck learn ls m4 neqn nm quot ratfor spline struct tar tsort uucp xsend quiz • stdio(3), mp(3)
93
@ CoolSWEng
94
47 3/3/2021
Environment Variables
• KEY=value • Kernel • Shell • C Library
95
Language Development Tools
• lex(1) • yacc(1) • 12 clients: awk bc cpp egrep eqn lex m4 make pcc neqn struct •
96
48 3/3/2021
Domain-Specific Languages
• sh • awk • sed • find • expr • egrep • m4 • make
97
Filesystem Directory Hierarchy
98
49 3/3/2021
99
First and Second Berkeley Software Distributions (1978) • Software Packages – csh – ex – Mail – Pascal – termlib
100
50 3/3/2021
3BSD (1979)
• Virtual Memory Paging – vm_*.c – 2808 out of 16039 C source code – 17% of kernel source code – vread(2), vwrite(2), vfork(2)
101
4BSD (Oct 1980)
102
51 3/3/2021
Regular Expression Library: regex(3)
– 5 implementations: awk, sed, ed, grep, expr – 1 client: more(1) – 2 more by 4.3: dbx(1), rdist(1) – 4 replacements in FreeBSD: ed, grep, sed, expr
103
Optimized Screen Handling
– curses(3) – termcap(5)
104
52 3/3/2021
4.2BSD (Sep 1983)
• Internet Protocol Family – ARP, IP, TCP, UDP, ICMP • Local and Remote Interprocess Communication – socket(2), etc. • Network and User Database Access – getfsent(3x), getgrent(3), gethostent(3n), getnetent(3n), getprotoent(3n), getpwent(3), getservent(3n) • Pseudo-Terminal Driver – pty(4)
105
• 2 library functions – rcmd(3x) System call Uses – rexec(3x) • 11 system daemons bind 23 – comsat(8c) connect 15 – ftpd(8c) – gettable(8c) accept 13 – implogd(8c) – rexecd(8c) select 12 – rlogind(8c) listen 11 – af(8c) – rshd(8c) sendto 10 – rwhod(8c) – telnetd(8c) shutdown 9 – tftpd(8c) recvfrom 8 • 8 user-mode programs – ftp(1c) getsockname 6 – rlogin(1c) recv 2 – rsh(1c) – talk(1c) send 2 – telnet(1c) sendmsg 1 – tftp(1c) – whois(1c) getsockopt 0 – sendmail(1c) recvmsg 0 socketpair 0
106
53 3/3/2021
4.3BSD Tahoe (Jun 1988)
• Multiple CPU Architecture Support Tahoe 19% – VAX – CCI Power 6/32 (Tahoe) Machine independent 33%
• Timezone Handling – Community contribution – Separation of timezone rules from code VAX 48%
107
4.3BSD Reno (Jun 1990)
• Kernel Packet Forwarding Database – route(4) /* – routed(8), XNSrouted(8) * Operations on vnodes. */ struct vnodeops { • Virtual Filesystem Interface int (*vn_lookup)( /* ndp */ ); int (*vn_create)( /* ndp, fflags, vap, cred */ ); int (*vn_mknod)( /* ndp, vap, cred */ ); int (*vn_open)( /* vp, fflags, cred */ ); int (*vn_close)( /* vp, fflags, cred */ ); int (*vn_access)( /* vp, fflags, cred */ ); int (*vn_getattr)( /* vp, vap, cred */ ); int (*vn_setattr)( /* vp, vap, cred */ );
int (*vn_read)( /* vp, uiop, offp, ioflag, cred */ ); int (*vn_write)( /* vp, uiop, offp, ioflag, cred */ ); int (*vn_ioctl)( /* vp, com, data, fflag, cred */ ); int (*vn_select)( /* vp, which, cred */ ); int (*vn_mmap)( /* vp, ..., cred */ ); int (*vn_fsync)( /* vp, fflags, cred */ ); int (*vn_seek)( /* vp, (old)offp, off, whence */ ); ….
108
54 3/3/2021
4.3BSD Net/2 (Jun 1991)
• Stream I/O Functions – funopen(3) – GNU funopencookie(3) added in FreeBSD 11
109
4.4BSD (Jun 1994)
• Stackable Filesystems – mount_null(8) – mount_union(8) • Generic System Control Interface (MIB) – sysctl(1) – sysctl(3) – sysctl(9)
110
55 3/3/2021
111
386BSD Patch Kit (1992-1993)
• Organized Community Contributions – From open source software … – … to an open source project • Patch metadata – title – author – description – prerequisites
112
56 3/3/2021
113
FreeBSD 1.1 (May 1994)
• Package Manager – Patch – Compile – Install – Uninstall – Handling of dependencies
114
57 3/3/2021
FreeBSD 2.0 (Nov 1994)
• Process Filesystem – procfs(5) • Dynamically Loadable Kernel Modules – lkm(4), then kld(4) – device drivers – file systems – emulators – system calls – 992 modules in FreeBSD 11.1
115
FreeBSD 2.1 (Nov 1995)
• Linux Emulation • Packet Capture Library
116
58 3/3/2021
FreeBSD 3.0 (Jan 1999)
• Common Access Method I/O Subsystem (CAM)
117
FreeBSD 3.4 (Dec 1999)
• Graph-based Kernel Networking and User Library (NETGRAPH)
118
59 3/3/2021
FreeBSD 4.0 (Mar 2000)
• OpenSSL Secure Sockets Layer and Transport Layer Security framework – Version 0.9.4 – 1127 files, 227118 lines – libssl(3), libcrypto(3), openssl(1) • Jail: Isolate a process and its descendants • Access Control Lists
119
FreeBSD 5.0 (May 2006)
• Symmetric Multiprocessing • Modular Disk I/O Request Transformation Framework (GEOM) • Mandatory Access Control • Pluggable Authentication Module
120
60 3/3/2021
FreeBSD 5.3 (Nov 2004)
• Streaming Archive Access Library • Miniport Driver Wrapper
121
FreeBSD 6.2 (Jan 2007)
• Basic Security Module Auditing
122
61 3/3/2021
FreeBSD 7 (Feb 2008)
• ZFS File System and Storage Pools • Dynamic Tracing
123
FreeBSD 9.0 (Jan 2012)
• Para-virtualized I/O • Infiniband Support • Application Compartmentalization
124
62 3/3/2021
FreeBSD 10.0 (Jan 2014)
• Virtual Machine Monitor (bhyve) • Fast User Space Raw Packet Processing
125
FreeBSD 11.0 (Sep 2016)
• Network Blacklisting
126
63
3/3/2021
Results Quantitative
127
128
64 3/3/2021
129
130
65 3/3/2021
131
132
66 3/3/2021
133
134
67 3/3/2021
Mean Cyclomatic Complexity
135
Mean Cyclomatic Complexity
FreeBSD GNU/Linux
136
68
3/3/2021
Theory of OS of Theory Architectural Evolution Architectural
137
Form and Pace of and Pace Form Architectural Evolution Architectural
138
69 3/3/2021
Many core architecture decisions are taken at the beginning of the system's lifetime
139
Most architecture decisions survive over the system lifetime
140
70 3/3/2021
141
142
71 3/3/2021
New architecture decisions are continuously made, further fueling architecture evolution
143
The rate of architecture decisions declines over the system's lifetime
144
72
3/3/2021
Accumulation of Accumulation Architectural Technical Debt Technical Architectural
145
Architecture decisions offering features that are either similar to existing ones or remain under-used
• read, pread, readv, preadv, recv, recvfro, recvmsg • /var/log, syslogd, acct, auditd • UGO permissions, ACL, MAC • Threads and processes for multitasking
146
73 3/3/2021
Debt systematically paid back despite increasing system size and complexity
147
Forces of Forces Architectural Evolution Architectural
148
74 3/3/2021
Conventions instead of enforcement facilitate evolution by reducing effort and offering flexibility
• Text files • Directory hierarchy • Documented file formats • Environment variables
149
Portability, due to its inherent complexity, is a key driver of architectural evolution
150
75 3/3/2021
An ecosystem of other OSs and third parties constantly shapes the architecture evolution
151
Third-party subsystems facilitate evolution through code reuse, but add technical debt
152
76 3/3/2021
Large subsystems form their own independent architecture
• Netgraph, SSL, MAC, PAM, GEOM, BSM, ZFS, DTrace
153
Thank you!
Diomidis Spinellis and Paris Avgeriou. Evolution of the Unix system architecture: An exploratory case study. IEEE Transactions on Software Engineering, 2019. doi:10.1109/TSE.2019.2892149
www.spinellis.gr @CoolSWEng
154
77 3/3/2021
Free open edX course (MOOC): Unix Tools: Data, Software and Production Engineering
Grow from being a Unix novice to Unix wizard status! Process big data, analyze software code, run DevOps tasks and excel in your everyday job through the amazing power of the Unix shell and command-line tools. https://www.spinellis.gr/unix?sbcars2020
155
Image Credits
• Faces of Open Source / Peter Adams • PDP11/40: Stefan_Kögl, CC BY-SA • Data: Joshua Sortino 3.0 • Hackers at Junction 2015: Vmuru • Digital VAX 11/780: Emiliano Russo, • ASR-33 Teletype: Rama & Musée PD Bolo • Numbers: Nick Hillier • VT100: Jason Scott • Building construction: chuttersnap • PDP 11/20: Image courtesy of • Technical Debt: Jacob Duck Die Computer History Museum Pfändung • Twisted skyscraper: Mitya Ivanov (Creative commons licenses) • SPARCstation, Mike Chapman, PD
156
78 3/3/2021
Funding Credit
The research described has been partially carried out as part of the CROSSMINER Project, which has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under grant agreement No. 732223.
157
Diomidis Spinellis and Paris Avgeriou. Evolution of the Unix system architecture: An exploratory case study. IEEE Transactions on Software Engineering, 2019. doi:10.1109/TSE.2019.2892149
158
79 3/3/2021
Data Sources
159
160
80 3/3/2021
161
81