3/3/2021

The Triumph of the New Jersey Style: How Architectural Elegance, Components, and Reuse Shaped the Evolution of Paris Avgeriou Department of Management Science and Technology Institute for Mathematics and University of Economics and Business University of Groningen & Department of Software Technology Delft University of Technology

www.spinellis.gr @CoolSWEng

1

Unix 50th Anniversary 1969–2019

2

1

3/3/2021 Faces of Faces of Open Source / Peter Adams

3

4

2 3/3/2021

5

6

3 3/3/2021

7

8

4 3/3/2021

9

10

5 3/3/2021

11

12

6

3/3/2021 History

13 Data Sources Data

14

7

3/3/2021

Architectural Design Design Decisions

15

Results Quantitative

16

8

3/3/2021

Theory of OS of Theory Architectural Evolution Architectural

17 History

18

9 3/3/2021

19

20

10 3/3/2021

21

22

11 3/3/2021

23

24

12 3/3/2021

25

26

13 3/3/2021

27

28

14 3/3/2021

29

30

15 3/3/2021

31

32

16 3/3/2021

33

34

17 3/3/2021

35 Data Sources Data

36

18 3/3/2021

37

38

19 3/3/2021

39

40

20 3/3/2021

41

42

21 3/3/2021

github.com/dspinellis/unix-history-repo/

43

44

22 3/3/2021

45

46

23 3/3/2021

47

48

24 3/3/2021

49

50

25 3/3/2021

51

dspinellis.github.io/unix-history-man

52

26 3/3/2021

53

54

27

3/3/2021

Architectural Design Design Decisions

55

56

28 3/3/2021

PDP-7 [Unix] (1970)

57

… and I once heard an old-timer growl a young programmer:

“I've written boot loaders that were shorter than your variable names!”

— Stephen C. Johnson

58

29 3/3/2021

Layering and Partitioning

adm.s .s dskio.s init.s s6.s ald.s check.s dskres.s lcase.b s7.s apr.s .s dsksav.s maksys.s s8.s as.s .s ds.s s1.s s9.s bc.s chrm.s dsw.s s2.s scope.v bi.s .s ed1.s s3.s sop.s bl.s db.s ed2.s s4.s trysys.s cas.s dmabs.s ind.b s5.s

59

Process Management (fork)

60

30 3/3/2021

Descriptor Management

61

Separation of Metadata from File Naming

62

31 3/3/2021

Devices as Files

63

File I/O API

• open • read • • seek • tell • close

64

32 3/3/2021

File System API

• creat • rename • link • unlink

65

Interpreter

66

33 3/3/2021

67

First Research Edition (Nov 1971)

68

34 3/3/2021

First Edition Unix 1972 FreeBSD 11.1 2018

69

The Shell as a User Program

70

35 3/3/2021

Abstraction of Standard I/O

71

Interoperability through Documented File Formats

72

36 3/3/2021

Format Description Clients

a.out Assembler and linker output as, ld, , , un

Archive Object code libraries , ld

Core Crashed program image Kernel, db

Directory directories , , , , ,

File system File system format check, dump,* mkfs, restor*

Ident GECOD ident card format opr

Password User accounts and passwords chown, find, getpw,* login,* ls, passwd*

Tape* DECtape file format mt,* tap*

Uid User identifier to name map chown

utmp Logged in users init, login,* ,* write*

wtmp* Users login history acct, date, init, login, tacct, who

73

User-Contributed Tools and Games

74

37 3/3/2021

Tree Directory Structure

• mkdir(II) • chdir(I) • chdir(II) • find(I) • ln(I) • ls(I) • stat(I) • mkdir(I) • (I) • (I) • rmdir(I)

75

Mountable File System Interface

• mount(II) • mount(I) • umount(II) • umount(I)

76

38 3/3/2021

Second Research Edition (Jun 1972)

• Software Library

77

Third Research Edition (Feb 1973)

78

39 3/3/2021

Pipes and Filters

79

Fourth Research Edition (Nov 1973)

80

40 3/3/2021

Structured Programming

• Kernel implemented in “New B” • 6373 lines New B • 768 lines PDP-11 assembly • 105 functions + 50 assembly symbols • vs 248 global symbols in the First Edition

81

Language-Independent API

82

41 3/3/2021

Data Structure Definition & Reuse

buf.h filsys.h proc.h text.h conf.h inode.h reg.h tty.h file.h param.h systm.h user.h

83

Dynamic Resource Management

int coremap[CMAPSIZ]; int swapmap[SMAPSIZ];

struct map { char *m_size; char *m_addr; };

malloc(mp, size) struct map *mp; { …

84

42 3/3/2021

Device Driver Abstraction

85

Driver Interface

struct { int (*d_open)(); int (*d_close)(); int (*d_strategy)(); } bdevsw[];

struct { int (*d_open)(); int (*d_close)(); int (*d_read)(); int (*d_write)(); int (*d_sgtty)(); } cdevsw[];

86

43 3/3/2021

Buffer Cache

Fourth Edition FreeBSD 11.1 #define B_READ 01 #define B_ASYNC 0x00000004 #define B_DONE 02 /* Start I/O, do not . #define B_ERROR 04 */ #define B_BUSY 010 […] #define B_XMEM 060 #define B_DELWRI 0x00000080 #define B_WANTED 0100 /* Delay I/O until buffer #define B_RELOC 0200 reused. */ #define B_ASYNC 0400 #define B_DONE 0x00000200 #define B_DELWRI 01000 /* I/O completed. */

87

Fifth Research Edition (Jun 1974) • Command Files chdir /usr/source/s3 cc -c ctime.c ar r /lib/liba.a ctime.o rm ctime.o chdir /usr/source/s1 cc -s -n date.c cp a.out /bin/date cc -s -n dump.c cp a.out /bin/dump cc -s -n ls.c cp a.out /bin/ls rm a.out

88

44 3/3/2021

Sixth Research Edition (May 1975)

89

Portable C Library

alloc.c clenf.c makbuf.c scan1.c calloc.c copen.c maktab.c scan2.c cclose.c cputc.c nexch.c scan3.c ceof.c cwrd.c nodig.c system.c cerror.c dummy.s .c tmpnam.c cexit.c ftoa.c putch.c unget.c cflush.c getch.c puts.c unprnt.s cfree.c gets.c relvec.c wdleng.c cgetc.c getvec.c revput.c ciodec.c iehzap.c run

90

45 3/3/2021

Seventh Research Edition (Jan 1979)

91

Unix as a Virtual Machine

Also, about this [1973] I had a fateful discussion with Dennis, in which he said:

“I think it may be easier to port Unix to a new piece of hardware than to port a complex application from Unix to a new OS” — Steve Johnson

92

46 3/3/2021

Dynamic User Memory Allocation

• malloc(3), free(3) – Used by 26 programs: cc col dc dcheck eqn graph icheck learn ls neqn nm quot ratfor spline struct tar tsort uucp xsend quiz • stdio(3), mp(3)

93

@ CoolSWEng

94

47 3/3/2021

Environment Variables

• KEY=value • Kernel • Shell • C Library

95

Language Development Tools

(1) • (1) • 12 clients: awk bc cpp egrep eqn lex m4 pcc neqn struct •

96

48 3/3/2021

Domain-Specific Languages

• sh • awk • • find • expr • egrep • m4 • make

97

Filesystem Directory Hierarchy

98

49 3/3/2021

99

First and Second Berkeley Software Distributions (1978) • Software Packages – csh – – Mail – Pascal – termlib

100

50 3/3/2021

3BSD (1979)

• Virtual Memory Paging – vm_*.c – 2808 out of 16039 C source code – 17% of kernel source code – vread(2), vwrite(2), vfork(2)

101

4BSD (Oct 1980)

102

51 3/3/2021

Regular Expression Library: regex(3)

– 5 implementations: awk, sed, ed, , expr – 1 client: (1) – 2 more by 4.3: dbx(1), rdist(1) – 4 replacements in FreeBSD: ed, grep, sed, expr

103

Optimized Screen Handling

– curses(3) – termcap(5)

104

52 3/3/2021

4.2BSD (Sep 1983)

• Internet Protocol Family – ARP, IP, TCP, UDP, ICMP • Local and Remote Interprocess Communication – socket(2), etc. • Network and User Database Access – getfsent(3x), getgrent(3), gethostent(3n), getnetent(3n), getprotoent(3n), getpwent(3), getservent(3n) • Pseudo-Terminal Driver – pty(4)

105

• 2 library functions – rcmd(3x) System call Uses – rexec(3x) • 11 system daemons bind 23 – comsat(8c) connect 15 – ftpd(8c) – gettable(8c) accept 13 – implogd(8c) – rexecd(8c) select 12 – rlogind(8c) listen 11 – af(8c) – rshd(8c) sendto 10 – rwhod(8c) – telnetd(8c) shutdown 9 – tftpd(8c) recvfrom 8 • 8 user-mode programs – ftp(1c) getsockname 6 – rlogin(1c) recv 2 – rsh(1c) – (1c) send 2 – telnet(1c) sendmsg 1 – tftp(1c) – whois(1c) getsockopt 0 – sendmail(1c) recvmsg 0 socketpair 0

106

53 3/3/2021

4.3BSD Tahoe (Jun 1988)

• Multiple CPU Architecture Support Tahoe 19% – VAX – CCI Power 6/32 (Tahoe) Machine independent 33%

• Timezone Handling – Community contribution – Separation of timezone rules from code VAX 48%

107

4.3BSD Reno (Jun 1990)

• Kernel Packet Forwarding Database – route(4) /* – routed(8), XNSrouted(8) * Operations on vnodes. */ struct vnodeops { • Virtual Filesystem Interface int (*vn_lookup)( /* ndp */ ); int (*vn_create)( /* ndp, fflags, vap, cred */ ); int (*vn_mknod)( /* ndp, vap, cred */ ); int (*vn_open)( /* vp, fflags, cred */ ); int (*vn_close)( /* vp, fflags, cred */ ); int (*vn_access)( /* vp, fflags, cred */ ); int (*vn_getattr)( /* vp, vap, cred */ ); int (*vn_setattr)( /* vp, vap, cred */ );

int (*vn_read)( /* vp, uiop, offp, ioflag, cred */ ); int (*vn_write)( /* vp, uiop, offp, ioflag, cred */ ); int (*vn_ioctl)( /* vp, com, data, fflag, cred */ ); int (*vn_select)( /* vp, which, cred */ ); int (*vn_mmap)( /* vp, ..., cred */ ); int (*vn_fsync)( /* vp, fflags, cred */ ); int (*vn_seek)( /* vp, (old)offp, off, whence */ ); ….

108

54 3/3/2021

4.3BSD Net/2 (Jun 1991)

• Stream I/O Functions – funopen(3) – GNU funopencookie(3) added in FreeBSD 11

109

4.4BSD (Jun 1994)

• Stackable Filesystems – mount_null(8) – mount_union(8) • Generic System Control Interface (MIB) – (1) – sysctl(3) – sysctl(9)

110

55 3/3/2021

111

386BSD Kit (1992-1993)

• Organized Community Contributions – From open source software … – … to an open source project • Patch metadata – title – author – description – prerequisites

112

56 3/3/2021

113

FreeBSD 1.1 (May 1994)

• Package Manager – Patch – Compile – Install – Uninstall – Handling of dependencies

114

57 3/3/2021

FreeBSD 2.0 (Nov 1994)

• Process Filesystem – (5) • Dynamically Loadable Kernel Modules – lkm(4), then kld(4) – device drivers – file systems – emulators – system calls – 992 modules in FreeBSD 11.1

115

FreeBSD 2.1 (Nov 1995)

Emulation • Packet Capture Library

116

58 3/3/2021

FreeBSD 3.0 (Jan 1999)

• Common Access Method I/O Subsystem (CAM)

117

FreeBSD 3.4 (Dec 1999)

• Graph-based Kernel Networking and User Library ()

118

59 3/3/2021

FreeBSD 4.0 (Mar 2000)

• OpenSSL Secure Sockets Layer and Transport Layer Security framework – Version 0.9.4 – 1127 files, 227118 lines – libssl(3), libcrypto(3), openssl(1) • Jail: Isolate a process and its descendants • Access Control Lists

119

FreeBSD 5.0 (May 2006)

• Symmetric Multiprocessing • Modular Disk I/O Request Transformation Framework (GEOM) • Mandatory Access Control • Pluggable Authentication Module

120

60 3/3/2021

FreeBSD 5.3 (Nov 2004)

• Streaming Archive Access Library • Miniport Driver Wrapper

121

FreeBSD 6.2 (Jan 2007)

• Basic Security Module Auditing

122

61 3/3/2021

FreeBSD 7 (Feb 2008)

• ZFS File System and Storage Pools • Dynamic Tracing

123

FreeBSD 9.0 (Jan 2012)

• Para-virtualized I/O • Infiniband Support • Application Compartmentalization

124

62 3/3/2021

FreeBSD 10.0 (Jan 2014)

• Virtual Machine Monitor () • Fast User Space Raw Packet Processing

125

FreeBSD 11.0 (Sep 2016)

• Network Blacklisting

126

63

3/3/2021

Results Quantitative

127

128

64 3/3/2021

129

130

65 3/3/2021

131

132

66 3/3/2021

133

134

67 3/3/2021

Mean Cyclomatic Complexity

135

Mean Cyclomatic Complexity

FreeBSD GNU/Linux

136

68

3/3/2021

Theory of OS of Theory Architectural Evolution Architectural

137

Form and Pace of and Pace Form Architectural Evolution Architectural

138

69 3/3/2021

Many core architecture decisions are taken at the beginning of the system's lifetime

139

Most architecture decisions survive over the system lifetime

140

70 3/3/2021

141

142

71 3/3/2021

New architecture decisions are continuously made, further fueling architecture evolution

143

The rate of architecture decisions declines over the system's lifetime

144

72

3/3/2021

Accumulation of Accumulation Architectural Technical Debt Technical Architectural

145

Architecture decisions offering features that are either similar to existing ones or remain under-used

• read, pread, readv, preadv, recv, recvfro, recvmsg • /var/log, syslogd, acct, auditd • UGO permissions, ACL, MAC • Threads and processes for multitasking

146

73 3/3/2021

Debt systematically paid back despite increasing system size and complexity

147

Forces of Forces Architectural Evolution Architectural

148

74 3/3/2021

Conventions instead of enforcement facilitate evolution by reducing effort and offering flexibility

• Text files • Directory hierarchy • Documented file formats • Environment variables

149

Portability, due to its inherent complexity, is a key driver of architectural evolution

150

75 3/3/2021

An ecosystem of other OSs and third parties constantly shapes the architecture evolution

151

Third-party subsystems facilitate evolution through code reuse, but add technical debt

152

76 3/3/2021

Large subsystems form their own independent architecture

• Netgraph, SSL, MAC, PAM, GEOM, BSM, ZFS, DTrace

153

Thank you!

Diomidis Spinellis and Paris Avgeriou. Evolution of the Unix system architecture: An exploratory case study. IEEE Transactions on Software Engineering, 2019. doi:10.1109/TSE.2019.2892149

www.spinellis.gr @CoolSWEng

154

77 3/3/2021

Free open edX course (MOOC): Unix Tools: Data, Software and Production Engineering

Grow from being a Unix novice to Unix wizard status! Process big data, analyze software code, run DevOps tasks and excel in your everyday job through the amazing power of the Unix shell and command-line tools. https://www.spinellis.gr/unix?sbcars2020

155

Image Credits

• Faces of Open Source / Peter Adams • PDP11/40: Stefan_Kögl, CC BY-SA • Data: Joshua Sortino 3.0 • Hackers at Junction 2015: Vmuru • Digital VAX 11/780: Emiliano Russo, • ASR-33 Teletype: Rama & Musée PD Bolo • Numbers: Nick Hillier • VT100: Jason Scott • Building construction: chuttersnap • PDP 11/20: Image courtesy of • Technical Debt: Jacob Duck Die Computer History Museum Pfändung • Twisted skyscraper: Mitya Ivanov (Creative commons licenses) • SPARCstation, Mike Chapman, PD

156

78 3/3/2021

Funding Credit

The research described has been partially carried out as part of the CROSSMINER Project, which has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under grant agreement No. 732223.

157

Diomidis Spinellis and Paris Avgeriou. Evolution of the Unix system architecture: An exploratory case study. IEEE Transactions on Software Engineering, 2019. doi:10.1109/TSE.2019.2892149

158

79 3/3/2021

Data Sources

159

160

80 3/3/2021

161

81