The History Simulation Project

The Computer History Simulation Project

The Computer History Simulation Project is a loose Internet-based collective of people interested in restoring historically significant and software systems by simulation. The goal of the project is to create highly portable system simulators and to publish them as freeware on the Internet, with freely available copies of significant or representative software.

Simulators

SIMH is a highly portable, multi-system simulator.

● Download the latest sources for SIMH (V3.5-1 updated 15-Oct-2005 - see change log).

● Download a zip file containing Windows executables for all the SIMH simulators. The VAX and PDP-11 are compiled without Ethernet support. Versions with Ethernet support are available here. If you download the executables, you should download the source archive as well, as it contains the documentation and other supporting files. ● If your host system is Alpha/VMS, and you want Ethernet support, you need to download the VMS Pcap library and execlet here.

SIMH implements simulators for:

Nova, Eclipse ● Digital Equipment Corporation PDP-1, PDP-4, PDP-7, PDP-8, PDP-9, PDP-10, PDP-11, PDP- 15, VAX ● GRI Corporation GRI-909 ● IBM 1401, 1620, 1130, System 3 ● Interdata (Perkin-Elmer) 16b and 32b systems ● Hewlett-Packard 2116, 2100, 21MX ● H316/H516 ● MITS , with both 8080 and Z80 ● Royal-Mcbee LGP-30, LGP-21 ● SDS 940

Also available is a collection of tools for manipulating simulator file formats and for cross- assembling code for the PDP-1, PDP-7, PDP-8, and PDP-11.

Software Kits to run on SIMH

Help with SIMH

http://simh.trailing-edge.com/ (1 of 2)26/11/2005 9:14:11 The Computer History Simulation Project System Photographs

Papers on Simulation and Historic Hardware

Future Work and Items Needed

List of Contributors

Links to Computer History and Simulation Resources

Updated 15-Oct-2005 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/ (2 of 2)26/11/2005 9:14:11 Available Simulators

Available Simulators

For details on the simulated implementation of each system, please click on the appropriate link. Except where noted, all simulators have been developed by Bob Supnik.

(plotter and multiplexor support by Bruce Ray)

(developed by Charles Owen) ● Digital Equipment Corporation PDP-1

● Digital Equipment Corporation PDP-4

● Digital Equipment Corporation PDP-7

● Digital Equipment Corporation PDP-8

● Digital Equipment Corporation PDP-9

● Digital Equipment Corporation PDP-10 (Ethernet support by David Hittner)

● Digital Equipment Corporation PDP-11 (Ethernet support by David Hittner, DHQ support by John Dundas) ● Digital Equipment Corporation PDP-15

● Digital Equipment Corporation VAX (Ethernet support by David Hittner, DHQ support by John Dundas) ● GRI Computer Corporation GRI-909

● IBM 1401

● IBM 1620

● IBM System 3 (developed by Charles Owen)

● IBM 1130 (developed by Brian Knittel - latest version at www.ibm1130.org)

● Interdata (Perkin Elmer) 16b and 32b systems

● Hewlett-Packard HP 2100

● Honeywell H316

● MITS Altair 8800 (developed by Charles Owen; Z80 version by Peter Schorn - latest version at www.schorn.ch)

● Royal-Mcbee LGP-30 (and LGP-21)

● Scientific Data Systems SDS 940

The simulators have been tested in the following environments

● Windows 9x/NT/2000 (Visual ++, Mingw gcc) ● DEC (DEC C) ● OpenVMS (DEC C) ● (gcc) ● NetBSD, OpenBSD, FreeBSD (gcc) ● Solaris (gcc) ● OS/2 (EMX) ● OS 9 (CodeWarrior)

http://simh.trailing-edge.com/hardware.html (1 of 2)26/11/2005 9:14:12 Available Simulators

● Macintosh OS X (Apple Developer Tools)

Updated 30-Jun-2004 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/hardware.html (2 of 2)26/11/2005 9:14:12 Change Log V3.5

Change Log For V3.5

The change log for the previous version (V3.4) is here.

IBM 1401:

● Changed character encodings to be consistent with Paul Pierce 709X simulator ● Changed card column binary format to be consistent with Paul Pierce 709X simulator ● Added choice of business or encoding for card punch, line , and inquiry terminal output ● Added mode control for old/new character encodings 1 15-Oct-05 IBM 1620: Changed character encodings to be consistent with 7094 and 1401

PDP-11: Fixed bug in autoconfiguration algorithm (missing XU table entry)

VAX:

● Fixed bug in autoconfiguration algorithm (missing XU table entry) ● Fixed bug in floating point structure definitions with 32b compilation option

SCP and libraries:

● Fixed to trim trailing spaces on file names ● sim_ether: added Windows user-defined adapter names (from Timothe Litt) ● sim_sock: fixed SIGPIPE error on UNIX ● sim_tape: fixed misallocation of TPC map array in 64b configurations ● sim_tmxr: added support for SET DISCONNECT

IBM 1401:

● Fixed clearing of SSB-SSG on reset (reported by Ralph Reinke) ● Fixed problem with 2, 5 character R, P instructions (reported by Van Snyder) ● Removed error stops from MCE

PDP-11:

● Revised autoconfiguration algorithm and interface ● Added additional 11/60 registers ● pdp11_vh: fixed bug in vector display routine

http://simh.trailing-edge.com/changes35.html (1 of 2)26/11/2005 9:14:15 Change Log V3.5

● pdp11_xu: fixed runt packet processing (found by Tim Chapman)

PDP-15:

0 09-Sep-05 ● Removed spurious AAS instruction ● Fixed bug in SHOW TTIX CONN/STATS ● Fixed bug in SET TTIXn LOG/NOLOG

PDP8

● Fixed bug in SHOW TTIX CONN/STATS ● Fixed bug in SET TTIXn LOG/NOLOG

HP2100

● Added SET MUXLn DISCONNECT

Interdata:

● fixed bug in SHOW PAS CONN/STATS ● Added SET PASLn DISCONNECT

SDS

● Fixed bug in SHOW MUX CONN/STATS ● Added SET MUXLn DISCONNECT

Updated 15-Oct-2005 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/changes35.html (2 of 2)26/11/2005 9:14:15 Software Kits

Software Kits

Some of these software kits are governed by licenses restricting their use. Each kit contains license information, if a license applies.

● RDOS V7.5 for the Nova (under license provided by Data General Corporation).

● Mapped RDOS 7.5 for the Eclipse (under license provided by Data General Corporation).

● Updated documentation! Lisp for the PDP-1 (courtesy of the author, L Peter Deutsch).

● DDT for the PDP-1 (courtesy of Derek Peschel).

● SIM8 for the PDP-7 (courtesy of the author, David J Waks).

● ESI-X for the PDP-8 (courtesy of the author, David J Waks).

● FOCAL69 for the PDP-8 (courtesy of Digital Equipment Corporation).

● OS/8 for the PDP-8 (under license provided by Digital Equipment Corporation).

● Updated documentation! TSS/8 for the PDP-8 (courtesy of Digital Equipment Corporation).

● 4k Disk Monitor System for the PDP-8 (courtesy of Digital Equipment Corporation).

● TOPS-10 and TOPS-20 for the PDP-10 (under license provided by Digital Equipment Corporation). ● ITS (Incompatible Timesharing System) for the PDP-10; click here for installation instructions. ● DOS/Batch-11 V10 for the PDP-11.

● RT-11 V4 for the PDP-11 (under license provided by Mentec Corporation).

● RT-11 V5.3 for the PDP-11 (under license provided by Mentec Corporation).

● RSTS/E V7 distribution system and RSTS/E prebuilt system for the PDP-11 (under license provided by Mentec Corporation). ● PDP-11 UNIX V5 with sources (under license provided by Caldera Corporation).

● PDP-11 UNIX V6 with sources (under license provided by Caldera Corporation).

● PDP-11 UNIX V7 with sources (under license provided by Caldera Corporation).

● FOCAL15 for the PDP-15 (courtesy of Digital Equipment Corporation).

● Advanced Software System, both Keyboard Monitor and Foreground/Background, for the PDP-15 (courtesy of Digital Equipment Corporation). ● DOS-15 for the PDP-15 (courtesy of Digital Equipment Corporation).

● XVM/DOS for the PDP-15 (courtesy of Digital Equipment Corporation).

● Interdata UNIX V6 (under license provided by Caldera Corporation).

● Interdata UNIX V7 (under license provided by Caldera Corporation).

● VAX/VMS (under a hobbyist license provided by Hewlett-Packard Corporation); click here for details. ● VAX/Rdb (under a Network Developer's license provided by Oracle Corporation); click here for details. ● NetBSD for the VAX; click here for installation instructions and a pointer to the installation kit.

http://simh.trailing-edge.com/software.html (1 of 2)26/11/2005 9:14:25 Software Kits

● 4.3BSD for the VAX; click here for the installation directory.

● Diagnostics and SPS assembler for the IBM 1401.

● Three single-card programs or "koans" for the IBM 1401.

● Paper-tape BASIC for the HP 2100.

● CP/M and DOS for the MITS Altair 8800 (CP/M under license provided by Caldera Corporation); and an updated kit with 4K Basic, 8K Basic, Prolog, and CP/M 3 (latest kits at www.schorn.ch).

● System 3 Model 10 SCP.

● DMS R2 V12 for the IBM 1130 (from www.ibm1130.org).

● Sources to three classic adventure games: Adventure (Colossal Cave), 1994 VMS version; Zork, final ITS version; and the Fortran translation, Dungeon, 1990 VMS version.

Updated 25-Apr-2005 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/software.html (2 of 2)26/11/2005 9:14:25 Getting Help With SIMH

Help With SIMH

Sources for Help

● SIMH Frequently Asked Questions (FAQ).

● Mirian Crzig Lennox, "Building an ITS from scratch on the Supnik PDP-10 simulator" - a step-by-step guide for bringing PDP-10 ITS up under SIMH. ● Phil Wherry, "Running VAX/VMS Under Linux Using SIMH" - a step-by-step guide for bringing VAX/VMS up under SIMH on a Linux host. ● Lars Brinkhoff, "NetBSD SIMH How-To" - a step-by-step guide for bringing NetBSD for the VAX up under SIMH. ● Paulo de Silva, "How To Use HP 2100 Algol On SIMH" - a step-by-step guide for bringup up HP2100 Algol up under SIMH. ● The SIMH users' mailing list (simh AT trailing-edge DOT com). To subscribe, send an email with subscribe in the subject list to simh-request AT trailing-edge DOT com. All of the email addresses are anti-spam encoded.

Updated 04-Apr-04 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/help.html26/11/2005 9:14:26 Historical System Photographs

System Photographs

All photographs courtesy of Digital Equipment Corporation, unless otherwise indicated.

Data General Corporation

● SuperNova (courtesy of Paul Pierce)

● Nova 840 (courtesy of Carl Friend)

● Nova 4 (courtesy of Toby Thain)

● Eclipse S130 (courtesy of Emil Sarlija)

Digital Equipment Corporation

● PDP-1

● PDP-4

● PDP-5

● PDP-6

● PDP-7

● PDP-7/A

● PDP-8

● PDP-8/S

● PDP-8/I

● PDP-8/E

● VT78

● DECmate III

● PDP-9

● PDP-10 (KA10)

● PDP-10 (KI10)

● DECSYSTEM-10 (KL10)

● DECSYSTEM-20

● PDP-11/20

● PDP-11/45

● PDP-11/05

● PDP-11/40

● PDP-11/34

● PDP-11/70

● LSI-11

● PDP-11/60

● LSI-11/23

http://simh.trailing-edge.com/photos.html (1 of 3)26/11/2005 9:14:27 Historical System Photographs

● PDP-11/23

● PDP-11/24

● PDP-11/44

● PDP-11/83

● PDP-11/94

● PDP-12

● PDP-14

● PDP-15

● PDP-16/M

Hewlett-Packard

● HP2112B (courtesy of Jay Jaeger)

● HP2114A (courtesy of Jeff Moffatt)

Honeywell

● H316 (courtesy of Mike Umbricht)

IBM

● IBM 1401 (courtesy of Paul Pierce)

● IBM 1620 (courtesy of Technology Museum of Thessaloniki)

● IBM System 3 (courtesy of Jim Watt)

● IBM 1130 (courtesy of Brian Knittel)

Interdata

● Interdata 4 (courtesy of Carl Friend)

● Interdata 70 (courtesy of Tony Farrell)

● Interdata 70 detail (courtesy of Tony Farrell)

MITS

● Altair 8800 (courtesy of Data General Corporation)

Royal McBee

● LGP-30

● LGP-21 (courtesy of Tom Jennings)

http://simh.trailing-edge.com/photos.html (2 of 3)26/11/2005 9:14:27 Historical System Photographs

Updated 10-Sep-2005 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/photos.html (3 of 3)26/11/2005 9:14:27 Papers on Simulation and Historic Systems

Papers on Simulation and Historic Systems

These papers were written to present the rationale for SIMH, to document SIMH, or to document relatively obscure architectural or implementation details in historic systems.

● "Preserving Computing's Past: Restoration and Simulation", by Max Burnet and Bob Supnik, from the Digital Technical Journal, Volume 8, Number 3, 1996 (PDF)

● "SIMH: Forward Into The Past", a presentation to the Vintage Computer Festival East, 16-Jul- 2004 (PDF)

● "Writing a Simulator for the SIMH System", by Bob Supnik (PDF)

● "Adding an I/O Device to a SIMH ", by Bob Supnik (PDF)

● "The SIMH Breakpoint System", by Bob Supnik (PDF)

● "SIMH Magtape Representation and Handling", by Bob Supnik (PDF)

● "Architectural Evolution in DEC's 18b ", by Bob Supnik (PDF)

● "Decoding the H316/H516 'Generic A' Instructions", by Bob Supnik (PDF)

● "Unearthing the PDP-15 Operating Systems", by Bob Supnik (PDF)

● "PDP-11 : Variations on a Theme", by Bob Supnik (PDF)

● "Bug, Feature, or Code Rot? Adventures in O/S Debugging", by Bob Supnik (PDF)

● "A Massbus Mystery, or, Why Primary Sources Matter, Even In Computer History", by Bob Supnik - documenting the oldest extent bug in VAX/VMS (PDF)

● "The Case Of The Missing PLA Term, or, Bugs I Have Known", by Bob Supnik (PDF)

● "HP's IOP Implementations: 2100 vs 21MX", by Bob Supnik (PDF)

● New! "CTSS Hardware", by Bob Supnik (PDF)

● "What was the PDP-X?", by Bob Supnik (PDF)

● "VAX Processor Chart", by Bob Supnik (text)

● Three late 1980's presentations on the design of VLSI VAXen, by Bob Supnik: 1. "VLSI VAX Micro-Architecture" (PDF) 2. "Microcoding Considered as a Fine Art" (PDF) 3. "CVAX and : The Development Process" (PDF)

Updated 23-Jun-2005 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/papers.html26/11/2005 9:14:27 Future Work, Items Needed

Future Work and Items Needed

Items Needed

● Other PDP-1 software ● Interdata 16b software ● PDP-7 operating systems: DECsys; UNIX V1-V4 ● PDP-15 operating systems: RSX-15, MUMPS-15 ● GRI-909 documentation and software

How You Can Help

The Computer History Simulation Project is a volunteer effort by enthusiasts world-wide. If you are interested in preserving computing's history, you can help with the project by extending the existing simulators or writing new ones. Additional work needed includes:

● Recovering additional software; see the Items Needed list above. ● Debugging simulators. There are always additional software sets to try or debug... OS2/MT on Interdata, for example. ● Writing additional simulators. Is your favorite system languishing in the dustbin of computer history? Bring it back to life by writing a simulator for it.

Updated 24-Mar-2004 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/future.html26/11/2005 9:14:27 Contributors

Contributors

Many individuals have contributed to the Computer History Simulation Project, including:

Bill Ackerman PDP-1 consulting Anders Ahgren VMS Ethernet simulation, Windows direct device access Dave Babcock IBM 1620 simulator debugging and enhancements Alan Bawden ITS consulting Winfried Bergmann Linux port testing J. David Bryan HP debugging and features Phil Budne Solaris port testing, PDP-1 software transcription and debugging Max Burnet DEC documentation, software, and working examples Robert Alan Byer VMS socket support and testing, VMS MMS file Doug Carman PDP-11 bootstrap debugging, RSX-11M+ images and debugging James Carpenter Linux port testing Chip Charlot Mentec Corporation license Louis Chrétien Macintosh porting Dave Conroy HP 21xx documentation, PDP-10 and 18b PDP debugging L Peter Deutsch PDP-1 Lisp source code and permission Ethan Dicks PDP-11 2.9 BSD debugging PDP-11 CPU debugging, programmable clock simulator, DHQ11 John Dundas simulator Jonathan Engdahl PDP-11 device debugging Carl Friend Nova and Interdata documentation, RDOS disk images Megan Gentry PDP-11 debugging PDP-8 and PDP-9/15 documentation, PDP-8 DECtape, disk, and David Gesswein paper tape images, PDP-9/15 DECtape images Dick Greeley Digital Equipment Corporation licenses Gordon Greene PDP-1 Lisp machine readable source Lynne Grettum PDP-11 RT-11, RSTS/E, RSX-11M legal permissions Franc Grootjen PDP-11 2.11 BSD debugging

http://simh.trailing-edge.com/contrib.html (1 of 4)26/11/2005 9:14:29 Contributors

Doug Gwyn Portability debugging Philipp Hachtmann H316 debugging VAX VMS/NetBSD debugging, TS11/TSV05 documentation, make Kevin Handy file Ken Harrenstein KLH PDP-10 simulator Bill Haygood PDP-8 information and simulator, OS/8 disk images Wolfgang Helbig DZ11 implementation, PDP-11 debugging Mark Hittinger PDP-10 debugging David Hittner SCP debugging, DEQNA simulator and Ethernet library Tarik Isani DEC Pro/350 simulator Sellam Ismail GRI-909 documentation Jay Jaeger IBM 1401 consulting Doug Jones PDP-8 information and simulator, OS/8 disk images Brian Knittel IBM 1130 simulator, SCP extensions for GUI support HP 21xx, Varian 620, TI 990, Interdata, SDS, IBM, DEC Al Kossow documentation and software, XVM/DOS recovery Arthur Krewat DZ11 changes for the PDP-10 Mirian Crzig Lennox PDP-10 ITS debugging and instructions Don Lewine Data General license PDP-10 hardware documentation, schematics, software Tim Litt documentation, media recovery and conversion Bill McDermith HP 21xx debugging, 12565A disk simulator Scott McGregor The Santa Cruz Operation license Richard Miller Interdata UNIX V6 and V7 debugging Jeff Moffatt HP 21xx information, documentation, and software Alec Muffett Solaris port testing Terry Newton HP 21MX debugging Thord Nilson DZ11 implementation MITS Altair 8800 simulator, Data General Eclipse simulator, IBM Charles Owen System 3 simulator, Nova documentation and simulator debugging, IBM 1401 diagnostics, software, and debugging Sergio Pedraja MINGW environment debugging Derek Peschel DDT-1 transcription and debugging

http://simh.trailing-edge.com/contrib.html (2 of 4)26/11/2005 9:14:29 Contributors

Media recovery and conversion, IBM 1401 information and Paul Pierce documentation Dave Pitts IBM 709x simulator and debugging Mark Pizzolato SCP, Ethernet, VAX simulator debugging and improvements PDP-10 debugging, PDP-15 bootstrap, PDP-9 restoration, Elliot 903 Hans Pufal simulator, DOS-15 recovery and debugging Software, documentation, bug fixes, and new devices for the Nova, Bruce Ray OS/2 porting Craig St Clair Digital Equipment Corporation archives Richard Schedler Public repository maintenance Peter Schorn Macintosh porting; Altair Z80 simulator Stephen Schultz PDP-11 2.x BSD debugging Olaf Seibert NetBSD port testing Brian and Barry Silverman PDP-1 simulator, Lisp and Spacewar sources Nova documentation and software, PDP-10 and PDP-11 software Tim Shoppa archive, site hosting for SIMH Michael Short IBM 1620 debugging Chris Smith PDP-10 floating point debugging IBM 1401 zero footprint card and tape bootstraps, testing and debug Van Snyder of numerous fine points of instructions and I/O Michael Somos PDP-1 debugging Hans-Michael Stahl OS/2 port, TERMIOS terminal implementation Ken Staley 4.3BSD kit for the VAX Tim Stark TS10 PDP-10 simulator; VAX debugging Larry Stewart Original suggestion for the project Chris Suddick PDP-11 simulator floating point debugging Ben Supnik Macintosh timing routine Bob Supnik SIMH simulators Ben Thomas VMS terminal emulator implementation Deb Toivonen Digital Equipment Corporation archives Warren Toomey PDP Unix Preservation Society (PUPS), PDP-11 UNIX disk images Peter Trimmel VAX debugging, Linux large file support Mike Umbricht DEC documentation, H316 schematics and documentation

http://simh.trailing-edge.com/contrib.html (3 of 4)26/11/2005 9:14:29 Contributors

PDP-11 UNIX V6 debugging, TERMIOS terminal emulator Leendert Van Doorn implementation DEUNA code, RK611 emulator, PDP-11 debugging, VAX/ Fred Van Kempen debugging Holger Veit OS/2 socket support David J Waks PDP-7 SIM8 and PDP-8 ESI-X source code and permission Tom West Nova documentation Adrian Wise H316 simulator, documentation, and software John Wilson PDP-11 simulator and RT-11, RSX-11M, and RSTS/E disk images Joe Young RP debugging on Ultrix-11 and BSD

In addition, the following companies have graciously licensed their software for hobbyist use:

Data General Corporation Digital Equipment Corporation Computer Corporation Hewlett-Packard Corporation Mentec Corporation The Santa Cruz Operation Caldera Corporation

Last, but hardly least, I would like to thank the , Mountain View, California, and the Rhode Island Retro-Computing Society, Providence, Rhode Island, and their staffs and volunteers, for their support and help.

Updated 03-May-2005 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/contrib.html (4 of 4)26/11/2005 9:14:29 Computer History and Simulation Links

Computer History and Simulation Links

This set of links is not intended to be complete. For more information, search Google with queries "computer collection", "computer museum", "", or "".

Online Computer Collections

Carl Friend's Museum - Data General Nova and Eclipse; DEC PDP-8, PDP-11, PDP- 12, and LINC-8; Interdata 4; Packard Bell 250; and more

Jay Jaeger's Computer Collection - Data General Nova and Eclipse; DEC PDP-8, PDP-11, and PDP- 12; HP 2112B; IBM 1410; and more

Paul Pierce's Computer Collection - Data General SuperNova; DEC PDP-8 and PDP-11; IBM 709, 7094, and 1401; and more

The -Cyber Project - CDC and Cray supercomputers, with online access

Online Documentation and Software Archives

Al Kossow's PDF document collection - many, many different systems

Tim Shoppa's Trailing Edge - PDP-10 and PDP-11 software archive

David Gesswein's pdp8.net - PDP-8 documentation and software archive

John Wilson's Dbit archive - PDP's, Alpha, IBM 370, and more

Zane Healy's DEC Emulation website

Eric S Raymond's Retrocomputing Museum

Computer and Software Information Pages

Gordon Greene's PDP-1 pages

Tom Knight's PDP-6 pages are offline

Doug Jones' PDP-8 pages

http://simh.trailing-edge.com/links.html (1 of 3)26/11/2005 9:14:30 Computer History and Simulation Links

Hans Pufal's PDP-9 pages

Joe Smith's PDP-10 pages

Bruce Ray's Nova pages

Jeff Moffatt's HP 21xx pages

Bob Mader's Project Delta (RSTS/E) pages

Brian Knittel's IBM 1130 pages

Peter Schorn's Altair Z80 and CP/M pages

Mike Umbricht's H316 pages

Other Simulators

Bill Haygood's simulator for the PDP-8

Doug Jones' simulator for the PDP-8

John Wilson's simulator for the PDP-11

Tarik Isani's simulator for the Pro/350

The Hercules simulator for the IBM S/370, ESA/390, and Z/Architecture

Tom Hunter's simulator for the CDC 6600

Peter Ingerman's simulator for the Univac I and II

Ron Burkey's simulator for the Apollo Guidance Computer

Dave Pitts' simulator for the IBM 7094, running IBSYS (modified from Paul Pierce's 709x simulator)

Computer Museums

The Retro-Computing Society of Rhode Island, Providence, Rhode Island

http://simh.trailing-edge.com/links.html (2 of 3)26/11/2005 9:14:30 Computer History and Simulation Links The Computer History Museum, Mountain View, California

Updated 03-May-2005 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/links.html (3 of 3)26/11/2005 9:14:30 Nova Simulator Configuration

Nova Simulator Configuration

Introduced in 1969, the Nova was Data General's first architecture and one of the first 16-bit . Its elegant and simple instruction set architecture prefigured today's RISC processors. The Nova family remained in production until the mid 1980's. There were eight processor designs:

● Nova ● SuperNova ● Nova 800 ● Nova 1200 ● Nova 2 ● Nova 3 ● Nova 4 ● MicroNova

Photographs:

● SuperNova (courtesy of Paul Pierce)

● Nova 840 (courtesy of Carl Friend)

● Nova 4 (courtesy of Toby Thain)

Option Description Capacity CPU and memory Nova 4KW - 32KW CPU options MDV (hardware multiply-divide) Nova 3 instruction set Nova 4 instruction set Console KSR-33 Teletype or Dasher VDT Second terminal KSR-33 Teletype or Dasher VDT Asynchronous multiplexor 4060 multiplexor 1-64 lines 425x multiplexor 1-64 lines Paper tape Real time clock Line printer Plotter Disk 4019 fixed head disk .5MB - 4MB 6030 single sided, single density .35MB

http://simh.trailing-edge.com/nova.html (1 of 2)26/11/2005 9:14:30 Nova Simulator Configuration

6097 double sided, double density floppy disk 1.2MB 4047 (4237, 4238) cartridge disk 2.4MB 4234 (6045) fixed + removable cartridge disk 10MB 6070 double density fixed + removable cartridge disk 20MB

4048 6.2MB 4057 (2311) disk pack 25MB 4231 (3330) disk pack 92MB 6225 non-removable disk 5MB 6227 non-removable disk 15MB 6099 non-removable disk 12.5MB 6103 non-removable disk 25MB Magnetic tape 6026 800bpi 9 track magnetic

Updated 24-Mar-2004 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/nova.html (2 of 2)26/11/2005 9:14:30 PDP-1 Simulator Configuration

PDP-1 Simulator Configuration

Introduced in 1960, the PDP-1 was DEC's first computer, and the world's first minicomputer. Documentation and drawings are sketchy; in addition, the machine was often customized for particular uses. The simulator is based on the 1963 maintenance manual.

Photographs (courtesy of Digital Equipment Corporation):

● PDP-1

● Screenshot of PDP-1 Spacewar

Option Description Capacity CPU and memory PDP-1 4KW - 64KW Type 10 automatic multiply divide CPU options Type 15 memory extension control Console Soroban B (FIODEC code) Paper tape integral Line printer Type 62 line printer (Hollerith code) Microtape (DECtape) Type 550/555 Microtape 148KW Drum Type 23 parallel drum 131KW Type 24 serial drum 131KW

Updated 12-Dec-03 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/pdp1.html26/11/2005 9:14:30 PDP-4 Simulator Configuration

PDP-4 Simulator Configuration

The PDP-4, introduced in 1963, was DEC's second (and second 18-bit) computer system. It simplified both the instruction set and I/O architecture of the PDP-1 in order to reduce cost. About 54 PDP-4's were sold.

Photographs (courtesy of Digital Equipment Corporation):

● PDP-4

Option Description Capacity CPU and memory PDP-4 4KW - 8KW CPU options Type 18 EAE, Extended Arithmetic Element Memory 4K - 8K, 18b words Console Type 65 KSR-28 Teletype (Baudot code) integral paper tape reader Paper tape Type 75 paper tape punch Real time clock integral Line printer Type 62 line printer (Hollerith code) DECtape Type 550/555 DECtape 148KW Drum Type 24 serial drum 131KW

Updated 12-Sep-2003 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/pdp4.html26/11/2005 9:14:31 PDP-7 Simulator Configuration

PDP-7 Simulator Configuration

The PDP-7, introduced in 1965, was DEC's third 18-bit computer system. Both faster and less expensive than the PDP-1, the PDP-7 maintained upward compatibility from the PDP-4 but introduced many new options, including extended memory, extended arithmetic capability, and a crude form of . The PDP-7 featured DEC's first mass storage operating system (DECsys, based on ). It was also the development system for the first versions of UNIX.

Photographs (courtesy of Digital Equipment Corporation):

● PDP-7

● PDP-7A

Option Description Capacity CPU and memory PDP-7 4KW - 32KW CPU options Type 177 EAE, extended arithmetic element Type 148 memory extension control Console Type 649 KSR-33 Teletype Paper tape Type 444 paper tape reader Type 75 paper tape punch Real time clock integral Line printer Type 647B line printer (sixbit ASCII) DECtape Type 550/555 DECtape 148KW Drum Type 24 serial drum 131KW

Updated 21-Apr-2003 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/pdp7.html26/11/2005 9:14:31 PDP-8 Simulator Configuration

PDP-8 Simulator Configuration

DEC's PDP-8, introduced in 1965, was the first mass produced minicomputer, as well as the first computer costing less than $20,000. It remained in active production until 1988; more than 50,000 units were produced. All the models in the family were based on one of the seven CPU designs (photographs courtesy of Digital Equipment Corporation):

● 1965: PDP-8

● 1966: PDP-8/S - serial logic design

● 1968: PDP-8/I - first design with integrated circuits (also PDP-8/L)

● 1970: PDP-8/E - first Omnibus design (also PDP-8/F, PDP-8/M)

● 1975: PDP-8/A ● 1979: - first LSI design (VT78)

● 1981: Harris 6120 - last design (DECmate I, DECmate II, DECmate III)

Option Description Capacity CPU and memory PDP-8/E 4KW - 32KW CPU options KM8E memory extension and timeshare control KE8E EAE (extended arithmetic element) TSC8-75 timesharing controller for ETOS Console KL8E and KSR-33 Teletype Extra terminals KL8JA and KSR-33 Teletype 4 terminals Paper tape PC8E paper tape reader and punch Real time clock KW8E real time clock Line printer LP8E line printer DECtape TC08/TU56 DECtape 190KW TD8E/TU56 DECtape 190KW Disk RX8E/RX01 single density floppy disk 256KB RX28/RX02 double density floppy disk 512KB DF32/DS32 fixed head disk 128KW RF08/RS08 fixed head disk 1MW RK8E/RK05 cartridge disk 1.6MW RL8A/RL01-RL02 cartridge disk 5-10MB Magnetic tape TM8E/TU10, 800bpi 9 track magnetic tape

http://simh.trailing-edge.com/pdp8.html (1 of 2)26/11/2005 9:14:32 PDP-8 Simulator Configuration

Updated 01-Dec-2003 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/pdp8.html (2 of 2)26/11/2005 9:14:32 PDP-9 Simulator Configuration

PDP-9 Simulator Configuration

The PDP-9, introduced in 1967, was DEC's fourth 18-bit computer system. It was upward compatible from the PDP-7, introducing a more advanced form of . A cost-reduced version, the PDP-9/L, was introduced in 1968.

Photographs:

● PDP-9 (courtesy of Digital Equipment Corporation)

Option Description Capacity CPU and memory PDP-9 4KW - 32KW CPU options KE09A EAE, extended arithmetic element KG09A memory extension control KP09A power fail detection KX09A memory protection control Console KSR-33 Teletype Additional terminals LT09 and KSR-33 Teletypes 1-4 lines Paper tape PC09A paper tape reader and punch Real time clock integral Line printer Type 647E line printer (sixbit ASCII) DECtape TC02/TU55 DECtape 148KW Disk RF09/RS09 fixed head disk 2.1MW Magnetic tape TC59 magnetic tape (9 track only)

Updated 24-Mar-2004 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/pdp9.html26/11/2005 9:14:32 PDP-10 Simulator Configuration

PDP-10 Simulator Configuration

The PDP-10, also known as the DECsystem-10 and the DECsystem-20, was among the most famous and popular 36-bit computers. Digital introduced its first 36-bit computer, the PDP-6, in 1964. The product was not a success. Digital relaunched the family, with a new design, as the PDP-10 in 1966, and shipped the first systems in 1967. In all, Digital brought six 36-bit designs to market:

● 1964: PDP-6 - first design

● 1967: KA10 - first PDP-10

● 1972: KI10 - first DECSYSTEM-10

● 1975: KL10 - first ECL design - as DECSYSTEM-10, as DECSYSTEM-20

● 1978: KL10B - extended addressing KL10 ● 1979: KS10 - last 36-bit system, also known as DECSYSTEM-2020

Option Description Capacity CPU and memory KS10 1MW IO dual Unibus Console 8080-based front end processor Paper tape PC11 paper tape reader and punch Timer built in Time of year clock TCU150 (from Digital Pathways) Line printer LP20 line printer Terminal multiplexor DZ11 terminal multiplexor 8-32 lines Disk RH11/RM03, RM05 disk packs 67-256MB RH11/RM80 non-removable disk 124MB RH11/RP04 (RP05), RP06 disk pack 87-174MB RH11/RP07 non-removable disk 516MB Floppy disk RX211/RX02 floppy disk 512KB Magnetic tape RH11/TU45 800/1600bpi 9 track magnetic tape Network DEUNA Ethernet interface

Updated 24-Mar-2004 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/pdp10.html26/11/2005 9:14:33 PDP-11 Simulator Configuration

PDP-11 Simulator Configuration

The PDP-11 was the most popular 16-bit minicomputer. Introduced by DEC in 1970, it remained in active production until 1996. The PDP-11 family included many processor designs (photographs courtesy of Digital Equipment Corporation):

● 1970: PDP-11/20 - first design (also PDP-11/15)

● 1972: PDP-11/45 - first design with memory extension, instruction and data space, floating point (with fast bipolar memory, PDP-11/55) ● 1972: PDP-11/05 (also PDP-11/10)

● 1972: PDP-11/40 (also PDP-11/35)

● 1975: PDP-11/04 - first single board design ● 1975: PDP-11/34

● 1975: PDP-11/70 - first design using memory

● 1975: LSI-11 - first LSI design (also PDP-11/03, PDT150)

● 1977: PDP-11/60

● 1979: LSI-11/23 (F11)- second LSI design, first with floating point (also PDP-11/23, PDP- 11/24, Pro 350)

● PDP-11/44 - last TTL design

● LSI-11/73 (J-11) - first CMOS design, last design (also PDP-11/53, PDP-11/73, PDP-11/74, PDP-11/83, PDP-11/84, PDP-11/93, PDP-11/94, Pro 380)

Option Description Capacity CPU and memory 19 models supported (Unibus and Qbus) 16KB - 4MB CPU options (MMU) Floating instruction set (FIS) Floating point processor (FPP) Commercial instruction set (CIS) Console DL11 full duplex asynchronous interface Paper tape PC11 paper tape reader and punch Real time clock KW11L real-time clock Line printer LP11 line printer Terminal multiplexor DZ11 multiplexor 8-32 lines DHQ11 multiplexor 8-32 lines Disk RX11/RX01 single density floppy disk 256KB

http://simh.trailing-edge.com/pdp11.html (1 of 2)26/11/2005 9:14:34 PDP-11 Simulator Configuration

RX211/RX02 double density floppy disk 512KB RK11/RK05 cartridge disk 2.5MB RLV12/RL01-RL02 cartridge disk 5-10MB RK611/RK06-RK07 cartridge disk 13-26MB RH11-RH70/RM03, RM05 disk packs 67-256MB RH11-RH70/RM80 non-removable disk 124MB RH11-RH70/RP04 (RP05), RP06 disk packs 87-174MB RH11-RH70/RP07 non-removable disk 516MB RQDX3/RX50,RX33 floppy disks .4-1.2MB RQDX3/RD51,RD52,RD53,RD54,RD31 disks 10.8-155.6MB RQDX3/RA60,RA71,RA72,RA73,RA81,RA82,RA90, 200-1960MB RA92 disks DECtape TC11/TU56 DECtape 296KB Magnetic tape TM11/TE10 800bpi 9 track magnetic tape TS11/TSV05 1600bpi 9 track magnetic tape RH11-RH70/TM03/TE16,TU45,TU77 1600bpi 9 track magnetic tape TQK50/TK50 TMSCP magnetic tape Network DELQA/DEQNA Qbus Ethernet controller DEUNA/DELUA Unibus Ethernet controller

Updated 18-Feb-2005 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/pdp11.html (2 of 2)26/11/2005 9:14:34 PDP-15 Simulator Configuration

PDP-15 Simulator Configuration

Introduced in 1969, the PDP-15 was DEC's fifth (and last) 18-bit computer system. Upward compatible from the PDP-9, it included several new architectural features, including indexing, expanded memory capacity, full memory protection, and floating point. The last PDP-15 was produced in the late 1970's.

Photographs (courtesy of Digital Equipment Corporation):

● PDP-15

Option Description Capacity CPU and memory PDP-15, no floating point 4KW - 128KW CPU options KE15 EAE, extended arithmetic element KF15 power fail detection KM15 memory protection control KT15 memory relocation control XM15 (XVM) FP15 floating point processor Console KSR-35 Teletype Additional terminals LT19 and KSR-35 Teletypes 1-16 lines Paper tape PC15 paper tape reader and punch Real time clock KW15 real time clock Line printer LP15 line printer DECtape TC15/TU56 DECtape 148KW Disk RF15/RS15 fixed head disk 2.1MW RP15/RP02 disk pack 10.4MW Magnetic tape TC59D magnetic tape (9 track only)

Updated 24-Mar-2004 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/pdp15.html26/11/2005 9:14:34 VAX Simulator Configuration

VAX Simulator Configuration

The VAX was the most popular 32-bit minicomputer. Introduced by DEC in 1977, it remained in active production until 1999. The VAX family included many processor designs:

● 1977: VAX-11/780 (TTL ); derivative design, VAX-11/785 ● 1980: VAX-11/750 (TTL gate arrays) ● 1982: VAX-11/730 (TTL) ● 1984: VAX 8600 (ECL gate arrays); derivative design, VAX 8650 ● 1984: MicroVAX I (4u NMOS data path chip + TTL control) ● 1985: MicroVAX II (3u NMOS) ● 1986: VAX 8800 (ECL gate arrays) ● 1987: CVAX (2u CMOS); derivative designs CVAX II (1.5u CMOS) and SOC (1u CMOS) ● 1989: Rigel (1.5u CMOS); derivative design Mariah (1u CMOS) ● 1990: VAX 9000 (ECL gate arrays) ● 1991: NVAX (.75u CMOS); derivative design NVAX+ (.75u CMOS) and NVAX+5 (.5u CMOS)

While the relationship between processor designs and system models is obvious for TTL and ECL processors, the VAX were used in many different types and models of systems. For example, CVAX was the basis for the MicroVAX 3500, the VAX and VAXstation 3100, and the VAX 6200. A complete chart of the VAX processors and their instruction sets and system implementations can be found here.

Option Description Capacity CPU and memory MicroVAX 3900 (CVAX CPU) 16MB - 64MB Console Full duplex asynchronous interface Real time clock VAX standard real-time and TOY clock Line printer LPV11 line printer Terminal multiplexor DZV11 multiplexor 4-16 lines DHQ11 multiplexor 8-32 lines Disk RLV12/RL01-RL02 cartridge disk 5-10MB RQDX3/RX50,RX33 floppy disks .4-1.2MB RQDX3/RD51,RD52,RD53,RD54,RD31 disks 10.8-155.6MB RQDX3/RA60,RA71,RA72,RA73,RA81,RA82,RA90, 200-1960MB RA92 disks Floppy disk RXV21/RX02 floppy disk 512KB

http://simh.trailing-edge.com/vax.html (1 of 2)26/11/2005 9:14:35 VAX Simulator Configuration

Magnetic tape TS11/TSV05 1600bpi 9 track magnetic tape TQK50/TK50 TMSCP magnetic tape Network DELQA/DEQNA Ethernet controller

Updated 18-Feb-2005 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/vax.html (2 of 2)26/11/2005 9:14:35 GRI-909 Simulator Configuration

GRI-909 Simulator Configuration

Introduced in 1969, the GRI-909 was the first system from the newly formed GRI Computer Corporation. Targetted for embedded and process control applications, the machine was very spare, with only one instruction format. The simulator is based on the 1969 preliminary reference manual, as well as a surviving program, the MIT Crystal Physics System.

Photographs (courtesy of Al Kossow):

● GRI-909

Option Description Capacity CPU and memory GRI-909 4KW - 32KW CPU options Extended arithmetic operator Console S42-001 Teletype input S42-002 Teletype output Paper tape S42-004 High speed reader S42-006 High speed punch Real time clock line-time clock

Updated 15-Sep-2003 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/gri.html26/11/2005 9:14:35 IBM 1401 Simulator Configuration

IBM 1401 Simulator Configuration

The IBM 1401 was introduced in 1960 as an I/O pre- and post-processor for its 36-bit mainframes. In addition, it was the first widely used small business computer.

Photographs (courtesy of Paul Pierce):

● IBM 1401

Option Description Capacity CPU and memory IBM 1401 4K - 16K characters CPU special features advanced programming (indexing) compare high-low-equal branch on bit equal modify address move record extended print edit multiply/divide Console 1407 inquiry terminal Cards 1402 card reader punch Line printer 1403 line printer Disk 1311 disk pack 2M characters Magnetic tape 7 track magnetic tape

Updated 21-Apr-2003 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/i1401.html26/11/2005 9:14:35 IBM 1620 Simulator Configuration

IBM 1620 Simulator Configuration

The IBM 1620 was introduced in 1960 as a small-scale scientific computer. For many people, the 1620 was the first computer they ever used.

Photographs (courtesy of the Technology Museum of Thessaloniki):

● IBM 1620 Model 2

Option Description Capacity CPU and memory IBM 1620, Model 1 or Model 2 20K - 60K digits CPU special features indirect addressing edit instructions automatic divide modify address Model 2 CPU features indexing binary capability floating point Console integrated typewritter Paper tape 1621 paper tape reader 1624 paper tape punch Cards 1622 card reader/punch Line printer 1443 line printer Disk 1311 disk pack, 4 drives 2M characters

Updated 21-Apr-2003 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/i1620.html26/11/2005 9:14:36 System 3 Simulator Configuration

IBM System 3 Simulator Configuration

The IBM System 3 was a system for small business. Its successors included the System 34 and the System 36.

The simulator for the IBM System 3 was developed by Charles Owen.

Option Description Capacity CPU and memory IBM System 3 Model 10 (5410) 8KB - 64KB Console 5471 printer/keyboard console Card reader/punch 1442 card reader/punch Line printer 1403 line printer Disk 5440 fixed/removable cartridge disk 2.467MB

Updated 21-Apr-2003 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/s3.html26/11/2005 9:14:36 HP 2100 Simulator Configuration

Interdata Simulator Configuration

Interdata was founded in the mid 1960's. It produced a family of 16b minicomputers loosely modeled on the IBM 360 architecture. Microprogramming allowed a steady increase in the functionality of successive models.

● Interdata 3 ● Interdata 4 (autoload, floating point) ● Interdata 5 (list processing, microcoded automatic I/O channel) ● Interdata 70, 74, 80 ● Interdata 6/16, 7/16 ● Interdata 8/16, 8/16e (double precision floating point, extended memory)

In the early 1970's, Interdata was purchased by Perkin-Elmer. In 1974, it introduced one of the first 32b minicomputers, the 7/32. Several generations of 32b systems followed:

● Interdata 7/32 ● Interdata 8/32 ● Perkin-Elmer 3205, 3210, 3220 ● Perkin-Elmer 3250

Interdata was spun out of Perkin-Elmer as Concurrent Computer Corporation.

Photographs:

● Interdata 4 (courtesy of Carl Friend)

● Interdata 70 (courtesy of Tony Farrell)

● Interdata 70 front panel detail (courtesy of Tony Farrell)

Option Description Capacity 16b CPU and memory Interdata 3, 4, 7/16, 8/16, 8/16e 8KB-64KB (256KB on 8/16e) 32b CPU and memory Interdata 7/32, 8/32 64KB-1024KB 32b options double precision floating point DMA Selector channels 1-4 Console KSR-33 Teletype PASLA-based terminal Paper tape reader and punch Clocks line time clock

http://simh.trailing-edge.com/interdata.html (1 of 2)26/11/2005 9:14:37 HP 2100 Simulator Configuration

precision real-time clock Line printer line printer Terminal multiplexor PASLA-based terminal multiplexor 32 lines Disk floppy disk 256KB cartridge disk 2.5MB/10MB mass storage module controller 13.5MB - 268MB Magnetic tape 9 track magnetic tape

Updated 18-Jun-2003 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/interdata.html (2 of 2)26/11/2005 9:14:37 HP 2100 Simulator Configuration

HP 2100 Simulator Configuration

Introduced in the late 1960's by Hewlett-Packard Corporation, the HP 2100 series was one of the mostly widely used real-time minicomputers. Among the models in this series were

● HP 2112 ● HP 2114 ● HP 2116 ● HP 2100 ● HP 21MX ● HP 1000

Photographs:

● HP 2112B (courtesy of Jay Jaeger)

● HP 2114A (courtesy of Jeff Moffatt)

Option Description Capacity CPU and memory HP 2116, 2100, or 21MX 4KW - 32KW (1024KW on 21MX) 2116 options extended arithmetic 2100 options floating point, memory protection, IOP 21MX options memory protection, DMS, IOP Console 12631A ASR-33 Teletype Paper tape 12597A reader and punch Real time clock 12539A/B/C time base generator Line printers 12653A line printer 12845B line printer Terminal multiplexor 12920A terminal multiplexor 16 lines Disk 12557A/13210A cartridge disk 1.25/2.5MW 12565A disk pack 11.95MW 13037/7905,7906,7920,7920 disk pack 7.5-60MW 12606A/12610A fixed head disk/drum 0.18MW - 1.54MW Magnetic tape 12559C 9 track magnetic tape 13181A/13183A 9 track magnetic tape

http://simh.trailing-edge.com/hp2100.html (1 of 2)26/11/2005 9:14:37 HP 2100 Simulator Configuration

Network 12556B interprocessor link for Access

Updated 18-Feb-2005 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/hp2100.html (2 of 2)26/11/2005 9:14:37 Honeywell H316 Simulator Configuration

Honeywell H316 Simulator Configuration

The Honeywell Series 16 was initially developed by Computer Controls Corporation in the mid-60's. 3C's systems were named DDP-x16. When Honeywell bought 3C, it renamed the models to Hx16. Among the models in this series were

● DDP-516 (H516) - the original Arpanet IMP ● DDP-116 ● DDP-416 ● H316 ● H716

Photographs:

● H316 (courtesy of Mike Umbricht)

Option Description Capacity CPU and memory H316 CPU 8KB - 32KB CPU options double precision integer arithmetic Console 316/516-33 KSR-33 Teletype Paper tape 316/516-50 paper tape reader 316/516-52 paper tape punch Clock 316/516-12 real-time clock Line printer Analex unbuffered shuttle line printer Fixed head disk 4400 fixed head disk 98KW - 1532KW Moving head disk 4623/4653/4720 disk pack controller 831KW - 8314KW Magnetic tape 4100 7-track magtape controller

Updated 01-Dec-03 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/h316.html26/11/2005 9:14:38 MITS Altair 8800 Simulator Configuration

MITS Altair 8800 Simulator Configuration

The MITS (Micro Instrumentation and Telemetry Systems) Altair 8800 was announced on the January 1975 cover of , which boasted you could buy and build this powerful computer kit for only $397. The kit consisted at that time of only the parts to build a case, power supply, card cage (18 slots), CPU card, and memory card with 256 *bytes* of memory. Still, thousands were ordered within the first few months after the announcement, starting the revolution as we know it today.

Many laugh at the small size of the that first kit, noting there were no peripherals and the 256 byte memory size. But the computer was an open system, and by 1977 MITS and many other small startups had added many expansion cards to make the Altair quite a respectable little computer. The "Altair Bus" that made this possible was soon called the S-100 Bus, later adopted as an industry standard, and eventually became the IEE-696 Bus.

The simulator for the Altair 8800 was developed by Charles Owen. The updated version with a Z80 CPU was developed by Peter Schorn.

Photographs:

● MITS Altair 8800 (courtesy of Data General Corporation)

Option Description Capacity CPU and memory 8080 or Z80 8b 4KB - 64KB Console MITS 88-2SIO serial interface card, port 1 Paper tape reader/punch MITS 88-2SIO serial interface card, port 2 Floppy disk MITS 88-DISK, 1-8 drives 337.5KB

Updated 21-Apr-2003 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/altair.html26/11/2005 9:14:38 LGP-30 Simulator Configuration

LGP-30 Simulator Configuration

The LGP-30 was a small scientific and business computer. Introduced in 1956, it used and diode logic and featured 4096 words of drum based memory, and a for input and output. (There was also an optional high-speed paper-tape reader and punch.) The later LGP-21 used logic and a disk memory but was about three times slower.

Photographs :

● LGP-30

● LGP-21 (courtesy of Tom Jennings)

Option Description Capacity CPU and memory LGP-30 4KW LGP-21 4KW Console Friden Flexowriter Paper tape High speed reader High speed punch

Updated 30-Jan-2004 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/lgp30.html26/11/2005 9:14:39 SDS-940 Simulator Configuration

SDS-940 Simulator Configuration

Scientific Data Systems of El Segundo, California, produced a line of 24-bit computers in the early and mid 1960's. The first system, the SDS 900, was succeeded by the SDS 920 and SDS 930. at the University of California Berkeley modified the SDS 930 to increase its capability and add support for timesharing. The Project Genie timesharing system provided many of the seminal ideas for Tenex and thus TOPS-20.

The SDS 940 is a commercial version of Project Genie's modified SDS 930. It was succeeded by the SDS 9300, which was not compatible. After the 9300, SDS built 32b systems, the SDS (and then XDS) Sigma series.

Option Description Capacity CPU and memory SDS 940 16KW - 64KW CPU options Genie mode or SDS 940 mode I/O decoding Console integral typewriter Paper tape integral tape reader integral paper tape punch Real time clock integral Line printer line printer Drum Project Genie drum 1376KW Fixed head disk rapid access disk 2097KW Moving head disk 9164 moving head disk 16777KW Magnetic tape 7-track magnetic tape

Updated 21-Apr-2003 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/sds940.html26/11/2005 9:14:39 Change Log V3.4

Change Log For V3.4

The change log for the previous version (V3.3) is here.

SCP and libraries:

● Fixed ASSERT code ● Revised syntax for SET DEBUG (from Dave Bryan) ● Revised interpretation of fprint_sym, fparse_sym returns ● Moved DETACH sanity tests into detach_unit ● Added test for WSAEINPROGRESS (from Tim Riker)

PDP-10:

● Fixed TU bug, ERASE and WREOF should not clear done (reported by Rich Alderson) ● Fixed TU error reporting

PDP-11:

● Fixed TU error reporting 0 03-May-05 Interdata 16b:

● Fixed bug in show history routine (from Mark Hittinger) ● Revised examine/deposit to do words rather than bytes

Interdata 32b:

● Fixed bug in initial memory allocation ● Fixed bug in show history routine (from Mark Hittinger) ● Revised examine/deposit to do words rather than bytes

HP2100 (all changes and fixes from Dave Bryan)

● CPU: reorganized CPU options ● CPU1: reorganized EIG routines ● Added FFP support

http://simh.trailing-edge.com/changes34.html (1 of 2)26/11/2005 9:14:40 Change Log V3.4

Updated 03-May-2005 by Bob Supnik (bob AT supnik DOT org - anti-spam encoded)

http://simh.trailing-edge.com/changes34.html (2 of 2)26/11/2005 9:14:40 http://simh.trailing-edge.com/simh_faq.txt

SIMH FAQ, 31-Mar-2004

1 General

1.1 What is SIMH? 1.2 Why was SIMH written? 1.3 What is the history of SIMH? 1.4 Who writes and maintains SIMH? 1.5 How is SIMH licensed? 1.6 How is SIMH distributed? 1.7 Which computer systems can SIMH simulate? 1.8 Which host systems does SIMH run on? 1.9 What software packages are available for use with the SIMH simulators? 1.10 Where can I get more information about SIMH?

------

2 Operational

2.1 How do I install SIMH on Windows? 2.2 How do I install SIMH with Ethernet support on Windows? 2.3 How do I install SIMH on Unix? 2.4 How do I install SIMH on VMS? 2.5 How do I transcribe a real CD for use with SIMH? 2.6 How do I transcribe other archival media for use with SIMH? 2.7 How can I get text files in and out of SIMH? 2.8 How can I get binary files in and out of SIMH? 2.9 Can I connect real devices on the host computer to SIMH? 2.10 My Windows host can't communicate with the PDP-11 or VAX over Ethernet; why?

------

3 Writing and Debugging New Code

3.1 What resources are available for writing new simulators? 3.2 What debugging facilities are available in SIMH? 3.3 When do I need to use the host for debugging a simulator? 3.4 What is the release process for SIMH?

------

4 VAX

4.1 Where can I get software and hobbyist licenses for the VAX? 4.2 How do I install VMS? 4.3 How do I install NetBSD? 4.4 How do I install Ultrix? 4.5 What's the CPU serial number for my hobbyist license PAK? 4.6 How do I change the simulator from a VAXserver 3900 to a MicroVAX 3900?

http://simh.trailing-edge.com/simh_faq.txt (1 of 13)26/11/2005 9:15:53 http://simh.trailing-edge.com/simh_faq.txt

4.7 Is there an example of the simulator running VMS? 4.8 How can I import files into a simulated VMS environment? 4.9 How can I export files from a simulated VMS environment?

------

5 PDP-11

5.1 When installing RSTS/E from simulated magtape, the installation process hangs with no error message; why?

======1. General Questions ======

1.1 What is SIMH?

SIMH is the Computer History Simulation system. It consists of simulators for approximately 20 different computers, all written around a common package and set of supporting libraries. SIMH can be used to simulate any computer system for which sufficient detail is available, but the focus to date has been on simulating computer systems of historic interest.

------

1.2 Why was SIMH written?

Significant portions of the computing past are being irretrievably lost, as old systems are scrapped, documentation and software is thrown out, media become obsolete or unreadable, and inventors and pioneers die. SIMH was written as a vehicle to allow the computing past to be made accessible to a wider audience, for recreational and educational purposes. SIMH preserves historic computers as portable software, that can be run on any modern system. SIMH also preserves representative software packages for these systems. With SIMH, anyone with a desktop computer can call up and run significant samples from the computing past, at any time.

------

1.3 What is the history of SIMH?

The SIMH project started in 1993, at the suggestion of Larry Stewart of DEC. Its immediate purpose was to preserve the fading hardware and software record of early minicomputers. Since then, the project has been expanded to include other important systems, spanning the history of computing from the late 50's to the late 80's.

SIMH's core design is based on an earlier simulation system called MIMIC. MIMIC was written in the late 1960's at Applied Data Research, by Mike McCarthy, Len Feshkens, and Bob Supnik. MIMIC was a mini-computer simulator that ran on the PDP-10. Its purpose was to facilitate the development and

http://simh.trailing-edge.com/simh_faq.txt (2 of 13)26/11/2005 9:15:53 http://simh.trailing-edge.com/simh_faq.txt debugging of real-time embedded systems by using the the PDP-10 timesharing environment for program development, instead of the limited facilities of the native minicomputer environments. Ironically, given SIMH's mission to preserve the computing record, all machine-readable copies of MIMIC have been lost.

------

1.4 Who writes and maintains SIMH?

Many people have contributed, and continue to contribute, to SIMH. The full list of contributors can be found on the SIMH web site. Bob Supnik coordinates SIMH development.

------

1.5 How is SIMH licensed?

SIMH is licensed under a modified X-Windows license. This license allows more or less unrestricted use of the sources and binaries. The license is included with the documentation and is also included in every source module. The software packages are available under various terms and conditions; see the documentation included with each software package.

------

1.6 How is SIMH distributed?

SIMH is distributed in source form from the SIMH web site, in the form of a Zip archive. For Windows users, pre-compiled binaries are also available.

------

1.7 Which computer systems does SIMH simulate?

SIMH simulates the following computer systems:

Manufacturer Model

Digital Equipment Corporation PDP-1, PDP-4, PDP-7, PDP-8, PDP-9, PDP-10, PDP-11, PDP-15, VAX Data General Corporation Nova, Eclipse IBM Corporation 1130, 1401, 1620, System 3 GRI Corporation GRI-909 Honeywell Corporation H316/516 Hewlett Packard Corporation HP2116, HP2100, HP21MX Interdata Corporation 16b systems, 7/32, 8/32 Scientific Data Systems SDS-940 MITS Altair 8080, Altair Z80 Royal-Mcbee LGP-30, LGP-21

http://simh.trailing-edge.com/simh_faq.txt (3 of 13)26/11/2005 9:15:53 http://simh.trailing-edge.com/simh_faq.txt

The documentation contains more details on supported models and peripherals.

------

1.8 Which host systems does SIMH run on?

Host System comments

OpenVMS/VAX DEC C no 64b support; no Ethernet support OpenVMS/Alpha DEC C Ethernet support provided in pcap-vms

Windows 9x or Mingw/gcc or requires WinPcap for Ethernet support Windows 2000 or Visual C++ or Windows XP Borland C++

Mac OS/X requires libpcap for Ethernet support

Linux gcc requires libpcap for Ethernet support Tru64 UNIX DEC C no Ethernet support AIX no Ethernet support Solaris requires libpcap for Ethernet support HP/UX no Ethernet support NetBSD gcc requires libpcap for Ethernet support OpenBSD gcc requires libpcap for Ethernet support FreeBSD gcc requires libpcap for Ethernet support

OS/2 EMX no Ethernet support

------

1.9 What software packages are available to run on SIMH?

The list of available software packages can be found on the SIMH web site.

------

1.10 Where can I get more information on SIMH?

The SIMH web site is http://simh.trailing-edge.com.

======2 Operational Questions ======

2.1 How do I install SIMH on Windows?

The simplest way is to download the pre-compiled binaries. Unzip these into the directory where you want to run SIMH. You can then run whichever binary that you want.

If you want to run the VAX emulator, you will also need files ka655.bin and

http://simh.trailing-edge.com/simh_faq.txt (4 of 13)26/11/2005 9:15:53 http://simh.trailing-edge.com/simh_faq.txt ka655x.bin from the source kit.

------

2.2 How do I install SIMH with Ethernet support on Windows?

Separate pre-compiled binaries contain Ethernet support. Before running these binaries, you must download download and install the WinPCAP AutoInstaller from

http://winpcap.polito.it

This creates a network packet driver in Windows for SIMH to attach to.

To use network support, you must either be an administrator on the Windows machine (implied in Windows 9X), or you must set the windows packet driver to autostart when the system boots; see the WinPCAP FAQ page for detailed information on how to do this.

------

2.3 How do I install SIMH on Unix?

- Unzip the archive of sources to a new directory. You must specify the -a to unzip for proper conversion of Windows cr-lf sequences to UNIX newline sequences.

- If your system supports gmake, you can compile the simulators with the command:

% gmake all

- If you want Ethernet support in the PDP-11 and VAX, you should compile the simulators with the command:

% gmake USE_NETWORK=1 all

Note that Ethernet support is available ONLY on Linux, NetBSD, and OpenBSD.

------

2.4 How do I install SIMH on VMS?

Download the SIMH source kit, and UNZIP it using the /TEXT=AUTO qualifier to the directory that you want SIMH to reside in. Use MMK or MMS and the descrip.mms file to build the binaries.

On a VAX use:

$ MMx

http://simh.trailing-edge.com/simh_faq.txt (5 of 13)26/11/2005 9:15:53 http://simh.trailing-edge.com/simh_faq.txt

On a Alpha use: $ MMx/MACRO=("__ALPHA__=1") !Without ethernet support $ MMx/MACRO=("__ALPHA__=1","__PCAP__=1") !With ethernet support

UNZIP can be found on the VMS freeware CDs, or from www.info-zip.org MMK can be found on the VMS freeware CDs, or from www.madgoat.com MMS can be licensed from HP/Compaq/Digital.

Note that the PDP-10 emulator cannot be built and used on VAX/VMS, because the DEC C compiler for VAX/VMS does not support 64-bit integers. DEC C on on Alpha VMS has the required 64-bit capability to build and run all of the emulators. Ethernet support is only available on Alpha VMS 7.3-1 and above.

------

2.5 How do I transcribe a real CD for use with SIMH?

- On UNIX, you can copy a CD to an ISO file with the dd command:

% dd /if=/dev/raw_cd_device /out=/path/cdimage.iso

Linux, and many Unix variants, support direct access to the CD ROM from SIMH:

sim> set rq1 cdrom sim> att rq1 /dev/cdrom_drive

- On Windows, there are quite a few products that can do this. The two most common products are detailed below. Make sure to disable any antivirus software before proceeding. Antivirus software tends to interfere with the smooth flow of data from the CD and will occasionally transform the data in strange and unexpected ways to 'protect' you.

1) Roxio EZ-CD Creator 5.x Go to the the Disc menu and select Disc Info (there will be a delay). Select the track shown, then click the Read Track button. Enter the Save file name, then OK.

2) Nero 5.5 Select Recorder|Save Track Select the track, set the output filename In Options, you may need to set the Read Speed down; the VMS Hobbyist CD didn't work after a 52x read, but worked fine at 8x Click GO

------

2.6 How do I transcribe other archival media for use with SIMH?

You must have access to a real system that can read the media to be transcribed (e.g., a system with a working DECtape drive to read a

http://simh.trailing-edge.com/simh_faq.txt (6 of 13)26/11/2005 9:15:53 http://simh.trailing-edge.com/simh_faq.txt

DECtape). Most systems have utilities to copy raw data to a disk file; that file can then be transferred over the console serial line to a system with an Internet link. Utility programs are available to convert raw data streams to SIMH format.

------

2.7 How can I get text files in and out of SIMH?

Since SIMH supports the universal serial interface using TELNET, text can be transferred using one of the serial line transfer protocols (X/Y/Zmodem, Kermit) or using standard cut and paste techniques, if the host's TELNET program supports it.

To use the TELNET feature, connect to the SIMH machine using TELNET, and set the target environment into a 'receive' mode. This is usually something like running a text editor. Then tell the TELNET program to 'send', 'transfer', or 'paste' the text that you want sent into the SIMH system.

To get text out of the system, have the TELNET program either log the output, or if the TELNET program supports a backscroll region you can use that. Tell the SIMH system to 'type' or 'cat' the , sending the output to the TELNET device, where you can edit it into a text file.

Many TELNET programs also support transferring large files via X/Y/ZModem or Kermit, which you can use as long as the SIMH system has the appropriate matching program.

C-Kermit from Columbia University (http://www.columbia.edu/kermit) is probably the most universal way to transfer files in and out of SIMH systems.

If the SIMH system supports Ethernet connectivity (PDP-11, VAX), you can also use the various network copy programs (FTP, DECNET) to transfer files.

Finally, you can "print" text files to the simulated line printer. Printer output is automatically formatted as an ASCII text file.

------

2.8 How can I get binary files in and out of SIMH?

Since SIMH supports the universal serial interface using TELNET, binary files can be transferred using one of the serial line transfer protocols (X/Y/ZModem, Kermit) or by converting the binary to a text-encoded file (HEXify, UUENCODE, VMShare, etc.) and transferred in text mode (see section 2.7).

Many TELNET programs also support transferring large files via X/Y/ZModem or Kermit, which you can use as long as the SIMH system has the appropriate matching program.

http://simh.trailing-edge.com/simh_faq.txt (7 of 13)26/11/2005 9:15:53 http://simh.trailing-edge.com/simh_faq.txt

C-Kermit from Columbia University (http://www.columbia.edu/kermit) is probably the most universal way to transfer files in and out of SIMH systems.

If the SIMH system supports Ethernet connectivity (PDP-11, VAX), you can also use the various network copy programs (FTP, DECNET) to transfer files.

------

2.9 Can I connect real devices on the host computer to SIMH?

At the moment, Ethernet is the only supported real device.

------

2.10 My Windows host can't communicate with the PDP-11 or VAX over Ethernet; why?

Due to the inherent limitations of WinPCAP, the SIMH system _CANNOT_ communicate with the host on the primary interface. To establish communications between SIMH and a PC host, add a second Ethernet controller, attach both controllers to the same hub, and attach SIMH to the second controller. The host and SIMH will now be able to communicate across the physical network connection.

======3 Writing and Debugging New Code ======

3.1 What resources are available for writing new simulators?

The SIMH web site contains documentation on the internals of SIMH, as well as specific help for writing new peripherals for several of the popular simulators.

------

3.2 What debugging facilities are available in SIMH?

Most simulators provide the following debugging capabilities:

- Symbolic assembly and disassembly of memory contents. - Numeric examination and modification of the data store of any simulated device. - Numeric search on both memory and device data. - Visibility to simulator internal structures, such as the event queue. - An unlimited number of instruction breakpoints. - Proceed counts on breakpoints. - Automatic execution of simulator commands on a breakpoint. - Stepped execution (from single step to 'n' steps). - A PC change queue, usually 64 instructions deep.

Specific simulators may provide additional features, such as an instruction history buffer, CPU and/or device logging, and breakpoints on memory reads

http://simh.trailing-edge.com/simh_faq.txt (8 of 13)26/11/2005 9:15:53 http://simh.trailing-edge.com/simh_faq.txt and writes.

------

3.3 When do I need to use the host debugger for debugging a simulator?

While a simulator is being debugged, its execution of instructions or debugging support code may be unreliable. During this process, the may need to use the host debugger to stop in the middle of an instruction execution, or to trap an error condition. Host debugger breakpoints should be invisible to the simulator; with the exception of clock calibration, all simulator events are driven off the event queue rather than real-world events.

If the programmer needs to force a simulator stop from the host debugger, most simulators provide an "address stop" global variable. Setting this variable to 1 will cause the simulator to stop after completing the current instruction.

------

3.4 What is the release process for SIMH?

SIMH is released whenever a significant number of new features, or important bug fixes, has accumulated. This has averaged every 4-8 weeks. The major version number only changes when there is a major restructuring of SIMH's internal structures. The minor version number is changed when the format of the save/restore file must be updated.

======4 VAX ======

4.1 Where can I get software and hobbyist licenses for the VAX?

HP (formerly Compaq formerly DEC) provides licenses to OpenVMS for hobbyist use. A description of the hobbyist license program can be found on http://www.montagar.com/hobbyist/.

------

4.2 How do I install VMS?

To install VMS, you will need a distribution CD ROM. Any version after VMS 5.5-2 should run on the simulator.

- Transcribe the distribution CD ROM to an ISO-format CD image file. (See question 2.5 for information on how to do this.) - Set drive RQ1 to be a CD ROM. - Attach the CD ROM image file to simulated drive RQ1. - Set drive RQ0 to be the type of disk you want. Be sure that the

http://simh.trailing-edge.com/simh_faq.txt (9 of 13)26/11/2005 9:15:53 http://simh.trailing-edge.com/simh_faq.txt

disk is large enough to hold VMS. - Attach a blank disk image file to simulated drive RQ0. - Boot the CPU. - When the self-test code completes, boot the CD ROM. - Use standalone backup to restore the CD ROM contents to the simulated disk.

sim> set rq0 rd54 sim> set rq1 cdrom sim> att rq0 new_vms.dsk sim> att rq1 cd_rom_image.iso sim> boot cpu : >>> boot rq1

$ (prompt from standalone backup)

A writeup on the procedure can be found on the VMS hobbyist site.

------

4.3 How do I install NetBSD?

Directions for installing NetBSD on the NetBSD web site, at http://www.netbsd.org/Ports/vax/emulator-howto.html.

------

4.4 How do I install Ultrix?

Ultrix is not presently licensed for hobbyist use. If you have a valid license for Ultrix, and distribution tapes for a version that supports the MicroVAX 3900 series (V4 or later), then you should be able to install Ultrix on the simulator.

- Transcribe the distribution tapes to SIMH-format tape image files. (See question 2.6 for information on how to do this.) - Mount the installation tape image on simulated drive TQ0. - Set drive RQ0 to be the type of disk you want. Be sure that the disk is large enough to hold Ultrix. - Mount a blank disk image file on simulated drive RQ0. - Boot the CPU. - When the self-test code completes, boot the installation tape. - The installation tape will guide you through the installation of Ultrix.

sim> set rq0 rd54 sim> att rq0 new_vms.dsk sim> att tq0 ultrix_install.tap sim> boot cpu :

http://simh.trailing-edge.com/simh_faq.txt (10 of 13)26/11/2005 9:15:54 http://simh.trailing-edge.com/simh_faq.txt

>>> boot mua0

(Ultrix installation dialog)

------

4.5 What's the CPU serial number for my hobbyist license PAK?

On a MicroVAX 3900, the CPU serial number is not readable and can be an arbitrary value. 12345 will work fine.

------

4.6 How do I change the simulator from a VAXserver 3900 to a MicroVAX 3900?

The system type is controlled by a "magic byte" in the CPU's boot ROM. By default, the system type is a VAXserver 3900. To change the type to a MicroVAX 3900, patch the boot ROM as follows:

sim> set ptr ena sim> att ptr ka655.bin sim> ie ptr 4 4: 2 1 sim> det ptr and reboot the simulated VAX.

------

4.7 Is there an example of the simulator running VMS?

This example assumes you are trying to emulate a MicroVAX 3900 with 64MB of memory, with a single 1GB disk drive, a CDROM, and an Ethernet controller.

The host OS is Windows NT/2000/XP, and you have previously dumped the contents of the VMS Hobbyist CD to a disk file as detailed in 2.5, and have loaded WinPCAP on the system for Ethernet support. Other host OS's will look similar but will have different file name syntax. c:\simh> vax ; run VAX emulator sim> set cpu 64m ; set memory size to 64MB sim> load -r vax\ka655.bin ; load the MicroVAX 3900 console ROM sim> attach NVR vax\ka655.nvr ; create/load a Non-Volatile RAM file sim> set LPT disable ; disable devices we don't want/need sim> set TQ disable ; " sim> set rq0 ra90 ; set disk 0 to 1GB (RA90 size) sim> attach rq0 vax\vaxsys.dsk ; create/use disk file sim> set rq1 rrd40 ; set disk 1 as a cdrom sim> attach -r rq1 vax\hobbyist.dsk ; attach cdrom dump file as read-only sim> set rq2 offline ; turn off disk rq2 sim> set rq3 offline ; turn off disk rq3

http://simh.trailing-edge.com/simh_faq.txt (11 of 13)26/11/2005 9:15:54 http://simh.trailing-edge.com/simh_faq.txt sim> attach xq eth0 ; attach to host ethernet controller sim> b cpu ; start (boot) VAX console

KA655-B V5.3, VMB 2.7 1) Dansk ; will not appear if the controlling .. ; keyboard doesn't support multi- 15) Svenska ; national characters! (1..15): 5 Performing normal system tests. 40..39..38..37..36..35..34..33..32..31..30..29..28..27..26..25.. 24..23..22..21..20..19..18..17..16..15..14..13..12..11..10..9.. 8..7..6..5..4..3.. Tests completed. >>> show device ; tell console to show all devices UQSSP Disk Controller 0 (772150) -DUA0 (RA90) -DUA1 (RRD40)

Ethernet Adapter 0 (774440) -XQA0 (08-00-2B-AA-BB-CC) >>> b dua1 ; tell console to boot cdrom (BOOT/R5:1 DUA1)

2..1..0

------

4.8 How can I import files to a simulated VMS environment?

- Use a CD burner program, like Easy CD Creator or Nero, to create an ISO 9660 CD image containing the files you want to import. Note that file names are limite to DOS '8.3' conventions. - Attach the simulated CD image to a simulated CD drive. - Mount the simulated CD as an ISO 9660 file system under VMS. - Copy the files you need from the simulated CD to the simulated disk.

(Thanks to Tim Stark for this suggestion.)

------

4.9 How can I export files from a simulated VMS environment?

- Utility ODS2 (available on the Web) can read an ODS-2 disk image and copy files from that image to the host file system. - Text files can be printed to the simulated line printer, as described above.

======5 PDP-11 ======

5.1 When installing RSTS/E from simulated magtape, the installation process

http://simh.trailing-edge.com/simh_faq.txt (12 of 13)26/11/2005 9:15:54 http://simh.trailing-edge.com/simh_faq.txt

hangs with no error message; why?

- RSTS/E installation from magnetic tape requires that the tape be write locked.

http://simh.trailing-edge.com/simh_faq.txt (13 of 13)26/11/2005 9:15:54 http://simh.trailing-edge.com/docs/hp2100_algol_howto_doc.txt

How To Use HP 2100 Algol On SIMH by Paulo da Silva

DISCLAIMER ======

THIS DOCUMENT SHOWS HOW TO USE HP ALGOL WITH THE HP2100 SIMH SIMULATOR USING TAPES FROM THE INTERNET. YOU MUST ENSURE THAT YOU ARE ALLOWED TO USE THESE TAPES. THE AUTHOR (PAULO DA SILVA) IS NOT RESPONSIBLE FOR ANY UNAUTHORIZED USE OF ANY SOFTWARE HERE CITED OR ANY OTHER. THE AUTHOR (PAULO DE SILVA) IS ALSO NOT NOT RESPONSIBLE FOR ANY CONSEQUENCES RESULTING DIRECTLY OR INDIRECTLY FROM USING THIS DOCUMENT.

IF YOU DISAGREE, PLEASE DO NOT CONTINUE READING.

Paulo da Silva ======

1. After installing SIMH HP2100, you may test it using the included basic1.abs as indicated in the simulator documentation.

2. From http://oscar.taurus.com/~jeff/2100/index.html, get the following files:

.abs bcsioc.rel bcslib.rel bcsloadr.rel bcsprep.abs bcsptp.rel bcsptr.rel bcstty.rel sio16k11.abs

BE SURE YOU HAVE PERMISSION TO USE THESE FILES.

3. Prepare a bcs16k.abs tape (linker) using bcsprep.abs and the .rel files you got. (See below on how to do this).

4. The downloaded file include a configured sio tape. It is configured for 16k words using the following units:

tty 11

http://simh.trailing-edge.com/docs/hp2100_algol_howto_doc.txt (1 of 6)26/11/2005 9:15:54 http://simh.trailing-edge.com/docs/hp2100_algol_howto_doc.txt ptr 13 ptp 20

5. change the default device numbers and load the sio tape, as follows:

set clk dev=12 set ptr dev=13 set ptp dev=20 load sio16k11.abs

6. Load the Algol compiler:

load algol.abs

7. Compile your HP Algol program from an ASCII file helloworld.alg:

------HPAL,B,"PROG"

BEGIN WRITE(2,#("Hello world")); END$ ------

att ptp helloworld.rel att ptr helloworld.alg run 100

8. Link your .rel file(s) with the bcslib.rel. You need to reaassign the device numbers to the default values:

set clk dev=13 set ptr dev=10 set ptp dev=12 load bcs16k.abs att ptp helloworld.abs att ptr helloworld.rel

Set bits 15 and 14 of the switch register to 1:

de s 140000 run 2

Repeat the following steps for any other tape you may have (other procedures for example):

http://simh.trailing-edge.com/docs/hp2100_algol_howto_doc.txt (2 of 6)26/11/2005 9:15:54 http://simh.trailing-edge.com/docs/hp2100_algol_howto_doc.txt de s 140000 att ptr OTHER_FILE.rel run

Finally, link the library, also setting bit 2 of the switch register:

de s 140004 att ptr bcslib.rel run de s 140004 run det ptr det ptp

9. You may now run your standalone program.

load helloworld.abs de s 000000 run 2 ------

10. To prepare a BCS tape, duplicate the commands shown in this log:

HP 2100 simulator V3.1-0 sim> load bcsprep.abs sim> i s S: 000000 11 sim> run 2000

HS INP? 10 HS PUN? 12

FWA MEM? 200 LWA MEM? 37677 ; THIS IS FOR 16k. For 32k, use 77677

* LOAD

HALT instruction, P: 02041 (JMP 613)

http://simh.trailing-edge.com/docs/hp2100_algol_howto_doc.txt (3 of 6)26/11/2005 9:15:54 http://simh.trailing-edge.com/docs/hp2100_algol_howto_doc.txt sim> att ptr bcstty.rel sim> run

D.00 37126 37677

* LOAD

HALT instruction, P: 02041 (JMP 613) sim> att ptr bcsptr.rel sim> run

D.01 36564 37125

* LOAD

HALT instruction, P: 02041 (JMP 613) sim> att ptr bcsptp.rel sim> run

D.02 36253 36563

* LOAD

HALT instruction, P: 02041 (JMP 613) sim> att ptr bcsioc.rel sim> run

IOC 36026 36252

* TABLE ENTRY

EQT?

HALT instruction, P: 04005 (CLA)

http://simh.trailing-edge.com/docs/hp2100_algol_howto_doc.txt (4 of 6)26/11/2005 9:15:54 http://simh.trailing-edge.com/docs/hp2100_algol_howto_doc.txt sim> run 11,D.00 10,D.01 12,D.02 /E

SQT? -KYBD? 7 -TTY? 7 -LIB? 10 -PUNCH? 11 -INPUT? 10 -LIST? 7

DMA? 0

* LOAD

HALT instruction, P: 02041 (JMP 613) sim> att ptr bcsloadr.rel sim> run

LOADR 33501 36002

INTERRUPT LINKAGE ?

HALT instruction, P: 02303 (JSB 2122) sim> run 10,101,I.01 11,102,I.00 12,103,I.02

http://simh.trailing-edge.com/docs/hp2100_algol_howto_doc.txt (5 of 6)26/11/2005 9:15:54 http://simh.trailing-edge.com/docs/hp2100_algol_howto_doc.txt /E

.SQT. 36003 .EQT. 36011 D.00 37126 I.00 37320 .BUFR 36177 D.01 36564 I.01 36704 D.02 36253 I.02 36370 .IOC. 36026 DMAC1 36247 DMAC2 36250 IOERR 36226 XEQT 36246 XSQT 36245 HALT 35770 .LDR. 35154 .MEM. 35775 LST 33532

*SYSTEM LINK 00200 00247

*BCS ABSOLUTE OUTPUT

HALT instruction, P: 02746 (LDA 3110) sim> det ptr sim> att ptp bcs.abs PTP: creating new file sim> run

HALT instruction, P: 02763 (LIA 1) sim> q Goodbye

http://simh.trailing-edge.com/docs/hp2100_algol_howto_doc.txt (6 of 6)26/11/2005 9:15:54 http://simh.trailing-edge.com/photos/dgsn.jpg

http://simh.trailing-edge.com/photos/dgsn.jpg26/11/2005 9:15:56 http://simh.trailing-edge.com/photos/nova840.jpg

http://simh.trailing-edge.com/photos/nova840.jpg26/11/2005 9:15:58 http://simh.trailing-edge.com/photos/nova4.jpg

http://simh.trailing-edge.com/photos/nova4.jpg26/11/2005 9:16:13 http://simh.trailing-edge.com/photos/eclipse-s130.jpg

http://simh.trailing-edge.com/photos/eclipse-s130.jpg26/11/2005 9:16:17 http://simh.trailing-edge.com/photos/pdp1.jpg

http://simh.trailing-edge.com/photos/pdp1.jpg26/11/2005 9:16:21 http://simh.trailing-edge.com/photos/pdp4.jpg

http://simh.trailing-edge.com/photos/pdp4.jpg26/11/2005 9:16:24 http://simh.trailing-edge.com/photos/pdp5.jpg

http://simh.trailing-edge.com/photos/pdp5.jpg26/11/2005 9:16:27 http://simh.trailing-edge.com/photos/pdp6.jpg

http://simh.trailing-edge.com/photos/pdp6.jpg26/11/2005 9:16:31 http://simh.trailing-edge.com/photos/pdp7.jpg

http://simh.trailing-edge.com/photos/pdp7.jpg26/11/2005 9:16:35 http://simh.trailing-edge.com/photos/pdp7a.jpg

http://simh.trailing-edge.com/photos/pdp7a.jpg26/11/2005 9:16:37 http://simh.trailing-edge.com/photos/pdp8.jpg

http://simh.trailing-edge.com/photos/pdp8.jpg26/11/2005 9:16:41 http://simh.trailing-edge.com/photos/pdp8s.jpg

http://simh.trailing-edge.com/photos/pdp8s.jpg26/11/2005 9:16:42 http://simh.trailing-edge.com/photos/pdp8i.jpg

http://simh.trailing-edge.com/photos/pdp8i.jpg26/11/2005 9:16:44 http://simh.trailing-edge.com/photos/pdp8e.jpg

http://simh.trailing-edge.com/photos/pdp8e.jpg26/11/2005 9:16:48 http://simh.trailing-edge.com/photos/vt78.jpg

http://simh.trailing-edge.com/photos/vt78.jpg26/11/2005 9:16:51 http://simh.trailing-edge.com/photos/decmate_iii.jpg

http://simh.trailing-edge.com/photos/decmate_iii.jpg26/11/2005 9:16:58 http://simh.trailing-edge.com/photos/pdp9.jpg

http://simh.trailing-edge.com/photos/pdp9.jpg26/11/2005 9:17:03 http://simh.trailing-edge.com/photos/pdp10.jpg

http://simh.trailing-edge.com/photos/pdp10.jpg26/11/2005 9:17:08 http://simh.trailing-edge.com/photos/ki10.jpg

http://simh.trailing-edge.com/photos/ki10.jpg26/11/2005 9:17:11 http://simh.trailing-edge.com/photos/decsystem_10.jpg

http://simh.trailing-edge.com/photos/decsystem_10.jpg26/11/2005 9:17:18 http://simh.trailing-edge.com/photos/decsystem_20.jpg

http://simh.trailing-edge.com/photos/decsystem_20.jpg26/11/2005 9:17:22 http://simh.trailing-edge.com/photos/pdp11_20.jpg

http://simh.trailing-edge.com/photos/pdp11_20.jpg26/11/2005 9:17:26 http://simh.trailing-edge.com/photos/pdp11_45.jpg

http://simh.trailing-edge.com/photos/pdp11_45.jpg26/11/2005 9:17:28 http://simh.trailing-edge.com/photos/pdp11_05.jpg

http://simh.trailing-edge.com/photos/pdp11_05.jpg26/11/2005 9:17:31 http://simh.trailing-edge.com/photos/pdp11_40.jpg

http://simh.trailing-edge.com/photos/pdp11_40.jpg26/11/2005 9:17:34 http://simh.trailing-edge.com/photos/pdp11_34.jpg

http://simh.trailing-edge.com/photos/pdp11_34.jpg26/11/2005 9:17:36 http://simh.trailing-edge.com/photos/pdp11_70.jpg

http://simh.trailing-edge.com/photos/pdp11_70.jpg26/11/2005 9:17:40 http://simh.trailing-edge.com/photos/lsi11.jpg

http://simh.trailing-edge.com/photos/lsi11.jpg26/11/2005 9:17:43 http://simh.trailing-edge.com/photos/pdp11_60.jpg

http://simh.trailing-edge.com/photos/pdp11_60.jpg26/11/2005 9:17:45 http://simh.trailing-edge.com/photos/lsi11_23.jpg

http://simh.trailing-edge.com/photos/lsi11_23.jpg26/11/2005 9:17:48 http://simh.trailing-edge.com/photos/pdp11_23.jpg

http://simh.trailing-edge.com/photos/pdp11_23.jpg26/11/2005 9:17:49 http://simh.trailing-edge.com/photos/pdp11_24.jpg

http://simh.trailing-edge.com/photos/pdp11_24.jpg26/11/2005 9:17:51 http://simh.trailing-edge.com/photos/pdp11_44.jpg

http://simh.trailing-edge.com/photos/pdp11_44.jpg26/11/2005 9:18:00 http://simh.trailing-edge.com/photos/pdp11_83.jpg

http://simh.trailing-edge.com/photos/pdp11_83.jpg26/11/2005 9:18:02 http://simh.trailing-edge.com/photos/pdp11_94.jpg

http://simh.trailing-edge.com/photos/pdp11_94.jpg (1 of 2)26/11/2005 9:18:06 http://simh.trailing-edge.com/photos/pdp11_94.jpg

http://simh.trailing-edge.com/photos/pdp11_94.jpg (2 of 2)26/11/2005 9:18:06 http://simh.trailing-edge.com/photos/pdp12.jpg

http://simh.trailing-edge.com/photos/pdp12.jpg26/11/2005 9:18:09 http://simh.trailing-edge.com/photos/pdp14.jpg

http://simh.trailing-edge.com/photos/pdp14.jpg26/11/2005 9:18:11 http://simh.trailing-edge.com/photos/pdp15.jpg

http://simh.trailing-edge.com/photos/pdp15.jpg26/11/2005 9:18:16 http://simh.trailing-edge.com/photos/pdp16m.jpg

http://simh.trailing-edge.com/photos/pdp16m.jpg26/11/2005 9:18:18 http://simh.trailing-edge.com/photos/hp2112b.jpg

http://simh.trailing-edge.com/photos/hp2112b.jpg26/11/2005 9:18:21 http://simh.trailing-edge.com/photos/hp2114.jpg

http://simh.trailing-edge.com/photos/hp2114.jpg26/11/2005 9:18:27 http://simh.trailing-edge.com/photos/honeywel.jpg

http://simh.trailing-edge.com/photos/honeywel.jpg26/11/2005 9:18:29 http://simh.trailing-edge.com/photos/ibm1401.jpg

http://simh.trailing-edge.com/photos/ibm1401.jpg26/11/2005 9:18:33 http://simh.trailing-edge.com/photos/ibm1620.jpg

http://simh.trailing-edge.com/photos/ibm1620.jpg26/11/2005 9:18:36 http://simh.trailing-edge.com/photos/system3.jpg

http://simh.trailing-edge.com/photos/system3.jpg26/11/2005 9:18:41 http://simh.trailing-edge.com/photos/ibm1130.jpg

http://simh.trailing-edge.com/photos/ibm1130.jpg26/11/2005 9:18:42 http://simh.trailing-edge.com/photos/id4.jpg

http://simh.trailing-edge.com/photos/id4.jpg26/11/2005 9:18:44 http://simh.trailing-edge.com/photos/I70_sized.jpg

http://simh.trailing-edge.com/photos/I70_sized.jpg26/11/2005 9:18:58 http://simh.trailing-edge.com/photos/I70-frontpanel_sized.jpg

http://simh.trailing-edge.com/photos/I70-frontpanel_sized.jpg26/11/2005 9:19:09 http://simh.trailing-edge.com/photos/altair.jpg

http://simh.trailing-edge.com/photos/altair.jpg26/11/2005 9:19:11 http://simh.trailing-edge.com/photos/lgp30.jpg

http://simh.trailing-edge.com/photos/lgp30.jpg26/11/2005 9:19:21 http://simh.trailing-edge.com/photos/lgp21.jpg

http://simh.trailing-edge.com/photos/lgp21.jpg (1 of 2)26/11/2005 9:19:31 http://simh.trailing-edge.com/photos/lgp21.jpg

http://simh.trailing-edge.com/photos/lgp21.jpg (2 of 2)26/11/2005 9:19:31 Maxwell M. Burnet Robert M. Supnik Preserving Computing’s Past: Restoration and Simulation

Restoration and simulation are two techniques The Computing Past for preserving computing systems of historical interest. In computer restoration, historical sys- The continuous improvements in computing technol- ogy cause the rapid obsolescence of computer systems, tems are returned to working condition through architectures, media, and devices. Since old comput- repair of broken electrical and mechanical sub- ing systems are rarely perceived to have any value, the systems, if necessary substituting current parts danger of losing portions of the computing record is for the original ones. In computer simulation, significant. When a computing architecture becomes historical systems are re-created as software extinct, its software, data, and written and oral records programs on current computer systems. In each often disappear with it. Older computer systems embody major investments case, the operating environment of the original in software, the value of which may persist long after the system is presented to a modern user for inspec- systems have lost their technical relevancy. For example, tion or analysis. This differs with computer con- the PDP-11 computer has not been a leading-edge servation, which preserves historical systems architecture since the introduction of 32-bit systems in their current state, usually one of disrepair. in the late 1970s and has not received a new hardware The authors argue that an understanding of implementation since 1984. Nonetheless, PDP-11 sys- tems continue to be used worldwide, particularly in computing’s past is vital to understanding its real-time and control applications. The unavailability future, and thus that restoration, rather than of suitable replacements of worn-out original parts is just conservation, of historic systems is an a serious issue for PDP-11 systems still in use. important activity for computer technologists. Another area of potential loss is data. In recent years, archival storage media have undergone rapid technologic evolution, and the industry standards of computing’s first 30 years, such as 0.5-inch magnetic tape, are now antiques. Salvaging data from original media is an industry-wide problem and has generated a small cottage industry of specialists in data recovery. This problem will only proliferate, as transitions in media types accelerate. Ten years from now, the large- diameter optical disks used for today’s archives will look as quaint as DECtape and magnetic tape storage systems do to current computer users. Finally, the disappearance of older equipment typi- cally entails loss of information: not only design sketches, blueprints, and documentation but also the folklore about these systems. The absence of system- atic archiving, as well as the absence of a perceived value of the archived data, causes continual informa- tion decay about design and operational details. This paper describes two techniques for preserving computing systems of historical interest. The first section of the paper discusses the restoration of old computers to working order. It also includes a descrip- tion of the Australian Museum collection and the

Digital Technical Journal Vol. 8 No. 3 1996 23 process of restoring a particular PDP-11 minicom- Once a representative sample of the early PDP puter. The second section discusses the simulation systems had been collected, the urgency abated. of old computers on modern systems. It describes a Hundreds of PDP-11 and VAX systems were then simulation framework called SIM, which has been brought to Australia; the window of opportunity for used to implement simulators for the PDP-8, PDP-11, collecting them is still open. PDP-4/7/9/15, and Nova minicomputers. The collection has grown significantly during the last 25 years. At the present time, we have in Sydney Restoring Old Computers a comprehensive collection of most early Digital machines, including hardware, manuals, software, and Since the computer became a mass-produced item in spares (see Table 2). The collection is catalogued in the late 1960s, its typical life cycle has consisted of initial a 6,000-line database that resides, appropriately, on a installation, rental or depreciation for about five years, MicroVAX I computer, running the first version of retention and use for a few more years (just in case), and the MicroVMS operating system. Figure 1 shows an then retirement and a trip to the refuse dump. There is example from the collection, a PDP-8/E computer only a brief window of opportunity to collect old com- system with peripheral equipment. puters at the end of their working life. Once that win- The goals of the collection are varied and are sum- dow is closed, the computers are gone forever. marized in Table 3. Apart from the academic challenge of keeping all old data media running, there is the The Australian Museum Collection responsibility to ensure that they can be kept alive and In Sydney, Australia, this window of opportunity available. The extensive variety of media types offered first became apparent in 1971, when the early PDP by Digital alone in only 30 years is summarized in systems reached the ends of their life cycles. Digital’s Table 4. The evolving status of the collection has been Australian subsidiary began collecting systems by a reported at several Australian DECUS Symposia.2,3 creative program of trade-ins for new equipment.1 It The restoration of the Australian collection will prob- was especially urgent to obtain examples of the 12-bit, ably ensure a retirement job for the curator for the 18-bit, and 36-bit PDP series, as they were relatively next 30 years! few in number. Table 1 lists the percentage of available units that have been collected. The status of each is General Issues in Restoration given as Restoration is a painstaking and time-consuming process. The goal of restoration is to return a system to ■ Static—can never be made to work for various a state where it will reliably run a major operating sys- reasons tem and offer as many media conversion facilities of ■ Restorable—could be made to work with enough the vintage as possible. Fortunately, computers do not care, patience, time, and effort deteriorate greatly in storage, provided the storage ■ Working—running its operating system the last area is dry. (One item that does decay dramatically is time it was turned on the black foam used to line side panels and to separate

Table 1 Early Digital CPUs in Australia Model Number Brought Number in Name to Australia Museum Collection Condition PDP-5 1 1 Restorable PDP-6 1 1 Some items PDP-7 1 1 Static PDP-8 28 3 Working PDP-8/S 20 2 Static LINC-8 2 2 Restorable PDP-9 7 1 Restorable PDP-10 8 1 Some items PDP-12 2 2 Restorable PDP-8/I 24 2 Restorable PDP-8/L 21 2 Restorable PDP-15 10 1 Static PDP-8/E 90 4 Working

24 Digital Technical Journal Vol. 8 No. 3 1996 Table 2 The Digital Australian Collection (chronological order) Year Item Description Status 1958 138 A/D converter Static 1960 ASR-33 Teletype reader/punch, 110 baud Working 1962 KSR-35 Heavy-duty Teletype Working 1963 PDP-6 Modules of first Digital computer in Australia Parts 1963 PDP-5 First minicomputer in Australia Working 1967 PDP-7 Third Digital computer in Australia Static 1965 PDP-8 Classic, table-top model Working 1965 PDP-8 Cabinet model Restorable 1965 PDP-8 system Static 1965 PDP-8 Cabinet model, first in New Zealand Restorable 1965 COPE-45 Remote batch (OEM PDP-8) Restorable 1966 PDP-9 18-bit computer Static 1966 KA10 Console of PDP-10 mainframe Static 1966 Linc-8 Early medical computer Working 1967 PDP-8/S Serial, under $10,000, CPU Static 1967 PDP-8/S Serial computer Static 1967 DF32 Digital’s first disk, 1/16 Mb Static 1967 PDP-9/L Last transistor logic, 18-bit Static 1968 PDP-8/I Digital’s first IC minicomputer Working 1968 PDP-8/L OEM version of PDP-8/I Static 1969 PDP-12 Laboratory computer Working 1969 PDP-12 Laboratory computer Static 1969 PDP-15 Last of 18-bit family Static 1969 KI10 Console of DECsystem-10 Static 1970 PDP-8/E Pinnacle of PDP-8 development Working 1970 PDP-8/E Full LAB 8 configuration Working 1970 PDP-11/20 The first PDP-11 Working 1970 CR11 Card reader, 285 cpm Working 1971 PDP-8/F Small PDP-8/E Working 1971 VT05 Digital’s first video terminal Working 1971 LA30P Digital’s first hard-copy terminal Working 1971 PDP-11/45 Last PDP-11 Static 1972 GT40 Graphics Broken 1972 PDP-11/10 Small PDP-11 Static 1973 PDP-11E10 First packaged system Working 1973 PDP-11/35 Mid-range PDP-11 Static 1973 PDP-8/A Last non-chip PDP-8 Working 1974 PDP-11/40 Mid-range, end-user PDP-11 Restorable 1975 VT50 Video terminal Working 1975 LA36 DECwriter II printer Working 1975 DS310 Desk-based commercial system Working 1975 PDP-11/70 Largest PDP-11 Restorable 1976 PDP-11/34 Mid-range PDP-11 Working 1977 PRS01 Portable paper tape reader Working 1977 LS120 DECwriter printer Working 1977 WS78 Word processor, 8-inch floppy disks Working 1978 LA120 DECwriter III printer, 180 cps Working 1978 VAX-11/780 Original unit of 1 VAX-11/780 Restorable continued on next page

Digital Technical Journal Vol. 8 No. 3 1996 25 Table 2 (continued) Year Item Description Status 1979 VT100 Famous video terminal Working 1980 MINC LSI-11 lab unit with RT-11 Working 1980 VAX-11/750 Mid-range VAX system Restorable 1980 PDT-150 Table-top LSI-11 with RX01 drives Working 1981 GIGI Low-cost terminal for schools Working 1982 VT125 Video terminal with graphics Working 1982 WS278 DECmate I word processor Restorable 1982 VAX-11/730 Low-performance VAX system Working 1982 LA12 Portable hard-copy terminal Static 1982 LQP03 Letter-quality printer Working 1982 DECmate II Word processor on mobile stand Working 1982 DECmate II Word processor Working 1982 Rainbow Personal computer Working 1982 PRO350 Professional PC Working 1983 VT241 Graphics color terminal Working 1983 MicroVAX I Smallest VAX .3 VUP Working 1983 VAX-11/725 Lowest cabinet VAX .3 VUP Working 1984 LN03 Laser printer Working 1985 MicroVAX II Famous MicroVAX II Working 1986 VAXmate 286-based PC with RX33 drive Working 1986 DECmate III Small word processor Working 1987 MicroVAX III 3-VUP MicroVAX II system Working 1987 VAX 8250 Dual VAX CPU, BI-based Restorable 1989 VAX 9000 Chip set Static 1990 DS3100 Mips UNIX workstation Restorable

ribbon cables. After 20 years, it turns into a sticky, ■ Physically assemble the hardware, checking module gooey mess. It should be removed as soon as possible; allocations, cabling, etc. otherwise, it falls into the modules and . ■ Carefully inspect the power system, high-voltage Replacing it with a modern equivalent can be done but sources can kill. Although most of the power wiring is not essential.) material appears to stand the test of time, the early The first step in restoration is to collect hardware, machines often had rather thin coverings on termi- software, and documentation. nals. Safety-first is a principal criterion in restora- ■ Collect the hardware, if possible two or ideally tion, since someday nontechnical people may open three items of each example. This provides a system the back door. to work on and a spare, as well as the ability to make ■ Assemble a minimal system of CPU, memory, and comparisons between units. console switch register for initial tests. ■ Collect diagnostic and operating software on origi- ■ Power up the computer, checking supply voltages, nal bootstrap media. Sources are very useful, partic- fans, and front console for signs of life. ularly for diagnostics. ■ Use simple routines at the switch register to check ■ Collect hardware manuals and schematics. for elementary operation. There is a network of enthusiasts around the world ■ Fit a serial line unit so that a VT or a Teletype con- who can help at this stage. sole can be used. Once the “ingredients” have been collected, the ■ Get the keyboard echoing to the screen or printer steps needed to restore a 1960s or 1970s vintage with simple routines. machine are as follows: ■ If they are available, run the internal tests of the ■ Inspect the hardware for physical safety, particularly read-only memory (ROM). the heavy drawers and slide mechanisms.

26 Digital Technical Journal Vol. 8 No. 3 1996 RX01 DUAL 8-INCH FLOPPY DISKETTES

TD8E TU56 TRANSFER DUAL DECTAPE SYSTEM

PC8E 300 CPS READER, 50 CPS PUNCH PAPER TAPE

PDP-8/E CPU WITH EXTENDED ARITHMETIC ELEMENT, 16K WORDS MEMORY, KL8E 2400-BAUD CONSOLE, KL8E 2400-BAUD COMMUNICATION PORT, DECTAPE BOOTSTRAP, RK05 DISK BOOTSTRAP, REAL-TIME CLOCK

RK05 REMOVABLE 2.4-MB CARTRIDGE DISK

STORAGE RACK FOR 10 DECTAPE SYSTEMS

H861 POWER DISTRIBUTION

Figure 1 PDP-8/E Computer System

Conventional wisdom would now advise that all the bility is questionable, however, and the procedure is diagnostic routines be run. However, diagnostics were tedious. Many diagnostics were on paper tape, but (philosophically) always used to find bugs in a previ- usually the quickest test is to load a complete paper ously good machine; they are too complex when huge system (such as FOCAL for Digital’s systems). If the chunks of the machine might still be missing. The diagnostics run, the system is probably functional. most practical next step is to get mass storage on-line. Once the CPU, console, and memory are verified, Depending on the manufacturer, the target device additional peripherals can be added, one at a time. It may be a floppy disk drive, a cartridge , pays to take the time and effort to research bus or some form of magnetic tape. With a working mass addresses, vectors, power supply loading, storage device and a bootstrap routine, it becomes and module placement, and to keep a log book with possible to boot a simple operating system (like OS/8 configuration diagrams and results. In general, if the or RT-11 for Digital’s systems). This quickly shows configuration rules are followed, the items will work. whether the machine is working or not. There are few electronic failures, even in 20- or 30- If a mass storage device is not available, the next best year-old modules. When a problem arises, it is usually thing is paper tape. This can be either the system’s address vector strapping, physical damage, or missing rack-mounted reader and punch or the paper tape cables. Corrosion of board contacts can be a problem; reader on an ASR33 or ASR35 console. The relia- they should be cleaned with a clean cloth or cardboard

Digital Technical Journal Vol. 8 No. 3 1996 27 Table 3 Table 4 Goals of the Australian Digital Museum Digital Data Media from 1960 to 1996 To preserve one of each model of Digital’s computers Paper tape To keep each major Digital operating system working 80-column punched and mark sense cards To have a working unit of each Digital terminal, con- 7-track, half-inch magnetic tape sole, and PC 9-track, half-inch magnetic tape To provide conversion and archival facilities for old DECtape and LINCtape systems media Audiocassette To preserve significant Digital literature and manuals DECtape II cartridge (TU58) To preserve a VAX-11/780 computer as the original unit of 1 VUP CompacTape (TK50, etc.) To disseminate instructive and educational material Quarter-inch cartridge tape To educate and amuse our staff, our customers, and Digital audio tape the public 8-inch floppy disk To support the DECUS NOP (nostalgic obsolete prod- 5.25-inch floppy disk uct) Special Interest Group 3.5-inch floppy disk To preserve spares, tools, test gear, and documenta- RK05 removable disk tion to keep the collection working RK06, RK07 removable disk To preserve and protect these treasures for future RL01, RL02 removable disk generations RP01…RP06 removable disk RM03, RM05 removable disk RC25 removable disk (for example, a business card), not with a pencil eraser, which leaves residues. Silicon components appear to be very stable and a tribute to the conservative design principles of early computer engineers. semiconductor (MOS) memory, and other later inven- The main components that seem to age are power tions. The project refocused on the mid-range supply , fans, and lights. The filter capaci- PDP-11/34, which in retrospect has proved wise. The tors across the high-voltage sources can short, and PDP-11/34 supports MOS memory, has an LED and reference electrolytic capacitors in power supply regu- push-button console, and represents a mature imple- lators can dry out. Although the large capacitors in mentation of the PDP-11 instruction set. It has an power supply RC filters have proven to be reliable, optional cache, battery backup, floating-point opera- some restorers replace them as a matter of course for tion, and the extended instruction set (EIS). safety reasons. Small rotary fans may seize if they have The current configuration occupies three large cab- logged many hours. Incandescent panel lamps are inets in what used to be the dining room of Max always failing and can be replaced by modern light- Burnet’s house. The virtues of the UNIBUS are many; emitting diodes (LEDs) if required. The irony is that in particular, it allows modular connection of I/O the panel lamps are needed only during initial check- devices and other components. However, I/O devices out; once the operating system is running, they are of the era often weigh 100 pounds and are mounted in rarely used. 10-inch drawers; their sheer physical size and weight Once restored, are old units reliable? Experience are disincentives to reconfiguration. proves that they are. A classic PDP-8 system restored The project currently uses the RT-11 operating in 1988 still turns on happily (untouched) eight years system because of its simplicity and extensive device later. A fully configured PDP-8/E system is still work- drivers. Eventually, it may be possible to run the ing four years after restoration. RSX-11M and the RSTS/E systems, but there is little to gain from a media conversion point of view, because Restoring a Minicomputer: A Case Study RT-11 includes utilities for dealing with foreign file An ongoing project is the restoration of a large, formats. UNIBUS-based PDP-11 system with many UNIBUS The main difficulties encountered have been associ- peripherals attached to it. The project was started ated with the power supply: the DC low signal threads using the original PDP-11/20 CPU. Since many its way through every peripheral. The absence of PDP-11 peripherals were designed long after the UNIBUS grant continuity cards can create havoc. PDP-11/20 CPU, it could not cope with single-board Since this PDP-11 system is very large, it is straining direct memory access (DMA) devices, metal-oxide the design rules concerning floating vectors, current loading, and bus loads.

28 Digital Technical Journal Vol. 8 No. 3 1996 The CPU and memory are relatively easy to check execution time of the target system, or simple represen- out. Due to the versatility of the UNIBUS, however, tations of advancing time, such as the number of checking out the I/O system is very laborious. instructions executed. The event mechanism provides a Starting with programmed I/O tests works best, fol- way to schedule events, such as I/O completion, for lowed by interrupt tests, and finally DMA or non- later evaluation. It can also implement other time- processor reference (NPR) tests. Experience shows based mechanisms such as keyboard polling. Finally, that tests need to be rerun whenever a new peripheral the control panel provides access to simulated state as is added. well as control commands such as start and stop. The system currently runs the RT-11 version 5.04 It may also provide more elaborate facilities to support operating system on a configuration comprising performance instrumentation or debugging. Historically, simulators have been used for many ■ RT-11/34 CPU with real-time clock and bootstraps purposes, including the following: ■ 256 kilobits of MOS memory ■ Design of new systems. The simulator mimics the ■ RX01 and RX02 floppy disks behavior of a future chip or computer system and is ■ Dual RL02 disks used to understand and debug the behavior of the ■ TU56 dual DECtape storage system proposed design. For example, prior to fabrication, ■ TU58 DECtape II storage system all modern microprocessors are extensively simu- lated, first as abstract performance models and then ■ Serial line units for console and serial printer at increasing levels of detail. 5–9 ■ CM11 mark sense and CR11 reader ■ Debugging for embedded systems. If the simula- ■ TU60 cassette tor contains facilities for program debugging, it ■ PC11 paper tape reader and punch becomes a useful tool for debugging programs that run in highly constrained environments such as Although the following peripherals are available, embedded systems. Simulators can capture more they await installation time and effort: state and provide a wider range of facilities than in ■ LPS-40 analog-to-digital (A/D) converter situ . For example, simulators can imple- ■ TU10 magnetic tape ment (PC) change queues, data access breakpoints, or precise traps on errors. ■ TSV03 magnetic tape ■ Replicable event tracing. Most simulators are fully ■ Cache and commercial instruction set deterministic. Asynchronous events are scheduled ■ Battery backup kit based on simple, nonrandom algorithms, such as The eventual goal is to keep “the last great fixed time-out or calculated seek time. As a result, (UNIBUS) PDP-11” running with almost every simulators allow for straightforward replication or UNIBUS peripheral ever made.4 Time will tell. playback of complicated sequences, removing the randomness factor that often plagues the debug- Simulating Old Computers ging of asynchronous software on real systems. ■ Preservation of past software. Simulators can pro- A simulator is a operating on one vide migration assistance in the transition from older computer system (known as the host system) which to newer architectures. Many transitional computer mimics the behavior of another computer system systems have provided simulators for older archi- (known as the target system). The simulator’s data is tectures, typically at the microcode level, to assist the state of the target computer system—registers, customers and developers in preserving their invest- memory, timed events, and so on. The simulator oper- ments in the previous architecture. Examples ates on presented state and transforms it, usually by include the early IBM System/360 series, which had sequential evaluation, in the same manner as would models that simulated the 1401, 1410, 7070, and the target computer system. 7090 families, and the early Digital VAX systems, Simulators typically consist of an execution engine, which included a PDP-11 compatibility mode.10,11 which performs the state transformations; a simple timed-event mechanism, which supports deferred and Simulation Levels asynchronous events such as I/O completions; and a Simulators can be written at various levels of detail and control panel, which provides user access to simulated thus various levels of fidelity to the target system. state. The execution engine is responsible for decoding Three common levels of simulation are register trans- instructions in simulated memory and performing the fer level (RTL), instruction, and software specific. specified alterations of simulated machine state. The An RTL simulator attempts to mimic the major execution engine keeps track of simulated time in arbi- hardware blocks of the target system and to imple- trary units, which may be precise representations of the ment its actual logic equations. The goal is absolute

Digital Technical Journal Vol. 8 No. 3 1996 29 fidelity, the test of which is that no piece of software General Design Considerations The design of an running on the simulator should behave differently instruction-level simulator is not technically compli- than it would on the target hardware. In practice, such cated; indeed, simulating a PDP-8 system is a common perfect mimicry is difficult to achieve, as it requires a problem in undergraduate computer science courses. painstaking re-creation of timing detail (for example, SIM follows the processor-memory-switch (PMS) the actual acceleration curve of a DECtape storage structure proposed by Bell and Newell and imple- system) and access to implementation documentation mented in MIMIC and countless other simulators that has often vanished. Nonetheless, some simulators since.10,13 The simulated system is a collection of have achieved results very close to this goal: MIMIC, devices, one of which has special properties (the a DECsystem-10 simulator written at Applied Data CPU). Each device has state (registers) and one or Research, was able to run CPU- and device-specific more units. Each unit has state and fixed- or variable- diagnostics. (As testimony to the vulnerability of sized storage. In the CPU device, the storage is main computing’s past, all machine-readable copies of the memory. In an I/O device, the storage is the device MIMIC sources appear to have been lost.) media. The CPU is distinguished from other devices An instruction simulator steps back from the RTL by having the master routine for instruction execu- level and tries to simulate at the functional or the tion. This routine is responsible for the sequential eval- behavioral level. System elements are treated as func- uation of instructions and for the state transformations tions that transform state according to the abstract that represent simulated execution. The CPU also pro- definitions of the system architecture, rather than vides a few systemwide routines, such as symbolic dis- as logic blocks that transform state based on imple- assembly and input and a binary loader. mentation equations. Instruction simulators sacrifice The devices interface to a control panel that pro- absolute fidelity to the idiosyncrasies of a particular vides access to simulated state and control over execu- implementation and focus on the intentions of the tion. The available commands in SIM are listed in architecture specification. As a result, instruction sim- Table 5. ulators can usually run systems software and applica- The control panel also includes routines that are tions but can rarely fool diagnostics. needed by most simulators, such as event queue main- Finally, a software-specific simulation further tenance and character-by-character terminal I/O. abstracts the functions of the target system to only those Different simulators need not use the same time base, needed by a particular piece of target system software. but all the SIM-based implementations to date use the For example, the OS/8 operating system on the PDP-8 number of instructions executed as the time base. computer does not use program interrupts; a simulator Note that the control panel provides for starting sim- aimed at running only the OS/8 operating system ulation, but termination is determined entirely by the would not need to implement interrupts or even simulated CPU. By convention, the CPU returns con- queued events. A recent PDP-11 simulator designed to trol to the control panel under the following conditions: run the 2.9 BSD UNIX operating system abstracted 1. If a HALT instruction is executed parts of the PDP-11 system’s interrupt model and could not run other PDP-11 operating systems.12 2. If a fatal exception is detected 3. If a fatal I/O error is detected Simulating Minicomputers: A Case Study 4. If a special character is typed at the controlling SIM is a portable instruction-level minicomputer sim- terminal ulator implemented in C. Its objectives are to facilitate the study and use of historic computer architectures by Likewise, the control panel does not implement any making simulated implementations and historic soft- debugging facilities beyond state examination and ware available to anyone who has a 32-bit computer. It modification and instruction stepping. To facilitate supports the following target architectures debugging with operating systems, CPUs provide a simple instruction breakpoint capability and a one- ■ PDP-8 level PC trace facility. ■ PDP-11 ■ Nova Implementation The implementation of a particular simulator begins with collecting reference manuals, ■ 18-bit PDP series (PDP-4, PDP-7, PDP-9, PDP-15) maintenance manuals, design documents, folklore, and has been successfully ported to the VAX VMS, the and prior simulator implementations for the target Alpha OpenVMS, the Digital UNIX, and the Linux system. This is nontrivial. In the early days of comput- architectures. Ports to the Windows NT and the ing, companies did not systematically collect and architectures and to an IBM 1401 simu- archive design documentation. In addition, collected lator are under way. material is subject to information decay, as noted

30 Digital Technical Journal Vol. 8 No. 3 1996 Table 5 Commands Available in SIM Command Definition attach Associate file with unit’s media. detach | ALL Disassociate unit’s (all units) media from any file. reset | ALL Reset device (all devices). load Load binary program from file. boot Reset all devices and bootstrap from unit. run {} Reset all devices and resume execution at the current PC {or new PC}. go {} Resume execution at the current PC {or new PC}. cont Resume execution at the current PC. step {} Execute one instruction {or number instructions}. examine Display contents of list of memory locations or registers. iexamine Display contents of list of memory locations or registers and allow interactive modification. deposit Store value in list of memory locations or registers. ideposit Interactively modify list of memory locations or registers. save Save simulator state in file. restore Restore simulator state from file. show queue Display the simulator’s event queue. show configuration Display the simulator’s configuration. show time Display the simulated time counter. show Show device’s configuration options. set

earlier. Lastly, the material is likely to be contradictory, ■ Design documents. For systems that do not have embodying differing revisions or versions of the archi- very large-scale integration (VLSI), the only extant tecture, as well as errors that have crept in during the design documents are the logic prints and the binary documentation process. microcode ROM listings. The prints are essential for For Digital’s 12-bit and 16-bit minicomputers, the RTL simulation: they provide the only documenta- typical hierarchy of documentation was the following: tion of implementation quirks. For VLSI systems, there are chip-level design specifications as well as ■ Processor Handbook. Providing an all-inclusive human-readable microprogram listings. summary of the instruction set architecture, periph- erals, bus interface, and software, these paperback- ■ Folklore. During the useful lifetime of a system, its size books are the most common form of system users exchange information and create an informal documentation but also the least accurate. record, both written and verbal, of shared expe- riences (folklore) regarding the fine points of ■ Subsystem Reference Manual. As the programmer’s operations, hardware/software interfaces, system reference manual for a particular subsystem, such as “personality,” and other factors. Folklore is subject the CPU or the disk drive, these manuals describe to rapid information decay, particularly once the the registers and functions accurately but omit target system becomes obsolete. maintenance-level features and other fine points. ■ Prior implementations. Prior simulator implementa- ■ Subsystem Maintenance Manual. As the mainte- tions can provide useful information, but it must be nance engineer’s manual for a particular subsystem, used cautiously. Unless the prior implementation is these manuals describe the registers and functions an RTL model, it embodies simplifications and at the hardware implementation level, often includ- abstractions that are not explicitly documented. The ing substantial abstracts from the print set. Because MIMIC sources (which are fragmentary and avail- of the level of detail, the maintenance manuals have able only on paper) proved trustworthy, but others proven to be the most useful references for simula- did not: for example, the 1970s PDP-11 simulator tor implementation. in the DECUS archives is highly misleading about interrupts, condition codes, and other details.

Digital Technical Journal Vol. 8 No. 3 1996 31 An important consideration is that much of the The final issue in software is licensing. Even though documentation, all the folklore, and most working the target systems are obsolete and often no longer systems are in the hands of individual collectors. manufactured, the operating system software may be The Internet plays a vital role in locating material held protected by copyrights and licenses. Most PDP-8 by enthusiasts, through news conferences such as software is in the public domain; however, the PDP-11 alt.folklore.computers, alt.sys.pdp8, alt.sys.pdp11, and Nova operating systems are still licensed, as are and comp.emulators.misc, and more recently, through all versions of UNIX. Corporate licensing policies World Wide Web sites devoted to historic systems.14–16 rarely accommodate hobbyists; this limits operating The sources for each simulator in SIM are listed in system distribution to legitimate (that is, business) Table 6. users. Table 7 lists the software found for each simula- The last step in implementation is collecting soft- tor in SIM. ware to run on the simulator. Software collection immediately raises the problem of media translation. Debug The debug path for a simulator depends Software for historic systems resides on paper tapes, on the available software. Ideally, the simulator would DECtape storage systems, 200/556/800 bits-per- be debugged with the same software tests used inch magnetic tapes, disk cartridges, 8-inch floppy to debug the target hardware, but this software is disks, and so on. Few if any modern systems have these rarely archived. Diagnostics can provide low-level peripherals; and few if any historic systems have mod- checking, but diagnostics typically check for broken ern network interconnects. Thus, media translation parts in a correct implementation, rather than an usually entails linking a working version of the target incorrect implementation. Even when diagnostics system to a modern system by means of a serial line. do check architecture rather than implementation (as KERMIT or some other simple protocol allows for a in the basic instruction diagnostics on the PDP-11 byte-by-byte network copy from the original media to system), the absence of sources limits their utility. a file on a modern system. Consequently, the simulators were debugged mostly Once the software has been located and moved with simple hand tests and then with the operating to a file, the next issue is sources. Without sources, systems. diagnostics and other test programs are useless; Operating systems are both exacting and imprecise detected errors cannot be traced back to causes with- tests of implementation correctness. Unless an out manual decode of the binary program. The operating system takes a deliberately restrictive view absence of sources was a principal reason for including of hardware (for example, OS/8 does not use the symbolic disassembly and input in SIM. PDP-8 interrupt system, and RT-11 does not use any

Table 6 Sources for Simulators in SIM Architecture Documents Location PDP-8 Minicomputer Handbook Private collection Reference manuals Digital archive Maintenance manuals Digital Australia collection Print sets Digital Australia collection Prior implementations Public archive17 Public archive18 MIMIC, private collection PDP-11 Minicomputer Handbook Private collection Reference manuals Digital archive Maintenance manuals Digital Australia collection Chip specifications Private collection Microcode listings Private collection Prior implementations Public archive19 MIMIC, private collection Nova System Reference Manual Private collection Reference manuals Data General archive Maintenance manuals Private collection Prior implementations MIMIC, private collection 18-bit PDP Reference manuals Digital archive Maintenance manuals Digital archive Print sets Digital archive

32 Digital Technical Journal Vol. 8 No. 3 1996 Table 7 Software for Simulators in SIM Architecture Software Location PDP-8 Basic instruction tests 1 and 2 Digital Australia collection Memory management test Digital Australia collection FOCAL69 Digital Australia collection OS/8 system disk Public archive18 PDP-11 RT-11 Transcribed from real system RSX-11M Transcribed from real system RSTS/E Transcribed from real system UNIX V5, V6, V7, 2.9 BSD PDP UNIX Preservation Society (PUPS) archive20 2.11 BSD Private collection Nova RDOS Private collection 18-bit PDP No software to date

optional PDP-11 instructions), the operating sys- as preservation of software and data; beyond that, tem will be sensitive to every error in implementation. there is an obligation to future generations. In 100 For example, Digital’s second-generation PDP-11 years, the systems from computing’s early history will systems—the PDP-11/05, 11/40, and 11/45— appear to be absolute dinosaurs of the past. Yet their were debugged with DOS-11 and RSTS after diag- educational and sociological value will be consider- nostics failed to detect certain subtle implementation able. A computer is a machine with a soul, and it must errors. Unfortunately, in an operating system, the be kept alive with its operating environment to show distance in time and space between the error and the its abilities and the contemporary state of the art. symptom may be enormous, and the traceable path may be lengthy and complicated. Artifacts in the Acknowledgments software can also complicate debug: the OS/8 disk image on the Internet contains a copy of BASIC that Max Burnet: I would like to thank Digital Equipment is broken. Corporation Australia Pty Ltd for tolerating my eccentricity and for supporting the Australian Digital Results SIM implements four minicomputer architec- museum collection. Also the DECUS Australia NOP tures: PDP-8, PDP-11, Nova, and 18-bit PDP. Each (nostalgic obsolete product) SIG members for help, simulator includes a particular CPU; basic peripherals encouragement, knowledge, good humor, and cama- such as terminal, paper tape, clock, and printer; and raderie on the last Wednesday of the month. My thanks a selection of mass storage peripherals (see Table 8). to my coauthor Bob Supnik for his continued inspira- The PDP-8 simulator has run the FOCAL69 and tion; it is great to see a V.P. who can cut code with the the OS/8 operating systems. The PDP-11 simulator best of them. My thanks also to the contributors to the has run the following operating systems: RT-11 V4 Digital Notes files, a great source of folklore. Therein and V5; RSX-11M V4; RSTS/E V8; UNIX V5, lies a treasure trove of solutions from people who are V6, and V7; and BSD V2.9 and V2.11. The Nova helping each other solve the same problems. simulator has run the RDOS V7.5 operating system. No system software for the 18-bit PDP systems Bob Supnik: The design, implementation, and debug has been found. The simulators were exercised on an of SIM was made possible by the generous help of AlphaStation 3000/600 workstation (approximately many people. Craig St. Clair and Deb Toivonen of the 120 SPECint92); the performance is given in Table 9. Digital archives located rare manuals and documents Figures 2, 3, and 4 show screen shots from the various on Digital’s 12-bit, 16-bit, and 18-bit systems. Tom simulators running their principal operating systems. West and Don Lewine of Data General Corporation provided documentation and support on the Nova. In Defense of Computing’s History Carl Friend’s private collection of Data General hardware and software was a crucial source of docu- As professional engineers who have been lucky mentation and software for the Nova and the RDOS enough to witness the computer revolution, the operating system. Doug Jones, Bill Haygood, and authors believe that the industry has a duty to keep John Wilson allowed me to use the sources to their early machines alive. There are practical reasons, such simulators and freely answered arcane questions about

Digital Technical Journal Vol. 8 No. 3 1996 33 Table 8 Architectures Implemented by SIM PDP-8 PDP-11 Nova CPU PDP-8/E J-11, Q-bus Nova 820 Options KE8E EAE, Integral FP11 Multiply/divide KM8E memory extension Memory 4–32K words 16 KB–4 MB 4–32K words Terminal KL8E DL11 KSR-33, Dasher Paper tape PC8E PC11 Yes Clock DK8E KW11L Yes Printer LE8E LP11 Yes Storage RX8E/RX01 RX11/RX01 4019 RK8E/RK05 RK11/RK05 4046/4047, 4048, RF08/RS08 RLV11/RL01,2 4057, 4234 Magnetic tape TM8E/TU10 TM11/TU10 6026 PDP-4 PDP-7 PDP-9 PDP-15 CPU PDP-4 PDP-7 PDP-9 PDP-15/30 Options T177 EAE, KE09A EAE, KE15 EAE, T148 memory KX09A memory KM15 memory extension protection protection KP09A power KP15 power Memory 4–8K words 4–32K words 4–32K words 4–128K words Terminal KSR-28 KSR-33 KSR-33 KSR-35 Paper tape Integral T444 reader PC09A PC15 reader- T75 punch T75 punch reader- punch punch Clock Yes Yes Yes Yes Printer T62 T647 T647E LP15 Storage T24 drum RF09/RS09 RF15/RS09 RP15/RP02 Magnetic tape TC59/TU10 TC59/TU10

the hardware. In addition, Bill provided a working Australia collection, answered questions based on his OS/8 system disk, and John copied several PDP-11 30 years of experience with Digital’s systems, and operating system disks off a working PDP-11/34. made connections with and introductions to the Megan Gentry was an important source of PDP-11 worldwide community of historic machine hobbyists folklore, debugged some of the subtlest problems, cre- and enthusiasts. ated the Makefile, and provided the first and most frequently used distribution site. Ben Thomas References and Notes provided the character-by-character I/O routines for VMS. Chris Suddick helped debug the PDP-11 1. As managing director of Digital’s Australian subsidiary floating-point code. Warren Toomey and the enthusi- from 1975 to 1982, Max Burnet created and operated asts at PUPS (the PDP UNIX Preservation Society) the PDP trade-in program. in Australia allowed me access to their archive of early 2. M. Burnet, “An Update on the Museum Treasures,” UNIX releases. Leendert Van Doorn debugged DECUS Australia Symposium Proceedings, August the PDP-11 simulator with UNIX V6, and Franc 1993. Grootjen with 2.11 BSD. Larry Stewart provided the 3. M. Burnet, “The ’94 Update on the Museum Trea- initial impetus to the project, and Ken Harrenstein sures,” DECUS Australia Symposium Proceedings, made an important contribution to preservation August 1994. by implementing a DECsystem-10 simulator. Last, but not least, Max Burnet generously provided 4. M. Burnet, “The Last Great PDP-11,” DECUS Australia Symposium Proceedings, documentation and software from the Digital August 1995.

34 Digital Technical Journal Vol. 8 No. 3 1996 Table 9 Simulator Performance Simulator Simulated Real Ratio Instructions Instructions per Second per Second PDP-8 1,800,000 400,000 4.5:1 PDP-11 440,000 500,000 .88:1 Nova 1,700,000 750,000 2.26:1

ucoder> pdp8

PDP-8 simulator V2.2b sim> att rk0 os8.dsk sim> boot rk0

.DA 08-APR-96

.DIR

08-Apr-96

COPYIT.SV 2 09-Mar-93 PASS2 .SV 20 11-Oct-92 FORT3 .LD 3 06-Jul-93 DIRECT.SV 7 11-Oct-92 PASS2O.SV 5 11-Oct-92 CLOSE .SV 2 10-Jul-93 CCLX .SV 24 25-Feb-93 PASS3 .SV 8 11-Oct-92 FORT4 .FT 1 11-Jul-93 PIP .SV 11 11-Oct-92 RALF .SV 19 11-Oct-92 FORT4 .LD 2 04-Aug-93 FOTP .SV 8 11-Oct-92 RESORC.SV 10 11-Oct-92 FORT6 .LD 2 09-Aug-93 ABSLDR.SV 5 11-Oct-92 RUNOFF.SV 24 11-Oct-92 FORT5 .FT 1 09-Aug-93 BASIC .SV 11 11-Oct-92 SABR .SV 24 11-Oct-92 FORT5 .LD 2 09-Aug-93 BATCH .SV 10 11-Oct-92 SCROLL.SV 17 11-Oct-92 FORT6 .FT 1 09-Aug-93 BCOMP .SV 26 11-Oct-92 SET .SV 20 11-Oct-92 METSC .SV 10 11-Aug-93 BITMAP.SV 5 11-Oct-92 SRCCOM.SV 5 11-Oct-92 METSC2.SV 10 11-Aug-93 BLOAD .SV 10 11-Oct-92 TECO .SV 32 11-Oct-92 EMAT .SV 9 11-Aug-93 BOOT .SV 5 11-Oct-92 VERSN3.SV 10 11-Oct-92 EMDCT .SV 14 11-Aug-93 BRTS .SV 24 11-Oct-92 BUILD .SV 33 11-Oct-92 EMTST .SV 10 11-Aug-93 CHEKMO.SV 15 11-Oct-92 BASIC .OV 16 11-Oct-92 SINST1.SV 14 11-Aug-93 COMPAF.SV 5 11-Oct-92 BUILD6.SV 33 11-Oct-92 ADDER .SV 13 11-Aug-93 CREF .SV 13 11-Oct-92 BUILT .SV 33 12-Oct-92 FORT7 .FT 1 30-Aug-93 EDIT .SV 10 11-Oct-92 HELP .HE 1 18-Oct-92 CLEAR .LS 2 13-Jan-94 EDITS .SV 6 11-Oct-92 HELP .HL 72 18-Oct-92 CLEAR .CF 2 13-Jan-94 EPIC .SV 14 11-Oct-92 HELP .OC 4 18-Oct-92 CLEAR .SV 2 13-Jan-94 F4 .SV 20 11-Oct-92 FORT7 .LD 2 07-Sep-93 CLEAR .PA 1 13-Jan-94 FRTS .SV 26 11-Oct-92 JMPTST.SV 3 18-Oct-92 CLEAR .BN 2 13-Jan-94 FUTIL .SV 26 11-Oct-92 JMPJMS.SV 3 18-Oct-92 DEMO . 28 21-Mar-95 HELP .SV 5 11-Oct-92 RK8ENS.BN 1 30-Oct-92 DOS .PA 4 25-Jan-94 LIBRA .SV 11 11-Oct-92 INST1 .SV 14 01-Dec-92 DOS .BN 1 25-Jan-94 LIBSET.SV 5 11-Oct-92 INST2 .SV 11 01-Dec-92 DOS .LS 10 25-Jan-94 LOAD .SV 16 11-Oct-92 FORT .FT 1 17-Jun-93 SHELL .PA 1 25-Jan-94 LOADER.SV 12 11-Oct-92 FORT .LD 2 09-Jul-93 SHELL .BN 1 25-Jan-94 MATST .SV 9 11-Aug-93 FORT2 .LD 2 09-Jul-93 SHELL .LS 2 25-Jan-94 MDTST .SV 14 11-Aug-93 FORT2 .FT 1 22-Jun-93 BASIC .WS 1 10-Mar-94 OCOMP .SV 8 11-Oct-92 DOS .SV 2 25-Jan-94 FOO .PA 1 31-Mar-94 OPTF4 .SV 13 11-Oct-92 SHELL .SV 2 25-Jan-94 FOO .BN 1 31-Mar-94 PAL8 .SV 19 11-Oct-92 FORT3 .FT 1 26-Jun-93

95 Files In 980 Blocks - 2212 Free Blocks

. Simulation stopped, PC: 01207 (KSF) sim>

Figure 2 PDP-8 Simulator Running OS/8

Digital Technical Journal Vol. 8 No. 3 1996 35 ucoder> nova

NOVA simulator V2.2b sim> att dp0 rdos.dsk sim> set tti dasher sim> boot dp0

Filename?

NOVA RDOS Rev 7.50 Date (m/d/y) ? 4 8 96 Time (h:m:s) ? 16 26 0

R list/e sys-.- SYS5.LB 17216 D 05/24/77 13:18 05/31/85 [001017] 0 SYS.SV 56320 SD 12/14/95 16:21 12/14/95 [005057] 0 SYS.LB 20240 D 04/30/85 14:49 05/31/85 [000746] 0 SYS.OL 30720 C 12/14/95 16:21 12/14/95 [005272] 0 SYSGEN.SV 23040 SD 05/02/85 22:20 05/31/85 [001401] 0 R disk LEFT: 2158 USED: 2706 MAX. CONTIGUOUS: 2054 R

Simulation stopped, PC: 41740 (LDA 1,4,3) sim>

Figure 3 Nova Simulator Running RDOS

5. A. Ahi, G. Burroughs, A. Gore, S. LaMar, C.-Y. Lin, 13. R. Rustin, ed., Debugging Techniques in Large and A. Wiemann, “Design Verification of the HP 9000 Systems, R. Supnik, “Debugging Under Simulation” Series 700 PA-RISC ,” Hewlett-Packard (Englewood Cliffs, N. J.: Prentice-Hall, 1971). Journal, vol. 43, no. 4 (1992). 14. For information on and pictures of Data General 6. W. Anderson, “Logical Verification of the NVAX CPU minicomputers, see C. Friend’s web page at Chip Design,” Digital Technical Journal, vol. 4, http://www.ultranet.com/~engelbrt/carl/museum no. 3 (1992): 38–46. /index.html. 7. R. Calcagni and W. Sherwood, “VAX 6000 Model 400 15. For information on and pictures of many historic CPU Chip Set Functional Design Verification,” computers, see J. Jaeger’s web page at http:// Digital Technical Journal, vol. 2, no. 2 (1990): www.msn.fullfeed.com/~cube/collect.htm. 64–72. 16. For information on and pictures of many historic 8. A. Hutchings, “The Evolution of the Custom CAD computers, see P. Pierce’s web page at http:// Suite Used on the MicroVAX II System,” Digital www.teleport.com/~prp/collect/index.html. Technical Journal, vol. 1, no. 2 (1986): 48–55. 17. For documentation and relevant links, see D. Jones’s 9. M. Kantrowitz and L. Noack, “Functional Verification web page at www.cs.uiowa.edu/~jones/pdp8/. For of a Multiple-issue, Pipelined, Superscalar Alpha his simulator, cross assembler, and core images, see Processor—the CPU Chip,” Digital ftp://ftp.cs.uiowa.edu/pub/jones/pdp8. Technical Journal, vol. 7, no. 1 (1995): 136–144. 18. For information on his simulator and OS/8 disk 10. D. Siewiorek, C. Bell, and A. Newell, Computer image, see W. Haygood’s web page at ftp:// Structures: Principles and Examples, “The IBM sunsite.unc.edu/pub/academic/computer-science/ System/360, System/370, 3030, and 4300: A Series history/pdp-8/emulators/haygood. of Planned Machines That Span a Wide Performance 19. For more information on J. Wilson’s simulator (exe- Range,” and “PMS Notation” (New York: McGraw- cutable only), see his web page at ftp:// Hill, 1982). ftp.update.uu.se/pub/ibmpc/emulators. 11. R. Brunner, ed., VAX Architecture Reference 20. For more information on the PDP-11 UNIX archive, Manual, chapter 9, “Compatibility Mode” (Bedford, see the PUPS home page at http:// Mass.: Digital Press, 1991). minnie.cs.adfa.oz.au/PUPS/index.html. 12. This simulator has since been withdrawn from the network.

36 Digital Technical Journal Vol. 8 No. 3 1996 ucoder> pdp11

PDP-11 simulator V2.2b sim> att rk0 rtrk.dsk sim> boot rk0

RT-11SJ (S) V05.04

.

.da 8-apr-96

.dir 08-Apr-96 NL .SYS 2 18-Sep-89 RT11FB.SYS 94 18-Sep-89 RT11SJ.SYS 80 18-Sep-89 SPOOL .REL 11 14-Apr-87 PTESTX.MAC 23 27-Jan-94 GVI .SAV 5 18-Apr-90 BINCOM.SAV 24 27-Sep-88 DUP .SAV 49 27-Sep-88 DIR .SAV 19 27-Sep-88 IND .SAV 58 27-Sep-88 LIBR .SAV 24 27-Sep-88 MACRO .SAV 61 27-Sep-88 LINK .SAV 49 27-Sep-88 RESORC.SAV 25 27-Sep-88 FORMAT.SAV 24 27-Sep-88 ODT .SAV 8 05-Oct-89 PBCOPY.SAV 2 16-Feb-89 SYSLIB.OBJ 55P 05-Oct-89 ODT .OBJ 8 05-Oct-89 SYSMAC.SML 61 16-Mar-89 SIPP .SAV 21 27-Sep-88 DATE .SAV 3 02-Feb-89 IOP .SAV 11 24-Apr-89 SWAP .SYS 27 27-Sep-88 TT .SYS 2 18-Sep-89 DL .SYS 4 18-Sep-89 DM .SYS 5 18-Sep-89 DP .SYS 3 18-Sep-89 DX .SYS 4 18-Sep-89 RK .SYS 3 18-Sep-89 LS .SYS 5 05-Oct-89 MT .SYS 9 18-Sep-89 LP .SYS 2 18-Sep-89 SP .SYS 6 18-Sep-89 PIP .SAV 30 27-Sep-88 HANDLE.SAV 7 16-Feb-89 LD .SYS 8 26-Dec-90 MAC .SAV 61 27-Sep-88 LC .SYS 2 01-Jan-80 UCL .SAV 13 22-Dec-89 UCL .CCL 4 07-Oct-90 STARTS.COM 1 19-Jan-94 MTPIP .SAV 28 27-Feb-87 MTROL .SAV 17 27-Feb-87 MLIB .SYS 300 20-Dec-90 HELP .SAV 132 20-Dec-90 XPC .SAV 16 25-Jun-91 DESS .SAV 18 09-Mar-88 PTESTX.OBJ 8 49 Files, 1432 Blocks 3330 Free blocks

.sho dev

Device Status CSR Vector(s) ------NL Installed 000000 000 TT Installed 000000 000 DL Installed 174400 160 DM Not installed 177440 210 DP Not installed 176710 254 DX Installed 177170 264 RK Resident 177400 220 LS -Not installed 176500 470 474 300 304 MT Installed 172520 224 LP Installed 177514 200 SP Installed 000000 110 LD Installed 000000 000 LC Installed 177514 200

. Simulation stopped, PC: 146506 (ASR R5) sim>

Figure 4 PDP-11 Simulator Running RT-11

Digital Technical Journal Vol. 8 No. 3 1996 37 Biographies

Maxwell M. Burnet Max Burnet has been with Digital in Australia for 29 years. During that time, he has sold, serviced, or marketed all the machines in the collection. He managed the Digital Australia subsidiary for seven years. He was a salesman in Boston during 1971 and managed to replace an IBM 1620 at Tufts University with a PDP-10. He is currently the oldest surviving “techie” in the Sydney office and makes many corporate presentations in Australia. He manages the Australian DECUS Society, the Subsidiary’s local content and export obligations with the Australian Government, and the local Product Assurance Group. He has collected a museum of early Digital machines and is known around Sydney as “Museum Max.” He received a B.Sc. (honours) from Melbourne University.

Robert M. Supnik Bob Supnik has been with Digital in the for 19 years. He joined the Mass Storage Group and then moved into Semiconductor Engineering, where he succes- sively managed the last PDP-11 implementation (the J-11), Advanced Development, the first single-chip VAX imple- mentation (the MicroVAX chip), and the VAX Micro- processor Group. He also wrote or contributed to the microcode of every single-chip VAX microprocessor. In 1988, he started the Alpha program, which he managed through launch of the first products in 1992. He then became technical director, first of Engineering and then of the Computer Systems Division. In 1996, he became vice president of Research and Advanced Development. He has B.A. degrees in mathematics and in history from MIT, and an M.A. in history from Brandeis University.

38 Digital Technical Journal Vol. 8 No. 3 1996 SIMH: Forward... Into The Past

Bob Supnik, VP, Sun Microsystems Contents

• An introduction to SIMH • Rationale • SIMH's development history • The role of the Internet • SIMH design principles • Building a simulator • The computer history ecosystem • Demonstrations • Going forward

2 What is SIMH?

• SIMH is an Internet-based collaborative project focused on preserving computers (and software) of historic interest via simulation • SIMH consists of – A portable application framework for implementing simulators – Portable implementations of 20+ simulators on this framework – Demonstration software to run on these simulators – Papers and presentations documenting interesting facts and tidbits gleaned from the simulators • On the Web at http://simh.trailing-edge.com

3 Portability

• SIMH runs on – Linux, NetBSD, OpenBSD, FreeBSD (gcc) – X86 Windows 95, Windows 98, Windows 2000, Windows XP (Visual C++ or MingW) – WindowsCE – Mac OS/9 (Codewarrior) or OS/X (Apple Development Tools) – Sun Solaris (gcc) – HP/UX (gcc) – AIX (gcc) – Alpha Unix (DEC C) – VAX/VMS, Alpha/VMS, IA64/VMS (DEC C) – OS/2 (gmx)

4 Scope

• SIMH implements simulators for – Data General Nova, Eclipse – Digital Equipment Corporation (DEC) PDP-1, PDP-4, PDP-7, PDP-8, PDP-9, PDP-10, PDP-11, PDP-15, VAX – GRI-909 – IBM 1401, 1620, 1130, System/3 – Hewlett-Packard (HP) 2116, 2100, 21MX – Interdata (Perkin-Elmer) 16b, 32b architectures – Honeywell H316/H516 – MITS Altair 8800, both 8080 and Z80 versions – Royal-McBee LGP-30, LGP-21 – Scientific Data Systems SDS-940 • More than two dozen machines in all • Rather biased towards East Coast USA...

5 Software

• With SIMH, you can run – PDP-1 Lisp and DDT, early interactive systems – PDP-11 Unix V5, V6, V7, the earliest extent releases of Unix (V1 to V4 are lost) – Interdata 7/32 Unix V6, the first port of Unix (and the first port to a 32b system) – PDP-10 TOPS-10, TOPS-20, ITS – PDP-11 DOS, RT-11, RSX-11M, RSX-11M+, RSTS/E – PDP-15 ADSS-15, F/B-15, DOS-15, DOS/XVM – PDP-8 OS/8, TSS-8, ETOS, DMS – VAX/VMS, VAX/Ultrix, VAX/BSD, VAX/NetBSD – Nova RDOS, Eclipse AOS – HP DOS, RTE-III, RTE-IV – MITS Altair CP/M, DOS – System/3 SCP, CMS

6 Rationale

• Computing’s past is disappearing – As machines are scrapped, software and documentation is thrown out, media becomes unreadable, and industry pioneers die • The Santayana principle – “Those who do not study the past are condemned to repeat it” – Hardware and software engineers re-invent the breakthroughs (and mistakes) of the past because they don’t know what they are – What’s the difference between optimizing for the 8KW of a PDP8 and the 8KW of a microprocessor’s first level cache? (Answer: no one does the second) • The relative lifespans of hardware, software, and data – Hardware and architectures come and go – Software and data have much longer duration: 15-50 years – What will run the BART signs when the last PDP-8 breaks down?

7 No, No, Why Those Systems?

• The LGP-30 was the first computer I ever saw • The IBM 1620 was the first computer I ever programmed • The PDP-7 and the PDP-8 were the first computers I wrote complete projects for • The Nova was the first computer I did a complete system design for • The GRI-909 was the weirdest architecture I ever wrote code for (one instruction) • I’ve always had a sneaking fondness for 24b machines (there were never any successful ones) • I worked at DEC for 22 years

8 And Why Did You Really Write SIMH?

• Larry made me do it – Larry Stewart, DEC Research, pointed out in 1993 that computing's past was being lost and suggested I do something about it • I needed to graduate from programming in assembly code and microcode – With Alpha, availability of VAXen was beginning to decrease • It seemed like a good idea at the time – I hadn't done a major software project since porting Dungeon to the PDP-11 (in the late 70's) – I needed an excuse to learn C • Besides, how difficult could it be to write a simulator? – 11 years and 125K lines of code later…

9 Project Goals

• Goal: make computers and software of historic interest accessible to a broad technical population – Simulators, rather than restored hardware – Highly portable (at least VMS, UNIX, and Win32) • Starting point: MIMIC, an RTL simulation system from the late 60's and early 70's – Theft (of one's own prior work) is the highest form of productivity • Initial targets: well documented minicomputers – DEC PDP-8, PDP-11 – Data General Nova • One man, one code base • Then came the Internet...

10 SIMH And The Internet

• The Internet has turned out not be a “global city” but a million global villages, and computer history is one of its communities • Initial contacts through newsgroups and mailing lists • The Web made the activities of collectors and hobbyists visible and accessible – Document repositories – Simulation systems – Software stashes • As a result, SIMH morphed from a one-person hobby project to a collaborative effort of more than 30 people • Most of us have never met IRL

11 The Computer History Ecosystem

• Private collectors – Sometimes there’s no substitute for real hardware • Document archivists – Bringing the written word on-line • Simulator writers – SIMH, MAME, Hercules, CyberCray, and many, many others • Restoration projects – Rhode Island Computer Museum PDP-9, La Cite des Sciences et L'Industrie PDP-9, Computer History Museum PDP-1 and 1620 • Institutions – Computer History Museum, RICM/RICS • Ebay – Ultimately, everything is put up for sale

12 SIMH and The Internet: Recovering Software for the 18b PDP’s

PDP-15 paper- tape software

7-track tape Documentation, Working system, transcription simulator, debug ADSS boot, DOS media, DOS debug

PDP-7 software on 7-track tape Documentation, Tom_Lehrer_Plagiarize.mp3 DOS/XVM media

Documentation, Documentation, ADSS/FB media working system

13 SIMH Design Principals

• Simulators are collections of devices (the CPU is just a device that executes instructions) • Devices contained named registers, which hold state, and numbered units, which contain data sets

Framework Simulator Control Package

Devices CPU Device Device Device

Registers registers registers

Units Unit Unit Unit Unit

Data Sets mem data data data

14 Design Principals, continued

• Data sets are mapped into a uniform set of host system containers – Containers can be in memory (arrays) or on disk (files) – Containers are constrained to “natural” size boundaries (e.g., a 12b memory is mapped as a 16b array or file) • Asynchronous behaviour is modelled explicitly – Time tracked in convenient units (nanoseconds, instructions, etc) – Device events are scheduled for “future time” – Simulator calls a device event handler at appropriate point • Common devices classes are implemented through libraries that hide host OS dependencies – Libraries for disks, tapes, terminal multiplexers, Ethernet – Future extensions will support graphics, “raw” device access

15 Writing A Simulator: A Three Step Program

• Step 1 – research • Gather as much documentation as possible – Primary documentation (maintenance manuals, print sets, microcode listings) are preferable to secondary sources (handbooks, user's guides, prior simulators) • Make contact with actual users – Folklore can be as important as the printed word • Gather and transcribe required software – Diagnostics, operating system(s), application code • RTFM (both the target system's and SIMH's) • The Internet provides a wide variety of starting points for gathering information

16 Writing A Simulator, continued

• Step 2 – implementation • Critical design decisions – How will instructions be decoded and executed? • Modern computers are fast, don't waste time on optimization – How will the I/O subsystem be modelled? • Typically, the more accurately the better – How will interrupts and exceptions be handled? • Typical hardware mechanisms, like microtraps, are easily implemented with longjmp and poorly implemented with try-except- finally – What debugging facilities should be included? • These will be used to debug the simulator, not new programs • There are plenty of examples to emulate (or borrow)

17 Writing A Simulator, continued

• Step 3 – debug – Hand test cases, to get out the stupid bugs – Diagnostics, to get out the straightforward bugs – Operating systems, to get the details right • Successful operating system and application operation is the only real proof of completion • Operating systems make excellent go/no-go diagnostics, but their reporting mechanisms (crash, hang) leave something to be desired • A typical “large” simulator seems to have ~100 bugs – The last 20 have to be found with operating system software • Hence, the need for strong simulator debug tools (step, save and repeat, breakpoints, traces, etc)

18 The Devil Is In The Details

• Writing and debugging a simulator can resemble detective work more than software engineering – Hardware documentation may be incomplete – Hardware documentation may be misleading or false – Software may be incomplete – Software paths may be untested – Simulated configurations may be untested – Software may have undocumented timing dependencies – Software may have undocumented hardware dependencies

19 Making The Right Tradeoffs

• Accuracy is more important than performance – Implement a specific system, not an architecture – Model the hardware accurately at the “black box” level – Follow the microcode (if applicable and available) or the logic prints – Include the fine-grain details • Allow for “real-world” interactions – Wall-clocks vs simulated clocks – Timing loops and real-time devices • Be prepared to “scale out” – Users may not be satisfied with real-world speeds and feeds

20 Demonstrations

• PDP-11 Unix V5 – the earliest extent Unix

• Interdata Unix V6 – the first port, the first 32b Unix

• PDP-10 TOPS-10 – a timesharing bureau on your laptop

• VAX/VMS – 6 9's availability for your PC

21 SIMH In The Real World

• SIMH has grown steadily in scope, scale, and complexity – 1996: 6 simulators; today: 24 simulators – 1996: 12b and 16b systems; today: up to 32b (with 64b to follow) – 1996: 2 host platforms; today: 18 host platforms • SIMH is being used beyond the hobbyist community – As the development platform for PDP-11 OS development – As a replacement VAX platform in a government software development program (reduced build cycles from 135 minutes to 14 minutes) • Future releases will allow more “real world” interactivity – Graphics – Access to additional real devices on the host system

22 What You Can Do

• Check your attic (or your father’s… or grandfather’s) – Lots of equipment, media, software, documentation still in private hands – The greatest risk is simple discarding of “unimportant” artefacts – If in doubt, consult the Computer History Museum • Write (or adopt) a simulator – Lots of interesting machines still to do – Lots of simulators sitting in unfinished or untested state – You never forget your first computer (much as you'd like to) • Get involved! – Preservation of computing’s history depends more on individuals than on institutions or governments

23 Writing a Simulator for the SIMH System Revised 21-Apr-2005 for SIMH V3.4-0

1. Overview ...... 2 2. Data Types...... 2 3. VM Organization...... 3 3.1 CPU Organization...... 4 3.1.1 Time Base...... 4 3.1.2 Step Function...... 5 3.1.3 Memory Organization...... 5 3.1.4 Interrupt Organization ...... 5 3.1.5 I/O Dispatching...... 6 3.1.6 Instruction Execution...... 7 3.2 Peripheral Device Organization ...... 7 3.2.1 Device Timing ...... 8 3.2.2 Clock Calibration...... 9 3.2.3 Data I/O ...... 10 4. Data Structures ...... 11 4.1 sim_device Structure ...... 11 4.1.1 Awidth and Aincr ...... 12 4.1.2 Device Flags ...... 13 4.1.3 Context ...... 13 4.1.4 Examine and Deposit Routines...... 13 4.1.5 Reset Routine...... 14 4.1.6 Boot Routine...... 14 4.1.7 Attach and Detach Routines ...... 14 4.1.8 Memory Size Change Routine...... 15 4.1.9 Debug Controls...... 15 4.2 sim_unit Structure ...... 15 4.2.1 Unit Flags ...... 16 4.2.2 Service Routine...... 17 4.3 sim_reg Structure...... 17 4.3.1 Register Flags ...... 18 4.4 sim_mtab Structure ...... 19 4.4.1 Validation Routine...... 20 4.4.2 Display Routine ...... 21 4.5 Other Data Structures ...... 21 5. VM Provided Routines...... 21 5.1 Instruction Execution ...... 21 5.2 Binary Load and Dump...... 21 5.3 Symbolic Examination and Deposit...... 22 5.4 Optional Interfaces ...... 23 5.4.1 Once Only Initialization Routine...... 23 5.4.2 Address Input and Display...... 23 5.4.3 Command Input and Post-Processing...... 23 5.4.4 VM-Specific Commands ...... 24 6. Other SCP Facilities ...... 24 6.1 Terminal Multiplexor Emulation Library...... 24 6.2 Magnetic Tape Emulation Library ...... 27 6.3 Breakpoint Support ...... 28

1. Overview

SIMH (history simulators) is a set of portable programs, written in C, which simulate various historically interesting computers. This document describes how to design, write, and check out a new simulator for SIMH. It is not an introduction to either the philosophy or external operation of SIMH, and the reader should be familiar with both of those topics before proceeding. Nor is it a guide to the internal design or operation of SIMH, except insofar as those areas interact with simulator design. Instead, this manual presents and explains the form, meaning, and operation of the interfaces between simulators and the SIMH simulator control package. It also offers some suggestions for utilizing the services SIMH offers and explains the constraints that all simulators operating within SIMH will experience.

Some terminology: Each simulator consists of a standard simulator control package (SCP and related libraries), which provides a control framework and utility routines for a simulator; and a unique virtual machine (VM), which implements the simulated processor and selected peripherals. A VM consists of multiple devices, such as the CPU, paper tape reader, disk controller, etc. Each controller consists of a named state space (called registers) and one or more units. Each unit consists of a numbered state space (called a data set). The host computer is the system on which SIMH runs; the target computer is the system being simulated.

SIMH is unabashedly based on the MIMIC simulation system, designed in the late 1960’s by Len Fehskens, Mike McCarthy, and Bob Supnik. This document is based on MIMIC’s published interface specification, “How to Write a Virtual Machine for the MIMIC Simulation System”, by Len Fehskens and Bob Supnik.

2. Data Types

SIMH is written in C. The host system must support (at least) 32-bit data types (64-bit data types for the PDP-10 and other large-word target systems). To cope with the vagaries of C data types, SIMH defines some unambiguous data types for its interfaces:

SIMH data type interpretation in typical 32-bit C

int8, uint8 signed char, unsigned char int16, uint16 signed short, unsigned short int32, uint32 signed int, unsigned int t_int64, t_uint64 long long, _int64 (system specific) t_addr simulated address, uint32 or t_uint64 t_value simulated value, uint32 or t_uint64 t_svalue simulated signed value, int32 or t_int64 t_mtrec mag tape record length, uint32 t_stat status code, int t_bool true/false value, int

[The inconsistency in naming t_int64 and t_uint64 is due to VC++, which uses int64 as a structure name member in the master Windows definitions file.]

In addition, SIMH defines structures for each of its major data elements:

DEVICE device definition structure UNIT unit definition structure REG register definition structure MTAB modifier definition structure CTAB command definition structure DEBTAB debug table entry structure

3. VM Organization

A virtual machine (VM) is a collection of devices bound together through their internal logic. Each device is named and corresponds more or less to a hunk of hardware on the real machine; for example:

VM device Real machine hardware

CPU central processor and main memory PTR paper tape reader controller and paper tape reader TTI console keyboard TTO console output DKP disk pack controller and drives

There may be more than one device per physical hardware entity, as for the console; but for each user-accessible device there must be at least one. One of these devices will have the pre- eminent responsibility for directing simulated operations. Normally, this is the CPU, but it could be a higher-level entity, such as a bus master.

The VM actually runs as a of the simulator control package (SCP). It provides a master routine for running simulated programs and other routines and data structures to implement SCP’s command and control functions. The interfaces between a VM and SCP are relatively few:

Interface Function

char sim_name[] simulator name string REG *sim_pc pointer to simulated program counter int32 sim_emax maximum number of words in an instruction DEVICE *sim_devices[] table of pointers to simulated devices, NULL terminated char *sim_stop_messages[] table of pointers to error messages t_stat sim_load (…) binary loader subroutine t_stat sim_inst (void) instruction execution subroutine t_stat parse_sym (…) symbolic instruction parse subroutine (optional) t_stat fprint_sym (…) symbolic instruction print subroutine (optional)

In addition, there are six optional interfaces, which can be used for special situations, such as GUI implementations:

Interface Function

void (*sim_vm_init) (void) pointer to once-only initialization routine for VM t_addr (*sim_vm_parse_addr) (…) pointer to address parsing routine void (*sim_vm_fprint_addr) (…) pointer to address output routine char (*sim_vm_read) (…) pointer to command input routine void (*sim_vm_post) (…) pointer to command post-processing routine CTAB *sim_vm_cmd pointer to simulator-specific command table

There is no required organization for VM code. The following convention has been used so far. Let name be the name of the real system (i1401 for the IBM 1401; i1620 for the IBM 1620; pdp1 for the PDP-1; pdp18b for the other 18-bit PDP’s; pdp8 for the PDP-8; pdp11 for the PDP-11; nova for Nova; hp2100 for the HP 21XX; h316 for the Honeywell 315/516; gri for the GRI-909; pdp10 for the PDP-10; vax for the VAX; sds for the SDS-940):

• name.h contains definitions for the particular simulator • name_sys.c contains all the SCP interfaces except the instruction simulator • name_cpu.c contains the instruction simulator and CPU data structures • name_stddev.c contains the peripherals which were standard with the real system. • name_lp.c contains the line printer. • name_mt.c contains the mag tape controller and drives, etc.

The SIMH standard definitions are in sim_defs.h. The base components of SIMH are:

Source module header file module

scp.c scp.h control package sim_console.c sim_console.h terminal I/O library sim_fio.c sim_fio.h file I/O library sim_timer.c sim_timer.h timer library sim_sock.c sim_sock.h socket I/O library sim_ether.c sim_ether.h Ethernet I/O library sim_tmxr.c sim_tmxr.h terminal multiplexor simulation library sim_tape.c sim_tape.h magtape simulation library

3.1 CPU Organization

Most CPU’s perform at least the following functions:

• Time keeping • Instruction fetching • Address decoding • Execution of non-I/O instructions • I/O command processing • Interrupt processing

Instruction execution is actually the least complicated part of the design; memory and I/O organization should be tackled first.

3.1.1 Time Base

In order to simulate asynchronous events, such as I/O completion, the VM must define and keep a time base. This can be accurate (for example, nanoseconds of execution) or arbitrary (for example, number of instructions executed), but it must be consistently used throughout the VM. All existing VM’s count time in instructions.

The CPU is responsible for counting down the event counter sim_interval and calling the asynchronous event controller sim_process_event. SCP does the record keeping for timing.

3.1.2 Step Function

SCP implements a stepping function using the step command. STEP counts down a specified number of time units (as described in section 3.1.1) and then stops simulation. The VM can override the STEP command’s counts by calling routine sim_cancel_step:

• t_stat sim_cancel_step (void) – cancel STEP count down.

The VM can then inspect variable sim_step to see if a STEP command is in progress. If sim_step is non-zero, it represents the number of steps to execute. The VM can count down sim_step using its own counting method, such as cycles, instructions, or memory references.

3.1.3 Memory Organization

The criterion for memory layout is very simple: use the SIMH data type that is as large as (or if necessary, larger than), the word length of the real machine. Note that the criterion is word length, not addressability: the PDP-11 has byte addressable memory, but it is a 16-bit machine, and its memory is defined as uint16 M[]. It may seem tempting to define memory as a union of int8 and int16 data types, but this would make the resulting VM endian-dependent. Instead, the VM should be based on the underlying word size of the real machine, and byte manipulation should be done explicitly. Examples:

Simulator memory size memory declaration

IBM 1620 5-bit uint8 IBM 1401 7-bit uint8 PDP-8 12-bit uint16 PDP-11, Nova 16-bit uint16 PDP-1 18-bit uint32 VAX 32-bit uint32 PDP-10, IBM 7094 36-bit t_uint64

3.1.4 Interrupt Organization

The design of the VM’s interrupt structure is a complex interaction between efficiency and fidelity to the hardware. If the VM’s interrupt structure is too abstract, interrupt driven software may not run. On the other hand, if it follows the hardware too literally, it may significantly reduce simulation speed. One rule I can offer is to minimize the fetch-phase cost of interrupts, even if this complicates the (much less frequent) evaluation of the interrupt system following an I/O operation or asynchronous event. Another is not to over-generalize; even if the real hardware could support 64 or 256 interrupting devices, the simulators will be running much smaller configurations. I’ll start with a simple interrupt structure and then offer suggestions for generalization.

In the simplest structure, interrupt requests correspond to device flags and are kept in an interrupt request variable, with one flag per bit. The fetch-phase evaluation of interrupts consists of two steps: are interrupts enabled, and is there an interrupt outstanding? If all the interrupt requests are kept as single-bit flags in a variable, the fetch-phase test is very fast:

if (int_enable && int_requests) { …process interrupt… }

Indeed, the interrupt enable flag can be made the highest bit in the interrupt request variable, and the two tests combined:

if (int_requests > INT_ENABLE) { …process interrupt… }

Setting or clearing device flags directly sets or clears the appropriate interrupt request flag:

set: int_requests = int_requests | DEVICE_FLAG; clear: int_requests = int_requests & ~DEVICE_FLAG;

At a slightly higher complexity, interrupt requests do not correspond directly to device flags but are based on masking the device flags with an enable (or disable) mask. There are now two parallel variables: device flags and interrupt enable mask. The fetch-phase test is now:

If (int_enable && (dev_flags & int_enables)) { …process interrupt… }

As a next step, the VM may keep a summary interrupt request variable, which is updated by any change to a device flag or interrupt enable/disable:

enable: int_requests = device_flags & int_enables; disable: int_requests = device_flags & ~int_disables;

This simplifies the fetch phase test slightly.

At yet higher complexity, the interrupt system may be too complex or too large to evaluate during the fetch-phase. In this case, an interrupt pending flag is created, and it is evaluated by subroutine call whenever a change could occur (start of execution, I/O instruction issued, device time out occurs). This makes fetch-phase evaluation simple and isolates interrupt evaluation to a common subroutine.

If required for interrupt processing, the highest priority interrupting device can be determined by scanning the interrupt request variable from high priority to low until a set bit is found. The bit position can then be back-mapped through a table to determine the address or interrupt vector of the interrupting device.

3.1.5 I/O Dispatching

I/O dispatching consists of four steps:

• Identify the I/O command and analyze for the device address. • Locate the selected device. • Break down the I/O command into standard fields. • Call the device processor.

Analyzing an I/O command is usually easy. Most systems have one or more explicit I/O instructions containing an I/O command and a device address. Memory mapped I/O is more complicated; the identification of a reference to I/O space becomes part of memory addressing. This usually requires centralizing memory reads and writes into , rather than as inline code.

Once an I/O command has been analyzed, the CPU must locate the device subroutine. The simplest way is a large switch statement with hardwired subroutine calls. More modular is to call through a dispatch table, with NULL entries representing non-existent devices; this also simplifies support for modifiable device addresses and configurable devices. Before calling the device routine, the CPU usually breaks down the I/O command into standard fields. This simplifies writing the peripheral simulator.

3.1.6 Instruction Execution

Instruction execution is the responsibility of VM subroutine sim_instr. It is called from SCP as a result of a RUN, GO, CONT, or BOOT command. It begins executing instructions at the current PC (sim_PC points to its register description block) and continues until halted by an error or an external event.

When called, the CPU needs to account for any state changes that the user made. For example, it may need to re-evaluate whether an interrupt is pending, or restore frequently used state to local register variables for efficiency. The actual instruction fetch and execute cycle is usually structured as a loop controlled by an error variable, e.g.,

reason = 0; do { … } while (reason == 0); or while (reason == 0) { … }

Within this loop, the usual order of events is:

• If the event timer sim_interval has reached zero, process any timed events. This is done by SCP subroutine sim_process_event. Because this is the polling mechanism for user-generated processor halts (^E), errors must be recognized immediately:

if (sim_interval <= 0) { if (reason = sim_process_event ()) break; }

• Check for outstanding interrupts and process if required.

• Check for other processor-unique events, such as wait-state outstanding or traps outstanding.

• Check for an instruction breakpoint. SCP has a comprehensive breakpoint facility. It allows a VM to define many different kinds of breakpoints. The VM checks for execution (type E) breakpoints during instruction fetch.

• Fetch the next instruction, increment the PC, optionally decode the address, and dispatch (via a switch statement) for execution.

A few guidelines for implementation:

• In general, code should reflect the hardware being simulated. This is usually simplest and easiest to debug.

• The VM should provide some debugging aids. The existing CPU’s all provide multiple instruction breakpoints, a PC change queue, error stops on invalid instructions or operations, and symbolic examination and modification of memory.

3.2 Peripheral Device Organization

The basic elements of a VM are devices, each corresponding roughly to a real chunk of hardware. A device consists of register-based state and one or more units. Thus, a multi-drive disk subsystem is a single device (representing the hardware of the real controller) and one or more units (each representing a single disk drive). Sometimes the device and its unit are the same entity as, for example, in the case of a paper tape reader. However, a single physical device, such as the console, may be broken up for convenience into separate input and output devices.

In general, units correspond to individual sources of input or output (one tape transport, one A-to- D channel). Units are the basic medium for both device timing and device I/O. Except for the console, all I/O devices are simulated as host-resident files. SCP allows the user to make an explicit association between a host-resident file and a simulated hardware entity.

Both devices and units have state. Devices operate on registers, which contain information about the state of the device, and indirectly, about the state of the units. Units operate on data sets, which may be thought of as individual instances of input or output, such as a disk pack or a punched paper tape. In a typical multi-unit device, all units are the same, and the device performs similar operations on all of them, depending on which one has been selected by the program being simulated.

(Note: SIMH, like MIMIC, restricts registers to devices. Replicated registers, for example, disk drive current state, are handled via register arrays.)

For each structural level, SIMH defines, and the VM must supply, a corresponding data structure. sim_device structures correspond to devices, sim_reg structures to registers, and sim_unit structures to units. These structures are described in detail in section 4.

The primary functions of a peripheral are:

• command decoding and execution • device timing • data transmission.

Command decoding is fairly obvious. At least one section of the peripheral code module will be devoted to processing directives issued by the CPU. Typically, the command decoder will be responsible for register and flag manipulation, and for issuing or canceling I/O requests. The former is easy, but the later requires a thorough understanding of device timing.

3.2.1 Device Timing

The principal problem in I/O device simulation is imitating asynchronous operations in a sequential simulation environment. Fortunately, the timing characteristics of most I/O devices do not vary with external circumstances. The distinction between devices whose timing is externally generated (e.g., console keyboard) and those whose timing is externally generated (disk, paper tape reader) is crucial. With an externally timed device, there is no way to know when an in- progress operation will begin or end; with an internally timed device, given the time when an operation starts, the end time can be calculated.

For an internally timed device, the elapsed time between the start and conclusion of an operation is called the wait time. Some typical internally timed devices and their wait times include:

PTR (300 char/sec) 3.3 msec PTP (50 char/sec) 20 msec CLK (line frequency) 16.6 msec TTO (30 char/sec) 33 msec

Mass storage devices, such as disks and tapes, do not have a fixed response time, but a start-to- finish time can be calculated based on current versus desired position, state of motion, etc.

For an externally timed device, there is no portable mechanism by which a VM can be notified of an external event (for example, a key stroke). Accordingly, all current VM’s poll for keyboard input, thus converting the externally timed keyboard to a pseudo-internally timed device. A more general restriction is that SIMH is single-threaded. Threaded operations must be done by polling using the unit timing mechanism, either with real units or fake units created expressly for polling.

SCP provides the supporting routines for device timing. SCP maintains a list of devices (called active devices) that are in the process of timing out. It also provides routines for querying or manipulating this list (called the active queue). Lastly, it provides a routine for checking for timed- out units and executing a VM-specified action when a time-out occurs.

Device timing is done with the UNIT structure, described in section 4. To set up a timed operation, the peripheral calculates a waiting period for a unit and places that unit on the active queue. The CPU counts down the waiting period. When the waiting period has expired, sim_process_event removes the unit from the active queue and calls a device subroutine. A device may also cancel an outstanding timed operation and query the state of the queue. The timing subroutines are:

• t_stat sim_activate (UNIT *uptr, int32 wait). This routine places the specified unit on the active queue with the specified waiting period. A waiting period of 0 is legal; negative waits cause an error. If the unit is already active, the active queue is not changed, and no error occurs.

• t_stat sim_cancel (UNIT *uptr). This routine removes the specified unit from the active queue. If the unit is not on the queue, no error occurs.

• int32 sim_is_active (UNIT *uptr). This routine tests whether a unit is in the active queue. If it is, the routine returns the time (+1) remaining; if it is not, the routine returns 0.

• double sim_gtime (void). This routine returns the time elapsed since the last RUN or BOOT command.

• uint32 sim_grtime (void). This routine returns the low-order 32b of the time elapsed since the last RUN or BOOT command.

• int32 sim_qcount (void). This routine returns the number of entries on the clock queue.

• t_stat sim_process_event (void). This routine removes all timed out units from the active queue and calls the appropriate device subroutine to service the time-out.

• int32 sim_interval. This variable counts down the first outstanding timed event. If there are no timed events outstanding, SCP counts down a “null interval” of 10,000 time units.

3.2.2 Clock Calibration

The timing mechanism described in the previous section is approximate. Devices, such as real- time clocks, which track wall time will be inaccurate. SCP provides routines to synchronize multiple simulated clocks (to a maximum of 8) to wall time.

• int32 sim_rtcn_init (int32 clock_interval, int32 clk). This routine initializes the clock calibration mechanism for simulated clock clk. The argument is returned as the result.

• int32 sim_rtcn_calb (int32 tickspersecond, int32 clk). This routine calibrates simulated clock clk. The argument is the number of clock ticks expected per second.

The VM must call sim_rtcn_init for each simulated clock in two places: in the prolog of sim_instr, before instruction execution starts, and whenever the real-time clock is started. The simulator calls sim_rtcn_calb to calculate the actual interval delay when the real-time clock is serviced:

/* clock start */

if (!sim_is_active (&clk_unit)) sim_activate (&clk_unit, sim_rtcn_init (clk_delay, clkno)); etc.

/* clock service */

sim_activate (&clk_unit, sim_rtcb_calb (clk_ticks_per_second, clkno);

The real-time clock is usually simulated clock 0; other clocks are used for polling asynchronous multiplexors or intervals timers.

3.2.3 Data I/O

For most devices, timing is half the battle (for clocks it is the entire war); the other half is I/O. Some devices are simulated on real hardware (for example, Ethernet controllers). Most I/O devices are simulated as files on the host file system in little-endian format. SCP provides facilities for associating files with units (ATTACH command) and for reading and writing data from and to devices in a endian- and size-independent way.

For most devices, the VM designer does not have to be concerned about the formatting of simulated device files. I/O occurs in 1, 2, 4, or 8 byte quantities; SCP automatically chooses the correct data size and corrects for byte ordering. Specific issues:

• Line printers should write data as 7-bit ASCII, with newlines replacing carriage- return/line-feed sequences.

• Disks should be viewed as linear data sets, from sector 0 of surface 0 of cylinder 0 to the last sector on the disk. This allows easy transcription of real disks to files usable by the simulator.

• Magtapes, by convention, use a record based format. Each record consists of a leading 32-bit record length, the record data (padded with a byte of 0 if the record length is odd), and a trailing 32-bit record length. File marks are recorded as one record length of 0.

• Cards have 12 bits of data per column, but the data is most conveniently viewed as (ASCII) characters. Column binary can be implemented using two successive characters per card column..

Data I/O varies between fixed and variable capacity devices, and between buffered and non- buffered devices. A fixed capacity device differs from a variable capacity device in that the file attached to the former has a maximum size, while the file attached to the latter may expand indefinitely. A buffered device differs from a non-buffered device in that the former buffers its data set in host memory, while the latter maintains it as a file. Most variable capacity devices (such as the paper tape reader and punch) are sequential; all buffered devices are fixed capacity.

3.2.3.1 Reading and Writing Data

The ATTACH command creates an association between a host file and an I/O unit. For non- buffered devices, ATTACH stores the file pointer for the host file in the fileref field of the UNIT structure. For buffered devices, ATTACH reads the entire host file into a buffer pointed to by the filebuf field of the UNIT structure. If unit flag UNIT_MUSTBUF is set, the buffer is allocated dynamically; otherwise, it must be statically allocated.

For non-buffered devices, I/O is done with standard C subroutines plus the SCP routines sim_fread and sim_fwrite. sim_fread and sim_fwrite are identical in calling sequence and function to fread and fwrite, respectively, but will correct for endian dependencies. For buffered devices, I/O is done by copying data to or from the allocated buffer. The device code must maintain the number (+1) of the highest address modified in the hwmark field of the UNIT structure. For both the non-buffered and buffered cases, the device must perform all address calculations and positioning operations.

SIMH provides capabilities to access files >2GB (the int32 position limit). If a VM is compiled with flags USE_INT64 and USE_ADDR64 defined, then t_addr is defined as t_uint64 rather than uint32. Routine sim_fseek allows simulated devices to perform random access in large files:

• int sim_fseek (FILE *handle, t_addr position, int where) sim_fseek is identical to standard C fseek, with two exceptions: where = SEEK_END is not supported, and the position argument can be 64b wide.

The DETACH command breaks the association between a host file and an I/O unit. For buffered devices, DETACH writes the allocated buffer back to the host file.

3.2.3.2 Console I/O

SCP provides three routines for console I/O.

• t_stat sim_poll_char (void). This routine polls for keyboard input. If there is a character, it returns SCPE_KFLAG + the character. If the user typed the interrupt character (^E), it returns SCPE_STOP. If the console is attached to a Telnet connection, and the connection is lost, the routine returns SCPE_LOST. If there is no input, it returns SCPE_OK.

• t_stat sim_putchar (int32 char). This routine types the specified ASCII character to the console. If the console is attached to a Telnet connection, and the connection is lost, the routine returns SCPE_LOST.

• t_stat sim_putchar_s (int32 char). This routine outputs the specified ASCII character to the console. If the console is attached to a Telnet connection, and the connection is lost, the routine returns SCPE_LOST; if the connection is backlogged, the routine returns SCPE_STALL.

4. Data Structures

The devices, units, and registers that make up a VM are formally described through a set of data structures which interface the VM to the control portions of SCP. The devices themselves are pointed to by the device list array sim_devices[]. Within a device, both units and registers are allocated contiguously as arrays of structures. In addition, many devices allow the user to set or clear options via a modifications table.

4.1 sim_device Structure

Devices are defined by the sim_device structure (typedef DEVICE):

struct sim_device { char *name; /* name */ struct sim_unit *units; /* units */ struct sim_reg *registers; /* registers */ struct sim_mtab *modifiers; /* modifiers */ int32 numunits; /* #units */ uint32 aradix; /* address radix */ uint32 awidth; /* address width */ uint32 aincr; /* addr increment */ uint32 dradix; /* data radix */ uint32 dwidth; /* data width */ t_stat (*examine)(); /* examine routine */ t_stat (*deposit)(); /* deposit routine */ t_stat (*reset)(); /* reset routine */ t_stat (*boot)(); /* boot routine */ t_stat (*attach)(); /* attach routine */ t_stat (*detach)(); /* detach routine */ void *ctxt /* context */ uint32 flags; /* flags */ uint32 dctrl; /* debug control flags */ struct sim_debtab debflags; /* debug flag names */ t_stat (*msize)(); /* memory size change */ char *lname; /* logical name */ };

The fields are the following:

name device name, string of all capital alphanumeric characters. units pointer to array of sim_unit structures, or NULL if none. registers pointer to array of sim_reg structures, or NULL if none. modifiers pointer to array of sim_mtab structures, or NULL if none. numunits number of units in this device. aradix radix for input and display of device addresses, 2 to 16 inclusive. awidth width in bits of a device address, 1 to 64 inclusive. aincr increment between device addresses, normally 1; however, byte addressed devices with 16-bit words specify 2, with 32-bit words 4. dradix radix for input and display of device data, 2 to 16 inclusive. dwidth width in bits of device data, 1 to 64 inclusive. examine address of special device data read routine, or NULL if none is required. deposit address of special device data write routine, or NULL if none is required. reset address of device reset routine, or NULL if none is required. boot address of device bootstrap routine, or NULL if none is required. attach address of special device attach routine, or NULL if none is required. detach address of special device detach routine, or NULL if none is required. ctxt address of VM-specific device context table, or NULL if none is required. flags device flags. dctrl debug control flags. debflags pointer to array of sim_debtab structures, or NULL if none. msize address of memory size change routine, or NULL if none is required. lname pointer to logical name string.

4.1.1 Awidth and Aincr

The awidth field specifies the width of the VM’s fundamental computer “word”. For example, on the PDP-11, awidth is 16b, even though memory is byte-addressable. The aincr field specifies how many addressing units comprise the fundamental “word”. For example, on the PDP-11, aincr is 2 (2 bytes per word).

If aincr is greater than 1, SCP assumes that data is naturally aligned on addresses that are multiples of aincr. VM’s that support arbitrary byte alignment of data (like the VAX) can follow one of two strategies:

• Set awidth = 8 and aincr = 1 and support only byte access in the examine/deposit routines. • Set awidth and aincr to the fundamental sizes and support unaligned data access in the examine/deposit routines.

In a byte-addressable VM, SAVE and RESTORE will require (memory_size_bytes / aincr) iterations to save or restore memory. Thus, it is significantly more efficient to use word-wide rather than byte-wide memory; but requirements for unaligned access can add significantly to the complexity of the examine and deposit routines.

4.1.2 Device Flags

The flags field contains indicators of current device status. SIMH defines 2 flags:

flag name meaning if set

DEV_DISABLE device can be set enabled or disabled DEV_DIS device is currently disabled DEV_DYNM device requires call on msize routine to change memory size DEV_NET device attaches to the network rather than a file DEV_DEBUG device supports SET DEBUG command DEV_RAW device supports raw I/O DEV_RAWONLY device supports only raw I/O

Starting at bit position DEV_V_UF, the remaining flags are device-specific. Device flags are automatically saved and restored; the device need not supply a register for these bits.

4.1.3 Context

The field contains a pointer to a VM-specific device context table, if required. SIMH never accesses this field. The context field allows VM-specific code to walk VM-specific data structures from the sim_devices root pointer.

4.1.4 Examine and Deposit Routines

For devices which maintain their data sets as host files, SCP implements the examine and deposit data functions. However, devices which maintain their data sets as private state (for example, the CPU) must supply special examine and deposit routines. The calling sequences are:

t_stat examine_routine (t_val *eval_array, t_addr addr, UNIT *uptr, int32 ) – Copy sim_emax consecutive addresses for unit uptr, starting at addr, into eval_array. The switch variable has bit set if the n’th letter was specified as a switch to the examine command.

t_stat deposit_routine (t_val value, t_addr addr, UNIT *uptr, int32 switches) – Store the specified value in the specified addr for unit uptr. The switch variable is the same as for the examine routine.

4.1.5 Reset Routine

The reset routine implements the device reset function for the RESET, RUN, and BOOT commands. Its calling sequence is:

t_stat reset_routine (DEVICE *dptr) – Reset the specified device to its initial state.

A typical reset routine clears all device flags and cancels any outstanding timing operations. Switch –p specifies a reset to power-up state.

4.1.6 Boot Routine

If a device responds to a BOOT command, the boot routine implements the bootstrapping function. Its calling sequence is:

t_stat boot_routine (int32 unit_num, DEVICE *dptr) – Bootstrap unit unit_num on the device dptr.

A typical bootstrap routine copies a bootstrap loader into main memory and sets the PC to the starting address of the loader. SCP then starts simulation at the specified address.

4.1.7 Attach and Detach Routines

Normally, the ATTACH and DETACH commands are handled by SCP. However, devices which need to pre- or post-process these commands must supply special attach and detach routines. The calling sequences are:

t_stat attach_routine (UNIT *uptr, char *file) – Attach the specified file to the unit uptr. Sim_switches contains the command switch; bit SIM_SW_REST indicates that attach is being called by the RESTORE command rather than the ATTACH command.

t_stat detach_routine (UNIT *uptr) – Detach unit uptr.

In practice, these routines usually invoke the standard SCP routines, attach_unit and detach_unit, respectively. For example, here are special attach and detach routines to update line printer error state:

t_stat lpt_attach (UNIT *uptr, char *cptr) { t_stat r; if ((r = attach_unit (uptr, cptr)) != SCPE_OK) return r; lpt_error = 0; return SCPE_OK; }

t_stat lpt_detach (UNIT *uptr) { lpt_error = 1; return detach_unit (uptr); }

If the VM specifies an ATTACH or DETACH routine, SCP bypasses its normal tests before calling the VM routine. Thus, a VM DETACH routine cannot be assured that the unit is actually attached and must test the unit flags if required.

SCP executes a DETACH ALL command as part of simulator exit. Normally, DETACH ALL only calls a unit’s detach routine if the unit’s UNIT_ATT flag is set. During simulator exit, the detach routine is also called if the unit is not flagged as attachable (UNIT_ATTABLE is not set). This allows the detach routine of a non-attachable unit to function as a simulator-specific cleanup routine for the unit, device, or entire simulator.

4.1.8 Memory Size Change Routine

Most units instantiate any memory array at the maximum size possible. This allows apparent memory size to be changed by varying the capac field in the unit structure. For some devices (like the VAX CPU), instantiating the maximum memory size would impose a significant resource burden if less memory was actually needed. These devices must provide a routine, the memory size change routine, for RESTORE to use if memory size must be changed:

t_stat change_mem_size (UNIT *uptr, int32 val, char *cptr, void *desc) – Change the capacity (memory size) of unit uptr to val. The cptr and desc arguments are included for compatibility with the SET command’s validation routine calling sequence.

4.1.9 Debug Controls

Devices can support debug printouts. Debug printouts are controlled by the SET {NO}DEBUG command, which specifies where debug output should be printed; and by the SET {NO}DEBUG command, which enables or disables individual debug printouts.

If a device supports debug printouts, device flag DEV_DEBUG must be set. Field dctrl is used for the debug control flags. If a device supports only a single debug on/off flag, then the debflags field should be set to NULL. If a device supports multiple debug on/off flags, then the correspondence between bit positions in dctrl and debug flag names is specified by table debflags. debflags points to a continguous array of sim_debtab structures (typedef DEBTAB). Each sim_debtab structure specifies a single debug flag:

Struct sim_debtab { char name; /* flag name */ uint32 mask; /* control bit */ };

The fields are the following:

name name of the debug flag. mask bit mask of the debug flag.

The array is terminated with a NULL entry.

4.2 sim_unit Structure

Units are allocated as contiguous array. Each unit is defined with a sim_unit structure (typedef UNIT):

struct sim_unit { struct sim_unit *next; /* next active */ t_stat (*action)(); /* action routine */ char *filename; /* open file name */ FILE *fileref; /* file reference */ void *filebuf; /* memory buffer */ uint32 hwmark; /* high water mark */ int32 time; /* time out */ uint32 flags; /* flags */ t_addr capac; /* capacity */ t_addr pos; /* file position */ int32 buf; /* buffer */ int32 wait; /* wait */ int32 u3; /* device specific */ int32 u4; /* device specific */ int32 u5; /* device specific */ int32 u6; /* device specific */ };

The fields are the following:

next pointer to next unit in active queue, NULL if none. action address of unit time-out service routine. filename pointer to name of attached file, NULL if none. fileref pointer to FILE structure of attached file, NULL if none. hwmark buffered devices only; highest modified address, + 1. time increment until time-out beyond previous unit in active queue. flags unit flags. capac unit capacity, 0 if variable. pos sequential devices only; next device address to be read or written. buf by convention, the unit buffer, but can be used for other purposes. wait by convention, the unit wait time, but can be used for other purposes. u3 user-defined. u4 user-defined. u5 user-defined. u6 user-defined. buf, wait, u3, u4, u5, u6, and parts of flags are all saved and restored by the SAVE and RESTORE commands and thus can be used for unit state which must be preserved.

Macro UDATA is available to fill in the common fields of a UNIT. It is invoked by

UDATA (action_routine, flags, capacity)

Fields after buf can be filled in manually, e.g,

UNIT lpt_unit = { UDATA (&lpt_svc, UNIT_SEQ+UNIT_ATTABLE, 0), 500 }; defines the line printer as a sequential unit with a wait time of 500.

4.2.1 Unit Flags

The flags field contains indicators of current unit status. SIMH defines 12 flags:

flag name meaning if set

UNIT_ATTABLE the unit responds to ATTACH and DETACH. UNIT_RO the unit is currently read only. UNIX_FIX the unit is fixed capacity. UNIT_SEQ the unit is sequential. UNIT_ATT the unit is currently attached to a file. UNIT_BINK the unit measures “K” as 1024, rather than 1000. UNIT_BUFABLE the unit buffers its data set in memory. UNIT_MUSTBUF the unit allocates its data buffer dynamically. UNIT_BUF the unit is currently buffering its data set in memory. UNIT_ROABLE the unit can be ATTACHed read only. UNIT_DISABLE the unit responds to ENABLE and DISABLE. UNIT_DIS the unit is currently disabled. UNIT_RAW the unit is attached in RAW mode.

Starting at bit position UNIT_V_UF, the remaining flags are unit-specific. Unit-specific flags are set and cleared with the SET and CLEAR commands, which reference the MTAB array (see below). Unit-specific flags and UNIT_DIS are automatically saved and restored; the device need not supply a register for these bits.

4.2.2 Service Routine

This routine is called by sim_process_event when a unit times out. Its calling sequence is:

t_stat service_routine (UNIT *uptr)

The status returned by the service routine is passed by sim_process_event back to the CPU.

4.3 sim_reg Structure

Registers are allocated as contiguous array, with a NULL register at the end. Each register is defined with a sim_reg structure (typedef REG):

struct reg { char *name; /* name */ void *loc; /* location */ uint32 radix; /* radix */ uint32 width; /* width */ uint32 offset; /* starting bit */ uint32 depth; /* save depth */ uint32 flags; /* flags */ uint32 qptr; /* current queue pointer */ };

The fields are the following:

name device name, string of all capital alphanumeric characters. loc pointer to location of the register value. radix radix for input and display of data, 2 to 16 inclusive. width width in bits of data, 1 to 32 inclusive. width bit offset (from right end of data). depth size of data array (normally 1). flags flags and formatting information. qptr for a circular queue, the entry number for the first entry

The depth field is used with “arrayed registers”. Arrayed registers are used to represent structures with multiple data values, such as the locations in a transfer buffer; or structures which are replicated in every unit, such as a drive status register. The qptr field is used with “queued registers”. Queued registers are arrays that are organized as circular queues, such as the PC change queue.

Macros ORDATA, DRDATA, and HRDATA define right-justified octal, decimal, and hexidecimal registers, respectively. They are invoked by:

xRDATA (name, location, width)

Macro FLDATA defines a one-bit binary flag at an arbitrary offset in a 32-bit word. It is invoked by:

FLDATA (name, location, bit_position)

Macro GRDATA defines a register with arbitrary location and radix. It is invoked by:

GRDATA (name, location, radix, width, bit_position)

Macro BRDATA defines an arrayed register whose data is kept in a standard C array. It is invoked by:

BRDATA (name, location, radix, width, depth)

For all of these macros, the flag field can be filled in manually, e.g.,

REG lpt_reg = { { DRDATA (POS, lpt_unit.pos, 31), PV_LFT }, … }

Finally, macro URDATA defines an arrayed register whose data is part of the UNIT structure. This macro must be used with great care. If the fields are set up wrong, or the data is actually kept somewhere else, storing through this register declaration can trample over memory. The macro is invoked by:

URDATA (name, location, radix, width, offset, depth, flags)

The location should be an offset in the UNIT structure for unit 0. The width should be 32 for an int32 or uint32 field, and T_ADDR_W for a t_addr filed. The flags can be any of the normal register flags; REG_UNIT will be OR’d in automatically. For example, the following declares an arrayed register of all the UNIT position fields in a device with 4 units:

{ URDATA (POS, dev_unit[0].pos, 8, T_ADDR_W, 0, 4, 0) }

4.3.1 Register Flags

The flags field contains indicators that control register examination and deposit.

flag name meaning if specified

PV_RZRO print register right justified with leading zeroes. PV_RSPC print register right justified with leading spaces. PV_LEFT print register left justified. REG_RO register is read only. REG_HIDDEN register is hidden (will not appear in EXAMINE STATE). REG_HRO register is read only and hidden. REG_NZ new register values must be non-zero. REG_UNIT register resides in the UNIT structure. REG_CIRC register is a circular queue. REG_VMIO register is displayed and parsed using VM data routines. REG_VMAD register is displayed and parsed using VM address routines.

4.4 sim_mtab Structure

Device-specific SHOW and SET commands are processed using the modifications array, which is allocated as contiguous array, with a NULL at the end. Each possible modification is defined with a sim_mtab structure (synonym MTAB), which has the following fields:

struct sim_mtab { uint32 mask; /* mask */ uint32 match; /* match */ char *pstring; /* print string */ char *mstring; /* match string */ t_stat (*valid)(); /* validation routine */ t_stat (*disp)(); /* display routine */ void *desc; /* location descriptor */ };

MTAB supports two different structure interpretations: regular and extended. A regular MTAB entry modifies flags in the UNIT flags word; the descriptor entry is not used. The fields are the following:

mask bit mask for testing the unit.flags field match value to be stored (SET) or compared (SHOW) pstring pointer to character string printed on a match (SHOW), or NULL mstring pointer to character string to be matched (SET), or NULL valid address of validation routine (SET), or NULL disp address of display routine (SHOW), or NULL

For SET, a regular MTAB entry is interpreted as follows:

1. Test to see if the mstring entry exists. 2. Test to see if the SET parameter matches the mstring. 3. Call the validation routine, if any. 4. Apply the mask value to the UNIT flags word and then or in the match value.

For SHOW, a regular MTAB entry is interpreted as follows:

1. Test to see if the pstring entry exists. 2. Test to see if the UNIT flags word, masked with the mask value, equals the match value. 3. If a display routine exists, call it, otherwise 4. Print the pstring.

Extended MTAB entries have a different interpretation:

mask entry flags MTAB_XTD extended entry MTAB_VDV valid for devices MTAB_VUN valid for units MTAB_VAL takes a value MTAB_NMO valid only in named SHOW MTAB_NC do not convert option value to upper case MTAB_SHP SHOW parameter takes optional value match value to be stored (SET) pstring pointer to character string printed on a match (SHOW), or NULL mstring pointer to character string to be matched (SET), or NULL valid address of validation routine (SET), or NULL disp address of display routine (SHOW), or NULL desc pointer to a REG structure (MTAB_VAL set) or a validation-specific structure (MTAB_VAL clear)

For SET, an extended MTAB entry is interpreted as follows:

1. Test to see if the mstring entry exists. 2. Test to see if the SET parameter matches the mstring. 3. Test to see if the entry is valid for the type of SET being done (SET device or SET unit). 4. If a validation routine exists, call it and return its status. The validation routine is responsible for stroing the result. 5. If desc is NULL, exit. 6. If MTAB_VAL is set, parse the SET option for “option=n”, and store the value n in the register described by desc. 7. Otherwise, store the match value in the int32 pointed to by desc.

For SHOW, an extended MTAB entry is interpreted as follows:

1. Test to see if the pstring entry exists. 2. Test to see if the entry is valid for the type of SHOW being done (device or unit). 3. If a display routine exists, call it, otherwise, 4. If MTAB_VAL is set, print “=n”, where the value, radix, and width are taken from the register described by desc, otherwise, 5. Print the pstring.

SHOW [dev|unit] {=} is a special case. Only two kinds of modifiers can be displayed individually: an extended MTAB entry that takes a value; and any MTAB entry with both a display routine and a pstring. Recall that if a display routine exists, SHOW does not use the pstring entry. For displaying a named modifier, pstring is used as the string match. This allows implementation of complex display routines that are only invoked by name, e.g.,

MTAB cpu_tab[] = { { mask, value, “normal”, “NORMAL”, NULL, NULL, NULL }, { MTAB_XTD|MTAB_VDV|MTAB_NMO, 0, “SPECIAL”, NULL, NULL, NULL, &spec_disp }, { 0 } };

A SHOW CPU command will display only the modifier named NORMAL; but SHOW CPU SPECIAL will invoke the special display routine.

4.4.1 Validation Routine

The validation routine can be used to validate input during SET processing. It can make other state changes required by the modification or initiate additional dialogs needed by the modifier. Its calling sequence is:

t_stat validation_routine (UNIT *uptr, int32 value, char *cptr, void *desc) – test that uptr.flags can be set to value. cptr points to the value portion of the parameter string (any characters after the = sign); if cptr is NULL, no value was given. desc points to the REG or int32 used to store the parameter.

4.4.2 Display Routine

The display routine is called during SHOW processing to display device- or unit-specific state. Its calling sequence is:

t_stat display_routine (FILE *st, UNIT *uptr, void *desc) – output device- or unit-specific state for uptr to stream st. If the modifier is regular MTAB entry, or an extended entry without MTAB_SHP set, desc points to the structure in the MTAB entry. If the modifier is an extended MTAB entry with MTAB_SHP set, desc points to the optional value string or is NULL if no value was supplied.

When the display routine is called for a regular MTAB entry, SHOW has output the pstring argument but has not appended a newline. When it is called for an extended MTAB entry, SHOW hasn’t output anything. SHOW will append a newline after the display routine returns, except for entries with the MTAB_NMO flag set.

4.5 Other Data Structures char sim_name[] is a character array containing the VM name. int32 sim_emax contains the maximum number of words needed to hold the largest instruction or data item in the VM. Examine and deposit will process up to sim_emax words.

DEVICE *sim_devices[] is an array of pointers to all the devices in the VM. It is terminated by a NULL. By convention, the CPU is always the first device in the array.

REG *sim_PC points to the reg structure for the program counter. By convention, the PC is always the first register in the CPU’s register array. char *sim_stop_messages[] is an array of pointers to character strings, corresponding to error status returns greater than zero. If sim_instr returns status code n > 0, then sim_stop_message[n] is printed by SCP.

5. VM Provided Routines

5.1 Instruction Execution

Instruction execution is performed by routine sim_instr. Its calling sequence is:

t_stat sim_instr (void) – execute from current PC until error or halt.

5.2 Binary Load and Dump

If the VM responds to the LOAD (or DUMP) command, the load routine (dump routine) is implemented by routine sim_load. Its calling sequence is:

t_stat sim_load (FILE *fptr, char *buf, char *fnam, t_bool flag) - If flag = 0, load data from fptr. If flag = 1, dump data to binary file fptr. For either command, buf contains any VM-specific arguments, and fnam contains the file name.

If LOAD or DUMP is not implemented, sim_load should simply return SCPE_ARG. The LOAD and DUMP commands open and close the specified file for sim_load.

5.3 Symbolic Examination and Deposit

If the VM provides symbolic examination and deposit of data, it must provide two routines, fprint_sym for output and parse_sym for input. Their calling sequences are:

t_stat fprint_sym (FILE *ofile, t_addr addr, t_value *val, UNIT *uptr, int32 switch) – Based on the switch variable, symbolically output to stream ofile the data in array val at the specified addr in unit uptr.

t_stat parse_sym (char *cptr, t_addr addr, UNIT *uptr, t_value *val, int32 switch) – Based on the switch variable, parse character string cptr for a symbolic value val at the specified addr in unit uptr.

If symbolic processing is not implemented, or the output value or input string cannot be parsed, these routines should return SCPE_ARG. If the processing was successful and consumed more than a single word, then these routines should return extra number of addressing units consumed as a negative number. If the processing was successful and consumed a single addressing unit, then these routines should return SCPE_OK. For example, PDP-11 parse_sym would respond as follows to various inputs:

input return value

XYZGH SCPE_ARG MOV R0,R1 -1 MOV #4,R5 -3 MOV 1234,5670 -5

There is an implicit relationship between the addr and val arguments and the device’s aincr fields. Each entry in val is assumed to represent aincr addressing units, starting at addr:

val[0] addr + 0 val[1] addr + aincr val[2] addr + (2 * aincr) val[3] addr + (3 * aincr) : :

Because val is typically filled in and stored by calls on the device’s examine and deposit routines, respectively, the examine and deposit routines and fprint_sym and fparse_sym must agree on the expected width of items in val, and on the alignment of addr. Further, if fparse_sym wants to modify a storage unit narrower than awidth, it must insert the new data into the appropriate entry in val without destroying surrounding fields.

The interpretation of switch values is arbitrary, but the following are used by existing VM’s:

switch interpretation

-a single character -c character string -m instruction mnemonic

In addition, on input, a leading ‘ (apostrophe) is interpreted to mean a single character, and a leading “ (double quote) is interpreted to mean a character string.

5.4 Optional Interfaces

For greater flexibility, SCP provides some optional interfaces that can be used to extend its command input, command processing, and command post-processing capabilities. These interfaces are strictly optional and are off by default. Using them requires intimate knowledge of how SCP functions internally and is not recommended to the novice VM writer.

5.4.1 Once Only Initialization Routine

SCP defines a pointer (*sim_vm_init)(void). This is a “weak global”; if no other module defines this value, it will default to NULL. A VM requiring special initialization should fill in this pointer with the address of its special initialization routine:

void sim_special_init (void); void (*sim_vm_init)(void) = &sim_special_init;

The special initialization routine can perform any actions required by the VM. If the other optional interfaces are to be used, the initialization routine can fill in the appropriate pointers; however, this can just as easily be done in the CPU reset routine.

5.4.2 Address Input and Display

SCP defines a pointer t_addr *(sim_vm_parse_addr)(DEVICE *, char *, char **). This is initialized to NULL. If it is filled in by the VM, SCP will use the specified routine to parse addresses in place of its standard numerical input routine. The calling sequence for the sim_vm_parse_addr routine is:

t_addr sim_vm_parse_addr (DEVICE *dptr, char *cptr, char **optr) – parse the string pointed to by cptr as an address for the device pointed to by dptr. optr points to the first character not successfully parsed. If cptr == optr, parsing failed.

SCP defines a pointer void *(sim_vm_fprint_addr)(FILE *, DEVICE *, t_addr). This is initialized to NULL. If it is filled in by the VM, SCP will use the specified routine to print addresses in place of its standard numerical output routine. The calling sequence for the sim_vm_fprint_addr routine is:

t_addr sim_vm_fprint_addr (FILE *stream, DEVICE *dptr, t_addr addr) – output address addr to stream in the format required by the device pointed to by dptr.

5.4.3 Command Input and Post-Processing

SCP defines a pointer char* (sim_vm_read)(char *, int32 *, FILE *). This is initialized to NULL. If it is filled in by the VM, SCP will use the specified routine to obtain command input in place of its standard routine, read_line. The calling sequence for the sim_vm_read routine is:

char sim_vm_input (char *buf, int32 *max, FILE *stream) – read the next command line from stream and store it in buf, up to a maximum of max characters

The routine is expected to strip off leading whitespace characters and to return NULL on end of file.

SCP defines a pointer void *(sim_vm_post)(t_bool from_scp). This is initialized to NULL. If filled in by the VM, SCP will call the specified routine at the end of every command. This allows the VM to update any local state, such as a GUI console display. The calling sequence for the vm_post routine is:

void sim_vm_postupdate (t_bool from_scp) – if called from SCP, the argument from_scp is TRUE; otherwise, it is FALSE.

5.4.4 VM-Specific Commands

SCP defines a pointer CTAB *sim_vm_cmd. This is initialized to NULL. If filled in by the VM, SCP interprets it as a pointer to SCP command table. This command table is checked before user input is looked up in the standard command table.

A command table is allocated as a contiguous array. Each entry is defined with a sim_ctab structure (typedef CTAB):

struct sim_ctab { char *name; /* name */ t_stat (*action)(); /* action routine */ int32 arg; /* argument */ char *help; /* help string */ };

If the first word of a command line matches ctab.name, then the action routine is called with the following arguments:

t_stat action_routine (int32 arg, char *buf) – process input string buf based on optional argument arg

The string passed to the action routine starts at the first non-blank character past the command name.

6. Other SCP Facilities

6.1 Terminal Multiplexor Emulation Library

SIMH supports the use of multiple terminals. All terminals except the console are accessed via Telnet. SIMH provides two supporting libraries for implementing multiple terminals: sim_tmxr.c (and its header file, sim_tmxr.h), which provide OS-independent support routines for terminal multiplexors; and sim_sock.c (and its header file, sim_sock.h), which provide OS-dependent socket routines. Sim_sock.c is implemented under Windows, VMS, UNIX, and MacOS.

Two basic data structures define the multiple terminals. Individual lines are defined by an array of tmln structures (typedef TMLN):

struct tmln { SOCKET conn; /* line conn */ uint32 ipad; /* IP address */ uint32 cnms; /* connect time ms */ int32 tsta; /* Telnet state */ int32 rcve; /* rcv enable */ int32 xmte; /* xmt enable */ int32 dstb; /* disable Tlnt bin */ int32 rxbpr; /* rcv buf remove */ int32 rxbpi; /* rcv buf insert */ int32 rxcnt; /* rcv count */ int32 txbpr; /* xmt buf remove */ int32 txbpi; /* xmt buf insert */ int32 txcnt; /* xmt count */ FILE *txlog; /* xmt log file */ char *txlogname; /* xmt log file name */ char rxb[TMXR_MAXBUF]; /* rcv buffer */ char rbr[TMXR_MAXBUF]; /* rcv break */ char txb[TMXR_MAXBUF]; /* xmt buffer */ };

The fields are the following:

conn connection socket (0 = disconnected) tsta Telnet state rcve receive enable flag (0 = disabled) xmte transmit flow control flag (0 = transmit disabled) dstb Telnet bin mode disabled rxbpr receive buffer remove pointer rxbpi receive buffer insert pointer rxcnt receive count txbpr transmit buffer remove pointer txbpi transmit buffer insert pointer txcnt transmit count txlog pointer to log file descriptor txlogname pointer to log file name rxb receive buffer rbr receive buffer break flags txb transmit buffer

The overall set of extra terminals is defined by the tmxr structure (typedef TMXR):

struct tmxr { int32 lines; /* # lines */ int32 port; /* listening port */ SOCKET master; /* master socket */ TMLN *ldsc; /* pointer to line descriptors */ };

The fields are the following:

lines number of lines (constant) port master listening port (specified by ATTACH command) master master listening socket (filled in by ATTACH command) ldsc array of line descriptors

Library sim_tmxr.c provides the following routines to support Telnet-based terminals:

int32 tmxr_poll_conn (TMXR *mp) – poll for a new connection to the terminals described by mp. If there is a new connection, the routine resets all the line descriptor state (including receive enable) and returns the line number (index to line descriptor) for the new connection. If there isn’t a new connection, the routine returns –1. void tmxr_reset_ln (TMLN *lp) – reset the line described by lp. The connection is closed and all line descriptor state is reset. int32 tmxr_getc_ln (TMLN *lp) – return the next available character from the line described by lp. If a character is available, the return variable is:

(1 << TMXR_V_VALID) | character

If no character is available, the return variable is 0. void tmxr_poll_rx (TMXR *mp) – poll for input available on the terminals described by mp. void tmxr_rqln (TMLN *lp) – return the number of characters in the receive queue of the line described by lp. t_stat tmxr_putc_ln (TMLN *lp, int32 chr) – output character chr to the line described by lp. Possible errors are SCPE_LOST (connection lost) and SCPE_STALL (connection backlogged). void tmxr_poll_tx (TMXR *mp) – poll for output complete on the terminals described by mp. void tmxr_tqln (TMLN *lp) – return the number of characters in the transmit queue of the line described by lp. t_stat tmxr_attach (TMXR *mp, UNIT *uptr, char *cptr) – attach the port contained in character string cptr to the terminals described by mp and unit uptr. t_stat tmxr_open_master (TMXR *mp, char *cptr) – associate the port contained in character string cptr to the terminals described by mp. This routine is a subset of tmxr_attach. t_stat tmxr_detach (TMXR *mp, UNIT *uptr) – detach all connections for the terminals described by mp and unit uptr. t_stat tmxr_close_master (TMXR *mp) – close the master port for the terminals described by mp. This routine is a subset of tmxr_detach. t_stat tmxr_ex (t_value *vptr, t_addr addr, UNIT *uptr, int32 sw) – stub examine routine, needed because the extra terminals are marked as attached; always returns an error. t_stat tmxr_dep (t_value val, t_addr addr, UNIT *uptr, int32 sw) – stub deposit routine, needed because the extra terminals are marked as detached; always returns an error. void tmxr_linemsg (TMLN *lp, char *msg) – output character string msg to line lp. void tmxr_fconns (FILE *st, TMLN *lp, int32 ln) – output connection status to stream st for the line described by lp. If ln is >= 0, preface the output with the specified line number.

void tmxr_fstats (FILE *st, TMLN *lp, int32 ln) – output connection statistics to stream st for the line described by lp. If ln is >= 0, preface the output with the specified line number.

t_stat tmxr_dscln (UNIT *uptr, int32 val, char *cptr, void *mp) – parse the string pointed to by cptr for a decimal line number. If the line number is valid, disconnect the specified line in the terminal multiplexor described by mp. The calling sequence allows tmxr_dscln to be used as an MTAB processing routine.

The OS-dependent socket routines should not need to be accessed by the terminal simulators.

6.2 Magnetic Tape Emulation Library

SIMH supports the use of emulated magnetic tapes. Magnetic tapes are emulated as disk files containing both data records and metadata markers; the format is fully described in the paper “SIMH Magtape Representation and Handling”. SIMH provides a supporting library, sim_tape.c (and its header file, sim_tape.h), that abstracts handling of magnetic tapes. This allows support for multiple tape formats, without change to magnetic device simulators.

The magtape library does not require any special data structures. However, it does define some additional unit flags:

MTUF_WLK unit is write locked

If magtape simulators need to define private unit flags, those flags should begin at bit number MTUF_V_UF instead of UNIT_V_UF. The magtape library maintains the current magtape position in the pos field of the UNIT structure.

Library sim_tape.c provides the following routines to support emulated magnetic tapes:

t_stat sim_tape_attach (UNIT *uptr, char *cptr) – Attach tape unit uptr to file cptr. Tape Simulators should call this routine, rather than the standard attach_unit routine, to allow for future expansion of format support.

t_stat sim_tape_detach (UNIT *uptr) – Detach tape unit uptr from its current file.

t_stat sim_tape_set_fmt (UNIT *uptr, int32 val, char *cptr, void *desc) – Set the tape format for unit uptr to the format specified by string cptr.

t_stat sim_tape_show_fmt (FILE *st, UNIT *uptr, int32 val, void *desc) – Write the tape format for unit uptr to the file specified by descriptor st.

t_stat sim_tape_rdrecf (UNIT *uptr, uint8 *buf, t_mtrlnt *tbc, t_mtrlnt max) – Forward read the next record on unit uptr into buffer buf of size max. Return the actual record size in tbc.

t_stat sim_tape_rdrecr (UNIT *uptr, uint8 *buf, t_mtrlnt *tbc, t_mtrlnt max) – Reverse read the next record on unit uptr into buffer buf of size max. Return the actual record size in tbc. Note that the record is returned in forward order, that is, byte 0 of the record is stored in buf[0], and so on.

t_stat sim_tape_wrrecf (UNIT *uptr, uint8 buf, t_mtrlnt tbc) – Write buffer uptr of size tbc as the next record on unit uptr.

t_stat sim_tape sprecf (UNIT *uptr, t_mtrlnt *tbc) – Space unit uptr forward one record. The size of the record is returned in tbc.

t_stat sim_tape_sprecr (UNIT *uptr, t_mtrlnt *tbc) – Space unit uptr reverse one record. The size of the record is returned in tbc.

t_stat sim_tape_wrtmk (UNIT *uptr) – Write a tape mark on unit uptr.

t_stat sim_tape_wreom (UNIT *uptr) – Write an end-of-medium marker on unit uptr (this effectively erases the rest of the tape).

t_stat sim_tape_rewind (UNIT *uptr) – Rewind unit uptr. This operation succeeds whether or not the unit is attached to a file.

t_stat sim_tape_reset (UNIT *uptr) – Reset unit uptr. This routine should be called when a tape unit is reset.

t_bool sim_tape_bot (UNIT *uptr) – Return TRUE if unit uptr is at beginning-of-tape.

t_bool sim_tape wrp (UNIT *uptr) – Return TRUE if unit uptr is write-protected.

t_bool sim_tape_eot (UNIT *uptr, t_addr cap) – Return TRUE if unit uptr has exceed the capacity specified by cap.

Sim_tape_attach, sim_tape_detach, sim_tape_set_fmt, and sim_tape_show_fmt return standard SCP status codes; the other magtape library routines return return private codes for success and failure. The currently defined magtape status codes are:

MTSE_OK operation successful MTSE_UNATT unit is not attached to a file MTSE_FMT unit specifies an unsupported tape file format MTSE_IOERR host operating system I/O error during operation MTSE_INVRL invalid record length (exceeds maximum allowed) MTSE_RECE record header contains error flag MTSE_TMK tape mark encountered MTSE_BOT beginning of tape encountered during reverse operation MTSE_EOM end of medium encountered MTSE_WRP write protected unit during write operation

Sim_tape_set_fmt and sim_tape_show_fmt should be referenced by an entry in the tape device’s modifier list, as follows:

MTAB tape_mod[] = { { MTAB_XTD|MTAB_VDV, 0, “FORMAT”, “FORMAT”, &sim_tape_set_fmt, &sim_tape_show_fmt, NULL }, … };

6.3 Breakpoint Support

SCP provides underlying mechanisms to track multiple breakpoints of different types. Most VM’s implement at least instruction execution breakpoints (type E); but a VM might also allow for break on read (type R), write (type W), and so on. Up to 26 different breakpoint types, identified by the letters A through Z, are supported.

The VM interface to the breakpoint package consists of three variables and one subroutine:

sim_brk_types – initialized by the VM (usually in the CPU reset routine) to a mask of all supported breakpoints.

sim_brk_dflt – initialized by the VM to the mask for the default breakpoint type.

sim_brk_summ – maintained by SCP, providing a bit mask summary of whether any breakpoints of a particular type have been defined.

If the VM only implements one type of breakpoint, then sim_brk_summ is non-zero if any breakpoints are set.

To test whether a breakpoint of particular type is set for an address, the VM calls

t_bool sim_brk_test (t_addr addr, int32 typ) – test to see if a breakpoint of type typ is set for location addr

Because sim_brk_test can be a lengthy procedure, it is usually prefaced with a test of sim_brk_summ:

if (sim_brk_summ && sim_brk_test (PC, SWMASK (‘E’))) { }

Adding An I/O Device To A SIMH Virtual Machine Updated 01-Oct-04 for SIMH V3.3

This memo provides more detail on adding I/O device simulators to the various virtual machines supported by SIMH.

1. SCP and I/O Device Interactions

1.1 The SCP Interface

The simulator control package (SCP) finds devices through the device list, DEVICE *sim_devices. This list, defined in _sys.c, must be modified to add the DEVICE data structure(s) of the new device to sim_devices: extern DEVICE new_device; : DEVICE *sim_devices[] = { &cpu_dev, : &new_device, NULL };

The device then defines data structures for UNITs, REGISTERs, and, if required, options.

1.2 I/O Interface Requirements

SCP provides interfaces to attach files to, and detach them from, I/O devices, and to examine and modify the contents of attached files. SCP expects devices to store individual data words right- aligned in container words. The container words should be the next largest power of 2 in width:

Data word Container word

1b to 8b 8b 9b to 16b 16b 17b to 32b 32b 33b to 64b 64b (requires compile flag –DUSE_INT64)

1.3 Save/Restore Interactions

The Save/Restore capability allows simulations to be stopped, saved, resumed, and repeated. For save and restore to work properly, I/O devices must save and restore all state required for operation. This includes control registers, working registers, intermediate buffers, and mode flags.

Save and restore automatically handle the following state items:

• Content of declared registers. • Content of memory-like structures. • Device user-specific flags and DEV_DIS. • Whether each unit is attached to a file and, if so, the file name. • Whether each unit is active, and, if so, the unit time out. • Unit U3-U6 words. • Unit user-specific flags and UNIT_DIS.

There are two methods for handling intermediate buffers. First, the buffer can be made accessible as unit memory. This requires buffer-specific examine and deposit routines. Alternately, the buffer can be declared as an arrayed register.

2. PDP-8

2.1 CPU and I/O Device Structures

Simulated memory is kept in array uint16 M[MAXMEMSIZE]. 12b words are right justified in each array entry; the high order 4b must be zero.

The interrupt structure is implemented in three parallel variables:

• int32 int_req: interrupt requests. The two high order bits are the interrupt enable flag and the interrupts-not-deferred flag • int32 dev_done: device done flags • int32 int_enable: device interrupt enable flags

A device without interrupt control keeps its interrupt request, which is also the device done flag, in int_req. A device with interrupt control keeps its interrupt request in dev_done and its interrupt enable flag in int_enable. Pictorially,

+----+----+…+----+----+…+----+----+----+ |ion |indf| |irq1|irq2| |irqx|irqy|irqz| irq_req +----+----+…+----+----+…+----+----+----+

+----+----+…+----+----+…+----+----+----+ | 0 | 0 | | 0 | 0 | |donx|dony|donz| dev_done +----+----+…+----+----+…+----+----+----+

+----+----+…+----+----+…+----+----+----+ | 0 | 0 | | 0 | 0 | |enbx|enby|enbz| int_enable +----+----+…+----+----+…+----+----+----+

<- fixed -> <-no enbl-> <- with enable->

Logically, the relationship is

int_req = (int_req & (OVHD+NOENB)) | (dev_done & dev_enable);

Macro INT_UPDATE maintains this relationship after a change to any of the three variables.

Device enable flags are kept in dev_enb. The device enable flag, by convention, is the same bit position as device interrupt flag.

I/O dispatching is done by explicit case decoding in the IOT instruction flow for CPU IOT’s, and dispatch through table dev_tab[64] for devices. Each entry in dev_tab is a pointer to a device IOT processing routine. The calling sequence for the IOT routine is:

new_data = iot_routine (IOT instruction, current AC); where

new_data<11:0> = new contents of AC new_data = 1 if skip, 0 if not new_data<31:IOT_V_REASON> = stop code, if non-zero

2.2 DEVICE Context and Flags

The DEVICE ctxt (context) field must point to the device information block (DIB), if one exists. The DEVICE flags field must specify whether the device supports the “SET ENABLED/SET DISABLED” commands (DEV_DISABLE). If a device can be disabled, the state of the device flag must be declared as a register for SAVE/RESTORE.

2.3 Adding A New I/O Device

2.3.1 Defining The Device Number and Done/Interrupt Flag

Module pdp8_defs.h must be modified to add the device number definitions and the device interrupt flag definitions. The device number is the lowest device number that the device responds to (e.g, 060 for the RL8A):

#define DEV_NEW 0nn /* not 0,010,020-027 */

If the device has a separate interrupt enable, the interrupt flag must be added above INT_V_DIRECT, and the latter increased accordingly:

#define INT_V_TTI4 (INT_V_START+13) /* clock */ #define INT_V_NEW (INT_V_START+14) /* new */ #define INT_V_DIRECT (INT_V_START+15) /* direct start */ : #define INT_NEW (1 << INT_V_NEW)

If the device has only an interrupt/done flag, it must be added between INT_V_DIRECT and INT_V_OVHD, and the latter increased accordingly:

#define INT_V_UF (INT_V_DIRECT+8) /* user int */ #define INT_V_NEW (INT_V_DIRECT+9) /* new */ #define INT_V_OVHD (INT_V_DIRECT+10) /* overhead start */ : #define INT_NEW (1 << INT_V_NEW)

2.3.2 Adding The Device Information Block

The device information block is declared in the device module, as follows: int32 iotrtn1 (int32 instruction, int32 AC); int32 iotrtn2 (int32 instruction, int32 AC); : DIB dev_dib = { DEV_NEW, num_iot_routines, { &iotrtn1, &iotrn2, … } };

DEV_NEW is the device number, and num_iot_routines is the number of IOT dispatch routines (allocated contiguously starting at DEV_NEW). If a device number in the range defined by [DEV_NEW, DEV_NEW + num_iot_routines - 1] is not needed, the corresponding dispatch address should be NULL.

3. PDP-11, VAX, VAX-780, and PDP-10

3.1 Memory

For the PDP-11, simulated memory is kept in array uint16 *M, dynamically allocated. For the VAX and VAX-780, simulated memory is kept in array uint32 *M, dynamically allocated. For the PDP-10, simulated memory is kept in array t_uint64 *M, dynamically allocated. Because the three systems use different memory widths and different I/O mapping schemes, DMA peripherals that are shared among them use interface routines to access memory.

3.2 Interrupt Structure

The interrupt structure is implemented by array int_req, indexed by priority level (except on the PDP-10, where all levels are kept in one word). Each device is assigned a request flag in int_req[device_IPL], according to its priority, with highest priority at the right (low order bit). To facilitate access to int_req across the three systems, each device dev defines three variables:

INT_V_dev – the bit number of the device’s interrupt request flag INT_dev – the mask of the device’s interrupt request flag IPL_dev – the index into int_req for the device’s priority level (PDP-11, VAX only)

Four macros allow simulated devices to access and manipulate interrupt structures independent of the underlying VM:

IVCL (dev) – vector locator for DIB (IPL * 32 + bit number) IREQ (dev) – resolves to int_req[device_IPL] CLR_INT (dev) – clears the device’s interrupt request flag SET_INT (dev) – sets the device’s interrupt request flag

3.3 I/O Dispatching

3.3.1 Unibus/Qbus Devices

For Unibus and Qbus devices, I/O dispatching is done by table-driven address decoding in the I/O page read and write routines. Interrupt handling is done by table driven processing of vector and interrupt handling tables. These tables are constructed at run time from device information blocks (DIB’s). Each I/O device has a DIB with the following information:

{ IO page base address, IO page length, read_routine, write_routine, num_vectors, vector_locator, vector, { &iack_rtn1, &iack_rtn2, … } }

The calling sequence for an I/O read is:

t_stat read_routine (int32 *data, int32 pa, int32 access)

The calling sequence for an I/O write is:

t_stat write_routine (int32 data, int32 pa, int32 access)

For both, the access parameter can have one of the following values:

READ normal read READC console read (PDP-11 only) WRITE word write WRITEC console word write (PDP-11 only) WRITEB byte write

I/O read and I/O word write use word (even) addresses; the low order bit of the address should be ignored. I/O byte write uses byte addresses, and the data byte to be written is right-justified in the calling argument.

If the device has vectors, the vector_locator field specifies the position of the vector in the interrupt tables, using macro IVCL (dev). If the device has static interrupt vectors, they are specified by the DIB vector field and by the DIB num_vectors field. The device is assumed to have vectors at vector, …, vector + ((num_vectors –1) * 4). If the device has dynamic interrupt acknowledge routines, they are specified by the DIB interrupt acknowledge routines. An calling sequence for an interrupt acknowledge routine is:

int32 iack_rtn (void)

It returns the interrupt vector for the device, or 0 if there is no interrupt (passive release).

3.3.2 Massbus Devices (PDP-11, VAX-780 only)

For Massbus devices, I/O dispatching is done by table-driven address decoding in the Massbus adapter (RH for the PDP11, MBA for the VAX-780). These tables are constructed at run time from device information blocks (DIB’s). Each Massbus device has a DIB with the following information:

{ Massbus number, 0, mb_read_routine, mb_write_routine, 0, 0, 0, { &abort_routine } }

The calling sequence for a Massbus register read is:

t_stat mb_read_routine (int32 *data, int32 offset, int32 drive)

The calling sequence for a Massbus register write is:

t_stat mb_write_routine (int32 data, int32 offset, int32 drive)

For both, offset is the internal register offset of the Massbus register being accessed, and drive is the unit number of the Massbus controller being accessed. These routines can return the following status values:

SCPE_OK access ok MBE_NXD non-existent drive MBE_NXR non-existent register MBE_GOE error attempting to initiate function

The abort routine is called if the Massbus adapter must stop a data transfer or reset the associated controllers. Its calling sequence is:

t_stat mba_abort (void)

The abort routine typically invokes the device reset routine to stop all transfers and reset all device controller state.

3.4 DEVICE Context and Flags

For the PDP-11, VAX, and PDP-10, the DEVICE ctxt (context) field must point to the device information block (DIB), if one exists. The DEVICE flags field must specify whether the device is a Unibus device (DEV_UBUS); a Qbus device with 22b DMA capability, or no DMA capability (DEV_QBUS); or a Qbus device with 18b DMA capability (DEV_Q18); a Massbus device (DEV_MBUS); or a combination thereof. The DEVICE flags field must also specify whether the device supports the “SET ENABLED/SET DISABLED” commands (DEV_DISABLE). Lastly, the DEVICE flags field specifies whether the device addresses and vectors are autoconfigured (DEV_FLTA).

Most devices do not care whether the I/O bus is Unibus or Qbus. Those that do can use macro UNIBUS to see if the host bus is Unibus (true) or Qbus (false). On the PDP-11, UNIBUS is derived from the CPU model; on the PDP-10 and VAX-11/780, it is always true; and for CVAX, it is always false.

3.5 Memory Access Routines

3.5.1 Unibus/Qbus Devices

Unibus/Qbus DMA devices access memory through four interface routines:

int32 Map_ReadB (t_addr ba, int32 bc, uint8 *buf); int32 Map_ReadW (t_addr ba, int32 bc, uint16 *buf); int32 Map_WriteB (t_addr ba, int32 bc, uint8 *buf); int32 Map_WriteW (t_addr ba, int32 bc, uint16 *buf);

The arguments to these routines are:

ba starting memory address bc byte count *buf pointer to device buffer

Note that the PDP-10 can only share a small number of PDP-11 peripherals, because of its dependence on 18b transfers on the Unibus; and that all non-Massbus peripherals are on Unibus 3.

The routines return the number of bytes not transferred: 0 indicates a successful transfer. Transfer failures can occur if the mapped address uses an invalid mapping register or maps to non-existent memory.

3.5.2 Massbus Devices

Massbus devices access memory through three interface routines, for read, write, and write check respectively:

int32 mba_rdbufW (uint32 mbus, int32 bc, uint16 *buf); int32 mba_wrbufW (uint32 mbus, int32 bc, uint16 *buf); int32 mba_chbufW (uint32 mbus, int32 bc, uint16 *buf);

The arguments to these routines are:

mbus Massbus adapter number bc byte count *buf pointer to device buffer

The routines the number of bytes successfully transferred. Transfer failures can occur if a mapped address uses an invalid mapping register, maps to non-existent memory, or on a write- check, if a miscompare occurs.

3.6 Adding A New I/O Device

3.6.1 Defining The I/O Page Region

I/O page regions are defined by a base address and a byte length. The base address is defined as an offset against the I/O page base address (IOPAGEBASE). These definitions are kept in pdp11_defs.h (vaxmod_defs.h). For example, if a new IPL 4 device has I/O addresses 17777700-17777707:

#define IOBA_NEWIPL4 (IOPAGEBASE + 017700) /* base addr */ #define IOLN_NEWIPL4 010 /* length = 8 bytes */

Note that the offsets are always the low order 13b of the I/O address, because the I/O page is only 8KB long.

3.6.2 Defining The Device Parameters

If the device can interrupt, pdp11_defs.h (vaxmod_defs.h, vax780_moddefs.h, pdp10_defs.h) must be modified to add the device interrupt flag(s) and priority level. The device flag(s) should be inserted using a spare bit (or bits) at the appropriate priority level. On the PDP-11, the PIRQ interrupt flags (PIR) must always be the last (lowest priority) device in the level.

/* IPL 4 devices */

#define INT_V_LPT 4 #define INT_V_NEW 5 /* new IPL 4 dev */ #define INT_V_PIR4 6 /* used to be 4 */ : #define INT_NEW (1u << INT_V_NEW) : #define IPL_NEW 4

The device vector(s) must also be defined:

#define VEC_NEW 0360

If the device participates in autoconfiguration, its rank must be specified as well:

#define RANK_DEV 17 /* rank 17 */

3.6.3 Adding The Device Information Block

The device information block is declared in the device module, as follows: t_stat new_rd (int32 *data, int32 addr, int32 access); t_stat new_wr (int32 data, int32 addr, int32 access); int32 new_iack1 (void); int32 new_iack2 (void); : DIB new_dib = { IOBA_NEW, IOLN_NEW, &new_rd, &new_wr, num_vectors, IVLC (NEW), VEC_NEW, { &new_iack1, &new_iack2, … };

3.6.4 Adding The Device To Autoconfiguration (PDP-11, VAX, VAX-780 only)

If the device needs to be autoconfigured, and it is not presently included in the autoconfiguration table, it must be added to table auto_tab in pdp11_io.c (vax_io.c). Entry ‘n’ in auto_tab corresponds to autoconfiguration rank n + 1; the first two fields of the entry are filled in. The fields are:

uint32 amod address modulus uint32 vmod vector modulus uint32 flags flags uint32 num number of controllers if determined statically uint32 fix CSR address if first controller has fixed address char *dnam[4] list of controller names in this rank, maximum 4

Currently defined flags are AUTO_DYN (number of controllers is determined dynamically) and AUTO_VEC (autoconfiguration determines the device vectors as well as the device addresses).

4 Nova

4.5 CPU and I/O Device Structures

Simulated memory is kept in array uint16 M[MAXMEMSIZE].

The interrupt structure is implemented in three parallel variables:

• int32 int_req: interrupt requests. The two high order bits are the interrupt enable flag and the interrupts-not-deferred flag • int32 dev_done: device done flags • int32 dev_disable: device interrupt disable flags

Pictorially,

+----+----+…+----+----+…+----+----+----+ |ion |indf| |irqa|irqb| |irqx|irqy|irqz| irq_req +----+----+…+----+----+…+----+----+----+

+----+----+…+----+----+…+----+----+----+ | 0 | 0 | |dona|donb| |donx|dony|donz| dev_done +----+----+…+----+----+…+----+----+----+

+----+----+…+----+----+…+----+----+----+ | 0 | 0 | |disa|disb| |disx|disy|disz| dev_disable +----+----+…+----+----+…+----+----+----+

<- fixed -> <------I/O devices ------>

Logically, the relationship is

int_req = (int_req & ~INT_DEV) | (dev_done & ~dev_disable);

Device enable flags are kept in iot_enb. The device enable flag, by convention, is the same bit position as device interrupt flag.

I/O dispatching is indirectly through dispatch table dev_table, which has one entry for each possible I/O device. Each entry is a structure of the form:

int32 mask; /* interrupt/done mask bit */ int32 pi; /* PI out mask bit */ t_stat (*iot_routine)(); /* addr of I/O routine */

The I/O routine is called by

new_data = iot_routine (IOT pulse, IOT subopcode, AC value); where

new_data<15:0> = new contents of AC, if DIA/DIB/DIC new_data = 1 if skip, 0 if not new_data<31:IOT_V_REASON> = stop code, if non-zero

4.6 DEVICE Context and Flags

The DEVICE ctxt (context) field must point to the device information block (DIB), if one exists. The DEVICE flags field must specify whether the device supports the “SET ENABLED/SET DISABLED” commands (DEV_DISABLE). If a device can be disabled, the state of the device flag must be declared as a register for SAVE/RESTORE.

4.7 Memory Mapping

On mapped Nova’s and on Eclipse’s, DMA transfers use a memory map to translate 15b virtual addresses to physical addresses. The mapping function is called by:

int32 MapAddr(int32 map, int32 addr) with the following arguments:

map map number, usually 0 addr virtual address

The routine returns the to be used for the transfer.

4.8 Adding A New I/O Device

4.8.1 Defining The Device Number And The Done/Interrupt Flag

Module nova_defs.h must be modified to add the device number definitions and the device interrupt flag definitions.

#define DEV_NEW 0nn /* can’t be 00, 01 */

Device flags are kept as a bit vector. If priority is unimportant, the device flag can be defined as one of the currently unused bits:

#define INT_V_NEW 1 /* new */ : #define INT_NEW (1 << INT_V_NEW)

If the device requires a specific priority with respect to existing devices, it must be assigned the appropriate flag bit, and the other device flag bits moved up or down.

The device’s PI mask bit must also be defined:

#define PI_NEW 000200

4.8.2 Adding The Device Information Block

The device information block is declared in the device module, as follows: int32 iot (int32 pulse, int32 code, int32 AC); : DIB new_dib = { DEV_NEW, INT_new, PI_new, &iot };

The SIMH Breakpoint Subsystem Bob Supnik, 26-Jul-2003

Summary

SIMH provides a highly flexible and extensible breakpoint subsystem to assist in debugging simulated code. Its features include:

· Up to 26 different kinds of breakpoints · Unlimited numbers of breakpoints · Proceed counts for each breakpoint · Automatic execution of commands when a breakpoint is taken

If debugging is going to be a major activity on a simulator, implementation of a full-featured breakpoint facility will be of immense help to users.

Breakpoint

SIMH breakpoints are characterized by a type, an address, a proceed count, and an action string. Breakpoint types are arbitrary and are defined by the virtual machine. Each breakpoint type is assigned a unique letter. All simulators to date provide execution (“E”) breakpoints. A useful extension would be to provide breakpoints on read (“R”) and write (“W”) data access. Even finer gradations are possible, e.g., physical versus virtual addressing, DMA versus CPU access, and so on.

Breakpoints can be assigned to devices other than the CPU, but breakpoints don’t contain a device pointer. Thus, each device must have its own unique set of breakpoint types. For example, if a simulator contained a programmable graphics processor, it would need a separate instruction breakpoint type (e.g., type G rather than E).

The virtual machine defines the valid breakpoint types to SIMH through two variables:

sim_brk_types – initialized by the VM (usually in the CPU reset routine) to a mask of all supported breakpoints; bit 0 (low order bit) corresponds to type ‘A’, bit 1 to type ‘B’, etc.

sim_brk_dflt – initialized by the VM to the mask for the default breakpoint type.

SIMH in turn provides the virtual machine with a summary of all the breakpoint types that currently have active breakpoints:

sim_brk_summ – maintained by SIMH; provides a bit mask summary of whether any breakpoints of a particular type have been defined.

When the virtual machine reaches the point in its execution cycle corresponding to a breakpoint type, it tests to see if any breakpoints of that type are active. If so, it calls sim_brk_test to see if a breakpoint of a specified type (or types) is set at the current address. Here is an example from the fetch phase, testing for an execution breakpoint:

/* Test for breakpoint before fetching next instruction */

if ((sim_brk_sum & SWMASK (‘E’)) && sim_brk_test (PC, SWMASK (‘E’)))

If the virtual machine implements only one kind of breakpoint, then testing sim_brk_summ for non-zero suffices. Even if there are multiple breakpoint types, a simple non-zero test distinguishes the no-breakpoints case (normal run mode) from debugging mode and provides sufficient efficiency.

Testing For Breakpoints

Breakpoint testing must be done at every point in the instruction decode and execution cycle where an event relating to a breakpoint type occurs. If a virtual machine implements data breakpoints, it simplifies implementation if data reads and writes are centralized in subroutines, rather than scattered throughout the code. For this reason (among others), it is good practice to perform memory access through subroutines, rather than by direct access to the memory array.

As an example, consider a virtual machine with a central memory read subroutine. This routine takes an additional parameter, the type of read (often required for memory protection):

#define IF 0 /* fetch */ #define ID 1 /* indirect */ #define RD 2 /* data read */ #define WR 3 /* data write */

t_stat Read (uint32 addr, uint32 *dat, uint32 acctyp) { static uint32 bkpt_type[4] = { SWMASK (‘E’), SWMASK (‘N’), SWMASK (‘R’), SWMASK (‘W’) };

If (sim_brk_summ && sim_brk_test (addr, bkpt_type[acctyp])) return STOP_BKPT; else *dat = M[addr]; return SCPE_OK; }

This routine provides differentiated breakpoints for execution, indirect addresses, and data reads, with a single test.

SIMH Magtape Representation and Handling Bob Supnik, 03-Mar-03

Magtape Representation

SIMH represents magnetic tapes as disk files. Each disk file contains a series of objects. Objects are either metadata markers, like tape mark or end of medium, or they are data records. Location 0 of the file is interpreted as beginning of tape; end of file is interpreted as end of medium. Pictorially:

Location 0: +------+ | data | | record | +------+ | data | | record | +------+ : +------+ | tape | | mark | +------+ | data | | record | +------+ : end of file:

Metadata markers are 4 bytes stored in little-endian order. The currently defined metadata markers are:

0xFFFFFFFF end of medium 0xFF000000:0xFFFFFFFE reserved 0x00000000 tape mark

Data records are consist of an initial 4 byte record length n, (n + 1) & ~1 bytes of data, and a trailing 4 byte record length n that must be the same as the initial record length: bytes 0:3 +------+ | record | | length | +------+ bytes 4:n+3 | data | | : | | : | +------+ bytes n+4:n+7 | record | | length | +------+

Note that the data is rounded to an even number of bytes. If the record length is odd, the extra byte is undefined but should be 0.

Record lengths are 4 bytes stored in little-endian order. The high order bit is flag, indicating that the record contains an error; the next 7b must be zero; the low 24 bits are the record length: bit<31> 1 = record contains error 0 = record is error-free bits<30:24> must be zero bits<23:0> record length, must be non-zero

The leading and trailing record lengths allow a record to be accessed either forward or backward.

Magtape Operations

Magnetic tape drives can perform the following operations:

· Read forward · Read backward · Write forward · Space forward record(s) · Space backward record(s) · Space forward file(s) · Space backward file(s) · Write tape mark · Security erase · Write extended gap

On a real magtape, all operations are implicitly sequential, that is, they start from the current position of the tape medium. SIMH implements this with the concept of the current tape position, kept in the pos field of the tape drive’s UNIT structure. SIMH starts all magtape operations at the current position and updates the current position to reflect the results of the operation:

· Read forward. Starting at the current position, read the next 4 bytes from the file. If those bytes are a valid record length, read the data record and position the tape past the trailing record length. If they are a tape mark, signal tape mark and position the tape past the tape mark. If they are end of medium, or an end of file occurs, signal no more data (‘long gap’ or ‘bad tape’) and do not change the tape position. · Read reverse. If the current position is beginning of tape, signal BOT. Otherwise, starting at the current position, read the preceding 4 bytes from the file. If those bytes are a valid record length, read the data record and position the tape before the initial record length. If they are a tape mark, signal tape mark and position the tape before the tape mark. If they are end of medium, or an end of file occurs, signal no more data (‘long gap’ or ‘bad tape’) and position the tape before the end of medium marker. · Write. Starting at the current position, write the initial record length, followed by the data record, followed by the trailing record length. Position the tape after the trailing record length. · Space forward record(s). Starting at the current position, read the next 4 bytes from the file. If those bytes are a valid record length, position the tape past the trailing record length and continue until operation count exhausted or metadata encountered. If those bytes are a tape mark, signal tape mark and position the tape after the tape mark. If they are end of medium, or an end of file occurs, signal no more data (‘long gap’ or ‘bad tape’) and do not change the tape position. · Space reverse record(s). If the current position is beginning of tape, signal BOT. Otherwise, starting at the current position, read the preceding 4 bytes from the file. If those bytes are a valid record length, position the tape before the initial record length and continue until operation count exhausted, BOT, or metadata encountered. If they are a tape mark, signal tape mark and position the tape before the tape mark. If they are end of medium, or an end of file occurs, signal no more data (‘long gap’ or ‘bad tape’) and position the tape before the end of medium marker. · Space forward file(s). Starting at the current position, read the next 4 bytes from the file. If those bytes are a valid record length, position the tape past the trailing record length and continue. If those bytes are a tape mark, signal tape mark, position the tape after the tape mark, and continue until operation count exhausted. If they are end of medium, or an end of file occurs, signal no more data (‘long gap’ or ‘bad tape’) and do not change the tape position. · Space reverse file(s). If the current position is beginning of tape, signal BOT. Otherwise, starting at the current position, read the preceding 4 bytes from the file. If those bytes are a valid record length, position the tape before the initial record length and continue. If they are a tape mark, position the tape before the tape mark and continue until operation count exhausted or BOT. If they are end of medium, or an end of file occurs, signal no more data (‘long gap’ or ‘bad tape’) and position the tape before the end of medium marker. · Write tape mark. Starting at the current position, write a tape mark marker. Position the tape beyond the new tape mark. · Security erase. Starting at the current position, write an end of medium marker. Do not update the tape position. · Write extended gap. All implementations to date treat this as a NOP, because it does not create readable data. This should erase the next object on the tape (as a minimum), but because tape records are only 16b aligned instead of 32b aligned, there is no reliable way to do this.

Magtape Error Handling

The following matrix defines error responses versus events for simulated magtapes. PNU signifies position not updated; PU signifies position updated.

Unit not Tape mark End of Write End of Data read attached medium locked attached or write mark file error Read Error: unit not Error: tape Error: bad ok Error: bad Error: parity or forward ready, PNU mark, PU tape or tape or data, PNU runaway tape, runaway tape, PNU PNU Read Error: unit not Error: tape Error: bad or ok Error: bad or Error: parity or reverse ready, PNU mark, PU runaway tape, runaway tape, data, PNU PU PU Write Error: unit not na na Error: unit na Error: parity or forward ready, PNU write locked, data, PNU PNU Space Error: unit not Error: tape Error: bad or ok Error: bad or Error: parity or records ready, PNU mark, PU runaway tape, runaway tape, data, PNU PNU PNU forward Space Error: unit not ok Error: bad or ok Error: bad or Error: parity or records ready, PNU runaway tape, runaw ay tape, data, PNU if PU PU error on reverse record length, otherwise PU Space files Error: unit not Error: tape Error: bad or ok Error: bad or Error: parity or forward ready, PNU mark, PU runaway tape, runaway tape, data, PNU PNU PNU Space files Error: unit not ok Error: bad or ok Error: bad or Error: parity or reverse ready, PNU runaway tape, runaway tape, data, PNU if PU PU error on record length, otherwise PU Write tape Error: unit not na na Error: unit na Error: parity or mark ready, PNU write locked, data, PNU PNU Erase Error: unit not na na Error: unit na Error: parity or ready, PNU write locked, data, PNU PNU

The behavior of simulated tapes mirrors that of real tapes, except for errors that make determination of the record length impossible. On a real tape, a read or write error would update the position of the tape. On a simulated tape, this isn’t possible; the length of the record is unknown. Real tape drivers would try to recover from the error by backspacing over the erroneous record and trying again. This won’t work on a simulated tape.

For intelligent tapes, like the TK50 and the TS11, this problem is handled by reporting “position lost”. This status tells the tape driver that tape position is no longer known, and normal error recovery isn’t possible. Older tapes do not have this status. Accordingly, these tapes implement a limited form of state “memory” for error recovery. If an error occurs on a forward operation, and the position is not updated, the simulated tape unit “remembers” this fact. If the next operation is a backspace record, the first backspace is skipped, because the simulated tape is still positioned at the start of the erroneous record. If a read is then attempted, the tape will read the record that caused the original error.

Magtape Emulation Library

SIMH provides a support library, sim_tape.c (and its header file sim_tape.h), that implements the standard tape format and functions. The library is described in detail in the associated document, “Writing A Simulator For The SIMH System”.

Architectural Evolution in DEC’s 18b Computers Bob Supnik, revised 14-Jan-2004 [revised links 18-Feb-2005]

Abstract

DEC built five 18b computer systems: the PDP-1, PDP-4, PDP-7, PDP-9, and PDP-15. This paper documents the architectural changes that occurred over the lifetime of the 18b systems and analyses the benefits and tradeoffs of the changes made.

Introduction

From 1961 to 1975, Digital Equipment Corporation (DEC) built five 18b computer systems: the PDP-1, PDP-4, PDP-7, PDP-9, and PDP-15 (see table below). Each system differed from its predecessors, sometimes in major ways representing significant architectural breaks, and sometimes in minor ways representing new features or incompatibilities. The architectural evolution of these systems demonstrates how DEC’s ideas about architectural versus implementation complexity, I/O structures, and system features evolved over the period of a decade.

PDP-1 PDP-4 PDP-7 PDP-9 PDP-15 First ship Nov 1960 Jul 1962 Dec 1964 Aug 1966 May 1970 Number built 50 45 120 445 790 Memory cycle 5usec 8usec 1.75usec 1usec 0.8usec Base price $120K $65.5K $45K $25K $19.8K

Reproduced from Computer Engineering: A DEC View Of Hardware Systems Design

The PDP-1

The PDP-1 was DEC’s first computer system. Introduced in 1960, the PDP-1 reflected ideas from Lincoln Labs’ TX-2 project as well as the existing capabilities of DEC’s module . It was implemented in 5Mhz logic.

Arithmetic System

The PDP-1 was a 1’s complement arithmetic machine. In 1’s complement arithmetic, negative numbers are represented by the bit-for-bit inversion of their positive counterparts:

+1 = 000001 -1 = 777776

+4 = 000004 -4 = 777773

One’s complement arithmetic has two problems. First, zero has two representations, +0 and -0:

+0 = 000000 -0 = 777777

Second, addition of negative numbers requires an “end around carry” from the high order position to the low order position:

-1 = 777776 -1 = 777776 ------sum 1 777774 |----->1 -2 = 777775

The PDP-1 tried to solve the zero-representation problem by guaranteeing that arithmetic operations never produced –0. To do this, it performed an extra logic step during addition, checking the result for –0 and converting it to 0. However, the PDP-1 performed subtraction by complementing the AC, adding the memory operand, and recomplementing the result. The recomplementation step occurred in the same time slot as the –0 detect during add. As a result, subtract had one special case: -0 – (+0) yielded –0.

Character Sets

The PDP-1’s first console typewriter was a Friden Flexowriter. (Production units used a Soroban typewriter, which was a modified IBM Model B.) The console’s six bit character set was called FIODEC, which stood for Friden Input Output for Digital Equipment Corporation. This code included both upper and lower case letters, using shift characters to move between sets. The PDP-1’s line printer used Hollerith (BCD) coding. FIODEC and Hollerith had common encodings for letters but not for symbols, requiring character conversions throughout the software.

Instruction Set Architecture

The PDP-1’s visible state included the following registers and capabilities:

AC<0:17> accumulator IO<0:17> I/O register OV overflow flag PC<0:11> program counter EPC<0:3> extended program counter (if memory > 4K) EXTM extend mode PF<1:6> program flags SS<1:6> sense switches TW<0:17> test word (front panel switches) IOSTA<0:17> I/O status

In addition, the PDP-1 had non-observable state in the I/O system for I/O timing (see below).

The PDP-1 had 32 opcodes and implemented six instruction formats: memory reference, skip, shift, operate, I/O, and load immediate. The memory reference format was:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | op |in| address | mem reference +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

<0:4> <5> mnemonic action

00 02 AND AC = AC & M[MA] 04 IOR AC = AC | M[MA] 06 XOR AC = AC ^ M[MA] 10 XCT M[MA] is executed as an instruction 12 14 16 0 CAL M[100] = AC, AC = PC, PC = 101 16 1 JDA M[MA] = AC, AC = PC, PC = MA + 1 20 LAC AC = M[MA] 22 LIO IO = M[MA] 24 DAC M[MA] = AC 26 DAP M[MA]<6:17> = AC<6:17> 30 DIP M[MA]<0:5> = AC<0:5> 32 DIO M[MA] = IO 34 DZM M[MA] = 0 36 40 ADD AC = AC + M[MA] 42 SUB AC = AC - M[MA] 44 IDX AC = M[MA] = M[MA] + 1 46 ISP AC = M[MA] = M[MA] + 1, skip if AC >= 0 50 SAD skip if AC != M[MA] 52 SAS skip if AC == M[MA] 54 MUL AC'IO = AC * M[MA] 56 DIV AC, IO = AC'IO / M[MA] 60 JMP PC = MA 62 JSP AC = PC, PC = MA

The skip format was:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | 1 1 0 1 0| | | | | | | | | | | | | | skip +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | | | | | | \______/ \______/ | | | | | | | | | | | | | | | +---- program flags | | | | | | +------sense switches | | | | | +------AC == 0 | | | | +------AC >= 0 | | | +------AC < 0 | | +------OV == 0 | +------IO >= 0 +------invert skip

The shift format was:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | 1 1 0 1 1| subopcode | encoded count | shift +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | | \___/ | | | | | +------1=AC,2=IO, | | 3=both | +------rotate/shift +------right/left

The shift count was the number of 1’s in bits <9:17>.

The load immediate format was:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | 1 1 1 0 0| S| immediate | LAW +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | +----- if S = 0, AC = IR<6:17> else AC = ~IR<6:17>

The I/O transfer format was:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | 1 1 1 0 1| W| C| subopcode | device | I/O transfer +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

The operate format was:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | 1 1 1 1 1| | | | | | | | | | | | | | operate +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | | | | | | | \______/ | | | | | | | | | | | | | | | +---- PF select | | | | | | +------clear/set PF | | | | | +------or PC | | | | +------clear AC | | | +------halt | | +------CMA | +------or TW +------clear IO

There are significant discrepancies in the PDP-1 documentation about memory extension options. The original 1960 User Handbook (F15) didn’t mention any. The 1961 Handbook (F15B) described two, the Type 13 and Type 14. The 1962 and 1963 Handbooks (F15C and F15D, respectively), and the Maintenance Manual, described only one, the Type 15. This option expanded memory to 64K words. The address space was divided into sixteen 4K word fields. An instruction could directly address, via its 12b address, the entire current field. If extend mode was off, indirect addresses accessed the current field, and multi-level indirect addressing was enabled; if on, indirect addresses could access all 64K, and indirect addressing was single level. The state of extend mode was captured by subroutine calls and sequence breaks, and extend mode was cleared at the start of a sequence break.

BBN built a custom memory manager for its PDP-1 timesharing system. There are also scattered referenced to a PDP-1D, built by DEC itself for timesharing. Gordon Bell believes two were built, one for BBN and one for Stanford.

I/O System

The PDP-1’s I/O system offered multiple modes for I/O instructions, including synchronous waiting, timed waiting, asynchronous, and sequence break (interrupt) driven. This multiplicity made the I/O system complex and redundant.

I/O operations were initiated by a single instruction, Input/Output Transfer (IOT). Bits<12:17> addressed a particular device; bits <7:11> provided additional control or opcode bits. Bits<5:6> specified the mode for the I/O transfer:

<5:6> mode

00 asynchronous - no wait, no device completion pulse 01 timed wait - no wait, device completion pulse 10 synchronous - wait for completion 11 not used - wait, no completion pulse (hung the system if <12:17> != 0)

In synchronous wait, the CPU effectively stalled until the I/O operation completed. If synchronous wait was not specified, three different mechanisms were available for I/O completion:

• Timed wait. Execution proceeded. Eventually, the CPU issued a wait instruction. The CPU then stalled until the I/O operation completed and the device issued a completion pulse. • Polled wait. Execution proceeded. The CPU monitored the device’s flag in the I/O status word until the I/O operation completed. • Sequence break driven. Execution proceeded. When the I/O operation completed, a sequence break (interrupt) occurred, signaling I/O done.

The IOT wait mechanism was implemented with four control flip-flops:

• IOC (I/O command): when asserted, allowed IOT pulses to be sent to a device; when clear, IOT was effectively a NOP. • IOH (I/O halt): when asserted, stalled the CPU by re-executing the current instruction. • IHS (I/O halt save): saved the state of IOH on a no-wait IOT. • IOS (I/O synchronization): when asserted, terminated I/O wait state.

An IOT that specified wait would set IOH, execute the IOT, and, if IOS was clear, clear IOC and decrement the PC. Thus, subsequent re-executions of the IOT would do nothing, because IOC was not asserted. When the I/O operation completed, the device would set IOS. This caused the IOT to set IOC and not decrement the PC, allowing execution to proceed.

An IOT that did not specify wait copied IOS to IHS, set IOC, executed the IOT, and copied IHS back to IOH. If IOH was set as a result of the copy, IOC was cleared. This implemented a one- level memory for wait state. If an IOT with wait was interrupted, the interrupt routine could execute no-wait IOT’s while preserving wait state for the main line program.

The sequence break mechanism recorded break requests in a single pulse sensitive flip flop. Thus, like the PDP-11 but unlike the other 18b systems, break requests were independent of the device completion flags. If the sequence break system was enabled, and a break request occurred, the CPU automatically stored the state of the machine and initiated a new program by:

• storing AC in location 0 • storing EPC and PC, plus overflow and extend mode, in location 1 • storing IO in location 2 • clearing overflow and extend mode • setting the PC to 3 • setting the sequence break in progress flag

The sequence break in progress flag blocked further breaks.

The end of the break was recognized when the CPU decoded a JMP I 1 (from field 0 in a multi- field system) while the sequence break system was enabled. At that point, the CPU automatically restored the state of the system by:

• temporarily turning on extend mode • obtaining the new PC from location 1 • restoring the original values of overflow and extend mode • clearing sequence-break-in-progress

A CPU option expanded the standard sequence break system from one channel to sixteen. Each channel was a unique priority level and had a dedicated four location memory block (0 – 3 for the highest priority channel, 4 – 7 for the next, etc.). The first three locations of the block were used to store AC, PC, and IO when a break occurred; the PC was then set to point to the fourth location.

Software

The PDP-1 featured some notable software offerings, including an interactive editor (called ), a macro assembler (Macro), a symbolic debugger (DDT), a Lisp , and the world’s first computer video game, Spacewar. Sources to Lisp and Spacewar are available on the Internet, and source listings for Macro and DDT are in the Computer History Museum collections.

The PDP-4

The PDP-4 was intended to be substantially lower cost than the PDP-1. Part of the cost reduction was achieved by using slower and less expensive logic (500Khz instead of 5Mhz), but part was achieved by simplifying the system and reducing the number of gates. Thus, the PDP-4 (and its closely related successors, the PDP-7 and PDP-9) simplified the architecture of the PDP- 1 along multiple dimensions.

Arithmetic Systems

The PDP-4 introduced two’s complement arithmetic in parallel with the PDP-1’s one’s complement arithmetic. Two’s complement arithmetic eliminated the need for -0 detection and made implementation of multi-precision arithmetic much easier. However, 1’s complement capability was not dropped; indeed, it remained the predominant arithmetic system, as reflected in architectural extensions such as the EAE. Thus, the PDP-4 still needed end around carry propagation, as well as 1’s complement overflow detection. The result was greater, rather than lesser complexity, in the hardware, and loss of valuable opcode space in the architecture. Gordon Bell commented that the retention of 1’s complement arithmetic was, simply, “a mistake”. By the PDP-5, it had vanished from DEC’s architectures.

Character Sets

The PDP-4’s console typewriter was an ASR-28 Teletype. Its five bit character code was called Baudot. It supported only upper case letters and required shift characters to get from letters to figures and back again. The line printer was unchanged and continued to use Hollerith coding.

Instruction Set Architecture

The PDP-4 and its successors reduced the amount of visible state in the CPU. Specifically,

register PDP-1 PDP-4,-7,-9

AC arithmetic register same, plus I/O register IO I/O register removed (MQ with EAE option) OV overflow indicator replaced by Link register PF program flags removed SS sense switches removed TW test word front panel switches EXTM extend mode same IOSTA IO flags same

The register changes simplified the logic implementation. The L was essentially the 19th bit of the AC, rather than a special flag. The AC no longer implemented -0 detection. I/O now used the existing access paths to the AC rather than separate paths to an IO register. The elimination of the program flags, and the sense switches, was pure gain.

The PDP-4 halved the number of instructions, from 32 to 16, and reduced the number of instruction formats from 6 to 4. The memory reference format was:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | op |in| address | mem reference +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

The I/O transfer format was:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | 1 1 1 0 0 0| device | sdv |cl| pulse | I/O transfer +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

The operate format was:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | 1 1 1 1 0| | | | | | | | | | | | | | operate +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | | | | | | | | | | | | | | | | | | | | | | | | | +- CMA (3) | | | | | | | | | | | +---- CML (3) | | | | | | | | | | +------OAS (3) | | | | | | | | | +------RAL (3) | | | | | | | | +------RAR (3) | | | | | | | +------HLT (4) | | | | | | +------SMA (1) | | | | | +------SZA (1) | | | | +------SNL (1) | | | +------inv skip (1) | | +------rotate two (2) | +------CLL (2) +------CLA (2)

The immediate format was:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | 1 1 1 1 1| immediate | LAW +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

The following table shows the reduction in instruction count between the PDP-1 and the PDP-4:

PDP-1 instruction PDP-4 instruction

AND AND IOR removed XOR XOR LAC LAC DAC DAC DZM DZM DIP removed DAP removed LIO removed DIO removed ADD ADD; L used in place of overflow SUB removed MUL (EAE option) DIV (EAE option) not present TAD (2’s complement add) IDX removed ISP ISZ XCT XCT SAD SAD SAS removed CAL CAL JDA JMS JSP removed JMP JMP skips OPR skips operate OPR operates shifts (EAE option) LAW LAW IOT IOT

Beyond the reduction in instruction count, the PDP-4’s instruction set required less logic to implement.

• Instructions were encoded to minimize logic. For example, all instructions with IR<0:1> = 00 (CAL, DAC, JMS, DZM) did not read a memory operand. All instructions with IR<0:1> = 11 (JMP, EAE, IOT, OPR/LAW) were single cycle. • ISZ (replacing IDX and ISP) did not modify the AC. By using 2’s complement arithmetic, it did not need to detect -0. • JMS (replacing JDA and JSP) did not modify the AC. This eliminated the transfer path from the PC to the AC. JMS (and interrupts) saved PC and L, and in later systems, the memory extend and memory protection flags. • LAW did not mask or modify the address but instead copied the entire instruction to AC. • OPR no longer guaranteed conflict-free execution of any combination of bits.

Finally, indirect addressing was simplified by the elimination of multi-level indirection.

The PDP-4 replaced the PDP-1’s multiply, divide, and multi-bit shifts with an option, the Extended Arithmetic Element (EAE). The EAE added a second 18b arithmetic register, the MQ, and a shift/multiply/divide instruction. The EAE instruction was microprogrammed and could implement a wide variety of unsigned and signed (one’s complement) operations:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | 1 1 0 1| | | | | | | | | | | | | | | EAE +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | | | | | | | | | | | | | | | | | | | | | | | | | | | +- or SC (3) | | | | | | | | | | | | +---- or MQ (3) | | | | | | | | | | | +------compl MQ (3) | | | | | | | | \______/ | | | | | | | | | | | | | | \_____/ +------shift count | | | | | | | | | | | +------EAE cmd (3) | | | | +------clear AC (2) | | | +------or AC (2) | | +------load sign (1) | +------clear MQ (1) +------load link (1)

The EAE architecture remained unchanged in the PDP-7, PDP-9 and PDP-15.

The PDP-4 included an extended addressing option (Type 16); two of the surviving PDP-4’s in the 1972 census have more than 8KW of memory. No documentation has yet been found on this option, but it’s reasonable to assume that it was the same as the PDP-7’s. If that is true, the PDP-4’s extended memory model was essentially the same as the PDP-1’s, with 13 direct address bits instead of 12. Addressable memory was divided into four 32K word banks. Direct addresses always referenced the current memory bank; indirect addresses accessed either the current memory bank or all of memory, depending on the extend mode flag. As on the PDP-1, subroutine calls and interrupts saved the state of extend mode automatically.

In all, the architectural tradeoffs in the PDP-4 substantially reduced control logic at the cost of complete software incompatibility with the PDP-1. There were also a few oversights; in particular, the lack of a “complement and increment” operate (present in the PDP-5) made two’s complement subtract an instruction longer. The PDP-15 finally corrected this oversight.

The PDP-4 (and the PDP-5) introduced a new feature, the concept of “auto-index” memory locations, that is, locations which, when used as indirect addresses, incremented before use. This feature allowed efficient traversal of linear data structures and made the IDX and DAP instructions unnecessary.

I/O System

The I/O system was pruned even more dramatically than the CPU. Synchronous waits and timed waits were dropped. Instead, only two mechanisms were supported: polled waits and interrupts. Further, the two mechanisms were integrated by having the device flag for polling be the triggering mechanism for device interrupts. Finally, polled waiting was implemented more efficiently by allowing devices to increment the PC (skip) in response to an IO instruction. The PDP-5 also used this I/O paradigm, and it was retained throughout the life of the 12b and 18b families.

In the PDP-4, an ideal I/O device had one flag representing the state of an I/O operation. This flag was cleared when the device initiated I/O; it was set when the device completed I/O. For example, in the paper tape reader, the reader flag was cleared by a request to read a character or by explicit command, and set when the character was in the I/O buffer.

Interrupts (as sequence breaks were now called) were simplified, and control was made explicit rather than implicit.

Function PDP-1 PDP-4

interrupt request request flip-flop logical or of device flags interrupt block request in progress flop interrupts turned off interrupt action save AC -- save PC + flags save PC + flags save IO -- clear OV -- clear extend mode clear extend mode set break in progress turn off interrupts set PC = 3 set PC = 1 interrupt complete monitor for JMP I 1 turn on interrupts, one cycle delay to allow for JMP I 0

The PDP-4 offered a multi-level interrupt option. As in the PDP-1, each interrupt vectored to a unique memory block. Unlike the PDP-1, the memory block was a single location, which was executed. If the location contained a JMS, control transferred to an interrupt service routine. If the location contained any other instruction, the instruction was executed, but control returned to the main line program.

Software

Because the PDP-4 was not compatible with the PDP-1, it required new software. DEC provided an editor, an assembler, and, most notably, a Fortran II compiler, all paper-tape based. While the Fortran compiler was a significant advance, the assembler was actually a step backward: the PDP-1’s assembler had supported macros, the PDP-4’s did not. But it offered some consolation by being a one pass assembler, obviating the need to read the source paper tape twice. The assembler assumed that unresolved references would in fact be resolved and punched unresolved binary code as it processed the source, with a resolution dictionary at the end of the output tape. The resulting tape was then read, upside down and backward, by the loader, which used the resolution dictionary to “fix up” the broken references in the binary.

The PDP-4’s programs later became the basis for the PDP-7’s software offerings, which accounts for lingering use of Baudot code on the PDP-7. However, the presence of FIODEC on the PDP-4 (and thus on the PDP-7) is a mystery, since the PDP-1 software base was not carried forward.

Early Mass Storage

The PDP-1 and PDP-4 started out as paper tape based systems. The development software was paper tape based; magnetic tape, if used at all, was used strictly for data. This situation was clearly unsatisfactory, and by 1963 DEC was experimenting with mass storage.

The first mass storage products were based on Vermont Research Drums. The Type 23 parallel and Type 24 serial drums offered 131,072 words of storage with rapid access. But the drums were big (two six-foot cabinets for the Type 23, one for the Type 24), expensive, and inflexible: storage was tied to the computer. This didn’t fit with the typical use of the 18b computers as “personal” or serially shared systems.

To find a solution, DEC again turned to Lincoln Labs. In 1962, Wes Clark had demonstrated the prototype of the LINC computer. It featured LINCtape, a block-replaceable tape system with a simple, rugged transport and small, inexpensive tape reels. LINCtape offered exactly the kind of “personal” storage needed to complement DEC’s computers. With some changes in tape format, DEC offered “MicroTape” (later renamed DECtape) on the PDP-1 and PDP-4 in 1963. The product also included a stand-alone program librarian, Microtrieve. DECtape was to remain the dominant form of mass storage on DEC’s 12b and 18b systems into the early 1970’s, when it was supplanted by the RK05 (2315-style) cartridge disk drive.

The PDP-7

According to the history of the 18b series in Computer Architecture, the PDP-4 was not a success. The use of slower logic yielded a system that was 5/8 the performance of the PDP-1 at ½ the price. What the market required was a system that was both higher performance and lower cost. That system was the PDP-7. Implemented (primarily) in 10Mhz logic, its basic 1.75 usec cycle time was almost three times the speed of the PDP-1, at 1/3 the cost.

The PDP-7’s basic architecture consisted of minor refinements of the PDP-4’s instruction set, accompanied by one interesting architectural extension: multi-user protection, the first in the 18b family. The PDP-7 also was the first 18b PDP to use ASCII coding.

Arithmetic Systems and Character Sets

The PDP-7’s arithmetic systems were identical to the PDP-4. The console typewriter was an ASR-33 Teletype. Its eight-bit character set was an early version of ASCII, with the high order bit always forced on. The character set supported both upper and lower case letters, although the console only supported upper case. The line printer’s SIXBIT character set was derived from ASCII by truncating codes 040 - 0137 to six bits. The rapid evolution of character sets in the 18b family was embodied in the PDP-7’s DECtape-based operating system DECsys. DECsys stored information in FIODEC, Baudot, and SIXBIT, depending on whether the underlying software was derived from the PDP-4 or newly written.

Instruction Set and I/O Architecture

The PDP-7 used the same instruction set architecture as the PDP-4, including the EAE. The extended memory model was the same as the PDP-4’s. A new feature was a primitive form of multi-user protection called trap mode. If trapping was enabled, IOT’s and HLT became privileged instructions. If extend mode was simultaneously disabled, indirect addresses were confined to the current bank. This allowed for simple time-sharing, with each user in a separate memory bank. (An option, the KA70A, added a small bounds control register to protect memory within a bank.)

The PDP-7’s I/O architecture was identical to the PDP-4’s, and it used the same controllers for major I/O devices such as DECtape, magnetic tape, and the serial drum. A few new IOT’s were added, for management of the trap system. The PDP-7 featured an interprocessor link; this device set the model for the general purpose parallel I/O options in subsequent DEC computers. Like the PDP-1 (but unlike the PDP-4), the PDP-7 console featured a “read-in” switch, to automate system bootstrapping from paper tape. The “read-in” function did not use the PDP-4’s RIM format but instead loaded memory sequentially from the tape. Therefore, loading software required three steps: use the “read-in” switch to load the RIM loader; use the RIM loader to load the binary loader; and finally use the binary loader to load the software.

Software

The PDP-7 offered DEC’s first mass-storage operating system, the DECtape-based DECsys. (DECsys also ran on the PDP-4.) DECsys was a modest first step in operating system development. It consisted of a simple memory-resident DECtape I/O library, a keyboard monitor, a Fortran II compiler, an assembler, a linking loader, and a symbolic debugger. All of the components were based on PDP-4 and PDP-7 paper-tape counterparts, with calls to the DECtape I/O library replacing paper-tape I/O. The internals of DECsys reflect its heterogeneous origins, with directory information stored in Baudot and source files in FIODEC.

A DECsys system tape contained the bootstrap monitor in blocks 0 and 1, and the directory in block 2. The first word of the directory contained the directory length; the last word contained the address of the first free block on the tape. Directory entries consisted of 5 or 6 words:

Word 1: Type (1 for System, 2 for Working) Words 2-3: File name, in Baudot S, word 4: starting block on tape S, word 5: starting address in memory W, word 4: starting block on tape for F (Fortran) version W, word 5: starting block on tape for A (assembler) version W, word 6: starting block on tape for R (relocatable binary) version

Files were simply linked DECtape blocks, with the first word of a block pointing to the next; a pointer of 0 signified end of file.

As far as the author can tell, all copies of DECsys have vanished. This is equally true of an even more historic system for the PDP-7, UNIX. The PDP-7’s multi-user protection, crude as it was, sufficed for implementation of the first version of UNIX, making the PDP-7 a significant system in the history of computing. Unfortunately, all copies of UNIX for the PDP-7 have been lost. Some details of the PDP-7 version can be found on ’s personal web site.

The PDP-9

The PDP-7 was considerably more successful than its predecessors, selling more than 100 systems thanks to its significant price/performance improvements. The PDP-9 was intended to carry the line forward. The arithmetic system and character sets were unchanged, and the instruction set and I/O architecture changed only minimally. The I/O subsystem changed from a radial to a bus design, necessitating redesign of all peripherals. Interfaces to programmed I/O peripherals (paper tape, console, line printer) remained basically the same as the PDP-4 and PDP-7; however, interfaces to mass storage peripherals (magnetic tape, DECtape) changed significantly. An entirely new multi-level interrupt option, called the Automatic Priority Interrupt (API), was designed. The PDP-9 carried over little of the PDP-7’s admittedly small software base.

Instruction Set and I/O Architecture

The PDP-9 introduced a more flexible form of memory management, with a bounds register separating user (lower) memory from system (upper) memory. The PDP-7’s trap flag now became the PDP-9’s user mode flag.

Although intended to be upward compatible with the PDP-7, the PDP-9 introduced a number of differences:

• Auto-indexing. In the PDP-7, each bank of memory had auto-index registers. In the PDP-9, only bank 0 had auto-index registers, and indirect references through addresses 00010- 00017 were forced to reference bank 0. • Extend mode restore. The PDP-7 used EMIR to prepare the system to restore extend mode at the end of an interrupt. The PDP-9 introduced the more ambitious RES, which prepared the system to restore the link, extend mode, and memory protect mode. This removed two instructions from the end of all interrupt routines. • Extend mode behavior. The PDP-7 set extend mode on a protection trap but cleared it on an interrupt; the PDP-9 cleared it on both. The PDP-7 performed a modified JMS, storing the program state in location 0 but taking the next instruction from location 2; the PDP-9 performed a JMS 0 or JMS 20, depending on whether interrupts were on or off.

The PDP-9’s I/O architecture contained some modest improvements in flexibility and error detection. Status flags were added for reader and punch errors. The line printer controller implemented a device-specific interrupt enable/disable. The new DECtape, magnetic tape, and fixed head disk controllers implemented better programming models than their PDP-7 counterparts, and used up fewer device numbers in the process.

The PDP-9 also implemented an entirely new design for multi-level interrupts. Called the Automatic Priority Interrupt (API) option, the API separated the concept of interrupt channel from priority. The API option supported 32 channels (interrupting devices), but the channels were grouped into eight priority levels. Four channels, on the four lowest priorities, were reserved for software interrupts. When an API break occurred, the memory location corresponding to the channel was executed. The location had to contain a JMS to an interrupt service routine; use of other instructions was not supported. The API was carried over unchanged to the PDP-15.

Software

The PDP-9’s close compatibility with the PDP-7 allowed the latter’s software to be brought forward. However, that code base, dating from the PDP-4, was considered inadequate and relegated to use in the smallest systems. For mainstream use, a new software suite was written from scratch. The three-step software loading process was simplified by eliminating the intermediate RIM loader. The hodge-podge of I/O routines and libraries was replaced by a standard I/O executive that maintained compatible interfaces from the paper-tape environment through the mass-storage based operating systems (Advanced Monitor System, its foreground/background extension, and DOS). The PDP-4/7 assembler syntax and binary formats were scrapped and replaced with a new macro assembler, Macro 9. Fortran II was replaced by Fortran IV. The console was changed from software echoing of input characters to hardware echoing. The intent versus the practice for PDP-9 software is illustrated by the changes in the manual set. The examples in the Systems Reference Manual all follow PDP-7 assembler syntax, but most surviving software is written in Macro 9.

The PDP-15

The PDP-15 introduced the most significant set of architectural changes in the 18b product line since the transition from the PDP-1 to the PDP-4. It represented a major technology shift, from discrete to TTL integrated circuits. The PDP-15 was the fastest and most popular 18b computer in Digital’s history. It was also the last.

Instruction Set and I/O Architecture

The PDP-15 introduced four architectural extensions:

• two new registers, an 18b and a 9b limit register • extended addressing to 128K words • memory relocation and protection • hardware floating point

The introduction of the index register made the PDP-15 more competitive with contemporary machines such as the SDS 940 and DDP 516, both of which had indexing. To get an index register select into the memory reference instructions, the directly addressable memory range was reduced from 8K to 4K:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | op |in| x| address | mem reference +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

Direct addressing beyond 4K words could be done by indirect addressing (maximum 32K words), or by indexing (maximum 128K words). However, return addresses remained limited to 15b; thus the maximum practical code segment size remained 32K words. Extended memory worked best with the new memory relocation and protection option; in that environment, multiple 32K word programs could reside in memory simultaneously.

The addition of indexing created a serious compatibility problem with the PDP-9. To ameliorate migration issues, the PDP-15 redefined the PDP-7’s and PDP-9’s extend mode flag as PDP-9 compatibility mode, or bank mode. If bank mode was enabled, memory reference decoding was identical to the PDP-9, without index capability. The PDP-15 did not implement the PDP-9’s extend mode capability within bank mode, because extend mode, which was a compatibility aid for PDP-4 and PDP-7 programs, was no longer needed.

The hardware floating point unit was another new addition to the architecture. It dramatically improved the performance of the system in scientific applications. To support indexing and floating point, the PDP-15 introduced two new instructions, both carved out of the IOT instruction. Bits <4:5> of the IOT instruction had been defined as sub-device selects but in practice were unused. The PDP-15 used them to differentiate between IOT instructions (<4:5> = 00), floating point instructions (<4:5> = 01),

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | 1 1 1 0 0 1| subopcode | floating point +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |in| address | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ and index operate instructions (<4:5> = 1x):

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | 1 1 1 0 1| subopcode | immediate | index operate +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

In addition to the major changes outlined above, the PDP-15 had its own set of tweaks and incompatibilities compared to its predecessor. Two meaningless operates were redefined as IAC (increment AC) and BSW (byte swap). The former facilitated a one-instruction 2’s complement, thereby correcting a hole in the arithmetic system. On the PDP-9, DBR and RES were triggered by a JMP indirect, on the PDP-15 by any indirect. The PDP-15 implemented new IOT skips for bank mode. A mid-life ECO (called the “re-entrancy ECO”) added two additional IOT’s to inhibit and enable interrupts. The PDP-15 EAE did not require that the link be cleared for IMUL and IDIV. Lastly, the PDP-15 API placed the program interrupt priority between the API hardware and software interrupts, rather than below the software interrupts.

From a programming viewpoint, the PDP-15’s I/O architecture was the same as the PDP-9’s, but the implementations were quite different. The PDP-15 implemented a separate I/O processor, providing greater expandability and flexibility, and a different I/O bus. It had more powerful peripherals, including the RP15/RP02 disk pack and the LP15 DMA line printer. Some PDP-9 controllers, such as the TC09 DECtape controller and the RF09 fixed head disk controller, were redesigned to connect directly to the PDP-15’s I/O bus; others were interfaced through a backwards-compatible bus converter.

Although the PDP-15 was more successful than any prior 18b system, compared to the PDP-11 its volume was low. This made continuing investment in new technology and options difficult. The CPU was never re-implemented to take advantage of advances in component integration. Investments in new peripheral types and controllers had to be limited. The PDP-15 group responded with great ingenuity to these constraints. Notable developments included:

• Multiprocessing. Two CPU’s could share memory and I/O subsystems, for increased throughput in a multiprogramming environment. • PDP-11 add-on processor. The Unichannel-15 was a PDP-11/05 CPU that functioned as an I/O controller. The Unibus tied in directly to the PDP-15’s memory system, using the two data parity lines as extra data lines. This gave the PDP-15 access to inexpensive PDP-11 peripherals, such as the RK05 and LP11. • XVM memory manager. The XVM project was the final spin on the PDP-15. It replaced the initial memory relocation option with a more sophisticated unit. The new relocation unit allowed individual programs to extend beyond 32K words.

These structural innovations stretched the lifetime of the product line but could not reverse its status as a niche rather than a volume product. By the mid 1970’s, the PDP-15’s position in DEC’s product line was eclipsed by the success of the more flexible PDP-11 (as the position of the PDP-10 would be by the VAX). In 1977, the PDP-15 was retired, ending the history of the 18b product family.

Software

The PDP-15 built on the PDP-9’s software base. The Advanced Monitor System was retained and extended to create DOS-15 and its batch extension, BOS-15. A new real-time operating system, RSX15, evolved from an execution-only environment into a full-featured multiprogramming system, RSX15-Plus III, that exploited the memory relocation hardware and multiprocessing capabilities to provide simultaneous timesharing, batch, and real-time capabilities. Another notable system was MUMPS (MGH Utility Multi Programming System), a timesharing system developed at Massachusetts General hospital for processing medical records. Descendents of MUMPS (now known as the M language) continue to be used today in medical systems. DOS, RSX-Plus III, and MUMPS were all substantially rewritten in the mid-70’s to take advantage of XVM memory management.

18b Systems Today

Because of the low numbers produced (< 1500), and the early retirement of the product line, relatively few examples of the DEC 18b computers are still extent (a fate shared by the early 36b products as well). Surviving systems are scattered and often in private collections, making an accurate census difficult.

• PDP-1: The Computer History Museum (Mountain View, Ca) has three PDP-1’s. One of these was running as recently as 1995 and is being restored to operation. The other two are from DEC’s history collection. • PDP-4: The Computer History Museum has three PDP-4’s, all from DEC’s history collection. None are considered restorable. • PDP-7: The Computer History Museum has a PDP-7, from DEC’s history collection. Max Burnet (Sydney, Australia) has a PDP-7 in his collection. Neither is considered restorable. There is a partially running PDP-7 in Norway and, incredibly, one still in operation in Oregon. • PDP-9, 9/L: The Computer History Museum has both a PDP-9 and a –9/L. Max Burnet also has one of each, and the PDP-9/L works. The Rhode Island Computer Museum has a PDP- 9, which is being restored. There are two PDP-9’s at ACONIT (Grenoble, France); Hans Pufal and his team have restored one to working order. • PDP-15: Multiple examples in private hands.

Sources

The primary source for this article was DEC’s documentation archive. The author was fortunate to have access to the archive while it was still being staffed and maintained (Compaq dismissed the archive staff and dispersed the documents; HP has donated the archive to the Computer History Museum). Max Burnet has graciously shared his unique collection of DEC documents and hardware. In addition, Al Kossow and Dave Gesswein have done the field of “computer archaeology” a tremendous service by scanning, transcribing, and publishing online, surviving documents, DECtapes, and paper-tapes from the 18b family. Last, but hardly least, the staff of the Computer History Museum has made available its significant archive of DEC material. Among the items consulted:

Family 1972 Field Service Census of Systems under contract – Computer History Museum

PDP-1 PDP-1 Handbook (F-15, 1960 edition) – online PDP-1 Handbook (F-15B, 1961 edition) – online PDP-1 Handbook (F-15C, 1962 edition) – Max Burnet’s collection, now online PDP-1 Handbook (F-15D, 1963 edition) – Computer History Museum, now online PDP-1 Maintenance Manual (F-17) – Max Burnet’s collection, now online PDP-1 Input-Output Systems Manual (F-25) – DEC archive, now online

PDP-4 PDP-4 Handbook (F-45, 1962 edition) – DEC archive, now online PDP-4 Maintenance Manual (F-47) – Max Burnet’s collection, now online PDP-4 Technical Specification (DEC memo M-1142) – online PDP-4 Fortran Users’ Manual (J-4FT) – DEC library, now online PDP-4 EAE Option Bulletin (F-43(18)P) – Computer History Museum PDP-4 Paper, Gordon Bell, August 1977 – Computer History Museum

PDP-7 PDP-7 Reference Manual (F-75, 1964 edition) – DEC archive, now online PDP-7 Maintenance Manual and logic prints (F-77) – Max Burnet’s collection DECSYS-7 Operating Manual (7-5-S) – DEC library, now online

PDP-9 PDP-9 User’s Handbook (F-95, 1968 edition) – online PDP-9 Maintenance Manual (F-97) – online PDP-9 logic prints – online KE09A Extended Arithmetic Element Instruction Manual – online PDP-9 – Design History, Don Vonada, undated – Computer History Museum

PDP-15 PDP-15 Reference Manual (first and sixth editions) – online PDP-15 Maintenance Manual – online XVM System Reference Manual – online XVM Maintenance Manual – online PDP-15 processor diagnostics – online PDP-15 Development Project History, Jerry Butler, September 1977 – Computer History Museum

Another critical source was Computer Engineering: A DEC View Of Hardware Systems Design. The article “The PDP-1 and Other 18-Bit Computers”, by Gordon Bell, Gerald Butler, Robert Gray, John McNamara, Donald Vonada, and Ronald Wilson, contains unique hardware, marketing, and technology information about the 18b family. The book, out of print for years, is now online, thanks to the efforts of Gordon Bell.

Lastly, the author had the benefit of the recollections of people who worked on the 18b family, including Gordon Bell, Dennis Ritchie, and Barry Rubinson, as well as access to the surviving archive of PDP-7 software from Applied Data Research.

18b PDP Web Sites

Gordon Greene’s PDP-1 web site, http://www.dbit.com/~greeng3/pdp1/

Al Kossow’s documentation archive, including the 18b PDP’s, http://bitsavers.org/pdf/

Dennis Ritchie and memoir of early UNIX, http://www.bell- labs.com/history/unix/pdp7.html

SIMH simulation site, http://simh.trailing-edge.com

Decoding The H316/H516 “Generic A” Instructions Bob Supnik, 23-Jul-2001 [revised links 18-Feb-2005]

Introduction

The Honeywell Series 16 (H116, H316, H416, H516, H716) was a family of 16b minicomputers sold from the mid-60’s to the mid-70’s. The series was originally built by Computer Controls Corporation and designated the DDP family. In 1969, Honeywell purchased Computer Controls and renamed the family the Series 16. Historically, the most famous model in the series was the H516, which was used as the original Arpanet “Interface Message Processor” or IMP – the world’s first router. This paper is concerned with the H316 and H516, which were logically identical.

Like many 1960’s minicomputers, the H316/H516 was accumulator rather than general register based. It had an instruction group (known in the hardware documentation as the Generic A group) for manipulating the accumulator (A) and the (C). In many contemporary machines, the accumulator manipulation instruction was microcoded; that is, individual bits or fields in the instruction controlled individual functions in the data path. For example, the PDP-7/9 operate instruction was decoded as follows:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | 1 1 1 1 0| | | | | | | | | | | | | | operate +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | | | | | | | | | | | | | | | | | | | | | | | | | +- CMA (3) | | | | | | | | | | | +---- CML (3) | | | | | | | | | | +------OAS (3) | | | | | | | | | +------RAL (3) | | | | | | | | +------RAR (3) | | | | | | | +------HLT (4) | | | | | | +------SMA (1) | | | | | +------SZA (1) | | | | +------SNL (1) | | | +------rev skip (1) | | +------rot twice (2) | +------CLL (2) +------CLA (2)

In the H316/H516, the skip instruction group was also microcoded:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | 1 0 0 0 0 0|rv|po|pe|ev|ze|s1|s2|s3|s4|cz| skip +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | | | | | | | | | | | | | | | | | | | +- skip if C = 0 | | | | | | | | +---- skip if ssw 4 = 0 | | | | | | | +------skip if ssw 3 = 0 | | | | | | +------skip if ssw 2 = 0 | | | | | +------skip if ssw 1 = 0 | | | | +------skip if A == 0 | | | +------skip if A<0> == 0 | | +------skip if mem par err | +------skip if A<15> = 0 +------reverse skip sense

But the generic A instruction group were not, apparently, microcoded. In addition, the generic A group was very sparsely encoded: only 16 combinations out of a possible 1024 were defined. What did the undefined instructions do? How did the group really work?

Prior Work

In 1971, Donald Bell, at the National Physics Laboratory in the UK, wrote a technical note on “Micro-coding the DDP-516 Computer” [1]. By scanning all possible 1024 generic A instructions, he demonstrated that:

1. All of the generic A instructions had reproducible results. 2. Instruction bit 7 had no effect on operation, effectively halving the number of possible unique instructions. 3. The 512 potential remaining instructions fell into groups, with up to 46 different instructions producing the same result.

Bell offered a partial explanation of how the generic A group was implemented; but his explanation was insufficient, as Adrian Wise demonstrated in his 1999 H316/H516 simulator [2].

Generic A Decoding

The implementation of the generic A group depends on the particular details of the H316/H516 data path. The data path consists of a two input adder, multiplexors on the adder inputs, a results distribution register (D), logic for storing part or all of D back into A, and logic for manipulating the carry flag:

A X M ~M P Y | | | | | | ------\ G mux / \ H mux / ------| | ------v------+----- \ adder / <- carry in | ------| | | +------+ +----- | D register | | +------+ V | carry out logic | V D1 to A1 D1:8 to A1:8 D9:16 to A9:16 D1:8 to A9:16 D9:16 to A1:8

Some points to note:

1. The G mux selects A, X, or no input. If there is no input, the output is 0. For the generic A instructions, the only available choices are A or 0. 2. The H mux selects M (memory input), ~M (memory input complemented), P, Y, or no input. If there is no input, the output is 177777. For the generic A instructions, the only available choices are M and ~M together, producing 0, or no input, producing 177777. 3. The adder performs either a true add or, if carries are suppressed, an exclusive OR. 4. Unless a register is explicitly cleared, a transfer OR’s new information into the register. If multiple sources are transferred simultaneously, all the sources are OR’d together. 5. The adder lacks a ~A input. Any instruction requiring the complement of A must use the adder to perform the operation A XOR 177777.

Generic A instructions are performed in four or six phases. A four phase instruction consists of:

T1 decoding T2 setup T3 adder T4 distribution, carry

A six phase instruction repeats phases 2 and 3, with special overrides on the arithmetic unit during the repeated cycles:

T1 decoding T2 setup T3 arithmetic T2 repeat distribution T3 repeat adder, forced add T4 distribution, carry

The data path and timing is controlled by hard-wired decode logic, as follows: phase signal decoding function 2-3 EASTL (((m12+m16)x!AZZZZ) + Enable A to adder input G (tlate) (m9+m11+AZZZZ) (else, input 1 = 0) EASBM m9+m11+AZZZZ Enable 0 to adder input H (else, input 2 = ‘177777) JAMKN (m12+m16)x!AZZZZ Force adder carry network to 0 (adder generates exclusive OR) EIKI7 (m15x(C+!m13))x!JAMKN Enable 1 to adder carry in (else, adder carry in = 0) 3 SETAZ m8xm15x!AZZZZ Set AZZZZ CLDTR always Clear D ESDTS always Enable adder output to D If AZZZZ 2 CLATR t2xAZZZZ Clear A EDAHS t2xAZZZZ Enable D high to A high EDALS t2xAZZZZ Enable D low to A low 2-3 EASTL (((m12+m16)x!AZZZZ) + Enable A to adder input 1 (tlate) (m9+m11+AZZZZ) (else, input 1 = 0) EASBM m9+m11+AZZZZ Enable 0 to adder input 2 (else, input 2 = ‘177777) JAMKN (m12+m16)x!AZZZZ Force adder carry network to 0 (adder generates exclusive OR) EIKI7 (m15x(C+!m13))x!JAMKN Enable 1 to adder carry in (else, adder carry in = 0) 3 CLDTR always Clear D ESDTS always Enable adder output to D End if AZZZZ 4 CLATR t4x(m11+m15+m16) Clear A CLA1R t4x(m10+m14) Clear A1 EDAHS t4x((m11xm14)+m15+m16) Enable D high to A high EDALS t4x((m11xm13)+m15+m16) Enable D low to A low ETAHS t4x(m9xm11) Enable D low to A high ETALS t4x(m10xm11) Enable D high to A low EDA1R t4x((m8xm10)+m14) Enable D1 to A1 CBITL t4x(m9x!m11) Clear C, conditionally set C from adder overflow CBITG D1xm10xm12 Conditionally set C if D1 = 1 CBITE m8xm9 Unconditionally set C Generic A Instructions

The logic in the previous section was implemented as part of the SIMH simulator [3] for the H316/H516. Using a special test harness, the simulator produced a decomposition of the generic A group into unique instructions. This was compared to the output of the original instruction scan program, executing on a real H316; the results were identical. Thus, the simulated logic accurately reproduced the generic A implementation of a real H316.

The following table lists the unique instructions within the generic A group. Where Bell provided mnemonics, they are used. Where he did not, the instruction function is shown in a C-like notation.

NOP: no operation 140000 140010 140020 140030 140041 140043 140045 140047 140051 140053 140054 140055 140057 140061 140062 140063 140065 140066 140067 140071 140072 140073 140074 140075 140076 140077 140400 140410 140420 140430 140441 140445 140451 140454 140455 140461 140465 140471 140474 140475

CMA (complement accumulator): A ← ~A 140001 140003 140005 140007 140011 140013 140015 140017 140021 140022 140023 140025 140026 140027 140031 140032 140033 140035 140036 140037 140101 140103 140105 140107 140111 140113 140115 140117 140401 140405 140411 140415 140421 140425 140431 140435 140501 140505 140511 140515

CRA (clear A): A ← 0 140002 140006 140040 140060 140102 140106 140440 140460

SSM (set sign minus): A1 ← 1 140004 140014 140104 140114 140404 140414 140500 140504 140510 140514

CM1: A ← C - 1 140012 140016 140112 140116

CHS (change sign): A1 ← ~A1 140024 140034 140424 140434

AD1 (add 1 to A, do not change C): A ← A + 1 140042 140046 140443 140447 140462 140463 140466 140467

CAR (clear A right): A ← A & 177400 140044 140064 140444 140464

CAL (clear A left): A ← A & 377 140050 140070 140450 140470

ADC (add C to A, do not change C): A ← A + C 140052 140056 140453 140457 140472 140473 140476 140477

SSP (set sign plus): A1 ← 0 140100 140110

C ← C | ~A1, A1 ← 0 140120 140130

CMA/ORC: A ← ~A, C ← C | A1 140121 140122 140123 140125 140126 140127 140131 140132 140133 140135 140136 140137 140521 140525 140531 140535

CHS/ORC: A1 ← ~A1, C ← C | A1 140124 140134 140520 140524 140530 140534

ICL (interchange and clear left): A ← A >> 8 140140

BTR (OR left to right): A ← A | (A >> 8) 140141 140143 140145 140147 140151 140153 140154 140155 140157 140541 140545 140551 140554 140555

A ← (A + 1) | ((A + 1) >> 8) 140142 140146 140543 140547

LTR (copy left to right): A ← (A & 177400) | (A >> 8) 140144 140544

BCL (OR to right, clear left): A ← (A & 377) | (A >> 8) 140150

A ← (A + C) | ((A + C) >> 8) 140152 140156 140553 140557

ORC/ICL: C ← C | A1, A ← A >> 8 140160

ORC/BTR: C ← C | A1, A ← A | (A >> 8) 140161 140162 140163 140165 140166 140167 140171 140172 140173 140174 140175 140176 140177 140561 140565 140571 140574 140575

ORC/LTR: C ← C | A1, A ← (A & 177400) | (A >> 8) 140164 140564

ORC/BCL: C ← C | A1, A ← (A & 377) | (A >> 8) 140170

RCB (reset C bit): C ← 0 140200 140201 140203 140204 140205 140207 140210 140211 140213 140214 140215 140217 140220 140221 140222 140223 140224 140225 140226 140227 140230 140231 140232 140233 140234 140235 140236 140237 140301 140303 140304 140305 140307 140311 140313 140314 140315 140317

AOA (add 1 to A): A ← A + 1, C ← overflow 140202 140206 140302 140306

ACA (add C to A): A ← A + C, C ← overflow 140212 140216 140312 140316

ICR (interchange and clear right): A ← A << 8 140240 140260

BTL (OR right to left): A ← A | (A << 8) 140241 140243 140245 140247 140251 140253 140254 140255 140257 140261 140262 140263 140265 140266 140267 140271 140272 140273 140274 140275 140276 140277

A ← (A + 1) | ((A + 1) << 8) 140242 140246

BCR (OR to left, clear right): A ← (A & 177400) | (A << 8) 140244 140264

RTL (copy right to left): A ← (A & 377) | (A << 8) 140250 140270

A ← (A + C) | ((A + C) << 8) 140252 140256

RCB/SSP: C ← 0, A1 ← 0 140300 140310

CSA (copy sign and set plus): C ← A1, A1 ← 0 140320 140330

CPY (copy sign): C ← A1 140321 140322 140323 140324 140325 140326 140327 140331 140332 140333 140334 140335 140336 140337

ICA (interchange A): A ← byteswap (A) 140340

BTB (OR to both halves): A ← A | byteswap (A) 140341 140343 140345 140347 140351 140353 140354 140355 140357

A ← (A + 1) | byteswap (A + 1) 140342 140346

A ← A1 | byteswap (A) 140344

A ← (A & 0377) | byteswap (A) 140350

A ← (A + C) | byteswap (A + C) 140352 140356

ORC/ICA: C ← C | A1, A ← byteswap (A) 140360

ORC/BTB: C ← C | A1, A ← A | byteswap (A) 140361 140362 140363 140365 140366 140367 140371 140372 140373 140374 140375 140376 140377

C ← C | A1, A ← A1 | byteswap (A) 140364

C ← C | A1, A ← (A & ‘377) | byteswap (A) 140370

LD1 (load 1): A ← 1 140402 140406 140502 140506

TCA (two’s complement A): A ← -A 140403 140407 140422 140423 140426 140427 140503 140507

ISG (inverse sign): A ← 2*C - 1 140412 140416 140512 140516

CMA/ADC: A ← ~A + C 140413 140417 140432 140433 140436 140437 140513 140517

A2A (add 2 to A): A ← A + 2 140442 140446

A2C (add 2*C to A): A ← A + 2*C 140452 140456

TCA/ORC: A ← -A, C ← C | A1 140522 140523 140526 140527

CMA/ADC/ORC: A ← ~A + C, C ← C | A1 140532 140533 140536 140537

ICS (interchange, clear left, keep sign bit): A ← A1 | (A >> 8) 140540

A ← (A + 2) | ((A + 2) >> 8) 140542 140546

A ← A1 | (A & 0377) | (A >> 8) 140550

A ← (A + 2*C) | ((A + 2*C) >> 8) 140552 140556

A ← A1 | (A >> 8), C ← C | A1 140560

A ← (A + 1) | ((A + 1) >> 8), C ← C | A1 140562 140563 140566 140567

A ← A1 | (A & 377) | (A >> 8), C ← C | A1 140570

A ← (A + C) | ((A + C) >> 8), C ← C | A1 140572 140573 140576 140577

SCB (set C bit): C ← 1 140600 140601 140604 140605 140610 140611 140614 140615 140620 140621 140624 140625 140630 140631 140634 140635 140700 140701 140704 140705 140710 140711 140714 140715 140720 140721 140724 140725 140730 140731 140734 140735

A2A/SCB: A ← A + 2, C ← 1 140602 140606 140702 140706

AOA/SCB: A ← A + 1, C ← 1, 140603 140607 140622 140623 140626 140627 140703 140707 140722 140723 140726 140727

A2C/SCB: A ← A + 2*C, C ←1 140612 140616 140712 140716

ACA/SCB: A ← A + C, C ← 1 140613 140617 140632 140633 140636 140637 140713 140717 140732 140733 140736 140737

ICR/SCB: A ← A << 8, C ← 1 140640 140660

A ← A | (A << 8), C ← 1 140641 140645 140651 140654 140655 140661 140665 140671 140674 140675

A ← (A + 2) | ((A + 2) << 8), C ← 1 140642 140646

A ← (A + 1) | ((A + 1) << 8), C ← 1 140643 140647 140662 140663 140666 140667

A ← (A & 177400) | (A << 8), C ← 1 140644 140664

RTL/SCB: A ← (A & 377) | (A << 8), C ← 1 140650 140670

A ← (A + 2*C) | ((A + 2*C) << 8), C ← 1 140652 140656

A ← (A + C) | ((A + C) << 8), C ← 1 140653 140657 140672 140673 140676 140677

A ← A1 | byteswap (A), C ← 1 140740 140760

BTB/SCB: A ← A | byteswap (A), C ← 1 140741 140745 140751 140754 140755 140761 140765 140771 140774 140775

A ← (A + 2) | byteswap (A + 2), C ← 1 140742 140746

A ← (A + 1) | byteswap (A + 1), C ← 1 140743 140747 140762 140763 140766 140767

A ← (A & 177400) | byteswap (A), C ← 1 140744 140764

A ← A1 | (A & 377) | byteswap (A), C ← 1 140750 140770

A ← (A + 2*C) | byteswap (A + 2*C), C ← 1 140752 140756

A ← (A + C) | byteswap (A + C), C ← 1 140753 140757 140772 140773 140776 140777

This chart differs from Bell’s in one case. Bell identified 140413 as CMA/ACA, with equivalent encodings 140417, 140432, 140433, 140436, 140437, 140513, 140517, 140532, 140533, 140536, 140537. On the H316, 140413 is actually CMA/ADC (C is not changed), and the equivalent encodings are 140417, 140432, 140433, 140436, 140437, 140513, 140517. The four instructions 140532, 140533, 140536, 140537 are a separate group implementing CMA/ADC/ORC. This does not mean that Bell was wrong: he ran his experiment on an H516, while this table is derived from an H316. The machines are supposedly equivalent, but without H516 logic prints, or access to a real system, we can’t be sure.

Acknowledgements

As is often the case in computer history work, this paper would not have been possible without the help of colleagues whom I know mostly or exclusively through the Internet. Adrian Wise created and maintains an invaluable set of web pages on the computers, transcribed software and manuals, and wrote the first H316/H516 simulator. Al Kossow provided online documentation. Mike Umbricht provided the hardware prints that unlocked the secrets of the generic A logic. Finally, Adrian closed the loop between simulated logic and real machine by running the instruction scan on his H316.

References

[1] On the web at http://www.series16.adrianwise.co.uk/computers/microcode.html. [2] On the web at http://www.series16.adrianwise.co.uk/computers/emulator.html. The current version (1.4) reflects the results of this paper. [3] On the web at http://simh.trailing-edge.com.

Unearthing The PDP-15’s Operating Systems Bob Supnik, Revised 18-Feb-2005

Summary

On 13-May-2001, the PDP-15’s Advanced Monitor operating system was successfully booted on the SIMH (history simulator) emulation system. On 31- Jan-2003, DOS-15 was recovered as well. In July 2003, the sources to XVM/DOS were found. The discoveries and events leading up to these milestones illustrate the vital role that the Internet plays in enabling computer history enthusiasts world-wide to work together for computing preservation.

Background

DEC’s 18b computing family (the PDP-1, PDP-4, PDP-7, PDP-9, and PDP-15) was of significant historic interest. The PDP-1 was DEC’s first computer, and the base for the first “video game” (SpaceWar). The PDP-7 ran DEC’s first mass storage operating system (DECsys), and the first version of UNIX. Despite their historic importance, the 18b family was a limited financial success and was the first of DEC’s computing families to go out of production. Development ceased in the mid 1970’s, and the last 18b computer was produced in 1979. As a result, functioning 18b systems are rare, and many of the key software systems (DECsys, UNIX V1) have been lost. At the beginning of 2000, none of the later 18b software systems was available.

The Computer History Project is an Internet-based collection of computer history enthusiasts. Its goal is to recreate historic systems via emulation and to collect and transcribe software that ran on those historic systems.

Initial Documentation

The first step in preservation and recreation is to collect documentation about the target systems. For the 18b family, this was very challenging. The User Manuals for the early systems (PDP-1, PDP-4, PDP-7) are inaccurate, misleading, and sometimes just plain wrong. Peripheral and option documentation is sketchy or non-existent. The Maintenance Manuals and logic prints are really the only authoritative sources. For all these systems, documentation is rare, as fewer than 200 were produced in total.

Documentation for the later systems (PDP-9, PDP-15) is more plentiful but not necessarily better. The PDP-15’s User Manual, in particular the First Edition, is notorious for its inaccuracies. Again, the Maintenance Manuals, logic prints, and where available, diagnostics are the best source of accurate information.

The starting point for the documentation search was the DEC Archive, while it still existed. The Archive contained these documents:

- PDP-1 Handbook (second edition) - PDP-1 I/O Manual - PDP-1 Maintenance Manual - PDP-4 Handbook - PDP-7 User Manual (preliminary) - PDP-7 Maintenance Manual (including logic prints) - PDP-9 User Manual - PDP-15 User Manual (first edition)

This sufficed to write a pair of preliminary 18b simulators, one for the PDP-1, the other for the rest.

Initial Software

With first pass simulators available, the next step was to collect software for testing and demonstration purposes. This proved to be as difficult as finding the documentation. For the PDP-1, the sources to Lisp and SpaceWar had been published. For the later 18b systems, no software was available on public sources. The restoration of Lisp illustrates the importance of the Internet in historic computer salvage. The source came from the Internet (transcribed by Gordon Greene); the PDP-1 macro assembler was derived from a PDP-8 cross- assembler (by Gary Messenbrink) found on the Internet; and the final debug was done remotely (by Paul McJones).

The initial PDP-7 software came from an old PDP-10 backup tape found by Dave Waks. Again, the Internet played a vital role in salvaging the code. The tape was transcribed on a 7-track tape drive by Paul Pierce. He shipped the raw bits over the Internet to Tim Litt, who decoded the PDP-10 backup formats. Tim in turn sent the transcribed bits to the author for debugging.

From 1998 to 2000, nothing was found for the PDP-9 or PDP-15. In 2000, Al Kossow found and transcribed a set of paper tapes from the McMaster physics lab. Among these tapes were some diagnostics and several copies of FOCAL for the PDP-15. The diagnostics sufficed to debug the PDP-15 simulator and get FOCAL running. At the same time, David Gesswein found a set of DECtapes for the Advanced Software System.

DECtapes

To revive the Advanced Monitor System, the SIMH team would have to find a way of dealing with DECtapes. In the 60’s, DECtapes were the principal form of mass storage on DEC minicomputers; the only affordable alternative was fixed- head disk, which was non-removable. DECtapes posed multiple challenges:

- DECtapes must be simulated with great precision. DECtape software was timing-dependent; in addition, it relied on the ability to examine individual words as they were read into or written from memory. - DECtapes are difficult to transcribe. The only way to transcribe a DECtape is on a real DECtape drive, which is a complex mechanical device and difficult to maintain. In addition, the DECtape format for the PDP-9/15 requires special handling on the PDP-8 and PDP-11, the systems most likely to have survived and to have working drives.

As a prerequisite to implementing DECtapes, the unresolved issues in the 18b simulators needed to be cleared up. Via the Internet, Al Kossow, Max Burnet, and David Gesswein provided additional critical documents:

- PDP-1 Handbook (third edition) - PDP-4 Maintenance Manual - PDP-9 Maintenance Manual - PDP-9 Schematics - PDP-15 User Manual (sixth edition) - KE09A (EAE) Reference Manual - KF09A (API) Reference Manual - TC02 (DECtape) Instruction Manual - RF15 (DECdisk) Maintenance Manual - PDP-9 Advanced Software Systems Monitors Manual - PDP-15 Foreground/Background Reference Manual

These sufficed to answer the outstanding questions.

To cope with the expectations of DECtape software, the SIMH DECtape emulator implemented a word-by-word, time-based model that provided full simulation of acceleration, deceleration, and tape turn-around. By source code count, it was the most complicated peripheral simulator in SIMH. The simulator successfully ran the DECtape exercisers for the PDP-9 before any attempt was made to run DECtape-based software.

David Gesswein was able to transcribe PDP-9/15 DECtapes, including both the Keyboard Monitor System and the Foreground/Background System, using a PDP-8/E with TD8-E controller. The TD8-E was a highly simplified version of a traditional DECtape controller. It read every frame off the tape and left the decoding of the format to software. Thus, it was the ideal device for reading PDP-9/15 DECtapes; it disregarded the format differences and delivered raw bits from the tape for software to decode.

The Keyboard Monitor System

All the prerequisites for reviving the PDP-15’s DECtape operating systems – documentation, simulator, transcribed DECtapes – seemed to be at hand. There was one last problem. Unlike contemporary systems on the PDP-8 and PDP-11, the Advanced Software System did not bootstrap by reading DECtape block 0 into memory and jumping to it. Instead, it required a bootstrap paper tape that was loaded by the hardware read-in facility. For more than a year, all attempts to find this paper tape failed. Appeals on the Internet brought no response. The various private collectors on the Internet, and the Computer History Center, drew a blank.

In May 2001, an email exchange with Hans Pufal in Grenoble France about the PDP-10 revealed that Hans had a paper-tape bootstrap for the PDP-9. He had no direct way to transcribe it. Ingeniously, he scanned the paper tape in sections on a standard optical scanner, wrote a program to decode the pattern of holes, and verified the results by hand. He sent the results to me on May 10, 2001, and I immediately tried it with David Gesswein’s DECtape images.

Unfortunately, it didn’t work. One issue was a lingering bug in the DECtape simulator. A second was that the Foreground/Background System required the Automatic Priority Interrupt option (API), which hadn’t been implemented. But the major hurdles were undocumented software changes that occurred between the PDP-9 and PDP-15. As I wrote on May 11:

I tried bootstrapping the ADSS [Keyboard Monitor] as well as the F/B monitors. For the former, the starting address is a SKP HLT. For the later, it's all 0's. The very next set of instruction picks up the starting address, masks the address with 070000, and performs other manipulations - clearly reconstructing the bootstrap address, provided that the BOOTSTRAP EXITS WITH A JMS RATHER THAN A JMP. So I changed the exit instruction (at 17745) to be JMS I 17755, and suddenly I am a lot further - not running mind you, but further. ADSS, in particular, prints out a nice error message IOPS03 021400, which indicates that the basic I/O system is alive if not well. (Somewhere I have documentation on its error messages.) I think F/B requires the API option, which isn't implemented.

So this is tremendous progress! I am fairly sure that the boot process is:

1. read 36(8) DECtape blocks, starting at block 0, into memory, starting at location 100 2. use location 105 of the loaded image as the starting vector 3. if pdp-9 software, jump to it; for the -15, jms to it

The next day, the error message was traced to a customization in the interrupt skip chain:

With Hans' bootstrap tape, modified (as I think) for the PDP-15 (exit instruction is JMS rather than JMP), I was getting an IOPS03 error - invalid interrupt. Tracing through the interrupt skip chain, I got to:

IOT 1041 JMP* handler

I have no idea what IOT 1041 does; it's not any standard DEC device, and it is backwards from normal I/O tests, which are always:

PSF SKP JMP* handler

So it must have been a custom device SYSGEN'd into this version of the monitor for the installation. I nop'd the tests out, and the tape got to the keyboard monitor prompt! I don't know how the keyboard monitor works, so I typed in PIP, that resulted in

.SYSLD 1 IOPS03 ...

so there's still more debug to do, but this is the first sign of life out of an 18b operating system!

The next day, this bug was traced to another undocumented change between the PDP-9 and PDP-15 bootstraps:

Second difference in bootstrap for the 15 vs the 9: the load image is one block longer. The 9 bootstrap loads 17000(8) words from 100 to 17100. The -15 monitor is 17400(8) words long, from 100 to 17500. The failure to load the last 400 words accounts for all the crashes on keyboard monitor commands. I can now take a directory, print the information message, print the SCOM region, etc.

I can't load or run a system program, so there's more work to be done. One possibility is that the system is gen'd for more memory than I am allowing.

And that indeed proved to be the case. Loading the bootstrap in upper memory allowed the Keyboard Monitor System to run correctly. By running SYSGEN, references to custom devices were eliminated, creating a clean DECtape image.

From Keyboard Monitor To Foreground/Background

With the Keyboard Monitor System running, the next challenge was to bring up the Foreground/Background System. This required additional hardware: memory protection, automatic priority interrupts (API), and (although I didn’t realize it) a second terminal. All of these had their issues:

- Memory protection, though implemented, had never been tested and contained serious bugs. - API required intrusive changes throughout the CPU simulator. - SIMH had no capability for multiple terminals.

Fortunately, the PDP-9/15’s API closely resembled the PDP-10’s, allowing the latter to be used as a model for the implementation. The second terminal problem was solved by a kludge that allowed multiple terminals to access the controlling window and keyboard on a sequential, rather than a simultaneous, basis. With these changes, the Foreground/Background System was successfully run on 28-May-2001.

DOS-15

Once again, there was a long interval with no apparent progress. However, Hans Pufal and a team in Grenoble France had been restoring a real PDP-9 (part of the collection of La Cite des Sciences et d'Industrie in Paris). On 30-Aug- 2002, they succeeded in booting the Advanced Monitor System on real hardware, for the first time in two decades. The availability of a working system enabled Hans to go, as he put it, “DECtape fishing”. One of the first discoveries was a complete source set for ADSS. But on 30-Jan-2003, he reported an even more significant find:

I've been doing some more DECtape fishing and have recovered three tapes which appear to be DOS-15 V2A. I have another set of restore tapes which I have so far not been able to dump successfully.

Having read the manuals and checked out the tapes all appeared to be in agreement and I wrote a C program to perform the same functions as the DOSSAVE program would do to reload the tapes onto the disk.

I've looked at the disk structures and all appears to be coherent, I see the MFD at disk block 1777 and the UIC's appear at their proper places also. So I THINK I have an image if an RF single platter disk containing DOS-15.

How to boot it? I have no bootstrap!

Looking through the DOS-15 System Manual (which Al Kossow had scanned and made available on the Internet), I noticed that the calling sequence for the resident bootstrap looked very similar to the DECtape bootstrap for ADSS. Hans had the sources for the ADSS bootstrap and tried it on the DOS image, but it didn’t work. He wrote:

Actually, I've just been reading RFSBT source and it seems to be "converting" DT units numbers to platter addresses - I don't think we want that. Can you take a look around label PLAT3.

Based on the DOS-15 Manual, I suggested a patch to the boot program:

You're right, the code is wrong. The DOS15 System Manual says that the unit number should be ignored for the RF15; instead the disk is numbered sequentially from block 00000 to 17777.

Assuming that the "4 word parameter block" for TRAN ends up in .dtblk through dkword, the address calculation needs to do something like this:

This would at least be consistent with the DOS15 documentation, and would function identically to the current code for bootstrapping the system.

That sufficed to get to the DOS-15 $ prompt, as Hans reported on 31-Jan-2003:

Progress:

sim> load rfsboot.rim 77637 sim> at rf dosv2a.rfa RF: buffering file in memory sim> go

DOS-15 V2A ENTER DATE (MM/DD/YY) -

But none of the system programs seemed to work; they all aborted with an IOPS21 message. Hans traced the problem to the for the RF15, which was sitting in a strange loop: I'm having difficulties firguring ot the following code:

75072: CLA 75073: IOT 7045 75074: IOT 0 75075: IOT 0 75076: IOT 0 75077: IOT 7065 75100: IOT 0 75101: IOT 0 75102: DSSF 75103: JMP 75106 75104: DSCD 75105: JMP 75113 75106: DSCD 75107: TAD 75401 75110: SAD 75722 75111: JMP 75231 75112: JMP 75077 75113: DAC 75072 75114: SNA CLL

In particular the disk IOT's., also the IOT 0s seem somewhat strange I assume they are time delays or does IOT 0 do something?

The code is executed right before the IOPS error is declared, my tentative assumption is that the code does something different on the emulator than on the real hardware causing failure.

I found a similar piece of code in the ADSS RF15 driver. Based on the source, I concluded that the code was attempting to size the number of platters and failing due to an emulator bug. Hans verified this assumption by patching the code and was able to get further, but he still hit the IOPS21 error (because there were multiple copies of the code, as it turned out). Even though the extent documentation did not mention this sizing capability, it clearly worked. I revised the RF15 emulator accordingly, and on 02-Feb-2003 Hans was able to get into and out of the system programs. By 03-Feb-2003 he had run the DOS checkout package. A few days later he was able to generate systems on larger disks. Shortly thereafter, he demonstrated that DOS-15 contained a bug that prevented maximum-sized RF15’s from running correctly and generated the first DOS-15 “patch” in more than 30 years! But that’s another story.

From image discovery to complete recovery had taken less than two weeks, thanks to tapes from France, documentation from California, simulation from Massachusetts, and Internet-based collaboration and debugging.

XVM/DOS

The recovery of XVM/DOS involves a new Internet-based element, namely, the worldwide trading market created by Ebay. Over the last few years, a small but growing market has emerged for historic computer artifacts. Almost all of the buying and selling occurs on Ebay. While the number of items offered for sale on any given day is small, over time a variety of interesting items have appeared.

Through Ebay, Al Kossow was able to purchase a set of “PDP-15 DECtapes”. Al used a functioning PDP-11 with a TC11/TU56 (and a modified version of John Wilson’s program for reading PDP-8 DECtapes) to recover the contents. The tapes contained an intact set of sources for V1A of XVM/DOS. Using these sources, it proved possible to build XVM/DOS for a system without a Unichannel (that is, without an attached PDP-11). After suitable extensions to the PDP-15 simulator (and debugging of those extensions using XVM/DOS itself), XVM/DOS was run successfully in early January, 2004.

Next Steps?

The Keyboard Monitor, Foreground/Background Monitor, DOS-15, and XVM/DOS do not exhaust the variety of environments available for the PDP-15. The advanced disk-based operating systems -- RSX-PLUS-III (and its later version XVM/RSX) and MUMPS -- represent even higher levels of capability and would be of great interest historically. Recently, a full set of sources for XVM/RSX surfaced in a lot of DECtapes purchased on Ebay. If system generation documentation can be found, it should be possible to bring back XVM/RSX. That would leave only MUMPS unrecovered. A partial source listing of XVM/MUMPS turned up in the DEC Archives; no other copies have been found. DEC junked its PDP-15 media archive at the end of the 80’s, when the last systems went off contract. DECdisks and RP02’s were huge and have all ended up on the scrap heap. Unless there is a complete save set on magnetic tape somewhere, MUMPS is lost.

Acknowledgements

The revival of the Advanced Monitor Systems and DOS-15 demonstrates the critical role of the Internet in creating a virtual community of computer history enthusiasts. The project would not have succeeded without the help of individuals whom I know mostly or exclusively through the Internet. In addition, the Internet allowed for rapid interchange of documents, software images, and folklore.

I am particularly indebted to:

- Max Burnet (Australia), for hardware documentation on the 18b systems. - Al Kossow (California), for hardware and software documentation on the 18b systems, for paper tape images, and for the XVM/DOS DECtapes. - David Gesswein (Maryland), for hardware and software documentation on the 18b systems, as well as the ADSS DECtape images. - Hans Pufal (France), for finding the critical missing ADSS bootstrap tape and transcribing it without a paper tape reader; and for finding, transcribing, and debugging DOS-15. Hans filled in an enormous number of missing pieces, including reconstruction of DOSSAVE from a description of its functional behavior.

PDP-11 Interrupts: Variations On A Theme Bob Supnik, 03-Feb-2002 [revised 20-Feb-2004]

Summary

Despite the presence of documented standards and example implementations, PDP-11 devices showed significant variability in implementing interrupts. While some of these variations were nearly invisible or harmless, others required explicit support or workarounds in device drivers. Consequently, PDP-11 emulators must model device interrupt control with great care.

The “Standard” Implementation

Until the advent of message-oriented devices like the TS11 and the MSCP controllers, all PDP-11 devices contained a “control/status register”. The CSR contained a device ready flag in bit <7> and an interrupt enable flag in bit<6> (other common assignments were error summary in bit<15> and go in bit<0>):

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 E RI G R D E O R Y

The device’s interrupt request was implemented with an edge sensitive flip-flop. INTR was set by the rising edge of the logical AND of RDY and IE; it was cleared by device initialization or by interrupt acknowledge; and its output was masked by the logical AND of RDY and IE. The entire circuit required just two AND gates and a JK flip flop:

+-----+ |--\ |--\ +5V--|D 1|-----| | RDY ---| | | | | |--- INTERRUPT REQUEST | |---+---|C | +--| | IE ---| | | +--^--+ | |--/ |--/ | | | +------|-----+ | INIT + IAK ------+

Behaviorally,

- A transition of RDY AND IE from 0 to 1 set the interrupt request. - Granting the request, or initializing the device, cleared the interrupt request. - Clearing either RDY or IE blocked the interrupt request. This cannot be distinguished from clearing the request.

This circuit is presented in all the standard Unibus handbooks and is included in all the early PDP-11 device controllers, such as the PC11 (paper tape), DL11 (serial line), and RK11 (cartridge disk). The circuit was reduced to silicon in the Qbus interrupt logic chip (DC003).

Variations

As the TTL logic family broadened, variations began to appear in the interrupt circuitry. The standard implementation seemed overly general. While it made sense to request an interrupt on the rising edge of RDY, why request an interrupt on the rising edge of IE? The following variation began to appear:

+-----+ |--\ IE--+------|D 1|-----| | | | | | |--- INTERRUPT REQUEST | RDY--|C | +--| | | +--^--+ | |--/ | | | +------|-----+ | INIT + IAK ----+

If IE was set, the rising edge of RDY requested an interrupt. Once the interrupt was set, clearing IE would block the interrupt request. As before, both device initialization and interrupt acknowledge cleared the interrupt request.

This variation apparently saved a gate with no impact on function. But it had one peculiarity: an interrupt request, once set, could not be cleared by program action. Clearing IE did not actually clear the interrupt request; more importantly, clearing RDY neither cleared nor blocked the interrupt request. The seeds for future confusion were sewn.

The RH70 and RH11

The Massbus controllers took the circuit variant described above, added an additional “feature”, and in doing so created something unique. Disk controllers have always had an issue in handling overlapped seeks on multiple units. Software would like to have an interrupt for each distinct operation; but to multiplex all the seek complete interrupts, and the controller complete interrupt, onto a single interrupt request line requires complex mechanisms like unit polling, as in the RK11. Without this mechanism, software must time overlapped seeks, as in the RL11.

The Massbus designers proposed a simple hardware-software combination to handle this problem. Each disk drive would have an “attention” (ATA) flag. Provided that the controller was enabled for interrupts and was not performing a data transfer, the controller would request an interrupt if any ATA flag was set. The software driver would have to clear the ATA flag of the drive requesting attention. Thus, ATA interrupts behaved like the level-sensitive interrupts of the PDP-8 and PDP-15.

To implement this additional class of interrupt, the Massbus controller simply OR’d the ATA interrupt requests with the output of the RDY interrupt circuit. But because the ATA request already included IE, and the RDY circuit gated IE, it omitted the final AND with IE:

+-----+ |--\ IE------|D 1|----- \ \ | | | >--- INTERRUPT REQUEST RDY------|C | +-- / / +--^--+ | |--/ | | ATA + RDY + IE------+ | INIT + IAK ----+

Now here was a circuit with truly peculiar properties. A transition of RDY from 0 to 1 with IE set latched an interrupt request. This request not only couldn’t be cleared by program action; it couldn’t be blocked by software action. That is, once the interrupt request was latched, clearing IE did not prevent the interrupt from happening! This inexcusable mistake didn’t even save gates; the correct implementation required only a 3-input AND gate in place of the 3-input OR gate:

+------+ | +------+ | | | | | |--\ | | +-----+ |--\ | +--| | IE--+--|----|D 1|----- \ \ +----| |-- INTR REQUEST | | | | >------| | RDY----+----|C | +-- / / |--/ +--^--+ | |--/ | | ATA ------+ | INIT + IAK ----+

With this implementation, RDY interrupt would have behaved like “classical” PDP-11 edge-sensitive interrupts, while ATA would be a level-sensitive interrupt conditioned on RDY and IE.

DEC drivers didn’t utilize the peculiarities of the RH70 and RH11, but UNIX variants, such as Ultrix-11, did. Simulators attempting to model the RH series controllers cannot running Ultrix-11 without mimicing the behavior of its interrupt logic.

The DEUNA

The DEUNA shows that the advent of purpose-built IC’s did not fix the problem. The DEUNA replaced the last AND gate of the classic implementation with the DC013 Unibus grant controller chip:

+-----+ +-----+ |--\ +5V--|D 1|---| | RDY ---| | | | |DC013|--- INTERRUPT REQUEST | |------|C | | | IE ---| | +--^--+ +-----+ |--/ | | | INIT + IAK ------+

Unfortunately, the DC013 had no enable input, only a request inputt. Thus, once the DEUNA requested an interrupt, there was no programmatic way to remove it. As with the RH11/RH70, simulators attempting to model the DEUNA must model its incorrect interrupt behavior.

Acknowledgements

Joseph Young first diagnosed problems in the RH simulators running BSD 2.9 and 2.11 and pointed to the interrupt logic as the source. Tom Evans and Alan Baldwin brought the DEUNA example to my attention.

Bug, Feature, or Code Rot? Adventures in OS Debugging Bob Supnik, 24-Mar-2002 (updated 26-Jan-2005)

Summary

In bringing up an old operating system on a simulator, the assumption must be that any problem is the simulator’s fault; after all, the operating system worked on real hardware. This assumption has not always proved to be true. Simulators on modern PC’s are often significantly faster than real hardware and thus may expose race conditions or timing bugs. Simulators may exercise code paths that, in late stage operating systems, were no longer used, such as full installs. Simulators may create configurations that were not practical, due to physical or financial limitations. Finally, simulators may present late stage operating systems with hardware configurations that, while nominally supported, could in practice no longer be tested.

Timing Problems

On modern PC’s, simulators for a computer architecture are often significantly faster than any real hardware that was ever built. PDP-10 simulators, for example, have been clocked at over 10 mips; the fastest DEC PDP-10 (the KL10) was 1.5 mips. Simulated devices are often much faster than their real counterparts. These speed changes can expose timing dependencies in operating system code.

A trivial case is a timing loop. Some software environments, such as console- or -based games, are very dependent on timing loops. Timing loops also occur in system bootstraps. For example, the VAX KA655 boot code uses delay loops executing directly from boot ROM to run “slowly” as a wait loop on clock ticks. Finally, timing loops show up frequently in diagnostics.

More subtle problems occur around interrupts. Operating system code often assumes that a large amount of time elapses between initiation of an I/O operation and receipt of the completion interrupt. If the interrupt “too soon”, it may be misinterpreted or lost. All versions of the RSX11M+ MSCP driver prior to V4.5 had this problem in the initialization sequence. VAX NetBSD driver has this problem during normal operations, as Kevin Handy documented in this note to the author:

Starting at '1->', we set up a mscp packet to put the drive online. At '2->' we ping the mscp controller to take a look at it's packets. And '3->' waits up to (100*100) time units for the controller to respond with an interrupt.

The problem, is that by the time it gets to '3->', the interrupt has already occurred and been processed. It's waiting for an interrupt that has already occurred, thus the timeout fails. You can see it by single stepping through the code (it suddenly jumps out of the sequence, putters around for a while, then jumps back in).

The CPU is expecting to have enough time to set up a timeout routine before it will get a response back. It's not expecting an instant response back. You need to delay the responses from your emulated controllers for instructions/microseconds, and then you will then get past this problem.

Ken Harrenstein found a similar problem in the disk driver in ITS.

OpenBSD has the opposite problem: if the interrupt occurs “too late”, it is lost, as I documented in this note to the OpenBSD maintainers:

1. In rx_putonline, the code apparently does an online and waits for an interrupt:

/* Poll away */ i = bus_space_read_2(mi->mi_iot, mi->mi_iph, 0); if (tsleep(&rx->ra_dev.dv_unit, PRIBIO, "rxonline", 100*100)) rx->ra_state = DK_CLOSED;

2. But, in fact, tsleep is being short-circuited. Because of autoconfiguration, it simply opens and then closes a window for an interrupt to occur.

s = splhigh(); if (cold || panicstr) { /* * After a panic, or during autoconfiguration, * just give interrupts a chance, then just return; * don't run any other procs or panic below, * in case this is the idle process and already asleep. */ splx(safepri); splx(s); if (interlock != NULL && relock == 0) simple_unlock(interlock); return (0); }

I can't follow the code in detail (the compiler optimization is very good), but none of the normal path processing (storing the arguments in the process data structure) occurs. SPL is raised to 1F, then lowed to 0, then put back where it was, and 0 is returned.

3. In the meantime, the RQ is loafing along, trying to simulate the qtime polling delay, which is set at 200. tsleep returns so quickly that only about 30 instruction times have elapsed. The packet hasn't been fetched, and no interrupt has occurred. The rx_putonline call fails.

4. A bit later, the OS goes through all this again, because rxopen finds that the state is still DK_CLOSED. It calls rx_putonline, etc. But the RQ is loafing around on the first poll. The OS gets into tsleep and out again, before the online packet has even been fetched. So the state remains DK_CLOSED, and panic ensues.

5. You can prove this trivially: set RQ QTIME to 100 before booting, and OpenBSD 3.5 comes right up.

So one answer is to set RQ QTIME down to 100, or 50, or whatever, declare victory and go home. But QTIME is set so high because operating systems had timing windows. In particular, RSX11M+ V3 requires a QTIME of at least 175, or it wont boot. And NetBSD has timing problems as well.

All of these didn't occur in 'real' life, because the CPU speeds versus the MSCP speeds were very different. The fastest PDP11 (the J11) was only about 2X the speed of the T11 in the RQDX3. But the slowest VAX was at least 3.5X, and the VAX I'm simulating is 10X faster. So the 'right' QTIME will vary, depending on the CPU/disk hardware combinations.

To address these issues, the SIMH MSCP simulator simulates ‘delays’ between initialization steps, and between initiation of an operation and completion. The delays have to be tuned experimentally to get the right values. For example, M+ requires at least 200 instructions between initialization steps, but RSTS/E can tolerate virtually no delay after the completion of step 4. PDP-11’s need “slow” operations, but VAXen need fast operations.

Finally, the changed timing of the simulation environment may expose race conditions and bugs that have lain dormant in the code. RSX11M+ has a compound bug in which a coding error in MSCP device initialization is masked, on real hardware, by the outcome of a timing race condition. If the boot device is an MSCP disk, M+ routine RVEC brings up first the controller (routine $KRBSC) and then the boot disk (routine $UCBSC) by issuing three MSCP commands:

- Set controller characteristics - Unit online - Get unit status

There is a bug in the MSCP driver’s handling of the get unit status command. In the interrupt handler for command completion, routine RQRCT destroys the success status code and overwrites it with 0310 (bad block replacement needed). If the MSCP disk is ‘fast’, or the driver code paths are really long, the get unit status command completes before control returns to $UCBSC. $UCBSC sees an error status and marks the disk as offline, causing the bootstrap to fail. This is what happens on the simulator with M+ V3.0.

On the other hand, if the MSCP disk is slow, or the driver code paths tighter, control returns to $UCBSC while the get unit status is still in progress. The error status code is from the successful unit online, and $UCBSC marks the boot disk as online and returns. This is what happens on the simulator with M+ V4.0 or later, and, apparently, with real hardware.

Even with the timing race falling the ‘right’ way, it requires another bug to prevent routine RVEC from seeing the erroneous status code from the get unit status. When $UCBSC returns, RVEC sees that the unit online sequence is not complete and waits for the get unit status to set a final status code. When that status (the erroneous 0310) is set, it is ignored. RVEC only checks to see whether the disk is online. And the disk is online, because $UCBSC set status from the unit online command, rather than the get unit status command.

Interestingly, when the bug in RQRCT was addressed in M+ V4.0, the fix was incorrect, and the code continued to work only because of the timing race condition.

To get around this race condition, the SIMH MSCP simulator command completion delay must be tuned experimentally. M+ 3.0 requires at least 175 instructions between initiation and completion of a command.

One Good Bug Deserves Another

RSX11M+ was not the only PDP-11 operating system to benefit from bugs that cancelled each other out. RSTS/E’s tape boot for the TE16/TU45/TU77 drives (driver MM:) contains a code sequence that appears, on the surface, to be clearly incorrect. RSTS/E boots from tape by rewinding the tape and reading in the tape label (!), which contains the secondary bootstraps for the various drives. On an MM: drive, it issues rewind and then read through this subroutine:

MMCMD: MOV (R5)+,(R1) ;Issue the command 10$: TSTB (R1) ;Controller ready? BPL 10$ ;Nope, not yet. TST RHDS(R1) ;Drive ready? BPL 10$ ;Nope, keep waiting.

Now, this should not work. TST RHDS(R1) is not testing drive ready, it's testing formatter attention. This is clear from a different copy of the routine in the boot block:

110$: TSTB (R0) ;WAIT FOR CONTROLLER READY BPL 110$ ;NOT YET TSTB RHDS(R0) ;IS DRIVE READY IN RHDS? BPL 110$ ;NO

This copy tests bit <7>, which is drive ready, rather than bit <15>, which is attention. So how did the boot block code ever work? The answer is that there's another bug in the code, namely, the call to MMCMD to read the tape:

MOV #157000-MMBOOT,RHBA(R1) ;Load MM boot driver MOV #BOOTSZ,RHWC(R1) ;Read our extended DOS label CALL MMCMD,R5 ;Issue a command to the TM02/TM03 .WORD 71 ;The command is read

BOOTSZ is the size of the bootstrap in bytes, as defined by this structure:

.DSECT .BLKB 14. ;Normal DOS label area MUBOOT: .BLKB 1000 ;MU boot driver MSBOOT: .BLKB 1000 ;MS boot driver MMBOOT: .BLKB 1000 ;MM boot driver MTBOOT: .BLKB 1000 ;MT boot driver BOOTSZ: ;Length of the entire DOS label

But there are two problems in the calling sequence. First, RHWC is word count, not byte count. And second, it's supposed to be 2's complement. So MOV #BOOTSZ,RHWC(R1) is loading an absurdly long word count, and the record runs out before the Massbus WC does. In the real hardware, this sets frame count error (FCE), which sets attention; hence, the TST RHDS(R1) instruction exits the loop. This detail, which is not mentioned in the hardware documentation, must be simulated for RSTS/E to boot.

Unused Paths

Simulator users routinely perform full installations of operating systems onto empty disks; indeed, a full installation is one of the litmus tests for simulator success. But in real life, this path might no longer be used or tested. DEC ceased production on DECsystem-10’s in the early 80’s but continued to update TOPS-10 through 1988. When the last release (7.04) came out, there were no new DECsystem-10’s requiring full installs, and the code path was insufficiently tested. And, in fact, it contains a bug. This problem burned Tim Stark during debug of TS10, as documented in this note to KLH10 author Ken Harrenstein:

There's also a bug that interferes with TOPS-10 7.04 from being built correctly from scratch; that was presumably not found because no one was doing clean installs in 1988. It has to do with enumerating magtape channels or units; the code's counting loop overflows from the MTCS2 formatter select field into the Unibus address inhibit, so that the next magtape read doesn't work. SIMH got away with it because I didn't implement address inhibit, but Tim Stark got burned in TS10 because he did. (He thought the driver required it.)

Because SIMH doesn’t accurately follow the hardware, it is, ironically, immune from this problem.

A more complex case is a magtape boot bug in TOPS-20 V4.1 for the KS10. The magtape bootstrap is read into low memory and then relocated to high memory for execution. For some reason, the move is done with EXCH instructions rather than conventional moves, thus replacing the low core image with the contents of high memory. The bootstrap contains the instruction WRCSTM [77B5]. After relocation of the bootstrap, the WRCSTM’s address is still pointing to low core, which has been overwritten. The WRCSTM writes garbage to the CSTM, and the boot fails, as documented in this note in alt.sys.pdp10:

The tape bootstrap moves itself into high memory with a routine that exchanges memory locations, rather than copies them. (I have no idea why.) The WRCSTM instruction in the boot references absolute address 40127, but that's been copied to high memory, and garbage (zero for the simulator) exchanged into its place. When paging is turned on, the simulator gets an age page fail error, because the CSTM is all 0's, and the age bit gets zeroed on the second page fill. Ugh. If I run the boot again, in the same core image, it works, because the contents of 40127 are already in high memory and are brought back to the right spot by the exchange.

How could such an obvious problem been overlooked? One suggestion – that the tape bootstrap of V4.1 had simply not been tested on the KS10 – was indignantly rejected by veterans of the TOPS-20 group. They insisted that the code worked on a real KS10 CPU but could not explain how.

The answer, perhaps, lies in the observation that the bootstrap succeeds the second time, because the exchange moves a copy of the bootstrap back to low memory, and the WRCSTM retrieves the correct data. On a real KS10, the front- end console had a watchdog timer. If the main CPU failed to respond with a heartbeat in a given amount of time, the console would reboot the system – without disturbing memory. The second bootstrap would succeed. From the viewpoint of anyone debugging the bootstrap process on real hardware, there would be a small tape movement, a delay, a backspace, and then a normal boot. If the tape motion wasn’t noticed, the delay could be ascribed to self-test procedures in the front-end console or other “normal” delays. The system did boot; there was no need to look deeper.

Impractical Configurations

In today's computers with megabytes of memory and gigabytes of storage, the largest configuration of a historic computer represents a tiny fraction of the available resources. Simulators can create configurations that for physical or financial reasons were impractical with real hardware. For example, the SIMH PDP-15 simulator supports an RF15 fixed head disk controller with up to 8 RS09 fixed head disks. In practice, no customer would buy that many fixed head disks; instead, the customer would buy an RP15/RP02 disk pack, which provided five times the storage at lower cost.

Apparently, the maximum RF15 configuration was never tested with the PDP- 15's DOS-15 operating system. The predecessor operating system, ADSS-15, had been limited to 4 RS09 disks. DOS-15 increased this to 8, but the configuration was never tested, as Hans Pufal documented in a mail message:

The OS exits to IOPS error code 21 when it reaches a platter number 010. The problem is that with 8 platters there will never be a NED indication. I think the problem is in the OS code:

75072: CLA ; set platter to 0 75073: IOT 7045 ; force controller idle, clear done 75074: IOT 0 ; padding 75075: IOT 0 75076: IOT 0

; Top of platter loop 75077: IOT 7065 ; set disk platter 75100: IOT 0 ; padding 75101: IOT 0 75102: DSSF ; skip on error (NED) 75103: JMP 75106 ; jump if disk exists

75104: DSCD ; clear status 75105: JMP 75113 ; found NED, AC equals number of platters

; Disk exists, inc disk and loop back if not done 8 75106: DSCD ; clear status 75107: TAD 75401 = 001 ; add 1 75110: SAD 75722 = 010 ; compare with 010 75111: JMP 75231 ; jmp out if disks = 8 75112: JMP 75077 ; not 8 so go back for next disk

75113: DAC 75072 ; store # of platters 75114: SNA CLL ; skip if AC = 0 75115: JMP 75231 ; jump to IOPS error

; Error path 75231: LAC 75131 75232: DAC* 75731 75233: LAW 21 ; IOPS number 75234: JMP 75240 ; go do IOPS error

I think the JMP at 75111 should be a SKP.

And indeed it should. A maximum RF15 configuration, impractically expensive at the time, was never tested.

Untestable Configurations

A simulator can mimic any implementation of a computer architecture. Further, it can implement an arbitrary assemblage of peripherals. This flexibility may significantly exceed the testing capabilities available to real developers in late stage operating systems. For example, the SIMH PDP-11 simulator emulates a KDJ11A CPU with broad set of peripherals ranging from DECtape (out of production by the early 70s) to MSCP disks (still current in the early 90s). DEC in its heyday would have been hard pressed to assemble such an eclectic set of devices. Therefore, it is not surprising that by the late 90’s, the skeleton crew maintaining the PDP-11 operating systems could no longer test older hardware.

This problem is evident in the behavior of RSX11M+ V4.5 autoconfigure. V4.2 correctly identifies the simulator as an LSI-11/73 (KDJ11A CPU). But V4.5 identifies it as an “M11”, Mentec’s 1997 re-implementation of the J-11 in gate arrays. What happened?

M+ autoconfigure implements a series of tests that act as a sieve to eliminate classes of PDP-11 processors. When the tests are done, one and only one CPU model should be flagged. The tests are very fine grained, but the KDJ11A and M11 are almost identical. Both respond with MFPT = 5 and maintenance ID = 20. To distinguish them, the following code was added to autoconfigure in V4.5 (as disassembled by the simulator):

;;; PDR7 has W bit set 131640: MOV @#172317,@#172317 ;;; write odd byte of kernel PDR7 131646: BIT #100,@#172316 ;;; is W bit still set? 131654: BEQ 131664 ;;; if eq no 131656: BIC #200,R4 ;;; if ne yes, clear J11 bit (ie, it's an M11) 131662: BR 132124 131664: BIC #20000,R4 ;;; if eq no, clear M11 bit (ie, it's a J11) 131670: BR 132056

This code sequence cannot work as written. On the KDJ11A, and presumably on the M11, the MOV instruction accesses an odd address and traps while fetching the source address. The trap handler simply RTI’s, and the third word of the MOV is executed as an instruction ADDF F3,(PC), which is harmless. Because the PDR is not actually written, the W bit isn’t cleared, and the CPU is always classified as an M11. What is going on?

The answer comes by comparison with the CPU identification code in routine SAVSIZ:

20$: MOV #KISDR7+1,R0 ;;;POINT TO KERNEL PDR7 ;DC535 MOVB (R0),(R0) ;;;WRITE THE HIGH BYTE OF THE PDR ;DC535 BITB #100,-(R0) ;;;DOES IT SHOW WRITTEN? ;DC535 BNE 60$ ;;; IF NE, YES, WE HAVE AN M11 ;DC535

This sequence will work. The MOVB doesn’t trap. On a KDJ11A, a write to the PDR clears the W bit, even if the PDR is mapping itself. On the M11, apparently, it does not.

How did the bug in autoconfigure go undetected? One possibility is that autoconfigure was not tested. But a more compelling hypothesis is that the developer simply didn’t have a KDJ11A available for testing. The KDJ11A is a relatively rare survival as a system processor; most J11-based PDP-11 systems were built with the KDJ11B, D, or E processor modules. The developer tried the code on an M11, and it worked; he probably didn’t have a KDJ11A available to see that it didn’t.

A similar problem exists with the SIMH VAX simulator. It emulates a MicroVAX 3900 with a broad range of MSCP-attached storage devices: floppies, disks, CDROM’s, etc. Many of the devices had a brief lifespan and then disappeared. In particular, MSCP-attached CDROM’s were superceded by SCSI-attached CDROM’s. By the late 90’s, MSCP CDROM’s had disappeared, and their behavior could not be tested.

This problem showed up as a bug in the VMS 7.3 installation procedure. VMS 7.3 allowed a read-only copy of VMS to be booted from CD. On SIMH, this procedure did not work: the bootstrap process entered mount verification and hung forever. What happened?

Read-only bootstrapping is controlled by flag DEV$M_SWL in the DEVCHR word of the unit control block (UCB). If the system mount code detects that the boot device is write-locked, it sets DEV$M_SWL, preventing any writes to the boot disk. But the boot process never reaches system mount. Instead, it enters mount verification as a side effect of establishing a connection to an MSCP controller. Unless DEV$M_SWL is already set, mount verification treats the write-locked condition as an error, outputs an error message, and loops indefinitely waiting for the write-lock to be removed.

The SCSI class driver handles this problem by setting DEV$M_SWL if it detects write-lock in an information packet. The MSCP class driver says it sets DEV$M_SWL:

; RECORD_UNIT_STATUS - copy data from GET UNIT STATUS end message to UCB. ; ; Functional Description: ; ; The supplied MSCP end message is analyzed and appropriate fields in ; the UCB are filled in with information contained in the end message. ; ; If the end message is shorter than MSCP$K_LEN, it is zero filled ; to that length. This compensates for controllers possibly passing ; some fields back as zeros by returning short messages. Then various ; geometry parameters are copied or calculated from the geometry ; information in the end message. If the basic cylinders/tracks/sectors ; information produced by these calulations contains any zeros, a class ; driver bugcheck is declared. Finally, the two write-locked bits are ; tested. If either is set, the DEV$M_SWL bit is set. Otherwise, the ; bit is not set.

Actually, it does something quite different:

BICL #UCB$M_MSCP_WRTP, - ; Clear class driver write protected UCB$L_DEVSTS(R3) ; flag. BITW #,- ; software write protected? MSCP$W_UNT_FLGS(R2) BEQL 60$ ; Branch if not write protected. BISL #UCB$M_MSCP_WRTP, - ; Else, set the class driver write UCB$L_DEVSTS(R3) ; protected flag. 60$:

The driver is setting a flag in the device-dependent DEVSTS field, rather than in the device-independent DEVCHR field (the SCSI class driver sets both). Thus, mount verification regards the write-locked condition as an error and loops forever.

How did the bug go undetected? By the time read-only booting was added to VMS, all MSCP CDROM’s had long-since gone out of service. CDROM boot was tested with SCSI CDROM’s, and worked (because the SCSI driver sets DEV$M_SWL). It could not be tested with MSCP CDROM’s, and failed.

Conclusion

In debugging a simulator, 99% of all problems that occur in bringing up an operating system will be the simulator’s fault. Occasionally, the problem will be in the operating system itself. Operating systems contain timing dependencies that simulated devices break or may not have been tested against all possible hardware configurations. They contain canceling bugs that only work due to obscure details of the hardware. Late stage operating systems suffer from inadequate staffing, incomplete test facilities, and other limitations. The result is introduction of bugs through coding mistakes or “code rot” (code breakage as a side effect of new features). Locating these problems, and tracing them to root causes, is one of the most difficult challenges in simulator debugging.

Acknowledgements

Once again, the Internet gang of historical computer enthusiasts played an indispensable role in the work documented in this paper. Doug Carman raised the initial alarms about RSX11M+’s behavior under simulation and provided access to critical sources. Robert Alan Byer showed that there were inconsistencies in several different versions of autoconfigure. Brian McCarthy, one of the stalwarts of M+ development, provided crucial insights into the autoconfigure algorithm. Tim Shoppa demonstrated the sensitivity of RSTS/E to processor and clock parameters, and John Dundas suggested how to work around the problems. Tim Stark uncovered the TOPS-10 7.04 and TOPS-20 4.01 bugs. Kevin Handy debugged the MSCP simulator issues in VAX NetBSD. Ken Harrenstein documented the ITS disk driver bug. Markus Weber and Mark Kettenis patiently coached me through OpenBSD debug. Hans Pufal recovered DOS-15 from archival DECtapes, restored it to operation, and found the RF15/RS09 bug. Andy Goldstein, stalwart of the VMS group since its inception, pointed out the controlling role of DEV$M_SWL in read-only booting and helped me trace why it was not getting set.

SIMH is on the web at http://simh.trailing-edge.com.

A Massbus Mystery, or, Why Primary Sources Matter, Even In Computer History Bob Supnik, 24-Sep-2004

Summary

In preparing a simulator for the VAX-11/780, I discovered that all the extent printed documentation for the DEC RP04/RP05/RP06 controllers is incorrect. Further, VMS followed this error in its drivers, creating a latent bug that has been present since the first release of the operating system in 1977.

Background: The Massbus

The Massbus is a simple, 16b, high-speed interconnect between a CPU host adapter and one or more mass storage devices. DEC created the Massbus in the early 1970’s, to provide a CPU-to-mass-storage interconnect that was faster than the Unibus. The Massbus was implemented in the PDP-11/70 (via the RH70 host bus adapter) and the DECsystem-20 (via the RH20 host bus adapter). The Massbus was the primary storage interconnect on the VAX-11/780 (via the RH780 host bus adapter). Massbus storage could also be connected to Unibus PDP-11’s (via the RH11 host bus adapter).

The Massbus implemented a very simple command and control structure between the host bus adapter and devices. The host adapter maintained the address and word count (DMA) logic. It communicated with the devices via register reads and writes. The host adapter mapped host addresses either to internal (adapter) registers offsets, or to external (device) register offsets. On the PDP-11, this mapping was quite complicated, and a mapping PROM was used between host addresses and register offsets; on the VAX, it was very simple, with different partitions of the adapter’s address space being used for internal offsets and external offsets.

RP vs RM, VAX vs PDP-11

The issue at hand arose in trying to understand how this mapping actually worked across different Massbus storage devices, particularly the RP and RM families of removable disk packs. The RP04/05/06 family came first, based on buyout Memorex drives; the RM03/RM05/RP07 family (yes, the RP07 was an RM, despite the name) came later, based on buyout CDC and ISS drives. According to the maintenance manuals for the respective drive families [1,2], the internal register offsets were not quite the same:

Offset10 RP family RM family

0 CS0 CS0 1 DS DS 2 ER1 ER1 3 MR MR 4 AS AS 5 DA DA 6 DT DT 7 LA LA 8 ER2 SN 9 OF OF 10 DC DC 11 CC HR 12 SN MR2 13 ER3 ER2 14 EC1 EC1 15 EC2 EC2

Because the RH780 didn’t map the external registers in any way, this difference was also reflected in the VMS driver.

But the PDP-11 (RH70/RH11), which did map the registers, showed a different picture:

Address8 RP family RM family

176700 CS0 CS0 176702 RH BA RH BA 176704 RH WC RH WC 176706 DA DA 176710 RH CS2 RH CS2 176712 DS DS 176714 ER1 ER1 176716 AS AS 176720 LA LA 176722 RH DB RH DB 176724 MR MR 176726 DT DT 176730 SN SN 176732 OF OF 176734 DC DC 176736 CC HR 176740 ER2 MR2 176742 ER3 ER2 176744 EC1 EC1 175746 EC2 EC2

The correspondence between RP and RM registers in the VAX and in the PDP- 11 is identical, except for RP SN, RM SN, RP ER2, and RM HR. If the VAX offsets were correct, then somehow the RH70/RH11 was mapping 176730 to offset 12 on the RP and offset 8 on the RM, and 176740 to offset 8 on the RP and offset 12 on the RM. How could this be?

First Hypothesis: Magic In the RH70/RH11

My first hypothesis was that, somehow, the RH70/RH11 was generating different mappings for the RP and RM drive families. Because this mapping was done with a PROM [3], this hypothesis implied that the RH70/RH11 had to be customized for the drive type. Further, RP and RM drives could not be mixed on the same PDP-11 Massbus controller, because addresses 176730 and 176740 would be incorrectly mapped if the drive type and controller PROM didn’t match.

A Beautiful Theory vs Ugly Facts

This hypothesis was quickly overwhelmed by evidence from developers and users. A typical response was from Paul Koning, from RSTS/E development.

“I'm somewhat puzzled about all this but I know for a fact that we supported mixed configs, and we had all sort of odd mixes on our development machines. Remember DECnet host ARK? It was called that because it had "two of everything". (Eventually that was more than was possible, but it had an amazing collection even so. I distinctly remember RP04, RP06, RM03, and RP07 disks mixed on the two RH70s.)”

A perusal of the RSTS/E driver showed that of the suspect registers, SN and MR2 were never accessed, and ER2 was only accessed if the drive was known to be an RP. So even with scrambled numbering, mixed strings would work, provided the RH70/RH11 always used an RP-style mapping.

TOPS-10 and TOPS-20 developers were even firmer: mixed configurations were not only supported but routine. The only restriction was that disks and tapes could not be mixed on the same Massbus adapter. The TOPS-10 and TOPS-20 drivers accessed SN and ER2 and expected them to be in their proper Unibus locations.

Further, the TOPS-10 and TOPS-20 drivers for the RH20 (which, like the RH780 on the VAX, did not map device offsets) stated that the RP offset for SN was 8, not 12, and for ER2 was 12, not 8.

TOPS-10 and TOPS-20 ran with real hardware; so too did VMS. Who was right?

Back To The Primary Source

At this point, the only remaining option was to consult the primary source for computer designs: the schematics. Fortunately, the schematics for the RP04/RP05/RP06 were online, in Al Kossow’s invaluable archive. The schematics provided the answer.

The RP04/05/06 implemented register decoding with a 74154 4:16 demultiplexor. The selects were laid out in numeric order, with 8 = SN and 12 = ER2 [4]. There were no jumpers or select swizzling logic, before or after the demultiplexor. The PDP-11, TOPS-10, and TOPS-20 were right. The RH70/RH11 needed only one, consistent mapping between host addresses and Massbus offsets. The RP maintenance manual, and the VMS driver, were wrong.

The Smoking Gun; And An Explanation

So how did VMS work? The answer couldn’t be simpler: although the register offsets for SN and ER2 were defined, they were never used. It didn’t matter that they were wrong; it was only a problem in the comments, not in functional operation. The definitions were probably copied over on “day 1” from an incorrect document (like the maintenance manual) and never changed.

As confirmation, the VMS error logging facility (ERF) differed from the driver. The error logger stored the RP registers in numeric order, lowest to highest, and then defined the following data structure to access the resulting information (I’ve added the implicit register offsets to make the correspondence clearer):

{ { RP04/5/6/7 Disk Device Error Sub-packets {

Aggregate RP0X_DE_PKT structure prefix RP0X_DE$; MBA_REGS structure longword unsigned dimension 7; /* MBA adapter registers MBA_CNFG longword unsigned; /* Configuration register (RH780) MBA_CNTRL longword unsigned; /* Control register MBA_STAT longword unsigned; /* Status register MBA_VA longword unsigned; /* Virtual address register MBA_BYTE_CNT longword unsigned; /* Byte count register MBA_FNL_MAP longword unsigned; /* Final map register MBA_PRE_MAP longword unsigned; /* Previous map register End MBA_REGS; 0 CSR1 longword unsigned; /* RP04/5/6/7 control/status reg. 1 DRV_STAT longword unsigned; /* RP04/5/6/7 drive status register 2 ERROR1 longword unsigned; /* RP04/5/6/7 error register 3 MAINT longword unsigned; /* RP04/5/6/7 maintenance register 4 ATTN_SUM longword unsigned; /* RP04/5/6/7 attention summary reg. 5 D_ADDR longword unsigned; /* RP04/5/6/7 desired address reg. 6 DRV_TYP longword unsigned; /* RP04/5/6/7 drive type register 7 LOOK_AHEAD longword unsigned; /* RP04/5/6/7 look ahead register 8 SN longword unsigned; /* RP04/5/6/7 serial number register 9 OFFSET longword unsigned; /* RP04/5/6/7 offset register 10 D_CYL longword unsigned; /* RP04/5/6/7 desired cylinder addr. 11 CUR_CYL longword unsigned; /* RP04/5/6/7 current cylinder addr. 12 ERROR2 longword unsigned; /* RP04/5/6/7 error register 2 13 ERROR3 longword unsigned; /* RP04/5/6/7 error register 3 14 ECC1 longword unsigned; /* RP04/5/6/7 ECC position register 15 ECC2 longword unsigned; /* RP04/5/6/7 ECC pattern register End RP0X_DE_PKT;

The error logger, which certainly did care about the definitions of SN and ER2, had them in the correct (i.e., schematic) order.

Conclusions

In an article on SIMH in ACM Queue, I wrote,

“As in most forms of historical research, primary sources (schematics, microcode listings, and maintenance documentation) are best; secondary sources such as handbooks, marketing material, textbooks, and even user manuals cannot be trusted.” [5]

As this Massbus mystery illustrates, the definition of primary sources has to be narrowed further: even maintenance manuals cannot be trusted. Errors can pile on errors over time: user manuals from maintenance manuals, drivers from user manuals, etc. And only reference to the schematics can unravel a 25+ year old error.

Acknowledgements

Paul Koning, Fred Van Kempen, Mark Crispin, Dave Carroll, and Joe Smith all provided personal evidence that knocked down my first hypothesis and eventually led me to look at the RP schematics. Andy Goldstein confirmed my analysis of the VMS drivers. Al Kossow’s multi-year project to scan schematics and manuals and put them online made the entire effort possible.

References

[1] Digital Equipment Corporation, “RP05/RP06 Device Control Logic Maintenance Manual”, EK-RP056-MM-01, December 1975

[2] Digital Equipment Corporation, “RM Massbus Adapter Technical Description Manual”, EK-RMADA-TD-001, October, 1980

[3] Digital Equipment Corporation, “RH11B Field Maintenance Print Set”, MP00382, schematic BTCA, Bus Control page 1

[4] Digital Equipment Corporation, “RP04/05/06 Field Maintenance Print Set”, MP-00086, schematic RG5, Register Logic page 5

[5] Bob Supnik, “Simulators: Virtual Machines of the Past (and Future)”, ACM Queue, Vol 2 No 5, July/August 2004

The Case Of The Missing PLA Term, or, Microcode Bugs I Have Known Bob Supnik, 24-Sep-2004

Introduction

Perusing a recently scanned copy of a later PDP-11 system manual (the PDP- 11/84), I was surprised to see two entries in the “PDP-11 differences” list that I had never seen before:

55. The ASH instruction with a source operand of octal 37 (shift left 31 decimal times) will cause the register to be shifted right instead of left. 56. The ASHC instruction with an octal value of 37 (shift left 31 decimal times) in source operand bits 5:0 and bits 15:6 of the operand being non-zero, will cause the register to be shifted right instead of left.

Both new entries had a single check mark in the column for the J-11. These weren’t “differences”, these were bugs in the microcode that had been discovered too late to be fixed before the J-11 was in general release.

Microcode bugs were an inherent risk in a ROM-based VLSI microprocessor. Large-scale microprogrammed machines used PROM chips, which could be replaced, or RAM chips, which could be reloaded, for control store. But microprocessor control stores, once fabricated, were fixed for all time.

This paper documents the microcode bugs in DEC microprocessors that got out into general release.

Bad-Mannered Testing: F-11 MULP/DIVP

The Commercial Instruction Set (CIS) was a late addition to the PDP-11 architecture. Modeled after the commercial instructions in the VAX, CIS was intended to boost the PDP-11’s COBOL performance. It provided an extensive set of string and decimal instructions; indeed, its capabilities were more complete than its VAX counterparts.

The F-11 (11/23) microprocessor was the first PDP-11 to implement CIS. CIS was a relatively late addition to the project (always a danger sign). The F-11 didn’t offer much hardware support for CIS: a decimal adjust microinstruction was about the extent of its capabilities. As a result, the microprograms for CIS were large. CIS required six extra microcode control store chips (six times the size of the base instruction set). MULP and DIVP required two chips just by themselves – as much as the entire floating point instruction set.

Almost all of the decimal instructions were structured as loops that counted down by bytes or nibbles. These loops were always interruptible; if an interrupt occurred, the instruction was “packed up” into the general registers, and PSW was set. A typical loop would to round the nibble count to even, divide by two, and then count down by one, testing against zero for end of loop. However, a few loops simply rounded and counted down by two, without dividing. This worked fine – provided that the instruction wasn’t interrupted, and that the interrupt-level program didn’t tamper with the state saved in the registers.

All was well until the code was tested with a program called BADMAN. Intended to flush out latent microcode bugs, BADMAN deliberately mangled the state saved in the general registers, set PSL, and tried to see what would happen. When BADMAN set the saved loop count of one of these count-by-two loops to an odd value, the microprogram ran forever. It was still interruptible – control was never lost – but the instruction never completed.

This problem was found after the F11 CIS option shipped, and after DEC’s interest in CIS for the PDP-11 had waned. It was never fixed.

The Case of the Missing PLA Term: J-11 ASH/ASHC

Like its predecessors the LSI-11, F-11, and T-11, the J-11 used an elegant control store structure that contained both ROM words and PLA terms. The PLA provided great flexibility and conciseness in implementing the nearly (but not quite) regular instruction set of the PDP-11. A base micro-operation operation that depended only on the PDP-11 opcode, e.g.,

ADD: ; Do ADD. =10********0 PLA0 [^0 111 0X0, ^0 110 XXX XXX XXX XXX] ADD.W* [RF, RE], ; Override: RF to RSRC if Source Mode 0, ; RE to RDST if Destination Mode 0, NAF/NOP-PF ; NAF to ID1 (477) if Dest. Modes 00-06

would be modified by other PLA terms reflecting instruction modes:

DOP-SM0: ; Turns on at DOP execution. =10********0 ; Register select override for Source ; Mode 0. PLA0 [^0 111 0X0, ^X XXX 000 XXX XXX XXX] NOP.B [RSRC, RE], ; Override RF to RSRC (PDP11 register). NAF/1777 ; Do not affect NAF.

All the PLA terms that were selected drove their outputs, which ‘wire ANDed’ together. Thus, any particular part of the base micro-instruction (source register, destination register, next address) could be overridden; and the overrides would apply to all base micro-instructions with appropriate addresses.

The PLA decode mechanism was so powerful that in the J11, its use was extended from decoding instructions to decoding arbitrary information. The microcode could load the PLA input register (PIR) from the data path. This technique was faster than testing bits and branching and was extensively used in computationally intensive instructions such as floating point, ASH, and ASHC.

The basic algorithm for ASH was to use PLA overrides to modify a load counter instruction based on bits<5:0>:

ASH1: ; Default for EXTRX-X, ASHX-0-OVR and ASHX-NEG-1 PLAs. =10********0 PLA0 [^1 011 110, ^X] LCNTR.W [037, RE], ; Load CNTR with shift count. NAF/ASH-R ; Overridden to NOP-PF1 for no shift or ; to ASH-L for positive shift or ; to ASH-R1 for negative 1 shift.

These overrides were used throughout the microcode to “extract” the value in the PIR and use the value to change microcode constants.

ASHX-0-OVR: ; SRC<5:0> = 0. No shift. PLA0 [^1 01X 110, ^X XXX XXX XXX 000 000] NOP.W [RF, RF], ; Do not override microinstruction. NAF/ASHC-NO ; Override to NOP-PF1 for ASH or ; to ASHC-NO for ASHC.

EXTR0-0: ; Override on 0 in PIR<5> and PIR<0>. PLA0 [^1 011 1XX, ^X XXX XXX XXX 0XX XX0] LBIS.B [336, RF], ; Override literal<5,0> (SPL, ASHX). NAF/1775 ; Override NAF<1> for left shift (ASH and ASHC)

EXTR0-1: ; Override on 0 in PIR<5> and PIR<1>. PLA0 [^1 011 1XX, ^X XXX XXX XXX 0XX X0X] LBIS.B [275, RF], ; Override literal<6,1> (SPL, ASHX). NAF/1775 ; Override NAF<1> for left shift (ASH and ASHC)

EXTR0-2: ; Override on 0 in PIR<5> and PIR<2>. PLA0 [^1 011 1XX, ^X XXX XXX XXX 0XX 0XX] LBIS.B [173, RF], ; Override literal<7,2> (SPL, ASHX). NAF/1775 ; Override NAF<1> for left shift (ASH and ASHC)

EXTR0-3: ; Override on 0 in PIR<5> and PIR<3>. PLA0 [^1 011 11X, ^X XXX XXX XXX 0X0 XXX] LBIS.B [367, RF], ; Override literal<3>. NAF/1775 ; Override NAF<1> for left shift (ASH and ASHC)

EXTR0-4: ; Override on 0 in PIR<5> and PIR<4>. PLA0 [^1 011 11X, ^X XXX XXX XXX 00X XXX] LBIS.B [357, RF], ; Override literal<4>. NAF/1775 ; Override NAF<1> for left shift (ASH and ASHC)

ASHX-NEG-1: ; SRC<5:0> = -1. One right shift only. PLA0 [^1 01X 110, ^X XXX XXX XXX 111 111] NOP.W [RF, RF], ; Do not override microinstruction. NAF/ASHC-R1 ; Override to ASH-R1 or ASHC-R1.

EXTR1-0: ; Override on 1 in PIR<5> and PIR<0>. PLA0 [^1 011 11X, ^X XXX XXX XXX 1XX XX1] LBIS.B [376, RF], ; Override literal<0>. NAF/1777 ; Do not override NAF.

EXTR1-1: ; Override on 1 in PIR<5> and PIR<1>. PLA0 [^1 011 11X, ^X XXX XXX XXX 1XX X1X] LBIS.B [375, RF], ; Override literal<1>. NAF/1777 ; Do not override NAF.

EXTR1-2: ; Override on 1 in PIR<5> and PIR<2>. PLA0 [^1 011 11X, ^X XXX XXX XXX 1XX 1XX] LBIS.B [373, RF], ; Override literal<2>. NAF/1777 ; Do not override NAF.

EXTR1-3: ; Override on 1 in PIR<5> and PIR<3>. PLA0 [^1 011 11X, ^X XXX XXX XXX 1X1 XXX] LBIS.B [367, RF], ; Override literal<3>. NAF/1777 ; Do not override NAF.

EXTR1-4: ; Override on 1 in PIR<5> and PIR<4>. PLA0 [^1 011 11X, ^X XXX XXX XXX 11X XXX] LBIS.B [357, RF], ; Override literal<4>. NAF/1777 ; Do not override NAF.

This sequence used the PLA mechanism elegantly and efficiently to decode both the amount and direction of the shift. For example, if the instruction being executed was ASH #10,Rn, then PLA overrides EXTR-0, EXTR-1, EXTR-2, and EXTR-4 would be selected. The load counter literal would be modified to

037 & 336 & 275 & 173 & 357 = 010 and the next address field (NAF) would be modified to left shift. If the instruction being executed was ASH #76,Rn, then PLA overrides EXTR1-1, EXTR1-2, EXTR1-3, and EXTR1-4 would be selected. The load counter literal would be modified to

037 & 375 & 373 & 367 & 357 = 1 since right shift expected a 1’s complement result. There was even a special case for ASH #77, to account for the expected 1’s complement counter value.

But what happened if PLA term was selected? If the operand value is 037, then none of the left extract, right extract, or special case overrides is selected. The load counter instruction is unmodified. It loads the correct value (37) but fails to override the next address field. A right shift is executed, by default, instead of a left shift.

Another special case term was needed, to account for the ASH #37,rn:

ASHX-L-37: ; SRC<5:0> = 011111. PLA0 [^1 01X 110, ^X XXX XXX XXX 011 111] NOP.W [RF, RF], ; Do not override microinstruction. NAF/1775 ; Override NAF<1> for left shift (ASH and ASHC)

ASHC used a slightly different algorithm. It loaded the counter as well as the PIR from the data path and only used the extracts if source operands bits 15:6 were non-zero. Thus, ASHC only showed the effect of the missing PLA term if source operand bits 15:6 were non-zero.

How did these cases get overlooked? The PDP-11 never had a formal architectural exerciser like AXE for the VAX. Machines were verified by running diagnostics, hand tests, and system software. The two failing cases were fairly meaningless. ASH #37,Rn should clear Rn. With the bug, ASH #37,Rn would clear Rn for positive values, and set Rn to –1 for negative values. ASHC #xxxx37,Rn, where xxxx was non-zero, would be a meaningless looking instruction, because the shift count would appear out of range.

Whatever the reason, the bugs weren’t found until long after the J11 shipped. By then, it was too late to recall the thousands of chips in the field. The bugs became “differences”, indelible markers of the J11.

Always Check The Edge Cases: MicroVAX Passive Release

In June, 1985, the MicroVAX II system had been launched to great acclaim, when I got a call from the system group’s services leader. A MicroVAX II system at a customer site was failing unpredictably but regularly. Running diagnostics, swapping boards, and other standard service procedures had failed to find or cure the problem. Could I help?

I went to the customer site with personnel from the systems group. The first clue was that the problem always occurred in the immediate vicinity of a MOVCx instruction. The second was that the system had a third-party Qbus to Unibus converter: a configuration that DEC didn’t support and therefore had never tested. Looking at the setup with a scope, we determined that the problem occurred on a Unibus passive release (an interrupt cycle that was never completed). Why?

The interrupt flows for interruptible instructions like MOVCx and POLYx were very convoluted.

1. The interruptible instruction set a global flag to indicate interruptibility. 2. The interruptible instruction periodically tested for an interrupt; if one was pending, the microcode branched to the interrupt “fault” entry. 3. The interrupt fault handler called a cleanup routine. If the global flag was set, the fault handler called an instruction specific cleanup routine. 4. For MOVCx, the instruction specific fault handler packed up instruction state into the general registers, set PSL, and returned to the main interrupt flows.

The main interrupt flow issued a bus cycle to read the vector. If the bus cycle was aborted, or if the returned vector was zero, the interrupt flow simply exited to instruction decode. Instruction decode fetched the instruction, saw that PSL was set, called the instruction-specific restart routine, and the microcode was back in business. Interruptibility had been extensively tested. Why was passive release failing?

The key was in that ambiguous phrase, “fetched the instruction”. MicroVAX, like all VAXen, had an instruction prefetch mechanism. When the microcode reached its execution flows, the prefetch mechanism was already pointing at the next instruction. All paths through the interrupt and exception microcode set a new PC, resetting the prefetch mechanism, except for passive release. Passive release just resumed execution. Therefore, the instruction that was fetched was the next instruction, not the interrupted instruction. If that instruction was not interruptible, PSL was ignored. The MOVCx never completed, the registers were in the wrong state, and the program crashed.

Even though Qbus devices did not generate passive releases, the problem was regarded as serious enough to ECO every MicroVAX system in the field. A microcode revision was hurriedly generated and tested, new parts fabricated, and systems in the field upgraded.

The moral of this story was the importance of edge-case testing. Dynamic conditions, such as interrupts, DMA, and halts, were difficult to incorporate into architectural testing. These conditions needed to be tested explicitly, whether they were “impossible” or not.

There is another microcode bug in MicroVAX. It has never been seen in the field; and I’ll never tell where it is ;)

If An Exception Case Is Never Tried, Is It Really There? Rigel INSV

Rigel systems had been in the field for two years when testing with a new microcode verification tool, MAX, turned up a bug. In an INSV to a register, if the position operand was a reserved operand (that is, > 31), and the equation 32- size-position caused integer overflow, the INSV instruction would be treated like a NOP instead of causing a reserved operand fault. For this to happen, position (a longword) had to be 800000xy (hex), where xy <= 32-size. For example:

clrl r3 ; source movl #^x800000F,r4 ; position movl #^x3,r5 ; size movl #^x7004,r6 ; base insv r3,r4,r5,r6 ; should fault

The INSV should have generated a reserved operand fault. Instead, it was treated as a NOP, and the next instruction executed normally.

The problem occurred because of an insufficiently constrained n-way branch:

INSV.RMODE.1.255: ;------; sc<5> = 0: [MD.T3] <-- [MD.T3] RROT (SC), ; [6] rotate field ; surround by position CASE [WBUS.NZV] AT [INSV.RMODE.1] ; case on 32 - (pos + ; size) test from [3]

;= ALIGNLIST 01** (INSV.RMODE.1, INSV.RMODE.2) ; WBUS.NZVC set by subtract of bytes in longword --> V = 0

The alignlist was insufficiently constrained; it assumed the third bit (corresponding to the overflow condition code) was a don’t care, that is, the bit would always be zero. In fact, the subtract was a longword subtract, because position was a longword parameter, and integer overflow could occur. By chance, the microcode instruction selected by an errant branch was benign, resulting in an effective NOP.

Because the bug had not been seen in the field, and caused an exception on all other VAXen, the bug was waived, and Rigel was never fixed.

The End Of An Era: NVAX and Alpha

By the time NVAX was designed, late-stage microcode bugs had become a sufficient annoyance, and silicon real estate had become sufficiently great, that the design team returned to a strategy used in early VAXen: a patchable control store. This made microcode bugs a thing of the past. Alpha carried the idea a stage further, by eliminating microcode altogether. Alpha’s equivalent (PAL code) was always executed from main memory, allowing errors to be corrected by simple firmware updates. Patchable control stores (for CISC machines) and loadable firmware (for RISC machines) continue to be the preferred implementations to this day.

HP’s IOP Implementations: 2100 vs 21MX Bob Supnik, 22-Nov-2002 (revised 16-Apr-2004)

Summary

HP’s Access system (a late version of TimeShared Basic) consisted of two processors linked by a parallel interconnect. One of the computers was responsible for computation and mass storage and was called the System Processor; the other was responsible for character-by-character I/O and was called the I/O processor. To improve performance, the IOP used unique microcode assists called the IOP instruction set. On the HP 2100, these instructions overlapped with, and were mutually exclusive with, the floating point microcode. On the HP 21MX, these instructions occupied different code points in the extended instruction space.

Encoding Differences

Instruction 2100 21MX notes

ILIST 105000 105470 LAI 105020-57 105400-37 SAI 105060-117 101400-37 MBYTE 105120 105765 standard 21MX instruction CRC 105150 105460 TRSLT 105160 105467 not used in 21MX IOP code MWORD 105200 105777 standard 21MX instruction READF 105220 105462 PRFIO 105221 105473 PRFEI 105222 105471 PRFEX 105223 105472 ENQ 105240 105464 PENQ 105257 105465 DEQ 105260 105466 SBYTE 105300 105764 standard 21MX instruction LBYTE 105320 105763 standard 21MX instruction REST 105340 105461 SAVE 105362 105474 INS - 105463

Functional Differences

1. The 2100’s byte and word move instructions have the same functional definition as the 21MX’s standard MVB and MVW, but the 2100’s microcode implementation checked an additional condition (do nothing if the count is less than zero). 2. The 2100 uses the memory protect option’s fence register (internally, the F register) as the IOP stack pointer. The F register is loaded with an OTx 5 instruction, and stored with READF. The 21MX uses a different internal register, because it provides a new instruction to load the stack pointer. 3. The 2100’s TRSLT instruction is not used in the 21MX IOP code, even though the code point is defined.

CTSS Hardware Bob Supnik (fourth revision, 23-Jun-2005)

All of the "specifications" given below are derived from the CTSS sources on Paul Pierce's site and from surviving CTSS documentation. They are likely to be incomplete; only the functionality used by CTSS can be deduced.

1. Interval timer

Location 5 is incremented at 60Hz. When it overflows, that is, bits 1-35 increment to 0, a trap occurs. The PC is saved in location 6, and the next instruction is taken from location 7. Traps are inhibited, as with data channel traps.

It also appears that ENB <17> controls the interval timer trap, but this is not certain; data channel traps are always gang-enabled in CTSS.

This is similar, but simpler, than the interval timer option for the 7044. In particular, there is no interval timer reset trap (which indicates that a second interval timer overflow occurred without the first one being serviced). The differences are documented in the 7044 Principles Of Operation.

2. Extended memory

Memory is doubled with the addition of "B core", a second bank of 32KW. There are separate controls for use of B core for instruction (and indirect) references, and for data references. Four new instructions control B core:

SEA (-076100t00041) set data references to A core SEB (-076100t00042) set data references to B core TIA (+0101f0tyyyyy) transfer to A core at eff addr and set user mode TIB (-0101f0tyyyyy) transfer to B core at eff addr and set user mode

When a trap occurs that writes the decrement (protection trap, clock trap, channel trap, or floating point trap), the use of A and B core is recorded in the decrement of the saved trap word, as follows:

bit<3> 0 = instructions executing in A core 1 = instructions executing in B core bit<4> 0 = data references in A core 1 = data references in B core

All traps store the trap word in A core, set the data reference mode to A core, start executing in A core, and reset user mode.

[The behavior of traps that only modify the address of the trap word cannot be deduced from the sources. For examples, on an STR trap, CTSS simulates the behavior by copying A-core location 0 to B-core location 0 and resuming execution at B-core location 2. For a floating point trap, CTSS clears bit <3:4> before copying the address and decrement (only) of A-core location 0 to B-core location 0.]

Channels specify A core vs B core by bit<20> of the channel command word. Bit<20> = 0 is A core, = 1 is B core. This effectively extends the channel initial starting address to 16b. The channel address register remains 15b and does not change the A/B select bit on overflow.

[The selection of A vs B core is illustrated in routine CMEXIT, which sets up memory for a return to an interrupted user process or a system task. The selection of A vs B core for a channel is illustrated in the CTSS bootstrap.]

3. Protection

Protection is implemented through two mechanisms: implementation of user mode, and implementation of memory relocation and protection.

3.1 User Mode

User mode is a subset of standard 7094 mode. In user mode,

- Memory accesses are subject to memory relocation and protection - Certain instructions are forbidden and cause a protection trap

The TIx instructions implicitly set user mode. User mode is cleared by any trap, including a protection trap. Because TIA is privileged, it is the conventional way of entering the monitor in A core. Because TIA and TIB are the same opcodes with different signs, TIB presumably sets user mode as well.

The list of privileged instructions:

- all I/O instructions (RDS, WRS, BSR, BSF, SDN, RUN, REW, etc) - all channel instructions (RCHx, LCHx, SCHx, etc) - all I/O transfer instructions (TEFx, TRCx, TCOx, TCNx) - plus and minus sense (+0760... and -0760...) - HPR and HTR - ENB - SEA, SEB, TIA, TIB, LRI, and LPI

Nested XEC's were trapped, and XEC * would hang the system.

Protection traps save the PC and A/B flags in the address and decrement, respectively, of location 032, clear user mode, set instruction and data memory to A core, and execute the instruction in location 033.

The list of privileged instructions is derived from the CTSS background task emulation routine, which executes a subset of the privileged instructions on behalf of a background process. The lack of trapping on nested XEC's comes from Jerry Saltzer and Stan Dunten.

3.2 Memory Relocation and Protection

According to the CTSS book, memory is managed in 256W blocks. Memory relocation and protection is done by three 7b registers:

- base: defines the 256W block that is the base memory addr - end: defines the 256W block that is the end memory addr - relo: defines the value to be added to the block number prior to memory access

The relocation and protection registers are loaded as follows:

LRI (+0562f0tyyyyy) load relocation from M[ea] LPI (-0564f0tyyyyy) load base from M[ea] load end from M[ea]

The CTSS code operates as though the relocation and protection registers were 15b rather than 7b, and as though the base and relocation are always zero. The algorithm is:

- load image base into both base and relo - add number of words - load sum into end

Various code sequences that check the legality of virtual addresses skip relocation and check only the end address. This is consistent with the so-called "onion-skin" swap algorithm used in CTSS. However, one sequence adds in the relocation register and then checks it against the end register. One could conjecture that relocation is performed before protection checks, but there's no way to confirm that from the code.

The actual protection check appears to be:

base<0:6> <= addr<0:6> <= end<0:6>

This is confirmed by a statement in the CTSS guide, that a user may be able to access more than the allocated number of words, because protection checks are done on 256W boundaries; but only the allocated number of words is swapped in and out. If the user was allocated 1024 words, then base = 0 and end = 4; any address up to 1024 + 255 would be considered valid.

4. I/O

The CTSS system corresponding to the sources that we have is the “red machine” from Project MAC. It has the following I/O configuration:

channel A: 7607, card reader and punch, line printer, 8 tape drives, Chronolog clock channel B: 7607, 7 tape drives channel C: 7909, 2 2302 disks, 1 7320 drum channel D: 'direct channel' connection to display channel E: 7909, 7750 communications controller channel F: unused channel G: 7289 (7389?), 2 7320A high-speed drums

4.1 Channel A (7607)

The card reader, card punch, and line printer are described in the 7094 Principles Of Operation. CTSS allows the background processor to use all three, although the line printer can only be used in BCD mode.

The tape drives are described in the 7094 Principles Of Operation. CTSS allows the background processor to use drives 1-7 and 0. [Note that this allows the background processor to read the Chronolog clock.]

Tape drive "7" is the Chronolog clock. It returns 2 words of BCD digits giving the current time of year (without the year):

word1: month|day in month|hour in day word2: minute in hour|second in minute|60th second

The months and day in month are 1-based; all others are 0-based. The Chronolog apparently accounts for leap years correctly.

4.2 Channel B (7607)

The tape drives are described in the 7094 Principles Of Operation. CTSS allows the background processor to use drives 1-6 and 0.

4.3 Channel C (7909)

4.3.1 7631 File Control

Channel C is a 7909 supporting a single 7631 file controller. The 7631 in turn supports ten logical units (called modules), of which CTSS uses 5. The best online document for the 7631and its devices is "IBM 1301, Models 1 and 2, Disk Storage, and IBM 1302, Models 1 and 2, Disk Storage, with IBM 7040 and 7044 Data Processing Systems", A22- 6768. The material is directly applicable to the 709x series as well.

4.3.2 2302 Disk

The 2302 disk is apparently a later model of the 1302 disk, which in turn was the successor the original 1301 disk. The basic building block of the series is the module, a stack of 25 fixed disk platters, 20 of which hold data on both sides (40 data surfaces). Each surface was divided into tracks (250 on the 1301, 500 on the 1302 and 2302). Each track is a serial bit stream. The 2302-2 that CTSS used has two modules per physical enclosure, and each system has two enclosures, for a total of four modules – a total of less than 80MW (340MB) of storage.

Tracks are numbered sequentially, from 0-9999, based on the cylinder number and the track within cylinder. On the 1302/2302, the second set of 250 cylinders is addressed via a second access arm.

The 1301/2302 series allows variable formatting per cylinder. Each track has a fixed home address (the track number), a user-specified variable home address, and then user- specified record numbers of variable length. The formula for how much data fit in a track is a complex function of the length of home address 2, the length of the record numbers, and the records themselves. Fortunately, CTSS uses the 2302 disk is a simplified way:

home address 2 = 67676767676 (=HXXXXXX) record address = TTTTRM, where TTTT = track number R = record within track (0 or 1) M = physical module number records per track = 6: 2 data, 4 filler 435 word data record 31 word filler record 14 word filler record 435 word data record 16 word filler record 1 word filler record

The mapping from logical modules to physical disks is as follows:

logical physical

0 access 0, module 0 1 access 1, module 0 2 access 0, module 1 3 access 1, module 1 4 access 0, module 4 5 access 1, module 4 6 access 0, module 5 7 access 0, module 5 [drum 8 access 0, module 2]

Presumably, the 1301 had just a single record per track, in the same format as the 7320 drum, but this cannot be proven from the sources.

4.3.3 7320 Drum

The 7320 drum is much smaller. It has 400 tracks. As with the 1301 series disks, the tracks support variable format. CTSS uses the drum in a simplified way:

home address 2 = 67676767676 (=HXXXXXX) record address = TTTTRM, where TTTT = track number R = record within track (always 0) M = physical module number (always 2) records per track = 3: 1 data, 2 filler 435 word data record 16 word filler record 1 word filler record

Thus, the drum holds 174K 36b words - less than 6 core loads. The drum is on access 0, module 2, of the 7631 file controller and responds as logical module 8.

4.4 Channel D (7607)

Channel D is the experimental channel. For the sources that we have, it is connected to an experimental display driven by a PDP-7 and uses the 7094's "direct interrupt" capability. The experimental display is not documented, except by the source code, and is unlikely to be reproducible in simulation.

4.5 Channel E (7909)

Channel E is a 7909 supporting a 7750 communications computer. The 7750 is not well documented, other than an overview in the IBM Journal of Research, March 1963. For simulation purposes, its detailed behavior is irrelevant; instead, it has to be reverse- engineered from the CTSS source.

According to the Journal article, the 7750 supports up to 112 lines. The CTSS sources allow for 62. Lines 0-3 are "high-speed” lines and are used for computer-to-computer communications. High-speed lines 0-1 are 12b only, lines 2-3 support 6b and 12b. For simulation purposes, the high-speed lines can be ignored.

The 7750 has two basic operations: read and write. Read is used to drain accumulated input from terminals and is an asynchronous operation. Write is used to output to terminals and is a synchronous operation. Reads occur in response to user input and is signaled by the ATTENTION1 signal. Writes occur in response to program operations.

The 7750 relied on local echo ("half duplex") to provide timely feedback for typed characters. This includes inserting LF after CR on ASCII terminals.

The 7750 is initialized by sending it a 36b word of all 1's.

4.5.1 Read Messages

Read messages consist of a string of 12b characters, aligned to a 36b (word) boundary. The message terminates with at least one EOM (end of message) of 3777; if the message must be padded to a 36b word boundary, the padding also consists of EOM's.

The first character is the transmission number, which counts modulo 2048. There is no apparent protocol for initializing this count. CTSS ignores message number mismatches and resynchronizes after each message. From the comments, one error is to be expected, i.e., an error on the first message. According to Stan Dunten, this feature was added to debug a problem of dropped messages between the 7750 and the 7094; once the problem was found, the feature was no longer used.

After the message number, the read message consists of pairs of line number and input characters. The line number character is expected to be between 1024 and 2046, that is, to have the 2**10 bit set. If that bit is clear, the character pair is discarded.

Input characters can be either control or data characters. Control characters have the 2**10 bit set. There are six control characters:

2001 dialup 2002 end ID sequence 2003 interrupt 2004 quit 2005 hangup 30nn completed typeout of nn (<= 31) characters

Dialup is the first character received from a newly dialed in line. It is followed by the ID sequence in subsequent data characters. The first data character following dialup contains the device ID:

<0:6> discarded <7:11> device ID 1 = KSR-35 2 = 1050 3 = Telex 4 = TWX 5 = inhouse TWX 6 = KSR-35 standard 7 = KSR-37 8 = 2741 9 = ESL scope

The low order 6b of subsequent data characters are shifted into bits <6:35> of the UNITID field until an end ID character is received.

Interrupt and quit are generated by specific character sequences on each input device (for example, BREAK generates quit on a KSR-28). The keyboard mappings for these keys are not documented in the CTSS listings. Hangup is generated when a line disconnects.

The 7750 sends data characters in 1's complement form. This is evident from two tables, one for identifying the spacing count, the other for character conversion. The former has entries for 001-137, 162, 166, 167, that is, 176-040, 015 (CR), 011 (tab), and 010 (backspace). The latter clearly shows the BCD equivalent of A for entry 76 (~101), B for entry 75 (~102), etc. Stan Dunten speculates that IBM's engineers didn't understand or didn't like the serial line protocol that uses 1's as the idle line state and made 0 the idle state, effectively inverting the protocol.

Data characters consist of 7b plus parity. IBM terminals and the KSR-37 generate even parity. Parity is checked on the incoming (1's complement) character. All other terminals do not generate parity.

4.5.2 Write Messages

Unlike read messages, write messages are always for a single line. There are two formats for write messages: control messages and data messages. Control messages are always one 36b word long and contain:

<0:11> 3000 + line number <12:23> control function <24:35> 7777 (end of medium)

The only defined control function is all 0's, which resets a line and releases all buffered characters (see flow control).

Data messages consist of a 12b line number, a 12b character count, and then either 6b or 12b characters, followed by end of medium. If the message is 6b characters, then bit 2**10 in the line number is clear, and end of medium is 077; if 12b, then bit 2**10 in the line number is set, and end of medium is 07777. The largest write that CTSS can do in one operation is 94 words (559 6b characters). In practice, 6b mode is used only for high-speed lines, that is, for computer-to-computer communication without character translation.

In 12b mode, data characters consist of 7b plus parity plus start bit. As with input, the actual character is 1's complemented; this is evident from the code conversion table. The start bit is inserted by code for data characters but comes from a table for control characters. Because of the 1's complement coding, the inserted start bit is always a 1. Only the KSR-37 and ESL scope require even parity; other devices do not.

Character 3777 is treated specially. It causes the 7750 to repeat the last bit sent for the number of bit times specified in the next character. If the last bit was a space (idle), as is always the case for ASCII devices, then 3777 holds the line idle for the specified number of bit times. This is used to add delay for positioning characters like CR, LF, and TAB. If the last bit was a mark, then 3777 produces a break sequence, which was needed for the IBM terminals. The special sequences for the KSR-37 are:

CR 753,3777,0 TAB 755,3777,1 FORM FEED 747,3777,240 CR-NO-LF 345,3777,24 LF 333,333,0 POFF 711,633 (printer off) PON 711,211 (printer on) VT 351,3777,30 BLACK 711,277 RED 711,631

The BCD equivalents are (where UC implies values >= 0100):

UC 07 => PON UC 16 => LF UC 17 => VT UC 36 => POFF LC 52 => FF LC 55 => CR UC 61 => CR-NO-LF LC 72 => TAB UC 72 => RED UC 75 => BLACK

For the KSR-35, some equivalent sequences:

CR 345,3777,0 TAB 355,3777,22 FORM 347,3777,54

The KSR-37 treats LF as CR-LF on output, and recognizes an auxiliary code (022) as bare LF. The KSR-35 treats CR as CR-LF on output.

4.5.3 Flow Control

On input, the 7750 relies on the "natural" balance between slow typists and a "fast" CPU to keep buffers from overflowing. But on output, a program can generate data much faster than the terminals can output. The 7750 cooperates with the operating system to implement global flow control on output.

CTSS tracks the number of characters of 7750 buffer consumed by a given user. It increments this total each time a buffer is output. The 7750 sends asynchronous “output complete” messages to let CTSS know when to decrement the user's character count. Output complete messages are sent at least every 31 characters, as the maximum character count in the message is 31. I suspect that they are also sent if the user's output line goes idle (all characters output).

4.5.4 Possible Simulation Strategy

- Ignore the high-speed lines; start the effective line count above the high-speed/low- speed boundary. This eliminates 6b mode. - Simulate only a single type of terminal. The KSR-37 would seem to be the easiest, as it requires trivial code conversion and no tracking of shift codes. Red/black could be handled by ANSI compatible bold/normal sequences, for ANSI-compliant Telnet clients.

4.5.5 Unanswered questions:

- What keystrokes generated interrupt and quit for various terminals? Stan Dunten recollects that BREAK was used for quit on all ASCII terminals. - Are there other output control messages in addition to line reset? - Was 6b mode used for anything other than the high-speed lines?

4.6 Channel F

Unused.

4.7 Channel G (7289)

Channel G is a unique 7289 (possibly 7389) channel supporting two "high-speed" 7320A drums. The programming model for this drum is a hybrid between the select-driven model of the 7607 and the channel-driven model of the 7909. Read or write select is done with RDS and WDS, respectively, and data transfers with IOCP and IOCD, just like the 7607; but the channel takes the first word of the channel program as a drum address, and responds to the SCDx command, like the 7909. In addition, the drum channel has unique characteristics for storing information on SCD (store channel diagnostic).

Each physical drum has 192K 36b words, organized as six "logical" drums of 32K 36b words each. Each "logical" drum, in turn, is divided into 16 "chunks" (sectors) of 2048 words. The drum channel supports some number of physical drums, but CTSS only has 2. Drum addresses to the channel have the following format:

bits<3:5> physical drum bits<15:17> logical drum bits<21:35> sector (word) address

The high-speed drum controller in CTSS responds as unit 330 on channel G. RDS and WDS prepare the controller for a read or write operation. When the channel is started (with LCHx or RCHx), the first word of the channel program is the drum address. The second and subsequent words are normal 7607 commands, of which CTSS uses only IOCP and IOCD.

When a drum operation completes, and the channel disconnects, if the enable bit is set for channel G, a data channel trap occurs. The trap location contains the standard 7607 error indicators in <15:17>: end of file, redundancy check, other error. SCDG returns 36b of channel error status. These are not defined, but bits <0:2> and <13> are considered to be errors; bit <0> is apparently I/O check.

Acknowledgements

First and foremost, I want to acknowledge the help of the CTSS team, whose continued interest in, and lively recollection of, the project made possible the salvaging of the sources as well as resolution of many obscure points. Tom Van Vleck, Stan Dunten, Jerry Saltzer, Donald Widrig, Dan Edwards, Bob Fenichel, Bob Daley, Roger Roach and Karolyn Martin have read all the drafts of this note and supplied corrections to errors and anecdotes that illuminated operation of the system. Needless to say, any remaining errors are my responsibility.

Paul Pierce did the mammoth task of transcribing the CTSS source tapes to online form. Paul also wrote the first 709 simulator, which contributed substantially to my knowledge of the architecture, and transcribed key hardware documents for online use. Dave Pitts extended Paul’s simulator to be a full 7094 and got it to run IBSYS successfully. Dave’s cross-assembler and cross-linker for the 7094 have made it possible to create executable modules from the CTSS source set and will be key to any reconstruction of an executable CTSS image. Rob Storey’s 7094 simulator has also been very helpful.

Last, but hardly least, Al Kossow’s multi-year project to transcribe manuals for online use has been critical in this effort, as in so many others. Al transcribed not only extent 7094 manuals, but also 704X manuals, which provided documentation about the interval clock and the 7631 file control.

Web resources

Paul Pierce’s CTSS source set: http://www.piercefuller.com/library/ctss.html

Al Kossow’s document archive: http://bitsavers.org/pdf/

Tom Van Vleck’s CTSS page: http://www.multicians.org/thvv/7094.html

What Was The PDP-X? Bob Supnik, 10-Jan-2004 [revised 23-Feb-2004]

Introduction

The PDP-X was one of Digital Equipment Corporation’s legendary lost designs. The leaders of the project, Edson DeCastro and Henry Burkhardt, left DEC when the project was cancelled to found Data General Corporation, amid charges of bad faith and IP theft. The PDP-X was rumored to be the prototype for the Nova, the PDP-11, both, or neither.

Recently uncovered documents in the DEC Archive (now in possession of the Computer History Museum in Mountain View, California) make it possible to debunk these rumors. The “PDP-X” technical memorandum series shows conclusively that the proposed PDP-X had little similarity to either the DG Nova or the PDP-11. Both the Nova and the PDP-11 demonstrate substantial advances in architectural thinking over the PDP-X, with the Nova pointing the way to future RISC processors, and the PDP-11 to the VAX.

The PDP-X Project

The documentary record for the PDP-X begins in June, 1967, with an introductory memo about the Technical Memorandum series, and ends in February, 1968 with a note about proposed assembler syntax. Critical memos include the Processor Architecture (#13) and the System Architecture (#16), both dating from the summer of 1967. By the spring of 1968, the project had been rejected, its value vis-à-vis the established 12b (PDP-8) and 18b (PDP-9) product lines insufficiently proven to warrant further development.

The PDP-X proposal represented a way-station between the “one of” system design embodied in DEC’s 12b and 18b systems and the “family” concept of the PDP-11. From the outset, the PDP-X was intended to include a variety of models at a variety of price points. These models would have (upward) compatible features and capabilities but would share common peripherals and software. The lower cost model (the model I) was intended to be price competitive with the PDP-8, the higher cost model (the model II) with the PDP-9.

Architecturally, the PDP-X was also a way-station between the accumulator- oriented systems of the early 60’s and the more radical Nova and PDP-11. Multiple accumulators and index registers gave the architecture more flexibility, at the cost of greater complexity (including variable length instructions). The instruction set followed a register-memory model, like the PDP-10, rather than the load-store model of the Nova or the generalized operands of the PDP-11. Real-time processing was a central concern, with fast context switching through multiple register sets.

The PDP-X Architecture

Data Types

The PDP-X was a word-oriented, multiple accumulator, variable instruction length computer. A minimal system had 8KW. A system without memory protection could support 32KW, with memory protection, 128KW. There were five basic data types:

• 16b unsigned integers • 16b signed integers – 2’s complement • 8b bytes – stored two per word, with the “first” byte on the right (“little endian”) • 32b floating point – IBM “hex” format • 64b floating point – IBM “hex” format

Bits in memory were numbered left to right, starting with bit 0.

Memory

Memory consisted of 16b words. A minimal system had 8KW. A system without memory protection could support 32KW, with memory protection, 128KW. Memory was contiguous; references to non-existent memory caused a trap.

Registers

Processor state was organized around 16 registers. These registers occupied memory addresses 0-15, as in the PDP-10. The first eight registers could be implemented in discrete logic, again following the model of the PDP-10:

R0 program status word R1 program counter, “index register” R2 accumulator, subroutine linkage, index register R3 accumulator, index register R4 accumulator R5 accumulator R6 accumulator R7 accumulator

The second eight registers were always in memory and had dedicated purposes:

R8 extended op PC R9 extended op instruction R10 extended op effective address R11 extended op entry address R12 push down pointer R13 push down counter R14 trap PC R15 trap entry address

Each interrupt priority level had its own register set. A minimal system had two priority levels, user and interrupt (monitor); a fully populated system had eight.

The program status word (PSW) consisted of 16b of status information and the 15b program counter. The PSW provided trap information, condition codes, and priority level control (register set select):

bit<0> arithmetic trap enable bit<1> arithmetic trap flag bit<2> push down list error bit<3> non-existent memory error bit<4> privileged instruction violation bit<5> memory protection violation bits<10:12> priority (register set select) bits<13:15> condition codes

The three condition codes were carry/borrow, negative, and not-zero, respectively.

The PDP-X also implemented four 64b floating point accumulators, in memory locations 32-47.

Instructions

PDP-X instructions were either 16b or 32b in length, depending on the opcode and the . There were multiple instruction formats:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | opcode | R | X | disp | short, +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ disp != 10000000

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | opcode | R | X | 1 0 0 0 0 0 0 0| long, +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ X != 01 | I| direct address | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | opcode | R | 0 1| 1 0 0 0 0 0 0 0| immediate +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | immediate operand | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | opcode | R | X | subopcode or dev addr | extended or IO, +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ X != 01 | I| direct address | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | opcode | R | 0 1| subopcode or dev addr | extended or IO +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ immediate | immediate operand | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

Effective address calculation was controlled by the opcode, the X field (bits<6:7>), and by the displacement field (bits<8:15>), as follows:

Opcode <=5 && displacement != 0x80 X == 0 field 0 direct ea = displacement X == 1 PC relative ea = PC + SEXT8 (displacement) X == 2 R2 (link) relative ea = R2 + SEXT8 (displacement) X == 3 R3 (index) relative ea = R3 + SEXT8 (displacement)

Opcode >5 || displacement == 0x80 X == 0 direct ea = direct address X == 1 immediate ea = PC + 1 X == 2 R2 (link) relative ea = R2 + direct address X == 3 R3 (index) relative ea = R3 + direct address

Long addresses supported indirection. The memos don’t make clear whether indirect addressing was single-level or multi-level.

Because the opcode field was so small (3 bits), the number of basic operations was very small and was almost the same as the PDP-8:

op == 0 LDA Rn = M[ea], CC unchanged op == 1 STA M[ea] = Rn, CC unchanged op == 2 ADD Rn = Rn + M[ea], set CC 0-2 op == 3 AND Rn = Rn & M[ea], set CC 1-2 op == 4 branch Rn selects one of 8 branch functions op == 5 modify Rn selects one of 8 modify operations

The 8 available branches were:

R == 0 BCN branch if CC 0 (carry) == 1 R == 1 BM branch if CC 1 (minus) == 1 R == 2 BN branch if CC 2 (non-zero) == 1 R == 3 B unconditional branch R == 4 BCZ branch if CC 0 (carry) == 0 R == 5 BP branch if CC 1 (minus) == 0 R == 6 BZ branch if CC 2 (non-zero) == 0 R == 7 BAL R2 = PC + 1, unconditional branch The 8 available modify functions were:

R == 0 TST set CC 1 and 2 from M[ea] R == 1 COM M[ea] = ~M[ea], set CC 1-2 R == 2 INC M[ea] = M[ea] + 1, set CC 1-2 R == 3 NEG M[ea] = -M[ea], set CC 1-2 R == 4 RR rotate M[ea] right through CC 0, set CC 1-2 R == 5 RL rotate M[ea] left through CC 0, set CC 1-2 R == 6 SWP swap bytes in M[ea], set CC 1-2 R == 7 CLR M[ea] = 0, set CC 1-2

The extended operation instructions (opcode 6) provided an “escape” for more complex instructions, at the cost of an additional word. The extended operation class provided room for 256 additional instructions. The first 64 were reserved as UUO’s (unused operation orders), for program/monitor communication (again, like the PDP-10). Of the remaining 192, the following were defined: subop == 100 LMUL Rn’Rn v 1 = Rn * M[ea], unsigned, set CC 1-2 subop == 101 MUL Rn’Rn v 1 = Rn * M[ea], signed, set CC 1-2 subop == 102 LDIV Rn,Rn v 1 = Rn’Rn v 1 / M[ea], unsigned, set CC 1-2 subop == 103 DIV Rn,Rn v 1 = Rn’Rn v 1 / M[ea], signed, set CC 1-2 subop == 104 TSTN Rn & M[ea], set CC 1-2 subop == 105 TSTZ Rn & M[ea], set CC 1-2; Rn = Rn & ~M[ea] subop == 106 TSTO Rn & M[ea], set CC 1-2; Rn = Rn | M[ea] subop == 107 TSTC Rn & M[ea], set CC 1-2; Rn = Rn ^ M[ea] subop == 110 LCMP Rn : M[ea], unsigned, set CC 1-2 subop == 111 CMP Rn : M[ea], signed, set CC 1-2 subop == 112 SUB Rn = Rn – M[ea], set CC 0-2 subop == 113 shift Rn = Rn (shftop) SEXT8(M[ea]<8:15>) subop == 114 LDC Rn<8:15>> = M-byte[ea]; Rn<0:7> = 0 subop == 115 STC M-byte[ea] = Rn<8:15> subop == 116 push/pop one of 8 push-down list operations, selected by R

The shift operation used the effective operand as a control word. Bits<6:7> specified the type of shift:

• Bits<6:7> == 00: arithmetic shift • Bits<6:7> == 01: rotate through CC 0 • Bits<6:7> == 10: rotate without CC 0 • Bits<6:7> == 11: logical shift while bits<8:15>, sign extended, controlled the direction and amount of the shift.

The push-down list operations used R12 as the push-down pointer and R13 as the push-down counter. The counter had two bytes; the left for tracking pops, the right for tracking pushes. Push and pop were defined as follows:

• void push (operand): M[R12++] = operand; R13<0:7>++; R13<8:15>-- • int16 pop (void): result = M[--R12]; R13<0:7>--; R13<8:15>++

If either half of the counter was decremented past 0, a trap occurred. This provided both overflow and underflow detection but limited the push-down list to 256 entries. The push-down list operations were:

R == 0 PUC push but no memory store R == 1 PUSH push (M[ea]) R == 2 PUB push (PC); PC = ea R == 3 PUL push (R2); push (PC); R2 = PC; PC = ea R == 4 POC pop but no memory read R == 5 POP M[ea] = pop () R == 6 POB PC = ea + pop () R == 7 POL PC = ea + pop (); R2 = pop ()

Other extended operations were reserved for floating point and future extensions.

I/O

The I/O architecture was fairly standard for the day. I/O devices were addressed via ports rather than memory locations. There were four basic I/O primitives:

• read status, • read data, • write command, • write data, plus acknowledge interrupt. Data transfers were 8b; a device could optionally supply 16b. Direct memory access was implemented via a medium speed multiplexor channel or a dedicated selector channel.

The ‘four primitives’ I/O model reflected current competitive practices; the same model could be found in the Interdata and 3C (later Honeywell) minicomputers. The multiplexor channel extended DEC’s existing 3-cycle data break designs; the selector channel was, from an architectural viewpoint, invisible.

The PDP-X and the Nova

The PDP-X bears little resemblance to the Nova. To list the most obvious differences:

• The PDP-X had a register-memory instruction set, the Nova had a load-store instruction set. • The PDP-X was little-endian, the Nova was big-endian. • The PDP-X was architected for a microcoded implementation, the Nova was architected for a hard-wired implementation. • The PDP-X had 8 accumulators, the Nova had 4. • The PDP-X’s accumulators could be addressed as memory locations, the Nova’s could not. • The PDP-X had multiple register sets, the Nova did not. • The PDP-X had variable length instructions, the Nova had fixed length instructions. • The PDP-X had condition codes and used branches, the Nova had a single carry bit and used skips. • The PDP-X had many specific single-register operate instructions, the Nova had eight generalized dual-register operate instructions.

Indeed, the only bit of resemblance is in the addressing modes for single-word memory reference instructions. The PDP-X’s four modes:

Opcode <=5 && displacement != 0x80 X == 0 field 0 direct ea = displacement X == 1 PC relative ea = PC + SEXT8 (displacement) X == 2 R2 (link) relative ea = R2 + SEXT8 (displacement) X == 3 R3 (index) relative ea = R3 + SEXT8 (displacement)

are pretty much the same as the Nova’s (although the Nova reversed the roles of R2 and R3). However, there are also differences: the Nova provided indirect addressing for its 16b load-store instructions, the PDP-X did not.

The Nova demonstrates a substantial advance in architectural simplicity, elegance, and orthogonality over the PDP-X. Except for the loop instructions ISZ/DSZ, the Nova was a strict load-store machine, foreshadowing the later RISC processor movement. The I/O system was more flexible than the Interdata/Honeywell-like PDP-X model. The simplicity of the architecture (and the newly available S181 ALU slice) made it possible to build a system that was smaller, faster, and less expensive than the PDP-X would have been.

The PDP-X and the PDP-11

The PDP-X also has little relationship to DEC’s eventual 16b architecture, the PDP-11. To list the most obvious differences:

• The PDP-X was a multi-accumulator architecture, the PDP-11 was a general- register architecture. • The PDP-X had a register-memory instruction set, the PDP-11 had a generalized operand instruction set. • The PDP-X addressed memory as words, the PDP-11 addressed memory as bytes. • The PDP-X’s accumulators could be addressed as memory locations, the PDP-11’s general registers could not. • The PDP-X had 16b and 32b instructions, the PDP-11 had 16b, 32b, and 48b instructions. • The PDP-X had an explicit push down list mechanism, the PDP-11 integrated stacks into the overall addressing modes. • The PDP-X used the PC-as-general-register only to implement relative addressing, the PDP-11 used the PC as a general register in all addressing modes. • The PDP-X had multiple register sets, the PDP-11 had only one (until the 11/45, which added a second). The PDP-X register sets were tied to the processor mode, the PDP-11’s were not. • The PDP-X used separate instructions and addressing for devices, the PDP- 11 integrated device addressing into standard addressing and used standard instructions for I/O.

There are some similarities. Both designs had a processor status word, both had branches rather than skips, the list of single operand instructions is similar.

Like the Nova, the PDP-11 is a substantial advance in architectural thinking over the PDP-X. The major advances:

• Generalized addressing modes integrating indexing and stack • Generalized two operand instructions • Use of the PC as a full general register for addressing • Integration of I/O with memory represented a significant break with prior systems. The PDP-11 set the model for most minicomputer and microcomputer architecture of the 1970’s, and was considered the epitome of architectural ingenuity until the VAX.

Summary

The PDP-X was not the direct architectural precursor of either the Nova or the PDP-11. Indeed, its most obvious relationship is not to those systems but to contemporary competitive minicomputers. Its I/O system borrowed heavily from the Interdata and Honeywell models. The Nova abandoned all the complexity of the PDP-X; and the PDP-11 rethought it from scratch. Both proved to be major advances in computer architecture. The PDP-X, despite the nine months of hard work that went into it, was just another minicomputer.

http://simh.trailing-edge.com/docs/vax_proc.txt

VAX Processor Charts Rev 7 15-Sep-2000

1. Definitions

Integer Inst = Integer instruction group

arith/logic: ADAWI ADDB2 ADDB3 ADDL2 ADDL3 ADDB2 ADDB3 ADWC ASHL ASHQ BICB2 BICB3 BICL2 BICL3 BICW2 BICW3 BISB2 BISB3 BISL2 BISL3 BISW2 BISW3 BITB BITL BITW CLRB CLRL CLRQ CLRW CMPB CMPL CMPW CVTBL CVTBW CVTLB CVTLW CVTWB CVTWL DECB DECL DECW DIVB2 DIVB3 DIVL2 DIVL3 DIVW2 DIVW3 EDIV EMUL INCB INCL INCW MCOMB MCOML MCOMW MNEGB MNEGL MNEGW MOVB MOVL MOVQ MOVW MOVZBW MOVZBL MOVZWL MULB2 MULB3 MULL2 MULL3 MULW2 MULW3 PUSHL ROTL SBWC SUBB2 SUBB3 SUBL2 SUBL3 SUBW2 SUBW3 TSTB TSTL TSTW XORB2 XORB3 XORL2 XORL3 XORW2 XORW3

address: MOVAB MOVAL MOVAQ MOVAW PUSHAB PUSHAL PUSHAQ PUSHAW

bit field: CMPV CMPZV EXTV EXTZV FFC FFS INSV

control: ACBB ACBL ACBW AOBLEQ AOBLSS BBC BBCC BBCCI BBCS BBS BBSC BBSS BBSSI BEQ BGEQ BGEQU BGTR BGTRU BLBS BLBC BLEQ BLEQU BLSS BLSSU BNEQ BRB BRW BSBB BSBW BVC BVS CASEB CASEL CASEW JMP JSB RSB SOBGEQ SOBGTR

proc call: CALLG CALLS RET

miscellaneous: BICPSW BISPSW BPT INDEX MOVPSL NOP POPR PUSHR HALT

queue: INSQHI INSQTI INSQUE REMQHI REMQTI REMQUE

oper system: CHMK CHME CHMS CHMU HALT LDPCTX MFPR MTPR PROBER PROBEW REI SVPCTX

string: CMPC3 CMPC5 LOCC MOVC3 MOVC5 SCANC SKPC SPANC

F_flt = F_floating instruction group

http://simh.trailing-edge.com/docs/vax_proc.txt (1 of 7)26/11/2005 9:24:00 http://simh.trailing-edge.com/docs/vax_proc.txt

ADDF2 ADDF3 CMPF CVTBF CVTFB CVTFL CVTFW CVTLF CVTRFL CVTWF DIVF2 DIVF3 MNEGF MOVF MULF2 MULF3 SUBF2 SUBF3 TSTF

D_flt = D_floating instruction group

ADDD2 ADDD3 CMPD CVTBD CVTDB CVTDF CVTDL CVTDW CVTFD CVTLD CVTRDL CVTWD DIVD2 DIVD3 MNEGD MOVD MULD2 MULD3 SUBD2 SUBD3 TSTD

G_flt = G_floating instruction group

ADDG2 ADDG3 CMPG CVTBG CVTFG CVTGB CVTGF CVTGL CVTGW CVTLG CVTRGL CVTWG DIVG2 DIVG3 MNEGG MOVG MULG2 MULG3 SUBG2 SUBG3 TSTG

H_flt = H_floating instruction group

ADDH2 ADDH3 CLRH CMPH CVTBH CVTDH CVTFH CVTGH CVTHB CVTHD CVTHF CVTHG CVTHL CVTHW CVTLH CVTRHL CVTWH DIVH2 DIVH3 MNEGH MOVAH MOVH MOVO MULH2 MULH3 PUSHAH SUBH2 SUBH3 TSTH

Decimal = Decimal instruction group

ADDP4 ADDP6 ASHP CMPP3 CMPP4 CVTLP CVTPL CVTPS CVTPT CVTSP CVTTP DIVP MOVP MULP SUBP4 SUBP6

Vectors = Vector instruction group

IOTA MFVP MTVP VGATHL VGATHQ VLDL VLDQ VSADDD VSADDF VSADDG VSADDL VSBICL VSBISL VSCATL VSCATQ VSCMPD VSCMPF VSCMPG VSCMPL VSDIVD VSDIVF VSDIVG VSMERGE VSMULD VSMULF VSMULG VSMULL VSSLLL VSSRLL VSSUBD VSSUBF VSSUBG VSSUBL VSTL VSTQ VSXORL VSYNC VVADDD VVADDF VVADDG VVADDL VVBICL VVBISL VVCMPD VVCMPF VVCMPG VVCMPL VVCVT VVDIVD VVDIVF VVDIVG VVMERGE VVMULD VVMULF VVMULG VVMULL VVSLLL VVSRLL VVSUBD VSSUBF VSSUBG VSSUBL VVXORL

Emulated fp = Emulated floating point instruction group

ACBF ACBD ACBG ACBH EMODF EMODD EMODG EMODH POLYF POLYD POLYG POLYH

http://simh.trailing-edge.com/docs/vax_proc.txt (2 of 7)26/11/2005 9:24:00 http://simh.trailing-edge.com/docs/vax_proc.txt

Emulated st = Emulated string instruction group

CRC EDITPC MATCHC MOVTC MOVTUC

Compat mode = Compatibility mode instruction group

http://simh.trailing-edge.com/docs/vax_proc.txt (3 of 7)26/11/2005 9:24:00 http://simh.trailing-edge.com/docs/vax_proc.txt

2. Processors

y = implemented in hardware/microcode n = emulated in macrocode opt = implemented in hardware/microcode as an option

Processor Shipped Process Integer F_flt D_flt G_flt H_flt Decimal Vectors Emul'td Emul'td Compat Phys Sys TLB (SID) fp st Mode Addr Space Size

780 = 1 1977 TTL y y y opt opt y n y [1] y y 30b 1gb 128,2,s

750 = 2 1981 TTL GA y y y opt opt y n y [1] y y 24b 1gb 512,2,s

730 = 3 1983 TTL y y y y y y n y y y 24b 1gb 128,1,s

8600 = 4 1984 Mosaic1 y y y y y y n y y y 30b 1gb 512,2,s

V-11 = 5 1986 ZMOS y y y y y y n y y n 30b 1gb 5,f,* + 512,1,s

8800 = 6,17 1986 Mosaic1 y y y y y y n y y n 30b 1gb 1024,1,s uVAX I/D = 7[2] 1984 NMOS y y y n n n n y [1] n n 22b 1gb 512,1,s uVAX I/G = 7[2] 1984 NMOS y y n y n n n y [1] n n 22b 1gb 512,1,s uVAX II = 8 1985 ZMOS y [3] y y y n n n y [1] n n 30b 1gb 8,f,u

CVAX = 10 1987 CMOS-1 y y y y n n n y [1] n n 30b 1gb 28,f,u (CVAX+ shrink) 1988 CMOS-2

Rigel = 11 1989 CMOS-2 y y y y n n opt n n n 30b 1gb 64,f,u

9000 = 14 1990 Mosaic3 y y y y y y opt n n n 30b 1gb 1024,1,s

Mariah = 18 1990 CMOS-3 y y y y n n opt n n n 30/32b 1gb 64,f,u

NVAX = 19 1991 CMOS-4 y y y y n n n n n n 30/32b 2gb 96,f,u

SOC = 20 1990 CMOS-3 y y y y n n n y [1] n n 30b 1gb 28,f,u

NVAX+ = 23 1992 CMOS-4 y y y y n n n n n n 30/32b 2gb 96,f,u (NVAX++ shrink) 1994 CMOS-5

Notes:

[1] Only for supported floating point data types.

[2] MicroVAX 1's were available with either D_floating or G_floating, but not both. 90% of all sold implemented D_floating.

[3] CMPC3, CMPC5, LOCC, SKPC, SCANC, SPANC were emulated.

Translation lookaside buffer (TLB) size is: number of entries, associativity, organization -- associativity: 1 = direct map 2 = 2 way associative f = fully associative

http://simh.trailing-edge.com/docs/vax_proc.txt (4 of 7)26/11/2005 9:24:00 http://simh.trailing-edge.com/docs/vax_proc.txt

organization: s = split (half system, half process) u = unified * = V-11 mini-TLB, 1 instruction, 4 data

Processes --

Mosaic1 3u ECL gate arrays Mosaic2 1u ECL gate arrays ZMOS 3u NMOS, 2 metal layers CMOS-1 2u CMOS, 2 metal layers CMOS-2 1.5 CMOS, 2 metal layers CMOS-3 1u CMOS, 3 metal layers CMOS-4 .75u CMOS, 3 metal layers CMOS-5 .5u CMOS, 4 metal layers

http://simh.trailing-edge.com/docs/vax_proc.txt (5 of 7)26/11/2005 9:24:00 http://simh.trailing-edge.com/docs/vax_proc.txt

3. Processors and Models

Processor Used in Models Cycle Time Primary Secondary Cache Cache

780 VAX-11/780, VAX-11/782 200ns 8KB -

750 VAX-11/750 320ns 4KB -

730 VAX-11/730, VAX-11/725 270ns - -

785 VAX-11/785 133ns 32KB -

8600 VAX 8600 80ns 16KB - 8650 VAX 8650 55ns 16KB -

V-11 VAX 8200, VAX 8300, VAXstation 8000 200ns 8KB - VAX 8250, VAX 8350 160ns 8KB -

8800 VAX 8500, VAX 8530, VAX 8550, VAX 8700, 45ns 64KB - VAX 88xx uVAX I MicroVAX I, VAXstation I 250ns 8KB - uVAX II MicroVAX II, MicroVAX 2000, 200ns - - VAXstation II, VAXstation II/GPX, VAXstation 2000

CVAX MicroVAX 3300/3400 100ns 1KB - MicroVAX 3500/3600, 90ns 1KB 64KB VAXstation 3200, VAXstation 3500 MicroVAX 3100-10/20 90ns 1KB - VAXstation 3100-30/40 90ns 1KB 32KB VAXstation 3520/3540 80ns 1KB 64KB VAX 6000-2XX 80ns 1KB 256KB

CVAX+ MicroVAX 3800/3900 60ns 1KB 64KB MicroVAX 3100-10e/20e 60ns 1KB - VAXstation 3100-38/48 60ns 1KB 32KB VAX 6000-3XX 60ns 1KB 256KB VAXft 110, 310 60ns 1KB 32KB

Rigel VAX 4000-300 28ns 2KB 128KB VAXstation 3100-76 28ns 2KB 128KB VAX 6000-4XX 28ns 2KB 128KB

Mariah MicroVAX 3100-80 20ns 2KB 256KB VAXstation 4000-60 18ns 2KB 256KB VAX 6000-5XX 16ns 2KB 512KB

SOC VAX 4000-200 35ns 6KB - MicroVAX 3100-40 40ns 6KB - VAXstation 4000-VLC 40ns 6KB - VAXft 410, 610 35ns 6KB 128KB

9000 VAX 9000-110,210,3XX,4XX 16ns 8+128KB -

http://simh.trailing-edge.com/docs/vax_proc.txt (6 of 7)26/11/2005 9:24:00 http://simh.trailing-edge.com/docs/vax_proc.txt

NVAX VAX 4000-100 14ns 2+8KB 128KB VAX 4000-105A 12ns 2+8KB 128KB VAX 4000-400 16ns 0+8KB 128KB VAX 4000-500 and -500A 14ns 2+8KB 128KB VAX 4000-600 and -600A and -505A 12ns 2+8KB 512KB VAX 4000-700A 10ns 2+8KB 2MB VAX 4000-705A 9ns 2+8KB 2MB MicroVAX 3100-85 16ns 0+8KB 128KB MicroVAX 3100-90 14ns 2+8KB 128KB MicroVAX 3100-95 12ns 2+8KB 512KB VAXstation 4000-90 14ns 2+8KB 256KB VAXstation 4000-90A 12ns 2+8KB 256KB VAX 6000-6XX 12ns 2+8KB 2MB

NVAX+ VAX 7000-6XX, 10000-6XX 11ns 2+8KB 4MB VAXft 810 12ns 2+8KB 512KB

NVAX++ VAX 7000-7XX, 10000-7XX 7.5ns 2+8KB 4MB

Caches with separate instruction and data spaces are indicated as I+D.

http://simh.trailing-edge.com/docs/vax_proc.txt (7 of 7)26/11/2005 9:24:00 VLSI VAX Micro-architecture

May 1988

Bob Supnik

For Internal Use Only Semiconductor Engineering Group Contents

• The macro-architectural issues • The canonical micro-architectural model • MicroVAX • Improving performance • CVAX and Rigel • Summary

VLSI VAX Micro-architecture The VAX Architecture

• The VAX architecture is a complex instruction set computer (CISC) characterized by: – Irregular instruction format (1 to 50+ bytes) – Large instruction set (304 instructions) – Multiple addressing modes (21) – Demand paged management – Few hardware limitations on software • TTL/ECL implementations have typically been characterized by: – Complex microcode-based control – Large control store (400k bits to 2200k bits) – Redundant facilities (microcoded and hardware floating point) – Inbuilt console I/O – Complex memory subsystem (large TB, large cache) • These implementation characteristics pose severe problems for a single chip VLSI implementation.

VLSI VAX Micro-architecture The MicroVAX Subset

• Generically, the MicroVAX subset is a set of hardware/ mi- crocode/ software/ performance tradeoffs intended to facilitate VLSI implementation. • Firmware to software tradeoffs: – 59 instructions implemented in macrocode rather than mi- crocode: character string, decimal, EDITPC, CRC, octa- word, h-floating – Console implemented in macrocode rather than microcode • Firmware to hardware tradeoffs: – Hardware floating point only • Performance tradeoffs: – Small translation buffer, fully associative, fast replacement – No cache or small cache

VLSI VAX Micro-architecture The Canonical VAX Micro-model

• Most VAX implementations, including the 78X, 750, 730, V-11, MicroVAX, CVAX, Rigel, and Nautilus, have the same basic block structure: – I(nstruction) box – E(xecution) box – M(emory) box – Microsequencer/ control store – Bus interface unit – Interrupts – Memory subsystem – Console subsystem

VLSI VAX Micro-architecture IBox

• Parses and decodes instruction stream using internal state and prefetch queue data fetched by the M Box and BIU. • Gives “VAXness” to the rest of the chip by directing the Microse- quencer through specifier evaluation and instruction execution. • Supplies parameters to specifier evaluation. • Formats I-stream data for specifier evaluation and instruction execution.

VLSI VAX Micro-architecture EBox

• Contains main execution data path: – Register file – ALU and shifter • Under microcode control, performs: – Specifier evaluation – Instruction execution – Interrupts and exceptions – Memory management processing • Maintains PC, backup PC, PSL, GPRs, RLOG, and other ar- chitecturally specified state.

VLSI VAX Micro-architecture MBox

• Performs address translation and access checking. • Decodes and initiates memory references, TB accesses. • Maintains address registers. • Performs instruction prefetching when idle.

VLSI VAX Micro-architecture Microsequencer/ Control Store

• Forms next micro-word address and performs micro-word se- quencing and access. • Decodes and selects micro-branch conditions. • Evaluates requests and initiates micro-traps. • Maintains micro-stack and pointer.

VLSI VAX Micro-architecture BIU

• Controls DAL and other external interfaces pins. • Controls DAL latches and rotators for proper positioning and formatting of incoming and outgoing data. • Cooperates with M Box in processing of unaligned data. • Provides autonomous operation on selected I/O functions.

VLSI VAX Micro-architecture Other

• Interrupt section responds to external hardware and internal software interrupt requests. • Memory subsystem provides connection of processor to external storage (memory and I/O). • Console subsystem provides diagnostic and control interface to entire system. • Note: Console subsystem ise external in VLSI VAXen and will not be discussed.

VLSI VAX Micro-architecture Canonical VAX Problems

• All VAX implementations must wrestle with thorny implemen- tation problems posed by the architecture, including: – Variable length instructions – Unaligned data – Virtual memory management – Instructions with multiple destinations – Instructions with complex algorithms – Exceptions – Clocking and stalls • It is interesting to note that there is no reasonable relationship between the difficulty of implementing a feature and its impor- tance.

VLSI VAX Micro-architecture MicroVAX Overview

• MicroVAX was the first single chip implementation of the VAX. Its characteristics included: – Single chip MicroVAX subset CPU (175 instructions) plus companion floating point unit (70 instructions) – ZMOS process (3u drawn, NMOS, double level metal) – 125,000 transistors, 353 mils x 358 mils – 200ns microcycle, 400ns I/O cycle – 8 entry TB, no cache – 1600 x 39b control store – PG - 2/84, LR - 3/85, FRS - 5/85 • MicroVAX implemented a simple external interface: – Multiplexed data and address bus (DAL) – Address and data strobes (AS, DS) – Byte masks for masked writes (BM<3:0>) – Cycle status for I/O differentiatiion (CS<2:0>,WR) – DMA request and grant (DMR, DMG) • MicroVAX drew heavily on the only VLSI full VAX (V-11) for its micro-architectural inspiration (E Box, micro-word, clocking).

VLSI VAX Micro-architecture MicroVAX I Box

• Principal function: prefetch, parse, and decode instructions. • Prefetch queue: – 8 bytes (2 aligned longwords) – Maximum of 4 bytes retired per microcycle • Instruction data register: – Data link from I Box to E Box – Automatically loaded for simple conditional branches, byte/ word displacements, short literals – Manually loaded for complex conditional branches, longword displacements, immediates, absolute addresses • Decode logic: – Decode PLAs for exceptions, instructions, specifiers – Initial decode PLA (IPLA) to supply instruction parameters to specifier flows • Control registers: – Opcode register – Access type/ data length register – Current GPR register

VLSI VAX Micro-architecture MicroVAX Instruction Decode

initial instruction decode | +------+------+ ||| exception direct first dispatch execution specifier dispatch decode | +------+ | next specifier decode | +------+ || execution second dispatch specifier decode | +------+ | execution dispatch

VLSI VAX Micro-architecture MicroVAX I Box Dispatches

• Exception dispatches: – VAX trap (divide by zero, subscript) – Interrupt – Trace trap – Prefetch exception (no data available) – Note: Trap, interrupt dispatch inhibit T bit update – Note: Initial instruction decode (IID) can only happen once • Direct execution dispatches: – Instructions with no specifiers and simple branches – FD prefix – PSL set – Note: FD and FPD interaction

VLSI VAX Micro-architecture I Box Dispatches, continued

• Specifier dispatches: – Short literal – Register – Indexed – Register deferred – Autodecrement – Autoincrement – Immediate – Autoincrement deferred – Absolute – Byte/ word displacement and relative – Byte/ word displacement and relative deferred – Longword displacement and relative – Longword displacement and relative deferred – Note: Separate longword dispatch due to 4 bytes per cycle limit on prefetchq queue

VLSI VAX Micro-architecture MicroVAX Decode Flow

EXAMPLE: ADDL3 R4, l^disp(R5), (R6)+ initial instruction decode --> first spec decode

W[0] := GPR[Rn] ! Flow for first spec next specifier decode --> second specifier decode

IDR := IB.LONG and case ! Flow for second spec VA := GPR[Rn] + IDR W[2] := MEM(VA) next specifier decode --> execution dispatch

W[0] := W[0] + W[2] ! Execution microcode dispatch to write destination

VA := GPR[Rn] ! Flow for third spec GPR[Rn]:= GPR[Rn] + 4 MEM(VA):= W[0] ! End of instruction

VLSI VAX Micro-architecture MicroVAX E Box

• Principal function: Execute VAX macro-instructions. • Register file: – 15 single ported general purpose registers (GPR’s): R0 - R14 – 12 single ported temporary registers (T’s): IS, P0BR, P1BR, SBR, SISR, PSL, etc. – 7 dual ported working registers (W’s): microcode tempo- raries • Program counter: – Architectural PC (R15) – Backup PC, loaded at IID from PC, for exception recovery – PC adder, for incrementing PC during instruction parse • Constant generator: – Literal constants from micro-word – Fixed constants (0, 1, 4) – State dependent constants (KDL, SEXTN)

VLSI VAX Micro-architecture MicroVAX E Box, continued

• SC/Q register: – Working register (W7) – Shift counter to control barrel shifter – Multiplier/quotient register – Case generation register • Arithmetic/ logical unit (ALU): – 32b arithmetic and logic function unit – Condition code outputs for 8b, 16b, 32b results • Barrel shifter: – 64b in, 32b out funnel shifter – Right shift in hardware – Left shift by ‘32-n’ right shift

VLSI VAX Micro-architecture MicroVAX E Box, continued

• Condition code logic: – ‘Raw’ ALU condition codes for microcode testing – Architecturally defined PSL condition codes – Instruction specified condition code ‘map’ raw condition codes to architecturally specified condition codes – Override for developing multi-word condition codes • Conditional branch logic: – Maps opcode against ALU/ PSL condition codes – Generates ‘branch taken’ for conditional update of PC • State logic: – Microcode settable/ testable flags – Half are global, half are cleared at IID • Register logging stack: – Records autoincrement/ autodecrement modifications to GPRs – Used in exception recovery – Cleared at IID

VLSI VAX Micro-architecture MicroVAX M Box

• Principal function: Address translation and external I/O. • Address registers: – VA - address register for data – VA’ - backup address register for data, autoincrements – VIBA - address register for instructions, autoincrements • Length check logic: – SLR, P0LR, P1LR - architecturally specified length registers – Length comparator – Status output - only tested on TB miss • Translation buffer: – Tag store (CAM) looks up addresses, fully associative – Data store (PTEs) holds corresponding PTEs – Management algorithm is true LRU

VLSI VAX Micro-architecture MicroVAX M Box, continued

• Access check logic: – Validity check (PTE.V # 0) and micro-trap – Access (privilege) check and micro-trap – M = 0 check and micro-trap – Note: probe vs memory request – Note: read vs read check, write vs write check • Unaligned logic: – Checks for data transfer across longword boundary – Breaks transfer into two transfers with proper data rotation and latching • Cross page logic: – Checks for data transfer across page boundary – Initiates micro-trap for proper access check • Data length logic - drives data length on DAL. • Micro-trap and abort logic.

VLSI VAX Micro-architecture Control Store

• Principal function: control memory for chip. • 1600 words x 39b control store. • 25b of data path control, in nine formats: – Basic (ALU) – Shift (shifter) – Constant (ALU + microcode constant) – Special (state twiddling) – Mem req (external I/O) – MXPR (internal I/O) – F Box transfer (FPU I/O) – F Box execute (not used) – Spare (integer multiply/divide) Too many! Decoding is a nightmare. • 14b of sequencing control, in two formats: – Jump – Branch (conditional or case)

VLSI VAX Micro-architecture Microsequencer

• Principal function: sequence access of micro-words from control store. • Provides multiple access modes: – Absolute next address – Relative next address (signed offset) – Sequential next address (micro-PC + 1) – Conditional branch – Case branch – Micro-subroutine and return – Externally generated address (test mode) • Maintains micro-PC (11b), micro-stack (8 entries). • Mediates and generates micro-traps: – M Box - TB miss, ACV/TNV, M = 0, cross page – E Box - integer overflow – I Box - reserved opcode – BIU - floating point error, DAL error

VLSI VAX Micro-architecture BIU

• Principal function: control external I/O. • Sequences external I/O functions: – Data and interrupt vector read, instruction prefetch – Data write with overlap (write and run) – FPU transfer – DMA request and grant • Controls data formatting: – Write data rotators – Read data rotators, latches, zero extender – Byte mask pins

VLSI VAX Micro-architecture Interrupts and Clocks

• Interrupt logic mediates external and internal interrupts: – External hardwired interrupts - HALT, PWRFL – External vectored interrupts - IRQ<3:0> =IPL<17:14> – Interval timer interrupt and disable flag - ICCS<6> Internal software interrupts - SISR<15:1> =IPL<0F:01> - are implemented entirely in microcode. • Clock logic provides master clocks for all chip logic: – Divide by two logic for internal master clock – Clock generators for 8 two phase internal clocks – Reset and synchronization logic Too complicated!

VLSI VAX Micro-architecture Improving Performance

• MicroVAX, like the 11/780, runs at about 500,000 VAX instruc- tions per second: – Average 10 microcycles per macro-instruction – 200ns microcycles – Average macro-instructions is 2.0 microseconds • To improve performance, there are two, and only two, techniques that can be tried: – Shorten the microcycle ∗ by improving technology ∗ by pipelining micro-instructions – Reduce the number of microcycles (ticks) per instruction (tpi) ∗ by improved macro-level parallelism ∗ by piecemeal improvement

VLSI VAX Micro-architecture Faster Microcycles: Technology

• There are four critical loops in a VAX implementation: – The E Box loop (register read, ALU, register write) – The I Box loop (data in, decode, micro-address out) – The Microsequencer loop (control store access, next address decode) – The TB/cache loop (address out, translation, access, data in) • In MicroVAX, each of these loops is balanced around a 200ns period. • Each generation of technology provides approximately 30% faster gates. • Therefore, successive generations of VLSI VAXen can speed up by 30% on technology alone.

Can’t we do better than that?

VLSI VAX Micro-architecture Faster Microcycles: Pipelining

By pipelining the E Box microcycle, micro-instruction through- put can be dramatically increased, thereby reducing the apparent microcycle time.

unfolded (1X):

read ALU write +------+------+------+

half folded (1.5X):

read ALU write +------+------+------+ read ALU write +------+------+------+

fully folded (3X):

read ALU write +------+------+------+ read ALU write +------+------+------+ read ALU write +------+------+------+

VLSI VAX Micro-architecture Micro-pipelining, continued

• Micro-pipelining impacts entire micro-architecture: – I Box must be pipelined to meet apparent faster microcycle – Microsequencer and control store must get faster to meet apparent faster microcycle – TB/cache must get faster to meet apparent faster microcycle – Control becomes much more complex throughout due to for- mal pipeline controls, stalls, etc – Microcode becomes much more complex due to longer micro- branch latencies, pipeline side effects, etc • Micro-pipelining is not a perfect win: – Segments are not equal length, effective microcycle time de- termined by longest segment – Pipeline introduces some inefficiencies and stalls • Micro-pipelining provides the biggest ‘multiplier’ for improving VAX performance; but where do we go after fully folding?

VLSI VAX Micro-architecture Reduced TPI: Pipelining

• The high TPI of most VAXen is due to two primary factors: – Serial decoding of specifiers – Lengthy execution times of complex instructions (CALLx, RET, etc) • Increasing macro-level parallelism could reduce apparent TPI by: – Parallel decoding of multiple specifiers, or – Overlap of specifier decoding with instruction execution • However, the VAX architecture is highly resistant to macro- level parallelism: – Variable length specifiers make parallel decoding of specifiers difficult and expensive – Interlocks within and between instructions make overlap of specifiers with instruction execution difficult and expensive • Most (but not all) VAX architects feel that the costs of macro- level parallelism outweight the benefits; hence, this approach is not being actively pursued.

VLSI VAX Micro-architecture Reduced TPI: Nibbling

• If we cannot get a radical reduction in TPI, we can nonetheless get small reductions via piecemeal improvements to the micro- architecture. • One area for improvement is the memory subsystem. Improve- ments can include: – Enlarged translation buffer – On chip cache – Multi-level cache – Multi-word I/O – Write and run (write pipelining) – Multiple write buffers – Read and run (read pipelining) – Hits under misses • Other areas for improvement: – Optimized (via special case) specifier decoding – Better hardware support or microcode algorithms for long instructions

VLSI VAX Micro-architecture CVAX

• Second generation VLSI VAX single chip microprocessor: – MicroVAX subset CPU (175 instructions) plus companion floating point unit (70 instructions) – CMOS-1 process (2u drawn, CMOS, double level metal) – 175,000 transistors, 390 mils x 375 mils – 80ns - 100ns microcycle, 160ns - 200ns I/O cycle – 28 entry TB, 1kb cache – 1600 x 41b control store • Performance goal is 2.5X - 3.0X current generation: – 1.5X from technology improvements – 1.5X from micro-architectural pipelining – Remainder from improved memory subsystem

VLSI VAX Micro-architecture CVAX, continued

• Faster microcycle – technology: – CMOS-1 process substantially faster than ZMOS (2ns repre- sentative gate delay vs 3ns) – Lower power permits fuller use of large devices for speed- critical paths • Faster microcycle – micro-pipelining: – Half folded micro-pipeline – Register file writes through, thereby allowing writes under reads with no explicit bypass logic – RAS/CAS addressing of control store provides same micro- branch latency as in MicroVAX (one cycle) – Pipeline in I Box adds one cycle to macro-branch latency • Reduced TPI – better memory subsystem: – Enlarged TB (28 entries vs 8) for reduced misses – On chip single cycle cache (1kb, two way associative, 8 byte block) – Off chip two cycle cache (64kb+, direct map) – Multi-word read for on chip cache fill

VLSI VAX Micro-architecture CVAX, continued

• CVAX I Box is based on Nautilus rather than 780: – I Box is an autonomous state machine which parses the in- struction stream based on its own state data – I Box parses all specifiers using one generic (parameterized) setofspecifierflows – I Box and E Box are synchronized by a single directive, DE- CODER NEXT – Prefetch queue is 12 bytes (3 aligned longwords), allowing retirement of up to 6 bytes per microcycle – Instruction data register automatically loaded in most cases (only immediates and complex branch displacements are done manually) • CVAX E Box implements half folded micro-pipeline: – All registers have extra (write) port – Writes are executed under reads, with bypass through the register file – 4 extra T registers for per process stack pointers – SC and Q are separate registers – PSL is maintained in hardware • CVAX M Box is like MicroVAX: – 28 TB entries – Not last used (NLU) replacement algorithm

VLSI VAX Micro-architecture CVAX, continued

• CVAX control store and Microsequencer are simplified: – 1600 x 41b control store – Five (rather than nine) data path formats – Two sequencing formats – Paged rather than signed displacement addressing – Case rather than conditional branching – 8 way rather than 16 way cases • CVAX BIU provides increased flexibility: – On chip 1kb single cycle cache – Multi-word reads for cache fills – Externally requested cycle retry – Optional data parity – Much more efficient FPA protocol • Improvements in interrupts and clocking: – Two more hardwired interrupts (CRD, MEMERR) – Partial hardware implementation of software interrupts – Externally generated four phase overlapping clocks

VLSI VAX Micro-architecture CVAX Decode Flow

EXAMPLE: ADDL3 R4, l^disp(R5), (R6)+ decoder next --> specifier decode

W[Sn] := GPR[Rn] ! Flow for specifier !Sn=0,Rn=4 decoder next --> specifier decode (IDR loaded)

VA := GPR[Rn] + IDR ! Flow for specifier W[Sn] := MEM(VA) ! Sn = 2, Rn = 5 decoder next --> specifier decode

VA,W[Sn]:= GPR[Rn] ! Flow for specifier GPR[Rn]:= GPR[Rn] + 4 ! Sn = 4, Rn = 6 decoder next --> execution dispatch

W[0] := W[0] + W[2] ! Execution MEM(VA):= W[0] ! End of instruction

VLSI VAX Micro-architecture Rigel

• Third generation VLSI VAX single chip microprocessor: – MicroVAX subset CPU (175 instructions) plus companion floating point unit (70 instructions) – CMOS-2 process (1.5u drawn, CMOS, double level metal) – 325,000 transistors, tbd mils x tbd mils – 30 ns - 40ns microcycle, 90ns - 120ns I/O cycle – 64 entry TB, 2kb cache – 1700 x 50b control store • Performance goal is 6X - 8X current generation: – 2X from technology improvements – 3X from micro-architectural pipelining – Remainder from improved memory subsystem

VLSI VAX Micro-architecture Rigel

• Faster microcycle – technology: – CMOS-2 process substantially faster than ZMOS (1.5ns rep- resentative gate delay vs 3ns) – Lower power permits fuller use of large devices for speed- critical paths • Faster microcycle – micro-pipelining: – Fully folded micro-pipeline – Register file writes through, thereby allowing writes under reads with just one level of explicit bypass logic – Micro-branch latency increases to three cycles – Pipeline in I Box adds yet another cycle to macro-branch latency • Reduced TPI – better memory subsystem: – Enlarged TB (64 entries) for reduced misses – On chip single cycle cache (2kb, direct map, 8 byte block) – Off chip three cycle cache (128kb, direct map, 16 byte fill size, 64 byte block size) – Multi-word read for all cache fills – Multi-word writes for burst output situations – Read and run pipeline

VLSI VAX Micro-architecture Rigel, continued

• Rigel I Box is based on CVAX/ Nautilus rather than 780: – I Box is an autonomous state machine which parses the in- struction stream based on its own state data – I Box parses all specifiers using one generic (parameterized) setofspecifierflows – I Box and E Box are synchronized by a single directive, DE- CODER NEXT – Prefetch queue is 16 bytes (4 aligned longwords), allowing retirement of up to 10 bytes per microcycle – Instruction data register automatically loaded in all cases • Rigel E Box implements fully folded micro-pipeline, plus read pipelining: – All registers have extra (write) port – MD (working) registers have second write port plus valid bits for synchronization – Bypass around ALU/ shifter and through register file – 8 extra T registers for per process stack pointers and memory management length registers – MD7 is separate register, SC and Q are again combined – PSL is maintained in hardware • Rigel M Box is simplified: – 64 TB entries – Not last used (NLU) replacement algorithm – Length checks implemented in microcode rather than in hardware

VLSI VAX Micro-architecture Rigel, continued

• Rigel control store and Microsequencer are simplified: – 1600 x 50b control store – Four data path formats – Two sequencing formats – Paged rather than signed displacement addressing – Case rather than conditional branching – 8 way rather than 16 way cases • BIU provides even more flexibility: – On chip 2kb single cycle cache – Multi-word reads for cache fills – Multi-word writes for high output – Externally requested cycle retry – Mandatory data parity – Much more efficient FPA protocol • Improvements in interrupts and clocking: – Two more hardwired interrupts (CRD, MEMERR) – Full hardware implementation of software interrupts – Externally generated four phase overlapping clocks

VLSI VAX Micro-architecture Rigel Decode Flow

EXAMPLE: ADDL3 R4, l^disp(R5), (R6)+ decoder next --> specifier decode

MD[Sn] := GPR[Rn] ! Flow for specifier !Sn=0,Rn=4 decoder next --> specifier decode (IDR loaded)

MD[Sn] := MEM(GPR[Rn]+IDR) ! Flow for specifier !Sn=2,Rn=5 decoder next --> specifier decode

VA := GPR[Rn] ! Flow for specifier GPR[Rn]:= GPR[Rn] + 4 ! Sn = 4, Rn = 6 decoder next --> execution dispatch

MEM(VA):= MD[0] + MD[2] ! Execution

VLSI VAX Micro-architecture Summary

• The implementation of the VAX in VLSI has required some adaptations and adjustments at the macro-architectural level. • The four VLSI VAXen defined to date (MicroVAX, V-11, CVAX, and Rigel) all follow the same (canonical) micro-architectural model. • The implementation process is complex, with much effort ex- pended on architectural nits that have little or no performance benefit. • The constraints of the VAX architecture have limited attempts at performance improvement to just three basic areas: – Improved technology – Microcycle pipelining – Improved memory subsystem • Despite the difficulties, the VLSI VAXen have proven both pop- ular and competitive, and will form the basis of DEC’s low end and mid range product offerings for years to come.

VLSI VAX Micro-architecture Microcoding Considered As A Fine Art

May 1988

Bob Supnik

For Internal Use Only Introduction

If once a man indulges himself in microcoding, very soon he comes to think little of assembly coding; and from assembly coding he next comes to Fortran and Forth; and from that to terse comments and goto statements.

Contents: • Microcoding. • Microarchitecture.

SEG/AFL Microcoding Microcoding

• (Narrow) definition: Microcoding is the implementation of an instruction set interpreter on a low-level hardware engine. • Goals (descending priority): – Accuracy. – Performance. – Schedule. – Space. – Maintainability. • Non-goals: – Plasticity. – Modularity. – Aesthetics.

SEG/AFL Microcoding The Design Process

• Preparation - studies, prework. • Comparative analysis - plagiarism no vice. • Algorithm development - ways and means. • Coding - implementation and documentation. • Verification - functional, restrictive, dynamic. • Optimization - squeezing the cycles.

SEG/AFL Microcoding Preparation

• Study the target ISP; know the SRM cold: – Complex instruction definitions. – Interactions between specifiers and execution. – Memory management. – Interrupts and exceptions. • Study the micromachine: – Control and branching structure. – Parallel capabilities. – Serial restrictions. – Duplicate facilities. • Understand performance tradeoffs in the architecture. • Set goals and priorities: space vs speed, etc. • Establish conventions for coding. • Choose tools for development and verification.

SEG/AFL Microcoding Comparative Analysis

“Plagiarism is the highest form of flattery.” • Study past implementations: – Algorithms for complex instructions. – Space vs speed tradeoffs. – Features that worked especially well. – Features that were not used. – Utilization of parallelism. – Bugs that escaped pre-PG verification. • Study contemporaneous implementations: – Study list is the same as given above. – Establish contact with other microcode projects. – Review others’ microcode as it is developed. – Invite others to review the code.

SEG/AFL Microcoding Algorithm Development

• Start with a baseline sketch, derived from: – Simple implementation of SRM description. – Direct translation of past implementation. – Direct translation of contemporary implementation. • Critique by constraints: – Is decision tree compressed to minimum? • Critique by comparison: – Does it meet performance goals? – Is it as fast, or faster, than past machines? – Is it as fast, or faster, than contemporary machines? • Critique by dead space: – Are there NOPs? • Examples: Rigel VFIELD, Rigel CSTRING.

SEG/AFL Microcoding Coding

• Start with register allocation. – Minimize (eliminate) data moves. – Maximize use of common routines and common exits. • NOPs are verboten! – Shorten the decision tree. – Push calculations up the tree. – Use cases instead of compares. – Add additional functions to fill slots. • Think parallel. – Multiple tests from one calculation. – Multiple consequences of one action. – Multiple interpretations of one result. – Multiple actions on one source.

SEG/AFL Microcoding Coding, continued

• Think hardware. – What hardware features are not being used, and could be eliminated? – What microcode routines are not meeting goal, and could use additional hardware support? • Document, document, document! – Introductions to entry points, algorithms, etc. – Specialized comments on microcode restrictions. – State description on every continuation page. – Cycle counts for complex decision trees. • Code review. – When code is complete, or even before. – Starting point is accuracy. – Next issue is performance. – Next issue is space. – Subordinate points: reusability, allocation plasticity.

SEG/AFL Microcoding Coding Examples

• Rigel VFIELD: [SC] <-- 000000[32.] - [MD.T2], LONG

– Tests size > 32. – Using SC case, tests size = 0.

• Rigel VFIELD: [SC] <-- [MD.T0] - 000000[32.], LONG

– Tests position > 31. – Loads position to SC (SC operates mod 32).

• Rigel QUEUE: MEM(VA)&, [SC] <-- [MD.T0] - [SC], LONG

– Calculates result. – Saves result for later use. – Writes result to memory.

SEG/AFL Microcoding Verification

• Functional verification: – HCORE - for initial debug. – AXE - for thorough coverage; use latest version, always! – MAXE - for pipeline interactions. – SEGUE - for pipeline interactions. – ELN - a reasonably short OS test. – VMS - the ultimate test. • Restrictive verification: – ARCS - microcode restrictions check. • Dynamic verification: – Check every error and exception path! – Interrupts (regular and passive release). – DMA and invalidates. – Coincident transactions (prefetch+, invalidate +, error+), etc.

SEG/AFL Microcoding Optimization

• Microcode, like bread dough, needs to “rest” a while before being worked again. • After initial verification, take a break. • Then look for “peephole optimizations”: – Eliminate NOPs by restructuring, filling, etc. – Eliminate STALLs by scheduling parallel work. – Eliminate MOVEs by revised register allocation. – Eliminate COMPAREs by cases. – Eliminate duplicate actions. – Eliminate words by finding common sequences. – Eliminate words by finding one line subroutines. – Eliminate allocation bottlenecks by alignlist strength reduc- tion. • The number of changes needed to free up a word or cycle may be enormous!

SEG/AFL Microcoding Optimization Examples

• RIGEL MISC: Eliminated two NOPs in INDEX by: – Adding functionality (size = 1 test). – Pushing calculation (subscript + size) up the tree. – Casing into subroutine instead of calling. – Saved multiple cycles, words too! • RIGEL CALLRET: Eliminated cycle in CALLs by: – Making one shift serve two purposes (align mask for call frame, align mask for casing). • RIGEL CALLRET: Eliminated STALL in RET by: – Placing more work under LOAD PC shadow. – Rewriting as needed to free up work for shadow. • RIGEL MULDIV: Eliminated register copy in DIVIDE by: – Noticing “useless” move on error path. – Reallocating registers to eliminate move.

SEG/AFL Microcoding Optimization Examples

• RIGEL OPSYS: Eliminated COMPAREs in CHMx by: – Implementing full case tree for opcode vs current mode. – Saved compare, extra data move. – Eliminated “wrong choice” path. – Saved 2+ cycles, no extra words. • RIGEL MISC: Eliminated duplicate read in POPR: – Prologue reads end of stack frame to test accessibility. – Main loop unwinds mask bits <11:0>. – Epilogue unwinds mask bits <14:12>. – Reusing longword saves 3 words, can save cycles. • RIGEL INTLOG: Eliminated words, cycles in ASHQ by: – Noticing right shift case had extra cycle due to conflict be- tween condition code order requirements and shift order re- quirements. – Reused MOVQ storage routine to set condition codes. – Allowed shift to be done in optimized order.

SEG/AFL Microcoding Random Thoughts

• Ordinary programming (systems or application) and microcod- ing are very different. – An ordinary program is implemented once, with a view to- wards long term maintenance and modification. – Microcode may be implemented many times, but once fin- ished is complete. – Microcode is hacking at its best: the last refuge of the as- sembly language fanatic. • The worst enemy of good microcode is NIH. – Beg, borrow, and steal good ideas from others. – Use others to review and critique code. • Microcode demands optimization, and optimization demands multiple passes. – Multiple passes, dispersed in time, by the same person. – Multiple reviewers, at the same time.

SEG/AFL Microcoding Microarchitecture

• (Narrow) definition: Microarchitecture is the process of defining a low-level hardware engine for a microcoded processor imple- mentation. • Goals (descending priority): – Implementation feasibility. – Performance. – Implementation complexity. – Schedule. – Implementation cost. • Non-goals: – Extensibility. – Reusability. – Aesthetics.

SEG/AFL Microcoding The Design Process

• Preparation - studies, prework. • Comparative analysis - the state of the art. • Microword development - trying out ideas. • Tradeoffs - hardware cost and microcode cost. • Formal definition - the final result.

SEG/AFL Microcoding Preparation

• Study the SRM; how does the architecture impact hardware? – Basic data path requirements (eg, GPRs, working registers, ALU, shifter, condition codes, RLOG, etc). – Specifier decomposition. – Memory management. – Interrupts and exceptions. • Study the “fine print” backbreakers. – Unaligned memory references. – Conflicting specifier usage (eg, (R),(R)+). – Double write specifiers. – Implicit specifiers. • Understand reliability and recoverability requirements for target systems. • Understand performance tradeoffs in the architecture. • Set goals and priorities: cycle time, cpi, etc.

SEG/AFL Microcoding Comparative Analysis

• Study past implementations: – Data path facilities. – Instruction parsing facilities. – Memory management facilities. – BIU structure. – Cache and memory structures. – Microcode structure (horizontal vs vertical, serial vs paral- lel). – Handling of architectural nasties. – Performance return from individual features. • Study contemporaneous implementations: – Study list is the same as given above. • Study hypothetical implementations: – Cache and memory subsystem configurations. – Theoretical limits on implementation efficiency.

SEG/AFL Microcoding Microword Development

• Goal: first cut at a microword (E Box structure). • Starting point: select a theme. – MicroVAX - get it to fit (microword existed)! – CVAX (first theme) - minimize microword count. – CVAX (final theme) - minimize logic. – Rigel - allow parallel E Box, BIU operations. – NVAX - minimize cpi of complex instructions. • Select a sequencing style. • Code key routines in “free form” microcode. – Specifiers. – Integer/logical, control, field, procedure call. • Derive minimum set of fields and functions per field. • Define initial microword.

SEG/AFL Microcoding Example: CVAX

• Initial CVAX goal was minimum number of microwords: • Final CVAX goal was minimum amount of logic while maintain- ing narrow microword: – MicroVAX’ 9 data path formats reduced to 5. – MicroVAX’ 32 destination selects reduced to 4. – MicroVAX’ 32 branch conditions reduced to 16. – MicroVAX’ 16 way cases reduced to 8, corresponding to hard- ware organization of CS ROM. – MicroVAX’ 9 literal formats reduced to 4. – MicroVAX’ 8 CC recipes reduced to 4. – MicroVAX’ 8 state flags reduced to 6. – MicroVAX’ signed offset addr changed to page mode. – MicroVAX’ conditional branches eliminated. – Fields common to all formats always in same place. – Memory request field horizontally encoded. – Special control functions horizontally encoded. – Duplicate data path functions eliminated. – Microword grew from 39b to 41b.

SEG/AFL Microcoding Example: Rigel

• Rigel goal was to augment CVAX with parallel MRQ and ALU functions and to further simplify decoding: – Combined MRQ/ALU format to support read and run. – Also supports calculate and write. – Destination select replaced by explicit destination field for simpler decoding. – Microword grew from 41b to 50b, since ROM width was no longer much of a performance issue. – Wider format allowed further simplification of shift format. • The main challenge of Rigel was not in the data path but in the sequencing. – Micropipeline implied longer branch latencies. – In particular, 1 cycle ALU latency grew to 3 cycles. – Could the microcode cope with the extra latency? – Feasibility proven by probe coding.

SEG/AFL Microcoding Example: NVAX

• NVAX is a macropipelined machine. • The largest irreducible component of cpi is complex instructions which require the I Box to shut down during execution. • Therefore, the goal of the (E Box) microarchitecture must be mininized cpi. • The Nautilus and Aquarius microcode are directly applicable. • Possible I/O facilities: – Automatic compaction of related writes into quadwords. – I/O operation every cycle at external pins. – Separate I and D (I/D) caches.

SEG/AFL Microcoding Example: NVAX, continued

• Possible microcode facilities: – Tailored ALU operations (eg, sign extend). – Tailored shift operations (eg, sign extend). – Parallel MRQ, ALU, SHF operations for maximum paral- lelism. – Tailored register operations (eg, byte writable PSL). – Tailored branch conditions (eg, case on all interesting CALLx mask bits). – Special function units for complex instructions (eg, mask unit, population counter, REI validator). – Shortened microbranch latency. – Microbranch tests at ALU/SHF input as well as output.

SEG/AFL Microcoding Tradeoffs

Ultimately, every microarchitectural feature must be justified by an SRM constraint or by a performance payback. • All VLSI uVAXen have featured: – Hardware implemented unaligned I/O. – Narrow, horizontally encoded microwords. – <2k word control store limit. – 32b right funnel shifter. – Multifunction SC register. • Some CVAX tradeoffs: – Opcode dependent ALU functions rejected. – Per-instruction register optimization rejected. – SC casing limited to bits <5:0>. – SISR partially implemented in hardware. – CC map select moved from IPLA to microword. – Branch logic enhanced to support loop branches. – Branches implemented via microtrap. – 780-like I Box replaced by 8800-like I Box. – VA on B Bus to speed up TB miss flows.

SEG/AFL Microcoding Tradeoffs, continued

• Some Rigel tradeoffs: – Per-instruction register optimization added. – SC casing broaded to bits <11:0>. – Population counter added. – REI, mask validator rejected. – SISR fully implemented in hardware. – Edge triggered latches reset by microcode. – Memory management length checks done in E Box. – No static ALU condition codes. – RLOG decoding done in hardware.

The list of decisions is endless.

SEG/AFL Microcoding Formalization

• The tentative microword definition is checked for hardware im- plementation feasibility. – Ease of decoding and execution. – Minimization of hardware maintained state. – Effect on cycle time. • The sketch feasibility microcode is fleshed out to be a full scale trial implementation. – Flushes out missing features for obscure cases. – Ensures adequate facilities for exceptions. • The trial implementation is used to drive a performance model of the system, to validate performance goals. • At the conclusion of performance modelling, the microcode def- inition is tentatively frozen. • But, there is always room for inventiveness, or catastrophe, to strike.

SEG/AFL Microcoding Concluding Thoughts

• Tbd.

SEG/AFL Microcoding CVAX and Rigel: The Development Process

May 1988

Bob Supnik

For Internal Use Only Semiconductor Engineering Group Introduction

• CVAX is the second generation single chip VAX microprocessor – 2.5X to 3.5X 11/780 performance – CMOS-1 (2u) process technology – 175,000 transistor sites in CPU – PG in Q1 FY87, system FRS in Q1 FY88 • Rigel is the third generation single chip VAX microprocessor – 6X to 8X 11/780 performance – CMOS-2 (1.5u) process technology – 325,000 transistor sites in CPU – PG in Q1 FY88, system FRS in Q2 FY89 • CVAX and Rigel use an evolutionary extension of the full custom design methodology employed in MicroVAX – Structured design style – Floorplan (physical partitioning) driven – Multi-level hierarchical simulation – Fully hand-crafted layout – Logical and physical verification

SEG/AFL Development Process Overall Design Process

• System performance model – Functionally crude but accurate models of system compo- nents – Trace driven simulation of large programs – Full modelling of TB, cache, I/O, multiprocessor effects – Used for high level design tradeoffs (TB and cache structures, bus protocols, chip partitioning) – Unique PASCAL program per system • Chip (set) external specification – Defines external interface and protocols – Reviewed by system groups and customers – Used by customers as basis for system designs • Chip (set) design specification – Defines internal blocks, functions, and interfaces – Reviewed by design team and external reviewers – Used by designers as highest level reference

SEG/AFL Development Process Overall Design Process, continued

• Chip (set) behavioral model – Starts as RTL transcription of design specification – Equation level descriptions of all logic blocks – Used by designers to guide/verify logic design – Used by microcoders to debug/verify microcode – Used by logic modellers as shell for logic simulation – Used by test developers to debug/verify tests – Incrementally updated to greater accuracy as design pro- gresses • Physical partitioning (floorplanning) – Based on behavioral model – Detailed area, routing estimates for all major sections – Establishes layout feasibility, guides layout • Logic design (unsized schematics) – Based on behavioral model – Entered via terminal/workstation, maintained online – Verified with standalone or mixed mode switch level simula- tion – Logic simulation expands from schematics to sections to en- tire chip

SEG/AFL Development Process Overall Design Process, continued

• Circuit design (sized schematics) – Based on unsized schematics – Done by table lookup from design guides, or by iterative analysis – Verified with circuit level simulator at small schematic level • Layout design (physical data base) – Based on sized schematics – Done by skilled professionals online with dedicated worksta- tions – Verified against physical design rules and schematics • Final design checks – Power/ground grid, electromigration analysis – Coupling, dynamic node noise analysis – Whole chip layout verification – Whole chip timing verification (still very crude) – Resimulation of critical paths with physically correct param- eters

SEG/AFL Development Process Overall Design Process, continued

• Pattern generation – PG – Direct transcription of online layout data base to masks • Debug – Component level verification with test vectors – Subsystem level verification with macrocode programs – System level verification with VMS, AXE, Ultrix – Problem isolation in logic, circuits, manufacturing • Manufacturing introduction – Reliability demonstration – Process tolerance demonstration (characterization) – Risk production and prototype approval – Limited Release • Volume production – Yield demonstration – Test time and coverage demonstration – Yield improvement plan – Field feedback – Production Release

SEG/AFL Development Process High Level Specification, Verification

• Specification – Performance model – Written specifications – Behavioral model (functional level) • Verification – System level traces for performance checking – Behavioral execution with microcode – Macro diagnostics and DVTs (EVKAA, HCORE, small benchmarks) – AXE (> 150k cases per instruction group) – Bootstrapping of VAXELN, VMB

SEG/AFL Development Process Low Level Specification, Verification

• Specification – Iterated behavioral model (accurate to schematic level) – Sized schematic set • Verification – Regression testing on behavioral model as updated – Standalone and mixed mode logic simulation from schematic up through entire chip – Design rule and interconnect verification of layout from indi- vidual cell up through entire chip – Back end checks of whole chip effects (coupling, noise, power, electromigration, etc) – Whole chip timing verification (for gross timing errors) – Resimulation of critical paths with physically correct param- eters

SEG/AFL Development Process Tools

Activity CAD Tool ------data management CHAS or KATIE behavioral modelling DECSIM logic entry Quickdraw logic verification DECSIM MOS circuit sizing SPICE, tables physical design GDS-II or MEGAN physical verification DRC phys reconciliation IV back end checks XREF timing verification TV pattern generation MDP

Except for GDS-II (which is being phased out), DRC, and MDP, all of these tools were developed by DEC. All of these tools are supported by the SEG CAD group.

SEG/AFL Development Process System Level Processes

• Chip behavioral model is component to larger system level model • Systems groups build whole system behavioral model for func- tional verification, test pattern generation • First system user is tightly coupled to chip development team (colocation, joint debugging) • System level functional checkout is prerequisite to chip release

SEG/AFL Development Process Libraries

• Process models – Detailed process models, based on extracted data, for circuit analysis using SPICE – Crude process models, abstracted from detailed model, for fast simulation of circuit effects in logic simulator – Reliability models for power, electromigration analyses • Circuit libraries – Reusable components (I/O buffers, latches, etc) developed by design teams – High performance memory predesigns (ROM, RAM, CAM) developed by SEG/AD Memory design team – Design is full custom, all other circuits are unique to each chip

SEG/AFL Development Process Testing Goals

• Testing – Probe (initial wafer sort) test time < 20 seconds/die – Final (die sort) test time < 10 seconds/die – > 99 percent correlation to next higher level of assembly • Yield – Probe test yield goal is a direct function of die size and tran- sistor count – Final test yield is expected to start at 50 percent and rise steadily thereafter • Reliability – Set by DEC standard for IC components – Operating temperature range of 0 C - 70 C – > 1,000,000 MTBF • Repair – not applicable. Failing parts are put on the cover of the Annual Report.

SEG/AFL Development Process Test Strategy

• Engineering test strategy – Component level DVTs runnable on behavioral simulator, logic simulator, real chip – 100 percent nodal coverage, full functional coverage – Macro DVTs and diagnostics runnable on behavioral simu- lator, logic simulator, real chip – Bench tester with microprober, test generator, and logic an- alyzer, linked to chip models, for debug – ELN, Ultrix, VMS debug as part of chip checkout • Manufacturing test strategy – Design for test features inbuilt in chips – 100 percent nodal coverage correlated to functional coverage – Demonstration of tolerance to process variations prior to re- lease – Demonstration of reliability prior to release – High volume specialized production testers – Initial screen at wafer level to catch functional failures – Final screen at die level to catch parametric failures – Multiple speed bins from day 1 – Statistical feedback from test results to design team • Field Service test strategy – system level issue.

SEG/AFL Development Process Intermittent Failures

• Chip failure mechanisms are different from system failure mech- anisms – Alpha particle upset of dynamic memory cells – Burst bond wire – Electromigration – Hot electrons In particular, transistors do not suddenly appear or disappear (eg, a ROM will not change its contents) • Intermittent failure techniques are tailored to failure mechanisms – Minimized use of dynamic latches outside memory arrays – Parity protection on dynamic memory arrays – Parity protection on data busses – Test of microprocessor critical paths as part of board self test Note that the intermittent failures in a single board CPU are very different, in type and number, from a multi board system

SEG/AFL Development Process Test Development/Techniques

• Component level tests are developed by engineering – DVTs driven from specs for functional verification – Vectors driven from schematics for nodal verification – All tests are hand coded and debugged on the behavioral and/or logic simulators – Test vectors are formally qualified as part of release process • Parametric tests are developed by manufacturing – Control programs for wafer and die sort – AC and DC parametric tests for die sort – AC characterization tests for process tolerance demonstra- tion – All programs are hand coded and debugged on the actual test hardware – Test programs are formally qualified as part of release process • System level tests are standard from chip to chip (EVKAA, HCORE, AXE, VMS, ELN, ULTRIX)

SEG/AFL Development Process New Technologies

• Process Technology – CMOS-1 (2u) and CMOS-2 (1.5u) processes – Developed by LSI Mfg/Adv Semi Development – Supported by LSI Mfg/Adv Mfg Engineering – CAD support jointly by LSI Mfg/ASD and SEG/CAD – Process strategy, with CAD support, is joint partnership of engineering and manufacturing • Packaging Technology – Surface mount 1 (50 mil) and 2 (25 mil) techology – Surface mounted ceramic packages (44, 68, 84, 132, 164, 196 pins) – Tape area bonding (TAB) for > 132 pin packages – Packaging strategy, with CAD support, is joint partnership of LSI Mfg and P/DS • Test Technology – Design for test hooks in chips – New high performance testers (Sentry 21, Takeda-Reiken)

SEG/AFL Development Process Data Management

• Entire chip data base managed by central data manager and tool interface – CHAS – first generation, DBMS based, not suited to dis- tributed design environment – KATIE – current product, flexible data manager, suitable for operation in both clusters and workstations – KATIE not only manages data but provides flexible and user- invisible methods for integrating new tools and special pro- cedures • Data archiving – Online data bases are backed up by SEG Computer Re- sources using both disks and tapes – Full tape backups are made at PG of each pass – All data bases for released chips are maintained by Hudson Document Control – Data bases are archived with current tool revisions

SEG/AFL Development Process Data Base Contents

• Specifications – Engineering (user) specification – Design specification • Behavioral model and documentation • Microcode and documentation • Sized schematics (wirelists, logic simulation model are automat- ically derived from the schematics) • Complete layout • Package specifications and packaging procedures • Checkout software and procedures – Component level DVTs and test vectors – Wafer sort, die sort, and characterization programs – Macrocode DVTs and diagnostics • Selected verification results (final DRC, IV, simulations) • CAD and design tools, both generic and special

To build the chip, Manufacturing requires, in addition, the process recipe, which is archived on a per process rather than a per chip basis

SEG/AFL Development Process Revision Control

• Revision sources: – Functional errors (bugs found in use) – Manufacturing problems (tolerance to process variation) • Functional errors are disastrous, as there is no effective ECO method other than total replacement. The chip must be right when released to production. This is why full system test is an essential prerequisite to production release. Areas of particular concern: – Power up/power failure – Interrupts – DMA • Manufacturing problems – Chip is followed by full time manufacturing engineer both before and after production release – Statistics on yield vs process variations are gathered contin- uously – Field returns are also monitored to gather failure data – Production problems are first screened out, then tweaked out by targetting the process, and then designed out by the chip team

SEG/AFL Development Process CVAX/CFPA Resources

• Engineering manpower = $10M • Engineering capital – 3 dedicated 785s, 2 shared 86XXs – 4 uVAX workstations – 9 Calma workstations – 2 bench testers – Terminal per team member, office and home – Multiple system prototypes • Test manpower = $.6 • There are no special test capital investments for this project • Manufacturing manpower = $.9M • There are no special manufacturing capital investments for this project, beyond the general investment in CMOS-1 manufactur- ing

SEG/AFL Development Process Rigel Resources

• Engineering manpower = $14M • Engineering capital – 1 dedicated 8800, 1 dedicated 86XX – 20 uVAX workstations – 2 bench testers – 1 shared SEM based debug system – Terminal per team member, office and home – Multiple system prototypes • Test manpower = $.9M • There are no special test capital investments for this project • Manufacturing manpower = $.1.6M • Manufacturing capital – Rigel is dependent on success of, and capital investment in, TAB technology program – Rigel is also dependent on the success of, and capital invest- ment in, the CMOS-2 program

SEG/AFL Development Process Integration

• Engineering, Test, Manufacturing function as one team – Test engineer and manufacturing (product) engineer join during design phase, live with design team – Engineering fields support team to work with test and man- ufacturing for full year following Limited Release – Jointly agreed goals on yield, reliability, test correlation, etc provide joint success metrics for all groups • Engineering, Test, Manufacturing function reside at same site – Increases bandwidth of communication – Facilitates joint problem solving and closure • Engineering, Test, Manufacturing use same CAD tools and mod- els – Behavioral model ties together microcode, functional verifi- cation, test patterns – Circuit models used for design development and manufactur- ing problem analysis – Manufacturing problem data base maintained online

SEG/AFL Development Process The Bottom Line

The basis for success is commitment to, and achieve- ment of, excellence in design, implementation, and follow up. Good processes are no substitute for good people.

SEG/AFL Development Process