The Long Road to 64 Bits
Total Page:16
File Type:pdf, Size:1020Kb
DOI:10.1145/1435417.1435431 Double, double toil and trouble —Shakespeare, Macbeth, Act 4, Scene 1 BY JOHN MASHEY The Long Road To 64 Bits SHAKESpeare’S WORDS OFTEN cover circumstances beyond his wildest dreams. Toil and trouble accompany major computing transitions, even when people plan ahead. Much of tomorrow’s software will still be driven by decades-old decisions. Past decisions have unanticipated side effects that last decades and can be difficult to undo. ranging from high-level strategies down For example, consider the overly to programming specifics. long, often awkward, and sometimes contentious process by which 32-bit Fundamental Problem (late 1980s) microprocessor systems evolved into Running out of address space is a long 64/32-bitters needed to address larger tradition in computing, and often quite storage and run mixtures of 32- and 64- predictable. Moore’s Law grew DRAM bit user programs. Most major general- approximately four times bigger every purpose CPUs now have such versions, three to four years, and by the mid- so bits have “doubled,” but “toil and 1990s, people were able to afford 2GB trouble” are not over, especially in soft- to 4GB of memory for midrange micro- ware. processor systems, at which point sim- This example illustrates the interac- ple 32-bit addressing (4GB) would get tions of hardware, languages (especial- awkward. Ideally, 64/32-bit CPUs would ly C), operating system, applications, have started shipping early enough standards, installed-base inertia, and (1992) to have made up the majority of industry politics. We can draw lessons the relevant installed base before they Chronology: Multiple E K Interlocking RANS IBM S/360 F EN 32-bit, with 24-bit B Threads addressing (16MB total) BY H P of real (core) memory HOTOGRA P 1964 1965 JANUary 2009 | VOL. 52 | NO. 1 | COMMUNICATIONS OF THE ACM 45 practice were actually needed. Then people mode operating systems, compilers, Problem: CPU Must Address could have switched to 64/32-bit oper- and libraries; and modify application Available Memory ating systems and stopped upgrading source code to work in both 32- and 64- Any CPU can efficiently address some 32-bit-only systems, allowing a smooth bit environments. Numerous details amount of virtual memory, done most transition. Vendors naturally varied had to be handled correctly to avoid conveniently by flat addressing, in in their timing, but shipments ranged redundant hardware efforts and main- which all or most of the bits in an in- from “just barely in time” to “rather tain software sanity. teger register form a virtual memory late.” This is somewhat odd, consider- Solutions. Although not without address that may be more or less than ing the long, well-known histories of in- subtle problems, the hardware was actual physical memory. Whenever af- sufficient address bits, combined with generally straightforward, and not fordable physical memory exceeds the the clear predictability of Moore’s Law. that expensive—the first commercial easily addressable, it stops being easy All too often, customers were unable to 64-bit micro’s 64-bit data path added to throw memory at performance prob- use memory they could easily afford. at most 5% to the chip area, and this lems, and programming complexity Some design decisions are easy to fraction dropped rapidly in later chips. rises quickly. Sometimes, segmented change, but others create long-term Most chips used the same general ap- memory schemes have been used with legacies. Among those illustrated here proach of widening 32-bit registers to varying degrees of success and pro- are: 64 bits. Software solutions were much gramming pain. History is filled with ˲ Some unfortunate decisions may more complex, involving arguments awkward extensions that added a few be driven by real constraints (1970: about 64/32-bit C, the nature of exist- bits to extend product life a few years, PDP-11 16-bit). ing software, competition/cooperation usually at the cost of hard work by oper- ˲ Reasonable-at-the-time decisions among vendors, official standards, and ating-system people. turn out in 20-year retrospect to have influential but totally unofficial ad hoc Moore’s Law has increased afford- been suboptimal (1976–1977: usage of groups. able memory for decades. Disks have C data types). Some better usage recom- Legacies. The IBM S/360 is 40 years grown even more rapidly, especially mendations could have saved a great old and still supports a 24-bit legacy ad- since 1990. Larger disk pointers are deal of toil and trouble later. dressing mode. The 64/32 solutions are more convenient than smaller ones, al- ˲ Some decisions yield short-term at most 15 years old, but will be with us, though less crucial than memory point- benefits but incur long-term problems effectively, forever. In 5,000 years, will ers. These interact when mapped files (1964: S/360 24-bit addresses). some software maintainer still be mut- are used, rapidly consuming virtual ad- ˲ Predictable trends are ignored, tering, “Why were they so dumb?”4 dress space. or transition efforts underestimated We managed to survive the Y2K In the mid-1980s, some people start- (1990s: 32 -> 64/32). problem—with a lot of work. We’re still ed thinking about 64-bit micros—for ex- Constraints. Hardware people need- working through 64/32. Do we have any ample, the experimental systems built ed to build 64/32-bit CPUs at the right other problems like that? Are 64-bit by DEC (Digital Equipment Corpora- time—neither too early (extra cost, no CPUs enough to help the “Unix 2038” tion). MIPS Computer Systems decided market), nor too late (competition, an- problem, or do we need to be working by late 1988 that its next design must be gry customers). Existing 32-bit binaries harder on that? Will we run out of 64-bit a true 64-bit CPU, and announced the needed to run on upward-compatible systems, and what will we do then? Will R4000 in 1991. Many people thought 64/32-bit systems, and they could be ex- IPv6 be implemented widely enough MIPS was crazy or at least premature. I pected to coexist forever, because many soon enough? thought the system came just barely in would never need to be 64 bits. Hence, All of these are examples of long- time to develop software to match in- 32 bits could not be a temporary com- lived problems for which modest fore- creasing DRAM, and I wrote an article patibility feature to be quickly discard- sight may save later toil and trouble. to explain why.5 The issues have not ed in later chips. But software is like politics: Sometimes changed very much since then. Software designers needed to agree we wait until a problem is really painful N-bit CPU. By long custom, an N-bit on whole sets of standards; build dual- before we fix it. CPU implements an ISA (instruction DEC PDP-11/20 16-bit, 16-bit addressing (64KB total) IBM S/370 family virtual memory, 24-bit Algol 68 addresses, but multiple user includes long long address spaces allowed 1966 1967 1968 1969 1970 46 COMMUNICATIONS OF THE ACM | JANUary 2009 | VOL. 52 | NO. 1 practice set architecture) with N-bit integer reg- of virtual memory. For many programs Of course, increasing AM in a multi- isters and N (or nearly N) address bits, PM can differ greatly according to the tasking system is still useful in improv- ignoring sizes of buses or floating-point input, but of course PM ≤ VL. ing the number of memory-resident registers. Many 32-bit ISAs have 64- or The RL (real limit) is visible to the tasks or reducing paging overhead, 80-bit floating-point registers and im- operating system and is usually lim- even if each task is still limited by VL. plementations with 8-, 16-, 32-, 64-, or ited by the width of physical address Operating system code is always simpli- 128-bit buses. Sometimes marketers buses. Sometimes mapping hardware fied if it can directly and simply address have gotten this confused. I use the is used to extend RL beyond a too- all installed memory without having to term 64/32-bit here to differentiate the small “natural” limit (as happened in manage extra memory maps, bank reg- newer microprocessors from the older PDP-11s, described later). Installed AM isters, and so on. 64-bit word-oriented supercomputers, (actual memory) is less visible to user Running out of address bits has a as the software issues are somewhat programs and varies among machines long history. different. In the same sense, the Intel without needing different versions of 80386 might have been called a 32/16- the user program. Mainframes, Minicomputers, bit CPU, as it retained complete sup- Most commonly, VL ≥ RL ≥ AM. Some Microprocessors port for the earlier 16-bit model. programs burn virtual address space The oft-quoted George Santayana is apro- Why 2N-bits? People sometimes for convenience and actually perform pos here: “Those who do not remember want wider-word computers to improve acceptably when PM >> AM: I’ve seen the past are condemned to repeat it.” performance for parallel bit operations cases where 4:1 still worked, as a result Mainframes. IBM S/360 mainframes or data movement. If one needs a 2N- of good locality. File mapping can in- (circa 1964; see accompanying Chro- bit operation (add, multiply, and so on), crease that ratio further and still work. nology sidebar) had 32-bit registers, each can be done in one instruction On the other hand, some programs run of which only 24 bits were used in ad- on a 2N-bit CPU, but requires longer poorly whenever PM > AM, confirming dressing, for a limit of 16MB of core sequences on an N-bit CPU.