<<

Bachelor Informatica

Simulating a ripple carry adder in SystemC: an overview of an under-represented

Brasso Kattenberg

June 16, 2020 Informatica — Universiteit van Amsterdam

Supervisor(s): Anthony van Inge Signed: Signees Abstract Binary are ubiquitous, but progress is slowing down. [23] This paper poses the balanced ternary base as an alternative and evaluates its merits as compared to binary with regards to transistor usage, and therefor latency, power usage; and construction complexity. A ternary trit represents more information than a binary by a factor of roughly 1.58. An overview of the timeline of this particular signed-digit base is given, as well as a comparison between binary and (balanced) ternary adder circuits. A balanced ternary ripple carry adder is simulated using SystemC to aid the evaluation.

Quote

Your uses ones and zeros to represent data. There’s no real reason for be the basic unit of information in a computer to be only a one or zero, though. It’s a historical choice that is common because of convention, like driving on one side of the road or having right-hand threads on bolts and screws. In fact, computers can be more efficient if they’re built using different systems. Base 3, or ternary, computing is more efficient at computation and actually makes the design of the computer easier. (...)

Hackaday

2 Contents

1 Introduction 5 1.1 Research questions ...... 6

2 Timeline overview of past pursuits 7

3 Theoretical advantages of the balanced ternary radix 11 3.1 Radix economy ...... 11 3.1.1 Positional system ...... 11 3.1.2 Ideal radix ...... 11 3.2 Negative and representation ...... 12 3.3 Truncating- equivalence ...... 12 3.4 and multiplication tables ...... 13 3.5 Extra: calculating to and from balanced ternary ...... 13

4 Balanced Ternary logic 14 4.1 Single-input gates ...... 14 4.2 Dual-input gates ...... 14

5 Simulating a balanced ternary ripple carry adder 16 5.1 Making the choice for simulation with SystemC ...... 16 5.2 Balanced ternary ripple carry adder: design ...... 17

6 Balanced ternary gate design 19 6.1 Types of MOSFET ...... 19 6.2 Balanced ternary gates ...... 19 6.2.1 Substituting the any-gate ...... 19 6.3 Binary MOSFET-based gates ...... 20

7 Results and discussion 25 7.1 Simulation results ...... 25 7.2 Critical paths ...... 26 7.2.1 Binary critical path ...... 26 7.2.2 Balanced ternary critical path ...... 26 7.2.3 Comparison ...... 26 7.2.4 Delay: number of MOSFETs and OpAmps ...... 27 7.3 Component cost ...... 27 7.3.1 Total binary ripple carry adder cost ...... 27 7.3.2 Total balanced ternary ripple carry adder cost ...... 27 7.4 Sublinear addition ...... 28

3 8 Conclusions and future research 29 8.1 Latency ...... 29 8.2 Component count ...... 29 8.3 Final thoughts ...... 29 8.3.1 Logic levels and supply voltage ...... 30 8.4 Closing remarks ...... 30

4 CHAPTER 1 Introduction

Moore’s law is a well-known indication of computing capability progress. For years, Moore’s law has been said to have ceased to apply, and various metrics have shown to be flat-lining. [23] Furthermore, semiconductor fabrication facilities are having increasing difficulty to advance production of ever smaller transistors. At the same time, the power efficiency benefits the industry is accustomed to with every smaller production node is decreasing as well, primarily due to increased leakage current and parasitic electrical effects at smaller scales. The increased density exacerbates issues such as interference, quantum tunnelling. Even endurance is at stake; life expectancy was long considered a non-issue, usually electrical lifespan far outstrips practical lifespan. [5] Furthermore, the high resolution required for the small-size components result in requiring multiple passes in the masking process, increasing manufacturing time, cost, and error-margin. Current chip production is impressively small; since the 1970s device fabrication has improved 3 orders of magnitude: from 10 micrometer (1∗10−5) to around 10 nanometer (1∗10−8). TSMC and Samsung currently produce 7nm chips, and have 5nm and 3nm on their roadmaps for the next two years. Past assumptions of physical ’hard’ limits have been proven pessimistic as evidenced by these achievements, but these sizes are nearing the size of a silicon atom which has a atomic radius of 111 picometer, making its diameter 0.222 nanometer. Efforts to alleviate the problems of going ever smaller the industry are trying different materials than silicon, such as graphene; expanding chips from planar to using the third dimension which presents additional challenges with heat removal and tracing inter-planar connections. All-in-all, with all the problems the industry faces and the seemingly impending crossroads the computer technology will arrive at, now might be an opportune moment to re-evaluate alternative avenues other than the aforementioned electrical engineering or materials science. Why not take another look at the literal basis of computing: the radix we use for number representation? Binary is fairly ideal in a number of aspects: simplicity, noise-resilience, etc. However the ternary radix —in particular the balanced ternary radix— has several distinct advantages.

log(3) • Trits represent log(2) ≈ 1.58 more information than . • Balanced radices have native representation of negative numbers, obviating the need for complement notation • Balanced radices present equivalence in truncating and rounding fractions. • The ternary radix is the closest base to Euler’s number

Such claims warrant investigation, as is done by numerous parties over time indicating recurring interest, there must be good reasons that compelled all these efforts. They are briefly mentioned in the next chapter before this paper will go into each of these aforementioned advantages, culminating in a discussion of properties and aided by a simulated balanced ternary ripple carry adder.

5 1.1 Research questions

Based on these claims, the following research questions- and sub questions have been formulated to investigate, which will be elaborated on in later sections

1. Can a (balanced) ternary architecture be more efficient than a binary equivalent? (a) What latency would a (balanced) ternary adder have? i. What is the critical path for a balanced ternary ripple carry adder? ii. What is the critical path for a binary ripple carry adder? iii. Is a sub-linear complexity ternary adder possible? (b) What effect does a (balanced) ternary radix have on component count?

6 CHAPTER 2 Timeline overview of past pursuits

Chronologically, the earliest suggested [13] appearance of signed digit arithmetic is in the Hindu Vedic texts known as the Vedas from the Vedic period (India, 1500-500 BCE) [6] Balanced ternary specifically is mentioned in the 1544 Arithmetica Integra by Michael Stifel. [27] There the numbers 1-40 are represented as sums of (positive and negative) powers of three; indicating counting with balanced ternary. Charles Babbage’s place in the history of computing is fairly well-known, with the designs he made for his difference engine (1822) and analytical engine (1837). [8] Babbage was ahead of his time, the difference engine was built over a century later in 1991 by the London Science Musuem, decades after the first electronic computers. https://en.wikipedia.org/wiki/Analytical_ Engine The Analytical engine was never built, but efforts into researching and potentially building the turing-complete design started in 2010 led by John Graham-Cumming. Unfortunately, the early digital computer pioneers (ENIAC) were unaware of Babbage’s efforts and as such several architectural innovations had to be re-invented. Lesser-known than Charles Babbage is Thomas Fowler, who invented and built a wooden computer based on the balanced ternary . His machine did not survive, but a replica was built in 1999-2000. [12] [30]

Figure 2.1: Fowler’s machine. stained glass window, St. Michaels Church [Torrington, Devon]

Claude E. Shannon wrote in 1950 about balanced ternary (and other signed-digit symmetrical systems) [25] In 1952, Grosch proposed using balanced ternary for the Whirlwind II project. [21] [22] [10]

7 Figure 2.2: reconstruction of Fowler’s machine, by Glusker, Vass & Hogan

In 1958, —the Sputnik-era— the Moscow State University led by Brusentsov built an electronic called the , named after a nearby river. [21] 50 machines were produced between 1959-1965. It was both economical (27,500 rubles) and reliable (17 years continuous operation, 3 components replaced in year 1) despite the sometimes harsh operating climates. It was also deemed power efficient but this was not quantified. [19] Heralded for its architectural simplicity, unfortunately it was constructed with two binary components for every trit. [7] Its successor the Setun-70 from 1970 can be considered the last ternary computer. [10]

Figure 2.3: original Setun 1958

8 Figure 2.4: Setun, serially produced model

Figure 2.5: Setun-70

In 1969, Donald Knuth wrote in ’the art of computer programming’ about balanced ternary calling it ”perhaps the prettiest number system of all”. [17] In 1973, Gideon Frieder at the State University New York, Buffalo implemented the TERNAC, an emulation in FORTRAN to validate the feasibility of a ternary computer. It featured both fixed and floating point arithmetic. It was deemed successful, the speed and price was in the order of binary computers. [11] [22] [10]

9 In 2014 Jessie Tank started working on a FPGA implementation of a balanced ternary cpu [28] which she presented on in 2016 [29] Simultaneously Tank is involved in the cryptocurrency IOTA, which launched in 2015. IOTA formerly used a ternary hash called curl-p specifically to present an application of ternary computing for the upcoming JINN processor. However due to a security flaw [14] the homegrown ternary hash curl-p was switched temporarily with keccak (converted to ternary) and is now (being?) replaced with a hash algorithm developed in cooperation with a german cybersecurity company CYBERCRYPT A/S, called Troika, with a 1-year 200,000 euro bounty [9] Since 2017, Thomas Leathers works on a open-source project called SBTCVM simple balanced ternary computer virtual machine [18] Recently (2019) efforts are made to produce ”the first ternary computer in 50 years” called the TRIADOR [26]

Figure 2.6: TRIADOR

10 CHAPTER 3 Theoretical advantages of the balanced ternary radix

This chapter discusses the posed advantages of a (balanced) ternary radix, that motivated the research questions

3.1 Radix economy

3.1.1 Positional numeral system Any non-negative integer can be —uniquely— represented by the positional number notation, regardless of base. That is, any element: N in N can be represented as the sum of N’s digits: di multiplied with their appropriate powers of the radix r

n−1 X i ∀N ∈ Z : N = dir i=0

3.1.2 Ideal radix Given that this works for any radix, one could wonder what the ideal radix could be. The notion ideal covers the assumption that some radii are more suitable than others, by some metric in some situation. Allegedly, most people across the world primarily use base-10 due to convenient counting with 10 fingers across two hands. Computers famously use base-2, due to simplicity and the convenient mapping of Boolean algebra. One metric by which we can judge radices is called the radix economy. The radix economy for a particular number N is the product of i and the radix r in the above formula. That is, the number of digits multiplied with the base. Or, if you will, the radix economy is a cost function describing the area, where the number of digits of a number is the length and the base is the width. More formally the radix economy E for a particular number N in base b is given as:

E(b, N) = bblogb(N) + 1c for large N this can be simplified to:

ln(N) E(b, N) ≈ b · log (N) = b b ln(b)

This function has a minimum at e; Euler’s number. This means base e is the base with the lowest average radix economy. Furthermore, 3 is the integer base with the lowest average radix economy. Note that, since radix economy depends on N, and therefor the average radix economy

11 depends on the range N, some bases are more efficient for particular ranges or numbers. For instance, base 2 is more efficient than base 3 for a minor range of N: 8487 distinct numbers. [13] Additionally, although base-1 is the worst base in general, it is the most efficient base for the range [1, 8]

40 r=2 r=3

r=e 35

30 measure of complexity, rw

25 0 0.2 0.4 0.6 0.8 1.0 capacity of number system, r w (× 106) Figure 3.1: radix economies (lower is better) courtesy of [13]

3.2 Negative numbers and representation

Decimal uses an additional sign to denote negative numbers, the minus symbol: − e.g. −5 and binary uses two’s complement to represent negative numbers, where the most significant bit indicates a negative number. Using two’s complement presents the advantage of allowing subtraction to be performed via addition of a two’s complement, or, negative number. Note that a similar scheme works for other bases too, such as 10’s complement in . However, balanced numeral systems are different. Balanced bases are integer bases where the base/radix is an odd number, i.e. of the form: 2k + 1. In those bases the digits range from −k to +k, and they are balanced (centered) around 0. These bases can represent negative numbers natively so to speak. If we take balanced ternary for example, and the decimal number −5 we can code this in a string of trits (from ternary digit). Where regular ternary has the {0, 1, 2} balanced ternary has {-1, 0, 1}. There are several notations to signify particularly the negative digits. Letters: T, F, and U for True, False, and Unknown; mixtures of these and the familiar 0 and 1, or alternate glyphs. The SETUN documentation used a rotated 1 for negative one:

{−1, 0, 1} = { ,1 0, 1} Relatively common is the overhead bar or vinculum (known for Boolean logical negation) {1,¯ 0, 1} Most common seems to be {−, 0, +}.

3.3 Truncating-rounding equivalence

An important aspect of balanced radices is that their number representations enables fractions to be automatically rounded in the event of truncation. [17] For unbalanced radices such as binary (or decimal) this is not the case. This is perhaps best explained through example: the 3 binary string b0.111 (most significant digit first) denotes 8 or 0.875 in decimal. When b0.111 1 is truncated to: b0.1 this represents 2 whereas b1.0 (incidentally 1.0 in decimal too) is closer. ¯ 1 For balanced ternary, this is not the case: 0.+++ denotes 0.481, slightly lower than 2 and when 1 truncated to 0.+ represents: 3 which is what rounding would result in as well; certainly closer 2 than the next closest alternative: +.− (in decimal: 3 )

12 3.4 Addition and multiplication tables

Unlike decimal, binary has a very simple multiplication table; without carries! This is significant because it reduces multiplication complexity. Balanced ternary shares this feature, however unbalanced ternary does not: 2 ∗ 2 = 4 would result in a sum of 1, with a carry of 1. (3 + 1 = 4)

Table 3.1: addition and multiplication tables

ADD − 0 + MUL − 0 + − −+ − 0 − + 0 − 0 − 0 + 0 0 0 0 + 0 + +− + − 0 +

3.5 Extra: calculating to and from balanced ternary

In order to facilitate potential adoption, an alternative computer would do well to with the incumbent radices of choice. That means converting to and from in this case balanced ternary is vital, not only for comprehension. In order to convert from balanced ternary, use the formula from 3.1.1 In order to calculate to balanced ternary, perform this algorithm:

tritstring = ""

while(n is not 0) switch (n modulo 3) case(0) tritstring += ’0’ case(1) tritstring += ’+’ case(2) tritstring += ’-’

if n > 0: n = floor((n + 1) / 3) else: n = floor((n - 1) / 3); Although, if your n is negative, rather than including the else-block, it is probably easier to just treat your number as positive and negate all the symbols afterwards.

13 CHAPTER 4 Balanced Ternary logic

Ternary has significantly more possible logic gates than binary. Binary has 22 = 4 possible single-input gates: always on, always off, identity or buffer, and not; the only note-worthy one.

4.1 Single-input gates

Ternary has 33 = 27 possible monadic gates. Of those, note-worthy ones are the inverters, cycles (increment and decrement), and decoders. Inverters come in three types: standard, negative, and positive. [15] The STI is most useful, since that is all it takes to negate a balanced ternary numbers.

Table 4.1: Inverter- and cycle gates

input STI NTI PTI − + + + 0 0 − + + − − − input increment decrement − 0 + 0 + − + − 0 input is false? is unknown is true − + − − 0 − + − + − − +

4.2 Dual-input gates

Two ternary inputs make for 32 = 9 possible input combinations, as shown by these 3x3 tables, 2 but every output-value can be 3 possibilities too. Therefor 3(3 )) = 19683 dual-input gates are possible. Many are uninteresting, but these gates are parallels to binary gates such as and, or, or the boolean implication. In binary, gates such as nand and nor are universal, meaning you can assemble these gates in formations to produce any other gate’s output. For ternary there also exist universal gates, in fact out of 19683 possible gates, 3773 are universal. [16]

14 Table 4.2: minimum/and, maximum/or, implication

MIN − 0 + MAX − 0 + IMP − 0 + − − − − − − 0 + − + 0 − 0 − 0 0 0 0 0 + 0 + 0 0 + − 0 − + + + + + + + + Table 4.3: antimin/nand, antimax/nor

AMIN − 0 + AMAX − 0 + − + + + − + 0 − 0 + 0 0 0 0 0 − + + 0 − + − − −

The sum, consensus, and accept-anything gates below will prove useful in the simulation since we can construct our half- and full-adders from these! The sum gate in particular outputs the modulo-3 sum of its inputs, note that it is essentially the same table as the previously mentioned addition table but without the carries, which are essentially the same as the consensus.

Table 4.4: mod-3 sum, consensus, and accept anything

SUM − 0 + − 0 + − + − 0 − − 0 0 0 − 0 + 0 0 0 0 + 0 + − + 0 0 + ANY − 0 + − − − 0 0 − 0 + + 0 + +

Perhaps due to the sheer number of dual-input ternary gates, it seems hard to identify which gates are useful. D. Jones mentions the useful consensus gate seems hardly utilised [15] Also note the symmetry in most of these tables, indicating these gates are commutative.

15 CHAPTER 5 Simulating a balanced ternary ripple carry adder

5.1 Making the choice for simulation with SystemC

Evaluating the merits of a balanced ternary architecture can be done in a number of ways: through simulation, hardware implementation, or formal verification. Both implementation and formal verification would be both time- and cost-prohibitive. Fortunately simulation does not hinder our prospects of evaluating ternary architectures. Due to the large discrepancy of availability and capability between ternary and binary devices, simulation can provide a more fair comparison by eliminating binary components’ vast competitive edge afforded by decades of engineering pursuit. Within the field of simulation multiple approaches arise as well. In terms of available tools there are the classic hardware description languages VHDL & Verilog, and there are high level synthesis tools, that generate these HDLs, many among them costing thousands of US dollars in licensing fees. [https://www.element14.com/community/groups/fpga-group/blog/2014/10/ 28/alternatives-to-vhdlverilog-for-hardware-design] Fortunately, one such HLS tool: SystemC by Accelera, is both open-source (apache licence) and qualified: SystemC has been adopted as a IEEE standard in 2005, and updated since. [4] [IEEE Std 1666-2005, 1666-2011, 1666.1-2016] Furthermore, it seems hardware designs in SystemC can by synthesised into verilog or vhdl using specialty compilers when adhering to constrictions in SystemC functionality. TODO In terms of methodology there is time- and (discrete) event-driven. SystemC allows both, but the latter is unsuitable since latency is a vital consideration for our experiment.

16 5.2 Balanced ternary ripple carry adder: design

Using the aforementioned sum, cons, and any gates we can construct a balanced ternary halfadder that resembles a binary halfadder. [15]

A S B

C

Figure 5.1: binary halfadder, courtesy of inductiveload

The balanced ternary version has a gate delay of 1 for both the sum and the carry-out, just like the binary counterpart.

B Si sum

Cout A cons

Figure 5.2: balanced ternary halfadder

Then, by combining two half-adders and an accept-anything gate, we can construct a full- adder. Instead of the accept-anything gate, another sum gate would work too, but the accept- anything is simpler and therefor preferred because it could be implemented with fewer transistors. [15]

Ci Si sum

Bi cons sum

Ci+1 Ai cons any

Figure 5.3: balanced ternary fulladder

17 Again, for comparison, the binary equivalent:

A B S Cin

Cout

Figure 5.4: binary fulladder courtesy of inductiveload

Finally, by stringing together a halfadder and several fulladders, we can construct a ripple carry adder:

A0 B0 A1 B1 A2 B2 A3 B3 A4 B4 A5 B5 A6 B6 A7 B7 A8 B8

HA C1 FA1 C2 FA2 C3 FA3 C4 FA4 C5 FA5 C6 FA6 C7 FA7 C8 FA8 overflow

S0 S1 S2 S3 S4 S5 S6 S7 S8 Figure 5.5: 9-trit balanced ternary ripple carry adder

18 CHAPTER 6 Balanced ternary gate design

6.1 Types of MOSFET

A MOSFET: Metal Oxide Semiconductor Field Effect Transistor, is a device that primarily functions as a control mechanism for current between source and drain based on a voltage potential at its gate. In other words, it is a of switch that is turned on/off electrically. MOSFETs come in two types: N-type and P-type, or channel; depending on how they are constructed. This is done by adding impurities (a dopant) to semiconductors, a process known as doping. Depending on the dopant, doping creates holes of surplus or missing electrons in the semiconductor, typically a latice structure of silicon. These holes change the conductivity characteristics of the MOSFET. [24] Additionally, MOSFETs can be split in enhancement- and depletion-types. These types indicate if a transistor is normally-off (enhancement-mode) or normally-on (depletion-mode) at zero gate-source voltage. These classifications are related: a transistor is in enhancement- or depletion-mode depending on its channel-type and input voltage polarity. An N-type (MOS)FET, enters enhancement-mode when crossing a positive threshold voltage, and depletion-mode with a negative voltage. Likewise, but vice-versa, a P-type MOSFET is in enhancement-mode with a negative voltage, and in depletion-mode with a positive voltage. These 4 combinations and their notation can be seen in figure 6.1, below.

6.2 Balanced ternary gates

The previous sections alludes to the potential construction of a balanced ternary design using the negative (voltage) operating principles of MOSFETs. This is indeed the case, ’Mechanical Advantage’ published a number of balanced ternary gate designs using complementary p- and n-type MOSFETs as a hackaday project. [1] (The fact that the design uses complementary MOSFETs is beneficial for power usage.) The two necessary gates for a ripple-carry-adder, the SUM and CONS are included in figures 6.2 , and 6.3 below.

6.2.1 Substituting the any-gate Note: although the ANY gate is missing, this does not prevent construction of a balanced ternary full-adder. As Dr. Jones [15] mentions, there are two input combinations in the section of the balanced ternary full-adder that produces the carry-out (the ANY-gate) that cannot occur, and are therefor inconsequential for the output. That means the ANY-gate in the full-adder can be substituted by 8 other (possibly more complex) gates (9 total) including, fortunately, the SUM gate. This is shown in figure 6.4

19 Figure 6.1: Mosfet types, courtesy of link

6.3 Binary MOSFET-based gates

For completion, and to compare component cost later on, here are the necessary binary gates implemented with MOSFETs. In the designs of figure 6.5, and 6.6 the and and or gates are implemented by inverting the nand and nor parts of the design. We can substitute the part of the binary fulladder that produces the carry-out (the OR of the two ANDs with NANDs as shown in figure 6.8 This substitution reduces the amount of necessary transistors in the carry-block from 18 to 12.

20 Figure 6.2: Sum gate, design courtesy of [1]

21 Figure 6.3: Consensus gate, design courtesy of [1]

Ci Si sum

Bi cons sum

Ci+1 Ai cons sum

Figure 6.4: balanced ternary fulladder, alternative to 5.3 without any-gate

22 Figure 6.5: CMOS design of Binary (N)AND, courtesy of [3]

Figure 6.6: CMOS design of Binary (N)OR, courtesy of [3]

Figure 6.7: CMOS design of Binary XOR, courtesy of wikipedia

23 A B S Cin

Cout

Figure 6.8: Binary fulladder, alternative carry-block using NAND

24 CHAPTER 7 Results and discussion

7.1 Simulation results

The experiment was set to measure the latency of adding the numbers 984110, i.e. ”+++++++++” and 110: ”00000000+” (most significant trit first) representing a worst case delay since every stage has a carry-out. The simulation assumes that the delay for sum, cons, and any are identical. The SystemC simulation of the 9-trit balanced ternary ripple carry adder yields a latency of 32 timesteps. This is precisely the expected latency. The 9th (last) result trit: S8, depends on 8 preceding carry trits, the first of which is from the halfadder, which produces its carry-out after 1 cons gate delay. The other seven: C2 through C8 come from full-adders one through seven, each having 2 gate delays: 1 cons + 1 any. Consider the first fulladder: of course the consensus after the carry-in also depends on the sum of A1 and B1. If we assume that the length of wiring, and therefor delay, is greater between modules than within them, then it takes longer for the result of the halfadder’s carry- out (the consensus of A0 and B0) to arrive to the fulladder’s carry-in consensus gate than it takes to calculate the sum of A1 and B1 in the fulladder itself. (This includes the assumption that Tsum = Tcons = Tany Therefor, calculating the carry-out of the first fulladder is not time dependant on the sum of A1 and B1. For all subsequent full-adders: two through eight, the time difference between calculating (sum of Ai and Bi) versus Ci only becomes larger. In other words, the simulated critical path is:

(ntrits − 1) · (Tcons + Tany) + Tcons to calculate the carry-out or overflow trit of a balanced ternary ripple carry adder. A sum trit is known 1 gate-delay sooner than the carry-out since the carry-out also has to go through a any-gate, whereas the sum trit does not. This means S8 is known after: (9 − 1) · 2 + 1 − 1 = 16 gate delays. By arbitrarily choosing the gate delay to be 2ns to account for transistor propagation delay, this means s8 is known after 2 · 16 = 32 ’nanoseconds’. This unit of time is actually a rather dimensionless number of time-steps, since the time-resolution and delays in SystemC are chosen, and since they are not based on any kind of experimentally derived measurement, they can be ignored. This delay of 32 timesteps as determined by the critical path is perfectly reflected in the graph below. This graph is produced from the value-change-dump from the simulation. Since the simulation utilises 2-bit binary numbers to represent signed digit for the negative one states, it was necessary to parse the tracefile (.vcd) the simulation outputs in a custom (python)script. The last row: S8 is most noteworthy: the signal starts out initially as 0, since that is how the results array was initialised, then after 2 gate delays (t=4) it changes to positive reflecting the addition of trits A8 + B8 = (1 + 0) = 1 to ultimately settle to negative after all the carry-ins trickle through.

25 Figure 7.1: graphed tracefile of simulated addition

7.2 Critical paths

7.2.1 Binary critical path The critical path in a binary ripple carry adder is almost identical. There is 1 (and-)gate delay for the carry-out in a binary halfadder. There are 2 nand gate delays for every subsequent full-adder. Therefor, the critical path is:

(nbits − 1) · (2 · Tnand) + Tand

7.2.2 Balanced ternary critical path The balanced ternary critical path was previously established to be:

(ntrits − 1) · (Tcons + Tany) + Tcons

7.2.3 Comparison From the above critical paths, it seems they are identical. However, note that a binary ripple carry adder needs more stages than a balanced ternary ripple carry adder to add the same number, since it takes more bits to represent the same number as it does with trits. ntrits In fact, the same experiment would need nbits = dlog2(3 )e bits. For the experiments’ 9 chosen 9 trits, this means an equivalent binary ripple carry adder would take: dlog2(3 )e = 15 bits. Therefor, since the latency per stage is identical, and binary needs more stages, a balanced ternary ripple carry adder has a latency advantage.

26 With this we can answer two of the sub-questions: 1(a)ii and 1(a)i.

7.2.4 Delay: number of MOSFETs and OpAmps As can be observed from the figures 6.5, 6.2 and 6.3: the component cost for the balanced ternary and binary designs are as seen in table: 7.1. With that, we can calculate the critical path delay for a binary ripple carry adder in terms of MOSFET time-delay to be:

(nbits − 1) · (2 · Tnand) + Tand

((nbits − 1) · (2 · 4) + 6) · Tmosfet For balanced ternary, to express the critical path in numbers of components, we ought to revisit an earlier assumption. Given that the cons function in the halfadder is now lower-cost than the sum: just 2 OpAmps rather than 3 MOSFETs + 4 OpAmps. Unless wiring latency is larger than the difference between sum and cons; the critical path in the ripple carry-adder changes slightly. In the first fulladder, rather than being time-dependant on the carry-out of the halfadder, its own sum dominates. With that, the critical path in terms of gates changes to:

(ntrits − 1) · (Tcons + Tsum|any) + Tsum

To express that in numbers of MOSFETs + OpAmps: ntrits ·3 MOSFETs + (ntrits −1)·(4+2)+4 OpAmps. Together: (ntrits − 1) · (2 + 3 + 4) + (3 + 4) Now we can see that the critical path latency favours binary for the same number of digits, rather than being identical is numbers of gates.

Table 7.1: component cost per gate #(MOSFET+OpAmps)

GATE #DEVICES GATE #DEVICES XOR 6 SUM 3+4 AND 6 CONS 0+2 NAND 4

7.3 Component cost

7.3.1 Total binary ripple carry adder cost

To construct a binary ripple carry adder we need: 1 halfadder and nbits − 1 fulladders. Every halfadder needs 1 xor + 1 and; every fulladder needs: 2 xors and 3 nands. In terms of gates, that comes down to: 1 and + (2 · nbits − 1) + 1 xor + 3 · (nbits − 1) nand. In terms of MOSFETs, since and and xor every halfadder needs 2 ∗ 6 = 12 MOSFETs; every fulladder needs 2 ∗ 6 + 3 ∗ 4 = 24 MOSFETs. In total that is: 24 · (nbits − 1) + 12 MOSFETs to build an n-bit binary ripple carry adder.

7.3.2 Total balanced ternary ripple carry adder cost

To construct a balanced ternary ripple carry adder, you need: 1 halfadder, and ntrits − 1 fulladders. Every halfadder needs 1 sum + 1 cons gates, and every fulladder needs 2 cons + 3 sum gates, i.e. 2 · (ntrits − 1) + 1 cons + 3 · (ntrits − 1) + 1 sum gates in total. In terms of MOSFETs and OpAmps, every halfadder needs 3 MOSFETs and (2 + 4) = 6 OpAmps; every fulladder needs: 3 · 3 = 9 MOSFETs, and 2 · 2 + 3 · 4 = 16 Opamps. In total that means: 9 · (ntrits − 1) + 3 MOSFETs and 16 · (ntrits − 1) + 6 OpAmps, for a grand total of:

25 · (ntrits − 1) + 9 components.

27 7.4 Sublinear addition

The former comparison is somewhat contrived, balanced ternary is favoured merely because of its stronger per-trit number representation since the latency is linear with n, in other words the comparison was made with addition that has a time complexity of big O(n), which is hardly the state of the art at least for binary. Binary adders can make use of the fact that depending on the inputs a carry out can be forced, that means that the carry out is independent of the carry in, and can therefor be predicted. Using a carry look-ahead circuit, an adder can be split up in halves. This can be done recursively, yielding an adder that computes in log(n) time. Now the question is, can a similar thing be done for (balanced) ternary? Dr. Jones [15] has laid out the idea for an unbalancedcarry look ahead adder, it required a factor 1.62 more logic, when already accounting for the ≈ 1.58x benefit of trits over bits. An alternative [2] proposes a carry-select adder, which is simply speaking an adder that is split up (tree-wise or not) by duplicating (triplicating) the latter part to compute all 2 or 3 possible sums regardless of carry-in, and then only after the carry-in is known, selecting for the correct result. This can also yield a log(n) time adder, at the cost of vastly more components. To answer subresearch question: 1(a)iii Yes it is possible to design a sub-linear, log(n) adder.

28 CHAPTER 8 Conclusions and future research

Now that the sub research questions have been answered, we can discuss the main research questions: 1b and 1a

8.1 Latency

To revisit latency, for equal expressiveness binary and balanced ternary ripple carry adders. The following holds: they are equal in gate delays; in numbers of devices: the ratio between the numbers of devices in the critical path that multiply with the n-factor (amount of bits/trits) of log(3) 8 vs 9 does not outweigh the log(2) factor. Then, we have included a possibility for a sub-linear addition method using a carry-select circuit. With that, we can conclude balanced ternary can outperform binary in latency at the cost of increased component count.

8.2 Component count

It is hard to say which effects a balanced ternary radix will have on component count exactly. Like the latency case, the total component cost for ripple carry adders in binary vs. balanced ternary differ in a factor 24 vs 25 which although a close win for binary, definitely does not log(3) outweigh the added log(2) factor for binary. However, the need for a carry select adder to compete in computation speed, with sublinear rather than linear addition causes component counts to explode tripling for every stage. There are log3(ntrits) such stages of carry-select, each having 3 times as many components. We can then assume the component cost to be roughly a factor of log(2) · (3 · log (n )) log(3) 3 trits more than binary.

8.3 Final thoughts

The difference in effort in designing optimised circuits and components cannot be overstated. Perhaps a carry-lookahead for balanced ternary is possible? Perhaps more efficient designs for balanced ternary gates exist? For instance, the accept-any gate that was replaced with the sum could cost fewer components. To conclude and answer the final question 1. Yes, based on this research a balanced ternary computer can be more efficient, with lower latency, lower intercomponent wiring due to lower amount of trits necessary, despite perhaps vastly higher intracomponent wiring if high duplication for carry select is necessary. Other than these factors, let’s not forget that rounding operations are omitted, as well as conversion to two’s complement.

29 8.3.1 Logic levels and supply voltage Given a single supply voltage and reference, e.g. 1v and 0v; unbalanced ternary subdivides this high and low voltage/logic level into: high, middle, and low, e.g. 1v, 0.5v, 0v. This causes unbalanced ternary to possess lesser noise resistant qualities, since there is more uncertainty to which logic level a certain voltage level belongs to; i.e. the margins are smaller. Balanced ternary forgoes this by utilising the negative voltages as well. The voltage range between logic levels remains just as large as binary.

8.4 Closing remarks

In terms of ethical considerations, there is not much to be said about this research. Philosophical consensus holds that computers are not considered moral agents, and any such computers that can be considered morally responsible operate more in the domain of artificial intelligence. [20] Therefor, research into computers that are not moral agents, nor morally aware, cannot be deemed immoral itself. With fundamental research, attempting to construe how research will be applied later on is largely futile. Despite this, researchers do have the moral responsibility to think about potential consequences of their research. To that end, the results of this research produces the following considerations: possible efficiencies, that translate to lower resource usage when realised are hard to object to. Finally, requests for source code and other (source) files are welcome.

30 Bibliography

[1] M. Advantage. Tern - ternary logic circuits, 2019. https://hackaday.io/project/ 6284-tern-ternary-logic-circuits. [2] M. Advantage. Ternary computing menagerie, 2019. https://hackaday.io/project/ 164907-ternary-computing-menagerie. [3] Allaboutcircuits. Cmos gate circuitry. https://www.allaboutcircuits.com/textbook/ digital/chpt-3/cmos-gate-circuitry/. [4] I. S. Association et al. Ieee standard systemc(r) language reference manual. IEEE Std 1666-2005, pages 1–423, March 2006. [5] B. Bailey. Aging problems at 5nm and below, 2020. https://semiengineering.com/ aging-problems-at-5nm-and-below/. [6] Bhratkrshatrtha. Vedic Mathematics, Or Sixteen Simple Mathematical Formulae from the Vedas. Hindu Vishvavidyalaya Sanskrit Publication Board, Banaras Hindu University, 1965. http://www.ms.uky.edu/~sohum/ma330/files/manuscripts/Tirthaji_S.B.K. ,_Agarwala_V.S.-Vedic_mathematics_or_sixteen_simple_mathematical_formulae_ from_the_Vedas-Orient_Book_Distributors_1981.pdf. [7] N. P. Brusentsov and J. R. Alvarez. Ternary computers: The setun and the setun 70. In IFIP Conference on Perspectives on Soviet and Russian Computing, pages 74–80. Springer, 2006. https://link.springer.com/content/pdf/10.1007%2F978-3-642-22816-2_10.pdf. [8] M. Campbell-Kelly. Computer: A History of the Information Machine. Taylor & Francis, 2018. [9] CYBERCRYPT. Troika a ternary hash function, 2018. https://www.cyber-crypt.com/ troika/. [10] A. V. Drogin, S. A. Sheypak, and V. V. Shilov. Ieee computer society computer history competition (chc’61) project. http://ternary.3neko.ru/about.html. [11] G. Frieder and . Luk. Ternary computers: part 2: emulation of a ternary computer. In Conference record of the 5th annual workshop on Microprogramming, pages 86–89, 1972. [12] M. Glusker. The ternary calculating machine of thomas fowler, 2005. http://www.mortati. com/glusker/fowler/recon.htm. [13] B. Hayes. Computing science: Third base. American scientist, 89(6):490–494, 2001. http:// citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.65.6976&rep=rep1&type=pdf. [14] E. Heilman, N. Narula, G. Tanzer, J. Lovejoy, M. Colavita, M. Virza, and T. Dryja. Cryptanalysis of curl-p and other attacks on the iota cryptocurrency. IACR Cryptology ePrint Archive, 2019:344, 2019. https://eprint.iacr.org/2019/344.pdf. [15] D. W. Jones. The ternary manifesto, 2012. THE UNIVERSITY OF IOWA Department of Computer Science http://homepage.divms.uiowa.edu/~jones/ternary/.

31 [16] C. KGdney. Exploring universal ternary gates, 2013. http://twistedoakstudios.com/ blog/Post7878_exploring-universal-ternary-gates. [17] D. E. Knuth. Art of computer programming, volume 2: Seminumerical algorithms. Addison- Wesley Professional, 2014. [18] T. Leathers. Simple balanced ternary computer virtual machine - a virtual base 3 computer., 2017. https://github.com/SBTCVM. [19] B. N.Malinovsky. Nikolay brusentsov - the creator of the trinary computer. In Essays on history of Computer Science and Technology in Ukraine., 1998. https://web.archive. org/web/20080117173113/http://www.icfcst.kiev.ua/Museum/Brusentsov.html. [20] M. Noorman. Computing and moral responsibility. 2012. [21] D. C. Rine. Setun’s reflections, 2005. Professor Emeritus of Computer Science, Volgenau School of Information Technology and Engineering https://www.researchgate. net/file.PostFileLoader.html?id=54d0128dd685cca83b8b45fa&assetKey=AS% 3A273688870490112%401442263919288. [22] D. C. Rine. Computer science and multiple-valued logic: theory and applications. Elsevier, 2014.

[23] K. Rupp. 42 years of microprocessor trend data, 2018. https://www.karlrupp.net/2018/ 02/42-years-of-microprocessor-trend-data/. [24] N. J. Sand. Physical properties of a mosfet, 2019. https://www.norwegiancreations. com/2019/01/physical-properties-of-a-mosfet/. [25] C. Shannon. A symmetrical notation for numbers. The American Mathematical Monthly, 57(2):90–93, 1950. https://www.math.arizona.edu/~bhallmark/shannon.pdf. [26] D. V. Sokolov. Triador: The only ternary computer made for real in the past 50 years., 2020. https://hackaday.io/project/28579-homebrew-ternary-computer. [27] M. Stifel. Arithmetica integra, 1544. https://archive.org/stream/bub_gb_ ywkW9hDd7IIC#page/n85/mode/2up. [28] J. Tank. Base 3 -ternary computer from scratch., 2014. https://hackaday.io/project/ 1043-base-3-ternary-computer-from-scratch. [29] J. Tank. Building a base 3 computer. In https: // www. youtube. com/ watch? v= EbJMtJq20NY . Hackaday superconference, 2016. [30] P. Vass. The : Thomas Fowler, Devon’s Forgotten Genius. Boundstone Books, 2016. http://www.boundstonebooks.co.uk/the-power-of-three.html.

32