A Fully-Integrated Four-way Outphasing Architecture in Heterogeneously Integrated CMOS/GaN Process Technologies

Dissertation

Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University

By

Matthew LaRue, B.S.E.E., M.S.

Graduate Program in Electrical and Computer Engineering

The Ohio State University

2018

Dissertation Committee:

Dr. Waleed Khalil, Advisor Dr. Ayman Fayed Dr. Steven Bibyk c Copyright by

Matthew LaRue

2018 Abstract

The growth of cellular and communications over the last two decades has led to unprecedented congestion of the frequency (RF) spectrum. New frequency bands and allocations are constantly being opened up to allevi- ate this congestion, but the demand for wireless data is increasing faster than spec- trum availability. Despite fundamental hardware limitations, the complex and rapidly changing wireless environment necessitates three key requirements for RF transmit- ter: frequency-agility, capability of transmitting complex schemes, and power efficiency.

In this work, a four-way outphasing architecture is developed for the generation of complex modulated waveforms across a wide RF frequency range. This proposed architecture offers an ACLR improvement equivalent to a 2-bit phase resolution in- crease compared to traditional two-way outphasing architectures. In addition, this architecture is more resilient to PA amplitude and timing mismatch.

This architecture is implemented in DARPA’s Diverse Accessible Heterogeneous

Integration (DAHI) process technology, featuring the heterogeneous integration of

45nm CMOS SOI and 0.2 µm GaN process technologies. The fabricated transmitter achieves greater than 33 dBm output power and a peak transmitter efficiency of greater than 41%.

ii I dedicate this work to my fianc´eeRachel for being by my side throughout my PhD.

iii Acknowledgments

I would like to thank my parents for always believing in me, supporting my edu- cation, and providing me with guidance as I face difficult decisions in life.

To all my friends and family, thank you for all the support you have given me over the years.

To all of the members of my research group, thank you for the sharing of technical knowledge and camaraderie over the years. Thank you Dr. Shane Smith for all the help you have been. I would especially like to thank Dr. Jamin McCue and Dr.

Lucas Duncan for all they time they took to teach me how to design circuits and for providing an example for me to follow as I advanced through my education.

Thank you to the researchers at the Air Force Research Labs for all the assistance you provided for my research.

Thank you to DARPA, Air Force Office of Scientific Research, and the NASA

Space Technology Research Fellowship for funding and supporting my research.

To my professors at Valparaiso University: Dr. Will, Dr. Tougaw, Dr. Budnik,

Dr. Hart, Dr. Kraft, Dr, Johnson, and Dr. Olejniczak. Thank you for providing me with strong fundamentals in electrical engineering and encouraging my pursuit of education and learning.

There are two individuals I like to specifically thank for what they have done for me. The first is Dr. Brian Dupaix. His technical guidance throughout the duration

iv of my time at Ohio State has helped me navigate my research and all the aspects of circuit design. I would also like to thank him for all the personal support and guidance he has offered me over the years.

Finally, I would like to thank my advisor, Dr. Waleed Khalil. He saw my potential even when I knew next to nothing about circuit design and has spent countless hours over the past 6 years to help me realize this potential. Thank you for setting a high standard for technical knowledge and quality of work that I will strive to meet throughout my career.

To all the people that directly and indirectly contributed to my work that I have forgotten to mention, thank you for your contributions.

–Thank You.

v Vita

2012 ...... B.S.E.E., Vaplaraiso University

2012-2016 ...... NASA Space Technology Research Fellow 2016-2017 ...... Ohio State University Fellow

2017 ...... M.S., The Ohio State University

2017-present ...... Graduate Research Associate, The Ohio State University

Publications

Research Publications

M. LaRue, T. Barton, M. Belz, et al “A Multifunction Transmitter based on a Fully-Digital CMOS/GaN Architecture in DAHI Technology”. GOMACTech, March 2018.

L. Duncan, B. Dupaix, M. LaRue, et al “A 10b DC-to-20GHz Multiple-Return-to- Zero DAC with >48dB SFDR”. IEEE Journal of Solid-State Circuits, Nov. 2017.

M. LaRue, B. Dupaix, S. Rashid, et al “A Fully-Integrated S/C Transmitter in 45nm CMOS/0.2µm GaN Heterogeneous Technology”. IEEE Compound IC Symposium, Oct. 2017.

L. Duncan, B. Dupaix, M. LaRue, et al “A 10b DC-to-20GHz Multiple-Return-to- Zero DAC with >48dB SFDR”. International Solid-State Circuits Conference, Feb. 2017.

vi S. Rashid, B. Dupaix, M. LaRue, et al “A Wide-Band Complementary Digital Driver for Pulse Modulated Single-Ended and Differential S/C Band Class-E PAs in 130nm GaAs Technoogy”. IEEE Compound Semiconductor IC Symposium, Oct. 2016.

E. Alwan, S. Balasubramanian, M. LaRue, et al “Coding-Based Ultra-Wideband Digital Beamformer with Significant Hardware Reduction”. Springer Journal of Analog Integrated Circuits and Signal Processing, July 2013.

M. LaRue, D. Tougaw, J. WIll “Stray Charge in Quantum-dot Cellular Automata: A Validation of the Intercellular Hartree Approximation”. IEEE Transactions of Nanotechnology, March 2013.

Fields of Study

Major Field: Electrical and Computer Engineering

Specialization: Analog and RF Electronics

vii Table of Contents

Page

Abstract ...... ii

Dedication ...... iii

Acknowledgments ...... iv

Vita ...... vi

List of Tables ...... xi

List of Figures ...... xii

1. Introduction ...... 1

1.1 Transmitter Architectures ...... 2 1.1.1 Analog Transmitter Architectures ...... 3 1.1.2 Outphasing Transmitter Architectures ...... 6 1.2 Heterogeneous Integration ...... 8 1.3 Research Overview ...... 10 1.4 Outline ...... 11

2. Modulation ...... 12

2.1 SOQPSK ...... 12 2.1.1 SOQPSK Overview ...... 13 2.1.2 SOQPSK Specifications ...... 17 2.2 64-QAM OFDM - LTE ...... 19 2.2.1 LTE Overview ...... 20 2.2.2 LTE Specifications ...... 22

viii 3. Outphasing ...... 25

3.1 Outphasing History ...... 25 3.2 Outphasing Overview ...... 28 3.2.1 Two-way Outphasing Overview ...... 28 3.2.2 Four-way Outphasing Overview ...... 31 3.2.3 Receiver Systems ...... 35 3.3 Recent Advances ...... 35 3.3.1 Digital Phase Modulators ...... 37 3.3.2 Four-way Non-isolating Outphasing Combiners ...... 43 3.4 Outphasing Performance ...... 45 3.4.1 Outphasing Quantization ...... 46 3.4.2 Outphasing Power Efficiency ...... 49 3.4.3 PA and Timing Mismatch ...... 51 3.4.4 Summary ...... 56

4. Phase 1 Implementation ...... 57

4.1 Architecture ...... 58 4.2 Packaging and Measurement ...... 59 4.2.1 Packaging ...... 59 4.2.2 Measured Results ...... 62 4.3 Comparison and Key Takeaways ...... 75

5. Phase 2 Implementation ...... 77

5.1 Phase Modulator Implementation ...... 77 5.1.1 Reconfigurable DLL ...... 80 5.1.2 Glitch-Free Multiplexer ...... 83 5.1.3 Fine Delay ...... 86 5.1.4 Very-Fine Delay ...... 87 5.2 Amplification Implementation ...... 89 5.2.1 CMOS to GaN Driver ...... 90 5.2.2 Three-stage GaN PA ...... 92 5.3 Packaging and Measurement ...... 93 5.4 Conclusions and Takeaways ...... 96

6. Conclusions and Future Work ...... 98

6.1 Work Summary and Conclusions ...... 98 6.2 Future Work ...... 99 6.3 Final Thoughts ...... 100

ix Bibliography ...... 101

x List of Tables

Table Page

1.1 Standard RF frequency bands [1] ...... 10

2.1 Comparison of SOQPSK and 64-QAM LTE ...... 24

3.1 Comparison of Chireix and LINC outphasing ...... 28

3.2 Time mismatch percent to time mismatch in picoseconds conversion at various frequencies ...... 55

3.3 Outphasing transmitter 20 MHz 64-QAM LTE simulation results . . 56

4.1 Transmitter implementation overview ...... 57

4.2 Phase 1 IC packaging and testing overview ...... 62

4.3 Measured transmitter power breakdown ...... 64

4.4 Test condition overview for different PA measurements ...... 65

4.5 Measured single-channel and combined PA performance ...... 69

4.6 Measured modulated outphasing performance ...... 70

4.7 Performance comparison of Phase 1 IC to recent works ...... 75

5.1 Transmitter implementation overview ...... 78

5.2 Phase 2 IC packaging and testing overview ...... 96

xi List of Figures

Figure Page

1.1 Simplified a) linear and b) nonlinear PA operation ...... 2

1.2 Block diagrams for a) Homodyne, b) envelope tracking, and c) polar transmitter architectures ...... 5

1.3 Outphasing transmitter block diagram ...... 6

1.4 Outphasing vectors showing a) in-phase and b) out-of-phase vector addition ...... 7

1.5 a) DAHI cross-section showing thermal and electrical heterogeneous in- terconnects (HICs) thermal dissipation paths and b) scanning electron microscope image of InP HIC from [2] ...... 9

1.6 Transmitter block diagram showing heterogeneous interconnects (HICs) between the integrated dies ...... 9

2.1 Block diagram for a phase-quantized SOQPSK transmitter ...... 14

2.2 SOQPSK pulse-shaping filter phase response ...... 14

2.3 SOQPSK pulse-shaping filter phase response ...... 16

2.4 3-bit phase-quantized 20 Mbps SOQPSK signal spectral components . 16

2.5 20 Mbps SOQPSK spectrum for 2-8-bit phase quantization showing 1st and 2nd adjacent frequency bands ...... 18

2.6 Simulated ACLR1 and ACLR2 for 20 Mbps SOQPSK waveform . . . 19

xii 2.7 a) Wideband single-carrier vs b) multiple orthogonal narrowband sub- carriers ...... 20

2.8 Block diagram for 64-QAM OFDM LTE transmitter ...... 21

2.9 Simulated spectrum for 20 MHz 64-QAM LTE waveform ...... 22

2.10 Vectors demonstrating EVM concept ...... 23

3.1 Chireix outphasing concept from original 1935 publication [3] . . . . . 26

3.2 LINC outphasing concept from 1974 publication [4] ...... 27

3.3 Polar and Cartesian form relationship ...... 29

3.4 Two-way outphasing transmitter block diagram ...... 29

3.5 Two-way outphasing vectors with angle names ...... 30

3.6 Outphasing amplitude as a function of outphasing angle θ ...... 31

3.7 Four-way outphasing transmitter block diagram ...... 32

3.8 Four-way outphasing vectors with angle names ...... 33

3.9 Outphasing amplitude as a function of outphasing angle θ and ϕ . . . 34

3.10 Receiver block diagram ...... 34

3.11 Outcomes of phase select data changing at three different transition times (TTs) ...... 38

3.12 20 Mbps SOQPSK ACLR2 increase due to mux glitches as a function

of the RF frequency (fRF) to oversampled baseband data rate (fOBB) ratio for 7-10-bit quantization ...... 40

3.13 Glitch-free multiplexer architecture presented in [5] ...... 41

3.14 Glitch-free multiplexer architecture presented in [6] ...... 42

xiii 3.15 PA loading for Chireix and four-way non-isolating combiners as phase difference is swept [7] ...... 44

3.16 6-bit phase quantized amplitude for a) two-way and b) four-way out- phasing architectures ...... 46

3.17 Quantization algorithm for four-way outphasing ...... 47

3.18 Simulated 20 MHz LTE ACLR for quantized two-way and four-way outphasing architectures ...... 48

3.19 Outphasing power efficiency and 20 MHz 64-QAM LTE probability density function (PDF) ...... 50

3.20 a) Ideal two-way outphasing vectors and b) two-way outphasing vectors with 20% PA mismatch ...... 51

3.21 Simulated 20 MHz 64-QAM LTE ACLR in the presence of PA mis- match for four-way and two-way outphasing transmitters. Each data point is the average of 400 independent simulations...... 52

3.22 Simulated 20 MHz 64-QAM LTE ACLR in the presence of time mis- match between outphasing channels. Solid lines are for 2-bit additional phase resolution over dotted lines ...... 55

4.1 Phase 1 IC block diagram ...... 59

4.2 Phase 1 IC die photo ...... 60

4.3 Phase 1 6”x4” printed circuit board with IC ...... 60

4.4 Wirebonded Phase 1 IC showing wirebond capacitors ...... 61

4.5 Measured Phase 1 output power and efficiency ...... 64

4.6 Measured Phase 1 output power ...... 66

4.7 Measured Phase 1 peak power efficiency ...... 67

4.8 Measured 10 Mbps SOQPSK spectrum ...... 68

xiv 4.9 Measured 5 Mbps 16-QAM spectrum ...... 71

4.10 Measured 5 Mbps 16-QAM far-out spectrum ...... 71

4.11 Measured 5 Mbps 16-QAM constellation diagram ...... 72

4.12 Measured pulsed radar near-in spectrum ...... 74

4.13 Measured pulsed radar far-out spectrum ...... 74

5.1 Phase 2 IC block diagram ...... 78

5.2 Phase modulator implementation showing four primary circuit blocks 79

5.3 Transmitter low, middle, and ranges and resolution . . 80

5.4 Reconfigurable DLL for 4-bit coarse phase resolution ...... 81

5.5 Frequency range of the reconfigurable DLL ...... 82

5.6 Frequency-agile glitch-free multiplexer architecture and timing diagrams 84

5.7 Three-stage fine delay ...... 86

5.8 Two-stage very fine delay ...... 88

5.9 CMOS to GaN driver and three-stage GaN PA ...... 89

5.10 CMOS to GaN driver ...... 90

5.11 CMOS to GaN buffer voltage gain with and without inductive peaking 91

5.12 Three-stage GaN PA ...... 92

5.13 Phase 2 integrated circuit die photo ...... 94

5.14 Wirebonded Phase 1 IC showing wirebond capacitors ...... 95

5.15 Measured Phase 2 output power and efficiency ...... 97

xv Chapter 1: Introduction

The growth of cellular and wireless communications over the last two decades has led to unprecedented congestion of the radio frequency (RF) spectrum. New frequency bands and bandwidth allocations are constantly being opened up to al- leviate this congestion, but the demand for wireless data is increasing faster than spectrum availability. To better utilize the available spectrum, higher-order mod- ulation schemes such as quadrature (QAM) and orthogonal frequency-division (OFDM) are being implemented. These modulation schemes require highly linear power amplifiers (PAs) to meet the spectral mask and er- ror vector magnitude (EVM) limits of current wireless standards because of their high peak-to-average power ratios (PAPRs). Unfortunately, linear PAs have an inherent trade-off between linearity and power efficiency, resulting in low transmitter power efficiency. Despite this fundamental hardware limitation, the complex and rapidly changing wireless environment necessitates three key requirements for RF transmit- ter: frequency-agility, capability of transmitting complex modulation schemes, and power efficiency.

1 Linear PA Nonlinear PA

Pout=Pin+Gain Pout=Constant Compression

Pout Dynamic Pout Range

Pin Pin a) b)

Figure 1.1: Simplified a) linear and b) nonlinear PA operation

1.1 Transmitter Architectures

All three on these key transmitters requirements directly depend on the choice of PA. PAs are generally divided into two groups: linear (class-A, -AB, -B, and -C) and nonlinear (class-D, -E, -and -F). While all of these PA’s can be made wideband to satisfy the frequency-agility requirement, there is a direct trade-off between ca- pability of transmitting complex modulation schemes and power efficiency. Complex modulation schemes contain both amplitude and , so the PA needs to linearly produce the amplified version of the input signal (i.e. Pout=Pin+Gain), as shown in Fig. 1.1a). This type of PA only achieves its peak efficiency at the high end of its output range, but the high PAPR of advanced waveforms means the PA is typically operating 6-10 dB below its peak output power resulting in low average power efficiency. This linearity vs efficiency tradeoff is exacerbated by the use of power back-off to further increase the PA’s linearity by decreasing the input power to avoid the gain compression region.

2 An alternative to a linear PA architecture is a highly nonlinear switch-mode PA.

In these amplifiers, the power transistor is driven deep into the ohmic (triode) region and behaves as a switch that is controlled by a RF input signal. Because of this switching operation, these PAs are highly nonlinear (achieve the same output power regardless of the PA input power), as shown in Fig. 1.1b). This switching operation results in high power efficiency, with greater than 50% power-added efficiency (PAE) achieved in literature. Despite this high power efficiency, these PA architectures are not used for complex modulation schemes because they are not capable of amplitude modulation. This lack of amplitude modulation makes them only suitable for constant envelope applications such as radar, FM radio, and older wireless standards like GSM.

1.1.1 Analog Transmitter Architectures

With a conceptual understanding of linear/nonlinear PA operation, an analysis of waveforms, frequency ranges, and power efficiency for traditional analog transmitter architectures used for complex modulation schemes can be conducted. Single-band

Homodyne and Heterodyne transmitter architectures (Fig. 1.2a)) have long served as the default transmitter architecture because they are proven and reliable. The baseband waveform In-phase (I) and Quadrature (Q) components of the signal are generated by digital-to-analog converters (DACs), upconverted to the RF frequency by a mixer, combined, and then amplified by a linear PA before being transmitted by the antenna. While these architectures are capable of transmitting wide-bandwidth complex modulation schemes, they deploy many RF components and multiple filters,

3 making the architecture not suitable for operation over multiple frequency bands with- out frequency tunable components. Lastly, the architecture requires many compo- nents such as data converters, mixer, voltage-controlled oscillators and a highly-linear

PA, resulting in low transmitter power efficiency.

Envelope tracking transmitter architectures (Fig. 1.2b)) have been developed to improve transmitter power efficiency. These transmitters extract the envelope information of the waveform and use a supply modulator to dynamically modulate the PA’s supply voltage. This technique is quite effective at improving PA power efficiency, but it carries several downsides. First, the high gate capacitance of the large supply modulation transistor limits the supply modulation bandwidth, which in turn limits the transmitter bandwidth. Second, timing mismatch between the envelope and RF signal paths can limit the spectral mask and EVM performance of this transmitter.

Polar transmitter architectures (Fig. 1.2b)), also known as envelope elimination and restoration (EER) transmitters [8], are another transmitter architecture aiming to improve transmitter power efficiency by extracting the envelope of the RF signal. Un- like envelope tracking architectures where the RF signal path still contains amplitude modulation, polar tracking provides only phase modulation through the RF path and all the amplitude (envelope) modulation through the supply modulator path. Only having phase modulation in the RF path enables the use of nonlinear switch-mode

PAs to amplify the signal, increasing the power efficiency of the transmitter. Because the amplitude modulation is only produced by the supply modulator, a highly-linear supply modulator is required to accurately generate the amplitude modulation. In addition, precise timing alignment between the amplitude and phase signal paths is

4 Homodyne I Channel

DAC

LO PA Q Channel

DAC

a) Envelope Tracking

Envelope Supply Modulator

I RF PA Q Upconversion

b) Polar

Envelope Supply Modulator

I RF CORDIC PA Q Upconversion

phase

c)

Figure 1.2: Block diagrams for a) Homodyne, b) envelope tracking, and c) polar transmitter architectures

5 Outphasing phase2

Phase PA Modulator

LO

Phase PA Modulator

phase1

Figure 1.3: Outphasing transmitter block diagram

critical for transmitter performance, but the different circuitry in the amplitude and phase paths makes this timing alignment difficult to achieve.

In summary, techniques like envelope tracking and EER are capable of improving transmitter power efficiency for complex modulation schemes at the expense of having to implement supply modulation. This supply modulation is circuit area intensive, can limit the modulation bandwidth, and requires precise timing alignment to not degrade transmitter spectral purity.

1.1.2 Outphasing Transmitter Architectures

In 1935, Henri Chireix [3] proposed a transmitter architecture to overcome the ef-

ficiency/linearity tradeoff without the overhead of supply modulation. This architec- ture, coined outphasing, combines the phase-modulated outputs of multiple nonlinear switch-mode PAs, as shown in Fig. 1.3. Based on the relative phase between the two channels, the combined output contains both amplitude and phase modulation. This

6 output

phase2 phase2

output

phase1

phase1

a) b)

Figure 1.4: Outphasing vectors showing a) in-phase and b) out-of-phase vector addi- tion

outphasing operation can be represented by a vector addition operation; when the two signals are in-phase (Fig. 1.4a)) they add to create large output amplitude, and when they are out-of-phase (Fig. 1.4b)) the vectors add to create decreased amplitude.

Despite the outphasing architecture being around for 80 years, they are gaining a resurgence in popularity due to advances is CMOS process nodes. These nodes allow the entire transmitter to be integrated onto a single die, decreasing the transmitter size compared to multi-chip approaches. In particular, the switching speeds of these nodes allow for the implementation of digitally-intensive phase modulators capable of generating sub-picosecond timing resolution while only consuming tens of milliwatts of power. Although these nodes are ideal for the phase modulator, the PA output power is limited by the low supply voltage and current handling capability of the metal stack.

Alternatively, implementing the transmitter in a III-V process technology such as Gallium Nitride (GaN) would allow better PA performance due to the high drain voltage, power density, and RF optimized metal stack of these processes. Despite

7 these benefits, fully-integrated transmitters are not typically implemented in these technologies because only depletion-mode devices are available. This causes wave- form modulation to be analog-intensive, resulting in the modulator being more nar- rowband and less power efficiency than digitally-intensive modulators designed in

CMOS. While multiple chips can be integrated on a printed circuit board to simulta- neously achieve the benefits of both process technologies, this approach will increase the interconnect parasitics, form factor, and power consumption.

1.2 Heterogeneous Integration

To decrease these interconnect parasitics while keeping the performance benefits of the individual technologies, DARPA’s Diverse Accessible Heterogeneous Integration

(DAHI) program [9] uses a silicon CMOS IC as a substrate on top of which III-V chiplets are integrated, with both electrical and thermal heterogeneous interconnects

(HICs) between the technologies. This integration requires no changes to the CMOS process and occurs after the back-end-of-line (BEOL) processing [2], allowing for designers to rely on the existing highly-developed CMOS device and interconnect models.

Fig. 1.5a) shows the cross-section of the DAHI process with GaN and Indium

Phosphide (InP) chiplet integrated on top of the CMOS chip. The HICs connect the chips, with thermal HICs utilizing the entire CMOS metal stack to allow better thermal dissipation to the backside of the CMOS and electrical HICs only going to the top metal layers so they can be used for signal interconnect. Fig. 1.5b) shows a scanning electron microscope image of a InP HIC, showing the InP chiplet, HIC, and

CMOS chip.

8 Thermal Elec. HICs HICs

HIC

a) b)

Figure 1.5: a) DAHI cross-section showing thermal and electrical heterogeneous in- terconnects (HICs) thermal dissipation paths and b) scanning electron microscope image of InP HIC from [2]

CMOS to GaN PA Driver Phase Modulator Out LO HIC PA

Phase Modulator CMOS to GaN GaN Driver CMOS

Figure 1.6: Transmitter block diagram showing heterogeneous interconnects (HICs) between the integrated dies

9 Table 1.1: Standard RF frequency bands [1] Band Designation Nominal Frequency Range S 1-2 GHz C 2-4 GHz X 8-12 GHz

Fig. 1.6 shows an example of an outphasing architecture implemented in the

DAHI process. Conceptually this block diagram is the same as the basic outphasing architecture presented in Fig. 1.3, but additional circuitry is needed to interface the process technologies. The phase modulators are not able to generate the large voltage-swing needed to drive a GaN PA, so a CMOS driver is needed to generate the large swing and overcome the parasitic associated with driving the GaN gate through the HICs.

1.3 Research Overview

This research aims to develop a novel four-way outphasing architecture to im- prove the performance of outphasing architectures transmitting high PAPR modu- lation schemes such as 20 MHz 64-QAM LTE. This architecture is implemented in a 2.2-10.4 GHz transmitter, continuously covering the S, C, and lower X frequency bands (Table 1.1). This design is fabricated in the DAHI process technology, fea- turing the heterogeneous integration Global Foundries 45nm SOI CMOS and 0.2µm

GaN processes. By changing the output combining network, this transmitter can be tested in both traditional two-way and four-path outphasing modes, allowing for di- rect performance comparison of the two architectures. A reconfigurable delay line is

10 implemented to allow for multiband operation and a novel glitch-free phase selection multiplexer circuit has been developed. Lastly, analysis of performance degradation due to phase quantization, multiplexer glitches, and amplitude/timing mismatch is presented.

1.4 Outline

The scope of this work begins with an analysis of the SOQPSK and LTE modula- tion schemes that will be used to characterize the performance of the transmitter in

Chapter 2. Next, Chapter 3 will develop the theoretical background of the four-way outphasing transmitter and determine key specifications for the implemented trans- mitter. Chapters 4 and 5 will present the first and second transmitter prototypes and compare their performance to state-of-the-art designs. Lastly, Chapter 6 summarizes this work and discusses future research in four-way outphasing transmitters.

11 Chapter 2: Modulation

Understanding the target modulation schemes is critical for deriving hardware requirements for a transmitter architecture. Section 2.1 overviews the shaped-offset quadrature phase-shift keying (SOQPSK) waveform that will be used to characterize the performance for simple phase-only modulation and proposes a specification to quantify the performance. Section 2.2 overviews the 20 MHz 64-QAM OFDM LTE waveform will be used to quantify system performance with complex modulation schemes and presents the EVM and ACLR specifications used to to quantify them.

2.1 SOQPSK

SOQPSK-TG is a constant-envelope modulation scheme used by the military telemetry community and is defined by the IRIG 106-09 standard [10]. SOQPSK is a general term encompassing an entire family of modulation schemes, but the -TG variant requires a specific impulse filter response that is defined by the standard. For the sake of brevity, the general term SOQPSK is used throughout this document to refer to the specific SOQPSK-TG implementation. Section 2.1.1 gives an overview of a SOQPSK transmitter and presents a theoretical analysis of quantization noise in phase modulated transmitters, and section 2.1.2 analyzes the SOQPSK spectrum

12 and develops a quantifiable SOQPSK specification to assist in transmitter simulation

and measurement.

2.1.1 SOQPSK Overview

Fig. 2.1 shows the block diagram for a phase-quantized SOQPSK transmitter.

The input bitstream is first differentially-encoded into the I and Q waveforms, which

is then precoded into the ternary impulse series α. α can take the values {1, 0, −1}; with α = 1 advancing the carrier phase by 90◦, α = 0 keeping the same carrier phase, and α = −1 decreasing the carrier phase by 90◦. This impulse series is then

upsampled (with “0” padding) and filtered by the pulse-shaping filter, with the filter

phase response shown in Fig. 2.2. Unlike typical communications schemes which

avoid inter-symbol interference (ISI), the filter response allows ISI by causing each

phase transition to take two bit periods (Tb) to complete. This decrease the signal bandwidth to 0.78 of the bitrate, but requires the use of trellis detection [11] in the receiver for optimal data recovery. This filtered waveform is integrated to produce the phase signal θ, which is then quantized by a n-bit quantizer to produce the quantized phase signal θQ. θQ is used as the phase-select input signal of digital phase modulator which modulates the sinusoidal RF input signal. The output of the phase modulator is then amplified with a gain of G and transmitted. The transmitted signal, y(t), is defined as

y(t) = G cos (ωRF t + θQ(t)). (2.1)

In order to examine the effect of phase quantization on SOQPSK waveforms, (2.1) should be expanded,following the analysis presented in [12]. First, the quantized phase signal θQ is broken down into an ideal non-quantized component θ and a quantization

13 Pulse- Shaping Filter

y(t) = G cos(ωRFt + θQ(t)) I Bitstream Diff α Convolve θ n-bit Precode Upsample Encode Q and ∫ Quantize

θQ[(n-1):0] Phase Mod G cos(ωRFt )

cos(ωRFt + θQ(t))

Figure 2.1: Block diagram for a phase-quantized SOQPSK transmitter

90

67.5

45

22.5

0

-4 -3 -2 -1 0 1 2 3 4 Normalized time (t / T ) b

Figure 2.2: SOQPSK pulse-shaping filter phase response

14 error component ∆q, resulting in

θQ(t) = θ(t) + ∆q(t). (2.2)

Substituting (2.1) into (2.1) (and removing the PA gain constant G for simplicity), the transmitter output is represented by

y(t) = cos [ωRF t + θ(t) + ∆q(t)]. (2.3)

Using the trigonometric identity cos (x + y) = cos (x) cos (y) − sin (x) sin (y), where x = ωRF t + θ(t) and y = ∆q(t), and rearranging, (2.3) is re-written as

Ideal Output Vector Error Vector z }| { z }| { y(t) = cos [∆q(t)] cos [ωRF t + θ(t)] + sin [∆q(t)] sin [ωRF t + θ(t)] (2.4) | {z } | {z } Scaling Factor Scaling Factor Fig. 2.3 shows the vector representation of (2.4). The quantized output consists

of two vectors: an amplitude scaled version of the ideal non-quantized output and

an orthogonal error vector. The summation of theses vectors produces a constant-

envelope signal, verifying that the constant envelope operation is achieved with quan-

tized phase.

Fig. 2.4 shows the simulated spectrum for a 3-bit phase-quantized 20 Megabits

per second (Mbps) SOQPSK signal. By ignoring the cos [∆q(t)] scaling factor on

the ideal non-quantized signal component of (2.4), the signal can be broken into two

components: the non-quantized ideal signal cos [ωRF t + θ(t)] and the orthogonal error

vector component sin [∆q(t)] sin [ωRF t + θ(t)]. As the figure shows, the ideal compo-

nent is responsible for the signal’s main lobe, and the error signal is responsible for the

spectral noise floor. Because the signal modulation and the non-linear sin operation

on the phase quantization error ∆q(t), the spectral noise floor due to quantization is

colored instead of white as typically seen in DAC-based wavefrom generation [13].

15 Ideal Output Error Vector

Quantized Scaled Ideal Output Output

Figure 2.3: SOQPSK pulse-shaping filter phase response

Phase-quantized signal

y(t) = cos(Δq(t))cos(ωRFt + θ(t)) + sin(Δq(t))sin(ωRFt + θ(t))

-30 -40 -50 -60 -70 -80 -60 -40 -20 0 20 40 60 Frequency Offset (MHz) Non-quantized ideal Orthogonal error vector signal component signal component

cos(ωRFt + θ(t)) sin(Δq(t))sin(ωRFt + θ(t)) -30 -30 -40 -40 -50 -50 -60 -60 -70 -70 -80 -80 -60 -40 -20 0 20 40 60 -60 -40 -20 0 20 40 60 Frequency Offset (MHz) Frequency Offset (MHz)

Figure 2.4: 3-bit phase-quantized 20 Mbps SOQPSK signal spectral components

16 2.1.2 SOQPSK Specifications

Quantifiable waveform specifications are needed to determine the effect of quanti- zation and other non-idealities on SOQPSK waveforms. Traditional phase modulated signals such as gaussian minimum-shift keying (GMSK) signals used in GSM cellular networks [14] use time-domain based specifications such as peak/RMS phase error to quantify signal performance, but the intentional ISI caused by the pulse-shaping filter causes these metrics to be inadequate for quantifying the performance of SOQPSK waveforms.

Instead of time-domain based specifications, frequency-domain based specifica- tions can be used. The SOQPSK standard gives a spectral mask specification, which can readily be used to determine the necessary phase-quantization. Fig. 2.5 shows the simulated spectrum of a 2-8-bit quantized 20 Mbps SOQPSK with the spectral mask limit. This simulation shows a minimum of 5-bit resolution is required to meet the spectral mask requirement, but several additional bits should be implemented to account for other non-idealities such as glitches and phase error due to process variation.

In order to quantify spectral performance, the development of an adjacent-channel leakage ratio (ACLR) for SOQPSK is proposed. ACLR is a specification used by many other modulation schemes and is defined as the ratio of the total adjacent channel power to the in-band power. With knowledge of the spectral mask and adjacent channel offset, an ACLR specification can be developed. For 20 Mbps SOQPSK, each channel is 15.6 MHz wide (99% power bandwidth of the SOQPSK signal) and adjacent channels are offset by 22 MHz, as shown in Fig. 2.5. Due to the colored nature of the quantization noise floor, both the first and second adjacent channels

17 2nd Adj. 1st Adj. In-band 1st Adj. 2nd Adj. -20 2-bit 4-bit -30 6-bit 8-bit

-40

-50

-60

-70

-80

-90 -60 -40 -20 0 20 40 60 Frequency Offset (MHz)

Figure 2.5: 20 Mbps SOQPSK spectrum for 2-8-bit phase quantization showing 1st and 2nd adjacent frequency bands

18 -20 ACLR1 ACLR2 -30 Limit

-40

-50

-60

-70 2 4 6 8 10 Quantization Bits

Figure 2.6: Simulated ACLR1 and ACLR2 for 20 Mbps SOQPSK waveform

need to be considered when determining mask compliance. The ratio of the first adjacent channel to in-band is defined as ACLR1, and second channel to in-band is defined as ACLR2. By comparing the spectral mask in these regions to the in-band mask, a -29.8 dBc limit can be calculated. Fig. 2.6 shows the simulated ACLR1 and ACLR2 for 2-10 bit phase quantization, showing a minimum of 5-bit resolution is required to meet the -29.8 dBc limit. This 5-bit required resolution agrees with the 5-bit resolution requirement determined by the direct inspection of the spectrum and mask. Note that the main lobe extends into the first adjacent channel (Fig.

2.5), limiting the ACLR1 to -50 dBc. Because of this, ACLR2 is a better metric for quantization noise for SOQPSK modulators with greater than 7-bit resolution.

2.2 64-QAM OFDM - LTE

With a specification developed for simple phase modulation only waveforms, it is now necessary to determine specifications for complex waveforms that require both amplitude and phase modulation. For this analysis, a 20 MHz 64-QAM orthogonal

19 -5 -4 -3 -2 -1 012345 Frequency a) b)

Figure 2.7: a) Wideband single-carrier vs b) multiple orthogonal narrowband sub- carriers

frequency-division multiplexing (OFDM) long-term evolution (LTE) waveform used by 4G cellular waveform is used. This waveform was selected because the prevalence of cellular communications makes this a very popular and in-demand waveform and it has a very high PAPR, making it a tough waveform to efficiently amplify.

2.2.1 LTE Overview

OFDM waveforms are typically used for high bandwidth (and thus high datarate) modulation. Traditional modulation schemes would synthesize a single high band- width carrier for high bandwidth modulation (Fig. 2.7a)), but this approach is sus- ceptible to multipath interference, making data recovery at the receiver difficult.

OFDM breaks this wide-bandwidth carrier into multiple narrowband sub-carriers

(Fig. 2.7b)), making it less susceptible to multipath interference. Note that the sub- carriers are spaced orthogonally, meaning that they are spaced so the peak of each sub-carrier coincides with the nulls of the other sub-carriers, avoiding sub-carrier

20 cos(ω1t ) cos(ω2t ) cos(ωnt ) subc1 Bitstream Serial to subc2 64-QAM Inv RF Parallel Mapping FFT upconvert G subcn

cos(ωRFt )

Figure 2.8: Block diagram for 64-QAM OFDM LTE transmitter

interference. For LTE, the frequency spacing in 15 kHz, so a 20 MHz 64-QAM modu- lated signal would consist of 1024 sub-carriers with an aggregate datarate of 92 Mpbs

[15].

Fig. 2.8 shows the block diagram of a 64-QAM LTE transmitter with n sub- carriers. First, the serial bitstream is broken into n parallel bitstreams. The parallel bitstreams then undergo 64-QAM mapping and are digitally mixed in the frequency domain with their respective sub-carrier frequencies. These frequency-domain signals are then combined and undergo an inverse fast-Fourier transform (FFT) to create the time-domain version of this signal. This signal is then upconverted to the RF carrier frequency, amplified, and transmitted.

The LTE waveform consists of multiple independent sub-carriers modulated by a random data, so they typically add to have a relatively constant output power.

Unfortunately, there are instances in time when all the sub-carriers add constructively, producing a high peak output power. This results in LTE waveforms have a high

PAPR of around 14 dB. While digital clipping can be used to decrease the peak output power at the expense of spectral purity and EVM, the resulting PAPR is still greater than 10 dB.

21 20

10

0

-10

-20

-30

-40

-50

-60

-70

-80 -60 -40 -20 0 20 40 60 Frequency Offset (MHz)

Figure 2.9: Simulated spectrum for 20 MHz 64-QAM LTE waveform

2.2.2 LTE Specifications

Unlike SOQPSK, the LTE standard defines multiple specification which can be used to quantify the performance. Fig. 2.9 shows the simulated spectrum for a 20

MHz 64-QAM LTE signal with the spectral mask overlaid. The LTE standard defines a ACLR specification of -44.2 dBc, with the spectrum shown in Fig. 2.9 achieving an

ACLR1 of -62.7 dBc and ACLR2 of -63.2 dBc.

The LTE standard also specifies a 8% average EVM limit for 64-QAM OFDM.

When a transmitted signal is received, the received constellation points will ideally be on the ideal constellation points, but transmitter/receiver imperfections cause the received points to shift. Fig. 2.10 demonstrates the concept of EVM, which is the ratio of the root mean squared (RMS) error vector (the distance between the received

22 Received Point V Q error Ideal Point

Vref

I

Figure 2.10: Vectors demonstrating EVM concept

and ideal constellation point) to the RMS reference vector (the distance from the origin to the ideal constellation point). EVM is defined as:

V EVM = error,RMS (2.5) Vref,RMS

where v u n q 2 u 1 X 2 2 Verror,RMS = t (Iref (i) − Irec(i)) + (Qref (i) − Qrec(i)) (2.6) n i=1

and v u n q 2 u 1 X 2 2 Vref,RMS = t (Iref (i)) + (Qref (i)) (2.7) n i=1 For OFDM modulation schemes, the EVM is calculated for each sub-carrier and

aggregated to get the EVM for the entire signal. Transmitter simulations show 8%

EVM limit for 64-QAM LTE signals is actually quite high, with the typical simu-

lated quantized waveform only having an EVM of 3%. Simulations show that an

LTE waveform with system impairments will violate the -44.2 dBc ACLR limit well

before the 8% EVM limit. Because of this, the remainder of this document will focus

23 Table 2.1: Comparison of SOQPSK and 64-QAM LTE SOQPSK 64-QAM LTE Modulation Type Phase-only Amp and Phase PAPR 0 dB >10 dB ACLR1 Spec -29.8 dBc -44.2 dBc ACLR2 Spec -29.8 dBc -44.2 dBc EVM Spec N/A 8%

on the spectrum/ACLR performance of the waveform when analyzing transmitter performance.

Table 2.1 gives an overview of the modulation schemes and target specifications presented in this chapter.

24 Chapter 3: Outphasing

Now that an overview of the modulation schemes has been outlined, it is possible to present the outphasing concept and propose the novel four-way outphasing archi- tecture. First, Section 3.1 overviews the history of the outphasing concept. Next,

Section 3.2 develops the communications theory behind two-way and four-way out- phasing transmitter architecture and verifies outphasing transmitters are compatible with tradition Homodyne/Heterodyne receiver architectures. Section 3.3 will discuss recent advances in the outphasing field. Lastly, Section 3.4 compares the performance of two-way and four-way outphasing architectures using the modulation specifications discussed in Chapter 2.

3.1 Outphasing History

The term outphasing was originally coined by Henri Chireix in 1935 [3]. Although his paper describes a transmitter operating with tube amplifiers instead of transis- tors, the underlying concept is remarkably similar to modern outphasing systems. His approach was designed for AM radio transmitters, with the goal of improving power efficiency and decreasing power supply requirements. Instead of modulating the an- ode voltage with a low-frequency high-power amplifier (envelope tracking) or “acting on the grids of the high-frequency power amplifier” (as in class-A,-B,and -C power

25 Figure 3.1: Chireix outphasing concept from original 1935 publication [3]

amplifiers), modulation can also be achieved by varying the load impedance by intro- ducing a phase difference between two power amplifiers and using a passive combiner network to combine their outputs (Fig. 3.1). This outphasing architecture gained commercial success in 1956 when it was implemented in RCA’s Ampliphase transmit- ters, which dominated the AM broadcast radio market for the next 15 years.[16, 17]

D.C. Cox presented a more modern version of the outphasing concept in his 1974 paper describing “linear amplification with nonlinear components” (LINC)[4], with the block diagram for his approach shown in Fig. 3.2. His work focused on the implementation of the component separator, which takes the modulated amplitude and phase signals and generates the two phase modulated signals that are amplified and combined.

While the architectures presented by Henri Chireix and D.C. Cox are both out- phasing architectures, there is a key distinction that separates them. Henri Chireix’s approach uses a non-isolating power combiner, so the PAs see efficiency-increasing load modulation from the other PA. The reactive component of the load impedance

26 Figure 3.2: LINC outphasing concept from 1974 publication [4]

varies greatly as the phase difference between the channels changes, resulting in this architecture only achieving around 10 dB dynamic range. This non-isolating out- phasing approach is typically referred to as Chireix outphasing. The outphasing approach proposed by D.C. Cox uses an isolating (Wilkinson) power combiner. With this approach, each PA sees a 50 Ω load regardless of the phase difference between the channels. This decreases the transmitter power efficiency because there is no load modulation, resulting in a power efficiency which is linearly related to output power.

The advantage of this approach is the transmitter dynamic range increases signifi- cantly. Instead of the dynamic range depending on the PA loading, it depends on the output power match between the two PA channel since it depends on the channels canceling each other out to achieve low output power. These isolating outphasing transmitters are referred to as LINC outphasing and can typically achieve 25 dB dynamic range. Table 3.1 gives a brief comparison of Chireix and LINC outphasing approaches.

27 Table 3.1: Comparison of Chireix and LINC outphasing Chireix Outphasing LINC Outphasing Combiner Type Non-isolating Isolating RF Frequency Range <500 MHz >1 GHz Dynamic Range <10 dB 25 dB Power Efficiency High Low

3.2 Outphasing Overview

3.2.1 Two-way Outphasing Overview

Modern modulation schemes are typically represented in Cartesian coordinate

system in the form y(t) = I(t) cos(ωRF t) + Q(t) sin(ωRF t). Using well know transfor-

mations, the same signal can be converted to the polar form:

y(t) = A(t) cos[ωRF t − ψ(t)] (3.1)

Fig. 3.3 shows the vector representation of this transformation.

Fig. 3.4 shows the block diagram for a two-way outphasing transmitter. First,

the amplitude A(t) is decomposed into the outphasing angle, denoted θ(t). Two independent phase modulators are fed the same RF tone, but have different phase select signals, [−ψ(t) + θ(t)] and [−ψ(t) − θ(t)]. The output of the phase modulators G are then amplified by a factor , producing the outphasing vectors y and y . These 2 1 2 outphasing vectors are defined as: G y1(t) = cos[ωRF t − ψ(t) + θ(t)] 2 (3.2) G y (t) = cos[ω t − ψ(t) − θ(t)]. 2 2 RF A passive combiner is then used to add the two outphasing signals together, producing the signal y. This combining can be represented by a vector addition

28 y

A Q

ψ

I

Figure 3.3: Polar and Cartesian form relationship

G/2 cos[ωRFt - ψ(t) + θ(t)] Phase G/2 Modulator y1 cos[ωRFt] [- ψ(t) + θ(t)] y

[- ψ(t) - θ(t)] G cos[θ(t)] cos[ωRFt - ψ(t)]

Phase y2 Modulator G/2

G/2 cos[ωRFt - ψ(t) - θ(t)]

Figure 3.4: Two-way outphasing transmitter block diagram

29 y

y1

θ ψ θ y2

Figure 3.5: Two-way outphasing vectors with angle names

operation, as shown in Fig. 3.5. Using the trigonometric identity cos(u ± v) = cos(u) cos(v) ∓ sin(u)sin(v) where u = ωRF t − ψ(t) and v = θ(t), the output of the combiner becomes

y(t) = G cos[θ(t)] cos[ωRF t − ψ(t)] (3.3)

By comparing equations (3.1) and (3.3), it is clear that the outphasing output is the polar form of a modulated signal with

A(t) = cos[θ(t)] ⇒ θ(t) = arccos[A(t)] (3.4)

Figure 3.6 shows a plot of this amplitude vs outphasing angle relationship. Although this nonlinear relationship can be easily corrected with predistortion, this nonlinearity will cause undesirable effects in quantized outphasing implementations, which will be discussed in Section 3.4.

30 e d u t i l p m A

Figure 3.6: Outphasing amplitude as a function of outphasing angle θ

3.2.2 Four-way Outphasing Overview

Four-way outphasing transmitters are conceptually very similar to two-way out-

phasing transmitters, except the number of channels are extended to four, as shown

in Fig. 3.7. This requires four phase select signal, [−ψ(t) + θ(t) + ϕ(t)], [−ψ(t) +

θ(t) − ϕ(t)] and [−ψ(t) − θ(t) + ϕ(t)], and [−ψ(t) − θ(t) − ϕ(t)], producing the four outphasing vectors y1, y2, y3 and y4 shown in Fig. 3.8. Both θ(t) and ϕ(t) are con- sidered outphasing angles for four-way outphasing transmitters. The output of the G phase modulators are then amplified by a factor , producing the signals 4 G y (t) = cos[ω t − ψ(t) + θ(t) + ϕ(t)] 1 4 RF G y2(t) = cos[ωRF t − ψ(t) + θ(t) − ϕ(t)] 4 (3.5) G y (t) = cos[ω t − ψ(t) − θ(t) + ϕ(t)] 3 4 RF G y (t) = cos[ω t − ψ(t) − θ(t) − ϕ(t)]. 4 4 RF

31 Phase Modulator G/4 y1 [- ψ(t) + θ(t) + φ(t)]

[- ψ(t) + θ(t) - φ(t)]

Phase G/4 cos[ωRFt] Modulator y2 y G cos[θ(t)]cos[φ(t)] y3 cos[ω t - ψ(t)] Phase RF Modulator G/4

[- ψ(t) - θ(t) + φ(t)]

[- ψ(t) - θ(t) - φ(t)] y4 Phase Modulator G/4

Figure 3.7: Four-way outphasing transmitter block diagram

32 y1 yout

y2 θ φ φ y3 ψ θ φ

φ y4

Figure 3.8: Four-way outphasing vectors with angle names

A four-way passive combiner is then used to add the four outphasing signals together. Using the trigonometric identity cos(u ± v) = cos(u) cos(v) ∓ sin(u)sin(v) twice, the output of the combiner becomes

y(t) = G cos[θ(t)] cos[ϕ(t)] cos[ωRF t − ψ(t)] (3.6)

By comparing equations (3.1) and (3.6), the outphasing output is the polar form of the modulated signal with

A(t) = cos[θ(t)] cos[ϕ(t)] (3.7)

Fig. 3.9 shows a plot of this amplitude vs outphasing angle relationship for four-way outphasing. Unlike two-way outphasing where the amplitude is only controlled by

θ(t), both θ(t) and ϕ(t) control the amplitude. This results in A(t) having no unique solution, so θ(t) and ϕ(t) can be selected to optimize system performance such as

ACLR and EVM, as presented in the Section 3.4

33 Increasing φ e d u t i l p m A

Figure 3.9: Outphasing amplitude as a function of outphasing angle θ and ϕ

I Channel A(t) cos[ψ(t)]

cos[ωRFt]

A(t) cos[θ(t)] cos[ωRFt - ψ(t)]

sin[ωRFt]

A(t) sin[ψ(t)] Q Channel

Figure 3.10: Receiver block diagram

34 3.2.3 Receiver Systems

One more important thing to verify is that outphasing transmitters are able to

operate with traditional Homodyne/Heterodyne receiver architectures without any

special modifications to the receiver hardware/software. Fig. 3.10 shows a simplified

receiver block diagram receiving the with polar form of the RF transmitted signal as

its input. The received polar signal is mixed with cos(ωRF t) and sin(ωRF t) to create the I and Q channels:

I Channel: A(t) cos(ωRF t − ψ(t)) ∗ cos(ωRF t) A(t)  cos(−ψ(t)) + cos(2ω t − ψ(t)) 2 RF (3.8) Q Channel: A(t) cos(ωRF t − ψ(t)) ∗ sin(ωRF t) A(t)  − sin(−ψ(t)) + sin(2ω t − ψ(t)) 2 RF The channels then undergo lowpass filtering to extract the baseband I/Q data.

Using the trig identities cos (−x) = cos (x) and sin (−x) = − sin (x), the recovered

I/Q signals are:

I(t) = A(t) cos[ψ(t)] (3.9) Q(t) = A(t) sin[ψ(t)].

This result is the expected I/Q waveforms, demonstrating that no special modifica- tions are needed for a Homodyne/Heterodyne receiver to operate with an outphasing transmitter instead of a traditional transmitter architectures.

3.3 Recent Advances

Now that the history and theory of outphasing operation has been presented, it is possible to analyze recent advances in the outphasing field. Advancements in silicon

35 CMOS process technologies is making outphasing an attractive alternative to tradi- tional Homodyne/Heterodyne architectures. As process nodes continue to shrink, two general trends can be observed: increased device stitching speed and decreased supply voltage. Homodyne/Heterodyne transmitters use DACs to synthesize the baseband waveform, which produce an amplitude modulated signal. As the supply voltage decreases with process , this provides less voltage headroom for the amplitude modulated signal, which either decreases the output power of the DAC or requires a separate supply voltage to be used for the DAC core.

Instead of generating the modulation in the amplitude domain, generating the modulation in the phase domain is an attractive alternative for advanced CMOS nodes. With the improvement in device switching speed as process nodes shrink, it is possible to achieve sub-picosecond timing resolution with digitally intensive phase modulator architectures. This timing resolution allows for high phase-resolution mod- ulation at lower RF operating frequencies (1-5 GHz) and makes phase modulation at higher RF frequencies (5+ GHz) possible. Section 3.3.1 gives an overview of some of these digitally-intensive phase modulators.

The last recent advancement in outphasing of note is the development of non- isolating four-way outphasing combiners. Previously, combining more than two chan- nels required the use of an isolating power combiner, as in LINC outphasing archi- tectures, which limits the power efficiency achievable with the architecture. Section

3.3.2 provides detail on the four-way outphasing combiners presented in literature.

36 3.3.1 Digital Phase Modulators

New digitally-intensive phase modulators can generate signals with sub-picosecond level delay resolution while decreasing the power consumption compared to analog phase modulators. In addition, the digital nature of these modulators allows for mod- ulation in the hundreds of megahertz range, while previous analog phase modulators based on phase-locked loops (PLLs) can only achieve modulation bandwidth in the tens of megahertz.

These modulators typically use a segmented approach to meet their phase resolu- tion requirements. Segmentation is a technique that uses different circuit architectures to implements the most significant bits (MSBs) and least significant bits (LSBs) of the modulation, which is an approach that has been borrowed from the DAC com- munity. This approach is well suited for phase modulators because of the large time range that needs to be covered. The MSBs represent time delays in the hundreds of picoseconds, while the LSBs represent time delays in the single or sub-picosecond range.

The authors of [5] present a great example of this segmented approach. For their target 8-bit phase resolution with a 2.4 GHz RF carrier, the 4-bit coarse phase resolution comes from a delay-locked loop (DLL) and the 4-bit fine resolution coarse from switching in/out capacitive loads on a series of inverters.

Glitch-Free Multiplexers

The DLL-based coarse delay architectures lead to a nonidealality known as a glitch. Since the RF frequency and phase select data rate are not related and the

DLL outputs span a full 360◦, the phase select data can change at any time during

37 DLLout[0] Muxout Mux

DLLout[1]

Phase Select Phase Select Transition: TT B DLLout[0] → DLLout[1] TT A TT C

DLLout[0]

DLLout[1] Phase Select Pulse-width TT A: Muxout distortion Glitch TT B: Muxout Safe TT C: Muxout transition

Figure 3.11: Outcomes of phase select data changing at three different transition times (TTs)

38 an RF period. This results in three possible scenarios when a simple multiplexer

(mux) is used, as shown in Fig. 3.11. Initially, the phase select data is selecting tap DLLout[0], but then the phase select data changes at three possible transition times (TTs) (denoted time TT A, TT B, and TT C) to select tap DLLout[1]. If the transition happens at the TT A, the mux output undergoes pulse-width distortion. A transition at TT B causes a narrow “glitch” pulse to occur at the output. Transitions at TT C are considered a safe transitions because there is no duty-cycle distortion or glitches. Both pulse-width distortion and glitches are non-idealities that should be avoided because they can significantly decrease the spectral purity of the ouput waveform. Recently, mux architectures that circumvent both pulse-width distortion and glitch non-idealities have been demonstrated in [5, 6]. While these techniques address both non-idealities, they are commonly referred to as glitch-free muxes.

Although [5, 6] have implemented glitch-free multiplexers to circumvent mux glitches, no published works have quantified the effect of these glitches. To quan- tify them, the ACLR2 for a 20 Mbps SOQPSK signal has been simulated with and without glitches, with the result shown in Fig. 3.12. Because glitches only occur when the phase select data changes, the likelihood of a glitch occurring is a function of the

RF frequency (fRF) to oversampled baseband data rate (fOBB) ratio. This simulation shows that the ACLR2 for high-resolution modulators is limited to approximately

-48dBc for low fRF/fOBB ratios, despite the implemented phase resolution. Recently published phase modulators have a fRF/fOBB of around 6 with 8-bit phase resolution, which would result in a 3dB ACLR2 increase if they were used to generate a SOQPSK waveform.

39 -40 Glitch-free w/ glitches -45

-50 7-bit -55 8-bit -60 9-bit -65

-70 10-bit

2 4 6 8 10 f /f RF OBB

Figure 3.12: 20 Mbps SOQPSK ACLR2 increase due to mux glitches as a function of the RF frequency (fRF) to oversampled baseband data rate (fOBB) ratio for 7-10-bit quantization

Fig. 3.13 and Fig. 3.14 overviews the glitch-free mux architectures previously demonstrated in literature, both of which limit the maximum possible phase transition to 90◦ to ensure glitch-free transitions. In [5] (Fig. 3.13), a DLL that produces 37.5% duty-cycle pulses is used with a phase select mux implemented with an AND-OR architecture, allowing multiple DLL outputs to be simultaneously selected, with the output of the multiplexer (denoted Muxint) being the logical OR of all the selected

DLL outputs. The rising and falling edges of Muxint are used to clock in the new phase select bits D[15:0]. These bits are encoded so that all taps between the current and next selected taps are enabled when a transition occurs. Fig. 3.13 shows a transition from DLL[0] to DLL[2], resulting in DLL[0], DLL[1], and DLL[2] being simultaneously selected and ORed by the mux. This scheme produces a glitch-free

40 DLL[15] D[15] DLL [15:0] Out D[15:0] out DLL[0] B[10:7] D[0] encode D Mux SR latch ff Mux Dual edge S out triggered Muxint

τd R

DLLout[0:2] DLLout[0] Selected Selected DLLout[2] Selected 37.5% DLLout[0]

DLLout[1]

DLLout[2]

Muxint Duty cycle Muxout 50% distorted

SR latch restores duty cycle Figure 3.13: Glitch-free multiplexer architecture presented in [5]

41 DLLout[15:0] Bdel[10:7] B[10:7] D τd2 Mux ff Muxout

τd1

DLLout[0] Selected DLLout[2] Selected

DLLout[0]

DLLout[2]

Muxout

τd1 τd2 B[8]

Delayed rising edge of Bdel[8] Muxout clocks in B[10:7] Mux control data changes τd1 + τd2 after Muxout rising edge

Figure 3.14: Glitch-free multiplexer architecture presented in [6]

signal at Muxint at the expense of duty-cycle distortion. A SR-latch with a fixed half

RF period delay (denoted τd) is then used to restore the mux output, Muxout, to a

50% duty-cycle.

Fig. 3.14 shows the glitch-free mux presented in [6], where the authors limit the maximum phase transition to 90◦ to ensure glitch-free operation. When the maximum transition is limited to 90◦, a glitch-free transition can be made 3/4 of a

RF period after the rising edge of the selected DLL output because all signals within the transition range are low at that time. First, the mux output, Muxout, is delayed by a 1/4 RF period delay τd1 and clocks in the phase control bits B[10:7]. This clocked

42 data is then delayed a half RF period delay τd2 to produce the delayed mux control data, denoted Bdel[10:7], that causes the mux to transition to the new tap. These delays cause the mux control signals to transition τd1+τd2=3/4 period after the rising

edge of Muxout, ensuring a glitch-free transition.

While both of these glitch-free mux implementations work at a single RF fre- quency, they are unsuitable for a transmitter that covers the wide range of delay-line lengths required to support multiple frequency bands since they depend on the gen- eration of a fixed delay τdx. For instance, a 1/2 RF period delay is required in [5]

to restore the 50% duty-cycle, resulting is duty-cycle distortion as the RF operating

frequency is varied. Meanwhile, a 3/4 RF period delay is required in [6] to ensure the

mux control data transitions at the desired time, leading to glitches and duty-cycle

distortion if the frequency changes. While tunable delays can be utilized to allow

these architectures to operate across a wider frequency range, this would require

excess power, area, and per-part calibration to ensure glitch-free operation.

3.3.2 Four-way Non-isolating Outphasing Combiners

For traditional two-way Chireix outphasing combiners, the two PAs each experi-

ence loading from the other PA. As the phase difference between the PAs is swept,

both the real and imaginary portion of the PA load impedance changes. The Chireix

outphasing combiner contains reactive components to cancel out the imaginary por-

tion of the PA loading, but this cancellation is perfect at only two phase differences.

This results in the reactive variation shown in Fig. 3.15, which limits the dynamic

range of the combiner to around 6-10 dB [18]. This makes these combiners unsuitable

for complex waveforms with high peak-to-average power ratios like 64-QAM LTE.

43 Figure 3.15: PA loading for Chireix and four-way non-isolating combiners as phase difference is swept [7]

44 Four-way non-isolating outphasing combiners have recently been developed that overcome this limitation. With the additional paths, more reactive components can be used and the phase relation between the PAs can be selected to minimize the reactive variance, causing a mostly-real PA loading, shown in Fig. 3.15, as the phase is swept. Microstrip, , and lumped-element implementations have been demonstrated in literature [7, 19, 20], with an output power range of 10-12 dB. The improved dynamic range and power efficiency of these four-way combiners make outphasing a more attractive option for efficiently transmitting high PAPR waveforms.

3.4 Outphasing Performance

Now that the basics of outphasing operation and recent advances have been pre- sented, the performance of the proposed four-way outphasing architecture can be an- alyzed. First, the outphasing combiner should be selected. Although a non-isolating combiner would increase the transmitter power efficiency, the large target RF fre- quency range of the transmitter requires an isolating combiner to be implemented.

With the combiner selected, it is possible to simulate the transmitter performance to set system requirements. First, Section 3.4.1 simulates the performance of a 20

MHz 64-QAM LTE signal to set the phase quantization requirement, with the perfor- mance of both two-way and four-way outphasing simulated to show the improvement of the four-way architecture. Next, Section 3.4.2 will present the expected transmit- ter power efficiency. Last, Section 3.4.3 will present the degradation of transmitter performance due to timing and PA mismatch.

45 1 1

0.8 0.8 Increasing φ

0.6 0.6

0.4 0.4

0.2 0.2

0 0 0 /8 /4 3 /8 /2 0 /8 /4 3 /8 /2 Outphasing Angle (radians) Outphasing Angle (radians) a) b)

Figure 3.16: 6-bit phase quantized amplitude for a) two-way and b) four-way out- phasing architectures

3.4.1 Outphasing Quantization

The first two-way and four-way outphasing characteristic to analyze is the trans- mitter performance is the presence of phase-quantization. Fig. 3.16 shows the am- plitude vs outphasing angle relationships from Fig. 3.6 and Fig. 3.9 with 6-bit phase quantization. Because of the A = cos(θ) relationship for two-way outphasing, there is non-linear amplitude quantization in Fig. 3.16a) even though there is linear phase quantization. This non-linearity causes problems at low amplitude levels because of the coarseness of the amplitude step. In total, a n-bit phase quantized two-way architecture will have 2n−2 + 1 amplitude levels available.

Alternately, the amplitude of a four-way outphasing architecture is controlled by two variables, with A = cos(θ) cos(ϕ). This results in 22(n−2) + 1 amplitude levels be- ing available, resulting in the dramatic increase in quantization levels shown in Fig.

46 1 Step 1: Select cos(θ) > Amp level 0.8 Step 2: Select Target Amp = 0.6 nearest φ 0.6

Target Amp = 0.6 0.4

0.2

0 0 /8 /4 3 /8 /2 Outphasing Angle (radians)

Figure 3.17: Quantization algorithm for four-way outphasing

3.16b). To better quantify the increase in amplitude levels, considered an 8-bit quan-

tized transmitter. For a two-way architecture, this results in only 65 amplitude levels;

a four-way architecture with the same quantization will have 4097 levels available.

The first problem for quantizing a waveform with all these additional ampli-

tude levels is selecting the value for θ and ϕ, especially since the solution for A = cos(θ) cos(ϕ) is non-unique. It is desirable to decrease the frequency of θ and ϕ changing to decrease the bandwidth of the phase signal and decrease the likelihood of glitches, so based off this criteria the quantization algorithm was developed. First,

θ is selected so that cos(θ) is the first amplitude level greater than the target ampli- tude level. Second, the value of ϕ is selected to minimize the amplitude error, using a non-linear mid-tread quantizer algorithm. Fig. 3.17 demonstrates this algorithm quantizing the amplitude of 0.6. First, θ = 9π/32 is selected because it is the first θ

47 -10 Four-way Two-way -20 LTE Limit

-30

-40

-50

-60

-70 4 6 8 10 12 Quantization Bits

Figure 3.18: Simulated 20 MHz LTE ACLR for quantized two-way and four-way outphasing architectures

value that cos(θ) > 0.6. Next, ϕ = 3π/32 is selected because it is the closest value for cos(θ) cos(ϕ) = 0.6.

With the quantization algorithm developed, the ACLR for a 20 MHz LTE wave- form was simulated for both two-way and four-way outphasing architectures. As shown in Fig. 3.18, the two-way architecture requires 9-bit quantization to meet the

-44.2dBc ACLR limit, but the four-way architecture requires two fewer phase quanti- zation bits to meet the limit. This two-bit improvement increases the minimum phase quantization by a factor of four, significantly improving the ease of phase modulator implementation.

48 3.4.2 Outphasing Power Efficiency

The efficiency of a LINC outphasing transmitter architecture is limited by the

use of the isolating Wilkinson power combiner, trading off the power efficiency for

increased RF frequency range and dynamic range. For a four-way outphasing archi-

tecture, the efficiency, denoted η, is calculated as:

Channel Efficiency Combiner Efficiency z }| { z }| { P + P + P + P P η η = out1 out2 out3 out4 out combiner (3.10) PDC1 + PDC2 + PDC3 + PDC4 Pout1 + Pout2 + Pout3 + Pout4

This efficiency has two primary components. The first component is the total efficiency of the four outphasing channels, consisting of the total output power divided by the total DC power consumption of the four channels. The second component is the combiner efficiency, which is the ratio of the ideal output power to total input power times the efficiency of the combiner, ηcombiner. (3.10) can be simplified to:

P η P η η = out combiner = out combiner (3.11) PDC1 + PDC2 + PDC3 + PDC4 PDC, total

Both ηcombiner and PDC, total are constant as the output power changes, resulting in

the efficiency being linearly related to the transmitter output power. Fig. 3.19 shows

the efficiency curve for a four-way outphasing transmitter with ηcombiner = 80% and

Pout/PDC, total = 50%, representing a power efficiency of 50% for each of the channels.

Note that the x-axis is in dB because this is typical for transmitter output power

plots, even though using a non-linear x-axis masks the linear relationship between

efficiency and output power.

To calculate a transmitter’s power efficiency when transmitting a specific wave-

form, the probability density function (PDF) for the waveform needs to know. The

PDF represents how likely the waveform is to be at each output power level. The

49 40 LTE PDF Efficiency

30

20

10

0 -40 -30 -20 -10 0 Power (dB)

Figure 3.19: Outphasing power efficiency and 20 MHz 64-QAM LTE probability density function (PDF)

black curve in Fig. 3.19 is the simulated PDF for a 20 MHz 64-QAM LTE signal with

3.5 dB clipping, having a PAPR of 10.2 dB. By multiplying the PDF and efficiency

curves and integrating over the entire output power range, the estimated transmitter

power efficiency of 4.9% is calculated.

Although this efficiency equation (3.11) was derived for a four-way outphasing

architecture, the final efficiency equation also holds true for two-way LINC outphasing

architectures. This is because PDC and ηcombiner is constant as Pout is swept for both two-way and four-way outphasing. Assuming the same ηcombiner, the two-way and four-way outphasing architectures have the same 4.9% estimated power efficiency for

20 MHz 64-QAM LTE signals.

50 y=y1+y2 Amp and y1 Phase Error

0.8 y1

y2 y2

a) b)

Figure 3.20: a) Ideal two-way outphasing vectors and b) two-way outphasing vectors with 20% PA mismatch

3.4.3 PA and Timing Mismatch

Outphasing transmitters rely on the combination of multiple channels to produce the modulated output. Because multiple channels are used, it is important to analyze the effect of mismatch between the channels. There are two key mismatches that need to be quantified: amplitude mismatch in the PAs and timing mismatch between the outphasing channels.

PA mismatch

PA mismatch is caused by variations in geometry and material properties that occur during semiconductor processing. For outphasing architectures, this causes the amplitude of the outphasing vectors to be non-equal. As shown in Fig. 3.20, this mismatch results in both amplitude and phase error in the combined vector. This mismatch will be modeled as a normally distributed variables αx of mean 1 and a standard deviation σ. This creates the two-way outphasing vectors G y1(t) = α1 cos[ωRF t − ψ(t) + θ(t)] 2 (3.12) G y (t) = α cos[ω t − ψ(t) − θ(t)] 2 2 2 RF 51 -43

-43.5

-44

-44.5

-45

-45.5 Four-way Two-way LTE Limit -46 5.6 5.8 6 6.2 6.4 6.6 3 mismatch (%)

Figure 3.21: Simulated 20 MHz 64-QAM LTE ACLR in the presence of PA mismatch for four-way and two-way outphasing transmitters. Each data point is the average of 400 independent simulations.

and the four-way outphasing vectors G y (t) = α cos[ω t − ψ(t) + θ(t) + ϕ(t)] 1 1 4 RF G y2(t) = α2 cos[ωRF t − ψ(t) + θ(t) − ϕ(t)] 4 (3.13) G y (t) = α cos[ω t − ψ(t) − θ(t) + ϕ(t)] 3 3 4 RF G y (t) = α cos[ω t − ψ(t) − θ(t) − ϕ(t)]. 4 4 4 RF A 20 MHz LTE signal was simulated with a two- and four-way outphasing ar- chitecture with PA amplitude mismatch. The four-way architecture is 8-bit phase quantized and the two-way architecture is 10-bit quantized so the architectures have the same ACLR without the presence of mismatch. As shown in Fig. 3.21, the four- way architecture can withstand a 3σ amplitude mismatch of 6.7%, which is a 15.5%

52 improvement over the two-way architecture’s 5.8% 3σ limit. This improvement in the presence of PA mismatch can be attributed to the decreased contribution of each channel in a four-way outphasing architecture. As shown by (3.12), each channel for a two-way architecture is scaled by 1/2, while (3.13) shows each channel for a four-way architecture is scaled by 1/4, decreasing its overall contribution.

It should be noted that this PA mismatch due to process variation can be cor- rected. Both the gate and drain voltage bias can be adjusted to ensure each PA has the same output power. The one downside with this approach is that the transmitter performance is limited by the performance of the worst PA because the output power of the other PAs will need to be decreased to match the output power of the worst

PA.

Timing mismatch

Timing mismatch between the outphasing channels is another nonideality that needs to be analyzed to determine outphasing performance. Ideally, all of the channels should be aligned in time, but mismatch in signal routing, PA phase distortion, and phase response of the outphasing combiner can cause there to be a static time mismatch. A time mismatch variable ρx will be used to represent this mismatch. For two-way outphasing, this creates G y1(t) = cos[ωRF t − ψ(t) + θ(t) + ρ1] 2 (3.14) G y (t) = cos[ω t − ψ(t) − θ(t) + ρ ] 2 2 RF 2

53 and the four-way outphasing G y (t) = cos[ω t − ψ(t) + θ(t) + ϕ(t) + ρ ] 1 4 RF 1 G y2(t) = cos[ωRF t − ψ(t) + θ(t) − ϕ(t) + ρ2] 4 (3.15) G y (t) = cos[ω t − ψ(t) − θ(t) + ϕ(t) + ρ ] 3 4 RF 3 G y (t) = cos[ω t − ψ(t) − θ(t) − ϕ(t) + ρ ]. 4 4 RF 4 A 20 MHz 64-QAM LTE was simulated for both a two- and four-way outphas- ing with time mismatch introduced on a single outphasing channel, with the results shown in Fig. 3.22. The mismatch was swept from 0 to 0.8% of an RF period, with the two-way architecture requiring less than a 0.2% mismatch and the four-way ar- chitecture requiring less than a 0.4% mismatch. The dotted lines are for 8-/10-bit phase resolution, and the solid curves are for a 2-bit increased phase resolution to

10-/12-bit. Although the higher phase resolution has a 12 dB better ACLR than lower phase resolution, the ACLR degrades more rapidly as the mismatch decreases, resulting in the same mismatch resilience as the lower resolution phase modulation.

Since time mismatch in percent of is not a very intuitive metric, Table 3.2 shows the conversion to mismatch in picoseconds at various frequencies. To meet the 0.4% limit at greater than 4 GHz, sub-picosecond delay mismatch is required. Fortunately, the fine phase resolution of the phase modulators can compensate for the delay mis- match. A phase modulator with the required 8-bit phase resolution has resolution equal to 0.39% of an RF period, and a 10-bit phase resolution has less than 0.1%

RF period resolution. If the time mismatch is measured, the phase modulator can add/subtract delay from a channel to achieve alignment with the other channels.

Chapter 4 demonstrates how this can be accomplished during testing.

54 -30

-35

-40

-45

-50

-55 Four-way -60 Two-way LTE Limit -65 0 0.2 0.4 0.6 0.8 Time mismatch (%)

Figure 3.22: Simulated 20 MHz 64-QAM LTE ACLR in the presence of time mismatch between outphasing channels. Solid lines are for 2-bit additional phase resolution over dotted lines

Table 3.2: Time mismatch percent to time mismatch in picoseconds conversion at various frequencies Frequency (GHz) Period (ps) 0.1% (ps) 0.2% (ps) 0.4% (ps) 0.6% (ps) 1 1000 1 2 4 6 2 500 0.5 1 2 3 3 333 0.333 0.666 1.332 1.998 4 250 0.250 0.500 1 1.5 5 200 0.200 0.400 0.800 1.200 7.5 133 0.133 0.266 0.532 0.789 10 100 0.100 0.200 0.400 0.600

55 Table 3.3: Outphasing transmitter 20 MHz 64-QAM LTE simulation results Architecture Two-way Outphasing Four-way Outphasing Minimum Phase Resolution 9-bit 7-bit Power Efficiency 4.9% 4.9% Maximum Amplitude Mismatch 5.8% 6.7% Maximum Time Mismatch 0.2% 0.4%

3.4.4 Summary

This chapter has overviewed the history of outphasing, including recent advances such as digitally-intensive phase modulators and multiway outphasing combiners. The theory behind outphasing operation was presented for two-way outphasing architec- tures, and the four-way outphasing architecture was developed. Simulations were conducted for two-way and four-way outphasing architectures transmitting a 20 MHz

64-QAM LTE waveform, with a summary of the simulation results presented in Table

3.3. The results of these simulations set the minimum system specifications needed to implement the transmitter, and show the performance benefits of the four-way outphasing architecture.

56 Chapter 4: Phase 1 Implementation

The proposed outphasing architecture was fabricated in two phases, with the

first phase serving as a proof-of-concept to verify operation of key circuit blocks and the second phase implementing the full four-way architecture. Both phases will be implemented in the DAHI process technology, featuring the heterogeneous integration of 45nm SOI CMOS and 0.2µm GaN. Table 4.1 overviews the target specs for both implementation phases.

As a proof-of-concept, Phase 1 will only implement two phase-modulated chan- nels to decrease the complexity and required die area. This limits the transmitter to

Table 4.1: Transmitter implementation overview Phase 1 Phase 2 Transmitter channels 2 4 RF frequency range 3.5-5.5 GHz 3.5-5.5 GHz Single-channel output power 35 dBm 35 dBm CMOS frequency range 2.2-5.2 GHz 2.2-10.4 GHz Minimum phase resolution 6-bit 8-bit Maximum phase resolution 7-bit 10-bit Oversampled baseband datarate 400 MHz 400 MHz Data transmission Low-speed parallel High-speed serial

57 only a two-way outphasing configuration, limiting the performance of the transmit- ter. Phase 1 will also implement a lower-resolution 6-7-bit phase modulator. While this phase modulator will not meet the minimum 9-bit phase resolution required for

20 MHz 64-QAM LTE waveforms with a two-way outphasing architecture, it will meet the 5-bit minimum resolution requirement for SOQPSK waveforms. In the two- way outphasing configuration, it will be tested with a simple 16-QAM waveform to determine modulation performance.

Specifics on the operation of each circuit block will be saved for the discussion of the Phase 2 implementation (Chapter 5) to avoid redundancy in this document.

Instead, Section 4.1 will provide a high-level overview of each circuit block. Section 4.2 will discuss the packing and testing of the Phase 1 design, and Section 4.3 will compare these measured results to other state-of-the-art works and discuss key takeaways from the Phase 1 IC.

4.1 Architecture

Fig. 4.1 shows the block diagram for the Phase 1 IC. This IC features two inde- pendent phase-modulated channels. The IC is fed an external RF reference tone at twice the desired RF output frequency, with an on-chip current-mode logic (CML) divide-by-2 used to provide a low phase noise RF reference tone at RF output fre- quency. Serial peripheral interface (SPI) and modulated phase select data for the phase modulators are externally supplied to a synthesized digital block which provides configuration and modulation data to the phase modulators. To decrease complexity, the phase select data is transmitted in two 8-bit parallel buses at a

(400 MHz).

58 Out PAPA 2

Phase Modulator LO 2 CMOS to GaN /2 Drivers PA Out1

SPI Synth. Phase Phase Digital Modulator1 Select1,2 GaN

CMOS

Figure 4.1: Phase 1 IC block diagram

The phase modulators are reconfigurable to operate across 2.2-5.2 GHz. 7-bit phase resolution is achieved at 2.2-4 GHz, while only 6-bit phase resolution is achieved at 4-5.2 GHz. The phase modulator output is amplified by the CMOS to GaN Drivers to drive the GaN PA. The GaN PAs consist of three stages: a CML pre-driver, a push- pull driver, and a Class-E PA.

4.2 Packaging and Measurement

4.2.1 Packaging

The Phase 1 integrated circuit is fabricated in the DAHI process technology, with the die photo shown in Fig. 4.2. This integrated circuit is mounted on a custom 6”x4” printed circuit board (PCB) test fixture to enable its testing, with the PCB shown in

Fig. 4.3. The DAHI process necessitates PCB packaging because there are pads on both the lower CMOS and upper GaN dies, and a significant number of wirebonds

(approximately 140) are required for all of the inputs/outputs (I/O).

59 Buffer

Phase Mod GaN PA mm GaN PA

4 4 Phase Mod

Buffer

4 mm

Figure 4.2: Phase 1 IC die photo

Figure 4.3: Phase 1 6”x4” printed circuit board with IC

60 Figure 4.4: Wirebonded Phase 1 IC showing wirebond capacitors

This PCB packaging has drawbacks compared to probing. The first major draw- back is the design and manufacturing time required for the custom PCB. The design requires several months and the fabrication, assembly, and wirebonding typically take a month. This design time also delays the development of the field-programmable gate array (FPGA) code needed to transmit the SPI configuration and phase select data.

The IC is wirebonded to the PCB, which introduces parasitic inductance and resistance to the I/O. For some I/O such as the low frequency phase select data and

SPI, these wirebond parasitics are relatively inconsequential, but for the gate and drain biases of the PA these parasitics can degrade the transmitter performance. To

61 Table 4.2: Phase 1 IC packaging and testing overview PCB Fully Comments Number Functioning A-1 No Encapsulation/testing damaged IC/packaging A-2 No Encapsulation/testing damaged IC/packaging A-3 No Encapsulation/testing damaged IC/packaging A-4 No Encapsulation/testing damaged IC/packaging B-1 Yes B-2 No Short on CMOS to GaN driver bias B-3 Yes C-1 No Oscillation on GaN output C-2 No Short on GaN bias C-3 Not tested

counteract this, capacitors are placed near the IC for the critical wirebonds to be bonded to/from, as shown in Fig. 4.4. This decreases the wirebond length and the capacitance removes high-frequency noise.

4.2.2 Measured Results

A total of ten integrated Phase 1 ICs were packaged in three groups, with an overview given in Table 4.2. The three groups are designated by letter code A/B/C, and the specific PCB is represented by the number. In total, there was poor yield on the packaged components, with only two of the 5 tested components working. Other

ICs fabricated in this particular DAHI run also achieved poor yield, but they were probed, removing the possibility of errors during packaging. The poor yield achieved in Phase 1 can be attributed to both poor IC yield and packaging problems.

All four boards in the A packaging group were fully encapsulated due to a miscom- munication with the wirebonding company. As a result, when the PA was powered

62 on the wirebond capacitors were damaged. The capacitors are rated to 50 V and current compliance limits were set on the power supplies, so it was surprising that the capacitors were damaged. the specific cause of the damage was never determined.

Both the B and C packaging groups were not encapsulated. Of the five tested PCBs in these groups, only two worked. Two of these boards had shorts on biases, these wirebonds for shorts were visually inspected but no obvious shorts were apparent.

Board C-1 was functioning, but the GaN PA output was significantly lower than ex- pected and showed significant oscillations. Boards C-1 and C-3 were both working, with C-3 achieving the best performance. The measured results reported throughout the remainder of this section are for board C-3.

Single-Channel Performance

Fig. 4.5 shows the measured output power and efficiency for the Phase 1 IC for a non-modulated waveform. It achieved a peak output power of 32.9 dBm and a peak transmitter efficiency of 41.3% at 4 GHz. It should be noted that the quoted efficiency is the entire transmitter efficiency, including all the power consumption from the CMOS and GaN chips. Table 4.3 shows the measured DC power consumption and percent of total power consumed for each of the three primary components. The PA dominates the transmitter power consumption, consuming more the 92% of the DC power across the entire output frequency range. Note that board and cabling losses have been calibrated out of the measured results, but the effect of the wirebonds have not been calibrated out.

Fig. 4.6 shows the measured output power and Fig. 4.7 shows the measured efficiency for three different levels on transmitter integration. The differences between the three different measurements are summarized in Table 4.4. The solid blue line are

63 34 45 ) 32 35 dBm (

30 25

28 15 Efficiency (%) OutputPower

26 5

Figure 4.5: Measured Phase 1 output power and efficiency

Table 4.3: Measured transmitter power breakdown DC Power % of Total CMOS Phase Modulator 62.5 mW 1.3 % CMOS to GaN Driver 240.9 mW 5.1 % Three-Stage GaN PA 4.45 W 93.6 %

64 Table 4.4: Test condition overview for different PA measurements Packaging Input source Drain bias Efficiency Integrated Tx PCB CMOS chip 25 V Tx Integrated PA Probed External amplifier 28 V PA only GaN PA Probed Signal generator 28 V PA only

the fully integrated transmitter measurements from Fig. 4.5, using a 25V stage-three drain bias. The dotted red line shows the measured results from one of the integrated

PAs, but instead of being packaged on a PCB the GaN PA was probed and driven from an off-chip signal generation source. This measurement used a 28V stage-three drain bias. The dashed black line shows the measured results from an non-integrated

GaN PA with a 28V drain bias, which was also probed.

The output power of the integrated PA was similar to the integrated transmitter, despite using a higher drain voltage. The GaN PA achieved a 35 dBm output power, which is 2 dB higher than the integrated transmitter and integrated PA. The two probed measurements also show less variation with operating frequency than the wirebonded transmitter, which can partially be attributed to the wirebond parasitics.

The GaN PA was able to achieve higher output power because a RF signal genera- tor was used to drive the PA. This enables the input power to be significantly increased to ensure proper switching operation throughout the three-stage PA, resulting in the high output power.

The peak efficiency of the integrated transmitter includes the CMOS power con- sumption, while the other two only include the GaN PA power consumption. As shown in Table 4.3, the efficiency of these two will be approximately 10% lower if

65 36 Integrated Tx Integrated PA 34 GaN PA

32

30

28

2 2.5 3 3.5 4 4.5 5 5.5 Frequency (GHz)

Figure 4.6: Measured Phase 1 output power

the rest of the transmitter (phase modulator and CMOS to GaN driver) is included.

Despite the handicap of the extra power consumption from the phase modulator and

CMOS to GaN driver, the integrated transmitter achieved the highest peak efficiency.

This can be attributed to only using a 25V PA drain bias, decreasing the peak output power but improving the power efficiency.

It should be noted that the measured performance of the GaN PA showed sig- nificant die-to-die and channel-to-channel variance. While the variance was partially corrected through adjusting gate and drain biases, it is not entirely possible to achieve the same performance with different PAs. The results presented are for PAs with the best performance.

66 Integrated Tx 40 Integrated PA GaN PA

30

20

10

2 2.5 3 3.5 4 4.5 5 5.5 Frequency (GHz)

Figure 4.7: Measured Phase 1 peak power efficiency

SOQPSK

Next, the single-channel performance with a 10 MHz SOQPSK waveform was measured. Originally, the spectrum did not meet the spectral mask requirements for the SOQPSK waveform despite having 6-bit phase resolution, with only 5-bit resolution ideally needed to meet the spectral mask. When comparing the spectrum to the simulated spectrum in Fig. 2.5, the shape of the spectral noise floor is different.

The raised shoulders of the spectrum’s main lobe is instead characteristic of a non- linear phase step instead of the phase quantization. The coarse phase steps were measured with an oscilloscope and phase-select signal was predistorted to correct for the phase nonlinearity. As the figure shows, this calibrated waveform met the spectral

67 -20 Uncalibrated Calibrated -30

-40

-50

-60

-70 -20 -10 0 10 20 Offset (MHz)

Figure 4.8: Measured 10 Mbps SOQPSK spectrum

mask limit for this waveform. These measurements were conducted before the ACLR specification in Chapter 2 was developed, so the ACLR was not measured.

Outphasing Performance

After the performance of a single channel was measured, two channels were com- bined to enable outphasing testing. Table 4.5 shows a summary of the outphasing setup. The RF frequency was set to 4 GHz to operate the transmitter at its peak power efficiency from the single-channel testing. The next step was to adjust the biasing of the two channels so they have an equal output power. This was done by setting the gate biases equal and adjusting the drain biases so there was a measured output power of 31.6 dBm. To achieve this, one PA had a 20.1 V drain bias while the other had a 25 V drain bias. This drastic drain voltage difference can be attributed

68 Table 4.5: Measured single-channel and combined PA performance PA1 PA2 Combined Output Power (dBm) 31.6 31.6 33.2 Transmitter Efficiency 38.4 % 33.1 % 25.4 % Drain Voltage (V) 20.1 25

to the large threshold voltage variation in this GaN process. Because the final stage of the PA dominates the transmitter power efficiency, the transmitter efficiency varies with the drain biasing, with one channel achieving 38.4% power efficiency and the other only achieving 33.1% power efficiency.

After the output power was adjusted, the two channels were combined by a Wilkin- son power combiner. The phase of one of the channels was then adjusted until the peak possible output power was achieved. Instead of seeing the ideal 3 dB increase in output power, combining the two channels only produced a 1.6 dB power increase.

The remaining 1.4 dB can be attributed to the insertion loss of the combiner and only having 6-bit phase resolution, making it difficult to guarantee the two channels are completely in-phase. Because the combining did not result in a 3 dB power increase, the peak combined transmitter efficiency was only 25.4%. The phase difference be- tween the two channels was swept to find the maximum and minimum output power, resulting in a measured dynamic range of 23.8 dB.

The outphasing transmitter was then fed modulation data for a 16-QAM and pulsed radar waveform, with the results shown in Table 4.6. The following sections will further analyze the 16-QAM and radar waveforms and transmitter performance.

69 Table 4.6: Measured modulated outphasing performance Combined 16-QAM Radar Average Output Power (dBm) 33.2 28.49 23.2 PAPR (dB) 0 5.4 10 Transmitter Efficiency 25.4 % 8.6 % 2.5 %

16-QAM

Fig. 4.9 shows the measured near-in spectrum and Fig. 4.10 shows the far-out spectrum of a 5 Mbps 16-QAM waveform with a 5.4 dB PAPR.

Due to the relatively coarse phase resolution, the noise floor for the near-in spec- trum shown in Fig. 4.9 is quite high, with a signal-to-noise ratio of 24 dB measured by the vector signal analyzer. In addition, the individual phase modulators were cal- ibrated as discussed in Section 4.2.2, so the raised shoulders on the sides of the main lobe can be partially attributed to phase nonlinearity. Because a specific standard- defined modulation was not used, there is no spectral mask specification to compare to the measured spectrum. For the far-out spectrum in Fig. 4.10, the sinc response due to the zero-order-hold is apparent, as well as signal images at integer multiples of the 80 MHz phase data sample clock.

Fig. 4.11 shows the measured 16-QAM constellation diagram. Despite the coarse phase resolution, the constellation diagram is quite clean. A 4.3% rms EVM was measured with a peak EVM of 10.7%.

70 0

-10

-20

-30

-40 -5 -2.5 0 2.5 5 Offset (MHz)

Figure 4.9: Measured 5 Mbps 16-QAM spectrum

0

-10

-20

-30

-40

-50

-60 -160 -80 0 80 160 Offset (MHz)

Figure 4.10: Measured 5 Mbps 16-QAM far-out spectrum

71 1

0.5

0

-0.5

-1

-1 -0.5 0 0.5 1 I Channel

Figure 4.11: Measured 5 Mbps 16-QAM constellation diagram

72 Pulsed Radar

The final waveform that was tested was a 10% duty-cycle pulsed radar waveform

with a 1.56 kHz pulse repetition frequency (PRF) (640 µs pulse repetition period).

This means that for every pulse period, the transmitter transmits at maximum power

for 10% of the time, and the remaining 90% of the time no signal is transmitted. This

is achieved by adjusting the phase between the two signals, with the signals in-phase

for full power and 180◦ apart for no output power. This results in a PAPR of 10 dB and a transmitter efficiency of 2.5%.

Fig. 4.12 shows the measured spectrum for this waveform, showing the expected sinc response due to the zero-order-hold at ten times the PRF, producing nulls at 15.6 kHz. This sinc response continues out if frequency as shown in the far-out spectrum in Fig. 4.13.

The only nonideality of note in this measured spectrum is the 5 dB spur that appears at 0 kHz offset. Simulations showed that this spur is caused by the 23.8 dB dynamic range of the transmitter. This dynamic range indicates that instead of the to

PAs completely canceling each other out when 180◦ out-of-phase, they instead output

an amplitude that is 6.5% of the peak amplitude, resulting in the spur. Elimination

of this spur would require increasing the transmitter dynamic range through better

PA output power matching and increased phase resolution.

73 0

-10

-20

-30

-40

-50 -93.75 -62.5 -31.25 0 31.25 62.5 93.75 Offset (kHz)

Figure 4.12: Measured pulsed radar near-in spectrum

0

-10

-20

-30

-40

-50

-60 -400 -200 0 200 400 Offset (kHz)

Figure 4.13: Measured pulsed radar far-out spectrum

74 Table 4.7: Performance comparison of Phase 1 IC to recent works This Work JSSC 2012 JSSC 2012 JSSC 2016 [21] [5] [6] Process 45nm CMOS/ 45nm 32nm 130nm 0.2 µm GaN CMOS CMOS CMOS Architecture Outphasing Outphasing Outphasing PWM Integration Full PA Only Full Full Frequency (GHz) 2.2-5.4 1.5-3.2 2.4 1.8-2.5 Glitch-Free Mux No No Yes Yes Modulation 16-QAM 64-QAM 64-QAM 64-QAM OFDM OFDM OFDM PAPR (dB) 5.4 6.7 6.7 7.3 EVM (dB) -27.3 -25 -25 -25.5 Peak Power (dBm) 33.2 31.5 26 25.6 Peak Tx Eff (%) 25.4 27* NA 34 Avg Tx Eff (%) 8.6 16* 18.6 13 *PA efficiency only

4.3 Comparison and Key Takeaways

Table 4.7 shows a comparison of the Phase 1 implementation to other recently published outphasing and RF pulse-width modulation (PWM) works. Because of the

GaN PA, the Phase 1 IC achieves 1.7 dB higher output power than the other works.

With a low-loss Wilkinson combiner, the output power will increase as well as the peak and average transmitter efficiency (Tx Eff).

There are several key takeaways from this Phase 1 IC. The first is that the measured results, although lower than expected, show that implementing a fully- integrated architecture in the DAHI technology can achieve higher output power than CMOS-only transmitters. Second, the key circuit block for the phase modula- tor, CMOS to GaN driver, and GaN PA have all been proven. Lastly, the combination

75 of IC yield and packaging resulted in few working parts. This packaging proved to be especially difficult because of the density of I/O and wirebonds for the Phase 1 IC.

76 Chapter 5: Phase 2 Implementation

The Phase 2 implementation improves upon the circuit blocks implemented in

Phase 1 and implements four transmitter channels. Table 5.1 presents an overview of the improvements of the Phase 2 implementation, with the block diagram shown in Fig. 5.1. Through reconfiguration, the CMOS portion of the chip will be able to support operation from 2.2-10.4 GHz, with 8-10-bit phase resolution offered across this range. Section 5.1 provides the details of the phase modulator implementation.

To reduce the I/O overhead for the phase select data, the data will be transmitter over high-speed low-voltage differential signaling (LVDS) channels instead of low- speed parallel data used in Phase 1. The CMOS to GaN driver and GaN PA are similar to the Phase 1 implementation, with details given in Section 5.2. Packaging and measurement of the Phase 2 implementation will be discussed in Section 5.3, and conclusions and takeaways will be discussed in Section 5.4

5.1 Phase Modulator Implementation

Based off the 20 MHz 64-QAM LTE simulations presented in Chapter 3, a 8-10- bit phase resolution was chosen for the transmitter. Fig. 5.2 shows a detailed block diagram for the digitally-intensive reconfigurable CMOS phase modulator used to achieve this resolution. It operates across 2.2-10.4 GHz with three frequency ranges:

77 Table 5.1: Transmitter implementation overview Phase 1 Phase 2 Transmitter channels 2 4 RF frequency range 3.5-5.5 GHz 3.5-5.5 GHz Single-channel output power 35 dBm 35 dBm CMOS frequency range 2.2-5.2 GHz 2.2-10.4 GHz Minimum phase resolution 6-bit 8-bit Maximum phase resolution 7-bit 10-bit Oversampled baseband datarate 400 MHz 400 MHz Data transmission Low-speed parallel High-speed serial

Out PA 4 Phase Mod4

Out PA 3 Phase LO Mod3

/2 Out PA 2 Phase Mod2 Synth. SPI Digital CMOS to GaN Drivers Out PA 1 Phase Phase Mod1 Select1-4 LVDS GaN

CMOS

Figure 5.1: Phase 2 IC block diagram

78 RF Input Reconfigurable config[3:0] DLL DLL[15:0] Glitch-Free B[10:7] Mux

Muxout Fine FDout Very Fine PMout Phase Delay Delay Select B[10:0] B[6:2] B[1:0]

Figure 5.2: Phase modulator implementation showing four primary circuit blocks

low, middle, and high (Fig. 5.3). The low range offers 10-bit resolution across 2.2-

4.8 GHz, the middle range offers 9-bit resolution across 3.2-8 GHz, and the high range offers 8-bit resolution across 6.2-10.4 GHz. These ranges are designed to have overlap to compensate for changes in delay due to process corners and mismatch.

This phase resolution is achieved using a segmented approach with three stages: a reconfigurable delay-locked loop (DLL) and glitch-free multiplexer (mux) for 4-bit coarse resolution, a 4-bit fine delay block (with an additional bit to increase the

fine delay span) implemented with capacitively loaded inverters, and a 2-bit very-

fine delay block implemented with current-starved inverters. At higher frequencies the reconfigurable DLL provides less than 4-bit phase resolution, resulting in the decreased resolution in the middle and high frequency ranges.

The 11 phase select bits B[10:0] are used to control the modulation of the RF input signal. Note that 11 phase select bits are needed even though only a maximum of 10-bit resolution is achieved. The 11th bit is used to increase the span of the

fine delay to compensate for gaps in phase coverage caused by frequency tuning and

79 10-bit 10 B[10:0] 9-bit 9 B[9:0] 8-bit 8 B[8:0]

Resoltion(bits) Low Middle High 2 4 6 8 10 Frequency (GHz)

Figure 5.3: Transmitter low, middle, and high frequency ranges and resolution

process corners, as explained in Section 5.1.3. These bits are received over an LVDS data channel and dynamically change at the oversampled baseband data rate, which is up 400 MHz. SPI registers are used to provide static configuration data to the DLL to reconfigure it based on the desired RF operating frequency.

5.1.1 Reconfigurable DLL

Fig. 5.4 shows the block diagram of the reconfigurable DLL used for 4-bit coarse delay. The DLL’s primary function is to match the time delay of the selectable-length inverter chain to one RF clock period. The RF input signal is externally supplied at twice the desired RF carrier frequency, then a current-mode logic (CML) divide-by- two circuit is implemented to provide the DLL with a low phase-noise input signal at the desired RF frequency. The output of the CML divide-by-two is the DLL input signal, which is fed to a delay-line consisting of 32 current-starved inverters.

The current-starved inverters consist of a inverter which has an analog-biased NMOS and PMOS device that controls the amount of switching current available, increas- ing/decreasing the amount of delay for each inverter. Each current-starved inverter

80 Current-Starved Inverter Vctrl Vctrl PMOS NMOS ref down in out ref up up up VctrlNMOS down VctrlPMOS down down down down DLO up VctrlPMOS DLO up up up in out down

VctrlNMOS Reference (ref) up Loop Filter Phase Charge VctrlNMOS Delay-line output (DLO) down Detector Pump VctrlPMOS MuxRep config[2:0] MuxFSel

RF Input CML / 2 DLL Input 0123457 91113 15 DLLout[15:0]

Figure 5.4: Reconfigurable DLL for 4-bit coarse phase resolution

drives the subsequent current-starved inverter and a separate branch inverter that is used for DLL locking or one of the DLLout signals.

To enable DLL locking, a true single-phase clock (TSPC) D flip-flop base phase detector [22] encodes the phase difference between the reference and delay-line output in the widths of the “up” and “down” pulses. The fully-differential charge pumps and loop filters adjust the current-starved inverter biases, VctrlNMOS and VctrlPMOS, changing the total delay of the delay-line to align the rising edge of the reference signal with the delay-line output signal. Simulations show each current-starved inverter has a delay range of 10.5-19.6ps, requiring the number of current-starved inverters in the delay line to be adjusted to ensure there is only one RF period of delay in the delay- line. The config[3:0] signal statically configures the frequency-select mux (MuxFSel) to adjust the length of the delay line by selecting the output of the 8th, 12th, 16th,

20th, 24th, 28th, or 32nd current-inverter to serve as the delay-line output the DLL

81 12 Decreasing number 10 of inverters 8 25% overlap 6 4

Frequency(GHz) 2 0.6 0.7 0.8 0.9 1 Vctrl (V) NMOS

Figure 5.5: Frequency range of the reconfigurable DLL

is locked to. This delay-line length selection results in the frequency ranges shown in

Fig. 5.5. The frequency ranges were designed to have a minimum of 25% overlap to ensure continuous frequency coverage in the presence of device mismatch and process corners which can increase/decrease the delay of each current-starved inverter.

The reconfiguration of the DLL for multiple frequency bands via the frequency- select mux results in extra delay in the delay-line output. This creates a delay mis- match between the reference and delay-line output signal which results in significant phase error in the DLLout signals. To minimize this delay-mismatch, a replica mux,

MuxRep,is placed in the path of the reference signal to match the delay of MuxFSel.

Without MuxRep, there is a 45ps delay mismatch between the reference and delay-line output. To demonstrate the necessity of this delay-matching, consider the DLL oper- ating with a 200ps period (5 GHz) input signal. At this frequency, the DLL is locked to the output of the 16th current-starved inverter, allowing 3-bit coarse resolution generated on DLLout[7:0]. The 45ps delay mismatch causes the delay-line to produce

155ps instead of the desired 200ps delay, resulting in only 279◦ of phase coverage

82 ◦ ◦ being provided and producing a 5 + n × 10.1 phase error on DLLout[7:0], where n is bit-index of the DLLout signal. With MuxRep and matching the routing length of the signals, there is less than a 0.5ps delay-mismatch between the reference and delay-line output paths, resulting in a maximum phase error of 0.5◦ and degrading the ACLR by only 0.1dBc.

5.1.2 Glitch-Free Multiplexer

For a glitch-free multiplexer to work across a wide frequency range, it is impera- tive to use an architecture that doesn’t relay on time delays for glitch-free operation.

To develop such a mux, the conditions for a glitch-free transition should be consid- ered. As shown by the transition at TT C in Fig. 3.11, a glitch-free transition can occur only if both the currently selected tap and the next selected tap are low (log- ical “0”). The implemented glitch-free mux architecture, shown in Fig. 5.6, uses a pipelined datapath to allow both the current and future phase select data (Bcurr[10:7] and Bnext[10:7], respectively) to be simultaneously available, allowing the mux to de- termine if the glitch-free transition condition is met before it makes the transition.

Two independent muxes are implemented; one receiving the phase select data from

Bcurr[10:7], producing the Muxout signal, and the other receiving the phase select data from Bnext[10:7], producing the Outnext signal. Muxout and Outnext undergo a logical

NOR operation, producing a signal on OutNOR that goes high (logical “1”) when both

Muxout and Outnext are low, indicating that a glitch-free transition can be made.

While OutNOR could be directly used to clock in the new phase select data for both muxes, this signal transitions at the RF frequency (2.2-10.4 GHz) even though the mux select data B[10:7] only changes at the oversampled baseband clock (clkOBB)

83 DLLout[15:0] DLLout[15:0] Bnext[10:7] Bcurr[10:7] tmux,setup

tmux B[10:7] D Mux Mux D D D t ff ff ff Dff ff Muxout clkOBB Outnext clk GF tNOR

R tAND Critical path D datachange,in Out D ff Q NOR datachange

DLLout[0] Selected DLLout[2] Selected

Muxout

Outnext

Both low = OutNOR NOR high

Bcurr, Bnext change, NOR high AND datachange datachange goes high datachange high = glitch-free clk pulse clkGF Mux phase select signal transitions on clkGF rising edge

Figure 5.6: Frequency-agile glitch-free multiplexer architecture and timing diagrams

84 frequency (400 MHz). It is therefore desirable to decrease the frequency of the glitch-

free clock to the oversampled baseband rate to lower the dynamic power consumption

of the circuit while still maintaining glitch-free operation. To achieve this, a datachange

signal is generated that goes high when Bcurr[10:7] and Bnext[10:7] change, which only occurs when B[10:7] is different from one oversampled baseband period to the next.

OutNOR and this datachange signal are combined by a logical AND gate, producing the glitch-free clock signal (clkGF) which clocks in the new phase select data. ClkGF

transitioning high also resets datachange so it can transition high the next time the

data changes. Since B[10:7] are the most significant bits of the phase select signal,

they don’t typically change every symbol. This results in datachange, and therefore

clkGF, transitioning at even less than the clkOBB frequency, further decreasing the

dynamic power dissipation of the glitch-free mux.

The ability of this architecture to make large glitch-free transitions at higher

frequencies is ultimately limited by the delay in the phase select data clocking path,

as shown in Fig. 5.6 (blue). This critical path delay can be calculated by:

tdelay = tmux + tNOR + tAND (5.1) +tDff,clk→Q + tmux,setup. The maximum possible phase transition is then a function of this delay and the RF operating frequency, and can then be represented by

1  phase = − t × f × 360◦. (5.2) max 2 delay RF

In the 45nm CMOS technology the simulated tdelay is 41ps, resulting in a maximum phase transition of 147◦ at 2.2 GHz and 62◦ at 8 GHz. Above 8 GHz glitch-free operation is not supported because the delay-line only provides 2-bit resolution (90◦), which is beyond the maxiumum glitch-free phase transition at these frequencies. It

85 Stage 1 Stage 2 Stage 3 Muxout FDout

D[22:16] D[15:8] D[7:0] B[6:2] De code Single-stage delay

12ps 1.5ps

Figure 5.7: Three-stage fine delay

should be noted that the maximum phase transition at lower frequencies are larger than the maximum transitions of the glitch-free mux architectures in [5, 6] because they limit the maximum transition to 90◦ to ensure glitch-free operation.

5.1.3 Fine Delay

Fig. 5.7 shows the 4-bit fine delay is implemented using three stages of capacitively loaded inverters, with an additional bit implemented to increase the fine delay span.

Each stage consists of an inverter with 8 unary sized capacitors that can be switched onto its output. The 5-bit dynamic phase select data B[6:2] is decoded to produce the signal D[22:0] which switches in/out the capacitive loads. The capacitors are sized so that each produce a 1.5ps delay, and the decoded control signal D[22:0] produces

1.5ps, 3ps, 6ps, 12ps, and an additional 12ps delays. This additional delay is needed to increase the delay line span by 50% to provide adequate delay for low frequency operation. For example, at 2.2 GHz the coarse delay produces outputs that are spaced by 28ps. Without the extra 12ps delay the fine delay only covers 22.5ps delay, leaving

86 a 5.5ps gap in phase coverage. The extra delay increases the fine delay span to 34.5ps,

ensuring that full phase coverage is achieved.

Each capacitively loaded inverter is followed by a non-loaded inverter to restore

the edge rate of the signal and improve the linearity of the fine delay step. Careful

consideration is placed on the layout of the capacitors to improve the monotonicity

of the delay step. The 45nm process technology only features vertical natural (i.e.

MOM) capacitors, which have larger capacitance variation than more controlled ca-

pacitors such as MIM capacitors. To improve the capacitor matching, they are laid

out in an array with dummy capacitors placed around their periphery, and they are

place on the metal layers that have the lowest variation. A Monte Carlo simulation

across process and mismatch show a 3σ delay variation increase of 130fs due to the vertical natural capacitor variation.

5.1.4 Very-Fine Delay

The 2-bit very-fine delay block is implemented using two stages of current-starved inverters as shown in Fig 5.8. Both stages are identical except the transistor are sized differently in each stage in order to produce the 750fs and 375fs delays needed to achieve 8-bit resolution at 10.4 GHz. Transistors M1 and M2 are always on to supply switching current to the inverter. To modulate the delay, phase select data

B[1:0] turns transistors M3 and M4 on/off to increase/decrease the amount of current available for the main inverter switching. Because the delay depends on the sizing of the transistors, these stages are susceptible to delay variation due to sizing mismatch and process corners. Monte Carlo simulations across process and mismatch show a

87 Stage 1 Stage 2 M1 M3

FDout PMout

M2 M4 B[1:0]

Figure 5.8: Two-stage very fine delay

3σ delay variation increase of 93fs and 64fs for the first and second very-fine delay stage, respectively.

88 CMOS to GaN Driver Three-Stage GaN PA 4.5V 12V 10V 20V

R=43Ω L1 Bias 5 Out Bias 6 HIC W= L2 C1 Parasitics 200μm CMOS Q out PU W= C2 CMOSout 800μm W= Bias 2 Bias 2 500fH Q M3 M4 200μm PD Q1 Q2 PM PM out M1 M2 out Bias 3 Bias 3 Bias 4 IB Bias 1 Bias 1 CML Pre-Driver Push-Pull Driver Class-E PA

Figure 5.9: CMOS to GaN driver and three-stage GaN PA

5.2 Amplification Implementation

Unlike fully-integrated CMOS transmitters where both the modulator and PA reside in the same process technology, the proposed design has the challenge of in- terfacing dissimilar device technologies. The GaN class-E PA requires 5V swing for proper switching performance, but the thick-gate 45nm CMOS devices have a break- down voltage of only 1.5V. To overcome this technology interfacing problem, the architecture shown in Fig. 5.9 was developed. The phase modulated RF signal is then fed to to the CMOS to GaN drivers which amplifies the 1V square-wave CMOS signal to the 5V differential square-wave signal needed to drive the first stage of the

GaN amplifier while maintaining the signal’s phase resolution fidelity. The GaN am- plifier consists of three stages: a differential CML pre-driver, a push-pull inverting driver, and a class-E GaN PA.

89 CMOS to GaN Driver 4.5V

2.5V R=43Ω CMOSout HIC Parasitics 1V PMout CMOSout CMOSout Bias 2 Bias 2 500fH M3 M4 PM PM out M1 M2 out

IB Bias 1 Bias 1

Figure 5.10: CMOS to GaN driver

5.2.1 CMOS to GaN Driver

A CML CMOS to GaN driver is used to drive the 0-1V differential CMOS signal up to a differential signal with 2.5V single-ended swing. The thin-gate input devices,

M1 and M2, are utilized for high speed switching operation, commutating the tail current through R1 and R2. The input signal is AC coupled to the input devices to allow M1/M2 to be biased independent of the input signal. The values of the current and the resistors are chosen for desired 2.5V output signal swing. Thick-gate cascode devices, M3 and M4, are used to prevent breakdown of the input switches.

The issue of die interface parasitics also has to be overcome by this circuit. The

DAHI technology offers heterogeneous interconnects (HICs) [9] between the dies, of- fering approximately 500fH of inductance instead of the higher parasitics associated with traditional interfacing technologies like wirebonds. Despite offering lower para- sitics, they can still limit the RF frequency range of the transmitter. The ideal input signal for the GaN class-E PA is a square wave, so the buffer needs to amplify the

90 VoltageGain (dB)

Figure 5.11: CMOS to GaN buffer voltage gain with and without inductive peaking

fundamental frequency component and the odd harmonics of the CMOS input signal.

While the CMOS phase modulator is designed to cover 2.2-10.4GHz, the maximum frequency of the GaN PA is 6 GHz. This requires driver to provide gain up to 30 GHz in order to amplify the fundamental, 3rd, and 5th harmonics. To meet this band- width requirement in the presence of the die interface parasitics, peaking inductors are used. Fig. 5.11 shows the simulated voltage gain of the circuit with and without the peaking inductors. They increase the 3 dB bandwidth from 19.8 GHz to 39.6

GHz, allowing the buffer to drive high-frequency square-wave signals.

One advantage on developing a fully-integrated transmitter is 50Ω matching is not required to interface the driver stages. The CMOS to GaN driver instead implements a 43Ω resistance in the pull-up path. This lower resistance decreases the RC time constant of the pull-up path, allowing the output to have faster edges and provide gain at higher frequencies.

91 Three-Stage GaN PA 12V 10V 20V

L1 Bias 5 Out Bias 6 W= L2 C1 200μm CMOS Q out PU W= C2 CMOSout 800μm W= 200μm QPD Q1 Q2

Bias 3 Bias 3 Bias 4

CML Pre-Driver Push-Pull Driver Class-E PA

Figure 5.12: Three-stage GaN PA

5.2.2 Three-stage GaN PA

To condition the differential output of the CMOS to GaN drive to the 5V single- ended swing needed to drive the class-E PA, two GaN pre-driver stages are imple- mented (Fig. 5.9). First is a GaN CML pre-driver mimicking the CMOS counterpart, which increases the 2.5V output swing of the CMOS CML buffer to the 5V needed to drive the subsequent GaN stages. Note that no cascoding is necessary in the GaN

CML pre-driver because of the much greater breakdown voltages of the heterojunc- tion FETs. Minimum sized devices are chosen for Q1 and Q2 to reduce the capacitive loading on the CMOS driver.

The next critical aspect of the design is to maintain the phase fidelity of the signal while efficiently providing the edge-rate necessary to drive the large capacitance of the class-E PA. In CMOS this can be achieved using complementary devices (PMOS and NMOS), but in III-V technologies like GaN only depletion-mode n-type devices are available. To circumvent this problem, the design takes advantage of the readily

92 available differential signal. The differential output of the CML pre-driver drives the push-pull driver, which works like an inverting buffer [23] and provides very high edge- rate square-wave signal to the following class-E PA. This differential signal drives the pull-up and pull-down switches, QPU and QPD, to ensure that only one is on at a time to prevent shoot-through currents that would degrade the power efficiency. This push-pull topology also converts the differential signal to the 5V single-ended signal needed to drive the class-E PA.

The final stage is a continuous class-E switch-mode power amplifier [24]. L1 and

C1 were selected to provide a wideband fundamental 50Ω load match, while series inductor L2 and shunt capacitor C2 were selected to provide to the second harmonic termination independent of the fundamental matching. The resulting final stage PA achieves both high output power and power efficiency across a wide output frequency range.

5.3 Packaging and Measurement

The Phase 2 integrated circuit is fabricated in the DAHI process technology, with the die photo shown in Fig. 5.13. A similar packaging approach to Phase 1 was used, with a custom PCB designed and the IC wirebonded to the PCB. Fig. 5.14 shows the wirebonded IC. The wirebond capacitors were once again used to decrease the wirebond inductance on the GaN PS biases.

One group of three ICs has been packaged, with the results shown in Table 5.2.

Similar to the Phase 1 integration, low yield was achieved. Of the twelve total trans- mitter channels, only three were functioning. The main PA device of one of these three working channels was damaged when the drain was biased at the designed

93 C2G Driver

Phase Mod

Phase Mod

GaN mm PAs 6 LVDSReceivers

Phase Mod

Phase Mod

4mm

Figure 5.13: Phase 2 integrated circuit die photo

94 Figure 5.14: Wirebonded Phase 1 IC showing wirebond capacitors

value of 28 V. The non-working channels have been visibly inspected, but no errors are readily apparent.

Because only one channel per board is working, only single-channel measurements can be conducted. Fig. 5.15 shows the measured output power and efficiency for the Phase 2 IC. Unfortunately, the density of wirebonds on the left side of the chip significantly increased the wirebond inductance, only allowing the RF input to operate

95 Table 5.2: Phase 2 IC packaging and testing overview PCB Working Comments Number Channels A-1 1 Shorts in GaN driver gate and drain biases A-2 2 Shorts on GaN pre-driver and PA drain biases PA on working channel destroyed at 28 V drain bias A-3 0 Short on PA gate biases and additional control circuitry

up to 3 GHz. The PA is designed for 3.5-5.5 GHz operation, so both the output power and transmitter efficiency are quite low.

A second revision of the PCB has been designed to resolve this issue. The RF trace is moved closer to the chip, and only two of the transmitter channels will be wirebonded to decrease the wirebond density. While this will not allow for a four-way outphasing configuration, it gives a higher probability of success for single- channel measurements. Given the low yield achieved in both the Phase 1 and Phase

2 implementations, it is very unlikely that all four channel would work even if they were all wirebonded.

5.4 Conclusions and Takeaways

Due to IC yield and packaging difficulties, these final testing results are not cur- rently available. Re-packaging and testing are currently underway, with the hope that only wirebonding two channels will increase the yield and enable testing at the PAs target 3.5-5.5 GHz frequency range.

96 32 24

31 22

30 20

29 18

28 16

27 14 Efficiency (%)

OutputPower (dBm) 26 12

25 10 2.25

Figure 5.15: Measured Phase 2 output power and efficiency

97 Chapter 6: Conclusions and Future Work

6.1 Work Summary and Conclusions

This work describes the history, theory, development, and testing of a four-way outphasing transmitter. This novel transmitter architecture offers improved ampli- tude quantization compared to traditional two-way outphasing architectures, allowing for an ACLR improvement equivalent to a 2-bit phase resolution increase. In addi- tion, this architecture is more resilient to PA and timing mismatch than two-way outphasing architectures.

This transmitter is fabricated in the DAHI process technology, featuring the het- erogeneous integration of 45nm CMOS SOI and 0.2 µm GaN process technologies.

This integration allows for the entire transmitter to be implemented on a single chip while simultaneously achieving the high-fidelity phase modulation of the CMOS and high output power of the GaN.

The transmitter is implemented in two phases. The Phase 1 implementation fea- tures a low resolution phase modulator and only implements two transmitter chan- nels. This transmitter achieves a single-channel output power of 32.93 dBm and a peak transmitter efficiency of 41.32%. To the best of my knowledge, it is the first

98 fully-integrated transmitter published consisting of heterogeneously integrated CMOS and GaN process technologies.

The Phase 2 transmitter improves on the design of the Phase 1 IC. A 8-10-bit phase modulator is implemented to operate across 2.2-10.4 GHz. A novel frequency-agile glitch-free mux architecture is also implemented in the phase modulator to enable glitch-free transitions up 8 GHz. This Phase 2 IC is currently being re-packaged with testing to continue over the upcoming weeks.

6.2 Future Work

In the short-term, testing of the re-packaged Phase 2 IC is a priority for proving the functionality of the IC. If higher yield is achieved, attempts may be made to once again test the full four-channel architecture.

The future of this four-way transmitter architecture is dependent on four fully- functioning transmitter channels to be implemented on a single integrated circuit.

Future work on this architecture should focus on techniques to more reliably fabri- cate and package the IC. Although the integration of CMOS and GaN achieves higher output powers, work on a CMOS-only implementation would result in a lower-risk route for further refinement of the architecture. Not only would this approach cir- cumvent the yield problems of heterogeneous integration, but it would also allow for the use of flip-chip packaging to avoid the issues with wirebonding.

Another critical area for future work is the four-way combiner. Current four- way combiners have major drawbacks; power-efficient non-isolating combiners have a small RF frequency range, and isolating power combiners have low power efficiency.

Both of these architectures need to be further studied to allow for a transmitter that

99 can simultaneously cover a large RF frequency range while achieving high-efficiency operation.

6.3 Final Thoughts

Altogether, this work gives a glimpse at the potential of the four-way outphas- ing architecture and advanced heterogeneous integration. With future work, this four-way outphasing transmitter architecture may offer an attractive alternative to achieve high-fidelity modulation across a large RF frequency range. The fine timing resolution available with advanced CMOS nodes makes phase-based modulation an attractive alternative to amplitude modulation as technology nodes continue to scale.

Heterogeneous integration will increase the popularity for architectures, such as out- phasing, that can leverage the benefits of multiple semiconductor technologies. As the integration techniques and yield continue to improve, heterogeneous integration could become a commercially viable option for wireless communications.

100 Bibliography

[1] IEEE Standard Letter Designations for Radar-Frequency Bands, IEEE Standard 521-2002, 2002.

[2] A. Gutierrez-Aitken et al, “Diverse Accessible Heterogeneous Integration (DAHI) at Northrop Grumman Aerospace Systems (NGAS),” IEEE Compound Semicon- ductor Integrated Circuit Sym., Oct. 2014.

[3] H. Chireix, “High Power Outphasing Modulation,” Proceding of the Institute of Radio Engineers, vol. 23, no. 11, pp. 1370–1392, Nov. 1935.

[4] D. Cox, “Linear Smplification with Nonlinear Components,” IEEE Transactions of Communications, vol. 22, no. 12, pp. 1942–1945, Dec. 1974.

[5] A. Ravi et al, “A 2.4-GHz 2040-MHz Channel WLAN Digital Outphasing Trans- mitter Utilizing a Delay-Based Wideband Phase Modulator in 32-nm CMOS,” IEEE Journal of Solid-State Circuits, vol. 47, no. 12, pp. 3184–3196, Dec. 2012.

[6] K. Cho and R. Gharpurey, “A digitally intensive transmitter/PA using RF-PWM with carrier switching in 130nm CMOS,” IEEE Journal of Solid-State Circuits, vol. 51, no. 5, pp. 1188–1199, May 2016.

[7] T. Barton and D. Perreault, “Four-way microstrip-based power combining for microwave outphsing power amplifiers,” IEEE Transactions of Circuits and Sys- tems I, vol. 61, no. 10, pp. 2987–2998, Oct. 2014.

[8] E. McCune, “Envelope tracking or polar- which is it?” IEEE Microwave Maga- zine, vol. 13, no. 4, pp. 34–56, May 2012.

[9] D. Green et al, “A revolution on the horizon from DARPA: heterogeneous Integration for revolutionary microwave/millimeter-wave circuits at DARPA: progress and future directions,” IEEE Microw. Mag., vol. 18, no. 2, pp. 44–59, Feb. 2017.

[10] Range Commanders Council Telemetry Group, “IRIG Standard 106-09: Teleme- try Standards,” Secretariat, Range Commanders Council, White Sands Missile Range, 2009.

101 [11] T. Hill, “Advanced modulation techniques for telemetry,” Internation Teleme- tering Conference, Oct 2011.

[12] H. Gheidi et al, “A 1-3 GHz delta-sigma-based closed-loop fully digital phase modulator in 45-nm CMOS SOI,” IEEE Journal of Solid-State Circuits, vol. 52, no. 5, pp. 1185–1195, May 2017.

[13] W. Bennett, “Spectra of Quantized Signals,” Bell Systems Technical Journal, vol. 27, no. 3, pp. 446–472, July 1948.

[14] European Standards Institute (ETSI), “3GPP Specification 45,” 2016.

[15] L. Frenzel. (2013, Jan) An Introduction to LET Advanced: The Real 4G. [Online]. Available: http://www.electronicdesign.com/4g/introduction-lte- advanced-real-4g

[16] T. Lee, Planar Microwave Engineering: A Practical Guide to Theory, Measure- ments, and Circuits. Cambridge, U.K.: Cambridge University Press, 2004, pp. 659–662.

[17] D. Musson, “Ampliphase.. for Economical Super-Power AM Transmitters,” Broadcast News, vol. 119, pp. 24–29, Feb. 1964.

[18] T. Barton, “Not Just a Phase: Outphasing Power Amplifiers,” IEEE Microwave Magazine, vol. 17, no. 2, pp. 18–31, Feb. 2016.

[19] T. Barton, J. Dawson, and D. Perreault, “Experimental validation of a four- way outphasing combiner for microwave power amplification,” IEEE Microwave Wireless Component Letters, vol. 23, no. 1, pp. 28–30, Jan. 2013.

[20] T. Barton, A. Jurkov, P. Pednekar, and D. Perreault, “Transmission-line-based multi-way lossless power combining and outphsing power amplifier system,” Pro- ceedings of IEEE MTT-S Microwave Symposium, pp. 1–4, June 2014.

[21] W. T. et al, “A transformer-combined 31.5 dBm outphasing power amplifier in 45 nm LP CMOS with dynamicpower control for backoff power efficiency enhancement,” IEEE Journal of Solid-State Circuits, vol. 47, no. 7, pp. 1646– 1658, July 2012.

[22] W.-H. Lee, J.-D. Cho, and S.-D. Lee, “A high speed and low power phase- frequency detector and charge-pump,” Proc. Asia South Pacific Des. Autom. Conf., vol. 1, pp. 269–272, Jan 1999.

102 [23] S. Rashid et al, “A wide-band complementary digital Drive for pulse modulated single-ended and differential S/C bands Class-E PAs in 130nm GaAs technology,” IEEE Compound Semiconductor Integrated Circuits Symp., Oct 2016.

[24] M. Ozen et al, “Continuous class-E power amplifier modes,” IEEE TCAS II: Express Briefs, vol. 59, no. 11, pp. 731–735, Nov 2012.

103