Jörg Fuhrmann

A Digital Power in 28 nm CMOS for LTE Applications

FAU Studien aus der Elektrotechnik

Band 6

Herausgeber der Reihe:

Prof. Dr. Günter Roppenecker

Jörg Fuhrmann

A Digital Power Amplifier in 28 nm CMOS for LTE Applications

Erlangen FAU University Press 2016

Bibliografische Information der Deutschen Nationalbibliothek: Die Deutsche Nationalbibliothek verzeichnet diese Publikation in der Deutschen Nationalbibliografie; detaillierte bibliografische Daten sind im Internet über http://dnb.d-nb.de abrufbar.

Das Werk, einschließlich seiner Teile, ist urheberrechtlich geschützt. Die Rechte an allen Inhalten liegen bei ihren jeweiligen Autoren. Sie sind nutzbar unter der Creative Commons Lizenz BY-NC-ND.

Der vollständige Inhalt des Buchs ist als PDF über den OPUS Server der Friedrich-Alexander-Universität Erlangen-Nürnberg abrufbar: https://opus4.kobv.de/opus4-fau/home

Verlag und Auslieferung: FAU University Press, Universitätsstraße 4, 91054 Erlangen

Druck: docupoint GmbH

ISBN: 978-3-944057-94-1 (Druckausgabe) eISBN: 978-3-944057-95-8 (Online-Ausgabe) ISSN: 2363-8699

A Digital Power Amplifier in 28 nm CMOS for LTE Applications

Ein digitaler Leistungsverstärker in 28 nm-CMOS für LTE-Anwendungen

Der Technischen Fakultät der Friedrich-Alexander-Universität Erlangen-Nürnberg zur Erlangung des Doktorgrades DOKTOR-INGENIEUR vorgelegt von Jörg Fuhrmann aus Nürnberg Als Dissertation genehmigt von der Technischen Fakultät der Friedrich-Alexander-Universität Erlangen-Nürnberg Tag der mündlichen Prüfung: 25.04.2016

Vorsitzender des Promotionsorgans: Prof. Dr. rer.-nat. Peter Greil 1. Gutachter: Prof. Dr.-Ing. Dr.-Ing. habil. Robert Weigel 2. Gutachter: Prof. Dr. techn. Harald Pretl Acknowledgment

First of all I thank Prof. Dr. techn. Harald Pretl for his guidance, con- structive discussions and his open door policy. He always found the time to support and listen to me, even in the most stressful phases of his own work. He constantly encouraged me and he paved the way for this work by setting up a great working environment and a good net- work inside the company. I deeply thank him and I could not imagine a better supervisor.

I thank my Ph.D. colleague Patrick Oßmann for the constructive and good team work. I thank Krzysztof Dufrêne who started with us this work and who set a great foundation. I thank Moreira José whose con- structive support, especially during the hard phases of this project, was a key element to the success of the designs. I thank my office colleague Anas Saudi for the nice working environment that he created and for his unconditional support in the laboratory.

I thank my managers Thomas Greifeneder and Volker Neubauer for their support and open door policy during the years. I thank Stephan Leuschner, Michael Fulde, Ofir Degani, Jonas Fritzin, Thomas Bauern- feind, Thomas Buggler, Jan Zaleski, Alexander Klinkan, Dirk Friedrich, Simon Grünberger, Daniel Gruber, Sven Hampel, Dejan Teodorovic, Alexander Huber and all other members of Danube Mobile Commu- nications Engineering (DMCE) and Intel® for their contributions to this work. I thank them for the nice working atmosphere and for all the extra hours and efforts they spent for discussions and support.

I thank my professor Prof. Dr.-Ing. Dr.-Ing. habil. Robert Weigel for providing me the opportunity to do my Ph.D. thesis and for his guidance during the time. I also thank all the other colleagues of the Department for Engineering at the Friedrich-Alexander- Acknowledgment

Universität Erlangen-Nürnberg for the friendly atmosphere on my oc- casional visits.

I especially thank my parents Anton and Birgit Fuhrmann and my whole family, who constantly supported me during my whole life and who have always encouraged me to pursue my goals. I thank my girl- friend Amalia Lorca Ballestrín for her patience, support and under- standing during the last years. I was having a great time with her and she always gave me new strength to continue. I thank my friends for the nice moments we had during the rare free time while I was working on my thesis.

I thank you all very much for your individual contributions and sup- port. The last years were a great and pleasant experience for me. I have enjoyed being with all of you and I hope that a lot of good years are following.

- Jörg Fuhrmann Abstract

The further development of the mobile communication standard to- wards the 4th generation (4G) long term evolution (LTE) and the simul- taneous development of technology standards create new challenges, that have to be fulfilled, while designing power amplifiers (PAs). The downscaling of complementary metal-oxide-semiconductor (CMOS) integrated circuits (ICs) according to Moore’s law makes the overall transceiver system more compact and reduces the required chip area. Recently a fully integrated CMOS power amplifier was included in a single-chip 3rd generation (3G) high speed packet access (HSPA) trans- ceiver with a digital to analog converter (RFDAC). For further integration the RFDAC and PA can be merged to a digital PA (DPA). The signal can be generated using polar modulation (PM) which allows a separated consideration of amplitude and phase. The current summing digital power amplifier (CSDPA) is one solution for fully in- tegrated CMOS PM architectures. A CSDPA can be implemented as switched power amplifier architecture with an inverse class-D PA that can theoretically achieve high efficiency what makes it a promising can- didate for fully integrated circuits.

To proof the capability of 28 nm CMOS technology to provide watt- level output power a linear class-AB PA is designed. A DPA is imple- mented to further merge the design and additionally to be able of test- ing more advanced designs. Both designs are implemented in a front- end-of-line (FEOL) 28 nm CMOS technology. The designs use 7 copper layers and 1 aluminum layer as back-end-of-line (BEOL). The fully in- tegrated circuits include on-chip matching, and electrostatic discharge (ESD) protection. To overcome the voltage stress for a single the designs are implemented as a triple stack with feedback path from the drain of the upper transistor to the gate. This reduces the voltage stress of the transistor and increases the reliability.

i Abstract

The circuits are measured, by using sinusoidal signals, to determine the output power and efficiency. For linearity characterization the stan- dard of the 3rd generation partnership project (3GPP) is taken. Uni- versal terrestrial radio access (UTRA), evolved UTRA (E-UTRA) adja- cent power leakage ratio (ACLR) and error vector magnitude (EVM) are tested using LTE physical uplink shared channel (PUSCH) orthogonal frequency-division multiplexing (OFDM) quadrature phase-shift key- ing (QPSK)/16 quadrature amplitude modulation (16-QAM) test sig- nals with 1.4-20 MHz bandwidth (BW) at the required channel power (CHP).

The linear stand-alone PA is designed for LTE frequency division du- plex (FDD) band 1. The bare bumped die measures 1.88 × 0.51 mm2 and is directly soldered on the printed circuit board (PCB). At pulsed measurements a maximum power-added efficiency (PAE) of 35.2 %, a drain efficiency ηd of 39.5 %, a gain of 15.5 dB and a maximum output power Pmax of 31.7 dBm are achieved at 1.83 GHz with 3.2 V supply volt- age. The use of digital predistortion (DPD) is shown in the frequency spectrum of a full allocated LTE signal with 15 MHz bandwidth (LTE-15) band 1 PUSCH 16-QAM OFDM signal. LTE requirements for the BWs 1.4-20 MHz are measured with full allocation of band 1 PUSCH QPSK signals. The required EVM of 17.5 %, UTRA ACLR of -33 dBc and E- UTRA of -30 dBc are fulfilled with the use of DPD for all BWs.

The monolithic fully integrated DPA, implemented in a single-chip LTE transceiver system, was designed for LTE FDD band 7 and LTE time division duplex (TDD) bands 38, 40 and 41. The implementation has been optimized for operating in the 2.3-2.7 GHz range. It is imple- mented as digital polar (DPT) that is directly connected to the digital front end (DFE). The DFE converts the IQ modulated signal data to a polar modulated signal. The modulated phase information is contained in the local oscillator (LO) signal. The amplitude infor- mation is decoded by a segmented 15 bit field. The most significant bits (MSBs) are thermometer decoded to assure monotonicity. The 5 least significant bits (LSBs) are realized as binary weighted cells to reduce complexity. The signal’s amplitude and phase information is combined again inside the DPA. It already provides the required LTE output power without further amplification. The output of the DPA is matched by a transformer. The transformer also acts as a balun and ii Abstract

transforms the differential signal into a single-ended one. By using an inverse class-D PA design for the unit cells (UCs) in the cell field, their outputs can be shorted and connected to an output matching network (OMN). This results in a compact implementation. The inductive el- ement of the matching network is merged into the OMN. The trans- former is divided into one on-chip winding and a secondary winding in the package. Since the secondary winding of the transformer is re- alized in an extra redistribution layer (RDL) inside the package the on- chip metal copper lines can be used for the primary windings. The primary winding is implemented in the thickest metal layer and in the aluminum layer to ensure good conductivity and decreased insertion losses. This results in an improved quality factor of the OMN. The output of the transformer is connected to a 50 Ω output load at the PCB. The center tap of the transformer is connected to the 2.5 V power supply. The DPA has an area of 0.61 × 0.5 mm2. The continuous wave (CW) measurements show a Pmax of 31.2 dBm and a maximum ηd of 34.3 %. The dynamic range (DR) of the DPA is 87.9 dB. The E- UTRA ACLR for band 7, at the required CHP of 26 dBm, is 26.9 dBc for an LTE signal with 5 MHz bandwidth (LTE-5) and 27.4 dBc for an LTE signal with 10 MHz bandwidth (LTE-10). The EVM requirements are fulfilled for all measurements. The duplex noise (DN) at 26 dBm CHP is -140.7 dBc/Hz for LTE-5 and -138.3 dBc/Hz for LTE-10.

iii

Kurzfassung

Die Weiterentwicklung des Mobilkommunikationsstandards bis hin zu der heutigen vierten Generation (4G) Long Term Evolution (LTE) und die gleichzeitige Weiterentwicklung des Technologiestandards erzeu- gen neue Herausforderungen bei dem Design eines Leistungsverstärk- ers (engl. kurz PA), welche erfüllt werden müssen. Die Skalierung von integrierten Schaltungen (engl. kurz ICs), mit komplementärer Logik aus Metall-Oxid-Halbleitern (engl. kurz CMOS), nach dem Moore- schen Gesetz macht den gesamten Transceiver kompakter und redu- ziert die benötigte Chipfläche. Kürzlich wurde für die dritte Genera- tion (3G) ein vollintegrierter CMOS-PA auf einem einzelnen Chip zu- sammen mit einem Hochfrequenz-Digital-Analog-Umsetzer (engl. kurz RFDAC), für einen High Speed Packet Access (HSPA)-Transceiver, integriert. Um das Konzept weiterzuintegrieren können der RFDAC und der PA zu einem digitalen PA (engl. kurz DPA) zusammengeführt werden. Das Signal kann mittels Polarmodulation (PM) generiert wer- den, welche eine getrennte Betrachtung von Amplitude und Phase er- laubt. Der stromsummierende DPA (engl. kurz CSDPA) ist eine Lö- sung für vollintegrierte CMOS-PM-Architekturen. Ein CSDPA kann als geschaltete PA-Architektur mit einem inversen Klasse-D PA implemen- tiert werden. Die theoretisch erzielbare hohe Effizienz macht ihn zu einem vielversprechenden Kandidaten für vollintegrierte Schaltungen.

Um die Fähigkeit der 28 nm-CMOS-Technologie zu prüfen, eine Aus- gangsleistung im Watt-Bereich bereitzustellen, wurde ein Klasse-AB PA entworfen. Ein DPA wird gebaut um den Entwurf weiter zusam- menzuführen und zusätzlich die Möglichkeit zu erlangen fortgeschrit- tenere Entwürfe zu testen. Beide Entwürfe sind in einer 28 nm-Techno- logie für die aktiven Bauelemente (engl. kurz FEOL) implementiert. Als Metallisierung (engl. kurz BEOL) werden 7 Kupfer- und 1 Alumini- umlage verwendet. Die vollintegrierten Schaltungen beinhalten auf

v Kurzfassung

dem Chip ein Anpassungsnetzwerk, die Arbeitspunkteinstellung und einen elektrostatischen Entladungsschutz (engl. kurz ESD). Um die hohe Spannungsbelastung eines einzelnen zu überwinden, werden die Entwürfe als Dreifach-Stapel mit Rückführpfad vom Drain des oberen Transistors zu dessen Gate implementiert. Das reduziert die Spannungsbelastung jedes einzelnen Transistors und erhöht dadurch die Zuverlässigkeit.

Die Schaltungen werden mit Hilfe von Sinus-Signalen angeregt, um deren Ausgangsleistung und Wirkungsgrad zu bestimmen. Für die Lin- earitätscharakterisierung wird der Standard des 3rd Generation Part- nership Projects (3GPPs) verwendet. Universal Terrestrial Radio Ac- cess (UTRA), Evolved UTRA (E-UTRA) Adjacent Power Leakage Ra- tio (ACLR) und Error Vector Magnitude (EVM) werden getestet indem LTE Physical Uplink Shared Channel (PUSCH) Orthogonal Frequency- Division Multiplexing (OFDM) Quadrature Phase-Shift Keying (QPSK)/16 Quadrature Amplitude Modulation (16-QAM)-Testsignale mit 1.4-20 MHz Bandbreite (engl. kurz BW) bei der geforderten Kanal- leistung (engl. kurz CHP) verwendet werden.

Der isolierte lineare PA ist für LTE Frequenzduplex (engl. kurz FDD) Band 1 entworfen. Das Die ohne Gehäuse misst 1.88 × 0.51 mm2 und ist direkt auf eine Leiterplatte (engl. kurz PCB) gelötet. Bei gepul- sten Messungen werden ein maximaler Leistungswirkungsgrad (engl. kurz PAE) von 35.2 %, ein Drain-Wirkungsgrad ηd von 39.5 %, eine Ver- stärkung von 15.5 dB und eine maximale Ausgangsleistung Pmax von 31.7 dBm bei 1.83 GHz und einer 3.2 V Versorgungsspannung erreicht. Die Verwendung einer digitalen Vorverzerrung (engl. kurz DPD) wird anhand eines voll belegten LTE-15 Band 1 PUSCH 16-QAM OFDM-Si- gnals gezeigt. Die LTE-Anforderungen für BW 1.4-20 MHz werden mit voll belegten Band 1 PUSCH QPSK-Signalen gemessen. Das geforderte EVM von 17.5 %, das UTRA ACLR von −33 dBc und das E-UTRA von −30 dBc werden unter Verwendung einer DPD bei allen BWs erreicht .

Der monolithisch vollintegrierte DPA, welcher auf einem einzelnen LTE-Transceiver-Chip implementiert ist, ist für das LTE FDD-Band 7 und die LTE Zeitduplex (engl. kurz TDD)-Bänder 38, 40 und 41 entwor- fen und somit für einen Betriebsbereich von 2.3-2.7 GHz ausgelegt. Die Implementierung ist als digitaler Polartransmitter (engl. kurz DPT) vi Kurzfassung

realisiert, der direkt mit dem digitalen Eingang (engl. kurz DFE) ver- bunden ist. Das DFE wandelt die IQ-modulierten Signaldaten zu einem polarmodulierten Signal um. Die modulierte Phaseninforma- tion ist im Signal des lokalen Oszillators (engl. kurz LO) enthalten. Die Amplitudeninformation wird in einem segmentierten 15 Bit-Feld dekodiert. Die 10 Bits mit dem höchsten Stellenwert (engl. kurz MSBs) sind thermometer-dekodiert, um die Monotonie zu gewährleisten. Die 5 Bits mit dem niedrigsten Stellenwert (engl. kurz LSBs) bleiben binär- dekodiert, um die Komplexität zu reduzieren. Die Signalamplitude und Phaseninformation werden innerhalb des DPA wieder zusammen- geführt. Die geforderte Ausgangsleistung für LTE wird auch ohne zu- sätzliche Verstärkung erreicht. Der Ausgang des DPAs wird mit einem Transformator angepasst. Der Transformator dient zeitgleich als Balun, der den differentiellen auf einen einpoligen Ausgang transformiert. In- dem man einen inversen Klasse-D PA als Einheitszelle (engl. kurz UC) im Zellfeld verwendet, lassen sich die Ausgänge kurzschließen und mit dem Anpassnetzwerk am Ausgang (engl. kurz OMN) verbinden. Dies resultiert in einer kompakten Implementierung. Das induktive Ele- ment des Anpassnetzwerks ist im OMN vereinigt. Der Transforma- tor ist mit einer Windung auf dem Chip und der sekundären Win- dung im Gehäuse unterteilt. Da die sekundäre Windung des Trans- formators in einer zusätzlichen Lage (engl. kurz RDL) innerhalb des Gehäuses realisiert wird, können die Kupferlagen auf dem Chip für die Primärwindung verwendet werden. Die Primärwindung ist in den oberen, niederohmigen Metalllagen und der Aluminiumlage imple- mentiert, um eine gute Leitfähigkeit und dadurch geringere Verluste zu garantieren. Das resultiert in einem verbesserten Gütefaktor des OMN. Der Ausgang des Transformators ist dann mit einer 50 Ω-Ausgangs- last am PCB abgeschlossen. Der Mittelabgriff des Transformators ist mit einer 2.5 V-Versorgungsspannung verbunden. Der DPA hat eine Fläche von 0.61 × 0.5 mm2. Die Messungen mit einem Dauerstrichsig- nal (engl. kurz CW) zeigen ein Pmax von 31.2 dBm und ein maximales ηd von 34.3 %. Der Dynamikbereich (engl. kurz DR) des DPAs ist 87.9 dB. Bei der geforderten CHP von 26 dBm ist das E-UTRA ACLR für Band 7 26.9 dBc für ein LTE-Signal mit 5 MHz Bandbreite (LTE-5) und 27.4 dBc für ein LTE-Signal mit 10 MHz Bandbreite (LTE-10). Die EVM-Anfor- derungen wurden für alle Messungen erfüllt. Das Duplex-Rauschen (engl. kurz DN) bei 26 dBm CHP ist −140.7 dBc/Hz für LTE-5 und −138.3 dBc/Hz für LTE-10.

vii

Acronyms

16-QAM 16 quadrature amplitude modulation 2G 2nd generation 3G 3rd generation 3GPP 3rd generation partnership project 4G 4th generation 64-QAM 64 quadrature amplitude modulation ACLR adjacent channel leakage power ratio ADP adaptive digital predistortion AM amplitude modulation AM-AM amplitude to amplitude AM-PM amplitude to phase distortion ASM antenna switch module BEOL back-end-of-line BO backoff BPSK binary phase-shift keying BW bandwidth CA carrier-aggregation CCDF complementary cumulative distribution func- tion CF crest factor CFR crest factor reduction CG common gate CHBW channel bandwidth CHP channel power CLK digital clock CMCD current-mode class-D CMOS complementary metal oxide semiconductor CP cyclic prefix CS common source

ix Acronyms

CSDAC current summing DAC CSDPA current summing DPA CW continuous wave DAC digital-to-analog converter DAT distributed active network DB dynamic biasing DC direct current DFE digital front-end DN duplex noise DNL differential nonlinearity DPA digital power amplifier DPD digital predistortion DPT digital polar transmitter DPWM digital pulse-width-modulation DR dynamic range DRAM dynamic random access memory DSP digital signal processor DUT device under test E-UTRA evolved universal terrestrial radio access EEC efficiency enhancement circuit EER envelope elimination restoration EM electromagnetic ESD electrostatic discharge ET envelope tracking EVM error vector magnitude FDD frequency division duplex FDMA frequency division multiplexing access FE front-end FEOL front-end-of-line FM frequency modulation G gain GSM global system for mobile communications HBT heterojunction bipolar transistor IC integrated circuit IMN input matching network INL integral nonlinearity IoT internet of things IQ in-phase and quadrature ISI intersymbol interference x Acronyms

ITRS international technology roadmap for semicon- ductors LDO low-dropout regulator LINC linear amplification with nonlinear components LO local oscillator LOX inverse local oscillator LP low pass LSB least significant bit LTE long term evolution LTE-1 LTE signal with 1.5 MHz bandwidth LTE-10 LTE signal with 10 MHz bandwidth LTE-15 LTE signal with 15 MHz bandwidth LTE-20 LTE signal with 20 MHz bandwidth LTE-5 LTE signal with 5 MHz bandwidth LUT look up table LVT low voltage transistors M2M machine-to-machine MOS metal oxide semiconductor MOSFET metal oxide semiconductor field effect transistor MPR maximum power reduction MSB most significant bit MtM more than Moore nMOS n-channel metal oxide semiconductor OFDM orthogonal frequency-division multiplexing OFDMA orthogonal frequency-division multiplexing ac- cess OMN output matching network OOB out of band PA power amplifier PADAC power amplifier digital to analog converter PAE power-added efficiency PAPR peak-to-average power ratio PAR peak-to-average ratio PCB printed circuit board PEC power enhancement circuit PER power enhancement ratio PLL phase-locked loop PM phase modulation pMOS p-channel metal oxide semiconductor

xi Acronyms

PUSCH physical uplink shared channel PVT process voltage temperature QPSK quadrature phase-shift keying RB resource block RBW resolution bandwidth RC random column RF radio frequency RFDAC radio frequency digital to analog converter RL return loss RRC root-raised-cosine RW random walk RX receiver SC subcarrier SC-FDMA single-carrier FDMA SCPA switched capacitor power amplifier SDPA summing DPA SMA SubMiniature version A SMPA switched-mode power amplifier SNR signal to noise ratio SoC system on chip TC test case TDD time division duplex TG transmission gate TX transmitter UC unit cell UE user equipment UMTS universal mobile telecommunications system UTRA universal terrestrial radio access VLSI very-large-scale integration VSDPA voltage summing DPA VSWR voltage standing wave ratio WCDMA wideband code division multiple access WCR worst case room WLAN wireless local area network WPAN wireless personal area network ZCDS zero-current derivative switching ZCS zero-current switching ZVDS zero-voltage derivative switching ZVS zero-voltage switching xii Symbols

A(t) time dependent amplitude Amax maximum amplitude Arms root mean square of amplitude Cdec decoupling capacitance Cfb feedback capacitance Cgd gate drain capacitance Cgs gate source capacitance Cin input capacitance Cm matching capacitance Coff off-chip capacitance Cox gate oxide capacitance Cpar parasitic capacitance Cs shunt capacitance C capacitance Γgm reflexion coefficient at gm stack I(t) inphase component of a signal Idc direct current Idd,P A supply current of a power amplifier Imax maximum current In current for the unit cell n Ireplica replica current Irf radio frequency current L inductance L gate length Pant output power at antenna Pdc direct power Pin input power Pmax maximum power Pmin minimum power

xiii Symbols

Pout output power Psat saturate output power Q(t) quadrature component of a signal Qind quality factor inductance Qp quality factor primary inductance Qs quality factor secondary inductance Rload load resistance Ron resistance of a conducting transistor Rpar parasitic resistance S21 S-parameter, input to output Tin input transformer Tout output transformer Vbias,1 first bias voltage Vbias,2 second bias voltage Vd1 voltage at the drain of the first transitor Vdd,P A supply voltage of a power amplifier Vdd supply voltage Vd,sat knee voltage Vds drain source voltage Vg1 voltage at the gate of the first transitor Vg2 voltage at the gate of the second transitor Vgd gate drain voltage Vin intput voltage Vout output voltage Vox oxide voltage Vrf radio frequency voltage Vss supply ground Vth threshold voltage W gate width Z impedance a(t) time dependent amplitude before amplification αchain chain losses λ channel modulation ηd drain efficiency δt time difference ϵn error for the unit cell n foob frequency fsample sampling frequency f frequency xiv Symbols

is1 current through first transistor is2 current through second transistor k coupling factor I mean value of current µ charge mobility V in negative input voltage np number of primary winding ns number of secondary winding ηoa overall efficiency ϕ(t) time dependent phase ϕ phase of a signal ψb time difference τ time delay v1 drain source voltage first transistor

xv

Contents

Abstract i

Kurzfassung v

1 Introduction 1 1.1 State-of-the-Art ...... 2 1.2 Theoretical Design Concepts ...... 4 1.3 DPA Theory ...... 7 1.4 Combiner Techniques ...... 8 1.5 Power Amplifier Classes ...... 12 1.5.1 Class-D and Inverse Class-D PA ...... 13 1.5.2 Class-E and Inverse Class-E PA ...... 14 1.5.3 Class-F and Inverse Class-F PA ...... 15 1.5.4 Losses and Output Power ...... 16 1.6 Linearization Concepts ...... 17 1.6.1 Outphasing ...... 17 1.6.2 Envelope Elimination and Restoration ...... 19 1.6.3 Digital Polar Transmitter ...... 20 1.6.4 Summing Digital Power Amplifier ...... 21 1.7 Watt-Level Output Power ...... 23 1.8 Modulated Signals ...... 24 1.9 Summary of DPAs ...... 25 1.10 Motivation ...... 25

2 Specifications 39 2.1 Power Amplifier Basics ...... 39 2.2 LTE Signal ...... 43 2.3 LTE Specification ...... 44 2.3.1 Temperature ...... 45 2.3.2 Output Power ...... 46

xvii Contents

2.3.3 Maximum Power Reduction ...... 46 2.3.4 VSWR ...... 47 2.3.5 Operating Band ...... 48 2.3.6 EVM Requirements ...... 49 2.3.7 ACLR Requirements ...... 49 2.3.8 Power Clipping ...... 51 2.3.9 Resolution ...... 53

3 Linear Power Amplifier 59 3.1 Fundamentals ...... 60 3.2 Design Considerations ...... 64 3.3 Circuit Design ...... 65 3.4 DC Simulation Results ...... 68 3.5 Silicon Implementation ...... 69 3.6 Measurement Setup ...... 70 3.7 DC Characterization ...... 71 3.8 DC Breakdown Measurements ...... 73 3.9 Single Tone Measurements ...... 74 3.9.1 AM-AM and AM-PM Measurements ...... 75 3.9.2 Saturated Output Power ...... 76 3.10 LTE Measurements ...... 76 3.10.1 Output Spectrum ...... 77 3.10.2 LTE-20 Band 1 16-QAM ...... 78 3.10.3 LTE-1 to LTE-20 Band 1 QPSK ...... 79 3.11 Comparison ...... 83

4 Digital Power Amplifier 89 4.1 DPA Design ...... 90 4.1.1 Inverse Class-D ...... 91 4.1.2 Stacked Inverse Class-D ...... 93 4.2 Theoretical Error Sources ...... 94 4.2.1 Quantization Error ...... 94 4.2.2 Amplitude Mismatch ...... 95 4.2.3 Driving Stage Mismatch ...... 96 4.2.4 Output Combining Mismatch ...... 96 4.2.5 Timing Mismatch ...... 97 4.3 AM-AM and AM-PM Distortion ...... 98 4.4 Matrix Controlling ...... 99 4.4.1 1-D Switching Schemes ...... 100 xviii Contents

4.4.2 2-D Switching Schemes ...... 101 4.5 Theoretical Error Cancellation ...... 105 4.5.1 Amplitude Variation Error ...... 105 4.5.2 Timing Mismatch Error ...... 107 4.6 Decoder ...... 108 4.6.1 Motivation ...... 109 4.6.2 Decoder Design ...... 109 4.6.3 Decoder Layout ...... 110 4.7 Cell Field Layout ...... 110 4.7.1 Layout Consideration ...... 113 4.8 DPA Simulation ...... 113 4.8.1 Simulation Setup ...... 113 4.8.2 Transient Voltage Output ...... 114 4.8.3 Output Power, Drain and Overall Efficiency ... 115 4.8.4 Output Voltage ...... 116 4.8.5 Output Phase ...... 117 4.9 Variable LO Load ...... 118 4.10 Silicon Implementation ...... 120 4.11 CW Measurements ...... 120 4.11.1 Measurement Setup ...... 121 4.11.2 CW Output Power and Drain Efficiency ..... 123 4.11.3 AM-AM and AM-PM ...... 125 4.11.4 CW Output Power over Frequency ...... 131 4.11.5 CW Output Power over Supply Voltage ..... 131 4.12 Simulations vs. Measurements ...... 132 4.13 LTE Measurements ...... 134 4.13.1 ACLR ...... 134 4.13.2 EVM ...... 135 4.13.3 Drain Efficiency ...... 137 4.13.4 Spectrum ...... 138 4.13.5 Full Span ...... 143 4.14 Failure Causes ...... 145 4.14.1 Simulation ...... 145 4.14.2 Measurement ...... 147

5 Conclusion 151

xix

Chapter 1 Introduction

Since the invention of the transistor in 1947 the industry as well as the mobile communication have changed drastically [1]. Moore’s Law brought the motivation of complementary metal oxide semiconduc- tor (CMOS) technology downscaling and led to further development of integrated circuits (ICs) [2]. Later technology scaling brought new inventions such as the Intel® 45 nm high-k metal gate silicon technol- ogy and the first demonstrated 32 nm logic process. Besides this tech- nology scaling further developments of non-digital function led to an expression called more than Moore (MtM). These developments are silicon based technologies but they do not scale in the same way [3]. At the same time as the transistor was invented, the theoretical foun- dations for communication were laid. To this day mobile communica- tion systems have been developed up to the 4th generation (4G). The 2nd generation (2G) was represented by global system for mobile com- munications (GSM) and the 3rd generation (3G) by universal mobile telecommunications system (UMTS) [4, 5]. Today’s 4G long term evo- lution (LTE) is already available for users on the market. Based on this standard, internet of things (IoT) and machine-to-machine (M2M) cre- ate another market for ICs [6,7]. It is crucial for all mobile devices to design them for high efficiency, and so for low power consumption, to guarantee a long operating time. Due to the high output power that is required by the 3rd generation partnership project (3GPP) the transmitter (TX) and here especially the power amplifier (PA) is a high power consumer. For the overall efficiency it is therefore of high interest to design this component care- fully. Besides high power efficiency also fully integrated designs make the overall design more compact. It is therefore of interest to further

1 Chapter 1. Introduction

develop existing PAs for later technology nodes and the next generation of mobile communication [8]. In the past years, the mobile communication standard has evolved to 4G. New devices with applications that use high definition video trans- fers and on demand services require a high data transfer rate. LTE is currently available as state-of-the-art standard and will be further de- veloped to LTE-Advanced. The high output power for mobile commu- nication standards, compared to wireless local area network (WLAN) or Bluetooth, require additional effort and care while designing a PA. Since this block is crucial to the overall system performance, a special focus on the PA design is mandatory.

1.1 State-of-the-Art

In recent years a summary of published CMOS PAs, that are able to achieve watt-level output power which is required for LTE, has been presented [9]. As mentioned before, the PA is a crucial part for the overall efficiency in a transceiver system. Therefore, a special focus on its design is mandatory. High peak-to-average ratio (PAR) in orthog- onal frequency-division multiplexing (OFDM) for LTE force the PA to operate in backoff (BO) while highest efficiency is achieved at satu- rated output power. Additional measurements such as adjacent chan- nel leakage power ratio (ACLR) and error vector magnitude (EVM) de- fine the quality of linearity, that a PA has to fulfill by transmitting a signal. This is important for low bit failure rates and spectral band co- existence with other wireless technology standards such as Bluetooth and WLAN. It was shown that linear class-AB PAs can achieve 3GPP LTE speci- fications even in latest nanometer technology nodes [10]. Due to the fact that linear designs already theoretically suffer from poor efficiency, switched amplifier designs such as class-D,-E,-F become very attrac- tive [11]. On one hand these switched amplifiers have a theoretical effi- ciency of 100 % but on the other hand it is impossible to linearly mod- ulate the amplitude of a signal without further design improvements. Outphasing, linear amplification with nonlinear components (LINC), envelope elimination restoration (EER), digital polar transmitter (DPT) or switched-mode power amplifier (SMPA) are architectures that lin- earize these highly non-linear PAs.

2 1.1. State-of-the-Art

Recently, integrated radio frequency (RF) CMOS PAs became the fo- cus of attention to compete with III-V heterojunction bipolar transis- tors (HBTs) based PAs in the mobile handset market [12]. For a full system on chip (SoC) in connection with very-large-scale integration (VLSI) it is important to design the PA in the same technology as the digital front-end (DFE) [13]. Transceivers have been further developed towards all-digital transceiver architectures and architectures were pre- sented that use an radio frequency digital to analog converter (RFDAC) in 65 nm CMOS technology with integrated PA [14–16]. It was also already presented that it is possible to implement full transceivers in CMOS for all kinds of wireless communication systems. Table 1.1 shows a selection of reported publications on Bluetooth [17], wireless personal area network (WPAN) [18], WLAN [19], LTE [20] and LTE-Advanced [21].

Table 1.1: Selected CMOS Transceiver Implementations for Wireless Communication Systems Modulation Process Frequency Supply Area Ref. [nm] [GHz] [V] [mm2] LTE-Advanced 90 1.95 2.7 3.69 [21] LTE 40 2.7 2.5 13 [20] WLAN 180 5.2 1.8 17.2 [19] WPAN 180 2.4 1.8 7.84 [18] Bluetooth 250 2.4 2.7 4.0 x 4.5 [17]

For more compact design solutions the digital to analog data con- version and amplification can be implemented by merging the digital- to-analog converter (DAC) with the PA into a digital power amplifier (DPA). DPA solutions are a promising candidate for compact designs and full SoC solutions with good efficiency. Different DPA concepts can be considered to fulfill the system requirements in the best possi- ble way.

This chapter shows the theoretical design concepts to fulfill 3GPP re- quirements and recently presented implementations. Section 1.2 ex- plains the 3GPP requirements for LTE and the challenges for low CMOS technology nodes. The theory of DPAs and their specification is ex- plained in Section 1.3. In Section 1.4 different power combining tech-

3 Chapter 1. Introduction

niques that were used in recent publications to increase the output power are shown. In addition, the theoretical efficiencies for the mostly used combiner techniques are shown. Section 1.5 and Section 1.6 sum- marize the theoretically highly efficient but also highly nonlinear SM- PAs and their linearization techniques. In Section 1.7 and Section 1.8 different architectures are summarized. Their design concepts and technology nodes were discussed with focus on watt-level output power. For modulated signals the focus is laid on ACLR and in-band power for LTE. At the end of this chapter a summary of PA designs is given and the motivation for choosing an inverse class-D PA for the implementation is explained.

1.2 Theoretical Design Concepts

For the modulation of signals for wireless communication quadrature phase-shift keying (QPSK), 16 quadrature amplitude modulation (16- QAM) or 64 quadrature amplitude modulation (64-QAM) are used. This can be either directly generated by using in-phase and quadrature (IQ) modulation or by using phase modulation (PM), that converts the signal into an amplitude A(t) and a phase ϕ(t) as √ 2 2 A(t) = I(t)(+ Q()t) − Q(t) ϕ(t) = tan 1 . (1.1) I(t) The amplitude and phase can be combined what eventually results in the time dependent signal s(t) as s(t) = A(t) sin(ωt + ϕ(t)). (1.2) The conversion from I(t) and Q(t) to A(t) increases the envelope signal bandwidth (BW) [22]. To modulate this signal different designs have been developed such as EER, DPT, outphasing [23] and summing DPAs (SDPAs) which either sum the voltage or the current [24].

In cellular systems the spectral requirements are given for in-band as EVM and for out of band (OOB) as ACLR. Table 1.2 displays a selection of requirements for LTE that are required by 3GPP [25]. The required output power for LTE is 23 dBm and has a maximum power reduction of 1-2 dB depending on the modulation and resource

4 1.2. Theoretical Design Concepts

Table 1.2: Selected 3GPP LTE Linearity Requirements Requirement LTE Unit EVM QPSK 17.5 % 16QAM 12.5 % ACLR E-UTRA 30 dBc UTRA 33 dBc block (RB) allocation of the design. Due to chain losses and peak-to- average power ratio (PAPR) the watt-level, or 30 dBm, is one of the con- sidered design requirements for PAs as shown in

Pout[dBm] Pout[W ] = 10 10 · 1mW. (1.3)

Pout at a resistor load Rload can be calculated, for a sinusoidal signal with the amplitude Vout, as

2 1 Vout Pout = . (1.4) 2 Rload In Table 1.3 the gate oxide thickness for the different technologies is given by the international technology roadmap for semiconductors (ITRS) [26]. It can be seen that in low CMOS technology node ar- chitectures it is impossible to achieve the required output power with the output power of the presented classes [9]. Therefore, most de- signs use a combination of differential push-pull designs with pream- plifier and with a power combining architecture [27]. To fulfill the stringent required rise and fall times to reduce amplitude to phase dis- tortion (AM-PM) distortion predriver stages can be used [28]. This stages have to be designed carefully not to dissipate to much power. To minimize the driver power it is better to use a small number of cas- caded drivers with large fan-out to drive the large PA devices than more drivers with smaller fan-out [28]. Taking these components into ac- count this can highly drop the overall efficiency if a design [29]. Volt- age switching classes have the additional disadvantage that while using stacked designs two driving stages are needed to drive the inverter. A non-overlapping clock has to be implemented between the two stages not to loose efficiency. Inverse class-D, -E or -F PAs can simply be stacked because no inverter is needed. Especially for lower technology nodes the voltage that drops at the PA is still too high for a single transistor, a stacked design is used to

5 Chapter 1. Introduction

Table 1.3: Selected ITRS Specifications for Thick Oxide and Thin Oxide Transistors in Deep Nanometer CMOS Technology Node Voltage Supply Tox Gate Length [nm] [V] [nm] [nm] Thin Oxide 65 1.2 2 53 45 1 1.5 32 28 0.95 1.1 20 Thick Oxide 65 2.5 5 250 45 1.8 3 180 28 1.8 3 180 distribute the voltage drop across more transistors [30, 31]. In Fig. 1.1 a stack of three transistors is shown. The bottom transistor N1 is in com- mon source (CS) mode and is driven by the signal Vrf . The transistors N2 and N3 are implemented as common gate (CG) and biased by Vbias,1 and Vbias,2. It has already been shown that in deep nanometer CMOS technology LTE output power can be achieved using a triple stack de- sign in a linear class-AB PA [10]. The bulks of the transistors are con- nected to the source of the transistors to dynamically change the bulk potential to distribute the voltage stress for RF signals. Furthermore, a capacitor Cfb can be implemented that generates a feedback from the

Vdd

Cfb N3 Vbias,2

N2 Vbias,1

N1 Vrf

Figure 1.1: Transistor level diagram of a stacked transistor structure.

6 1.3. DPA Theory

drain to further reduce the drain voltage stress that occurs at the upper transistor N3. With an n times higher output voltage Vout the same Pout 2 can be achieved using a load Rload that is n higher. With n being the number of transistors used in the stack [9]. This might be especially of interest if Rload has to be made as small as the parasitic resistance and capacitance at the output of the stage. Then the losses can be signif- icant enough to degrade the whole performance of the design due to voltage drops.

1.3 DPA Theory

Not all of the conventional specifications for PAs can simply be trans- ferred to DPA designs. In Table 1.4 a selection of PA characterization is shown. By looking at the gain (G) or the power-added efficiency (PAE) equations for a PA the question rises, how the desired characterizations can be calculated. Unlike for Pout, there is no analog modulated input signal power Pin that can be measured as in pure analog designs.

Table 1.4: Selection of DPA Parameters Characterization Unit Equation Pout G dB 10 log10 P − in PAE % 100 · Pout Pin Pdc · Pout ηd % 100 P − dc {Vn+1 Vn − } INL LSB max V 1 LSB− DNL LSB max{Vn Vn,ideal } VLSB

The power that is dissipated in an ideal digital design with the supply voltage Vdd can be calculated as

2 Pin = fCinVdd. (1.5)

Cin is the input capacitance that has to be driven and the frequency f is the value of how often the capacitance is loaded and unloaded [32]. Since most of the times Pin has no independent power supply it is im- possible to measure it. For this reason the PAE and gain are usually not provided and cannot be compared. Another approach is to describe Pin as the total used digital input power, that take local oscillator (LO)

7 Chapter 1. Introduction

and clock generation into account. It is then spoken of the total effi- ciency [33]. The ηd is independent of Pin, if Pdc is only measured for the PA. On the other side, additional aspects can be considered that were not relevant in designing a classical PA. By using controlled unit cells (UCs) to generate the output power and no more linear devices, the transfer characteristics of a PA change as well from continuous to dis- crete values. The bit resolution, integral nonlinearity (INL) and differ- ential nonlinearity (DNL) errors are values that should also be reported to fully describe a DPA. In Table 1.5 the bits, INL, DNL, Pout and frequency of different pre- sented DPAs are shown. A current summing DPA (CSDPA) with a unit class-E amplifier can operate at frequencies up to 47 GHz. Another de- sign with 10 bit resolution and segmented unit cells achieves INL/DNL values of 2.43/3.2 least significant bit (LSB) by using predistortion. One has to take into account that by using predistortion the effective reso- lutions is reduced. The good matching characteristics of CMOS capaci- tors of voltage summing DPA (VSDPA) result in DNL values of 0.5 LSB and INL of 3 LSB.

Table 1.5: Comparison of Bit, INL and DNL Resolution INL DNL Pout Frequency Year Ref. [Bit] [LSB] [LSB] [dBm] [GHz] [a] 3 - 0.45 28.9 47 2015 [34] 10 2.43 3.2 25.2 2 2009 [35] 6 <3 0.5 25.2 2.25 2011 [32]

1.4 Combiner Techniques

In Figure 1.2a and 1.2b it can be seen that power combining can be cat- egorized into voltage, series combining or current, parallel combining architectures [36]. Using series or parallel combiner with transform- ers has the advantage of isolating each power amplifier stage from the output and therefore make the design more independent. The load at the PA output can be adapted by using the transformation ratio np:ns. Another advantage of the transformer design is the integration of the matching components into the transformer [37].

8 1.4. Combiner Techniques

I1

Vin np ns V1 Vin np ns RL

I2

Vin np ns V2 Vin np ns RL

(a) Block level diagram of a series trans- (b) Block level diagram of a parallel former combiner. transformer combiner.

jBC λ/4 C

Vin λ/4 Vin L

−jBC R − L L jBC λ/4

RL Vin C Vin λ/4

jBC (c) Block level diagram of a parallel λ/4 (d) Block level diagram of a parallel dis- combiner. tributed LC matching. Figure 1.2: Block level diagram of a transformer based series combiner (a), a parallel transformer combiner (b), a parallel λ/4 com- biner (c) and a parallel combiner with lumped elements as distributed LC matching (d).

A distributed active network (DAT) is a series combiner that was in- troduced to overcome the problems of impedance transformation of an LC matching network. An LC matching network can be used to trans- form the impedance that is seen at the output of a PA’s and therefore increase the output power. This technique is very lossy due to high con- ductive substrate and thin metal and dielectric layers. For smaller tech- nology nodes these losses increase and lead to a degradation of output

9 Chapter 1. Introduction

power and efficiency [27]. The Figure 8 combiner was introduced as a series combiner architecture that due to its layout structure improves the coupling between primary and secondary winding, improves the quality factor by reducing unwanted capacitive coupling and current crowding and has a smaller area. One drawback of the Figure 8 design is that its turn ratio is limited and that by itself it is not completely symmetrical, therefore a symmetric and an octagonal design were de- veloped [38,39]. A parallel power combiner can be implemented as interleaved trans- former, using two primary windings and one secondary [40] or as λ/4 combiner as shown in Figure 1.2c or as equivalent LC network as in Figure 1.2d. Further it can be implemented as Wilkinson power com- biner or as Chireix combiner. The Wilkinson combiner isolates the two outputs from each other by a resistor [23] what results in good linearity because of a fixed output load for each amplifier. On the other side this leads to a degradation of efficiency due to power losses. The Chireix combiner has an additional phase shift due to an impedance jB [41], is therefore for a wider phase range more efficient than the Wilkinson combiner [42] and has better efficiency in BO. The compensation ele- ments are chosen for a specific frequency that makes it less attractive for multiple frequency bands. Furthermore, this degrades the linearity of designs that use load sensitive PAs [43]. It is also possible to combine current and voltage combiner. In a DPA technology the UCs can be combined at the output and additionally blocks with UCs can be combined as series combiner [44]. This has the advantage that the load can be dynamically modulated and a high efficiency can so be maintained through all amplitude levels. In Table 1.6 the efficiency equations of a transformer, Chireix com- biner and LC matching is summarized. For a transformer that is used to match the output of a PA, an additional load and tuning capacitance is needed. For this design an optimum efficiency was derived assum- ing an fixed relation of the input inductance of the primary winding to the secondary depending on the transformation ratio. Qp and Qs are the quality factors for the primary and secondary winding and k is the coupling factor. It can be seen that the efficiency of the whole design depends on this three components. Chireix combiner use a reactive el- ement to cancel the difference of a load impedance that is seen by each PA. From this equation it can be derived that there exists a second max- ima in BO, adapted by the phase ϕ, which depends as well as the whole

10 1.4. Combiner Techniques

efficiency on the susceptance BC of the inductor and the load resistance Rload. The Chireix combiner achieves the maxima by sin(2ϕ) = RLBC. For an LC network the efficiency depends on the quality factor Qind of the coil and the power enhancement ratio (PER) E that depends on the impedance transformation ratio and the efficiency of the transfor- mation network. Further studies on efficiency were done that include the impact of scaling PA elements in power combining circuits what makes the efficiency calculations more elaborated for fully integrated designs [39].

Table 1.6: Efficiency Models for Power Combiner Power combiner Efficiency Ref. √ 1 Transformer 2 ( 1 ) [27] 1+ 2 +2 QpQsk 1 QpQsk2 1+ QpQsk2 1 Chireix √ ( ) [41] − 2 1 sin(2ϕ) RLBC 1+ 2 4 (√sin (ϕ) ) − LC 1 − E 1 [27] Qind

The efficiency plot of the transformer for different secondary quality factors can be seen in Figure 1.3. It can be seen that for a given quality factor k the efficiency of the whole transformer depends on the qual- ity of low losses in the primary and secondary windings and in conse- quence of a good quality factor.

100

80 Q 60 s = 15 Qs = 10 Qs = 5 40

Efficiency [%] 20

0 100 101 102 Primary Quality Factor

Figure 1.3: Efficiency diagram of a transformer with k=0.9 and fixed Qs.

11 Chapter 1. Introduction

The advantage of a second efficiency maxima in BO of a Chireix com- biner is shown in Figure 1.4. By adapting the components RLBC the position of the maxima and the efficiency drop can be regulated. This is an advantage for designing a PA for technologies with high a PAPR. To become more realistic solutions the model was further developed and source losses were included [41].

100

80 Chireix R B = 0.4, 0.7, 1 60 L C

40

Efficiency [%] 20

0 −30 −25 −20 −15 −10 −5 0 Output Power Backoff [dB]

Figure 1.4: Efficiency diagram of a Chireix combiner.

It shows that due to this resistors the valley between the two maximas drops further down, what could result in a worse overall efficiency for signals with a high PAPR. The efficiency of an LC combiner with different quality factors Qind for the inductance is shown in Figure 1.5. The PER is the product of the impedance transformation ratio and its efficiency. It can be seen that the overall efficiency ηoa rises with the impedance ratio and with increased losses. It is therefore important for networks with lossy on- chip components [27].

1.5 Power Amplifier Classes

SMPA classes, such as class-D, -E or -F are promising candidates for in- tegrated PA solutions because of their theoretical high efficiency. Ad- ditionally to these classes also their inverse class-D, inverse class-E and inverse class-F implementations can be used. In the inverse designs the voltage and current characteristics change from a square to a sinu-

12 1.5. Power Amplifier Classes

100

80

60

40 Qind = 15 Efficiency [%] 20 Qind = 5 Qind = 10 0 0 20 40 60 80 100 Power Enhancement Ratio Figure 1.5: Efficiency diagram of an LC combiner. soidal wave and vice versa. Therefore, the designs can be implemented in series, voltage summing or parallel, current summing.

1.5.1 Class-D and Inverse Class-D PA In Figure 1.6 the concept of a class-D PA is shown. The inverter struc- ture switches the voltage at the input of the LC matching network. Ideally a rectangular voltage waveform is generated whose fundamen- tal is then filtered by the matching network that results in a sinusoidal current at the load Rload. While designing the driving stage for the in- verter a special care should be taken that during the switching time no short from the supply Vdd to the ground is generated that results in an additional power dissipation and efficiency degradation. This could be avoided by a more complex design with non-overlapping clocks. For designs that use a high supply voltage in comparison with low volt- age CMOS transistors the inverter based structure has the disadvantage that both transistor types have to be stacked because the drain to source voltage Vds has a range from zero to the supply voltage Vdd. Therefore, an additional level shifting is needed to switch the p-channel metal oxide semiconductor (pMOS) transistors. Since the rising and falling times of the PA have a direct impact on the linearity and efficiency of the design a good compromise for the driver stage has to be found [33]. An inverse class-D PA, also known as current-mode class-D (CMCD) [45], uses only one parallel LC tank that can be absorbed into the out-

13 Chapter 1. Introduction

Vdd

C L Vin Vds

Rload

Figure 1.6: Transistor level diagram of a class-D PA. put transformer network that makes the design more compact. Addi- tionally the parallel capacitor allows the absorption of the device par- asitics into the network. Zero-voltage switching (ZVS) can be used but the efficiency decreases due to losses in the tank [46]. To avoid these losses zero-current switching (ZCS) can be implemented [47].

1.5.2 Class-E and Inverse Class-E PA In Figure 1.7 the transistor level diagram of a class-E PA is shown. The parasitic device capacitance Cs is absorbed into the matching network. Vds depends on the lumped elements and exceeds the supply voltage Vdd. An efficiency degradation occurs when the transition switches

Vdd

L C L Vds

Vin CS Rload

Figure 1.7: Transistor level diagram of a class-E PA.

14 1.5. Power Amplifier Classes

from open to short circuit. If there is still charge stored in this transis- tor capacitance it will be dissipated through the switch. As in class-D, ZVS can be used to compensate this degradation because it drives the voltage to zero before the transistor is conductive [48].

An inverse class-E PA has a lower peak switching voltage, lower in- ductance value and higher peak output power compared to a class-E PA. But the disadvantage is that an additional inductance is needed that cannot be included in the matching network what makes it less compact [49].

1.5.3 Class-F and Inverse Class-F PA Figure 1.8 shows the transistor level diagram of a class-F PA. The class- F PA combines the class-B with additional harmonic resonators. By isolating harmonics the class-F PA increases its efficiency.

Vdd

L Harmonic Resonator C Vds

Vin L C Rload

Figure 1.8: Transistor level diagram of a class-F PA.

It is well known that by only isolating the third harmonic the effi- ciency improves from 78 %, for class-B amplifier, to 88 %. Theoretically the class-F amplifier can achieve 100 % efficiency but only with a infi- nite number of harmonics [2]. As in class-E designs the drain source voltage Vds also depends on the lumped elements and exceeds the sup- ply voltage Vdd. Instead of filtering the odd harmonics, an inverse class- F cancels the even harmonics [50]. Therefore, the implementation and efficiency tradeoff is the same as mentioned above.

15 Chapter 1. Introduction

Class-F PAs are cited to have much better waveform metrics than class-E but are said to be unrealizable in the strong switching case [48]. Therefore, a hybrid class-EF design was implemented that has the ben- efits of class-E, such as integration of the transistor parasitic capaci- tance, exact switching time-domain solutions and ZVS operation com- bined with inverse class-F to improve the waveforms and thereby the performance of the overall design [44].

1.5.4 Losses and Output Power To avoid losses during the switching states, ZVS and ZCS and their derivatives zero-voltage derivative switching (ZVDS), zero-current derivative switching (ZCDS) can be implemented. ZVS is used to elim- inate the switching losses by discharging the transistor capacitance be- fore the next switching state begins. ZCS is the equivalent for induc- tance losses. For a class-D PA the dead time, between the two switch- ing states, is crucial to achieve ZVS [51]. It was mentioned, that the output capacitance of the class-D is the dominant loss at gigahertz fre- quencies [45]. For class-E designs this state is always achieved. Never- theless in gigahertz frequency this approach was cited to be less effec- tive due to uncertainties in the duty cycle, nonlinear capacitance, and other parasitic components [45]. Class-E is considered as soft switch- ing in contrast to class-D [2]. With ZVDS any deviation results in a lower power loss. Class-F and inverse class-F are also designs that can achieve ZVS [45]. Besides ZVS in inverse class-D PAs ZCS can be con- sidered that eliminates the losses in the inductance. However it was stated that ZCS is less important than ZVS [45]. For the inverse class-E both ZCS and ZCDS can be achieved [52].

Table 1.7 shows the output power equations and the drain voltage stresses that occur for the different SMPAs. It can be seen that the high- est output power can be achieved by an inverse class-E PA that is eight times higher compared to the standard class-D design. The push-pull architectures of class-D and its inverse achieve the same output power as class-F and inverse class-F with infinite harmonics cancellation. By using the push-pull design the power for class-D increases four times. The highest voltage stress occurs for class-E. Compared to its inverse class-E design it achieves three times less output power and even gen- erates a higher voltage stress for the transistor. Class-D generates the

16 1.6. Linearization Concepts

lowest drain to source voltage stress because the voltage at the input is a generated rectangular signal of the supply voltage.

Table 1.7: Output Power for Different SMPA Classes Class D D−1 E E−1 F F −1 2 8 π2+4 8 π2 Pout¹ π2 ² n.a.³ π2+4 8 π2 ⁴ 8 ⁵ ≈ 0.203 n.a. 0.577 1.734 0.811 1.234 Ref. [53] [53] [54] [54] [2] [2]

Vds/Vdd 1 π ≈ 3.562⁶ ≈ 2.862⁷ 2 π Ref. [2] [45] [2] [2] [2] [2]

2 Vdd ¹ normalized Pout to Rload 8 ≈ π2 ≈ ² push-pull π2 0.811 ³ push-pull 8 1.234 81 ≈ ⁴ all harmonics, for 3rd[ harmonic 128( )] 0.633 π − π ⁵ all harmonics√ ⁶ 2π 2 arctan 2 π2 ⁷ 1 + 4 + 1

1.6 Linearization Concepts

All SMPAs have the disadvantage of being highly non-linear. Therefore, there is the need to linearize these classes. This section shows differ- ent design architectures approaches that are capable of achieving this linearization.

1.6.1 Outphasing Outphasing was invented by Chireix and is a well known concept that is used for IQ modulated signals. In recent years it is also cited as LINC. In Figure 1.9 the concept of outphasing is shown. Two phase delayed signals, consist of ϕ1, ϕ2 and their constant amplitudes a1 and a2. The two signals are independently amplified in their path and than com- bined at the output.

17 Chapter 1. Introduction

a1 sin(ωt + ϕ1)

A(t) sin(ωt + ϕ(t))

a2 sin(ωt + ϕ2)

Figure 1.9: Block level diagram of the outphasing concept.

The combined output amplitudes a1 and a2, of the upper and lower signal path sum up to the modulated amplitude A(t) and phase ϕ(t) that can be expressed as 2 2 2 − A (t) = a1(t) + a(2(t) + 2a1a2 cos(ϕ1 ϕ2)), − a sin(ϕ ) + a sin(ϕ ) ϕ(t) = tan 1 1 1 2 2 . (1.6) a1 sin(ϕ1) + a2 sin(ϕ2)

It can be seen that by the design itself a power combining stage is needed. For a high output power, that is one crucial criteria in design- ing CMOS PA in low technology nodes, this is a method to alleviate the maximum design requirements of a single power amplifier [55]. Using a class-D PA in an outphasing architecture has the advantage that the output voltage is independent of the load impedance and is so an ideal candidate for a non-isolating matching network [56]. For a class-E PA the ZVS characteristic depends on the load impedance. Therefore, a combiner with isolation is needed what degrades the efficiency in BO. An asymmetric combining technique was described that maintains the efficiency despite the mentioned combining [57]. Normally the power of a class-E PA is limited by the supply voltage that can be used for a stacked design. By implementing a power enhancement circuit (PEC) the maximum peak voltage at the drain can be reduced allowing higher supply voltage which results in higher possible output power [58]. Fur- thermore, to improve the efficiency in back off an efficiency enhance- ment circuit (EEC) was developed. An additional conductor was added that forms a resonant circuit with matching capacitance that provides

18 1.6. Linearization Concepts

high impedance at the carrier frequency what reduces power dissipa- tion and increases the efficiency.

1.6.2 Envelope Elimination and Restoration Figure 1.10 shows the EER technique, that was developed by Kahn. This technique is a PM architecture. EER is envelope tracking (ET) with switched components that combine the efficiency in BO mode with an efficient switched PA. The supply voltage is adapted to the required out- put power, what shifts the operation mode of the PA more towards its saturation region, where it operates more efficiently. The amplitude a(t) of the signal is detected and amplified by an auxiliary amplifier that modulates the envelope A(t) of the main PA. This keeps the PA always in saturation and utilizes so the high efficient region. There- fore, in theory the achievable efficiency is higher. It was stated that EER is more sensitive to delay mismatch and needs so higher effort of integration [59]. Additionally it was mentioned that it is more diffi- cult to implement EER for wide signal BWs such as OFDM [60]. The main causes of non-linearity in EER is a delay between the amplitude and phase path, low-pass filtering of the envelope signal and AM-PM distortion [29].

Amplitude Detector a(t)

A(t) sin(ωt + ϕ(t))

a(t)ϕ(t) Limiter

Figure 1.10: Block level diagram of EER and hybrid EER without limiter.

In this design an additional block is needed to shape the envelope. Therefore, an additional design effort has to be done that guarantees that the second PA does not degrade the overall performance of a sys- tem and is additional capable of operating at the envelope frequency.

19 Chapter 1. Introduction

Some years ago it was said that most high level PAs use a class-S am- plifier as envelope modulator and that they achieve in practice a high efficiency over a wide dynamic range but the BW is limited to 10 MHz in IC implementations. Therefore, for wideband applications class-G and split band modulators were suggested [11]. The oversampling ratio of the PA has to be high enough to achieve signal accuracy which limits the achievable signal BW [61]. It was stated that for a BW higher than 10 MHz it is not possible that such a design can be implemented [39]. The limiter in the RF path is a problem for some wide dynamic range OFDM signals [60]. Due to BW limitations it can be difficult to real- ize it for higher BW applications like LTE-Advanced that requires up to 100 MHz with carrier-aggregation (CA). In order to reduce the stringent RF and envelope BW requirements, hybrid structures were proposed [22]. In a hybrid structure the limiter is removed thus that the RF input signal that arrives at the input of the PA still contains the envelope a(t) and phase ϕ(t) modulation. An ad- ditional advantages to lower BW requirements are higher gain and so better efficiency and lower sensitivity to mismatch. On the other side, an amplitude modulation technique has to be implemented in the PA what can make the design more complex. In later years different imple- mentations were presented. For higher efficiency over more BW using an all-pass network [62]. The envelope amplifier consists of an op-amp as voltage source and a buck converter with inductor as current source that is controlled by a feedback path with hysteresis comparator [63]. This design was later further developed for the hybrid EER architec- ture. Conventional digital predistortion can be used to linearize the PA [22]. To align the RF and envelope paths an adaptive time align- ment can be implemented [64]. Problems in the spectral domain arise due to frequency response in the amplitude path that generates errors in the frequency domain or envelope distortion that occurring within a drain modulated PA [65].

1.6.3 Digital Polar Transmitter In general the term of DPT and in consequence the DPA is used when the digital to analog conversion and the envelope tracking are done in the same step. This can be seen in Figure 1.11 which shows the block level diagram of a digital polar transmitter design. The digital signal is used to modulate the amplitude A(t) and the phase ϕ(t) signal of an

20 1.6. Linearization Concepts

efficient SMPA with a digital pulse-width-modulation (DPWM) or the LO, respectively [66,67].

DPWM A(t)

PM A(t) sin(ωt + ϕ(t))

ϕ(t)

Figure 1.11: Block level diagram of a DPT.

The amplitude envelope block can be implemented for example as a buck converter. This block is then controlled by a digital signal proces- sor (DSP). The digital implementation can be used to noise shape the signal and so reduce the near-band quantization noise. To synchronize the amplitude and phase path a delay controller can be used to match the amplitude and phase [67].

1.6.4 Summing Digital Power Amplifier In Figure 1.12 the design of a SDPA is shown. The concept either di- rectly converts the amplitude A(t) with binary weighted cells or uses a decoder that modulates the digital binary input code to a thermometer code. The decoder than switches UCs which guarantees monotonicity for the output. Using a binary modulation with bad matched cells can lead to monotonicity errors. The phase ϕ(t) is modulated by the clock at the input of each cell. The clock distribution is important for the design considerations to avoid timing errors that create distortion. The resulting amplitude A(t) of a segmented implementation can be expressed as the sum of the thermometer decoded UCs and the binary weighted cells. The unit amplitudes an(t) and the binary amplitudes bn(t) result in ∑N ∑M A(t) = an(t) + bm(t). (1.7) n=1 m=1

21 Chapter 1. Introduction

Binary Input

A(t)

ϕ(t) PM

A(t) sin(ωt + ϕ(t))

Figure 1.12: Block level diagram of SDPA.

For SDPAs there are two possible ways of implementations. Firstly, VS- DPAs that collect the voltages that are generated by unit cells. One design architecture that became popular in recent years is the class-D like switched capacitor power amplifier (SCPA) [32]. Therefore, for low breakdown voltage technology nodes the design has the same inverter based disadvantages for high required output power that is achieved for a supply voltage Vdd and an output resistance Rload. The design uses the good capacitor matching abilities to sweep the output power according to ( ) 2 2 2 n Vdd Pout = 2 . (1.8) π N Rload Due to the switching, high-order harmonics are generated in the spec- tral output that have to be filtered by the matching network [32]. A transformer based PA with class-E/F operation was proposed to over- come class-D PA losses in the parasitic capacitance at higher frequen- cies and so improve the efficiency. In addition a duty cycle tuner was implemented that provides a selection for the linearity and efficiency tradeoff [44]. Secondly, the sum of currents can be done using Kirch- hoff’s law. CSDPAs have the advantage that for low voltage technolo- gies higher output powers can be achieved by shorting the output of the different cells. Therefore, no complex power combining network is needed as for VSDPA [66].

22 1.7. Watt-Level Output Power

1.7 Watt-Level Output Power

In Figure 1.13 recently published DPAs in different CMOS technology nodes are presented over output power. Technology nodes for half pitch contacted dynamic random access memory (DRAM) predicted by the ITRS are given that show the predicted availability of CMOS nodes in comparison with the published designs [26]. It can be seen that watt-level DPAs can be implemented in technology nodes down to 45 nm. Nevertheless, also higher CMOS nodes can be used to in- crease the maximum allowable voltage at the transistor. So higher out- put power can be achieved with the same amount of transistors. Fur- ther implementations were presented in other technology nodes for an output power of around 25 dBm. The question rises, whether these de- signs are also possible to be implemented for watt-level output power or by increasing the supply voltage.

2016/ 22 nm 40 nm [68] 180 nm [70] 2013/ 65 nm [44] 32 nm 90 nm [24] 45 nm [69] 150 nm [67] 2010/ 32 nm [28] 45 nm 130 nm [35]

Year / Technology Node 2007/ 65 nm 24 26 28 30 32 Output Power [dBm]

Figure 1.13: DPA comparison of area and Pout in different technology nodes.

Comparing lower with higher technology nodes it can be seen that in almost all designs the lower technology needs a bigger die area for the same output power. One possible explanation might be the more complex implementation of the designs. In Table 1.8 a composi- tion of PAs for different output power level implementations is shown. Assuming comparable supply voltage and CMOS technology node, the

23 Chapter 1. Introduction

output power of outphasing topologies can be increased by 6 dB, dou- bling the voltage by increasing the number of combiner stages [28,69]. DPT achieves equivalent output power by using only one combiner stage and Instead of using a class-D PA it uses the inverse current mode. The fact that the design uses a higher technology node alleviates the voltage stress [67]. For EER and CSDPA a voltage mode class-E de- sign were presented that achieved 27.8 dBm and 25.2 dBm with a single transistor stage [29, 35]. Compared to outphasing this designs might also achieve watt-level output power if the number of output stages was increased. For VSDPA rises the question why SCPA designs do not achieve higher output power though the number of output matching stages is increased [24,71]. A power amplifier with four times more out- put stages only had a 1.8 dB higher output power in comparable classes, technology node and the same amount of stacked transistors. Since no supply voltage was cited this might be a reason. Another explanation could be the losses in the series combiner or bad matched transistor stacks due to impedance transformation.

1.8 Modulated Signals

The theoretically very efficient designs are more difficult to be imple- mented in reality. Ideally they are memoryless nonlinear with a high degree of accuracy. But process variation, coupling effects and ad- ditional complexity of low technology node designs, such as stacked transistors, short channel modulation and several power combining stages making the designs more complex what can additionally con- tribute to mismatch. This mismatch in the amplifier stages result in gain and phase imbalance that lead to nonlinearity and spectral distor- tion [72]. Despite the discussed drawbacks in linearity of Chireix com- biners class-D can provide acceptable linearity. Digital predistortion using Voltera, Hammerstein or look up table (LUT) are used to correct signals. Since these methods are used for linear PAs, adaptations for outphasing have to be done. A new behavioral model structure with model-based phase-only predistortion that compensates amplitude as well as phase was presented [43].

Table 1.9 shows a comparison of implementations for modulated sig- nals. Of the discussed implementations, only amplifier structures were

24 1.9. Summary of DPAs

found that are implemented as EER and outphasing that are presented for LTE 20 MHz BW that achieve the required ACLR for evolved univer- sal terrestrial radio access (E-UTRA). Outphasing achieved very good E-UTRA ACLR values with the use of digital predistortion (DPD) for a 64-QAM modulated signal [58]. EER can fulfill the linearity require- ments for E-UTRA and universal terrestrial radio access (UTRA) with- out a predistortion technique [73] and shows backward compatibility to 3G. DPT was presented for a wideband code division multiple ac- cess (WCDMA) signal that achieved the UTRA ACLR values at lower frequency. The channel power (CHP) of CSDPA is below the required 3GPP. Furthermore it uses adaptive digital predistortion (ADP) to fulfill the requirements.

1.9 Summary of DPAs

In this chapter the theoretical concept to design a PA was shown with special focus on Pout, 3GPP requirements, technology nodes and relia- bility. Different combiner techniques were presented that were used in recent publications to increase Pout. Hereby a special focus on theoreti- cal achievable efficiency was shown. SMPA concepts and architectures to linearize the highly non-linear concepts are shown. The DPA the- ory is discussed and watt-level output power and modulated signals that were presented in recent years were analyzed with focus on tech- nology node, Pout and modulated signals. It was shown that SMPAs in linearization architectures can achieve watt-level output power even in deep nanometer technology nodes. Furthermore, implementations are shown that achieve the required in-band output power for LTE and fulfill the required ACLR specifications.

1.10 Motivation

Watt-level output power can be achieved by stacking transistors in deep nanometer technology nodes. CSDPA is a promising approach to pro- vide this output power since the drain of the transistors can be com- bined without any further power combining concept. The currents can be summed and provided to the load. This avoids a design with many transformers to combine power what might lead to less efficiency due to transformer losses [74]. Nevertheless, to alleviate the voltage stress

25 Chapter 1. Introduction ] ] ] ] ] ] ] ] 71 35 24 28 29 67 44 69 ] ] ] ] 35 73 58 67 OFDM NoNo [ [ LTE ADP [ DPD [ 2 22 [ 2 [ 4 - [ (4) (5) (4) (6) 64-QAM [%] 21 22 (7) Efficiency DPD Ref. d (1) (3) η - -33 26.5 -46 -41.1 32.2 Series Series Series (6) Implementations for UTRA LC Matching 1 3 [ LC Matching 1LC Matching 2 1 [ 2 [ Series Figure 8 4 2 [ DPA (1) PAE - - -33 -50 (5) - Guanella Reverse 1 2 [ - - CMOS 2 1.4 1.8 2.5 2.4 oa η (4) [GHz] [dBc] [dBc] (2) 27 24 0.75 15.3 1.9 ACLR UTRA 5 2020 22.8 25.6 2.4 1.95 Node Pout [dBm] Supply [V] Combiner Stages Stack Ref. [MHz] [dBm] (7) (7) (8) is the in-band power for the stated modulation Comparison of Output Power for Different Implemented Architectures : out Comparison of Different Implemented LTE Class P : Class-E 180 nm 27.8 Class-E 130 nm 25.2 Class-DClass-D 90 nm 90 nm 25.2 Class-E/F 65 nm 25.6 signal is used for (2) OFDM Table 1.8 Table 1.9 EER DPT WCDMA 5 16-QAM WCDMA ACLR CSDPA WiMAX Concept Modulation BW Pout f E-UTRA EER DPT inverse Class-D 150 nm 31 Outphasing LTE (1) (8) (3) CSDPA VSDPA VSDPA VSDPA Concept OutphasingOutphasing Class-D Class-D 45 nm 32 nm 31.5 25.3

26 1.10. Motivation

for one stack the design can be build as a differential design which re- sults in a gained factor for Vout by two. To combine these two path a transformer is needed. This transforms the output impedance that is seen by the stack with 1 : n [27]. This transformation also results in an increased Pout. Inverse class-D can be used as UC. It generates a simi- lar Vds as inverse class-E but has the advantage of merging the resonant components into the transformer.

27

Bibliography

[1] B. Hoefflinger, Chips 2020: A guide to the future of nanoelectronics, ser. The Frontiers Collection. Springer, 2012.

[2] P. Reynaert and M. Steyaert, RF power amplifiers for mobile com- munications, ser. Analog Circuits and Signal Processing. Springer Netherlands, 2006.

[3] G. Zhang and A. van Roosmalen, More than Moore: Creating high value micro/nanoelectronics systems. Springer US, 2010.

[4] H. Viswanathan and M. Weldon, “The past, present, and future of mobile communications,” Bell Labs Technical Journal, vol. 19, pp. 8–21, 2014.

[5] M. Sauter, From GSM to LTE: An introduction to mobile networks and mobile broadband, ser. Wiley Online Library: Books. Wiley, 2010.

[6] S. Mukhopadhyay, Internet of Things: Challenges and opportu- nities, ser. Smart Sensors, Measurement and Instrumentation. Springer International Publishing, 2014.

[7] G. Wu, S. Talwar, K. Johnsson, N. Himayat, and K. Johnson, “M2M: From mobile to embedded internet,” Communications Magazine, IEEE, vol. 49, no. 4, pp. 36–43, April 2011.

[8] S. Leuschner, J.-E. Mueller, and H. Klar, “A 1.8 GHz wideband stacked-cascode CMOS power amplifier for WCDMA applications in 65 nm standard CMOS,” in Radio Frequency Integrated Circuits Symposium (RFIC), 2011 IEEE, pp. 1–4, June 2011.

29 Bibliography

[9] T. Johansson and J. Fritzin, “A review of watt-level CMOS RF power amplifiers,” Microwave Theory and Techniques, IEEE Transactions on, vol. 62, no. 1, pp. 111–124, Jan 2014.

[10] J. Fuhrmann, P. Ossmann, K. Dufrene, H. Pretl, and R. Weigel, “A 28 nm standard CMOS watt-level power amplifier for LTE appli- cations,” in Power Amplifiers for Wireless and Radio Applications (PAWR), 2015 IEEE Topical Conference on, pp. 1–3, Jan 2015.

[11] F. Raab, P. Asbeck, S. Cripps, P. Kenington, Z. Popovic, N. Pothe- cary, J. Sevic, and N. Sokal, “Power amplifiers and for RF and microwave,” Microwave Theory and Techniques, IEEE Transactions on, vol. 50, no. 3, pp. 814–826, Mar 2002.

[12] S. Kousai, K. Onizuka, S. Hu, H. Wang, and A. Hajimiri, “A new wave of CMOS power amplifier innovations: Fusing digital and analog techniques with large signal RF operations,” in Custom In- tegrated Circuits Conference (CICC), 2014 IEEE Proceedings of the, pp. 1–8, Sept 2014.

[13] M. Steyaert, B. De Muer, P. Leroux, M. Borremans, and K. Mertens, “Low-voltage low-power CMOS RF transceiver design,” Microwave Theory and Techniques, IEEE Transactions on, vol. 50, no. 1, pp. 281–287, Jan 2002.

[14] W. Ali-Ahmad, “Radio transceiver architectures and design is- sues for wideband cellular systems,” in Radio-Frequency Integra- tion Technology: Integrated Circuits for Wideband Communication and Wireless Sensor Networks, 2005. Proceedings. 2005 IEEE Inter- national Workshop on, pp. 21–25, Nov 2005.

[15] Z. Boos, A. Menkhoff, F. Kuttner, M. Schimper, J. Moreira, H. Geltinger, T. Gossmann, P. Pfann, A. Belitzer, and T. Bauern- feind, “A fully digital multimode polar transmitter employing 17b RF DAC in 3G mode,” in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2011 IEEE International, pp. 376–378, Feb 2011.

[16] G. Li Puma, S. Marsili, C. Reindl, Y. Dai, E. Thaller, K. Getta, A. Ishak Loza, M. Feltgen, M. Wiedenhaus, V. Christ, M. Caterini, T. Schoenauer, S. van Waasen, and S. Heinen, “Digital polar

30 Bibliography

transmitter architecture suitable for multi-core SoC integration in 65 nm CMOS technology,” in Ph.D. Research in Microelectronics and Electronics (PRIME), 2012 8th Conference on, pp. 1–4, June 2012.

[17] S.-W. Lee, K.-Y. Lee, E. Song, Y.-J. Jung, H. Jeong, J.-M. Kim, H.-J. Lim, J.-W. Lee, J. Park, K. Lee, S.-I. Chae, D.-K. Jeong, and W. Kim, “A single-chip 2.4 GHz direct-conversion CMOS transceiver with GFSK modem for Bluetooth application,” in VLSI Circuits, 2001. Digest of Technical Papers. 2001 Symposium on, pp. 245–246, June 2001.

[18] Y.-I. Kwon, S.-G. Park, T.-J. Park, K.-S. Cho, and H.-Y. Lee, “An ultra low-power CMOS transceiver using various low-power techniques for LR-WPAN applications,” Circuits and Systems I: Regular Papers, IEEE Transactions on, vol. 59, no. 2, pp. 324–336, Feb 2012.

[19] T. Maeda, N. Matsuno, S. Hori, T. Yamase, T. Tokairin, K. Yanag- isawa, hitoshi Yano, R. Walkington, K. Numata, N. Yoshida, Y. Takahashi, and N. Yoshida, “A low-power dual-band triple- mode WLAN CMOS transceiver,” Solid-State Circuits, IEEE Journal of, vol. 41, no. 11, pp. 2481–2490, Nov 2006.

[20] T. Georgantas, K. Vavelidis, N. Haralabidis, S. Bouras, I. Vassiliou, C. Kapnistis, Y. Kokolakis, H. Peyravi, G. Theodoratos, K. Vrys- sas, N. Kanakaris, C. Kokozidis, S. Kavadias, S. Plevridis, P. Mudge, I. Elgorriaga, A. Kyranas, S. Liolis, E. Kytonaki, G. Konstantopou- los, P. Robogiannakis, K. Tsilipanos, M. Margaras, P. Betzios, R. Magoon, N. Bouras, M. Rofougaran, and R. Rofougaran, “A 13 mm2 40 nm multiband GSM/EDGE/HSPA+/TDSCDMA/LTE transceiver,” in Solid- State Circuits Conference - (ISSCC), 2015 IEEE International, pp. 1–3, Feb 2015.

[21] P. Rakers, M. Alam, D. Newman, K. Hausmann, D. Schwartz, M. Rahman, and M. Kirschenmann, “Multi-mode cellular transceivers for LTE and LTE-Advanced,” in Custom Integrated Cir- cuits Conference (CICC), 2014 IEEE Proceedings of the, pp. 1–8, Sept 2014.

[22] F. Wang, D. Kimball, J. Popp, A. Yang, D. Lie, P. Asbeck, and L. Larson, “An improved power-added efficiency 19 dBm hybrid envelope elimination and restoration power amplifier for 802.11g

31 Bibliography

WLAN applications,” Microwave Theory and Techniques, IEEE Transactions on, vol. 54, no. 12, pp. 4086–4099, Dec 2006. [23] S. Lee and S. Nam, “A CMOS outphasing power amplifier with in- tegrated single-ended Chireix combiner,” Circuits and Systems II: Express Briefs, IEEE Transactions on, vol. 57, no. 6, pp. 411–415, June 2010. [24] J. Walling, S.-M. Yoo, and D. Allstot, “Digital power amplifier: A new way to exploit the switched-capacitor circuit,” Communica- tions Magazine, IEEE, vol. 50, no. 4, pp. 145–151, April 2012. [25] LTE; Evolved universal terrestrial radio access (E-UTRA); User equipment (UE) radio transmission and reception, (3GPP TS 36.101 version 11.6.0 Release 11) , International Technology Roadmap for Semiconductors (ITRS) Std., 2013. [26] Radio frequency and analog/mixed-signal technologies for wireless communications, International Technology Roadmap for Semi- conductors (ITRS) Std., 2007. [27] I. Aoki, S. Kee, D. Rutledge, and A. Hajimiri, “Distributed active transformer-a new power combining and impedance transforma- tion technique,” Microwave Theory and Techniques, IEEE Transac- tions on, vol. 50, no. 1, pp. 316–331, Jan 2002. [28] H. Xu, Y. Palaskas, A. Ravi, M. Sajadieh, M. El-Tanani, and K. Soumyanath, “A flip-chip packaged 25.3 dBm class-D outphas- ing power amplifier in 32 nm CMOS for WLAN application,” Solid- State Circuits, IEEE Journal of, vol. 46, no. 7, pp. 1596–1605, July 2011. [29] P. Reynaert and M. Steyaert, “A 1.75 GHz polar modulated CMOS RF power amplifier for GSM-EDGE,” Solid-State Circuits, IEEE Journal of, vol. 40, no. 12, pp. 2598–2608, Dec 2005. [30] H. Ruiz and R. Perez, Linear CMOS RF power amplifiers: A com- plete design workflow, ser. Electrical engineering. Springer, 2013. [31] O. Lee, J. Han, K. H. An, D. H. Lee, K.-S. Lee, S. Hong, and C.-H. Lee, “A charging acceleration technique for highly efficient cas- code class-E CMOS power amplifiers,” Solid-State Circuits, IEEE Journal of, vol. 45, no. 10, pp. 2184–2197, Oct 2010.

32 Bibliography

[32] S.-M. Yoo, J. Walling, E. C. Woo, B. Jann, and D. Allstot, “A switched capacitor RF power amplifier,” Solid-State Circuits, IEEE Journal of, vol. 46, no. 12, pp. 2977–2987, Dec 2011. [33] H. Xu, Y. Palaskas, A. Ravi, and K. Soumyanath, “A highly linear 25 dBm outphasing power amplifier in 32 nm CMOS for WLAN ap- plication,” in ESSCIRC, 2010 Proceedings of the, pp. 306–309, Sept 2010. [34] K. Datta and H. Hashemi, “A 29 dBm 18.5 amplifier with dynamic load modulation,” in Solid- State Circuits Conference - (ISSCC), 2015 IEEE International, pp. 1–3, Feb 2015. [35] C. Presti, F. Carrara, A. Scuderi, P. Asbeck, and G. Palmisano, “A 25 dBm digitally modulated CMOS power amplifier for WCDMA/EDGE/OFDM with adaptive digital predistortion and ef- ficient power control,” Solid-State Circuits, IEEE Journal of, vol. 44, no. 7, pp. 1883–1896, July 2009. [36] K. H. An, O. Lee, H. Kim, D. H. Lee, J. Han, K. S. Yang, Y. Kim, J. J. Chang, W. Woo, C.-H. Lee, H. Kim, and J. Laskar, “Power com- bining transformer techniques for fully integrated CMOS power amplifiers,” Solid-State Circuits, IEEE Journal of, vol. 43, no. 5, pp. 1064–1075, May 2008. [37] D. Chowdhury, S. Thyagarajan, L. Ye, E. Alon, and A. Niknejad, “A fully integrated efficient CMOS inverse class-D power amplifier for digital polar transmitters,” Solid-State Circuits, IEEE Journal of, vol. 47, no. 5, pp. 1113–1122, May 2012. [38] D. Chowdhury, C. Hull, O. Degani, Y. Wang, and A. Niknejad, “A fully integrated dual-mode highly linear 2.4 GHz CMOS power amplifier for 4G WiMax applications,” Solid-State Circuits, IEEE Journal of, vol. 44, no. 12, pp. 3393–3402, Dec 2009. [39] A. Pye and M. Hella, “Analysis and optimization of transformer- based series power combining for reconfigurable power ampli- fiers,” Circuits and Systems I: Regular Papers, IEEE Transactions on, vol. 58, no. 1, pp. 37–50, Jan 2011. [40] O. Lee, K. An, H. Kim, D. Lee, J. Han, K. Yang, C.-H. Lee, H. Kim, and J. Laskar, “Analysis and design of fully integrated high-power

33 Bibliography

parallel-circuit class-E CMOS power amplifiers,” Circuits and Sys- tems I: Regular Papers, IEEE Transactions on, vol. 57, no. 3, pp. 725–734, March 2010.

[41] I. Hakala, D. Choi, L. Gharavi, N. Kajakine, J. Koskela, and R. Kau- nisto, “A 2.14 GHz Chireix outphasing transmitter,” Microwave Theory and Techniques, IEEE Transactions on, vol. 53, no. 6, pp. 2129–2138, June 2005.

[42] M. El-Asmar, A. Birafane, A. Kouki, and A. El-Rai, “Opti- mal combiner design for outphasing RF amplification systems,” in Advances in Computational Tools for Engineering Applications (ACTEA), 2012 2nd International Conference on, pp. 176–181, Dec 2012.

[43] Y. Jung, J. Fritzin, M. Enqvist, and A. Alvandpour, “Least-squares phase predistortion of a +30 dBm class-D outphasing RF PA in 65 nm CMOS,” Circuits and Systems I: Regular Papers, IEEE Transac- tions on, vol. 60, no. 7, pp. 1915–1928, July 2013.

[44] H. Wang and H. Hashemi, “A 0.5-6 GHz 25.6 dBm fully integrated digital power amplifier in 65 nm CMOS,” pp. 409–412, June 2014.

[45] H. Kobayashi, J. Hinrichs, and P. Asbeck, “Current-mode class-D power amplifiers for high-efficiency RF applications,” Microwave Theory and Techniques, IEEE Transactions on, vol. 49, no. 12, pp. 2480–2485, Dec 2001.

[46] D. Chowdhury, L. Ye, E. Alon, and A. Niknejad, “A 2.4 GHz mixed- signal polar power amplifier with low-power integrated filtering in 65 nm CMOS,” in Custom Integrated Circuits Conference (CICC), 2010 IEEE, pp. 1–4, Sept 2010.

[47] H. Kobayashi, J. Hinrichs, and P. Asbeck, “Current mode class- D power amplifiers for high efficiency RF applications,” in Mi- crowave Symposium Digest, 2001 IEEE MTT-S International, vol. 2, pp. 939–942 vol.2, May 2001.

[48] S. Kee, I. Aoki, A. Hajimiri, and D. Rutledge, “The class-E/F family of ZVS switching amplifiers,” Microwave Theory and Techniques, IEEE Transactions on, vol. 51, no. 6, pp. 1677–1690, June 2003.

34 Bibliography

[49] M.-W. Lee, S.-H. Kam, and Y.-H. Jeong, “A highly efficient dual- band inverse class-E power amplifier with double CRLH-TLs for LTE and WCDMA applications,” in Microwave Conference Proceed- ings (APMC), 2011 Asia-Pacific, pp. 514–517, Dec 2011.

[50] A. Al Tanany, A. Sayed, and G. Boeck, “Design of class-F-1 power amplifier using GaN pHEMT for industrial applications,” in Mi- crowave Conference, 2009 German, vol. , pp. 1–4, March 2009.

[51] S.-A. El-Hamamsy, “Design of high-efficiency RF class-D power amplifier,” Power Electronics, IEEE Transactions on, vol. 9, no. 3, pp. 297–308, May 1994.

[52] T. Mury and V. Fusco, “Analysis of the effect of finite d.c. blocking capacitance and finite d.c. feed inductance on the performance of inverse class-E amplifiers,” Circuits, Devices and Systems, IEE Pro- ceedings -, vol. 153, no. 2, pp. 129–135, April 2006.

[53] H. Krauss, C. Bostian, and F. Raab, Solid state radio engineering. Wiley, 1980.

[54] T. Mury and V. Fusco, “Series-L/parallel-tuned class-E power am- plifier analysis,” in Microwave Conference, 2005 European, vol. 1, pp. 1–4, Oct 2005.

[55] S. Cripps, RF power amplifiers for wireless communications, ser. Artech House microwave library. Artech House, 1999.

[56] J. Yao and S. Long, “Power amplifier selection for LINC applica- tions,” Circuits and Systems II: Express Briefs, IEEE Transactions on, vol. 53, no. 8, pp. 763–767, Aug 2006.

[57] R. Beltran, F. Raab, and A. Velazquez, “HF outphasing transmitter using class-E power amplifiers,” in Microwave Symposium Digest, 2009. MTT ’09. IEEE MTT-S International, pp. 757–760, June 2009.

[58] A. Banerjee, R. Hezar, L. Ding, N. Schemm, and B. Haroun, “A 29.5 dBm class-E outphasing RF power amplifier with performance enhancement circuits in 45 nm CMOS,” in European Solid State Circuits Conference (ESSCIRC), ESSCIRC 2014 - 40th, pp. 467–470, Sept 2014.

35 Bibliography

[59] D. Kang, B. Park, C. Zhao, D. Kim, J. Kim, Y. Cho, S. Jin, H. Jin, and B. Kim, “A 34 % PAE, 26 dBm output power envelope-tracking CMOS power amplifier for 10 MHz BW LTE applications,” in Mi- crowave Symposium Digest (MTT), 2012 IEEE MTT-S International, pp. 1–3, June 2012.

[60] F. Wang, D. Kimball, D. Lie, P. Asbeck, and L. Larson, “A mono- lithic high-efficiency 2.4 GHz 20 dBm SiGe BiCMOS envelope tracking OFDM power amplifier,” Solid-State Circuits, IEEE Jour- nal of, vol. 42, no. 6, pp. 1271–1281, June 2007.

[61] S. Mann, M. Beach, P. Warr, and J. McGeehan, “Increasing the talk-time of mobile radios with efficient linear transmitter archi- tectures,” Electronics Communication Engineering Journal, vol. 13, no. 2, pp. 65–76, Apr 2001.

[62] J.-H. Chen, K. U-yen, and J. Kenney, “An envelope elimination and restoration power amplifier using a CMOS dynamic power supply circuit,” in Microwave Symposium Digest, 2004 IEEE MTT-S Inter- national, vol. 3, pp. 1519–1522, June 2004.

[63] F. Wang, D. Kimball, J. Popp, A. Yang, D. Lie, P. Asbeck, and L. Lar- son, “Wideband envelope elimination and restoration power am- plifier with high efficiency wideband envelope amplifier for WLAN 802.11g applications,” in Microwave Symposium Digest, 2005 IEEE MTT-S International, pp. 1–4, June 2005.

[64] F. Wang, A. Yang, D. Kimball, L. Larson, and P. Asbeck, “Design of wide-bandwidth envelope tracking power amplifiers for OFDM applications,” Microwave Theory and Techniques, IEEE Transac- tions on, vol. 53, no. 4, pp. 1244–1255, April 2005.

[65] P. Fedorenko and J. Kenney, “Analysis and suppression of mem- ory effects in envelope elimination and restoration (EER) power amplifiers,” in Microwave Symposium, 2007. IEEE/MTT-S Interna- tional, pp. 1453–1456, June 2007.

[66] A. Kavousian, D. Su, M. Hekmat, A. Shirvani, and B. Wooley, “A digitally modulated polar CMOS power amplifier with a 20 MHz channel bandwidth,” Solid-State Circuits, IEEE Journal of, vol. 43, no. 10, pp. 2251–2258, Oct 2008.

36 Bibliography

[67] T. Nakatani, J. Rode, D. Kimball, L. Larson, and P. Asbeck, “Digi- tally controlled polar transmitter using a watt-class current-mode class-D CMOS power amplifier and Guanella reverse Balun for handset applications,” Solid-State Circuits, IEEE Journal of, vol. 47, no. 5, pp. 1104–1112, May 2012.

[68] E. Kaymaksut and P. Reynaert, “A dual-mode transformer-based doherty LTE power amplifier in 40 nm CMOS,” in Solid-State Cir- cuits Conference Digest of Technical Papers (ISSCC), 2014 IEEE In- ternational, pp. 64–65, Feb 2014.

[69] W. Tai, H. Xu, A. Ravi, H. Lakdawala, O. Bochobza-Degani, L. Car- ley, and Y. Palaskas, “A transformer-combined 31.5 dBm outphas- ing power amplifier in 45 nm LP CMOS with dynamic power con- trol for backoff power efficiency enhancement,” Solid-State Cir- cuits, IEEE Journal of, vol. 47, no. 7, pp. 1646–1658, July 2012.

[70] W.-Y. Kim, H. S. Son, J. H. Kim, J. Y. Jang, I. Y. Oh, and C. S. Park, “A CMOS envelope-tracking transmitter with an on- chip common-gate voltage modulation linearizer,” Microwave and Wireless Components Letters, IEEE, vol. 24, no. 6, pp. 406–408, June 2014.

[71] S.-M. Yoo, J. Walling, E.-C. Woo, and D. Allstot, “A power- combined switched-capacitor power amplifier in 90 nm CMOS,” in Radio Frequency Integrated Circuits Symposium (RFIC), 2011 IEEE, pp. 1–4, June 2011.

[72] X. Zhang, L. Larson, and P. Asbeck, Design of linear RF outphasing power amplifiers, ser. Artech House microwave library. Artech House, 2003.

[73] K. Oishi, E. Yoshida, Y. Sakai, H. Takauchi, Y. Kawano, N. Shirai, H. Kano, M. Kudo, T. Murakami, T. Tamura, S. Kawai, S. Yamaura, K. Suto, H. Yamazaki, and T. Mori, “A 1.95 GHz fully integrated envelope elimination and restoration CMOS power amplifier with envelope/phase generator and timing aligner for WCDMA and LTE,” in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2014 IEEE International, pp. 60–61, Feb 2014.

37 Bibliography

[74] J. Long, “Monolithic transformers for silicon RF IC design,” Solid- State Circuits, IEEE Journal of, vol. 35, no. 9, pp. 1368–1382, Sept 2000.

38 Chapter 2 Specifications

For the design of a transmitter system there are several requirements given by 3GPP regarding to output power and linearity requirements [1]. The design of a PA is a crucial part for the overall system performance since it has a direct impact on these requirements. In case of the DPA it is the stage where the conversion from the digital to the analog do- main is done and therefore the first stage where nonlinearities and so amplitude to amplitude distortion (AM-AM) and AM-PM occur. The requirements can be divided into signal performance requirements, e.g. linearity and power and design performance, such as area, efficiency and reliability. The output power is the power that a TX chain has to deliver to the antenna. The linearity is divided into in-band require- ments that is defined by the EVM and the maximum out of band emis- sion that is defined by the ACLR. For the linearity it is also of interest if a linearization technique is used and how much impact it has on the performance in terms of linearity, area and efficiency.

2.1 Power Amplifier Basics

The efficiency of a PA can be described as PAE, drain efficiency ηd and overall efficiency ηoa [2]. The PAE is defined by the difference of the output power Pout and the input power Pin divided by the static power consumption Pdc as P − P P AE = out in . (2.1) Pdc

39 Chapter 2. Specifications

The output or drain efficiency ηd is defined as the division of Pout and Pdc as

Pout ηd = . (2.2) Pdc

The overall efficiency ηoa is defined as the ηd but additionally consid- ering all input power Pin that is dissipated at the input to drive the PA. It can also include the power that is dissipated to drive additional blocks e.g. the clock stage or power supply modulation. In the following ηoa is defined as

Pout ηoa = (2.3) Pdc + Pin The operation power gain G is the amplification of the input signal to the output and is given as ( ) Pout G = 10 log10 . (2.4) Pin For a DPA additionally some specifications from a DAC have to be taken into account. The DNL is a measure of how accurate the DAC works at every single step. It is measured in LSBs and is defined at the step n as

Vn+1 − Vn DNLn = − 1. (2.5) VLSB It is common to state the maximum value max DNL as DNL value. The INL is a measure that describes the difference of ideal to real trans- fer function. It is also measured in LSBs and defined at the step n as ∑n Vn − Vn,ideal INLn = = DNLk. (2.6) VLSB k=1 It is common to state the maximum value max INL as INL value. For the characterization of test-signals the following specifications are im- portant. The crest factor (CF) of a signal is defined as the division of the maximum amplitude Amax and the root mean square of the amplitude Arms as A CF = max . (2.7) Arms

40 2.1. Power Amplifier Basics

The PAPR is the method to describe the maximum to average ratio in the power domain. It comprises the CF and can be calculated as ( ) ( ) 2 Amax P AP R = 10 log10 CF = 20 log10 . (2.8) Arms The test-signals are used to specify the performance of a PA. EVM is a measure for the in-band quality. The root mean square EVM is defined as

  1 ∑N 2  1 |S − S |2   N ideal,r meas,r  EVM =  r=1  . (2.9) rms  ∑N  1 | |2 N Sideal,r r=1

Sideal,r is the ideal symbol and Smeas,r is the measured symbol in a symbol stream [3]. The CHP can be described as [ ] N ∑ − CHBW 1 Pi ai(RRC) CHP = 10 log 10 10 . (2.10) RBW k N n 1

CHBW is the channel BW, RBW the resolution BW, kn the correc- tion factor for noise BW, N the number of pixels within the channel. Pi the level represented by pixel i and ai(RRC) the attenuation of the 3GPP root-raised-cosine (RRC) filter at pixel i [4]. ACLR is the ratio of the transmitted power CHPc to the power measured CHPac after a receiver filter in the adjacent channel and can be expressed as ( ) CHP ACLR = 10 log c . (2.11) CHPac Figure 2.1a-2.1d shows the four different kinds of modulation schemes that are specified. The four different modulation schemes are named binary phase-shift keying (BPSK), QPSK, 16-QAM and 64-QAM. The easiest modulation is BPSK which is only able to modulate 1 bit/symbol but has the highest robustness since only two points have to be trans- mitted. 64-QAM on the other hand can modulate 6 bit/symbol. But since 64 bits have to be transmitted the requirements on the system are higher. The specifications for LTE are given for BPSK, QPSK, 16-QAM and 64-QAM. It will be shown later that depending on the modulation schemes the requirements for LTE change.

41 Chapter 2. Specifications

Q Q Q Q

I I I I

(a) BPSK (b) QPSK (c) 16-QAM (d) 64-QAM

Figure 2.1: Diagram of the signal modulation using BPSK (a), QPSK (b), 16-QAM (c) or 64-QAM (d).

The EVM that results because of an error in the transmission can be seen in Figure 2.2. The measured signal varies in amplitude and phase to the ideal signal that results in a different location in the constellation diagram. The error can be expressed as a magnitude error and a phase error that eventually result in an error vector. The error vector itself can be also expressed by its magnitude and phase.

Q Magnitude Error Error Vector Phase of Error Vector Measured Signal

Phase Error Ideal Signal

I

Figure 2.2: Diagram of the EVM definition.

42 2.2. LTE Signal

2.2 LTE Signal

Figure 2.3 shows the diagram of OFDM time-frequency multiplexing. The smallest unit in the LTE signal is a subcarrier (SC) with 15 kHz BW. 12 SCs build one RB with a BW of 180 kHz. The RB is the smallest entity that can be scheduled in the frequency domain. LTE allows six different channel BWs. They consist of maximal 6, 15, 25, 50, 75, 100 RBs or 1.4, 3, 5, 10, 15, 20 MHz BW, respectively. In carrier aggregation different bands can be combined to increase the maximal BW of 20 MHz. The data is allocated to the user equipments (UEs) using RB. Any RB can be assigned to an UE [5, 6]. Every allocation for a UE is at least one subframe or two slots. Consequently, for the maximum BW of 20 MHz each millisecond 100 RB can be transferred.

Resource Subcarrier Block Spacing = 15 kHz

Frequency

1 Slot UE UE UE UE UE UE 0.5= ms 1 Subframe = UE UE UE UE UE UE 1 ms

UE UE UE UE UE UE

UE UE UE UE UE UE

Time Figure 2.3: Diagram of the OFDM time-frequency multiplexing.

Figure 2.4 presents the basics concept of an orthogonal frequency- division multiplexing access (OFDMA) signal. OFDMA consists of OFDM and frequency division multiplexing access (FDMA). The im- portant quality of the signal is the mathematical orthogonality which is shown by the zero-crossing. The signal is represented in the frequency domain by the function sin(x)/x. To each OFDM signal a cyclic prefix

43 15 kHz

Chapter 2. Specifications

(CP) is appended as a guard interval to make the system more robust against intersymbol interference (ISI). The OFDMA signal is for down- link. For uplink single-carrier FDMA (SC-FDMA) is used. SC-FDMA reduces the PAPR and therefore alleviates the design of the PA [7]. Normalized Amplitude

Frequency

Figure 2.4: Diagram of OFDM with 15 kHz signal spacing.

2.3 LTE Specification

In Figure 2.5 the TX chain of the transceiver from the DPA to the an- tenna is shown. After the DPA a notch filter is needed to filter the spectral components that would be emitted in an adjacent band, e.g. WLAN. Afterward, a more relaxed low pass filter is needed to filter the harmonics. The receiver (RX)/TX switch is needed in the time division duplex (TDD) architecture to switch between the receiver and trans- mitter paths.

Harmonic RX/TX DPA Notch Filter Filter Switch Coupler

Figure 2.5: Block diagram of the TX chain from the DPA to the antenna.

44 2.3. LTE Specification

The last stage before the signal arrives at the antenna is a directional coupler that allows measurements of the emitted power. In Table 2.1 the characterizations of the different components are shown. The post PA attenuation is an estimated value based on ex- perience of 0.1 dB. The notch filter is especially needed to protect the WLAN band and is normally characterized in the range of -20 to +85 °C. The filter should be designed for band 40 and 41. Therefore, it would be specified from 2300 MHz to 2690 MHz. Notice that band 38 is already included in band 41. But since the attenuation for this filter is very high it can not be used. For the low pass (LP) a filter for band 38,40 and 41 has to be used. Usually the center frequency should be around 2.5 GHz and the pass band range should be  200 MHz. The temperature range has to be between -40 and +85 °C. An example for a LP is provided be- low. Since the notch filter and the LP are next to each other one might consider to merge the two components. For the antenna switch module (ASM) the values for all three bands are the same as well as for the cou- pler and the antenna connector. After summing up the typical and the worst case room (WCR) the range for the transition loss is from 3.83 dB to 4.00 dB. To meet the maximum output power given in Table 2.3 with a DPA that has maximum 27 dBm output power a loss of only 2.56 dB for the notch and LP filter together is acceptable.

Table 2.1: Component Losses at Different Frequencies∑ Band Case Post PA Notch & ASM & Coupler LP Filter & SW_mech 38 typ. 0.1 dB 2.56 dB 1.27 dB 3.83 dB WCR 0.1 dB 2.56 dB 1.34 dB 4.00 dB 40 typ. 0.1 dB 2.56 dB 1.27 dB 3.83 dB WCR 0.1 dB 2.56 dB 1.34 dB 4.00 dB 41 typ. 0.1 dB 2.56 dB 1.27 dB 3.83 dB WCR 0.1 dB 2.56 dB 1.34 dB 4.00 dB

2.3.1 Temperature In the following a selection of requirements, that are given by 3GPP, are presented. All the specifications, that are stated for the UE, have to be fulfilled in the temperature range from -10 to +55 C°.

45 Chapter 2. Specifications

Table 2.2: Temperature Specification min. Temperature max. Tamperature Temperature −10 C° +55 C°

2.3.2 Output Power In Table 2.3 the specifications for the DPA are derived according to the specifications of 3GPP for LTE. As can be seen in Figure 2.5 there are several components between the DPA and the antenna that will atten- uate the signal. For these blocks additional insertion losses have to be considered and compensated. The values given for the basic five BW possibilities, 1.4, 3, 5, 10, 15 and 20 MHz. Notice that the entire power range from -40 dBm to 23 dBm must be considered. When the TX is switched off it should not transmit more than -50 dBm. Depend- ing on the transmitted power levels 3GPP allows some tolerances. For the maximum output power this is 2 dB and for the minimum power 7 dB.

Table 2.3: Output Power Specifications for LTE Antenna Chain DPA Tolerance losses Max. Output Power 23 dBm 4 dB 27 dBm 2 dB Min. Output Power −40 dBm n.a. −40 dBm 7 dB Transmit OFF Power −50 dBm n.a. −50 dBm < 0 dB

For the values between the minimum and maximum output power the output power is define in between certain tolerance boundaries. The tolerance values for the specific output power at the antenna are given in Table 2.4.

2.3.3 Maximum Power Reduction The maximum power reduction (MPR) for class 3 can be seen in Ta- ble 2.5. Depending on the channel bandwidth (CHBW) and the RB allocation a MPR of 1 or 2 dB is allowed. For a QPSK with 5 MHz BW and more the 8 RB a MPR of less or equal than 1 dB is allowed. For a 16-QAM signal with less or equal 8 RB this is also 1 dB but with more than 8 RB it is less or equal 2 dB.

46 2.3. LTE Specification

Table 2.4: Output Power Tolerance PCMAX Tolerance T(PCMAX) [dB] [dB] 21 ≤ PCMAX ≤ 23 2.0 20 ≤ PCMAX < 21 2.5 19 ≤ PCMAX < 20 3.5 18 ≤ PCMAX < 19 4.0 13 ≤ PCMAX < 18 5.0 8 ≤ PCMAX < 13 6.0 −40 ≤ PCMAX < 8 7.0

Table 2.5: Maximum Power Reduction in Power Class 3 for Different LTE RB Allocations and BW Modulation CHBW / Transmission BW (NRB) MPR (dB) 1.4 3.0 5 10 15 20 MHz MHz MHz MHz MHz MHz QPSK >5 >4 >8 >12 >16 >18 ≤ 1 16-QAM ≤5 ≤4 ≤8 ≤12 ≤16 ≤18 ≤ 1 16-QAM >5 >4 >8 >12 >16 >18 ≤ 2

2.3.4 VSWR The 3GPP specifications assume a load of 50 Ω at the antenna con- nector. In reality due to antenna mismatch this impedance can vary. Therefore, at the antenna port a voltage standing wave ratio (VSWR) of 3:1 is required for performance specification. For robustness the PA should be designed for a VSWR of 10:1 at the antenna port.

Table 2.6: VSWR Specification at the Antenna Performance Robustness VSWR 3:1 10:1

Considering the additional components between the antenna and the PA we can calculate the VSWR at the output of the PA. The relation for the return loss (RL) is given as

RL = −20 log |Γ|. (2.12)

47 Chapter 2. Specifications

The VSWR calculated for the PA, is considering the values given in Ta- ble 2.7. Considering the VSWR for robustness and worst case scenarios the PA has to be specified for a VSWR from 1.50 − 2.03 : 1.

Table 2.7: VSWR Specification at the PA Band Case Performance Robustness 38 typ. 1.52:1 2.03:1 WCR 1.50:1 2.00:1 40 typ. 1.52:1 2.03:1 WCR 1.50:1 2.00:1 41 typ. 1.52:1 2.03:1 WCR 1.50:1 2.00:1

2.3.5 Operating Band In Table 2.8 the selection of the used bands are shown. The linear PA was designed for frequency division duplex (FDD) band 1. The DPA was designed for the FDD band 7 and the TDD bands 30, 38, 40 and 41. Therefore, the center frequency for the linear PA needs to be designed for the uplink channel at 1.95 GHz. The frequency range for the DPA

Table 2.8: Selection of E-UTRA Operating Bands E-UTRA Uplink (UL) Downlink (DL) Duplex operating TX - RX operating band operating band mode band 1920 MHz - 2110 MHz - 1 FDD 190 MHz 1980 MHz 2170 MHz 2500 MHz - 2620 MHz - 7 FDD 120 MHz 2570 MHz 2690 MHz 2305 MHz - 2350 MHz - 30 FDD 45 MHz 2315 MHz 2360 MHz 2570 MHz - 2570 MHz - 38 TDD - 2620 MHz 2620 MHz 2300 MHz - 2300 MHz - 40 TDD - 2400 MHz 2400 MHz 2496 MHz - 2496 MHz - 41 TDD - 2690 MHz 2690 MHz

48 2.3. LTE Specification

needs to be designed for the frequency range 2.3-2.7 GHz. The duplex distance between TX and RX are 120 MHz for band 7. Notice that the lower band of WLAN operates at 2.4 GHz. This has to be taken into account for coexistence measurements.

2.3.6 EVM Requirements The EVM requirements for all bands and BWs are different for the spec- ified modulation. For QPSK and BPSK an EVM of 17.5 % is required. For 16-QAM and 64-QAM the requirements are 12.5 % and 8 %.

Table 2.9: EVM 3GPP Requirements for Different Modulations BPSK QPSK 16-QAM 64-QAM EVM [%] 17.5 17.5 12.5 8

2.3.7 ACLR Requirements The transmitter RF spectrum is given in Figure 2.6. The emissions can be divided into OOB and spurious emissions. It can be seen that be- tween the CHBW and the spurious domain lies an OOB domain foob and that the E-UTRA band is inside the spurious domain. Spurious Spurious ∆f Channel ∆f Domain OOB Bandwidth OOB Domain

RB

E-UTRA Band Figure 2.6: Transmitter RF spectrum.

The boundary between E-UTRA OOB and the spurious emission do- main depends on the BW. Table 2.10 shows the BW of the OOB bound- ary. Its range is from 2.8 MHz for 1.4 MHz CHBW to 25 MHz for 20 MHz.

49 Chapter 2. Specifications

Table 2.10: Boundary between E-UTRA OOB and Spurious Emission Domain Channel 1.4 3.0 5 10 15 20 BW MHz MHz MHz MHz MHz MHz OOB boundary 2.8 6 10 15 20 25 foob (MHz)

The ACLR requirements for the E-UTRA channel inside of foob are shown in Figure 2.7 for E-UTRA and UTRA. It can be seen that foob depends on the CHBW but the UTRA channels are fixed.

∆fOOB E-UTRA Channel

E-UTRA UTRA UTRA ACLR ACLR2 ACLR1 RB

Figure 2.7: OOB emission mask and E-UTRA channel.

In Table 2.11 the general requirements for E-UTRA ACLR and UTRA ACLR are given. The minimum E-UTRA ACLR, first UTRA ACLR and second UTRA ACLR are for all BW 30, 33, 36 dB.

Table 2.11: General Requirements for E-UTRA and UTRA ACLR CHBW / E-UTRAACLR / UTRAACLR1/2 1.4 3.0 5 10 15 20 MHz MHz MHz MHz MHz MHz E_UTRAACLR 30 dB 30 dB 30 dB 30 dB 30 dB 30 dB UTRAACLR1 33 dB 33 dB 33 dB 33 dB 33 dB 33 dB UTRAACLR2 - - 36 dB 36 dB 36 dB 36 dB

The OOB emission is limited by the spectrum emission mask. The general E-UTRA spectrum emission mask for 20 MHz is given in Fig- ure 2.8. For the first 100 MHz a measurement BW of 30 kHz is given. Above and below 1 MHz the measurement BW is as well 1 MHz.

50 2.3. LTE Specification

RB 0 dBm -5 dBm -10 dBm -15 dBm -20 dBm -25 dBm -30 dBm -30 -20 -10 0 10 20 30 MHz MHz MHz MHz MHz MHz MHz Figure 2.8: The general E-UTRA spectrum emission mask for 20 MHz.

The spurious emission limits are given in Table 2.12. They are defined for frequencies above and below OOB.

Table 2.12: Spurious Emissions Limits Frequency Range Maximum Level Measurement BW 9 kHz ≤ f <150 kHz -36 dBm 1 kHz 150 kHz ≤ f < 30 MHz -36 dBm 10 kHz 30 MHz ≤ f < 1000 MHz -36 dBm 100 kHz 1 GHz ≤ f < 12.75 GHz -30 dBm 1 MHz 12.75 GHz ≤ f < 5th harmonic -30 dBm 1 MHz

2.3.8 Power Clipping If the PA cannot fully transmit the output power of the required sig- nal this results in power clipping. Clipping always produces undesired spurious emission. For the representation of an ideal PA the classical power clipping was chosen [8]. In the following the ACLR for E-UTRA and UTRA and root mean square EVM is shown for different test cases (TCs). Notice that for the generation of the signal pseudo random num- bers are used and that the peak value of the signal can variate. To fur- ther specify the signal the PAPR, CF and CF that contains 99.9% can be calculated using complementary cumulative distribution function

51 Chapter 2. Specifications

(CCDF) [8]. To further specify the signal the PAPR, CF and CF that contains 99.9% can be calculated using CCDF [8]. The first violation is stated for any spectral requirement due to power clipping.

Without Clipping

Classical Clipping Output Amplitude

Input Amplitude

Figure 2.9: The classical clipping for power reduction.

In Table 2.13 the results for an LTE physical uplink shared channel (PUSCH) TC signals with a length of 10 slots and a 16-QAM symbol modulation are shown. Depending on the BW the PAPR of the signals is in the range of 7.84 dB to 8.78 dB. The CF range for 100 % is in between 2.47 and 2.75 and for 99.9 % the lower boundary is 2.08 and the upper 2.10. Due to the pseudo random generation the results might differ a little bit of the worst case scenarios. It can be seen that for a undistorted signal and the given TCs the power clipping has to be less then 5 dB to fulfill the spectral requirements.

Table 2.13: Spurious Emission Results for Different Test Signals BW PAPR CF CF First Violation at Power Clipping [MHz] [dB] 100% 99.9% [dB] 1.4 7.84 2.47 2.08 5.5 3 8.45 2.65 2.10 5.5 5 8.50 2.66 2.10 5.0 10 8.60 2.69 2.10 5.5 15 8.66 2.71 2.10 5.5 20 8.78 2.75 2.10 6

52 2.3. LTE Specification

Figure 2.10 contents the simulations results for the signal with 20 MHz BW. For power clipping the signal behavior of the upper and lower channel are identical. Both channels degrade in the same manner and violate the specifications at the same time for all leakage ratios. It can be seen that in this simulation, if the signal clips at more than 6 dB be- low Pmax, E-UTRA ACLR and UTRA ACLR requirements are violated. The in-band is more robust against clipping and does even not violate the EVM requirements at a clipping of 8 dB. 100 100 90 Lower Channel 90 Lower Channel 80 80

Upper Channel [dBc] Upper Channel

70 1 70 60 60 50 50 40 40 30 30 20 20 10 10 UTRA ACLR E-UTRA ACLR [dBc] 0 0 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 Power Reduction [dB] Power Reduction [dB] (a) E-UTRA ACLR OOB Emission (b) UTRA ACLR 1 OOB Emission 100 20 90 Lower Channel 18 80 16

[dBc] Upper Channel

2 70 14 60 12 50 10 40 8

30 EVM [%] 6 20 4 10 2 UTRA ACLR 0 0 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 Power Reduction [dB] Power Reduction [dB] (c) UTRA ACLR 2 OOB Emission (d) EVM Figure 2.10: Simulation of E-UTRA ACLR and UTRA ACLR 1/2 OOB emissions and EVM.

2.3.9 Resolution For the later design implementation the resolution for the DAC is one key consideration. This resolution has an impact on the design of the

53 Chapter 2. Specifications

Decoder as well as on the transistors in the UC. The signal to noise ratio (SNR) is given as SNR = 6.02N + 1.76. (2.13) This is for a sinusoidal signal and a resolution of N bits. For any other signal, the SNR can be calculated using the CF, an additional factor and the N bit resolution. In this case the SNR can be calculated as ( ) A SNR = 6.02N + 4.77 − 20 log max . (2.14) Arms Considering Table 2.13 and 2 dB power clipping then an average CF for an LTE can be considered as 2. To achieve the required ACLR of 36 dBc a minimum resolution of 7 bits is needed as can be calculated by 36 − 4.77 + 6 ≈ 7 ≤ N. (2.15) 6.02 Equation 2.14 can be extended to equation 2.16 where a oversampling factor, calculated by the sampling frequency fsample and BW, is consid- ered [9]. This oversampling factor can reduce the total number of bits if the SNR is the limiting factor. ( ) ( ) A f SNR = 6.02N + 4.77 − 20 log max + 10 log sample (2.16) Arms 2BW

Figure 2.11 shows the needed resolution considering the required dy-

20

0

−20

−40 1-15 Bits

Output Power [dBm] −60 100 101 102 103 104 105 Code [Unit Cells]

Figure 2.11: Simulation of required resolution for minimum Pout.

54 2.3. LTE Specification

namic range (DR) of Pout. Considering a maximum output power Pmax of 32 dBm and a minimum output power Pmin of -40 dBm. At least 12 bits are needed if an equidistant voltage step size of each UC is as- sumed. Table 2.14 shows the required bits calculated according to the SNR and Pout. It can be seen that according according to this calculations the minimum required power is the limiting factor.

Table 2.14: Required Design Resolution Limitation Bits SNR 7 Power 12

For additional RX band noise requirements and the reduction of ef- fective number of bits due to DPD the number of bits should be in- creased nevertheless [10].

55

Bibliography

[1] LTE evolved universal terrestrial radio access (E-UTRA) user equip- ment (UE) radio transmission and reception, (3GPP TS 36.101 ver- sion 11.6.0 Release 11) , International Technology Roadmap for Semiconductors (ITRS) Std., 2013.

[2] P. Reynaert and M. Steyaert, RF power amplifiers for mobile com- munications, ser. Analog Circuits and Signal Processing. Springer Netherlands, 2006.

[3] EVM calculation for broadband modulated signals, Electromagnet- ics Division, National Institute of Standards and Technology, 2004.

[4] Application note Rhode und Schwarz, measurement of adjacent channel leakage power on 3GPP W-CDMA Signals with the FSP.

[5] E. Dahlman, S. Parkvall, and J. Skold, 4G: LTE/LTE-Advanced for mobile broadband. Elsevier Science, 2011.

[6] C. Geßner, Long-Term Evolution: A concise introduction to LTE and its measurement requirements. Rohde & Schwarz, 2011.

[7] N. Soltani, “Comparison of single-carrier FDMA vs. OFDMA as 3GPP Long-Term Evolution uplink,” in EE359 Project, 2009.

[8] D. Guel and J. Palicot, “Analysis and comparison of clipping tech- niques for OFDM peak-to-average power ratio reduction,” in Dig- ital Signal Processing, 2009 16th International Conference on, pp. 1–6, July 2009.

[9] W. Kester and i. Analog Devices, Data conversion handbook, ser. Analog Devices series. Elsevier, 2005.

57 Bibliography

[10] C. Presti, F. Carrara, A. Scuderi, P. Asbeck, and G. Palmisano, “A 25 dBm digitally modulated CMOS power amplifier for WCDMA/EDGE/OFDM with adaptive digital predistortion and ef- ficient power control,” Solid-State Circuits, IEEE Journal of, vol. 44, no. 7, pp. 1883–1896, July 2009.

58 Chapter 3 Linear Power Amplifier

The CMOS technology is well known for its outstanding characteris- tics in digital circuits. Nevertheless, already a few decades ago the first investigations were made to evaluate the potential of metal oxide semi- conductor (MOS) technologies in analog circuits [1]. For high integra- tion and full SoC solutions it is necessary that both the analog and dig- ital part can be implemented on the same chip. In the last years, it was shown that a PA can be built in common standard CMOS technol- ogy nodes of 90 nm, 65 nm, 45 nm and 28 nm [2–5]. Furthermore, it was shown that despite of low breakdown voltages, CMOS devices can accomplish high output voltages, that are required for cellular devices, using stacked designs [6]. To overcome the limitation of the upper tran- sistor and to fully exploit the voltage range of the bottom transistors a feedback path can be implemented for a PA self biasing solution. This increases the reliability by allowing larger signal swings before encoun- tering hot carrier degradation [7].

This chapter shows that a standard 28 nm CMOS technology is able to generate watt-level Pout and additionally fulfills 3GPP linearity require- ments for the latest 4G LTE standard [8]. Measurements were taken for different kinds of predistortion to gain more information of the de- signed circuit. The chapter is divided into theoretical, simulation and measurement sections. The first sections take focus on the fundamen- tals of a linear PA design and mathematical calculations. Afterward, the circuit design is described on transistor level and direct current (DC) simulation results of the triple stack are presented to show its basic behavior. The design implementation on-chip is explained on silicon level. The setup, used for later measurements is explained. I-V-

59 Chapter 3. Linear Power Amplifier

characterizations and DC breakdown measurements illustrate the basic behavior and limits of the design. For further characterizations single tone and LTE measurements are presented. The chapter concludes by comparing the design with other CMOS class-AB PAs in different tech- nology nodes.

3.1 Fundamentals

Speaking about linear PAs it is referred to class-A, -B, -AB and -C PA. In Figure 3.1 the basic structure of the linear PA is shown. The bias current Idc of the transistor flows from the supply through the inductance L to the ground. This defines its bias setting and thereby its characteristics. The modulated RF signal is generated at the drain of the transistor. It is blocked by the inductance and flows through the capacitance C to the output. The difference between the different classes is the bias point at which the PA operates and so the shape of this output signal.

Vdd

Idc

L C

Vin Ropt

Figure 3.1: Schematic of a simplified linear PA.

Depending on the bias point the linearity of a PA and the efficiency changes [9, 10]. The change of the bias point can be expressed by the conduction angle ϕ and the output power Pout can be expressed as 1 I P = (V − V ) max (ϕ − sin(ϕ)). (3.1) out 2 dd d,sat 2π

60 3.1. Fundamentals

Vdd is the supply voltage, Vd,sat the knee voltage and Imax the maxi- mum current at the saturation point. The drain efficiency ηd also de- pends on these parameters and can be calculated as V − V ϕ − sin(ϕ) η = dd d,sat . (3.2) d V ϕ − ϕ ϕ dd 4(sin( 2 2 cos( 2 ))

The bias current Idc is an essential indication of the operating mode and so the class defintion of the PA. It therefore also depends on the conduction angle ϕ as well as Imax and can be calculated as

I 2 sin(ϕ) − ϕ cos(ϕ) I = max 2 2 . (3.3) dc 2π − ϕ 1 cos( 2 ) Another important factor while designing a linear PA are the harmon- ics that are produced. Since they do not contain any signal information and are unwanted in the spectrum because the can interfece other sig- nals, they can be considered as a measure of nonlinearity. The n-th harmonic can can be calculated as

∫ϕ/2 1 I ϕ I = max (cos(ϕ) − cos( )) cos (nα)dα. (3.4) n π − ϕ 2 1 cos( 2 ) −ϕ/2 In Table 3.1 the theoretical efficiencies, that can be achieved by a class- AB to class-C PA, for its corresponding conduction angles are shown. A class-A PA has its bias point in the middle of the signal so that no occur due to clipping of the signal. It is therefore very lin- ear and thereby produces no harmonics. The theoretical achievable efficiency of 50 % on the other hand is very low compared to the other classes. The class-B PA shifts its bias point to zero and so only transfers signals above this bias point. It can achieve up to 78.8 %. The most effi-

Table 3.1: Linear class-A to class-C Comparison Bias Quiescent Conduction Maximum Mode Point Current Angle Efficiency A 0.5 0.5 2π 50 % AB 0-0.5 0-0.5 π − 2π 50-78.8 % B 0 0 π 78.8 % C <0 0 0 − π 100 %

61 Chapter 3. Linear Power Amplifier

cient linear PA is the class-C PA with an efficiency of 100 %. This higher efficiency results due to the fact that in class-B and class-C design is no quiescent current. Figure 3.2 depicts the aforementioned theoretical achievable output power and efficiency of a linear PA over the conduction angle ϕ. For the normalized values it can be seen that the delivered output power achieves its maximum at a conduction angle of 2π. The efficiency on the other hand is the lowest at this point. If the conduction angle ϕ is reduced, the transferred sinusoidal signal is also reduced and the output power decreases. Since there is no bias current flowing and the conduction angle is decreasing, eventually the efficiency increases. When the efficiency reaches 100% no power is delivered anymore.

1 1 dd

0.8 0.8 [%] Efficiency

0.6 0.6

0.4 0.4

0.2 0.2 Output Power / V 0 0 0 ← C π ← BA → 2π Conduction Angle [rad]

Figure 3.2: Power and efficiency plot of a linear PA depending on the conduction angle ϕ.

In Figure 3.3 the bias amplitude, the fundamental and the harmonics for class-A to class-C are shown over the conduction angle. It can be seen that with a decreasing conduction angle the bias current decreases faster than the fundamental what results in a higher efficiency. The harmonics of the PA are plotted up to the third order. A class-A PA has no harmonics. Their values increase from this region until they reach their maximum in the class-C area, where they then decrease again with the fundamental. Here the system is most efficient but has no more output power.

62 3.1. Fundamentals

0.6

0.4 DC Fundamental 0.2 1st harmonic

Amplitude 2nd harmonic 0 3rd harmonic

0 ← C π ← BA → 2π Conduction Angle [rad]

Figure 3.3: DC, fundamental and n-th order harmonic plot of a linear PA depending on the conduction angle ϕ.

In Figure 3.4 the I-V-characteristic of an n-channel metal oxide semi- conductor (nMOS) transistor is shown. For this plot the Shichman- Hodges transistor equations are used. The width W , the length L and the mobility µ of the transistor are summarized to a factor of 0.1. For the threshold voltage Vth 0.3 V is chosen.

0.4

0.3

Vgs = 0.5 V

[A] 0.2 Vgs = 1.0 V

ds = 1.5 I Vgs V 0.1 Vgs = 2.0 V Vgs = 2.5 V 0 Vgs = 3.0 V V − V 0 1 2 3 4 5 gs th

Vds [V] Figure 3.4: I-V-characteristics of an nMOS transistor.

63 Chapter 3. Linear Power Amplifier

3.2 Design Considerations

Figure 3.5 shows the block diagram of a transceiver output stage. In the TX the signal is modulated in baseband, up-sampled and converted from digital to analog. This signal has to be amplified to achieve the required output power Pant at the antenna. Additional losses for filters, duplexer and coupler in the front-end (FE) chain, that attenuate the signal, have to be added.

TX PA FE Pout,P A Pout,ant

Figure 3.5: Block diagram of a transceiver output stage.

In 3GPP the requirements for LTE of the maximum output power at the antenna Pant are 23 dBm. For an estimated chain loss αchain of 3 dB and a PAPR, of the signal, of 6 dB an additional 9 dB have to be added for the design. This results in a total power of 32 dBm according to

Pout,P A = Pant + αchain + P AP R = 32 dBm. (3.5) This corresponds to a peak to peak voltage, at 50 Ω output load, of √ Vpp,P A = 2 2Pout,P ARL ≈ 25V. (3.6) This voltage is transferred from the PA core to the output by a factor n of approximately 2:3. By using a differential design another factor of 2 can be divided. The maximum voltage that the core has to be able to generate can be expressed by 1 V = nV ≈ 9 V. (3.7) max,Core 2 pp For a single transistor in 28 nm CMOS technology this voltage exceeds the breakdown voltage. Therefore, the voltage is distributed over three thick oxide transistors. For an equal distribution the stack can be reg- ulated by the bias voltages of the gates. Vd,n−1 of the lower transistor

64 3.3. Circuit Design

n-1 depends on the settings at the gate and the threshold voltage of the next transistor n in the stack and can be calculated to

Vd,n−1 = Vg,n − Vth,n. (3.8) Since the connections of the triple stack design cannot be measured on the chip it is important to estimate the threshold voltage with

Vth = Vox + 2ψb. (3.9)

Vox is the voltage over the gate oxide and ψb is the necessary bulk potential to bring the transistor into strong inversion [11].

3.3 Circuit Design

In Figure 3.6 the transistor level schematic of the linear class-AB PA is shown. To get a full sinusoidal output signal the PA is designed as a differential push-pull amplifier. The stage is the core of the PA and consists of two stacks. Each stack contains three transistors. This triple stack is biased by an on-chip bias network, that contains a replica stage to cancel process voltage temperature (PVT) effects [12]. This stage transforms a current into a bias voltage and is therefore also robust against voltage drops in the supply path. Further- more, to protect the chip from electrostatic discharge (ESD), at the sup- ply input pads an ESD protection circuit was implemented. The balun and additional matching components Coff are placed off-chip. The in- put matching network (IMN) consists of a transformer Tin and match- ing capacitors Cm that connect the RF input of the PA to the pads of the printed circuit board (PCB). A conjugate matching of the reflexion co- efficient Γgm was done to deliver a load of Z0 = 50 Ω to the input of the transformer. The output is matched by an output matching network (OMN), Tout and Cm to the load. The OMN was designed with a power match as described in [10] and additionally converts the differential sig- nal back to a single ended. A capacitor Cfb was added from the drain of the top transistors N3 to the gate to ensure dynamical feedback from the drain at the output to the gate [7,13]. Furthermore, the bulks of the transistor N3 and N2 are connected to the source of N2 to dynamically increase the bulk potential and so relieve the RF voltage stress at the drain of transistor N3 [3].

65 Chapter 3. Linear Power Amplifier load R Off-Chip protection circuit, ESD m,out input signal applied at the s C n RF out rf,p rf,n T I I p n dd V m,pout m,nout C C dec p p p n n n 3 2 1 3 2 1 C N N N N N N dc,PA dc,PA for a differential I I for load matching. On-chip IMN OMN fb fb C C gm Γ m,s design with an C PA in T 0 Z = in m,p Γ C off off C C Bias Supply & ESD Circuit Off-Chip gates of the bottom transistors and an Schematic of triple stack supply generation and off-chip signal conversion with matching capacitors. : in Balun RF Figure 3.6

66 3.3. Circuit Design

To decouple the high current flow Irf that occurs at 1 W output power a capacitor Cdec was placed between the center tap of the transformer and the ground supply. Therefore, the current has an RF short loop on the chip, with the transformer inductor L, for the frequency f that can be calculated as 1 f = √ . (3.10) 2π (LT,out/2)Cdec Figure 3.7 depicts the equivalent transistor level schematic of the triple stack for DC characterizations on the PCB. Due to the test board interconnections the transistor N3 has a DC short at the gate and drain of the upper transistor N3 and can be thereby considered as a diode- connected metal oxide semiconductor field effect transistor (MOSFET). The bias voltage Vg1 for the bottom transistor N1 can be either directly connected to a supply or can be provided by the on-chip replica stage. The gate voltage Vg2 for the transistor N2 is also provided by the bias stage.

Vdd

Vd3 Vg3 N3

Vd2 Vg2 N2 Vreplica Vd1 Vg1 N1 Vpcb

Figure 3.7: Triple stack of the class-AB PA with shorted gate drain and two voltage inputsfor the gate of the bottom transistor. Since the diode-connected transistor is always in saturation with Vds = Vgs > Vgs − Vth, the triple stack has a switch on voltage at Vth that can be expressed by the Shichman-Hodges transistor equations [14] as

67 Chapter 3. Linear Power Amplifier

√ ( I)ds Vdiode,N3 = Vth + 1 W . (3.11) 2µnCox L (1 + λVth) The transistor N2 works in common gate and transistor N1 in com- mon source mode. The I-V characteristic of this triple stack is a mixture of different behaving transistors. The drain current for transistor N1 and N2 is according to the equations of the linear and saturation re- gion. The supply settings for the triple stack can be set at the gate of N2 and N1. For N1 there are two supply connections possible. One has a direct connection to the PCB and is therefor regulated by a sup- ply generator. The other is an on-chip bias generation that consists a replica stage.

3.4 DC Simulation Results

It is not possible to directly measure the voltages inside the PA core. To show the voltages at each node of the triple stack a DC sweep simula- tion is done. In Figure 3.8 the voltages of the stack nodes are plotted over the supply voltage Vdd,P A of the PA. The drain and gate of the top transistors are shorted the voltage drop from gate to source and drain to source of the top transistor remain the same. The bias voltage of the middle transistor Vg2 is also static and so the drain voltage Vd1 at

Vd3, Vg3 6 Vd2, Vs3 Vg2 4 Vs2 Vg1 V 2 ss DC Voltage [V] 0 3 3.5 4 4.5 5 5.5 6

Vdd,PA [V]

Figure 3.8: Simulation of DC voltages of the triple stack with directly connected gates.

68 3.5. Silicon Implementation

the bottom transistor also remains the same. Vds and Vgd of N2 are the only voltages that are increased. Therefore, this is the reason for the breakdown of the transistor stack as it is shown in the following DC breakdown measurement.

3.5 Silicon Implementation

In Figure 3.9 the bumped bare die is presented. The whole die has an area of 1.88 mm × 1.4 mm while the active part, consisting of on-chip matching and biasing circuits only has an area of 1.05 mm × 0.51 mm. The compact design needs a very good ground plane to decrease the ohmic resistance at the grounded source of the bottom transistor. This has two effects. Firstly, it reduces the voltage that drops over this re- sistance. Secondly, the good metal connection acts as heat sink to improve the thermal conduction off the chip thereby reducing perfor- mance degradation. The die was soldered directly on the PCB to be able to investigate further self-heating effects [15]. The front-end-of-line

Figure 3.9: Picture of the 28 nm bumped bare die.

(FEOL) layers are implemented in a triple well technology [16]. Triple well designs isolate the p-well from the substrate making it possible to bias the bulk of the transistor at desired voltage. Another benefit of the

69 Chapter 3. Linear Power Amplifier

isolation is the reduction of noise that can be generated by substrate currents. The triple stack transistors are implemented as three thick oxide transistors. The transistor gates of low voltage CMOS technolo- gies using high-k material to antagonize the oxide shrinking with tech- nology scaling [17]. The back-end-of-line (BEOL) has 7 copper metal layers and one aluminum. The output transformer is realized in metal layer 6, 7 and aluminum for low insertion losses.

3.6 Measurement Setup

The die was tested on a PCB as shown in Figure 3.10. It is placed in the center of the board. One SubMiniature version A (SMA) connector is placed on the side of the balun for the signal generator and one on the other side to connect a spectrum analyzer. On the right side terminal strips can be bypassed to configure the supply setting. Several samples have been measured on different boards to guarantee the accuracy of the presented results over process corners.

Figure 3.10: Test board with SMA connector, balun, matching compo- nents and supply connectors.

70 3.7. DC Characterization

CMOS PAs have the disadvantage of being non-linear. One method to compensate this fact is the usage of a DPD at the input to predistort the signal. The known memory polynomial predistortion [18] is taken as DPD for later described LTE modulated measurements and can be expressed as

∑K ∑Q k−1 z(n) = akqx(n − q)|x(n − q)| . (3.12) k=1 q=0

The input is represented by x(n). It is multiplied by a factor akq and results in the output z(n). The factor K is the non-linearity order and Q is the memory length. The presdistorter is so capable of different levels of predistortion which can give further information about the implemented design such as memory effects or order of non-linearity. In Figure 3.11 the block diagram of the measurement setup for the de- vice under test (DUT) is shown. The signal generator modulates the signal at the input of the PCB. The signal is then amplified by the DUT. Afterward, the signal analyzer receives the amplified and distorted sig- nal. The data can then be analyzed and predistorted by the DPD. The order of predistortion is set manually.

Signal Signal Generator DUT Analyzer

DPD

Figure 3.11: Block diagram of the measurement setup for the DUT.

3.7 DC Characterization

In Figure 3.12 the I-V characteristics of the DUT with replica supply at the bottom transistor can be seen. In the replica stage a current, that is provided through the PCB, is converted into a voltage. The drain current Idd,P A is plotted over the drain voltage Vdd,P A. Additionally, the replica current Ireplica and thus the bias point of the transistor stack was

71 Chapter 3. Linear Power Amplifier

swept to show the DC output characteristics of the PA. The threshold voltage Vth is around 0.4 V. The drop of Idd,P A at the knee voltage is due to a regulation of the replica and a resulting decrease of the bias voltage at the gate of the bottom transistor. No further investigation were done to explain this effect at the output of the replica stage.By increasing Ireplica by 50 µA, Idd,P A increases around 25 mA.

0.3 Ireplica 150 µA 200 µA 0.2 250 µA 300 µA 350 µA 400 µA 0.1 450 µA 500 µA Supply Current [A] 550 µA 0 0 0.5 1 1.5 2 2.5 3 Supply Voltage [V]

Figure 3.12: I-V measurements of the PA by using a replica stage to bias the bottom transistor.

In Figure 3.13 the output characteristics for the directly connected PA stack is shown. It can be seen that by increasing the gate voltage of the

0.4 450 mV Vg1 500 mV 0.3 550 mV 600 mV 0.2 650 mV 0.1

Supply Current [A] 0 0 0.5 1 1.5 2 2.5 3 Supply Voltage [V]

Figure 3.13: I-V measurements of the PA by directly sweeping Vg1.

72 3.8. DC Breakdown Measurements

bottom transistor by 50 mV the saturated output current increases pro- portionally. Similar to the previous plot Vth is around 0.4 V. The impact of bypassing the replica stage can be seen around the knee voltage and in the saturation area. The channel modulation effects of the transistor stack let Idd,P A increase over Vdd,P A [19].

3.8 DC Breakdown Measurements

The bias voltage for the breakdown measurements were set for Vg1 to 600 mV and for Vg2 to 2.4 V. The voltages were directly connected to the gates of the triple stack, to avoid any influence of the bias stages. Figure 3.14 shows the breakdown voltage of the stack. It is shown that the device has a breakthrough at 6.2 V were the supply source is in the current limit. The device is not destroyed by the breakdown as the measurements characteristics before and after show. Therefore, a gate oxide breakdown can be excluded. Possible reasons for this breakdown could be an avalanche breakdown, a punch through from the source to the drain or a diode breakdown in the well that leads to a switched on bipolar transistor [20]. To check the reliability in terms of degradation, that might be caused by electron trapping, long term measurements are necessary. As seen in Figure 3.8, for a sweep over Vdd,P A the voltage only increases over transistor N2. Therefore, with an adaptation of Vg2 the overall breakdown point of the triple stack is higher.

0.4 0.3 0.2 0.1 Before Breakdown Supply Current [A] 0 After 0 1 2 3 4 5 6 Supply Voltage [V]

Figure 3.14: Measurement of DC breakdown for the triple stack.

73 Chapter 3. Linear Power Amplifier

3.9 Single Tone Measurements

In Figure 3.15 the output power Pout, PAE and ηd are shown for a pulsed signal at 1.83 GHz and 3.1 V voltage supply. The saturated output power Psat is 31.7 dBm, respectively. At 6 dB BO, which corresponds to 26 dBm output power,the gain is 15.5 dB. PAE and ηd are both 25 %. The max- imum PAE is 35.2 % at 20 dBm Pout. For higher Pout a higher Pin is needed. This difference decreases when the PA is in saturation. Due to a smaller Pdc the PAE increases nevertheless. At the peak value the DC consumption decreases less compared to the difference of Pout and Pin which also degrades the PAE. The ηd stills increases at this level of Pin until it reaches its maximum of 39.5 % at Psat.

40 Pout 30 PAE ηd 20 10

Efficiency [%] 0 −

Output Power [dBm], 10 −20 −10 0 10 20 30 Input Power [dBm]

Figure 3.15: Pulsed measurements of Pout, PAE and ηd over Pin.

In Figure 3.16 the efficiency is shown over output power. The plot shows typical poor efficiency characteristics in the linear region com- pared to saturation. Since for modulated signals the PA is mostly driven in BO mode the maximum efficiency is a first order comparison. An- other important fact is the efficiency at the operating mode, that is roughly at 6 dB BO for LTE. For other modulated signals it is also in- teresting to see the impact of small or large PAPR on the performance. Between the output power range from 20 to 30 dBm a linear increase of 2.5 %/1 dB can be obtained. An assumption of the efficiency degrada- tion at higher Pout is due to the clipping of the signal that generates an unwanted distortion what leads to a higher Pdc.

74 3.9. Single Tone Measurements

40 PAE ηd 30

20

10 Efficiency [%] 0 −10 −5 0 5 10 15 20 25 30 Output Power [dBm]

Figure 3.16: Pulsed measurements of PAE and ηd over Pout.

3.9.1 AM-AM and AM-PM Measurements In Figure 3.17 the AM-AM and AM-PM curves of the PA can be seen. The measurements were done at 3.1 V Vdd and 1.83 GHz. The parameter S21 is shown as phase and magnitude over input power. The PA has a linear gain of 15.5 dB up to -5 dBm input power. From this point it decreases 0.1 dB/dB up to 0 dBm input power. Afterward, it decreases almost linearly with 0.2 dB/dB. The same behavior can be seen for the phase shift. It also linearly decreases with 0.7 °/dB.

18 0

16 [ Phase S21

14 −5

12

− ◦

10 ] 10 S21 Magnitude [dB] 8 −15 −25−20−15−10 −5 0 5 10 15 Input Power [dBm]

Figure 3.17: Measurements of the S21-parameter over Pin for AM-AM and AM-PM characterization.

75 Chapter 3. Linear Power Amplifier

In Figure 3.18 the AM-AM characteristic over output power is shown for measurements done at the same frequency and supply voltage as before. It can be seen that the gain is now constant until Pout is 27 dBm. The difference can be explained since in this measurement setup pulsed signals are used. It can be seen that pulsed signals do not only have an impact on the maximum output power in saturation but also on the behavior in BO [15]. This reduces the PA performance for modulated signals in efficiency and linearity. Furthermore, it is shown that the PA has a smooth transition of linear gain when the amplification changes from linear class-A mode to the clipping region Vss of the class-AB PA.

15

10

Gain [dB] 5

−10 0 10 20 30 Output Power [dBm]

Figure 3.18: Pulsed measurement of gain over Pout.

3.9.2 Saturated Output Power In Figure 3.19 the saturated output power is swept over frequency. A continuous wave (CW) signal at different drain voltages were used to measure Psat. It can be seen that by increasing the drain voltage Psat in- creases, too. Nevertheless, the frequency behavior of the PA is the same for all measurements and therefore independent of the drain voltage. Pmax over frequency is for all supply voltages at 1.83 GHz. 3.10 LTE Measurements

All LTE measurements were done for band 1 at the center frequency of 1.95 GHz. The voltage supply of the tests is 3.1 V. Table 3.2 depicts the 3GPP requirements for LTE, that were focused on, are summarized.

76 3.10. LTE Measurements

31.5

31

30.5

30 Vdd 2.9 V 29.5 Vdd 3.0 V

Output Power [dBm] Vdd 3.1 V 29 1.6 1.65 1.7 1.75 1.8 1.85 1.9 1.95 2 Frequency [Hz] ·109

Figure 3.19: Measurements of Psat over frequency.

Pout at the antenna of 23 dBm is required. With respect to the modula- tion and to the resource block allocation, Pout can be reduced by a MPR of 1-2 dB depending on the modulation scheme. The in-band linearity requires an EVM of 17.5 % and 12.5 % for QPSK and 16-QAM, respec- tively. The ACLR values are defined for UTRA as 33 dBc and 30 dBc for E-UTRA.

Table 3.2: Selected 3GPP LTE Output Power and Linearity Require- ments Requirement Value Unit Pout at antenna 23 dBm MPR 1-2 dB EVM QPSK 17.5 % 16-QAM 12.5 % ACLR E-UTRA 30 dBc UTRA 33 dBc

3.10.1 Output Spectrum Figure 3.20 shows the in and out of band characteristics for band 1 LTE signal with 15 MHz bandwidth (LTE-15) 16-QAM OFDM PUSCH full al- located test signal at a center frequency of 1.95 GHz. The measurements are taken at a CHP of 25.1 dBm due to the high CF. The impact of DPD

77 Chapter 3. Linear Power Amplifier

usage can be seen in the spectral output. Without DPD the spectral emission requirement, especially in the upper band, is violated. Fur- thermore, show the upper and lower band different ACLR values. With DPD and memory polynomial cancellation the upper and lower band become more similar and it can be seen that the PA upper bands be- come less. Note the the resolution bandwidth (RBW) was 30 kHz for the whole measurement range. The required LTE mask was adapted. The emissions might relax when measuring with 1 MHz RBW as speci- fied.

0

−10

−20

−30 Power [dBm] DPD OFF −40 DPD ON + Memory 1.93 1.94 1.95 1.96 1.97 Frequency [GHz] ·109

Figure 3.20: Measured output spectrum with LTE mask of a fully al- located band 1 LTE-15 16-QAM OFDM PUSCH test signal with and without DPD.

3.10.2 LTE-20 Band 1 16-QAM In Figure 3.21 the ACLR characteristics of a band 1 LTE signal with 20 MHz bandwidth (LTE-20) 16-QAM OFDM PUSCH fully allocated test signal are shown over Pout. It can been seen that without the use of a DPD the required E-UTRA ACLR limit is already violated at an out- put power of 19.2 dBm, which is almost 7 dB too low. With the use of a polynomial DPD and memory cancellation a Pout of almost 26 dBm can be achieved, which is the targeted maximum output power, while still fulfilling ACLR requirements.

In Figure 3.22 the CF of the signal described before, is plotted over Pout. While without DPD the signal has a constant CF of 8 dB, it can be

78 3.10. LTE Measurements

−10 DPD OFF −20 DPD ON + Memory

−30

ACLR [dBc] −40

−50 16 18 20 22 24 26 Output Power [dBm]

Figure 3.21: ACLR measurement for band 1 LTE-20 16-QAM OFDM PUSCH fully allocated test signal. seen, that with DPD at 26 dBm Pout the CF is up to 17 dB. Since the re- quired Pout was reached at this point the device was no further stressed to avoid degradation or even gate oxide breakdowns.

18 DPD OFF 16 DPD ON + Memory 14 12 10

Crest Factor [dB] 8

16 18 20 22 24 26 Output Power [dBm]

Figure 3.22: Measurement of CF for band 1 LTE-20 OFDM 16-QAM PUSCH full allocation.

3.10.3 LTE-1 to LTE-20 Band 1 QPSK In the Figures. 3.23-3.26 the selected 3GPP measurements for the LTE output power, ACLR and EVM are presented. The measurements use a

79 Chapter 3. Linear Power Amplifier

feedback loop to increase the input signal power as much as necessary to reach the required Pout. The measurements were taken for LTE signal with 1.5 MHz bandwidth (LTE-1) to LTE-20 with a fully allocated band 1 QPSK OFDM PUSCH signal.

Output Power

In Figure 3.23 Pout is shown. For a fully allocated QPSK modulated sig- nal an MPR of 1 dB is allowed. The subsequent measurements are all done at this output power level. It can be seen that without DPD the output power is met in almost all cases. With DPD Pout has to be re- duced not to exceed the stress of the bottom transistor, as described before. Compared to the measurements without DPD, these measure- ments were taken at 0.2 dB less output power.

27.5 DPD OFF 27 DPD ON + Memory 26.5 26 25.5

Output Power [dBm] 25 LTE-1.4 LTE-3 LTE-5 LTE-10 LTE-15LTE-20 Bandwidth [MHz]

Figure 3.23: Pout measurement for band 1 LTE-1 to LTE-20 QPSK OFDM PUSCH with full allocation.

EVM In Figure 3.24 the EVM of the QPSK modulated signals is shown over the LTE BWs. The in-band requirements are achieved with and without the use of DPD. For the LTE-20 signal it still has a margin of 10 % to the limit of 17.5 %. In this case a crest factor reduction (CFR) algorithm can be included to increase Pout, on the cost of EVM. This measure improves the overall performance of the design [21].

80 3.10. LTE Measurements

20 DPD OFF 15 DPD ON + Memory

10

5

0

Error Vector Magnitude [%] LTE-1.4 LTE-3 LTE-5 LTE-10 LTE-15LTE-20 Bandwidth [MHz]

Figure 3.24: EVM measurement for band 1 LTE-1 to LTE-20 QPSK OFDM PUSCH with full allocation.

ACLR UTRA & E-UTRA In Figure 3.25 and 3.26 the ACLR values for E-UTRA and UTRA are shown. It can be seen that without DPD the upper and lower spectrum differ up to 10 dB. It is also visible that without DPD the requirements can not be fulfilled.

−10 DPD OFF n DPD OFF p DPD ON n −20 DPD ON p DPD ON + Memory n −30 DPD ON + Memory p

−40 E-UTRA ACLR [dBc]

LTE-1.4 LTE-3 LTE-5 LTE-10 LTE-15LTE-20 Bandwidth [MHz]

Figure 3.25: Measurement of E-UTRA ACLR for band 1 LTE-1 to LTE-20 QPSK OFDM PUSCH with full allocation.

81 Chapter 3. Linear Power Amplifier

With DPD the upper and lower spectrum become more equal and the required ACLR values are now met up to LTE signal with 10 MHz bandwidth (LTE-10). If additionally a memory cancellation is added, the ACLR values are met up to LTE-20. It can be seen that the non- linearity and the memory effects strongly depend on the BW.

−10 DPD OFF n DPD OFF p DPD ON n −20 DPD ON p DPD ON + Memory n −30 DPD ON + Memory p

UTRA ACLR [dBc] −40

LTE-1.4 LTE-3 LTE-5 LTE-10 LTE-15LTE-20 Bandwidth [MHz] Figure 3.26: Measurement of UTRA ACLR for band 1 LTE-1 to LTE-20 QPSK OFDM PUSCH with full allocation.

Drain Efficiency

The ηd can be seen in Figure 3.27. Regardless of the BW and usage of

24 DPD OFF 23 DPD ON + Memory

22

21

20 Drain Efficiency [%]

LTE-1.4 LTE-3 LTE-5 LTE-10 LTE-15LTE-20 Bandwidth [MHz]

Figure 3.27: Measurement of ηd for band 1 LTE-1 to LTE-20 QPSK OFDM PUSCH with full allocation.

82 3.11. Comparison

DPD the ηd is constant around 22 %. Therefore, linearity is gained with DPD without loosing efficiency inside the tested system.

3.11 Comparison

Table 3.3 compares this design with other published PAs. It is imple- mented in the smallest technology node in 28 nm and one of the de- signs with the highest saturated output power. Compared to the design with the same saturated output power both achieve the power with the same supply voltage. It only needs half of the area but operates at a lower frequency and uses a triple stack. The PAE is in the in the middle of what was presented. It has 7 % less efficiency than the highest but 8.5 % more than the lowest. The gain is comparable to other 1 stage designs but only has approximately half of the gain compared to the designs with predriver. All designs have the IMN and OMN on-chip.

83 Chapter 3. Linear Power Amplifier 1 3 28 3.2 15.5 ON 31.7 1.85 35.2 LTE 2.63 2015 ] this work 27 1 3 15 5.4 2.0 180 ON n.a. 42.2 28.6 2014 ] EuMC [ 26 5 2 2 31 37 90 3.3 2.5 4G WCDMA ON 31.8 2012 . PA ] PAWR [ 25 2 2 2 28 90 0.9 ON 3.33 LTE 25.8 29.4 2012 Class-AB ] MTT [ 24 CMOS 2 2 2 32 28 1.8 ON n.a. 31.9 19.5 2011 2.75 ] ISSCC [ 23 1 3 19 35 23 3.3 2.4 180 ON n.a. Comparison of : ] RFIC [ 22 2 2 3.3 ON 26.7 2006 2006 Table 3.3 WLAN n.a. MTT [ ] 2.52 2 [mm [GHz][dBm] 5 [%] 26.5 [dB][V] 25.5 sat Technology Node [nm]Area Frequency 180 P PAE Gain Supply Stages Stack Matching Standard

84 Bibliography

[1] D. Hodges, P. Gray, and R. Brodersen, “Potential of MOS technolo- gies for analog integrated circuits,” in Solid State Circuits Confer- ence, 1977. ESSCIRC ’77. 3rd European, pp. 43–47, Sept 1977.

[2] D. Chowdhury, C. Hull, O. Degani, Y. Wang, and A. Niknejad, “A fully integrated dual-mode highly linear 2.4 GHz CMOS power amplifier for 4G WiMax applications,” Solid-State Circuits, IEEE Journal of, vol. 44, no. 12, pp. 3393–3402, Dec 2009.

[3] S. Leuschner, S. Pinarello, U. Hodel, J.-E. Mueller, and H. Klar, “A 31 dBm, high ruggedness power amplifier in 65 nm standard CMOS with high efficiency stacked-cascode stages,” in Radio Frequency Integrated Circuits Symposium (RFIC), 2010 IEEE, pp. 395–398, May 2010.

[4] I. Sarkas, A. Balteanu, E. Dacquay, A. Tomkins, and S. Voinigescu, “A 45 nm SOI CMOS class-D mm-wave PA with >10 Vpp differen- tial swing,” in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2012 IEEE International, pp. 88–90, Feb 2012.

[5] J. Fuhrmann, P. Ossmann, K. Dufrene, H. Pretl, and R. Weigel, “A 28 nm standard CMOS watt-level power amplifier for LTE appli- cations,” in Power Amplifiers for Wireless and Radio Applications (PAWR), 2015 IEEE Topical Conference on, pp. 1–3, Jan 2015.

[6] A. K. Ezzeddine, H. C. Huang, R. S. Howell, H. C. Nathanson, and N. G. Paraskevopoulos, “CMOS PA for wireless applications,” IEEE Radio and Wireless Symposium, p. 2, 2007.

[7] T. Sowlati and D. Leenaerts, “A 2.4 GHz 0.18 um CMOS self- biased cascode power amplifier,” Solid-State Circuits, IEEE Journal of, vol. 38, no. 8, pp. 1318–1324, Aug 2003.

85 Bibliography

[8] LTE evolved universal terrestrial radio access (E-UTRA) user equip- ment (UE) radio transmission and reception, (3GPP TS 36.101 ver- sion 11.6.0 Release 11) , International Technology Roadmap for Semiconductors (ITRS) Std., 2013.

[9] P. Reynaert and M. Steyaert, RF power amplifiers for mobile communications, ser. Analog Circuits and Signal Processing. Springer, 2006.

[10] S. Cripps, RF power amplifiers for wireless communications, ser. Artech House microwave library. Artech House, 2006.

[11] B. Baliga, Fundamentals of power semiconductor devices. Springer US, 2010.

[12] P. Ossmann, J. Fuhrmann, J. Moreira, H. Pretl, and A. Springer, “A circuit technique to compensate PVT variations in a 28 nm CMOS cascode power amplifier,” in Microwave Conference (GeMiC), 2015 German, pp. 131–134, March 2015.

[13] D. Chowdhury, C. Hull, O. Degani, P. Goyal, Y. Wang, and A. Niknejad, “A single-chip highly linear 2.4 GHz 30 dBm power amplifier in 90 nm CMOS,” in Solid-State Circuits Conference - Di- gest of Technical Papers, 2009. ISSCC 2009. IEEE International, pp. 378–379,379a, Feb 2009.

[14] C. Kok and W. Tam, CMOS voltage references: An analytical and practical perspective. Wiley, 2012.

[15] P. Ossmann, J. Fuhrmann, J. Moreira, H. Pretl, and A. Springer, “A measurement method to mitigate temperature effects in nanome- ter CMOS RF power amplifiers,” in Microelectronics (Austrochip), 22nd Austrian Workshop on, pp. 1–5, Oct 2014.

[16] R. Baker, CMOS: Circuit design, layout, and simulation, ser. IEEE Press Series on Microelectronic Systems. Wiley, 2011.

[17] H. Huff and D. Gilmer, High Dielectric Constant Materials: VLSI MOSFET Applications, ser. Springer Series in Advanced Microelec- tronics. Springer Berlin Heidelberg, 2006.

86 Bibliography

[18] L. Ding, G. Zhou, D. Morgan, Z. Ma, J. Kenney, J. Kim, and C. Gi- ardina, “Memory polynomial predistorter based on the indirect learning architecture,” in Global Telecommunications Conference, 2002. GLOBECOM ’02. IEEE, vol. 1, pp. 967–971 vol.1, Nov 2002. [19] W. Chen, The electrical engineering handbook. Elsevier Science, 2004. [20] I. Aoki, S. Kee, R. Magoon, R. Aparicio, F. Bohn, J. Zachan, G. Hatcher, D. McClymont, and A. Hajimiri, “A fully integrated quad-band GSM/GPRS CMOS power amplifier,” Solid-State Cir- cuits, IEEE Journal of, vol. 43, no. 12, pp. 2747–2758, Dec 2008. [21] O. Degani, F. Cossoy, S. Shahaf, D. Chowdhury, C. Hull, C. Emanuel, and R. Shmuel, “A 90 nm CMOS power amplifier for 802.16e (WiMAX) applications,” in Radio Frequency Integrated Cir- cuits Symposium, 2009. RFIC 2009. IEEE, pp. 373–376, June 2009. [22] H. Solar, R. Berenguer, I. Adin, U. Alvarado, and I. Cendoya, “A fully integrated 26.5 dBm CMOS power amplifier for IEEE 802.11a WLAN standard with on-chip power inductors,” in Microwave Symposium Digest, 2006. IEEE MTT-S International, pp. 1875–1878, June 2006. [23] H.-S. Oh, C.-S. Kim, H. Yu, and C. Kim, “A fully integrated +23 dBm CMOS triple cascode linear power amplifier with inner-parallel power control scheme,” in Radio Frequency Integrated Circuits (RFIC) Symposium, 2006 IEEE, pp. 4 pp.–, June 2006. [24] Y. Tan, H. Xu, M. El-Tanani, S. Taylor, and H. Lakdawala, “A flip-chip packaged 1.8 V 28 dBm class-AB power amplifier with shielded concentric transformers in 32 nm SoC CMOS,” in Solid- State Circuits Conference Digest of Technical Papers (ISSCC), 2011 IEEE International, pp. 426–428, Feb 2011. [25] B. Francois and P. Reynaert, “A fully integrated watt-level linear 900-MHz CMOS RF power amplifier for LTE applications,” Mi- crowave Theory and Techniques, IEEE Transactions on, vol. 60, no. 6, pp. 1878–1885, June 2012. [26] O. Degani, A. Israel, F. Cossoy, V. Volokotin, K. Levy, E. Schwartz, E. Papir, M. Arbiv, I. Refaeli, D. Gidony, and S. Rivel, “A 90 nm

87 Bibliography

CMOS PA module for 4G applications with embedded PVT gain compensation circuit,” in Power Amplifiers for Wireless and Radio Applications (PAWR), 2012 IEEE Topical Conference on, pp. 25–28, Jan 2012.

[27] K. Terajima, K. Fujii, T. Sonoda, T. Takagi, E. Nakayama, S. Kameda, N. Suematsu, and K. Tsubouchi, “A 2.0 GHz CMOS triple cascode push-pull power amplifier with second harmonic injection for linearity enhancement,” in European Microwave Con- ference (EuMC), 2014 44th, pp. 1265–1268, Oct 2014.

88 Chapter 4 Digital Power Amplifier

Figure 4.1 shows the block diagram of a TX chain with DAC and PA. The digital signal is processed in the DFE. It is then up-sampled to the de- sired output frequency and converted from digital to analog. The ana- log signal is then amplified by a PA. For further integration the design can be made more compact by merging the DAC and PA. This results in a DAC that is capable of delivering the required output power and spec- tral requirements that are necessary to fulfill the 3GPP requirements for LTE.

DSP DAC PA Q

I

Figure 4.1: Block diagram of a TX chain.

This chapter discusses the design of the DPA and its on-chip imple- mentation. Several theoretical error sources that might lead to distor- tions are approached. The design is further analyzed using transient and harmonic balance simulations. 3GPP requirements are tested us- ing on-chip generated LTE test signals.

89 Chapter 4. Digital Power Amplifier

4.1 DPA Design

The SDPA is a promising approach for a high efficient and compact design because of the use of high efficient UCs. Figure 4.2 shows the block diagram of a SDPA. The amplitude is modulated in the DFE and modulates the UCs of the DPA. The phase of the signal is modulated by the LO and is directly connected to the gate of the UC. All UCs are con- nected at the output to sum up the currents. At the output a matching network terminates the design and transfers the 50 Ω load to the output of the PA stage. √ A = I(t)2 + Q(t)2

Decoder Matching ϕ =( ) −1 Q(t) DFE tan I(t) Rload

Figure 4.2: Block diagram of a SDPA with DFE and OMN.

In Figure 4.3 the block diagram of the CSDPA is shown. The decoder connects the amplitude that is generated in the DFE with the cell field. Between the decoder and the cell field an interface is placed that syn- chronizes the single-ended clocked data of the decoder with the LO of the analog domain. The LO is distributed to 32 drivers. This re- duces the load that has to be driven by a single buffer. Each buffer drives 32 UCs inside every row. To avoid timing mismatches between the buffers special care has to be taken in the layout. The parasitic re- sistance and capacitance have to be small enough to make the timing difference negligible to the clock period. The UC can be individually en- or disabled by the decoder and a logic block inside each UC that is done by the signal EN.

Figure 4.4 shows the enable logic inside each UC. Depending on the binary to thermometer decoding the AND gate in each UC can transmit or block the LO. The bottom transistor is place directly after the AND

90 4.1. DPA Design

Driver and UC

Logic and CS LO Decoder with Cell Field EN

Figure 4.3: Block diagram of the bottom part of the CSDPA. gate. The positive upper part P is immediately changed with the low phase of the LO. The differential path N is changed half an LO cycle later to ensure the signal change at the time when the negative LO is low. The transmission gate (TG) ensures this half-cycle switch.

P

EN

TG N

Figure 4.4: Block diagram of the LO signal enabling inside each UC.

4.1.1 Inverse Class-D In Figure 4.5 the schematic of an inverse class-D PA is shown. It is re- ferred to as inverse or current mode class-D because of the changed voltage and current behavior. A class-D amplifier switches a rectangu- lar voltage that generates a sinusoidal voltage. For the inverse class-D this behavior is reversed [1]. The voltages Vin and V in control the bot- tom transistors that switch the currents is1 and is2.

91 Chapter 4. Digital Power Amplifier

Vdd

L L choke C choke

L

Ropt v1 v2 is1 is2 Vin V¯in

Figure 4.5: Schematic of an inverse class-D PA.

The rectangular transistor current of an inverse class-D PA was de- rived as Fourier series that depends on the resistance Ron and the drain source voltage v1 of the transistor [ ] ∑ v1 2 sin(kϕ) i1 = 0.5 + . (4.1) Ron π k k_odd

The resulting voltage at the load Rload is expressed as the difference of the two drain source voltages with the phase ϕ and the amplitudes p and q as

v2 − v1 = p cos(ϕ) + q sin(ϕ). (4.2) Considering these equations and the resonant tank a more accurate voltage current relation can be derived [2]. In Figure 4.6 the basic current voltage relation of the inverse class-D amplifier is shown. In contrast to an inverse class-D design a rectan- gular current instead of a voltage is switched. It is generated by the transistors, that work as a switch, that eventually results in a half sinu- soidal voltage waveform. By connecting and switching several cells at

92 4.1. DPA Design

the same time the currents can be summed up and the voltage at the output increases proportionally.

V 1 I Normalized Voltage/Current

T /2 T 3T /2 Normalized Switching Time

Figure 4.6: I-V characteristics of an inverse class-D PA.

4.1.2 Stacked Inverse Class-D In Figure 4.7 the schematic of the implemented CSDPA is shown. The inverse class-D basic concept is realized as a stack of three transistors to distribute the voltage stress of the design, as discussed in a previous section. The bottom transistor is a thin oxide transistor and the two stack transistors are thick oxide. Thin oxide transistors show a better RF performance than thick oxide transistor but thick oxide transistor are more robust against higher voltages. The transistors are biased by the two bias voltages Vbias,1 and Vbias,2. The bottom transistors are sep- arated into UCs that are controlled by the decoder. The period of the rectangular current wave that is generated by the LO signal at the gate of each bottom transistors. The transistors are connected at the drain and determine the overall width and so the current of the DPA. The amplitude modulation is done by controlling this current flow. The resonant tank that generates the fundamental wave form consists of a capacitance C and an inductance L that is divided for differential de- sign into L/2. One benefit in designing an inverse class-D PA is that the resonant tank can be integrated into the output transformer [1]. This transformer also converts the differential signal into a single-ended. The capacitance C and the primary winding of the transformer is im- plemented on chip. The secondary winding is implemented in the

93 Chapter 4. Digital Power Amplifier

package. This saves metal area on chip, that can be used for the primary winding and so improve its quality factor. The primary and secondary winding is coupled by the factor k.

Vdd L/2 L/2 k

Vbias,2 C Vbias,2 L Rload

Vbias,1 Vbias,1

LOp LOn

Figure 4.7: Schematic of a stacked inverse class-D PA with transformer.

4.2 Theoretical Error Sources

While converting and amplifying a signal errors can occur that might degrade the quality of the signal. Some errors already occur in theory, e.g. quantization error. Others are physically based errors that can distort the signal due to a parasitic capacitance or inductance. This section has a selection of error sources that have to be considered in the design or layout.

4.2.1 Quantization Error One limitation in the output spectrum will be the quantization noise that DAC are passing to later designs and that are limiting the SNR.

94 4.2. Theoretical Error Sources

For a full-scale sinusoidal wave and a bit resolution N this results in the well known equation

SNR [dB] = 6.02N + 1.76. (4.3) Other well known errors of a DAC are offset, DNL and INL. These errors are caused due to the mismatch of different cells during manu- facturing processes.

4.2.2 Amplitude Mismatch The amplitude of the output signal depends on the amplitude of the current that is generated in each cell. For inverse class-D amplifier the transistors are always driven in saturation. The saturation drain current of a short channel transistor is given as 1 W I = µC (V − V )2(1 + λ(V − V )). (4.4) d,sat 2 ox L gs th ds ds,sat It can be seen that for high output power the variation of Vds and channel modulation λ have an impact on the saturated output cur- rent. Besides the variable change of the output current a mismatch of FEOL parameters result in an unequal summation of UC currents. There were already several investigation made about process variation. In equation 4.4 the basic function of saturated drain current is shown. The impact of process variations that results in a variation of the width W , length L, gate oxide Cox and threshold voltage Vth on the drain cur- rent can be seen [3]. Furthermore it was shown that different process steps result in different error types [4] and can strongly influence the performance of the DPA. It can be seen that he control of the cell field to switch on and off the different UC is an additional aspect that has to be considered. There were already different suggestions presented to increase the yield of analog to digital converters [5]. For a one di- mensional design the Typ_A and Typ_B method were presented. For 2-D designs there is an additional dimension that has to be considered. The basic idea for these designs is to eliminate or reduce a given error that is caused during process steps or layout mismatch. The current that is summed up results in an total current of

∑N IN = NI¯+ Iϵ¯ n. (4.5) n=1 95 Chapter 4. Digital Power Amplifier

Where I is the average current that an ideal matched UC would de- liver, ϵn is the positive or negative error that the UC differs from the ideal value and In is the resulting output current.

4.2.3 Driving Stage Mismatch In Figure 4.8 the equivalent block diagram of the input stage of the matrix cell field is shown. At the beginning of each row is an inverter that drives the whole line. The wire resistor and the gate capacitance build an RC ladder. Due to different impedance to the gates the voltages V1 and V2 have different rise and fall times that result in an delay of switching times.

R1 R2 V0 Cin V1Cin V2

Figure 4.8: Equivalent block diagram of the input stage.

4.2.4 Output Combining Mismatch In Figure 4.9 the equivalent block diagram of the output is presented. On the contrary to the input stage now it assumed that the UC act as

Resonant Tank ZT 1 ZT 2 RS RS L I I + I M Cin 1 Cin 1 2 Csub Csub CM RL

Figure 4.9: Equivalent block diagram of the output stage.

96 4.2. Theoretical Error Sources

current switches, so the currents are added and not the voltages. Each of this UC has a different output impedance Z that it has to drive. The different driving cells are controlled by the decoder and are switching thereby the combination of UC. At the output of the inverter stages a series resonance circuit is placed to generate the wanted output sinusoid. The currents are combined according to Kirchoff’s law.

4.2.5 Timing Mismatch In previous section results for timing delays δt of a switching matrix were shown. A timing mismatch on chip results in a superposition of different rectangular signals as can be seen in Figure 4.10.

2

1

δt Magnitude Normalized 0 0 1 2 3 4 5 6 Time Figure 4.10: Overlap of two timing delayed rectangular signals.

The rectangular function is now a added signal of two rectangular functions and can be written as followed. Due to the time delay a two step function instead of an one step function is generated that can be expressed as   0, for 2n + δt < |t| < (2n+1)  2f 2f  (2n+1) ≤ | | (2n+1) 1, for 2f t < 2f + δt frect(t) = (4.6)  and 2(n+1) < |t| ≤ 2(n+1) + δt  2f 2f  (2n+1) ≤ | | ≤ 2(n+1) − 2, for 2f + δt t 2f δt.

97 Chapter 4. Digital Power Amplifier

Each rectangular waveform can be presented by a Fourier series. Since we have a periodic system only the sinusoidal terms remain as 4 ∑ 1 f(x) = sin(nx). (4.7) π n n=1,3,... It can be easily seen that this results in a reduced amplitude as well as in a phase shift of the overall signal. Every combination of a series of sinusoidal signals can be expressed as

sin(ωt + ϕ1) + sin(ωt + ϕ2) + ... + sin(ωt + ϕn) =

sin(ωt)(cos(ϕ1) + cos(ϕ2) + ... + cos(ϕn)) + cos(ωt)(sin(ϕ1) + sin(ϕ2) + ... + sin(ϕn)) = A sin(ωt) + B cos(ωt) = √ ( ( )) − B A2 + B2 sin ωt + tan 1 . (4.8) A

4.3 AM-AM and AM-PM Distortion

Any nonlinearity discussed before will eventually result in an AM-AM or AM-PM distortion. Figure 4.11 shows a model of the bottom tran- sistor with a parasitic series resistance Rpar. The parasitic capacitance Cpar represents the metal connection and disabled transistors. The

Tank Connection

Vbias,1

Rpar i

Vin Cpar

Figure 4.11: Model of the bottom transistor with series resistance.

98 4.4. Matrix Controlling

upper transistor is biased with Vbias,1 and has to be chosen wide enough to provide sufficient current to the bottom transistor. Figure 4.12 shows the distortion of the rectangular current waveform in the stack because of increased transistor width. The bottom transis- tor is swept between a nominal value and 1/20 of it. It can be seen that with an increasing number of conducting transistors the amplitude is more distorted and the phase is shifted [6].

·10−2

3

2 T=1 T=15/20 1 T=10/20 Current [A] T=5/20 0 T=1/20 T/2 T Period Figure 4.12: Simulation of AM-AM and AM-PM distortion due to stack resistance.

4.4 Matrix Controlling

An essential question of a DPA or any other matrix based design is the controlling scheme that should be used. The simplest way is to switch the whole matrix on and off. In this case a synchronous timing has to be achieved to avoid distortions. For a dynamic modulation of the cell field, additionally to the timing the matching of the different cells and so their contribution can vary. Especially in later technology nodes the impact of mismatch can have an impact on the overall performance [7]. For a current summing DAC (CSDAC) the current provided by a single UCs can be expressed as a

In = I¯(1 + ϵn) (4.9)

99 Chapter 4. Digital Power Amplifier

I is the mean value of all current sources and ϵn is the error of each UC n that produces the current In. So the INL and DNL of the DPA can be calculated. The switching schemes give the sequence how to switch on or off these UC.

4.4.1 1-D Switching Schemes Figure 4.13 shows the sequential scheme simply switches on one cell after another. This means that any kind of mismatch error that occurs on chip will be summed up cell by cell [5,8]. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Figure 4.13: 1-dimensional sequential scheme.

The conventional symmetrical switching scheme on the other hand compensates a linear error. Figure 4.13 depicts the switching scheme which starts to sum up the UCs in the middle of the row and continues sequentially to the sides. 15 13 11 9 7 5 3 1 2 4 6 8 10 12 14 16 Figure 4.14: 1-dimensional conventional symmetrical scheme.

The hierarchical symmetrical scheme, shown in Figure 4.15, can com- pensate linear and symmetrical errors. It first starts on one side and then jumps to the other side. To compensate a linear or symmetrical error it always needs four iterations. 14 10 6 2 1 5 9 13 15 11 7 3 4 8 12 16 Figure 4.15: 1-dimensional hierarchical symmetrical scheme Type A.

The switching schemes shown before is called Type A and the one il- lustrated in Figure 4.16 Type B. As well as the Type A this also compen- sates linear and symmetrical errors. In contrary to Type B the switching scheme jumps around the middle. For linear and symmetrical errors around the middle this can be compensated with every second step. 15 11 7 3 1 5 9 13 14 10 6 2 4 8 12 16 Figure 4.16: 1-dimensional hierarchical symmetrical scheme Type B.

100 4.4. Matrix Controlling

4.4.2 2-D Switching Schemes For a 2-dimensional gradient the 1-dimensional switching schemes can be reused and extended to a 2-dimensional structure by simply copying the row and adding it multiple times under each other. The random column (RC) scheme is shown in Figure 4.17. As in 1-dimension this switching scheme is especially good for a linear and symmetrical gra- dient whose gradients are parallel to the switching scheme. Since the DNL in an ideal case does only depend on the matching and not on the switching, the INL is the factor that can be influenced.

5 1 3 7 8 4 2 6

13 9 11 15 16 12 10 14

21 17 19 23 24 20 18 22

29 25 27 31 32 28 26 30

Figure 4.17: Switching diagram of RC.

Figure 4.18 shows the scheme of the random walk (RW) [9]. The idea of this scheme is to cancel local as well as global mismatches. Each local block consists of a 4 × 4 matrix with UCs. The local sequence jumps around its center in a way that it cancels best any kind of error.

P B F K 12 10 6 2 8 1 14 4 D N H M 5 15 0 9 3 7 11 13 L J A O

G C R E

Figure 4.18: Switching diagram of RW.

101 Chapter 4. Digital Power Amplifier

For the global scheme a similar pseudo-random switching sequence can be seen. The switching scheme starts to activate the first cell of a local block. This cell is then sequentially switched on in all other 4 × 4 blocks. When this cell is activated in every block the next cell is activated in the first block. This sequence is done until all 16 UCs in every block is active. Due to process variations different mismatch errors occur [10, 11]. The distribution can have different causes and is so unpredictable [3, 12]. In Figure 4.19 you can see four different kinds of gradients over a ma- trix that represent different on-chip matching errors that might occur.

1 1

0.5 0.5 Magnitude Magnitude 0 0 N N N N

Column Row Column Row (a) Linear Gradient (b) Rotated Linear Gradient

1 1

0.5 0.5 Magnitude 0 Magnitude 0 N N N N

Column Row Column Row

(c) Symmetrical Gradient (d) Decentralized Peak Gradient

Figure 4.19: Diagram of different gradients that represent the UCs dif- ference across the matrix.

102 4.4. Matrix Controlling

Linear, quadratic and joint gradients can be used to model the error [8]. In Figure 4.19a a linear gradient is shown. In Figure 4.19b the linear gra- dient is rotated so that the peak is placed in the corner. Figure 4.19c has a symmetrical gradient with a peak in the center and Figure 4.19c has a decentralized bump. For these different gradients the transfer line of the DAC compared to an ideal DAC changes. The INL and DNL can now be calculated and compared. Since the DNL value does not change as discussed before only the INL is considered. Figure 4.20 shows the resulting transfer function for RC scheme that switches the cell fields of Figure 4.19. Figure 4.20a is the result of a cell

1 1 Ideal DAC Ideal DAC 0.8 DAC 0.8 DAC 0.6 0.6 0.4 0.4 0.2 0.2 Normalized Amplitude 0 Normalized Amplitude 0 1 512 1,024 1 512 1,024 Code [Unit Cells] Code [Unit Cells]

(a) Linear Gradient (b) Rotate Linear Gradient

1 1 Ideal DAC Ideal DAC 0.8 DAC 0.8 DAC 0.6 0.6 0.4 0.4 0.2 0.2 Normalized Amplitude 0 Normalized Amplitude 0 1 512 1,024 1 512 1,024 Code [Unit Cells] Code [Unit Cells]

(c) Symmetrical Gradient (d) Decentralized Peak Gradient

Figure 4.20: DAC transfer functions for RC with different gradients.

103 Chapter 4. Digital Power Amplifier

field with linear gradient. It can be seen that with every second UC that is switched active the DAC is equal to an ideal DAC. This kind of switching scheme is intended for linear or symmetrical gradients that are perpendicular to the switching scheme. If the gradient is turned by 45° the scheme cannot compensate anymore the difference that results in an INL. For a symmetrical gradient the same conclusion for the linear gradient can be made. For a gradient with decentralized bump the INL is shifted away from half code. Figure 4.21 shows the transfer functions of the DAC for the RW switch- ing scheme. The same gradients are used as seen before. Compared to

1 1 Ideal DAC Ideal DAC 0.8 DAC 0.8 DAC 0.6 0.6 0.4 0.4 0.2 0.2 Normalized Amplitude 0 Normalized Amplitude 0 1 512 1,024 1 512 1,024 Code [Unit Cells] Code [Unit Cells]

(a) Linear Gradient (b) Rotate Linear Gradient

1 1 Ideal DAC Ideal DAC 0.8 DAC 0.8 DAC 0.6 0.6 0.4 0.4 0.2 0.2

Normalized Amplitude 0 Normalized Amplitude 0 1 512 1,024 1 512 1,024 Code [Unit Cells] Code [Unit Cells]

(c) Symmetrical Gradient (d) Decentralized Peak Gradient

Figure 4.21: DAC transfer functions for RW with different gradients.

104 4.5. Theoretical Error Cancellation

the RC scheme for this one no preference for any gradient can be seen. The transfer functions look for all the gradients similar, what makes this transfer function more independent of the gradient and easier to predict. The main characteristic and difference to RC is that an overall INL is distributed to smaller deviations around the ideal transfer func- tion.

4.5 Theoretical Error Cancellation

In Figure 4.22 a linear gradient for the cell field is shown. The cells in one column have the same characteristics. The cells in every row have a linear mismatch. This error distribution is used to show the impact of the switching schemes for amplitude and phase distortion.

2

1 2 0 0 −1 −2 Magnitude 4 1 2 3 2 −2 4 5 Row Column Figure 4.22: 3-D diagram of a linear gradient error.

4.5.1 Amplitude Variation Error The disadvantage of this control mode is that the cancellation of an gradient error is only done in one dimensions. While a linear perpen- dicular error would result in a very good cancellation. If for the given sequence a predicted fixed error of ϵn for one row to the next is con- sidered and if first all cells are switched on that are above or below the average than this results in the worst possible INL. For the given cell and gradient this would result, if the half of the UC are switched on, that are either above are below the average. By defining the matrix

105 Chapter 4. Digital Power Amplifier

that M is the total number of columns and N is the total number of rows, the maximum INL for a given cell field N × M is

∑N/2 ∑M INLmax = ϵmax = ϵn,m ( n=1 m=1 ) ϵ N = M a 1 + 2 + ... + 2 ( ) 2 ϵ N + 2 N = M a . (4.10) 2 2 4

Assuming the same fixed error ϵn as before the worst case after one sequence is that all cells above the average are at the maximum dis- tance in a local matrix and all below are close to the average value. The maximum INL can be described as

( ) M∑×N INLmax = ϵmax = max ϵn,x [ n=1 ϵ M M N M = a 4 + 8 + ... + 2 4 4 ( 2 4) ] M M N M − − 5 − ... − − 3 4 4 2 4 ϵ 3N = M a . (4.11) 2 32 M is the number of columns of global matrices, N the number of rows and x is the local selection of a UC. Considering only a static mismatch, then the DNL error is for both schemes the same. For dynamic errors such might occur due to voltage drop or short channel effects the DNL changes cannot be considered equal because it is the sum of static and dynamic errors in different conditions. For a 32 × 32 matrix and a linear gradient, which has a difference of -0.155 LSB to 0.155 LSB, the resulting INL is displayed in Figure 4.23. It can be seen that for this linear gradient the RW scheme has a higher INL compared to the RC scheme, which cancels the error after every second cell.

106 4.5. Theoretical Error Cancellation

0.5 RC RW 0

−0.5 INL [LSB] −1

0 200 400 600 800 1,000 Code [Unit Cells]

Figure 4.23: INL of RC and RW for a linear gradient amplitude error.

4.5.2 Timing Mismatch Error For a given linear phase mismatch ϕ of 3.1°, from the first cell of each row to the last cell, the amplitude and phase distortion can be calcu- lated. In Fig: 4.24 the difference of the amplitude depending on the summation of each control scheme can be seen. ·10−6 1

0.5

0 Normalized Amplitude 0 200 400 600 800 1,000 Code [Unit Cells] Figure 4.24: Amplitude difference of RC and RW due to timing mis- match.

In Figure 4.25 the phase distortion for both switching schemes is plot- ted. The phases are normalized to the first phase. The overall phase distortion is for all switched on cells the same.

107 Chapter 4. Digital Power Amplifier

·10−2 1.5

] 1 ◦

0.5 Phase [ RC Phase 0 RW Phase 0 200 400 600 800 1,000 Code [Unit Cells]

Figure 4.25: Phase distortion of RC and RW due to timing mismatch.

4.6 Decoder

For CSDAC different architectures to weight the digital to analog con- version were discussed. There exists three architectures, the binary, the unary and an architecture were both are combined, the segmented architecture. The advantages of the binary architecture are a simple implementation because no decoder is needed and every cell has to be double the size of the previous. This results in a simple and power saving architecture. The disadvantages are that monotonicity is not ensured and glitches can occur. This disadvantages are covered by the unary architecture. Since the cells are weighted by switching on and off a sequence of unary cells the monotonicity is implicit. But for higher bit resolutions the decoding becomes more and more complex. Therefore, the segmented design can be chosen. For the LSBs a simpler and more efficient design can be chosen. For the most significant bits (MSBs) a more complex design might be required to fulfill the linearity require- ments [13]. The decoder has to work in an required frequency range of 2.3-2.7 GHz to meet LTE requirements as written in a previous chap- ter. The basic requirements of the decoder are that it works reliably at this frequencies and guarantees a synchronous output of the LO as well as the decoded data. Additionally, the decoder has to be able to drive the UCs inside the matrix. This can be achieved by using driver cells after the output ports. The decoder is the last fully digital design in our power amplifier digital to analog converter (PADAC). The whole

108 4.6. Decoder

digital part is realized with low voltage transistors (LVT) which work at a specified supply voltage of 1.1 V.

4.6.1 Motivation The theoretical concept of matrix controlling was described in the pre- vious section. For further on-chip investigations the two different con- trolling schemes were implemented in the decoder. The decoder was build using the digitally automated design flow. This has the benefit that the layout can be done automated and extracted for timing anal- ysis. These timing analysis can be done over PVT and are therefore faster compared to analog manual characterizations.

4.6.2 Decoder Design Figure 4.26 shows the implementation of the decoder. The interface to the DFE as well as to the analog cell field are synchronized by a digi- tal digital clock (CLK), which assures timing accuracy over PVT. Inside the logic block the two decoding schemes RC and RW are implemented and can be selected by Dec_Mode_Sel. Dec_ON is used to enable and disable the decoder. The binary 5 bit Binary data bits are also clocked

CLK

Dec_ON All_n Dec_Mode_Sel Cell Logic Thermo 10b Col Register Register Row

Binary 5b Bin

Figure 4.26: Block level diagram of the decoder.

109 Chapter 4. Digital Power Amplifier

and delivered to the output. The binary 10 bit thermometer data is de- coded according to one of the two activated decoding selections. Ac- cording to these modulations the UCs can be activated inside the cell field. Row activates the row, Col the column and Cell a block of cells. If an entire block of cells is activated this can be switched on by All_n. The differential LO and inverse local oscillator (LOX) generated signal contain the PM. To ensure that later the data input into each UC cell is reliable a half clock cycle delay of the clock to the data is generated at the output of the decoder. Furthermore, the input data is first synchronized with a register to be able to define specific requirements to the previous stage. The internal data are again synchronized at the output to ensure that all data wire synchronously go out of the decoder into the UCs and to be able to specify a defined clock delay between the data and the clock. For RW four cores, with each consisting of 256 UCs, are needed to get the required 1024 UCs. To alleviate the driver requirements for each output port, especially for the column, the decoder is designed in a T- shape. Therefore, the cell field is divided into two 8 × 8 matrices. The output ports of the decoder and their later position in the Layout can be seen in Figure 4.27.

4.6.3 Decoder Layout To ensure optimal timing and synchronicity the decoder layout is done in the digital design environment. The clock tree inside the tool is gen- erated regarding to Figure 4.28. It is well known that this structure is optimal to for the clock distribution concerning synchronicity. Ideally every path has the same length and therefore the same resistance. In a symmetrical design all path also have the same capacitance. This re- sults in an equal time delay factor τ. It can be seen that this structure is not suitable for minimum routing. In fact,√ with every additional it- eration the total length rises by a factor of 2 [14].

4.7 Cell Field Layout

The layout is one of the most important design steps because it is the step when an ideal mathematical calculation and transistor model sim- ulations are combined with physical parasitics. Each wire mismatch

110 4.7. Cell Field Layout

Figure 4.27: Equivalent layout block diagram of the implemented de- coder.

111 Chapter 4. Digital Power Amplifier

Figure 4.28: Block diagram of the clock distribution in the H-Tree structure. or additional capacitance can produce an additional clock mismatch. Figure 4.29 shows a piece of metal that represents a piece of wire. An important factor of designing a wire that is designed for high current is given by the area of the cross section and its material depending spe- cific requirements.

Metal

Height

Length

Width

Figure 4.29: Nominal current density in a given metal cross section.

For copper it is defined for relibility that the current density should not exceed 106A/cm2 or 10−2A/um−2 in the temperature range from −55 to 125 °C [15, 16]. Table 4.1 shows the current density defined for copper.

112 4.8. DPA Simulation

Table 4.1: Nominal Current Density for Copper Material Current Density Current Density [A/cm2] [A/µm−2] Copper 106 10−2

4.7.1 Layout Consideration To start the layout a top down approach is preferred. For a compact de- sign the minimum space between two top layers is chosen as well as the minimum width of a M7 layer and a M7 to aluminum via. These metal layers are now connected to M6 and distributed inside the cell field. The ground plane can be used to shield the signal paths and the LO signal. The FEOL placement is as well considered as compact as pos- sible. The AND structures are placed between the logic enable block and the transistors of the P and N path. They combine the LO and the logical input signal and thus control the gates of the transistors. The simulation results show that 1.5 A peak current is flowing in the design. Considering a total width of M7, for each P and N path, and the height of the metal results in 250 µm2. Dividing now the peak current by the area of the cross section results in a current density of 6 · 10−3 A/µm2. Comparing this with the given 10−2 A/µm2 proofs that the design is compliant with the requirements.

4.8 DPA Simulation

This section shows the simulation setup and the basic simulations that proof the concept of the DPA. The simulations show the transient volt- age behavior at the gate of the bottom transistors when a cell is en- or disabled. Furthermore, it shows the CW output signal over code. This characterizes the behavior of the design in terms of output power, phase and efficiency.

4.8.1 Simulation Setup Figure 4.30 shows the simulation setup. The simulations were done using an equivalent transistor model of the Decoder that was automat- ically generated by the digital design flow and a transistor model of the

113 Chapter 4. Digital Power Amplifier

stack. The LO buffers are included as well as the LO distribution to the interfaces. Each column, that consists of 32 UCs is driven by a buffer in the interface. For the OMN an electromagnetic (EM) simulation was done to extract an S-parameter file that was later used to describe the behavior of the layout. If not specifically stated the simulation model does not use any parasitic extractions or modulations. The simulation is done for a supply voltage of 2.5 V for the analog part at the center tap of the transformer. In the digital domain a supply of 1.1 V was used.

Input Stack Output

Figure 4.30: Block diagram of the simulation setup.

4.8.2 Transient Voltage Output In Figure 4.31 the transient simulation shows the output voltage re- sponse to a rectangular binary input signal that jumps between the

1 LO + LO − 0.5 Vout 0 − Voltage [V] 0.5 −1 2 3 4 5 6 7 8 9 10 Time [ns]

Figure 4.31: Transient simulation of the LO signal at one bottom tran- sistor of the stack and the resulting Vout.

114 4.8. DPA Simulation

code word of 5 Bit to 10 Bit and reverse. A UC, that is en- and disabled during this period, represents the basic behavior of all UCs. When the cell is active the LO signal is passed and drives the gate of the bottom transistor which contributes to the amplification of the output signal’s amplitude Vout. The complementary differential LOX signal path is en- abled half an LO period later.

4.8.3 Output Power, Drain and Overall Efficiency

Figure 4.32 shows Pout over the five binary bits and the transition to the first two thermometer decoded cells. Pout is in the range of −49.9 dBm and −20.4 dBm for the first 5 bits what results in a DR of 29.5 dB. Be- tween code 33 and 63 it is −20.0 dBm and −14.4 dBm. It can be seen that the transitions during the binary cells as well as the transitions from the binary cells to the first two thermometer decoded cells is monotonically increasing. This shows a got matching between the binary cells itself as well as a got matching at the transition of the binary cells to the thermometer cells.

−20

−30

−40

Output Power [dBm] −50 10 20 30 40 50 60 Code [Unit Cells]

Figure 4.32: Simulation of Pout over five binary bits and the transitions to the first two UCs.

Figure 4.33 shows Pout, ηd and ηoa over the thermometer cells for the both switching schemes RC and RW. By only considering the ther- mometer cells, the DPA has a DR of Pout from -20.2 dBm to 32.4 dBm. Taking the output power of the binary bits, shown in the plot above, into account this results in a total DR of 82.3 dB. The ηd and ηoa is in the range of 0.01 % to 46.8 % and 0.01 % to 45.5 %, respectively.

115 Chapter 4. Digital Power Amplifier

40

P RC 20 out Pout RW ηd RC 0 ηd RW Efficiency [%] ηoa RC − Output Power [dBm], 20 ηoa RW 0 200 400 600 800 1,000 Code [Unit Cells]

Figure 4.33: Simulation of Pout, ηd and ηoa over thermometer code.

Figure 4.34 shows the efficiencies, seen in the figure above, over Pout. The maximum ηd is 46.8 % at full input code. At 3 dB BO the efficiency decreases to 32.5 % and at 6 dB BO to 22.5 %.

50 ηd RC 40 ηd RW η RC 30 oa ηoa RW 20 10 Efficiency [%] 0 −10 0 10 20 30 Output Power [dBm]

Figure 4.34: Simulation of ηd and ηoa over Pout.

4.8.4 Output Voltage

Figure 4.35 shows Vout over the five binary bits and the first two transi- tions from binary to thermometer cells at code 32 and 64. The DR for the first 5 bits is from 0.001 V to 0.030 V. From step 32 to 64 the voltage increases from 0.031 V to 0.061 V.

116 4.8. DPA Simulation

·10−2 6

4

2

Output Voltage [V] 0 10 20 30 40 50 60 Code [Unit Cells]

Figure 4.35: Simulation of Vout over code.

Figure 4.36 shows Vout over the 10 bit thermometer code. The maxi- mum output voltage that is achieved in the simulation is 13.2 V. It can be seen that the transfer characteristics are identical for RC and RW. 15 Vout RC 12.5 Vout RW 10 7.5 5 2.5 Output Voltage [V] 0 0 200 400 600 800 1,000 Code [Unit Cells]

Figure 4.36: Simulation of Pout, ηd and ηoa over code.

4.8.5 Output Phase Figure 4.37 shows the phase over the five binary bits. The phase has a maximum deviation of 16.7° at code word 32. This is the transition from binary to thermometer code. Considering only the five LSBs the maximum difference is at code 16 with 6.1°.

117 Chapter 4. Digital Power Amplifier

5

0

−5

−10 Phase [°] −15

−20 10 20 30 40 50 60 Code [Unit Cells] Figure 4.37: Simulation of the phase over code for the five binary bits.

In Figure 4.38 the phase of the output signal is simulated and com- pared for RC and RW. It can be seen that the maximum difference of output phase of the RW signal is approximately 0.28° higher than the phase generated by RC.

Phase RC 4 Phase RW

2 Phase [°] 0

0 200 400 600 800 1,000 Code [Unit Cells] Figure 4.38: Simulation of the output phase difference between RC and RW.

4.9 Variable LO Load

Figure 4.39 shows the schematic of an AND gate. The two transistors, that are controlled by the signal S, work as a switch and either block or transmit the LO. The LO signal has to charge and discharge the gate

118 4.9. Variable LO Load

drain capacitance Cgd and gate source capacitance Cgs of the pMOS and nMOS transistors. In case that the cell is disabled some charges are stored the drain capacitance at the knot between the two bottom transistor. According to C = Q/V the charge of the capacitance Cgs that has to be recharged by the LO is changed. This results in a variable row capacitance that depends on the number of enabled cells.

Vdd Vdd

S

Cgd LO Cgd

C S gs

Figure 4.39: Schematic level of the LO driven AND gate.

Figure 4.40 shows the voltage difference that occurs at the output due to the changed load of the LO. The maximum difference is 0.013 V. It

·10−2

1

0.5 Voltage [V]

0 0 200 400 600 800 1,000 Code [Unit Cells]

Figure 4.40: Simulation of the output voltage difference between RC and RW

119 Chapter 4. Digital Power Amplifier

can be seen that this difference has no significant impact on the overall output voltage. Figure 4.41 shows the phase over code for RW and RC. By changing the gate connection of the LO to the bottom nMOS transistor the phase difference can be reduced.

4 Phase RC Phase RW

2

Phase [°] 0

−2 0 200 400 600 800 1,000 Code [Unit Cells]

Figure 4.41: Simulation of phase over code for the ten thermometer bits.

4.10 Silicon Implementation

Figure 4.42 shows the die photo of the inverse class-D CSDPA. The de- coder and the bottom transistor of the stack are at the left side. The remaining two transistors connect input block with the OMN. At the top and the bottom of the core a heat sink block was placed in order to connect the substrate with the package. This provides a good thermal conductance and is intended to reduce performance degradation due to heat development on chip. The DPA is 610 µm × 500 µm. The BEOL has 7 metal layers in copper and one in aluminum. A sense path at the center tap of the transformer allows an accurate voltage setting.

4.11 CW Measurements

In this chapter different measurements to investigate the DPA perfor- mance are done. The design was implemented for the LTE FDD band 7 and the TDD bands 38, 40 and 41. First the core board with the chip

120 4.11. CW Measurements

Figure 4.42: Die photo of the inverse class-D CSDPA. as DUT is shown. Next CW measurements are done at the center fre- quency of LTE band 40 to characterize the PA performance. Unlike in simulation no ηoa can be measured since the complete DFE and so the test signal production is integrated on chip. The DFE uses the same power supply as the digital part of the DPA what makes it impossible to measure the power consumption independently. Afterward, LTE mea- surements are done for band 7 to show the performance for modulated signals.

4.11.1 Measurement Setup Figure 4.43 shows the picture of the packaged chip on the PCB. The DPA is directly connected to the DFE so all signal modulations are directly produced on chip. At the output the signal is transferred on the PCB to an SMA. Here it can either be connected to a power meter for CW measurements or to a signal analyzer for LTE measurements. On chip is a voltage sense path at the supply bump of the chip that can be read out to be able to measure the drain voltage without losses directly on

121 Chapter 4. Digital Power Amplifier

the chip bump. For all measurements a supply voltage of 2.5 V is set. As default setting RC is set.

Figure 4.43: Chip photograph on PCB compared to a cent coin.

Figure 4.44 depicts the measurement setup for the DUT. The test sig- nal is generated directly on chip and then amplified by the integrated DPA. The output of the PCB is then connected to a load tuner to be able to match the impedance to 50 Ω. The output is then connected to either a power meter or a spectral analyzer depending on the mea- surements. For CW measurements the power meter is used because of higher accuracy.

Spectrum Analyzer

Interface DUT Load Tuner

Γ Power Meter

Figure 4.44: Block diagram of the measurement setup for the DUT.

122 4.11. CW Measurements

4.11.2 CW Output Power and Drain Efficiency

Figure 4.45 illustrates Pout for the first five binary bits. The output power has a DR of 33.6 dB from −56.7 dBm to −23.1 dBm. It therefore meets the requirements of a minimum Pout of −40 dBm as well as the −50 dBm transmit OFF power. Due to the good matching of the binary cells, Pout is monotonically increasing. −20

−30

−40

−50 Output Power [dBm] −60 5 10 15 20 25 30 Code [Unit Cells]

Figure 4.45: Measurements of Pout for the binary cells at the center fre- quency of LTE band 40.

Figure 4.46 presents the drain efficiency ηd over the input code of the five binary bits. It can be seen that in this range the efficiency is ·10−3

1

0.5 Efficiency [%] 0 5 10 15 20 25 30 Code [Unit Cells]

Figure 4.46: Measurements of ηd for the binary cells at the center fre- quency of LTE band 40.

123 Chapter 4. Digital Power Amplifier

increasing monotonously but the DPA is very inefficient despite of its low power consumption. Figure 4.47 shows the output power and efficiency measurements for RC and RW at the center frequency of the LTE band 40. The maximum output power of 31.2 dBm is achieved at full code. At input code 287 the DPA is in 6 dB BO. The maximum ηd is 34.3 % at full code. For both switching schemes the output powers and efficiencies are identical.

35 30 20 10

0 Pout RC −10 ηd RC Efficiency [%] P RW −20 out

Output Power [dBm], ηd RW −30 0 200 400 600 800 1,000 Code [Unit Cells]

Figure 4.47: Measurements of Pout and ηd at the center frequency of LTE band 40.

Figure 4.48 displays ηd over Pout for the ten MSBs. The maximum ηd is 34.3 % at 31.0 dBm. At 3 dB BO the efficiency decreases to 24.0 % and 40 ηd RC 30 ηd RW

20

10 Efficiency [%] 0 −20 −10 0 10 20 30 Output Power [dBm]

Figure 4.48: Measurements of ηd over Pout at the center frequency of LTE band 40.

124 4.11. CW Measurements

for 6 dB BO to 16 %. The efficiency further drops until it gets close to 0 % at 0 dBm output power. Figure 4.49 shows Pout over full code. The whole power range of the DPA is from -53 dBm for one binary cell to a maximum output power of 31 dBm. This results in a total DR of the DPA of 84 dB.

20

0

−20

−40 Output Power [dBm]

100 101 102 103 104 Code [Unit Cells]

Figure 4.49: Measurements of Pout over full core at the center frequency of LTE band 40.

4.11.3 AM-AM and AM-PM

Figure 4.50 shows Vout up to input code 32 and an ideal linear voltage. ·10−2

2 1.5 1

Voltage [V] 0.5 Vout 0 Vout Ideal 5 10 15 20 25 30 Code [Unit Cells]

Figure 4.50: Measurements of Vout for the binary cells at the center fre- quency of LTE band 40.

125 Chapter 4. Digital Power Amplifier

The voltage is in the range between 0.0005 V and 0.0224 V. It can be seen that the voltage rises linear with approximately 0.8 mV. Figure 4.51 shows the INL of the binary bits for Vout and the transition to the first thermometer bit at input code 32. It can be seen that the maximum INL is 0.64 LSB at the transition from full code binary to the first thermometer cell.

0.5

0

INL [Binary LSB] −0.5

5 10 15 20 25 30 Code [Unit Cells]

Figure 4.51: Measurement of INL error for binary bits.

Figure 4.52 shows the DNL of the thermometer decoded cells over input code. The maximum DNL is −0.64 LSB at the transition to the thermometer code. It can be seen that the most influence is given by the change from the first to the second binary cell and from the binary

0.2 0 −0.2 −0.4

DNL [Binary LSB] −0.6

5 10 15 20 25 30 Code [Unit Cells]

Figure 4.52: Measurement of DNL error for binary bits.

126 4.11. CW Measurements

cells to the thermometer cell. Especially the binary to thermometer transition can be further investigated and optimized. Figure 4.53 shows the AM-AM of the thermometer bits for RC and RW. The slightly increased voltage for RW, as discussed in a further section, can be seen when half of the core is active. Additionally, to the AM-AM an ideal AM-AM is shown to illustrate the INL of the DPA. 12.5

10

7.5

5 V Voltage [V] out RC 2.5 Vout RW Vout Ideal 0 0 200 400 600 800 1,000 Code [Unit Cells] Figure 4.53: Diagram of AM-AM for center frequency of LTE band 40.

Figure 4.54 shows the INL over code of the thermometer bits for RC and RW. Consistently to the plot before both scheme show the same characteristic behavior. It can be seen that the maximum INL is 283 LSB for RC and 290 LSB for RW. 300

200

100 INL RC INL [Thermo LSB] 0 INL RW 0 200 400 600 800 1,000 Code [Unit Cells]

Figure 4.54: Measurement of INL error for thermometer bits.

127 Chapter 4. Digital Power Amplifier

Figure 4.55 shows the DNL of the thermometer bits for RC and RW. The maximum DNL is 1.30 LSB at input code 1 for RC and 1.31 LSB at input code 3 for RW. Consider that the smaller variations in the plot could be caused by measurement inaccuracy.

2 DNL RC 1 DNL RW

0

−1 DNL [Thermo LSB] −2 0 200 400 600 800 1,000 Code [Unit Cells]

Figure 4.55: Measurement of DNL error for thermometer bits.

Figure 4.56 shows the AM-AM for RC over the entire 15 bits. Com- pared to the measurements with static binary bits, the non-monotonic behavior of the binary cells distorts the output transfer function. For future use it might be considered to only use the binary cells at low Vout.

12.5

10

7.5

5 Voltage [V] 2.5

0 0 0.5 1 1.5 2 2.5 3 Code [Unit Cells] ·104

Figure 4.56: Diagram of AM-AM for center frequency of LTE band 40.

128 4.11. CW Measurements

Figure 4.57 shows the INL of the full code for RC. It can be seen that the maximum INL is 9134 LSB. The basic behavior of the plot is similar compared to the plot that only used thermometer decoded cells but the transitions between become more random. ·104 1 0.8 0.6 0.4

INL [LSB] 0.2 0 0 0.5 1 1.5 2 2.5 3 Code [Unit Cells] ·104

Figure 4.57: Measurement of INL error for full code.

Figure 4.58 shows the DNL of the full code for RC. The maximum DNL is 18.51 LSB at the input code 31954 and the minimum is -26.79 LSB at 31743. The tendency, when only the thermometer decoded cells were considered, cannot be seen anymore.

20

10

0

−10 DNL [LSB] −20

−30 0 0.5 1 1.5 2 2.5 3 Code [Unit Cells] ·104

Figure 4.58: Measurement of DNL error for full code.

129 Chapter 4. Digital Power Amplifier

The AM-PM is shown in Figure 4.59. The maximum phase difference swept over the UCs is 49.5° for the rising edge of RC and 49.9 for RW. At input code 100 it decreases with 1°/15 UCs until half code where it is reduced to 1°/25 UCs. To measure the phase difference a synchronized rectangular stimulus was generated on-chip.

RE RC 40 FE RC RE RW FE RW 20 Phase [°]

0 200 400 600 800 1,000 Code [Unit Cells]

Figure 4.59: Measurement of phase for RC and RW.

Figure 4.60 shows the supply voltage over code. It can be seen that with increasing the input code the supply voltage Vdd drops. This re- sults in a lower supply for the DPA and a decrease of the Pout.

Vdd RC 2.48 Vdd RW 2.46 2.44 2.42 Supply Voltage [V] 2.4 0 200 400 600 800 1,000 Code [Unit Cells]

Figure 4.60: Measurements of Vdd for the center frequency of LTE band 40 for RC.

130 4.11. CW Measurements

4.11.4 CW Output Power over Frequency

Figure 4.61 displays Pout over frequency. The load tuner is set to an optimal load at 2.35 GHz. The DPA achieves an almost Pout of 31.4 dBm over the BW of 100 MHz. 38

36

34 Pout ηd

Efficiency[%] 32 Output Power [dBm], 30 2.30 2.35 2.40 Frequency [GHz]

Figure 4.61: Measurement of Pmax and ηd over frequency.

4.11.5 CW Output Power over Supply Voltage

Figure 4.62 shows Pout over Vdd. Vdd was measured at the bump. It can be seen that Pout increases with the supply from 26.16 dBm

36 34 32 30

Efficiency[%] 28 Pout

Output Power [dBm], 26 ηd 1.6 1.8 2 2.2 2.4 2.6 Supply Voltage [V]

Figure 4.62: Measurement of Pout and ηd for RC over Vdd.

131 Chapter 4. Digital Power Amplifier

at 1.5 V to 31.8 dBm at 2.7 V. The drain efficiency ηd reaches the maximum of 36.5 % at 2.1 V. At 2.4 V the efficiency is 36.08 % and Pout 31.0 dBm.

4.12 Simulations vs. Measurements

Figure 4.63 shows Pout for the simulation and measurement of RC and the binary cells are inactive. The minimum measured Pout is -23.0 dBm and Pmax is 31.0 dBm. For the simulation it is -20.2 dBm and 32.4 dBm, respectively, what results in 8.4 dB difference for a single UC and 1.4 dB at full code. This difference can be explained by the parasitics that are missing in the simulation. Furthermore, the measurements are CW that degrade the performance of the DPA due to heat development on chip.

40 33 30 32 20

31 10 Output Power [dBm]

30 0 800 850 900 950 1,000 Code [Unit Cells] −10 P Sim. RC −20 out

Output Power [dBm] Pout Meas. RC −30 0 200 400 600 800 1,000 Code [Unit Cells]

Figure 4.63: Difference between simulation and measurement of Pout.

Figure 4.64 shows ηd for the simulation and measurement of RC. The efficiency has a difference, at the maximum measured output power, of 5.34 %. Assuming the same Pout for measurements and the same con- stant difference of the efficiency, as at Pmax, the efficiency at 32.4 dBm would be 41.5 %. In BO the difference of ηd decreases to 3.46 % at 29 dBm and 2.53 % at 26 dBm. At 20 dBm the difference is 1 % and de- creases further with reduced Pout.

132 4.12. Simulations vs. Measurements

50 ηd Sim. RC 40 ηd Meas. RC 30 20 10 Efficiency [%] 0 −20 −10 0 10 20 30 Output Power [dBm]

Figure 4.64: Difference between simulation and measurement of ηd.

Figure 4.65 shows Vout for the simulation and measurement of RC. The difference of the Vout and especially at full code is beneficial for the design. The voltage difference at full code is 2 V. This reduces the maximum voltage stress of each transistor and therefore increases the reliability of the whole design.

15 12.5 10 7.5 5 2.5 Vout Sim. RC Output Voltage [V] Vout Meas. RC 0 0 200 400 600 800 1,000 Code [Unit Cells]

Figure 4.65: Difference between simulation and measurement of Vout.

Figure 4.66 shows ϕ for the simulation and measurement. The phase difference of the measurements is around 48°. For a non parasitic sim- ulation the maximum deviation is 4.4°. After reaching the maximum both plots decrease monotonously. Further investigations of the high discrepancy between simulation and measurement have to be done.

133 Chapter 4. Digital Power Amplifier

Additionally, the major contributor for the high phase difference of the design has to be found.

Phase Sim. 40 Phase Meas. 30 20 Phase [°] 10 0 200 400 600 800 1,000 Code [Unit Cells]

Figure 4.66: Difference between simulation and measurement of ϕ.

4.13 LTE Measurements

The measurements are done for 2.50 V at the supply which results in 2.48 V for LTE signal with 5 MHz bandwidth (LTE-5) at 26 dBm CHP and 2.47 V for LTE-10 at the bump. The OFDM QPSK PUSCH signals are generated on chip and analyzed with a spectral analyzer. The ACLR, EVM and efficiency measurements are taken at the center frequency of the TDD band 40 at 2.35 GHz. The spectral analysis, such as duplex noise (DN), are done for FDD LTE band 7 at 2.535 GHz. Note that there is no predistortion used.

4.13.1 ACLR

Figure 4.67 shows the ACLR values for RC and RW over Pout. At a CHP of 26.5 dBm the ACLR for RC is -27.5 dBc and -26.7 dBc for RW. At the required CHP of 26 dBm the ACLR of RC has a difference of 2.0 dB to the required 30 dBc. For RW the difference is 2.2 dB. The ACLR requirements are met at a CHP of 24.5 dBm for RC and 24.0 dBm for RW.

134 4.13. LTE Measurements

−20 ACLR1,l RC ACLR1,u RC −30 ACLR1,l RW ACLR1,u RW ACLR limit −40 ACLR [dBc]

−50 0 5 10 15 20 25 Output Power [dBm]

Figure 4.67: ACLR measurements of an LTE-5 PUSCH signal over Pout.

Figure 4.68 depicts the ACLR values for RC and RW over Pout. At an increased CHP of 26.5 dBm the ACLR is -26.4 dBc for RC. At the required CHP of 26 dBm the ACLR of RC misses 3.1 dB to the required 30 dBc. The required 30 dBc are met at 22.5 dBm. Compared to LTE-5 the ACLR is 5 dB higher. The ACLR requirements are never met for RW. Further investigation have to be done to explain this behavior. −20 ACLR1,l RC ACLR1,u RC −30 ACLR1,l RW ACLR1,u RW ACLR limit −40 ACLR [dBc]

−50 0 5 10 15 20 25 Output Power [dBm]

Figure 4.68: ACLR measurements of an LTE-10 PUSCH signal over Pout.

4.13.2 EVM

Figure 4.69 shows the EVM for RC and RW over Pout. In both cases the EVM values are below the required 17.5 %. At the increased CHP of

135 Chapter 4. Digital Power Amplifier

26.5 dBm the EVM for RC is 7.6 % and for RW 8.2 %. At the required CHP of 26 dBm the EVM for RC is 7.9 % and for RW 8.0 %. For both switching schemes the EVM is decreasing until 20 dBm. As for the ACLR values, both switching schemes have a similar behavior.

20

15

10

EVM [%] EVM RC 5 EVM RW EVM limit 0 0 5 10 15 20 25 Output Power [dBm]

Figure 4.69: EVM measurements of an LTE-5 PUSCH signal over Pout.

Figure 4.70 shows the EVM for RC and RW over Pout. In both cases the EVM values are below the required 17.5 %.At the increased output power, the EVM for RC is 8.3 % and 9.3% for RW, respectively. At the increased CHP of 26.5 dBm the EVM for RC is 8.3 % and for RW 9.2 %.

20

15

10

EVM [%] EVM RC 5 EVM RW EVM limit 0 0 5 10 15 20 25 Output Power [dBm]

Figure 4.70: EVM measurements of an LTE-10 PUSCH signal over Pout.

136 4.13. LTE Measurements

At the required CHP of 26 dBm the EVM for RC is 8.2 % and for RW 9.2 %. For both switching schemes the EVMremains constant with a difference of 1 %.

4.13.3 Drain Efficiency

Figure 4.71 shows the efficiency over Pout for the LTE-5 signal modu- lated with RC and RW. The efficiency increases with increased Pout. At the maximum required CHP of 26 dBm the efficiency is 18.9 %. At 3 dB and 6 dB BO the efficiency drops to 13.6 % and 9.3 %. At 27.5 dBm the efficiency increases up to 23.6 %. For a lower Pout the efficiency drops further. At 15.5 dBm the efficiency is below 5 %. For RW the same ef- ficiency curve can be seen as for RC. This shows that the efficiency for LTE-5 is independent of the two modulation schemes.

ηd RC 20 ηd RW 15 10

Efficiency [%] 5 0 0 5 10 15 20 25 Output Power [dBm]

Figure 4.71: Measurement of efficiency over output power for an LTE-5 signal.

Figure 4.72 shows the efficiency over Pout for LTE-10 signal modulated with RC and RW. The efficiency increases with increased Pout. At the maximum required CHP of 26 dBm the efficiency is 18.7 %. At 3 dB and 6 dB BO the efficiency drops to 13.6 % and 9.4 %. At 27.5 dBm the effi- ciency increases up to 23.9 %. For lower Pout the efficiency drops fur- ther. It is below 5 % at 15.5 dBm. As in LTE-5, ηd has the same behavior for both switching schemes.

137 Chapter 4. Digital Power Amplifier

25 ηd RC 20 ηd RW 15 10

Efficiency [%] 5 0 0 5 10 15 20 25 Output Power [dBm]

Figure 4.72: Measurement of efficiency over output power for an LTE-10 signal.

4.13.4 Spectrum Figure 4.73 shows the setup for the DN measurements. The DN is mea- sured in the receive band of the same channel to show how much noise is produced by the DPA. The setup consists the DUT, which receives the digitally modulated test signal from the interface and converts it to an analog output signal. The DUT is terminated by a load tuner that matches the output to a 50 Ω load. The first measurement is done

Interface DUT Load Tuner

Γ

Notch Filter Spectrum Analyzer Figure 4.73: Block level diagram of the noise measurement setup.

138 4.13. LTE Measurements

ignoring the notch filter. The spectral analyzer measures the in-band power as well as the ACLR and EVM. The second measurement includes the notch filter with 90 MHz BW to attenuate the signal and noise. So the noise floor and the spectral emissions can be measured. The RBW is 100 kHz. Afterward, both measurements are combined. For mea- surements band 7 LTE-5/LTE-10 OFDM QPSK PUSCH are used.

Spectrum LTE-5 Figure 4.74 depicts the normalized output spectrum of an LTE-5 sig- nal at different CHPs. Compared to the CHPs in BO the figure shows that at the required CHP of 26 dBm the OOB emissions increase and so decrease the E-UTRA ACLR values, as shown before. The noise floor above and below 100 MHz of the center frequency do not change signifi- cantly, which is important for DN measurements. By further increasing Pout to 29 dBm, the clipping of the signal can be seen in an increased spectral emission. This increases the noise level inside the notch filter’s BW. It can be seen that the noise floor is increase by the distortions and would be even higher without the notch filter. Spurs produced in the DFE create images of the signal at multiples of 78 MHz.

CHP = 29 dBm −60 CHP = 26 dBm −80 CHP = 22 dBm CHP = 17 dBm −100 DN −120 Power [dBc/Hz] Relative Spectral −140 2.3 2.535 2.75 Frequency [GHz]

Figure 4.74: Measurements of the output spectrum of an LTE-5 QPSK OFDM PUSCH signal for RC.

Figure 4.75 illustrates the LTE-5 signal at different CHPs for RW. Even in BO at 17 dBm the signal has a highly increased noise level inside the 90 MHz filter BW. The noise next to the signal band only decreases slightly resulting in a slowly improving ACLR value, as shown before.

139 Chapter 4. Digital Power Amplifier

The noise increases in BO, above the notch filter’s BW, which results in a worse DN.

CHP = 29 dBm −60 CHP = 26 dBm −80 CHP = 22 dBm CHP = 17 dBm −100 DN −120 Power [dBc/Hz] Relative Spectral −140 2.3 2.535 2.75 Frequency [GHz]

Figure 4.75: Measurements of the output spectrum of an LTE-5 QPSK OFDM PUSCH signal for RW.

Spectrum LTE-10 Figure 4.76 depicts the normalized output spectrum of an LTE-10 sig- nal at different CHPs. At the required Pout of 26 dBm the OOB emission increase significantly compared to Pout in BO although the noise floor

CHP = 27 dBm −60 CHP = 26 dBm −80 CHP = 22 dBm CHP = 17 dBm −100 DN −120 Power [dBc/Hz] Relative Spectral −140 2.3 2.535 2.75 Frequency [GHz]

Figure 4.76: Measurements of the output spectrum of an LTE-10 QPSK OFDM PUSCH signal for RC.

140 4.13. LTE Measurements

above and below 100 MHz of the center frequency is not increased and so does not decrease the noise to CHP ratio in the duplex receive band noise. At 27 dBm CHP the signal begins to clip and to be significantly distorted. This results in broader and increased noise level around the channel. Compared to the LTE-5 spectrum LTE-10 has a broader in- creased noise next to the CHBW. Figure 4.77 illustrates the spectral measurements for LTE-10 RW. As already seen in LTE-10 this scheme increases the noise level signifi- cantly compared to RC. In OOB the noise decreases in between the BO range from 26 dBm to 22 dBm and so the ACLR values. But it can be seen that the noise does not decrease any more in the range from 22 dBm to 17 dBm and so do the ACLR values stay constant. Outside OOB the noise increase in BO which might be to two different noise sources. This might be due to a higher impact of frequency modula- tion (FM). Especially in this frequency range FM is significant, as will be seen in the next section. The increased noise level in BO also de- grades the DN.

CHP = 27 dBm −60 CHP = 26 dBm −80 CHP = 22 dBm CHP = 17 dBm −100 DN −120 Power [dBc/Hz] Relative Spectral −140 2.3 2.535 2.75 Frequency [GHz]

Figure 4.77: Measurements of the output spectrum of an LTE-10 QPSK OFDM PUSCH signal for RW.

Spectrum CW, AM, FM Figure 4.78 shows the receive band noise relative to CHP of a CW only, amplitude modulation (AM) only, FM only and a composite signal. The composite signal has a CHP of 25.8 dBm. The other signals are gen- erated with these default settings by switching off the corresponding

141 Chapter 4. Digital Power Amplifier

modulation. For the AM the RC scheme was used to modulate the QPSK PUSCH signal. The measurement frequency is the center fre- quency of band 7 for the signals without FM. For the signals with FM LTE-5 OFDM band 7 is used. It can be seen that the CW signal has the lowest noise floor. No modulations are used and the signal’s am- plitude has a constant value. Therefore, there are no contributions to the noise floor due to modulation. This measurement can be taken as a reference. With this signal the spurs, caused by the clock frequency of the phase-locked loop (PLL), can be seen. Additionally, it can be seen that the AM causes an increased noise floor and a broader coil around the center frequency. FM increases the noise floor only minor com- pared to AM. The composite signal has the same noise floor as the AM only signal and a smaller in-band distortion than AM and FM.

−60 CW only AM only −80 FM only Composite −100

−120 DN −140 Power [dBc/Hz] Relative Spectral −160 2.3 2.535 2.75 Frequency [GHz]

Figure 4.78: Measurements of the output spectrum of a CW, AM, FM and a composite signal.

Table 4.2 summarizes the values of the duplex noise at 120 MHz dis- tance for the aforementioned modulations. As seen from the spectrum

Table 4.2: Measured Duplex Receive Band Noise for CW, AM, FM and Composite Measurement Duplex Noise CW only -155.8 dBc/Hz AM only -139.5 dBc/Hz Composite -140.1 dBc/Hz

142 4.13. LTE Measurements

the lowest channel to noise power is achieved by the CW signal. FM in- creases the values next to in-band but only slightly increases the noise for the duplex receiver by 2.5 dB. AM has the strongest impact on the duplex noise and so also reduces the channel to noise power ratio by 15.7 dB to -140.1 dBc/Hz.

Duplex Noise

Figure 4.79 depicts the DN over Pout for a LTE-5 QPSK OFDM PUSCH test signal. At the CHP of 25.74 dBm the DN is -140.7 dBc/Hz for RC and -136.7 dBc/Hz for RW. In BO the noise can be slightly improved for RC to 1.41.3 dBc/Hz. By further reducing Pout the noise level does not decrease and so the CHP to noise ratio increases again.

−120 NF RC NF RW −130

−140

Duplex Noise [dBc/Hz] −150 16 18 20 22 24 26 28 Output Power [dBm]

Figure 4.79: Duplex noise measurements of an LTE-5 PUSCH signal over Pout.

Figure 4.80 shows the DN over Pout for a LTE-10 QPSK OFDM PUSCH test signal. At the CHP of 25.80 dBm the DN is -138.3 dBc/Hz for RC and -133.8 dBc/Hz for RW. In 3 dB BO the dynamic biasing (DB) is 0.2 dB better. With decreased CHP the DN starts to increase as already seen for LTE-5.

4.13.5 Full Span Figure 4.81 shows a full span spectrum over 8 GHz of a fully allocated LTE-5 band 40 signal, modulated by RC and RW at a CHP of 26 dBm.

143 Chapter 4. Digital Power Amplifier

−120 NF RC NF RW −130

−140

Duplex Noise [dBc/Hz] −150 16 18 20 22 24 26 Output Power [dBm]

Figure 4.80: Duplex noise measurements of an LTE-10 PUSCH signal over Pout.

The second and third harmonic distortions can be seen at 4.7 GHz and 7.05 GHz, respectively. For a RBW of 30 kHz the noise level is constant around -65 dBm. RC and RW show the same noise behavior over the entire spectrum.

0 RC RW nd −20 rd 2 3

−40

−60 Spectral Power [dBm] 0 1 2 3 4 5 6 7 8 Frequency [GHz]

Figure 4.81: Full span plot for an LTE-5 signal.

Figure 4.82 shows the full span spectrum over 8 GHz of a fully allo- cated LTE-10 band 40 signal, modulated by RC and RW at the required CHP of 26 dBm. The second and third harmonic of the fundamental signal can be seen at 4.7 GHz and 7.05 GHz. For a RBW of 30 kHz the noise level is constant around -65 dBm. It can be seen that for RC the

144 4.14. Failure Causes

noise next to the channel band is increased compared to RW. Further investigations to describe this effect have to be done.

0 RC RW nd −20 rd 2 3

−40

−60 Spectral Power [dBm] 0 1 2 3 4 5 6 7 8 Frequency [GHz]

Figure 4.82: Full span plot for an LTE-10 signal.

4.14 Failure Causes

A practical error source that was found on the chip was the high resis- tance that the digital supply path, inside of the decoder, shows. If the current that is used by the decoder is high enough, then this causes that the supply voltage for the decoder drops along the supply line. Since the decoder is only connected at the bottom side to the supply, the voltage drop can be seen as linear voltage drop across it. This results in a lower supply voltage for the digital unit blocks inside the decoder. If the voltage drops below a certain threshold the cells do not function anymore and remain inactive.

4.14.1 Simulation Figure 4.83 shows the transient simulation of the current that flows inside the decoder during its operation. The root mean square value of the current settles around 6.35 mA. This current is distributed and does therefore not flow over the entire supply line what results in an unequal voltage drop distribution. It can be taken nevertheless as a first order approximation to explain the resulting effects.

145 Chapter 4. Digital Power Amplifier

·10−3 8 6 4 2 Current [A] 0 −2 0 2 4 6 8 10 12 14 16 18 20 Time [ns]

Figure 4.83: Simulation of the current in the digital supply path to the decoder.

It can be seen in Figure 4.84 that the voltage drops from the ideally 1.1 V to 850 mV. This causes that the transistors are not fully switched on anymore. The voltage drops due to parasitics at the supply path. The simulations were done with a resistance and inductance at the supply source. The values of the parasitics were calculated for the furthest point in the decoder.

1 0.9 0.8 0.7 0.6 0.5 0.5 0.4 0.3 Voltage [V] V CLK Voltage [V] 0.2 V CLK 0 V CLK at FF 0.1 V CLK at FF 0 0 5 10 15 20 5 6 7 8 Time [ns] Time [ns]

(a) Full Sweep (b) Zoom

Figure 4.84: Simulation of the voltage drop inside the decoder due to parasitics at the supply path.

146 4.14. Failure Causes

4.14.2 Measurement It is not possible to do an independent current consumption measure- ments for the digital part of the DPA. Since the digital part has no in- dependent supply, but uses the same supply as the DFE, only the total current consumption can be measured. The current is measured with enabled and disabled decoder. The difference of both measurements represents the current that is used by the decoder. In case that the de- coder is disabled the current consumption is around 7 mA less. Com- pared to the simulated 6.35 mA this value can be used as a first order value to explain the effects. Figure 4.85 shows Pout of the DPA with different low-dropout regu- lator (LDO) voltages for the RC modulated decoder. Due to the sym- metric design of the decoder the voltage drop can be seen linear. At input code 192 the voltage drop in the upper part of the decoder is so high that the transistors cannot be switch on or off anymore. There- fore, Pout stays 32 cells the same until the next row is active in the lower part of the decoder that is closer to the supply.

20

0 LDO 1.2 V LDO 1.3 V

Output Power [dBm] −20 LDO 1.4 V 200 400 600 800 1,000 Thermo Code [Unit Cells]

Figure 4.85: Measurement of Pout over code for different LDO voltages.

Figure 4.86 shows the same plot in logarithmic scale. Additionally it can be seen that Pout increases linearly up to the point when approxi- mately one third of the UCs are active. Here the supply connection is closer to the bottom side. Since RC switches the rows continuously the failing rows can be easily identified.

147 Chapter 4. Digital Power Amplifier

20

0 LDO 1.2 V LDO 1.3 V

Output Power [dBm] −20 LDO 1.4 V 100 101 102 103 Thermo Code [Unit Cells]

Figure 4.86: Logarithmic plot of the previous measurement.

Figure 4.87 shows Vout for the MSBs of the design. It can be seen that even for an LDO voltage of 1.3 V at the maximum output UCs are not switched on. 11.5 11 10.5 10 9.5 LDO 1.2 V LDO 1.3 V Output Voltage [V] 9 LDO 1.4 V 800 850 900 950 1,000 Thermo Code [Unit Cells]

Figure 4.87: Measurement of Vout over code for different LDO voltages.

148 Bibliography

[1] D. Chowdhury, L. Ye, E. Alon, and A. Niknejad, “An Efficient Mixed-Signal 2.4-GHz Polar Power Amplifier in 65-nm CMOS Technology,” Solid-State Circuits, IEEE Journal of, vol. 46, no. 8, pp. 1796–1809, Aug 2011.

[2] D. Chowdhury, S. Thyagarajan, L. Ye, E. Alon, and A. Niknejad, “A Fully-Integrated Efficient CMOS Inverse Class-D Power Amplifier for Digital Polar Transmitters,” Solid-State Circuits, IEEE Journal of, vol. 47, no. 5, pp. 1113–1122, May 2012.

[3] J. Bastos, M. Steyaert, and W. Sansen, “A high yield 12-bit 250-MS/s CMOS D/A converter,” in Custom Integrated Circuits Conference, 1996., Proceedings of the IEEE 1996, pp. 431–434, May 1996.

[4] K. Lakshmikumar, R. Hadaway, and M. Copeland, “Characterisa- tion and modeling of mismatch in MOS transistors for precision analog design,” Solid-State Circuits, IEEE Journal of, vol. 21, no. 6, pp. 1057–1066, Dec 1986.

[5] W. Sansen, Analog design essentials, ser. Analog circuits and signal processsing series. Springer, 2006, no. Bd. 1.

[6] S.-M. Yoo, J. Walling, E. C. Woo, B. Jann, and D. Allstot, “A Switched-Capacitor RF Power Amplifier,” Solid-State Circuits, IEEE Journal of, vol. 46, no. 12, pp. 2977–2987, Dec 2011.

[7] C. Michael and M. Ismail, “Statistical modeling of device mismatch for analog MOS integrated circuits,” Solid-State Circuits, IEEE Jour- nal of, vol. 27, no. 2, pp. 154–166, Feb 1992.

[8] Y. Cong and R. Geiger, “Switching sequence optimization for gra- dient error compensation in thermometer-decoded DAC arrays,”

149 Bibliography

Circuits and Systems II: Analog and Digital Signal Processing, IEEE Transactions on, vol. 47, no. 7, pp. 585–595, Jul 2000.

[9] M. Karimian, S. Hashemi, A. Naderi, and M. Sawan, “Impact of gra- dient error on switching sequence in high-accuracy thermometer- decoded current-steering DACs,” in Circuits and Systems (ISCAS), 2012 IEEE International Symposium on, pp. 1279–1282, May 2012.

[10] J.-B. Shyu, G. Temes, and F. Krummenacher, “Random error ef- fects in matched MOS capacitors and current sources,” Solid-State Circuits, IEEE Journal of, vol. 19, no. 6, pp. 948–956, Dec 1984.

[11] G. Van Der Plas, J. Vandenbussche, W. Sansen, M. Steyaert, and G. Gielen, “A 14-bit intrinsic accuracy Q2 random walk CMOS DAC,” Solid-State Circuits, IEEE Journal of, vol. 34, no. 12, pp. 1708–1718, Dec 1999.

[12] M. Pelgrom, A. C. Duinmaijer, and A. Welbers, “Matching prop- erties of MOS transistors,” Solid-State Circuits, IEEE Journal of, vol. 24, no. 5, pp. 1433–1439, Oct 1989.

[13] A. Van Den Bosch, M. Borremans, M. Steyaert, and W. Sansen, “A 10-bit 1-GSample/s Nyquist current-steering CMOS D/A con- verter,” Solid-State Circuits, IEEE Journal of, vol. 36, no. 3, pp. 315–324, Mar 2001.

[14] W.-K. Loo, K.-S. Tan, and Y.-K. Teh, “A study and design of CMOS H-Tree clock distribution network in system-on-chip,” in ASIC, 2009. ASICON ’09. IEEE 8th International Conference on, pp. 411–414, Oct 2009.

[15] D. Wolpert and P. Ampadu, Managing Temperature Effects in Nanoscale Adaptive Systems. Springer, 2011.

[16] Interconnect, International Technology Roadmap for Semicon- ductors (ITRS) Std., 2007.

150 Chapter 5 Conclusion

Two designs have been implemented and tested by using LTE test sig- nals. First a linear class-AB PA was designed to proof that high watt- level output power can be achieved in 28 nm CMOS technology using a triple transistor stack.

The 28 nm CMOS class-AB PA was designed and it was proven by measurements that it is able to meet selected 3GPP LTE requirements. Compared to former CMOS technology nodes it was shown that it is also possible in 28 nm CMOS to achieve the required Pout by using a triple stack design. For pulsed measurements a PAE of 35.2 % and ηd of 39.5 % was presented. For LTE-1 to LTE-20 band 1 QPSK PUSCH test signals the 3GPP requirements of 17.5 % for EVM and ACLR limits of - 30/-33 for E-UTRA/UTRA were met by using DPD. Higher Pout and bet- ter linearity can be achieved using CFR. For the LTE-1 to LTE-20 band 1 QPSK PUSCH test signals the ηd was around 22 %. The integrated de- sign including on-chip IMN and OMN has a total area of 1.05 mm × 0.51 mm.

After the proof of concept of the linear PA, a CSDPA was implemented to merge the DAC with the PA and so become more compact. For CW measurements the maximum Pout was 31.2 dBm and the maximum ηd 34.3 %. LTE measurements were done with a QPSK OFDM PUSCH LTE band 7 signal for 5 and 10 MHz BW. At the required CHP of 26 dBm the ACLR was 26.9 dBc for LTE-5 and 27.4 dBc for LTE-10. The required - 30 for E-UTRA were met at 21.7 dBm CHP and 22.0 dBm, respectively. EVM requirements were met for all test cases. The DN at 26 dBm CHP for LTE band 7 is -140.7 dBc/Hz for LTE-5 and -138.3 dBc/Hz for LTE-10.

151 Chapter 5. Conclusion

The fully integrated design has a total area of 0.61 mm × 0.5 mm.

In future work, techniques to increase the DPA performance can be included. To increase the efficiency in BO an efficiency enhancement circuit can be implemented [1, 2]. For spectral improvement the ACLR values can be reduced by using DPD. To ensure the reliability of the design long term measurement can be done to show the robustness or possible degradation at higher Pout.

152 Bibliography

[1] G. Liu, P. Haldi, T.-J. K. Liu, and A. Niknejad, “Fully integrated CMOS power amplifier with efficiency enhancement at power back- off,” Solid-State Circuits, IEEE Journal of, vol. 43, no. 3, pp. 600–609, March 2008.

[2] A. Tuffery, N. Deltimple, E. Kerherve, V. Knopik, and P. Cathe- lin, “CMOS fully integrated reconfigurable power amplifier with efficiency enhancement for LTE applications,” Electronics Letters, vol. 51, no. 2, pp. 181–183, 2015.

153

List of Figures

1.1 Transistor level diagram of a stacked transistor ...... 6 1.2 Block level diagrams of different combiners ...... 9 1.3 Efficiency diagram of a transformer ...... 11 1.4 Efficiency diagram of a Chireix combiner...... 12 1.5 Efficiency diagram of an LC combiner...... 13 1.6 Transistor level diagram of a class-D PA ...... 14 1.7 Transistor level diagram of a class-E PA ...... 14 1.8 Transistor level diagram of a class-F PA ...... 15 1.9 Block level diagram of the outphasing concept ...... 18 1.10 Block level diagram of EER and hybrid EER ...... 19 1.11 Block level diagram of a DPT ...... 21 1.12 Block level diagram of SDPA ...... 22 1.13 DPA comparison in different technology nodes ...... 23 2.1 Diagram of the signal modulation using BPSK (a), QPSK (b), 16-QAM (c) or 64-QAM (d)...... 42 2.2 Diagram of the EVM definition...... 42 2.3 Diagram of the OFDM time-frequency multiplexing. ... 43 2.4 Diagram of OFDM with 15 kHz signal spacing...... 44 2.5 Block diagram of the TX chain ...... 44 2.6 Transmitter RF spectrum ...... 49 2.7 OOB emission mask and E-UTRA channel...... 50 2.8 General E-UTRA spectrum emission mask ...... 51 2.9 Classical clipping for power reduction ...... 52 2.10 Simulation of ACLR emissions and EVM ...... 53 2.11 Simulation of required resolution for Pout ...... 54 3.1 Schematic of a simplified linear PA ...... 60 3.2 Power and efficiency plot of a linear PA ...... 62 3.3 Fundamental and harmonic plot of a linear PA ...... 63

155 List of Figures

3.4 I-V-characteristics of an nMOS transistor ...... 63 3.5 Block diagram of a transceiver output stage ...... 64 3.6 Schematic of triple stack PA design ...... 66 3.7 Triple stack of the class-AB PA ...... 67 3.8 DC voltage simulation of the triple stack ...... 68 3.9 Picture of the 28 nm bumped bare die ...... 69 3.10 Test board with components ...... 70 3.11 Block diagram of the measurement setup ...... 71 3.12 I-V measurements of the PA with replica stage ...... 72 3.13 I-V measurements of the PA ...... 72 3.14 Measurement of DC breakdown for the triple stack .... 73 3.15 Pulsed measurements over Pin ...... 74 3.16 Pulsed measurements over Pout ...... 75 3.17 Measurements of S21-parameter ...... 75 3.18 Pulsed measurement of gain over Pout ...... 76 3.19 Measurements of Psat over frequency ...... 77 3.20 Measured output spectrum with LTE of LTE-15 ...... 78 3.21 ACLR measurement for band 1 LTE-20 ...... 79 3.22 Measurement of CF for band 1 LTE-20 ...... 79 3.23 Pout measurement for band 1 LTE-1 to LTE-20 ...... 80 3.24 EVM measurement for band 1 LTE-1 to LTE-20 ...... 81 3.25 Measurement of E-UTRA ACLR for band 1 LTE-1 to LTE-20 81 3.26 Measurement of UTRA ACLR for band 1 LTE-1 to LTE-20 . 82 3.27 Measurement of ηd for band 1 LTE-1 to LTE-20 ...... 82 4.1 Block diagram of a TX chain ...... 89 4.2 Block diagram of a SDPA with DFE and OMN...... 90 4.3 Block diagram of the bottom part of the CSDPA ...... 91 4.4 Block diagram of the LO signal enabling ...... 91 4.5 Schematic of an inverse class-D PA ...... 92 4.6 I-V characteristics of an inverse class-D PA ...... 93 4.7 Schematic of a stacked inverse class-D PA ...... 94 4.8 Equivalent block diagram of the input stage ...... 96 4.9 Equivalent block diagram of the output stage ...... 96 4.10 Overlap of two timing delayed rectangular signals ..... 97 4.11 Model of the bottom transistor with series resistance ... 98 4.12 Simulation of AM-AM and AM-PM distortion ...... 99 4.13 1-dimensional sequential scheme ...... 100 4.14 1-dimensional conventional symmetrical scheme ..... 100

156 List of Figures

4.15 1-dimensional hierarchical symmetrical scheme Type A .. 100 4.16 1-dimensional hierarchical symmetrical scheme Type B .. 100 4.17 Switching diagram of RC ...... 101 4.18 Switching diagram of RW ...... 101 4.19 Diagram of different gradients ...... 102 4.20 DAC transfer functions for RC with different gradients .. 103 4.21 DAC transfer functions for RW with different gradients .. 104 4.22 3-D diagram of a linear gradient error ...... 105 4.23 INL of RC and RW for a linear gradient amplitude error .. 107 4.24 Amplitude difference due to timing mismatch ...... 107 4.25 Phase distortion due to timing mismatch ...... 108 4.26 Block level diagram of the decoder ...... 109 4.27 Equivalent layout block diagram of the decoder ...... 111 4.28 Block diagram of the clock distribution ...... 112 4.29 Nominal current density in a given metal cross section .. 112 4.30 Block diagram of the simulation setup ...... 114 4.31 Transient simulation of the LO signal ...... 114 4.32 Simulation of Pout over binary bits ...... 115 4.33 Simulation Pout, ηd and ηoa over thermometer code .... 116 4.34 Simulation of ηd and ηoa ...... 116 4.35 Simulation of Vout over code ...... 117 4.36 Simulation of Pout, ηd and ηoa over code ...... 117 4.37 Simulation of the phase over the binary bits ...... 118 4.38 Simulation of phase difference between RC and RW .... 118 4.39 Schematic level of the LO driven AND gate ...... 119 4.40 Simulation of the voltage difference between RC and RW . 119 4.41 Simulation of phase over the thermometer bits ...... 120 4.42 Die photo of the inverse class-D CSDPA ...... 121 4.43 Chip photograph on PCB ...... 122 4.44 Block diagram of the measurement setup ...... 122 4.45 Measurements of Pout for the binary cells ...... 123 4.46 Measurements of ηd for the binary cells ...... 123 4.47 Measurements of Pout and ηd ...... 124 4.48 Measurements of ηd over Pout ...... 124 4.49 Measurements of Pout over full core ...... 125 4.50 Measurements of Vout for the binary cells ...... 125 4.51 Measurement of INL error for binary bits ...... 126 4.52 Measurement of DNL error for binary bits ...... 126 4.53 Diagram of AM-AM for center frequency of LTE band 40 . 127

157 List of Figures

4.54 Measurement of INL error for thermometer bits ...... 127 4.55 Measurement of DNL error for thermometer bits ..... 128 4.56 Diagram of AM-AM for center frequency of LTE band 40 . 128 4.57 Measurement of INL error for full code ...... 129 4.58 Measurement of DNL error for full code ...... 129 4.59 Measurement of phase for RC and RW ...... 130 4.60 Measurements of Vdd for RC ...... 130 4.61 Measurement of Pmax and ηd over frequency ...... 131 4.62 Measurement of Pout and ηd for RC over Vdd ...... 131 4.63 Difference between simulation and measurement of Pout . 132 4.64 Difference between simulation and measurement of ηd .. 133 4.65 Difference between simulation and measurement of Vout . 133 4.66 Difference between simulation and measurement of ϕ .. 134 4.67 ACLR measurements of an LTE-5 signal over Pout ..... 135 4.68 ACLR measurements of an LTE-10 signal over Pout ..... 135 4.69 EVM measurements of an LTE-5 signal over Pout ...... 136 4.70 EVM measurements of an LTE-10 signal over Pout ..... 136 4.71 Measurement of efficiency over output power for LTE-5 .. 137 4.72 Measurement of efficiency over output power for LTE-10 . 138 4.73 Block level diagram of the noise measurement setup ... 138 4.74 Measurements of the output spectrum of LTE-5 for RC .. 139 4.75 Measurements of the output spectrum of LTE-5 for RW .. 140 4.76 Measurements of the output spectrum of LTE-10 for RC .. 140 4.77 Measurements of the output spectrum of LTE-10 for RW . 141 4.78 Measurements of the output spectrum of a CW, AM, FM and a composite signal ...... 142 4.79 Duplex noise measurements of LTE-5 over Pout ...... 143 4.80 Duplex noise measurements of LTE-10 over Pout ...... 144 4.81 Full span plot for an LTE-5 signal ...... 144 4.82 Full span plot for an LTE-10 signal ...... 145 4.83 Simulation of the current in the digital supply path .... 146 4.84 Simulation of the voltage drop inside the decoder ..... 146 4.85 Measurement of Poutfor different LDO voltages ...... 147 4.86 Logarithmic plot of the previous measurement ...... 148 4.87 Measurement of Vout for different LDO voltages ...... 148

158 List of Tables

1.1 CMOS Transceiver Implementations for Wireless Systems 3 1.2 Selected 3GPP LTE Linearity Requirements ...... 5 1.3 Selected ITRS Specifications ...... 6 1.4 Selection of DPA Parameters ...... 7 1.5 Comparison of Bit, INL and DNL ...... 8 1.6 Efficiency Models for Power Combiner ...... 11 1.7 Output Power for Different SMPA Classes ...... 17 1.8 Comparison of Output Power for Different Architectures . 26 1.9 Comparison of Different DPA Implementations for LTE .. 26 2.1 Component Losses at Different Frequencies ...... 45 2.2 Temperature Specification ...... 46 2.3 Output Power Specifications for LTE ...... 46 2.4 Output Power Tolerance ...... 47 2.5 Maximum Power Reduction ...... 47 2.6 VSWR Specification at the Antenna ...... 47 2.7 VSWR Specification at the PA ...... 48 2.8 Selection of E-UTRA Operating Bands ...... 48 2.9 EVM 3GPP Requirements ...... 49 2.10 OOB Boundary ...... 50 2.11 General Requirements for E-UTRA and UTRA ACLR ... 50 2.12 Spurious Emissions Limits ...... 51 2.13 Spurious Emission Results for Different Test Signals .... 52 2.14 Required Design Resolution ...... 55 3.1 Linear class-A to class-C Comparison ...... 61 3.2 Selected 3GPP LTE Requirements ...... 77 3.3 Comparison of CMOS Class-AB PA...... 84 4.1 Nominal Current Density for Copper ...... 113 4.2 Measured Duplex Receive Band Noise ...... 142

159