EURASIP Journal on Applied Signal Processing

Advanced Signal Processing for Digital Subscriber Lines

Guest Editors: Raphael Cendrillon, Iain Collings, Tomas Nordström, Frank Sjöberg, Michail Tsatsanis, and Wei Yu

EURASIP Journal on Applied Signal Processing Advanced Signal Processing for Digital Subscriber Lines

EURASIP Journal on Applied Signal Processing Advanced Signal Processing for Digital Subscriber Lines

Guest Editors: Raphael Cendrillon, Iain Collings, Tomas Nordström, Frank Sjöberg, Michail Tsatsanis, and Wei Yu

Copyright © 2006 Hindawi Publishing Corporation. All rights reserved.

This is a special issue published in volume 2006 of “EURASIP Journal on Applied Signal Processing.” All articles are open access articles distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Editor-in-Chief Ali H. Sayed, University of California, USA

Associate Editors Kenneth Barner, USA Søren Holdt Jensen, Denmark Vitor H. Nascimento, Brazil Mauro Barni, Italy Mark Kahrs, USA Sven Nordholm , Australia Richard Barton, USA Thomas Kaiser, Germany Douglas O’Shaughnessy, Canada Ati Baskurt, France Moon Gi Kang, South Korea Montse Pardas, Spain Kostas Berberidis, Greece Matti Karjalainen, Finland Wilfried Philips, Belgium Jose C. Bermudez, Brazil Walter Kellermann, Germany Vincent Poor, USA Enis Cetin, Turkey Joerg Kliewer, USA Ioannis Psaromiligkos, Canada Jonathon Chambers, UK Lisimachos P. Kondi, USA Phillip Regalia, France Benoit Champagne, Canada Alex Kot, Singapore Markus Rupp, Austria Joe Chen, USA Vikram Krishnamurthy, Canada Bill Sandham, UK Liang-Gee Chen, Taiwan Tan Lee, Hong Kong Bulent Sankur, Turkey Huaiyu Dai, USA Geert Leus, The Netherlands Erchin Serpedin, USA Satya Dharanipragada, USA Bernard C. Levy, USA Dirk Slock, France Frank Ehlers, Italy Ta-Hsin Li, USA Yap-Peng Tan, Singapore Sharon Gannot, Israel Mark Liao, Taiwan Dimitrios Tzovaras, Greece Fulvio Gini, Italy Yuan-Pei Lin, Taiwan Hugo Van hamme, Belgium Irene Gu, Sweden Shoji Makino, Japan Bernhard Wess, Austria Peter Handel, Sweden Stephen Marshall, UK Douglas Williams, USA R. Heusdens, The Netherlands C. Mecklenbräuker, Austria Roger Woods, UK Ulrich Heute, Germany Gloria Menegaz, Italy Jar-Ferr Yang, Taiwan Arden Huang, USA Ricardo Merched, Brazil Abdelhak M. Zoubir, Germany Jiri Jan, Czech Republic Rafael Molina, Spain Sudharman K. Jayaweera, USA Marc Moonen, Belgium

Contents

Advanced Signal Processing for Digital Subscriber Lines, Raphael Cendrillon, Iain Collings, Tomas Nordström, Frank Sjöberg, Michail Tsatsanis, and Wei Yu Volume 2006 (2006), Article ID 32476, 3 pages

The Worst-Case Interference in DSL Systems Employing Dynamic Spectrum Management, Mark H. Brady and John M. Cioffi Volume 2006 (2006), Article ID 78524, 11 pages

Joint Multiuser Detection and Optimal Spectrum Balancing for Digital Subscriber Lines, Vincent M. K. Chan and Wei Yu Volume 2006 (2006), Article ID 80941, 13 pages

Spectrally Compatible Iterative Water Filling, Jan Verlinden, Etienne Van den Bogaert, Tom Bostoen, Francesca Zanier, Marco Luise, Raphael Cendrillon, and Marc Moonen Volume 2006 (2006), Article ID 58380, 10 pages

The Normalized-Rate Iterative Algorithm: A Practical Dynamic Spectrum Management Method for DSL, Driton Statovci, Tomas Nordström, and Rickard Nilsson Volume 2006 (2006), Article ID 95175, 17 pages

ADSL Transceivers Applying DSM and Their Nonstationary Noise Robustness, Etienne Van den Bogaert, Tom Bostoen, Jan Verlinden, Raphael Cendrillon, and Marc Moonen Volume 2006 (2006), Article ID 67686, 8 pages

Analysis of Iterative Waterfilling Algorithm for Multiuser Power Control in Digital Subscriber Lines, Zhi-Quan Luo and Jong-Shi Pang Volume 2006 (2006), Article ID 24012, 10 pages

Alien Crosstalk Cancellation for Multipair Systems, George Ginis and Chia-Ning Peng Volume 2006 (2006), Article ID 16828, 12 pages

Crosstalk Models for Short VDSL2 Lines from Measured 30 MHz Data, E. Karipidis, N. Sidiropoulos, A. Leshem, Li Youming, R. Tarafi, and M. Ouzzif Volume 2006 (2006), Article ID 85859, 9 pages

Error Sign Feedback as an Alternative to Pilots for the Tracking of FEXT Transfer Functions in Downstream VDSL, J. Louveaux and A.-J. van der Veen Volume 2006 (2006), Article ID 94105, 14 pages

Iterative Refinement Methods for Time-Domain Equalizer Design, Güner Arslan, Biao Lu, Lloyd D. Clark, and Brian L. Evans Volume 2006 (2006), Article ID 43154, 12 pages

Near-Capacity Coding for Discrete Multitone Systems with Impulse Noise, Masoud Ardakani, Frank R. Kschischang, and Wei Yu Volume 2006 (2006), Article ID 98738, 10 pages Fine-Granularity Loading Schemes Using Adaptive Reed-Solomon Coding for xDSL-DMT Systems, Saswat Panigrahi and Tho Le-Ngoc Volume 2006 (2006), Article ID 65716, 13 pages

Intra-Symbol Windowing for Egress Reduction in DMT Transmitters, Gert Cuypers, Koen Vanbleu, Geert Ysebaert, and Marc Moonen Volume 2006 (2006), Article ID 70387, 9 pages

Designing Tone Reservation PAR Reduction, Niklas Andgart, Per Ödling, Albin Johansson, and Per Ola Börjesson Volume 2006 (2006), Article ID 38237, 14 pages

Cosine-Modulated Multitone for Very-High-Speed Digital Subscriber Lines, Lekun Lin and Behrouz Farhang-Boroujeny Volume 2006 (2006), Article ID 19329, 16 pages Hindawi Publishing Corporation EURASIP Journal on Applied Signal Processing Volume 2006, Article ID 32476, Pages 1–3 DOI 10.1155/ASP/2006/32476

Editorial Advanced Signal Processing for Digital Subscriber Lines

Raphael Cendrillon,1 Iain Collings,2 Tomas Nordstrom,¨ 3 Frank Sjoberg,¨ 4 Michail Tsatsanis,5 and Wei Yu6

1 Marvell Hong Kong Ltd., Hong Kong 2 CSIRO Information Communication Technologies Center, Australia 3 Telecommunications Research Center Vienna (ftw.), Donau-City-StraBe 1, 1220 Vienna, Austria 4 Division of Signal Processing, Lulea˚ University of Technology, and Upzide Labs, Lulea,˚ Sweden 5 Aktino Inc., Irvine, California, USA 6 Electrical and Computer Engineering Department, University of Toronto, 10 King’s College Road, Toronto, ON Canada, M5S 3G4

Received 27 January 2006; Accepted 27 January 2006 Copyright © 2006 Raphael Cendrillon et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

The recent deployment of digital subscriber line (DSL) tech- systems by adaptively varying transmit power-spectral den- nology around the world is rapidly making broadband access sity according to geographic locations and the crosstalk chan- for the mass consumer market a reality. The ever-growing nel characteristics of the subscribers in each bundle. This is- customer demand for higher data rates has been fueled by sue contains six papers on DSM. In “The worst-case inter- the popularity of applications like peer-to-peer (P2P) file- ference in DSL systems employing dynamic spectral man- sharing networks and video-streaming and high-definition agement,” Brady and Cioffi answer the question of what the television (HDTV). DSL technology allows telephone opera- worst-case crosstalk interference is for a given DSL line. They tors to get maximum leverage out of their existing infrastruc- characterize the performance of the system under the worst- ture by delivering broadband access over existing twisted- case noise using a game theory technique. In “Joint mul- pair telephone lines. At the heart of DSL lies a plethora of tiuser detection and optimal spectrum balancing for digital signal processing techniques which enable such high-speed subscriber lines,” Chan and Yu study the optimal spectrum transmission to be achieved over a medium originally de- management technique for a scenario in which crosstalk signed with only voice-band transmission in mind. These may also be partially cancelled using advanced crosstalk advanced signal processing techniques address many chal- cancellation techniques. In the next three papers, practi- lenges that exist in DSL networks today, such as the near-end cal spectrum management techniques are investigated. In and far-end crosstalk (NEXT/FEXT), impulse noise, peak-to- “Spectrally compatible iterative water-filling,” Verlinden et average-power ratio (PAR), intersymbol and intercarrier in- al. study a system in which spectral allocation scheme is terference (ISI/ICI), radio-frequency interference (RFI), and constrained by additional spectrum compatibility require- so forth. The goal of this special issue is to discuss the state- ments and propose a new scheme based on an earlier al- of-the-art and recent advances in signal processing tech- gorithm called iterative water-filling. In “The normalized- niques for DSL. rate iterative algorithm: a practical dynamic spectrum man- The special issue consists of fifteen papers on a range of agement method for DSL,” Statovci et al. propose a new topics. The first set of papers focuses on the area of dynamic low-complexity technique for spectrum balancing and fre- spectrum management (DSM). In a conventional DSL de- quency partition in a DSL bundle. In “ADSL transceivers ployment, the transmit spectrum for all modems in a bun- applying DSM and their nonstationary noise robustness,” dle are fixed to a predetermined level. As DSL deployment Van den Bogaert et al. report the performance of practical becomes increasingly heterogeneous, crosstalk produced by transceivers implementing dynamic spectrum management modems under a fixed spectrum can be a source of sig- and study their robustness against nonstationary noise. Fi- nificant interference. Dynamic spectrum management aims nally, from a theoretical perspective, the paper “Analysis of to improve the data rates and reaches of conventional DSL iterative water-filling algorithm for multiuser power control 2 EURASIP Journal on Applied Signal Processing in digital subscriber lines,” by Luo and Pang, takes a new Finally, we wish to take this opportunity to acknowledge look at the iterative water-filling algorithm and gives a novel and to thank all anonymous reviewers, without whom the interpretation of the algorithm based on optimization the- success of this special issue would not have been possible. ory. The next paper in the special issue deals with crosstalk cancellation. In a DSL deployment, when coordination Raphael Cendrillon among the transmit- or receive-modems is possible, further Iain Collings data improvement may be obtained via crosstalk cancella- Tomas Nordstrom¨ tion. In the paper “Alien crosstalk cancellation for multipair Frank Sjoberg¨ digital subscriber line systems,” Ginis and Peng give an Michail Tsatsanis overview of this area and propose a new crosstalk cancella- Wei Yu tion technique that takes advantage of the noise correlation among the multiple receivers. The practical success of dynamic spectrum management and crosstalk cancellation depends very much on how accu- Raphael Cendrillon was born in Mel- rately crosstalk channels may be modeled and identified in bourne, Australia, in 1978. He received an practice. Two papers of this special issue address this area. Electrical Engineering degree (honours first In “Crosstalk models for short VDSL2 lines from measured class) from the University of Queensland, 30 MHz data,” Karipidis et al. propose measurement-based Australia, in 1999, and a Ph.D. in electri- crosstalk models for VDSL. In “Error sign feedback as an al- cal engineering at the Katholieke Univer- ternative to pilots for the tracking of FEXT transfer functions siteit Leuven, Belgium, in 2004. His Ph.D. in downstream VDSL,” Louveaux and Van der Veen propose was awarded Summa Cum Laude with con- new ways of identifying the crosstalk channel using a novel gratulations of the jury, an honor given to feedback scheme. the top 5% of Ph.D. graduates. His research Equalization and coding continue to be important re- focuses on the application of multiuser communication theory to xDSL. In 2002, he was a Visiting Scholar at the Information Systems search topics in DSL. In the area of time-domain equal- Laboratory, , with Prof. John Cioffi. In 2005, ization (TEQ), the paper “Iterative refinement methods Dr. Cendrillon was a postdoctoral Research Fellow at the Univer- for time-domain equalizer design” by Arslan et al. pro- sity of Queensland, Australia. During this period, he was also a poses a new method to reduce the implementation com- Visiting Research Fellow at Princeton University with Prof. Mung plexity of the TEQ. In the area of error-correcting coding Chiang. He is now a senior DSP engineer with Marvell Hong Kong for the DSL system, the paper “Near capacity coding for Ltd. He was awarded the Alcatel Bell Scientific Prize in 2004; IEEE discrete multitone (DMT) systems with impulse noise” by Travel Grants in 2003, 2004, and 2005; the K.U. Leuven Bursary for Ardakani et al. proposes a methodology for the design of Advanced Foreign Scholars in 2004; and the UniQuest Trailblazer the newly emerged low-density parity-check (LDPC) codes Prize for Commercialization in 2005. for a DMT system, while addressing the practical DSL de- ployment issue of impulse noise. In “Fine-granularity load- Iain Collings was born in Melbourne, Aus- ing schemes using adaptive Reed-Solomon coding for xDSL- tralia, in 1970. He received the B.E. de- DMT systems,” Panigrahi and Le-Ngoc propose a joint de- gree in electrical and electronic engineering sign of bit-loading and error-correcting code, and charac- from the University of Melbourne, in 1992, terize the performance gain made possible by fractional bit- and the Ph.D. degree in systems engineering loading. from the Australian National University, in The final set of three papers in this special issue deals 1995. Currently he is a Science Leader in the CSIRO Information Communication Tech- with the area of modulation and transmitter design. The nologies Centre, Australia. Prior to this he design of transmit window to minimize egress is studied was an Associate Professor at the Univer- in the paper by Cuypers et al. “Intra-symbol windowing sity of Sydney (1999–2005); a Lecturer at the University of Mel- for egress reduction in DMT transmitters.” The peak-to- bourne (1996–1999); and a Research Fellow in the Australian Co- average-power ratio is another important transmitter de- operative Research Centre for Sensor Signal and Information Pro- sign issue for DMT systems. This is taken up in the pa- cessing (1995). His current research interests include mobile dig- per “Designing tone reservation PAR reduction” by Andgart ital communications and broadband digital subscriber line com- et al. DMT is not the only possible multicarrier modula- munications; more specifically, synchronization, channel estima- tion scheme for DSL. An alternative is proposed and stud- tion, equalization, and multicarrier modulation, for time-varying ied in the paper “Cosine modulated multitone for very and frequency-selective channels. He currently serves as an Editor for the IEEE Transactions on Wireless Communications, and as a high-speed digital subscriber lines” by Lin and Farhang- Guest Editor for the EURASIP Journal on Advanced Signal Process- Boroujeny. ing. He has also served as the Vice Chair of the Technical Program The continued growth of digital subscriber line technol- Committee for IEEE Vehicular Technology Conf. (Spring) 2006, as ogy worldwide is in part fueled by rapid advances in signal well as serving on a number of other TPCs and organizing commit- processing techniques. We hope that the readers will enjoy tees of international conferences. He is also a founding organizer the collection of papers on this timely topic. of the Australian Communication Theory Workshops 2000–2006. Raphael Cendrillon et al. 3

Tomas Nordstrom¨ was born in Harn¨ osand,¨ Wei Yu received the B.A.S. degree in com- Sweden, in 1963. He received the M.S.E.E. puter engineering and mathematics from degree in 1988, the Licentiate degree in the University of Waterloo, Waterloo, ON, 1991, and the Ph.D. degree in 1995, all Canada, in 1997, and M.S. and Ph.D. de- from Lulea˚ University of Technology, Swe- grees in electrical engineering from Stan- den. Currently, he is a key researcher and ford University, Stanford, Calif, USA, in project manager at the Telecommunications 1998 and 2002, respectively. Since 2002, Research Center Vienna (ftw). During 1995 he has been an Assistant Professor with and 1996, he was an Assistant Professor the Electrical and Computer Engineering at Lulea˚ University of Technology research- Department at the University of Toronto, ing computer architectures, neural networks, and signal process- Toronto, ON, Canada, where he also holds a Canada Research ing. Between 1996 and 1999, he was with Telia Research (the Chair. His main research interests include multiuser information research branch of the Swedish incumbent telephone operator), theory, coding, optimization, wireline communications, and wire- where he developed broadband Internet communication over less broadband access networks. He is currently an Associate Editor twisted copper pairs. He was instrumental in the development for the IEEE Transactions on Wireless Communications. of the Zipper-VDSL concept (contributed to the standardization of VDSL in ETSI, ANSI, and ITU) and in the design of the Zipper-VDSL prototype modems. In December 1999, he joined ftw, where he is the project Manager of the “Broadband wire- line access” group. At ftw he has worked with various aspects of wireline communications like simulation of xDSL systems, ca- ble measurements, RFI suppression, and exploiting the common- mode signal in xDSL. Currently his research is focused on “ac- tive copper resource management” which includes researching op- timized usage of cable bundles and dynamic spectrum manage- ment.

Frank Sjoberg¨ wasborninUme,Sweden,in 1970. He received the M.S. degree in com- puter science and the Ph.D. degree in signal processing from Lulea˚ University of Tech- nology, Lulea,˚ Sweden, in 1995 and 2000, re- spectively. Between 2000 and 2002, he was with Telia Research AB, Lulea,˚ Sweden. He currently works as a Researcher at Upzide Labs, and holds a position as an Assistant Professor in the Division of Signal Process- ing, Lulea˚ University of Technology. His primary research interests are statistical signal processing and digital communication, with emphasis on wireline systems.

Michail Tsatsanis is a founder and Chief Scientist of Aktino, a company developing next generation DSL transceiver technol- ogy. Prior to that he was with Voyan Tech- nology, where he served as Chief Scientist and Chief Technical Officer. From 1995 to 2000, he was with Stevens Institute of Tech- nology, NJ, were he served as an Associate Professor in Electrical Engineering. He is the author of more than 80 peer reviewed papers, three book chapters, and several patents. At Aktino he is leading a technology team that was first to successfully im- plement and produce a MIMO vectored transceiver in the DSL space. He has received a number of distinctions including the Na- tional Science Foundation CAREER Award and two IEEE Best Pa- per Awards. He has served the IEEE in various capacities includ- ing the position of Associate Editor for two IEEE Transactions and Chair of workshop organizing committees. He holds M.S. and Ph.D. degrees in electrical engineering from the University of Vir- ginia. Hindawi Publishing Corporation EURASIP Journal on Applied Signal Processing Volume 2006, Article ID 78524, Pages 1–11 DOI 10.1155/ASP/2006/78524

The Worst-Case Interference in DSL Systems Employing Dynamic Spectrum Management

Mark H. Brady and John M. Cioffi

Department of Electrical Engineering, Stanford University, Stanford, CA 94305-9515, USA

Received 1 December 2004; Revised 28 July 2005; Accepted 31 July 2005 Dynamic spectrum management (DSM) has been proposed to achieve next-generation rates on digital subscriber lines (DSL). Be- cause the copper twisted-pair plant is an interference-constrained environment, the multiuser performance and spectral compati- bility of DSM schemes are of primary concern in such systems. While the analysis of multiuser interference has been standardized for current static spectrum-management (SSM) techniques, at present no corresponding standard DSM analysis has been estab- lished. This paper examines a multiuser spectrum-allocation problem and formulates a lower bound to the achievable rate of a DSL modem that is tight in the presence of the worst-case interference. A game-theoretic analysis shows that the rate-maximizing strategy under the worst-case interference (WCI) in the DSM setting corresponds to a Nash equilibrium in pure strategies of a certain strictly competitive game. A Nash equilibrium is shown to exist under very mild conditions, and the rate-adaptive waterfill- ing algorithm is demonstrated to give the optimal strategy in response to the WCI under a frequency-division (FDM) condition. Numerical results are presented for two important scenarios: an upstream VDSL deployment exhibiting the near-far effect, and an ADSL RT deployment with long CO lines. The results show that the performance improvement of DSM over SSM techniques in these channels can be preserved by appropriate distributed power control, even in worst-case interference environments.

Copyright © 2006 M. H. Brady and J. M. Cioffi. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. INTRODUCTION talk scenario. Such methods are useful when a reasonable es- timate of spectrum of all users can be assumed priori. How- In recent years, increased demands on data rates and compe- ever, if spectrum is instead allocated dynamically, not only is tition from other services have led to the development of new this knowledge not available priori, but also because of loop high-speed transmission standards for digital subscriber line unbundling, other users’ spectrum may not even be known (DSL) modems. Dynamic spectrum management (DSM) is even during operation. Spectral compatibility between dif- emerging as a key component in next-generation DSL stan- ferent operators using DSM is a primary concern because dards. In DSM, spectrum is allocated adaptively in response new pathologies may arise with adaptive operation. More- to channel and interference conditions, allowing mitigation over, it is not unreasonable to suspect that each competing of interference and best use of the channel. As multiuser in- service provider sharing a binder would perform DSM in a terference is the primary limiting factor to DSL performance, greedy fashion, at the possible expense of other providers’ the potential for rate improvement by exploiting its structure users. However, in DSM, a worst-case interference analysis is substantial. based on maximum allowable PSDs is overly pessimistic, so DSM contrasts with current DSL practice, known as existing spectral compatibility techniques cannot be fruit- static spectrum management (SSM). In SSM, masks are fully employed. A new paradigm is needed to assess the im- imposed on transmit power spectrum densities (PSDs) to pact of DSM on multiuser performance of the overall system. bound the amount of crosstalk induced in other lines shar- ing the same binder group [1]. As SSM masks are fixed for 1.1. Prior results all loop configurations, they can often be far from optimal or even prudent spectrum usage in typical deployments. Stan- The capacity region of the AWGN interference channel (IC) dardized tests for “spectral compatibility” [1] assess “new is in general unknown, even for the 2-user case [2]. Com- technology” by defining PSD masks and examining the im- munication in the presence of hostile interference has been pact on standardized systems using the 99th-percentile cross- studied from a game-theoretic perspective in numerous 2 EURASIP Journal on Applied Signal Processing

SMC Downstream User 1 Upstream ......

L

Victim NEXT FEXT

Figure 1: Illustration of loop plant environment showing downstream FEXT and NEXT from user 1. The victim user is shown at the bottom. applications, for example, [3, 4]. A simple and relevant IC a game-theoretic viewpoint. Certain properties of the Nash achievable region is that attained by treating interference as equilibrium of this game are explored. Section 4 considers noise [5]. Capacity results for frequency-selective interfer- numerical examples in VDSL and ADSL systems. Conclud- ence channels satisfying the strong interference condition are ing remarks are made in Section 5. also known [6]. A word on notation: vectors are written in boldface, DSM algorithms have been proposed for the cases of dis- where vk denotes the kth element of the vector v,andv 0 tributed and centralized control scenarios. This paper con- denotes that each element is nonnegative. The notation v(n) siders what has been termed “Level 0–2 DSM” [7], wherein denotes a vector corresponding to tone n. For the symmetric cooperation may be allowed to manage spectrum, but not matrix X, X 0 denotes that X is positive semidefinite. 1 for multiuser encoding and decoding. A centralized DSM is a column vector with each element equal to 1. int(X)de- center controlling multiple lines offers both higher poten- notes the (topological) interior, cl(X) the closure, and ∂X the tial performance and improved management capabilities [8]. boundary of the set X. Distributed DSM schemes based on the iterative waterfilling (IW) algorithm [9] have been presented. IW has also been 2. SYSTEM MODEL studied from a game-theoretic viewpoint [10]. Numerous al- gorithms for centralized DSM have been proposed. Reference 2.1. Channel model [11] presents a technique to maximize users’ weighted sum- A copper twisted-pair DSL binder is modelled as a frequency- rate. Rate maximization subject to frequency-division and selective multiuser Gaussian interference channel [9, 23]. fixed-rate proportions between users has been considered The binder contains a total of L + 1 twisted pairs, with one [12]. Optimal [13] and suboptimal [14] algorithms to mini- DSL line per twisted pair, as shown in Figure 1. The effect of mize transmit power have been studied. NEXT and FEXT interferences generated by L “interfering” An extensive suite of literature on upstream power- users that generate crosstalk into one “victim” user is consid- ff backo techniques to mitigate the “near-far” problem has ered. This coupling is illustrated for downstream transmis- been developed for static spectrum-management systems sion in Figure 1. [13, 15–17]. A power-backoff algorithm for DSM systems implementing iterative waterfilling has been proposed [18]. 2.2. DSL modem model In current DSL standards, upstream and downstream transmissions use either distinct frequency bands or shared 2.2.1. Modem architecture bands. In the latter case, “echo” is created between upstream and downstream transmissions [9]. As analog hybrid circuits The standardized [24] discrete-multitone (DMT)-based do not provide sufficient isolation, echo mitigation is essen- modulation scheme is employed, so that transmission over tial in practical systems [19]. Numerous echo-cancellation the frequency-selective channel may be decoupled into N in- structures have been proposed for DSL transceivers [20–22]. dependent subcarriers or tones. Both FDM and overlapping bandplans are considered. As overlapping bandplans require 1.2. Outline echo cancellation that is imperfect in practice, error that is introduced acts as a form of interference and is of concern. This paper formulates the achievable rate of a single “victim” Echo-cancellation error is modelled presuming a prevalent modem in the presence of the worst-case interference from echo-cancellation structure utilizing a joint time-frequency other interfering lines in the same binder group. The perfor- LMS algorithm [19]isemployed.1 Using the terminology mance under the WCI is a guaranteed-achievable rate that of [19], let μ denote the LMS adaptive step size parameter. can be used, for example, in studying multiuser performance The“excessMSE”foragiventoneismodelled[25,equation of DSM strategies and establishing spectral compatibility of DSM systems. Section 2 defines the channel and system models. The 1 Other models may be more applicable to different echo-cancellation WCI problem is formalized and studied in Section 3 from structures. M. H. Brady and J. M. Cioffi 3

(12.74)] as proportional to the product of the LMS adaptive in ADSL and VDSL standards [9], this represents the impor- step size parameter μ and the transmit power on that tone. tant special case of the preceding model, where βn = 0(due (n) = ≤ ≤ The constant of proportionality is absorbed by defining β as to no echo cancellation) and hl 0foralln, L +1 l 2L the ratio of excess MSE to transmitted energy on a given tone. (due to frequency division). Additional technical results will be shown to hold in the FDM setting, as detailed in Section 3. 2.2.2. Achievable rate region 3. THE WORST-CASE INTERFERENCE This section discusses an achievable rate region for a DSL modem based on the preceding channel and system model. 3.1. Game-theoretic characterization of the WCI The following analysis applies to both upstream and down- stream transmissions. For specificity, the following refers to This section introduces and motivates the concept of the downstream transmission: first, consider the case where echo worst-case interference (WCI). Suppose that a “victim” mo- cancellation is employed. Denote the victim modem’s down- dem desires to keep its data rate at some level. Such a scenario stream transmit power on tone n, n ∈{1, ..., N},asxn.Let is commonplace as carriers widely offer DSL service at fixed ∈{ } (n) ∈ R2L element l, l 1, ..., L , of the vector y + denote the data rates. The objective is to bound the impact that mul- downstream transmit power of interfering modem l on tone tiuser interference can have on this victim modem, thereby n. Similarly, let element l, l ∈{L +1,...,2L},ofy(n) denote determining whether service may be guaranteed. To this end, the upstream transmit power of interfering user l − L.Define one considers interferences that are the most harmful in the ∈{ } (n) ∈ R2L element l, l l, ..., L , of the row vector h + as the sense of minimizing the achievable rate of a “victim” modem. FEXT power gain from interfering user l on tone n (necessar- However, it is not clear what form such interferences might ily, h(n) 0). Similarly, define element l, l ∈{L +1,...,2L}, take, nor how they might be best responded to. of h(n) to be the NEXT power gain from interfering user l−L. Examining this problem from the standpoint of game ∈ RN Let element n of hn + denote the victim line’s insertion theory leads to substantial insight. Consider a worst-case in- gain on tone n (hn ≥ 0). terference game where one player jointly optimizes the spec- 2 trum of all the interfering modems, irrespective of the data Independent AWGN (thermal noise) with power σn > 0 rate they achieve in doing so, to cause the most deleteri- is present on tone n.Letβn denote the echo-cancellation ra- tio on tone n as described above. Echo-cancellation error is ous interference to the victim modem. Thus in this game, treated as AWGN. Let Γ denote the SNR gap-to-capacity [9]. all the interfering modems act as one player, while the vic- Then the following bit loading2 is achievable on tone n [9]: tim modem acts as the other player, with the channel and noise known to all. Although such an arrangement may ap- pear pathological, it will be shown numerically that such a hnxn bn = log 1+ . (1) situation is quite close to what occurs in certain loop topolo- Γ (n) (n) 2 h y + βxn + σn gies. Neither is assuming such coordination of the interferers unreasonable in practice as under “Level 2” DSM [7, 8], each Observe that if hn = 0, then it is necessarily the case that collocated carrier may individually coordinate its own lines, bn = 0, implying that tone n is never loaded. Thus, in the nor may collocated equipment be centrally controlled by a competing carrier. Channels may be estimated in the field, sequel, hn > 0foralln ∈{1, ..., N} is considered without loss of generality by removing those tones with zero direct approximated by standardized models [9], and in the future, potentially published by operators [26]. gain (h = 0). Defining α = Γ/h , β = Γβ/h ,andN = n n n n n n n A Nash equilibrium in this game may be interpreted as Γ 2 σn /hn, and substituting characterizing a worst-case interference as an optimal re- sponse (power-allocation policy) to it. The structure of the = xn Nash equilibrium lends insight into the problem as well as bn log 1+ (n) (n) ,(2) αnh y + βnxn + Nn suggesting techniques that may be implemented in practical systems. because Γ ≥ 1, it follows that αn ≥ 0, βn ≥ 0, and Nn > 0. 3.2. Formalization of the WCI game 2.2.3. Achievable rate region for FDM Consider the following two-player game: let Player 1 con- When an FDM scheme is employed, NEXT and echo can- trol the spectrum allocation of victim modem, and let Player cellation are eliminated because transmission and reception 2 control the spectrum allocations of all the interfering occur on distinct frequencies.3 As a common configuration modems. Referring again to downstream transmission for specificity, let the total (sum) downstream power of the vic- x x tim modem n xn be upper bounded by P , where 0

x = ffi disregarding all unusable tones n for which Cn 0. Similarly strictly concave in x.Itissu cient [27] to show that for all for Player 2, consider per-line power constraints 0 ≺ Py ≺∞, x ≥ 0, it holds that ∂2 f/∂η2 ≥ 0 on the interval (−, ∞)for where the total downstream power of the lth interfering mo- some  > 0, and similarly for all η ≥ 0 that ∂2 f/∂x2 < 0on dem l ∈{1, ..., L} is upper bounded by the lth element the interval (−, ∞)forsome > 0. By differentiating and y ∈ R2L of P ++ and the total upstream power of interfering simplifying, modem l is upper bounded by element l + L of Py.Fur- y,(n) ∈ R2L ther, consider positive power constraints C ++ for ∂f αη + γ ( ) ,( ) = ,(7) n = 1, ..., N such that y n  Cy n for each n;anysuch ∂x (αη + βx + γ) (β +1)x + αη + γ power constraints equal to zero may be equivalently enforced ∂2 f (αη + γ) 2β(β +1)x +(2β +1)(αη + γ) by zeroing respective element(s) of {h(n)}. = 2 2 < 0, (8) The strategy set of Player 1 is the set of all feasible ∂x −(αη + βx + γ)2 αη +(β +1)x + γ power allocations for the victim modem, S ={x :0 ∂f αx 1 =− ,(9) x  Cx, 1T x ≤ Px}, and the strategy set of Player 2 is ∂η (αη + βx + γ) αη +(β +1)x + γ the set of all feasible power allocations for the interfering ∂2 f α2 2αη +(2β +1)x +2γ x S ={ (1) (N)  (n)  y,(n) = = ≥ modems, 2 [y , ..., y ]:0 y C , n 2 2 0, (10) (1) (N) y ∂η (αη + βx + γ)2 αη +(β +1)x + γ 1, ..., N,[y , ..., y ]1  P }.DefineS = S1 × S2. This is a strictly competitive or zero sum two-player game (S1, S2, J),  =  = where the objective function J : S → R+ is defined to be the where γ/(4β(β +1))in(8), γ/(2α) when α>0,  = = ∈ R × R2L achievable data rate of the victim user: and 1 when α 0in(10). For all (x, y) + + ,it must be that hT y ≥ 0. Thus g(x, y) = f (x, hT y). By the affine N (1) (N) = xn mapping composition property [27], it follows that g(x, y)is J x, y , ..., y log 1+ (n) (n) . n=1 αnh y + βnxn + Nn convex in y and strictly concave in x. (3)

The game G = (S1, S2, J) is defined to be the worst-case Because the objective (3)isasumoffunctionsthatare (n) interference game. strictly concave in xn and convex in y , J is strictly concave in x and convex in [y(1), ..., y(N)]. 3.3. Derivation of Nash equilibrium conditions Theorem 2. The WCI game G has a Nash equilibrium existing ∗ A Nash equilibrium in pure strategies in the WCI game G is in pure strategies, and a value R . (1) (N) ∈ S defined to be any saddle point (x,[y , ..., y ]) satis- S ⊂ RN S ⊂ R2LN fying Proof. Because 1 and 2 are closed and bounded, by the Heine-Borel theorem, they are both com- J x, y(1), ..., y(N) ≤ J x, y(1), ..., y(N) (4) pact. Also, the objective is a composition of continuous func- tions, hence continuous, and J is strictly concave in x and ≤ J x, y(1), ..., y(N) ,(5) convex in [y(1), ..., y(N)]. The conditions of [28,Theorem (1) (N) for all x ∈ S1,[y , ..., y ] ∈ S2. Condition (5) imme- 4.4] are thus satisfied, and therefore a pure-strategy saddle diately implies the claim that Player 1 rate at a Nash equi- point exists. Note that the saddle point need not be unique, librium of G lower bounds the achievable rate with any other in general. Because a saddle point exists in pure strategies, the feasible interference profile. This bound also extends to other game has a value [28, Theorem 4.1], which will be denoted as ∗ settings: in the noncooperative IW game [10], a (possibly R .Thus, non-unique) Nash equilibrium is known to always exist in pure strategies; condition (5) again yields a lower bound rate max min J = min max J = R∗. (11) ∈S (1) (N) ∈S (1) (N) ∈S ∈S at every Nash equilibrium of the IW game for the line corre- x 1 [y ,...,y ] 2 [y ,...,y ] 2 x 1 sponding to Player 1. It is now shown that a Nash equilibrium of G always ex- 3.4. Structure of the worst-case interference ists due to certain properties of the objective and strategy sets. First, the convex-concave structure of the objective is The previous section showed that under very general condi- established. tions, a Nash equilibrium exists. However, it is not immedi- ately clear whether there exists a unique Nash equilibrium, ≥ ≥ ∈ R2L Theorem 1. If α 0, β 0, γ>0, h + ,andα, β, γ, h are or whether Nash equilibria of the WCI game might possess R × R2L → R bounded, then the function g : + + + defined by any simplifying structure. x The former question may be addressed by considering = = = (1) = (2) = g(x, y) log 1+ T (6) the following example: N 2, L 2, h h αh y + βx + γ x y T [1100], P = 1, P = [11] , N1 = N2 > 0, α1 = α2 = 1, is strictly concave in x and is convex in y. Γ = 1, and suppose that the FDM condition is satisfied and the per-tone power constraints are redundant. Then it may Proof. It is first shown that f : R+ × R+ → R+, f (x, η) = be readily verified by symmetry arguments that with x = log((1 + β)x + αη + γ) − log(αη + βx + γ)isconvexinη and [1/21/2]T ,bothy(1) = [1000]T , y(2) = [0100]T M. H. Brady and J. M. Cioffi 5 and y(1) = y(2) = [1/21/200]T (and convex combina- [28]. Consequently, x is also the unique maximizer of (3) tions thereof) form saddle points (x,[y(1) y(2)]). Thus, for [y(1), ..., y(N)] = [y(1), ..., y(N)]. This implies that x = x. Player 2 may have an uncountably infinite number of opti- Taking x = x establishes the result. mal strategies even under the FDM condition, and hence the To show the second claim, define I ={i : xi > 0}, saddle point need not to be unique in general. where x is the unique Nash equilibrium strategy of Player Given that the Nash equilibrium is not generally unique, 1 as per the first claim, and suppose that there exists a (n) (n) (n) (n) its structure is explored in the following results. Some ba- nonempty set D ={n ∈ I : αnh y = αnh y }. sic intuition is first established showing that “waterfilling” is Consider (x,[y(1), ..., y(N)]) ∈ P and (x,[y(1), ..., y(N)]) ∈ (1) (N) Player 1 optimal strategy in response to the interference in- P,wherex = x = x.DefineS2  [y , ..., y ] = duced at a given Nash equilibrium where the FDM condition (1/2)[y(1), ..., y(N)]+(1/2)[y(1), ..., y(N)]. The function g : RN → R holds and the individual-tone constraints are inactive. + + defined by

(1) (n) Theorem 3. Let (x,[y , ..., y ]) be a Nash equilibrium of N x the WCI game G. If the FDM condition holds for G and Cx ≥ g i , ..., i = log 1+ n (13) n 1 N i + β x + N Px for all n, then the Nash equilibrium strategy of Player 1 n=1 n n n n (namely, x) is given by “waterfilling” against the combined (n) (n) noise and interference αnh y + Nn from Player 2. is convex in each variable in and strictly convex in each variable in for which n ∈ I due to (10). By the fact that Proof. Let (x,[y(1), ..., y(n)]) be any saddle point of J.The ∅ = D ⊂ I and the convexity properties, it follows x x (1)(1) (N)(N) (1) (1) condition Cn 1P ensures that the per-tone constraints that g([αnh y , ..., αnh y ]) < (1/2)g([αnh y , ..., x (N) (N) (1) (1) (N) (N) are trivially satisfied whenever the power constraint (P )is. αnh y ]) + (1/2)g([αnh y , ..., αnh y ]), and con- Evaluating the right-hand side of (11), if βn = 0 (from FDM sequently that assumption), then (1)(1) (N)(N) J x, αnh y , ..., αnh y N ∗ = xn 1 (1) (1) (N) (N) R max log 1+ . (12) < J x, αnh y , ..., αnh y x∈S (n)(n) (14) 1 n=1 αnh y + Nn 2 1 (1) (1) (N) (N) ∗ + J x, αnh y , ..., αnh y = R , The optimization problem (12) is seen to be precisely the 2 same as single-user rate maximization with parallel Gaussian channels [23], and hence the (modified) waterfilling spec- which contradicts (5). Therefore D =∅. trum is optimal and unique (for fixed [y(1), ..., y(n)]). In par- ticular, the modified AWGN noise level on tone n is seen to As a corollary, Theorem 4 implies that the “interference be α h(n)y(n) + N . This is the same modified noise level used n n profile” α h(n)y(n) +β x +N is invariant on each active tone in the rate-adaptive IW algorithm [9]. n n n n {n :(xn > 0)} at every Nash equilibrium. Even though the Nash equilibrium need not be unique, one therefore has a strong sense in which to speak of a worst-case interference Considering the structure of the general WCI game G, profile that is most deleterious to Player 1. It is possible to it is possible to establish uniqueness of Player 1 optimal strengthen Theorem 4 by restricting attention to the FDM strategyandstrongpropertiesofPlayer2optimalstrategy. setting: in Theorem 5, it is shown that in this case the struc- Henceforth, the set of all Nash equilibria of G is denoted by ture of P is polyhedral. Moreover, once one has obtained a . P single Nash equilibrium point, the set of all Nash equilibria Theorem 4. The Nash equilibrium strategy of Player 1 is may be readily deduced. This implies that the set of worst- unique; that is, there exists some x ∈ S such that for interference profiles may be explicitly computed by practi- 1 tioners for use in offline system design or dynamic operation. each (x,[y(1), ..., y(N)]) ∈ P,itisthecasethatx = x. Moreover, for Player 2, the induced “active” interference at (1) Theorem 5. If the FDM condition is satisfied, then the set P of each Nash equilibria is unique; in particular, (x,[y , ..., G 4 (N) (1) (N) (n) (n) all Nash equilibria of the WCI game is a polytope. y ]), (x,[y , ..., y ]) ∈ P imply that αnh y = (n)(n) ∈ αnh y for each n 1, ..., N satisfying xn > 0. Proof. The result is proven by constructing a polytope, Q and subsequently showing that P = Q.ToconstructQ, Proof. To show that Player 1 optimal strategy is identical for take any (x,[y(1), ..., y(N)]) ∈ P (such a point must exist by all Nash equilibria, consider the saddle points (x,[y(1), ..., Theorem 2). Define D ={n : x = 0}, E ={n :0< x < Cx}, y(N)]) ∈ P and (x,[y(1), ..., y(N)]) ∈ P,whicharenotnec- n n n F ={n : x = Cx},andI = E ∪ F.Equation(4) holds that essarily distinct. By Theorem 1 and separability over tones, n n the objective (3) is strictly concave in x, and therefore has a unique maximizer [27], namely x,whenonefixes 4 Different definitions of polytopes exist in the literature; this paper defines (1) (N) (1) (N) (1) [y , ..., y ] = [y , ..., y ]. Observe that (x,[y , ..., a polytope as the bounded intersection of a finite number of half-spaces y(N)]) ∈ P by the exchangeability property of saddle points [27]. 6 EURASIP Journal on Applied Signal Processing

x must be an optimum solution of the convex optimization For each n ∈ D,defineϕn as the solution of the equa- problem: tion 1/(xn + ϕ) = λ + νn,namelyϕn = 1/λ − xn. Define the polytope N x max log 1+ n , (15) = (1) (N) ∈ S = x (n)(n) Q x, y , ..., y : x x, n=1 αnh y + Nn (n) (n) (n) (n) αnh y = αnh y ∀n ∈ I, (25) subject to x 0, n = 1, ..., N, (16) (n) (n) ≥ ∀ ∈ αnh y + Nn ϕn n D . ≤ x xn P , (17) = n It remains to be shown that P Q; it is first argued that Q ⊂ P. Recall that (x,[y(1), ..., y(N)]) ∈ P was used to con- Cx x. (18) struct Q, and consider any (x,[y(1), ..., y(N)]) ∈ Q. Note that x = x by construction of Q. The inequality (5) requires that Associate Lagrangian dual variables λ ∈ R and ν ∈ RN with [y(1), ..., y(N)] be an optimum solution of the convex opti- constraints (17)and(18), respectively. Because the objective mization problem: is concave in x and Slater’s constraint qualification condi- tion is satisfied [27], the Karush-Kuhn-Tucker (KKT) con- N x min log 1+ n ditions are necessary and sufficient for optimality (for fixed (1) (N) (n) (n) [y ,...,y ] =1 αnh y + βnxn + Nn (26) [y(1), ..., y(N)] = [y(1), ..., y(N)]): n (1) (N) subject to y , ..., y ∈ S2. 1 − ≤ ν = = (n) (n) (n) (n) λ 0, n 0ifxn 0, (19) However since by Theorem 4, α h y = α h y for all α h(n)y(n) + x + N n n n n n n ∈ I, the objective value is equal, and hence (5)issatis- 1 − = ν = x fied. Equation (4) is equivalent to requiring the KKT condi- (n) (n) λ 0, n 0if0< xn < Cn, αnh y + xn + Nn tions (19)–(22) to be satisfied for some ordered pair (λ, ν), (20) where x = x and [y(1), ..., y(N)] = [y(1), ..., y(N)]arefixed. It is now argued that the choice of (λ, ν) = (λ, ν)satisfies 1 − − ν = = x λ n 0, if xn Cn, (21) the conditions. For each n ∈{1, ..., N},ifn ∈ D, then α h(n)y(n) + x + N − n n n (n)(n) ≥ (n)(n) 1 − αnh y + Nn ϕn implies that (xn + αnh y + Nn) ≤ ≥ − x = ∈ S ≥ ν λ 0 by monotonicity of 1/(x + a)inx 0fora>0. If λ xn P 0, x 1, λ 0, 0. (22) ∈ (n)(n) = (n)(n) n n I, then αnh y + Nn αnh y + Nn by construc- tion of Q, and accordingly (20)or(21) is satisfied. Because Suppose that the KKT conditions are satisfied by the both (4)and(5) are satisfied, it follows by definition that (1) (N) ∈ ⊂ triplet (x, λ0, ν0). The triplet (x, λ0, ν0) need not be unique, in (x,[y , ..., y ]) P, and hence Q P. ⊂ (1) general. However, the first element is unique (by Theorem 4), It is now argued that P Q. Recall that (x,[y , ..., (N) ∈ and thus it remains to be seen whether the ordered pair y ]) P was used to construct Q and consider any (x,[y(1), ..., y(N)]) ∈ P.ByTheorem 4, x = x. Also by (λ0, ν0) is unique. If E =∅, then the pair is unique. To see Theorem 4, one has h(n)y(n) = h(n)y(n) for all ∈ ,and 0 αn αn n I this, consider n0 ∈ E which by (20) uniquely determines λ (n) (n) therefore it remains only to prove that αnh y + Nn ≥ ϕn ν0 and along with (19)and(21) uniquely determines .Be- for all n ∈ D. (n0)(n0) ∈ S (1) (N) cause 1/(αn0 h y + xn0 + Nn0 ) > 0forallx 1,inac- Because (x,[y , ..., y ]) ∈ P, there must exist a pair count of (20)itmustbethatλ0 > 0. In this case, we define (λ0, ν0) such that the triplet (x, λ0, ν0) satisfies the KKT con- λ = λ0 and ν = ν0. ditions (for fixed [y(1), ..., y(N)] = [y(1), ..., y(N)]). =∅ In the event that E , observe that because the ob- In the event that E =∅,defineλ = λ0 and ν = ν0. =∅ jective (15) is strictly increasing in x,itmustbethatI . Clearly, the triplet (x, λ, ν) also satisfies the same KKT con- ⊂ ∪ = =∅ =∅ Also, because E E F I , one has F .Define ditions. Observe by Theorem 4 that because for n ∈ E one (n)(n) = (n)(n) = = 0 ν0 has αnh y αnh y , it follows by (20) that λ λ. λ λ +min m, (23) m∈F In the event that E =∅, observe that because ∅=E ⊂ ⎧ I =∅,wehaveF = I − E =∅.Define ⎨ν0 − ν0 ∈ n minm∈F m, n F, νn = (24) = 0 ν0 ⎩ λ λn +min m, (27) 0 else. ⎧ m∈F ⎨ν0 − ν0 ∈ ν = n minm∈F m, n F, ν n (28) It may be readily verified that (x, λ, ) also satisfies the KKT ⎩0 else. conditions. Observe that by (24), νn = 0 for at least one n ∈ (n)(n) ∈ =∅ I. Because 1/(αnh y + xn + Nn) > 0foralln I , It may be readily verified that (x, λ, ν) satisfies the KKT con- ∈ S x 1,(21) implies that λ>0. It is therefore the case that ditions (for fixed [y(1), ..., y(N)] = [y(1), ..., y(N)]). By (28),  the triplet (x, λ, ν) satisfies the KKT conditions and λ>0 there must exist some n ∈ F such that νn = 0. Simi- whether E =∅or E =∅. larly, recall that there must exist some m ∈ F such that M. H. Brady and J. M. Cioffi 7

νm = 0. It is now argued that there exists some m ∈ F such The partial derivatives of J,  that both νm = 0andνm = 0. In particular, let m = m . Then by (21) and the fact that the triplet (x, λ, ν) satisfies the (n) (n) (1) (N) = (1) (N) ∂J = αnh y + Nn KKT conditions for [y , ..., y ] [y , ..., y ], one has (n) (n) (n) (n) (m) (m) (n) (n) ∂xn βnxn + αnh y + Nn 1+βn xnαnh y + Nn 1/(αmh y + xm + Nm) ≤ 1/(αnh y + xn + Nn)forall ∈ (n)(n) = (n)(n) ∈ 1 1 1 n F.However,αnh y αnh y for all n F,and + − − , (m) (m) (n) (n) x − T x − therefore 1/(αmh y +xm +Nm) ≤ 1/(αnh y +xn +Nn) txn t P 1 x t Cn xn ∈ ν = for all n F. This (along with the fact that n 0forsome ∂J  n ∈ F) implies that νm = 0. Then, (21) for this choice of m (n) ∂ ym = implies that λ λ. (n) (n) α h α h Because it is always the case that λ = λ, the triplet = n m − n m 1+β x + α h(n)y(n) + N β x + α h(n)y(n) + N (x, λ, ν) satisfies the KKT conditions (for [y(1), ..., y(N)] = n n n n n n n n 1 1 1 [y(1), ..., y(N)]). Therefore, 1/(α h(n)y(n) + x + N ) − λ ≤ 0 − + + , n n n (n) y,(n) (n) y N (n) (n) (n) y − − for all n ∈ D implies that αnh y + Nn ≥ ϕn for all n ∈ D. t m t Cm ym t Pm n=1 ym Thus (x,[y(1), ..., y(N)]) ∈ Q. (30)

3.5. Numerical computation of the saddle point are continuous on S1 × S2, implying by continuity of the In order to apply the WCI bound in practical settings, it norm that ∇J2 is continuous on S1 × S2. Consequently, ∈ R is necessary to develop numerical algorithms to solve for the sublevel sets Sα for each α , Nash equilibrium strategies and R∗. The methodology con- sidered herein is that of interior-point optimization tech- (1) (N) Sα = x, y , ..., y ∈ S1 × S2 : niques such as the “infeasible start Newton method” [27, (31) Section 10.3]. The general approach of interior-point tech- ∇ (1) (N) ≤ J x, y , ..., y 2 α , niques is to replace the (power and positivity) constraints with barrier functions that become large as the (power and positivity) constraints become tight. By making the increase are closed relative to S1 × S2. To show that Sα is closed, in the barrier functions progressively sharper, one solves a se- suppose that {zn} is any sequence in Sα with zn → z.If quence of problems whose solutions converge to a Nash equi- z ∈ S1 × S2 = int(S1 × S2), then z ∈ Sα by relative clo- librium of G. We now formally cast the problem (11) in the sure. Therefore, it remains only to observe that there does interior-point setting and argue that it satisfies certain neces- not exist any zn → z with z ∈ ∂ cl(S1 × S2). This follows sary properties needed for convergence. Logarithmic barrier from examining (30), where it can be seen that ∇J(z ) functions are employed to enforce the positivity and power n 2 increases without bound for any such z → z.Thiscontra- constraints and a Newton-step central path algorithm is used n dicts the assumption that {z } is a sequence in S . to compute R∗ to arbitrary accuracy [27]. n α In order to show for arbitrary α ∈ R that the Hessian is Let the central path parameter be denoted by t ∈ R++ Lipschitz continuous on Sα, it is enough to show that each and define S1 = int(S1), S2 = int(S2), and J : S1 × S2 → R+, 2 element of ∇ J is continuously differentiable on Sα.Thepar- where tial derivatives of (30) may be readily computed5 and seen to be continuous functions on S ⊂ S × S .However, J x, y(1), ..., y(N) α 1 2 S ×S is bounded, therefore is also bounded (and closed), 1 2 Sα N N hence compact. Therefore, each partial derivative of ∇2J,as = −1 x − −1 x − t log P xn + t log Cn xn a continuous function on a compact set, is bounded. Finally, n=1 n=1 the bounded inverse condition on the Hessian follows from N xn −1 the fact that the barrier functions are strictly concave in x + log 1+ (n) (n) + t log xn (1) (N) n=1 αnh y + βnxn + Nn and strictly convex in [y , ..., y ]. In particular, compu- ∇2  − −1 x 2 2L tation of the Hessian reveals that xJ ( t /(P ) )I and − −1 (n) y,(n) − (n) ∇2 −1 y 2 S × S t log yl +log Cl yl [y(1),...,y(N)] (t / maxi(P )i )I on 1 2, and hence Sα. l=1 2L N − −1 y − (n) 4. SIMULATION RESULTS t log Pl yl . l=1 n=1 The scope of the WCI analysis extends generally to DMT- (29) based DSL systems. This section examines two particu- lar cases that are deployed prevalently: VDSL and ADSL. To establish convergence, it is necessary only to show that In VDSL, a prominent interference issue is the upstream J satisfies the following sufficient conditions [27, Section 10.3.4] that the sublevel sets of ∇J2 are closed, and that the Hessian of J is Lipschitz continuous with bounded inverse. 5 The expressions are lengthy and omitted for space. 8 EURASIP Journal on Applied Signal Processing

19 × 300 m

CO 10 × 1200 m

1× (variable) m

Figure 2: Binder configuration for upstream VDSL simulations (not to scale). The dashed line is of varying lengths.

25 4.1.1. WCI rate as a function of line length

20 First, consider the WCI rate bound when the variable-length line is the victim line (Player 1). Numerical results are shown in Figure 3, where a lower bound rate as well as the rate ob- 15 tained when all lines execute full-power rate-adaptive (RA) IW are plotted as a function of victim line length. Note that (MBps) 10 full-power RA IW is quite different from fixed-margin (FM) IW, where power is minimized while achieving a fixed rate 5 and margin [18]. To investigate practical bit loading con- straints numerically, RA IW with discrete bit constraints [9] is executed on the victim modem assuming the WCI (11). 0 200 300 400 500 600 700 800 900 1000 Player 1 achieved rate with discrete bit loading is plotted as ∗ ∗ ≤ ∗ ∗ Victim lines length (m) Rd .Evidently,Rd R , and therefore Rd is also a lower bound to the achievable rate under the WCI. WCIlowerratebound( ∗) Rd Observe that for most line lengths, the rate achieved by Full-power rate-adaptive IW RA IW is fairly close to the WCI bound, particularly near 200 m and 900 m. For intermediate lengths (≈ 650 m ), rate- Figure 3: Achievable rates in upstream VDSL as a function of vic- adaptive IW can perform up to ≈ 75% better than the WCI tim lines length (200–1000 m). bound, though the absolute difference is small. As a corol- lary, the interference generated by IW in this configuration is deleterious in the sense that it is close in rate to the WCI sad- dle point. This finding is consistent with results [11] showing that other centralized DSM strategies can significantly out- near-far effect, which is caused by crosstalk from short- perform IW in such cases. Furthermore, fixed-margin (FM) (“near”) lines FEXT coupling into longer (“far”) lines. In IW can also be seen to perform significantly better than the ADSL, the issue of RT FEXT injection into longer CO lines WCI bound when rates are adjudicated reasonably [18]. is similarly of concern. Numerical results for these sample deployments demonstrate the practicality of the WCI analy- 4.1.2. WCI rate as a function of PBO sis and show surprising commonalities between the different scenarios. In all simulations, the interior-point technique is Motivated by the results of the previous section showing that used with an error tolerance of less than 0.1%. the full-power WCI rate bound can decrease precipitously as loop length increases, the efficacy of upstream power back- ff ff 4.1. VDSL upstream o (UPBO) at mitigating this e ect is considered. This sec- tion examines a simple power-backoff strategy in the form of The WCI rate bound is first applied to two different up- power-constrained RA IW for Level 0–1 DSM. Though the stream VDSL scenarios exhibiting the near-far effect. The useofRAIWisretained,aneffect similar to fixed-margin binder configuration is illustrated in Figure 2.Forallsim- (rate-constrained) IW [18] is induced by imposing various ulations, 19 × 300 m lines, 10 × 1200 m lines, and one line tighter sum power constraints. In particular, the variable- of varying length occupy the binder of 24 AWG twisted- length line is set to length 300 m, and (sum) power backoff pairs. The FTTEx M2 (998 FDM) bandplan is employed is imposed on all (20) 300 m lines with full power retained with HAM bands notched and the usual PSD constraints re- on the (10) 1200 m lines. By taking the victim line to be one moved. Tones below 138 kHz are disabled for ADSL compat- of the 300 m lines, the 300 m WCI curve in Figure 4 is gen- ibility, and the normal PSD masks are not applied. The FDM erated, yielding a lower bound to the achievable data rate for condition is satisfied for this configuration, hence βn = 0. all 300 m lines in the binder. The 1200 m WCI curve repre- For 10−7 BER, assume coding gain of 3 dB, with 6 dB mar- sents the case where the victim modem is instead taken to gin, thus Γ = 12.5 dB. Each line is limited to 14.5dBmpower be one of the 1200 m lines. To compare standardized SSM (Px = 14.5dBm,Py = 1 · 14.5dBm). techniques to DSM, the rates achieved using the SSM VDSL M. H. Brady and J. M. Cioffi 9

4000 m 2000 m 12 RT 25 × 6000 m CO 10 5 × 5000 m 8

(MBps) 6 Figure 5: Binder configuration for downstream RT ADSL simula- 4 tions (not to scale). A common RT is used for each line.

2

0 7 −60 −50 −40 −30 −20 −10 0 10 300 m power constraint (dBm) 6

∗ 1200 m WCI bound ( ∗) 5 300 m WCI bound (Rd ) Rd 300 m RA IW rate 1200 m ref. PBO rate 1200 m RA IW rate 300 m ref. PBO rate 4

(MBps) 3 Figure 4: Achievable rates in upstream VDSL as a function of short- line (300 m) power backoff. 2 1

0 UBPO masking technique defined for the noise A environ- −70 −60 −50 −40 −30 −20 −10 0 ment [29] are illustrated by dashed horizontal lines. Remote terminal PBO (from 20.4 dBm nominal) The results illustrate that a tradeoff exists between the ∗ ∗ 6000 m WCI bound (Rd ) 5000 m WCI bound (Rd ) rates of the short and long lines. Examining the 1200 m 6000 m RA IW rate 5000 m ref. PBO rate lines, the proposed technique improves both the RA IW- 5000 m RA IW rate 6000 m ref. PBO rate achieved and WCI bounds significantly up to approximately −30 dBm, with diminishing returns for further PBO as the Figure 6: Achievable rates in downstream ADSL as a function of 300 m line FEXT no longer dominates the interference pro- RT line power backoff (relative to 20.4 dBm nominal TX power). file. However, further PBO decreases the achievable rates of the 300 m lines, as expected. The WCI bound is again fairly tight. Thus by employing such a simple PBO scheme with power backoff (relative to 20.4 dBm) for the RT lines. The ff Level 1 DSM, one can dynamically control the tradeo be- horizontal lines represent the performance obtained by SSM tween short and long lines to best match desired operat- with the standardized PSD masks. ing conditions, that is, operating with guaranteed ≈ 4MBps The WCI bound is reasonably close to actual power- on the 1200 m lines and ≈ 7.75 MBps on the 300 m lines. controlled RA IW performance on both RT and CO lines. In this example, the SSM technique achieves approximately Figure 7 shows the spectrum adopted at the (approximate) the same performance as this simple DSM technique at one Nash equilibrium, as well as the power allocation chosen by ff tradeo point (≈−22 dB PBO). discrete IW against the noise induced by Player 2, yielding ∗ Rd (in discrete IW, tones above 47 are not used because they 4.2. ADSL downstream with remote terminals (RTs) correspond to fractional bit loadings). The simulation shows that Player 1 interference is dominated by interference from The WCI rate bound is also applicable to ADSL. This sec- the RT modems; these modems induce a “kindred-like” noise tion considers an RT ADSL configuration as illustrated in while the CO lines concentrate their power at low frequen- Figure 5. For all simulations, 25 ADSL lines are located cies. Also illustrated by example is that the Player 2 optimal 2000 m from a fiber-fed RT 4000 m from the CO. Addition- strategy may be highly frequency-selective, and therefore the ally, 5 × 5000 m lines are present in the binder. The FDM existing interference analysis technique of setting tight PSD ADSL standard [30] parameters are assumed. As in the VDSL masks for each modem cannot capture the WCI unless the Γ = simulations, 12.5 dB. Each line is limited to 20.4dBm masks are set very high.6 As in VDSL, a wide range of useful x y downstream power (P = 20.4dBm,P = 1·20.4 dBm), and operating points may be attained; for example, it is possible the standard PSD masks are neglected. (through proper power control) to guarantee 3 MBps service A common problem of such configurations is that the on all lines, whereas this rate point was far from being feasi- signal from the CO to the non-RT (7000 m) modems will ble with SSM or with full-power rate-adaptive IW. However be saturated by FEXT from the RT lines. As in the VDSL ex- ample, the efficacy of (sum) power backoff for the RT lines as a means of improving the rate of the CO lines is stud- 6 Doing so would consistently overestimate interference power, and under- ied. Figure 6 shows the dependence of rates on the level of estimate achievable DSM performance. 10 EURASIP Journal on Applied Signal Processing

−10 REFERENCES − 20 [1] “Spectrum management for loop transmission systems,” ANSI Std. T1.417, 2002. −30 [2] I. Sason, “On achievable rate regions for the Gaussian inter- −40 ference channel,” IEEE Transactions on Information Theory, vol. 50, no. 6, pp. 1345–1356, 2004. −50 [3]W.C.Peng,Some communication jamming games,Ph.D.the- sis, University of Southern California, Los Angeles, Calif, USA, DS PSD (dBm/Hz) −60 1986. [4] M. V. Hegde, W. E. Stark, and D. Teneketzis, “On the capacity −70 of channels with unknown interference,” IEEE Transactions on −80 Information Theory, vol. 35, no. 4, pp. 770–783, 1989. 40 60 80 100 120 140 160 [5] T. Han and K. Kobayashi, “A new achievable rate region for the Tone index interference channel,” IEEE Transactions on Information The- ory, vol. 27, no. 1, pp. 49–60, 1981. Discrete IW against WCI ( ∗) Rd [6]S.T.ChungandJ.M.Cioffi, “The capacity region of Player 1 Nash eq. strategy frequency-selective Gaussian interference channels under Player 2 Nash eq. strategy strong interference,” in Proceedings of IEEE International Con- ference on Communications (ICC ’03), vol. 4, pp. 2753–2757, Figure 7: Spectral allocations (x,[y(1), ..., y(N)]) of players 1 and Anchorage, Alaska, USA, May 2003. 2 for the rightmost lower (0 dB PBO) operating point in Figure 6, [7] K. B. Song, S. T. Chung, G. Ginis, and J. M. Cioffi,“Dy- where player 1 is a CO line. Note that the RT line spectrum overlaps namic spectrum management for next-generation DSL sys- x on most tones. tems,” IEEE Communications Magazine, vol. 40, no. 10, pp. 101–109, 2002. [8] K. J. Kerpez, D. L. Waring, S. Galli, J. Dixon, and P. Madon, ff “Advanced DSL management,” IEEE Communications Maga- without any power backo , the performance of RA IW and zine, vol. 41, no. 9, pp. 116–123, 2003. the WCI bound is near that of SSM, showing the key role of [9]T.Starr,M.Sorbara,J.M.Cioffi, and P. J. Silverman, DSL Ad- power control in obtaining DSM gains in this setting. vances, Prentice-Hall PTR, Upper Saddle River, NJ, USA, 2003. [10]S.T.Chung,S.J.Kim,J.Lee,andJ.M.Cioffi, “A game- 5. CONCLUSION theoretic approach to power allocation in frequency-selective Gaussian interference channels,” in Proceedings of IEEE In- This paper has studied the worst-case interference encoun- ternational Symposium on Information Theory (ISIT ’03),pp. tered when deploying Level 0–2 DSM techniques for next- 316–316, Pacifico Yokohama, Kanagawa, Japan, June–July generation DSL. A game-theoretic analysis has shown that 2003. under mild conditions, a pure-strategy Nash equilibrium ex- [11] R. Cendrillon, M. Moonen, J. Verliden, T. Bostoen, and W. ists in the WCI game, and can be computed using standard Yu, “Optimal multiuser spectrum management for digital sub- optimization techniques. The Nash equilibrium provides a scriber lines,” in Proceedings of IEEE International Conference on Communications (ICC ’04), vol. 1, pp. 1–5, Paris, France, useful lower bound to the achievable rate for a DSL modem June 2004. employing DSM under any power-constrained interference [12] D. Statovci and T. Nordstrom, “Adaptive subcarrier allocation, profile. Furthermore, the structure of the Nash equilibrium power control, and power allocation for multiuser FDD-DMT reveals that for FDM systems, IW is optimal in a maximin systems,” in Proceedings of IEEE International Conference on sense. Communications (ICC ’04), vol. 1, pp. 11–15, Paris, France, The WCI bound was applied to a Level 0–1 upstream June 2004. near-far VDSL scenario and was found to be numerically [13] G. Cherubini, “Optimum upstream power back-off and mul- tight. The utility of a simple DSM UPBO strategy employing tiuser detection for VDSL,” in Proceedings of IEEE Global RA IW was compared to SSM UPBO, were it was found that Telecommunications Conference (GLOBECOM ’01), vol. 1, pp. controlofratetradeoffs is possible with DSM, which may al- 375–380, San Antonio, Tex, USA, November 2001. low significantly preferable operating rates. A similar trade- [14] J. Lee, R. V. Sonalkar, and J. M. Cioffi, “Multi-user dis- off was observed in RT ADSL systems, where CO line per- crete bit-loading for DMT-based DSL systems,” in Proceedings formance benefits significantly from proper power control. of IEEE Global Telecommunications Conference (GLOBECOM ’02), vol. 2, pp. 1259–1263, Taipei, Taiwan, November 2002. These results suggest that the parameter of transmit power [15] K. S. Jacobsen, “Methods of upstream power backoff on very is important to DSM performance, in the sense that proper high speed digital subscriber lines,” IEEE Communications power control can beget large performance gains in this set- Magazine, vol. 39, no. 3, pp. 210–216, 2001. ting. [16] S. Schelstraete, “Defining upstream power backoff for VDSL,” IEEE Journal on Selected Areas in Communications, vol. 20, ACKNOWLEDGMENT no. 5, pp. 1064–1074, 2002. [17] K.-M. Kang and G.-H. Im, “Upstream power back-off method The research was supported by NSF under Contract CNS- for VDSL transmission systems,” IEE Electronics Letters, 0427677 and by the Stanford Graduate Fellowship Program. vol. 39, no. 7, pp. 634–635, 2003. M. H. Brady and J. M. Cioffi 11

[18] W. Yu, G. Ginis, and J. M. Cioffi, “Distributed multiuser power of the National Academy of Engineering (2001); IEEE Kobayashi control for digital subscriber lines,” IEEE Journal on Selected Medal (2001); IEEE Millennium Medal (2000); IEEE Fellow (1996); Areas in Communications, vol. 20, no. 5, pp. 1105–1115, 2002. IEE J.J. TomsonMedal (2000); 1999 University of Illinois Outstand- [19]M.Ho,J.M.Cioffi, and J. A. C. Bingham, “Discrete multi- ing Alumnus, 1991 IEEE Communications Magazine Best Paper; tone echo cancelation,” IEEE Transactions on Communications, 1995 ANSI T1 Outstanding Achievement Award; NSF Presidential vol. 44, no. 7, pp. 817–825, 1996. Investigator (1987–1992), ISSLS 2004 Outstanding Paper Award. He has published over 250 papers and holds over 40 patents. [20] K. Van Acker, M. Moonen, and T. Pollet, “Per-tone echo can- cellation for DMT-based systems,” IEEE Transactions on Com- munications, vol. 51, no. 9, pp. 1582–1590, 2003. [21]G.Ysebaert,K.Vanbleu,G.Cuypers,M.Moonen,andJ. Verlinden, “Echo cancellation for discrete multitone frame- asynchronous ADSL transceivers,” in Proceedings of IEEE In- ternational Conference on Communications (ICC ’03), vol. 4, pp. 2421–2425, Anchorage, Alaska, USA, May 2003. [22] D. C. Jones, “Frequency domain echo cancellation for discrete multitone asymmetric digital subscriber line transceivers,” IEEE Transactions on Communications,vol.43,no.2-4,pp. 1663–1672, 1995. [23] T. M. Cover and J. A. Thomas, Elements of Information Theory, John Wiley & Sons, New York, NY, USA, 1991. [24] S. Schelestrate ed., “Very high speed digital subscriber lines, part 3: Multicarrier modulation (MCM) specification,” ANSI Std. T1.424, 2002. [25]B.WidrowandS.D.Streams,Adaptive Signal Processing, Prentice-Hall, Englewood Cliffs, NJ, USA, 1985. [26] J. M. Cioffi, “Incentive-based spectrum management,” T1.E1 Contribution 2004/480R2, August 2004. [27] S. Boyd and L. Vandenberghe, Convex Optimization,Cam- bridge University Press, Cambridge, UK, 2004. [28] T. Basar and G. J. Olsder, Dynamic Noncooperative Game The- ory, Academic Press, New York, NY, USA, 1982. [29] “Very high speed digital subscriber lines, part 1: Metallic in- terface,” ANSI T1.424 (Draft), February 2004. [30] ITR Recommendations G.992.1, “Asymmetric digital sub- scriber line (ADSL) transceivers,” ITU, June 1999.

Mark H. Brady received his B.S.E.E degree in 2001 from the University of Illinois at Urbana-Champaign, and his M.S.E.E de- gree from Stanford University in 2003. He is presently a Doctoral candidate at Stan- ford University under the supervision of Professor John Cioffi. His research interests include DSL systems, optimization theory, and information theory.

John M. Cioffi received his B.S.E.E. de- gree in 1978 from University of Illinois and he received his Ph.D.E.E. degree in 1984 from Stanford University. He was with Bell Laboratories from 1978 to 1984 and with IBM Research from 1984 to 1986. He has been a Professor of electrical engi- neering at Stanford University since 1986. He founded Amati Com. Corp. in 1991 (purchased by TI in 1997) and was Offi- cer/Director from 1991 to 1997. He currently is on the Board of Directors of Marvell, ASSIA, Inc. (Chair), Teranetics, and ClariPhy. He is on the Advisory Board of Portview Ventures and Wavion. His specific interests are in the area of high-performance digi- tal transmission. He is the holder of Hitachi America Professor- ship in Electrical Engineering at Stanford (2002); he is a Member Hindawi Publishing Corporation EURASIP Journal on Applied Signal Processing Volume 2006, Article ID 80941, Pages 1–13 DOI 10.1155/ASP/2006/80941

Joint Multiuser Detection and Optimal Spectrum Balancing for Digital Subscriber Lines

Vincent M. K. Chan and Wei Yu

The Edward S. Rogers Sr. Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON, Canada M5S 3G4 Received 1 December 2004; Revised 27 April 2005; Accepted 8 July 2006 In a digital subscriber line (DSL) system with strong crosstalk, the detection and cancellation of interference signals have the potential to improve the overall data rate performance. However, as DSL crosstalk channels are highly frequency selective and multiuser detection is suitable only when crosstalk is strong, the set of frequency tones in which multiuser detection may be used must be carefully chosen. Further, this problem of tone selection is highly coupled with the transmit power spectra of both direct and interfering signals, so the optimal solution requires the tone selection problem to be solved jointly with the multiuser spectrum optimization problem. The main idea of this paper is that the above joint optimization may be done efficiently using a dual decomposition technique similar to that of the optimal spectrum balancing algorithm. Simulations show that multiuser detection can increase the bit rate performance in a remotely deployed ADSL environment. Rate improvement is also observed when near-end crosstalk is estimated and cancelled in a VDSL environment with overlapping upstream and downstream frequency bands.

Copyright © 2006 Hindawi Publishing Corporation. All rights reserved.

1. INTRODUCTION cooperation represents the ultimate capacity limit for DSL systems. Crosstalk noise is a major limiting factor in wideband dig- This paper explores a different form of cooperation that ital subscriber line (DSL) systems. Current research has lies between the PSD-level and the signal-level coopera- focused on dynamic spectrum management (DSM) tech- tions described above. The algorithms described in this pa- niques for mitigating the effect of crosstalk [1]. The goal per are most applicable to DSL configurations where the of DSM is to facilitate cooperation among mutually inter- crosstalk channels are heavily unbalanced. For example, in a fering lines in a binder. Cooperation may be implemented downstream ADSL deployment with an optical network unit in two different levels. Power spectral density (PSD) level (ONU), some remote terminals (RT) served from the cen- cooperation allows the optimal set of power spectral den- tral office (CO) can be located much closer to a nearby ONU sities to be computed for each line in the binder so that than to their own CO. In this case, the crosstalk emitted by the effect of mutual interference is minimized. In this case, the ONU can overwhelm the intended transmission from the multiple transmitters in a DSL binder operate indepen- CO. Hence, the crosstalk channel can be stronger than even dently, but at mutually accommodating PSD levels. The the direct channel. class of algorithms that are capable of computing the best Signal-level cooperation is often not possible in the set of PSDs is called spectrum balancing algorithms (e.g., case described above. This is true for ADSL systems where [2, 3]). the transmitters and the receivers are not physically colo- When cooperation is possible, not only at the PSD cated. In this case, PSD-level cooperation, although capa- level, but also at the transmission signal level, the multi- ble of producing a large gain as compared to the current line DSL binder can then be truly designed as a multiple- practice of static spectrum management, is still not the- input multiple-output (MIMO) system where multiuser de- oretically the best possible. The main point of this pa- tection algorithms can be implemented [4]. In this case, per is that multiuser detection and crosstalk cancellation each line has the full knowledge of the transmitted sig- can bring further improvements to the system performance nal from neighboring lines, and crosstalk can be completely in these scenarios even when signal-level cooperation is not cancelled. The capacity of a DSL binder with signal-level possible. 2 EURASIP Journal on Applied Signal Processing

One of the main contributions of this paper that enables The following assumptions are made in the rest of crosstalk cancellation in systems with no signal-level coop- the paper. Perfect knowledge of channel state informa- eration is the idea of joint spectrum optimization and mul- tion of the direct and crosstalk channels is assumed. PSD- tiuser detection. Intuitively, crosstalk cancellation is effective level coordination between CO and ONU is assumed to only when the crosstalk signal is strong. In DSL systems, the be available for computing the best set of power spec- crosstalk channels are usually more severe at high frequency tra. The multiuser detection scheme used in the algorithm tones. The crosstalk channel in the low frequency band is is of the interference cancelling type, in which the inter- often too weak for crosstalk detection. Thus, multiuser de- fering signals are either detected fully or partially. Imple- tection must be carried out only at a selective set of tones menting this type of detection requires the assumption for optimal performance. Further, the magnitude of crosstalk that the multiuser detector can perfectly synchronize with at each tone depends also on the transmit power spectra the interfering users, for example, using schemes described of the neighboring line at that tone. Hence, the problem of in [12, 13]. Discrete multitone modulation (DMT) is as- tone selection and the optimal multiuser spectrum balanc- sumed. Proper insertion of the cyclic prefix and suffixisas- ing is strongly coupled. The main novelty of this paper is a sumed to ensure orthogonality between the DMT subchan- method that determines the optimal transmit spectra jointly nels. with the optimal tone selection for multiuser detection. The algorithm is based on the idea of dual optimization, recently 2. OPTIMAL SPECTRUM BALANCING applied to the optimal spectrum balancing problem in [3, 5] ALGORITHMS and its low-complexity version described in [6]. As the results of this paper show, multiuser detection can bring further im- Before addressing the multiuser detection problem, it is use- provement to the performance of the overall system beyond ful to review the spectrum optimization problem without that of optimal spectral balancing alone without the need for multiuser detection and to outline an existing algorithm additional cooperation. called optimal spectrum balancing (OSB). The OSB algo- The ideas of crosstalk cancellation and power alloca- rithm solves the spectrum optimization problem in a com- tion have been considered separately in the past. For exam- putationally manageable fashion. It is a crucial ingredient for ple, [7] proposed a maximum-likelihood multiuser detector the joint multiuser detection and spectrum balancing algo- (ML-MUD) that considers all possible combinations of the rithm to be described later. interference signals and determines the most likely combi- nation given the received signals. Alternatively, in an inter- ference cancelling multiuser detector (IC-MUD), interfer- 2.1. The spectrum optimization problem ence from adjacent users can be estimated, reconstructed, and subtracted from the received signal. It is shown in [8] In a K-user DSL bundle, the objective of spectrum optimiza- that this type of interference cancelling scheme can achieve tion is to maximize the weighted sum-rate of all participat- a substantial performance gain for near-end crosstalk can- ing users given an individual power constraint for each user. cellation. In terms of power allocation, [9]proposedanef- Given Pk the power constraints for user k and a set of weights K = ficient method for allocating power in DSL systems with wk such that k=1 wk 1, the goal of optimization is to find n multiuser detection. However, crosstalk is assumed to be the set of Sk , which is the subchannel power for user k in tone strong and crosstalk cancellation is performed in all chan- n, that maximizes the weighted sum of transmission rates of nels. Hence, none of the previous work considers the joint all users. Mathematically, the problem can be written as fol- optimization of bit/power allocation and crosstalk cancella- lows: tion. The main contribution of this paper is to show that such a joint optimization can be done in a numerically efficient K way. max wkRk s.t.Pk ≤ Pk ∀k,(1) n n N {S ,...,S } = While the crosstalk cancellation schemes mentioned in 1 K n 1 k=1 the above paragraphs involves full detection of the interfer- ence signal, this paper explores the possibility of performing where Pk is the total power used by user k, Rk is the total rate partial detection as well. The idea of partial detection stems achieved by user k,andN is the number of frequency tones in from classical information theoretical treatment of interfer- the DMT system. Solving (1) for all combinations of wk gives ence channel capacity. The largest achievable rate region for the achievable rate region of the system. The design variables n a Gaussian interference channel is described in [10, 11]. The in this problem are Sk ’s subject to the constraints main idea of [10, 11] is that the detection and subtraction of the interfering signal is useful and that partial detection N can further expand the rate region offered by complete de- = Δ n ≤ Pk f Sk Pk (2) tection. However, information theoretical results deal with n=1 frequency-flat channels only. This paper investigates the best achievable rate region for frequency-selective channels where n ≥ Δ and Sk 0, for all k, n where f is the frequency width of the the optimal power allocation across the frequency is of cru- DMT tones. Since DMT modulation facilitates independent cial importance. data transmission on each tone, Rk in (1) can be calculated V. M. K. Chan and W. Yu 3

 = N n n as Rk (1/T) n=1 bk ,whereT is the symbol period and bk The function g(λ1, ..., λK ) can be decoupled into N per- denotes the achievable bit rate for user k in tone n given by tone maximization problems. Since discrete bit-loading is as- sumed, each subproblem becomes discrete and the search      space becomes finite. Hence, each of the N maximization  2 1 hn Sn over { n, , n } canbesolvedbyanexhaustivesearchover bn = log 1+ ·  k  k  . (3) S1 ... SK k 2 Γ n  n 2 n all possible combinations of {bn, ..., bn } instead. Let the σk + i=k αi,k Si 1 K maximum number of bits on each tone be B. The exhaus- tive search involves BK combinations. For each combination, Γ n { n n } Here, is the SNR gap, σk is the channel noise variance for the corresponding S1, ..., SK maybecalculatedbyinvert- n user k in tone n, hk is the direct channel transfer function ing (3), and the one maximizing the Lagrangian as in (4) n for user k in tone n,andαi,k is the crosstalk transfer function may be found. As the maximization can be done on each from the ith user to the kth user in tone n. tone individually, the complexity of evaluating g(λ1, ..., λK ) The following assumptions are made in the above rate is O(NBK ), which is linear rather than exponential in N. calculation. First, discrete bit-loading is assumed, meaning The minimization of g(λ1, ..., λK )canbeefficiently that the number of bits loaded into each tone is restricted to solved using a subgradient search method. The idea is to be integer values. Second, a transmitted signal from one user keep adjusting {λ1, ..., λK } in proportion to a subgradient. is always treated as noise for all other users. The possibility of Global optimum is always attainable because the dual prob- crosstalk cancellation and multiuser detection is disregarded. lem is convex. It is pointed out in [5] that a subgradient of Third, intertone interference caused by channel propagation − Δ N = ··· T g(λ1, ..., λK )isP f n=1 Sn,whereP [P1 PK] delay and unsynchronous DMT blocks is neglected. This as- = n ··· n T and Sn [S1 SK] . Using this subgradient corresponds sumption is reasonable as long as the intertone interference Δ N n ≥ to increasing λk if f n=1 Sk Pk and decreasing λk if is minimized in practical frame-synchronous systems imple- N n Δ f = S ≤ P . The complexity of the subgradient search menting zipper-like modulation [12, 13]. With the last two n 1 k k is polynomial in K. Thus, the overall complexity of the OSB assumptions, the signal received by user k contains crosstalk algorithm is kept at O(NBK ). interference from all other users on a tone-by-tone basis. The optimal spectrum balancing algorithm works for the following reason. In general, for nonconvex optimiza- 2.2. Optimal spectrum balancing tion problems, solving the dual problem provides only an upper bound to the primal problem. The difference be- The main difficulty of the spectrum optimization problem tween the primal optimum and the dual optimum is called n (1) is that Rk is a nonconvex function of Sk . As the optimiza- the duality gap. From dual optimization theory, the du- tion is coupled over frequency by the power constraints, solv- ality gap is zero if the primal problem is convex, that is, ing this problem with a brute-force approach involves search- ing through all possible bit allocations on all frequency tones. This requires a complexity that is exponential in N,where K N = 256 for ADSL and N = 4096 for VDSL systems. Clearly, max wkRk = min g λ1, ..., λK . (5) Sn,...,Sn λ ,...,λ this is computationally intractable in a practical implemen- 1 K k=1 1 K tation. To reduce the computational complexity, the OSB algo- It turns out that for the spectrum optimization problem rithm proposed in [3] uses the idea of dual decomposition in DMT systems, the duality gap is zero even though the and solves the problem in the Lagrangian dual-domain. The primal problem is nonconvex [5]. The reason is that all main idea is to form the dual of the original problem and to DMT-based systems satisfy a so-called time-sharing prop- decompose the dual problem on a tone-by-tone basis. The erty which essentially transforms the nonconvex objective dual problem is the optimization of minλ1,...,λK g(λ1, ..., λK ) function into a convex function. More precisely, given the ≥ subject to λk 0. Hence, solving the dual problem con- total power of two power allocation schemes Px, Py,let sists of evaluating the dual objective g(λ1, ..., λK )forfixed R(P) denote the maximum rate achievable using P. The re- { } λ1, ..., λK and minimizing g(λ1, ..., λK ) over nonnegative quirement of the time-sharing property is that all interme- λk’s. diate rate vR(Px)+(1− v)R(Py)mustbeachievableus- The evaluation of g(λ1, ..., λK ) can be simplified by de- ing vPx +(1− v)Py (where 0 ≤ v ≤ 1 is the time- composing the dual objective as follows: sharing variable). The time-sharing property ensures that R(P)isconcaveinP, which in turn ensures the zero duality gap. g λ , ..., λ 1 K The DMT systems satisfy the time-sharing property K K whenever the frequency tone spacing is small. In this case, = max wkRk − λk Pk − Pk the intermediate rate can be achieved by interleaving the fre- { n n }N S1 ,...,SK n=1 k=1 k=1 (4) quency tones of the two original power allocations corre-   sponding to R(P )andR(P ). The approximation is accurate N K K x y = n − n as long as N is sufficiently large, which is true for practical max wkbk λkSk + λkPk. Sn,...,Sn n=1 1 K k=1 k=1 DSL systems. 4 EURASIP Journal on Applied Signal Processing

2.3. Iterative spectrum balancing c2 l1 Although the complexity of OSB is linear in N, the optimiza- K User 1 n n n − n CO tion within each tone, namely maxS1 ,...,SK k=1(wkbk λkSk ), Strong crosstalk has exponential complexity in K. To further reduce this com- User 2 plexity, an approximate near-optimal iterative spectrum bal- ONU ancing (ISB) algorithm is devised in [6]. The main idea of the l2 c1 ISB is to evaluate (4) approximately by iteratively optimizing K n − n Downstream transmission k=1(wkbk λkSk ) on a user-by-user basis. Specifically, the following maximization is performed repeatedly until con- vergence: Figure 1: Loop topology for 2-user ADSL downstream.

K ··· n − n Thus, the bit rate of user 2 is restricted by the quality of the max max max wkbk λkSk . (6) Sn Sn Sn K 2 1 =1 crosstalk channel. Therefore, k       n2 n ˜n = 1 · h1 S1 n b1 log2 1+Γ n Hence, the algorithm first optimizes S1 while keeping σ1 n n n n S2, ..., SK fixed, then optimizes S2 keeping all other Sk fixed,      n n n n  n 2 n then S3, ..., S , then S1, S2, ..., and so on. Convergence is 1 α S K b˜n = min log 1+ · 2,1  2 , (7) guaranteed because the objective function is nondecreasing 2 2 Γ n  n2 n σ1 + h1 S1 in each iteration. Although not globally optimal, simula-      2 tion shows that this scheme provides a near-optimal per- 1 hn Sn · 2 2 formance as compared to OSB for many practical chan- log2 1+ 2 Γ σn + αn  Sn nels. 2 1,2 1 ff | n |2 n The major advantage that ISB o ers over OSB is that its is an achievable rate pair. Note the removal of the α2,1 S2 computational complexity is polynomial in the number of ˜n ˜n term in the noise of b1 due to crosstalk cancellation. Thus, b1 users (and linear in the number of tones as before). OSB is ˜n is now larger than before. However, to ensure that b2 may not practical when the number of users is large. However, ˜n be cancelled by the first user, b2 is now the minimum of ISB can be applied to a large number of users while provid-  Γ · the rate allowed by the crosstalk channel log2(1 + (1/ ) ing a substantial performance gain to that of conventional | n |2 n n | n|2 n  ( α2,1 S2/(σ1 + h1 S1))) and the rate of the direct channel methods such as iterative water-filling [2].  Γ · | n|2 n n | n |2 n  log2(1 + (1/ ) ( h2 S2/(σ2 + α1,2 S1))) . Since channel gains are frequency selective, not every 3. JOINT MULTIUSER DETECTION AND tone in the crosstalk channel is suitable for multiuser detec- OPTIMAL SPECTRUM BALANCING tion. Good quality crosstalk channels, or channels with large n α2,1, only reside in the high frequencies where the crosstalk In both spectrum balancing algorithms, as described in the coupling between lines is strong. Thus, the multiuser detec- ff previous section, crosstalk from adjacent users is always re- tion scheme is e ective when it is applied only to high fre- garded as noise. This is near-optimal when the crosstalk quency tones. Making such a tone selection for multiuser de- n = tection is not trivial but important for achieving the optimal channel gains, αi,k for i k, are small. In many practical circumstances, however, an interfering transmitter can be weighted sum-rate. located very closely to the receiver of a neighboring user, This paper proposes a method that jointly determines for example, see Figure 1. In this case, crosstalk cancellation the optimal tone selection and optimal spectrum in an ef- schemes as described in the following sections may poten- ficient manner. The method is based on the dual decompo- tially bring additional performance gains. The discussion in sition idea of the OSB algorithm. For any tone n, multiuser this section is restricted to the detection of far-end crosstalks detection at receiver 1 can be enabled or disabled. This pro- { n n } (FEXT). Near-end crosstalk (NEXT) cancellation will be ad- vides an alternative mapping function from S1, ..., SK to { n n } dressed later. b1 , ..., bK . The choice between the two for each tone is the one that maximizes g(λ1, ..., λK ). When K = 2, (4)canbe modified as follows: 3.1. Full detection of the interfering user g λ1, λ2      The main idea proposed in this paper is that multiuser de- N 2 2 2 = n ˜n − n tectors (MUD) can be applied in conjunction with spectrum max max wkbk , wkbk λkSk Sn,Sn optimization in situations such as that in Figure 1.Amul- n=1 1 2 k=1 k=1 k=1 tiuser detector at the receiver of user 1 works by first detect- 2 ing and subtracting the signal from user 2 in the received sig- + λkPk. nal, then detecting the signal from user 1. Implementation of k=1 this scheme requires error-free decoding of user 2 at user 1. (8) V. M. K. Chan and W. Yu 5

n n S1 MUD for βnS2 User 1 β Sn n 2 User 2

− n (1 βn)S2

Downstream transmission

Figure 2: Partial interference detection for 2-user ADSL downstream.

{ n n} The set S1, S2 that minimizes g(λ1, λ2) is the optimal power The above method for finding the optimal power spec- n ˜n spectra and the choice of bk or bk in the inner maximization trum with MUD at the receivers can be extended to more determines the MUD mode for tone n. Similar to optimal than two users. However, the algorithm does become more { n n} spectrum balancing, the search for optimal S1, S2 can be complex. With two users, as in previous example, there are { n n} {˜n ˜n} performed by searching for the optimal b1 , b2 or b1 , b2 only two modes for the MUD: either cancelling or ignoring { n n} and inverting (7) to obtain the corresponding S1, S2 .Al- the crosstalk. If instead there are S users connecting to CO though an extra maximization computation is required when and T users connecting to ONU in Figure 1, there are ST can- multiuser detection is taken into account, the order of com- cellable strong crosstalk channels, giving a total of 2ST MUD plexity remains at O(NB2). modes. To lower the complexity, an upper limit should be Same as in the case of OSB, The joint multiuser detection imposed on the number of crosstalk channels considered for and optimal spectrum balancing algorithm works by mini- cancellation while the rest of the crosstalk channels should be mizing g(λ1, λ2)overallλk’s using a subgradient algorithm. ignored for cancellation. Choosing which crosstalk should be When N →∞, in which case the time-sharing property of ignored depends on the actual channel configurations. Nev- the DMT system holds, global optimality of this algorithm is ertheless, once the choice of cancellable crosstalk is made of- guaranteed, as shown in the following theorem. fline, the joint multiuser detection and OSB algorithm deter- mines the optimal spectra efficiently. Theorem 1. The joint multiuser detection and optimal spec- So far, the type of multiuser detection described involves trum balancing algorithm achieves global optimality in the fully resolving the signals transmitted from the interfering →∞ spectrum optimization problem (1) as N . user. Intuitively, this imposes a strict upper bound on the bit Proof. The frequency tone spacing approaches zero as N → rate of user 2. To relax this restriction, a scheme that involves ∞. In this case, the DMT system can achieve the time-sharing only partial detection of the interfering user is introduced in property by using the frequency tone interleaving scheme de- the next section. scribed in Section 2.2. This reduces the duality gap to zero. Hence, global optimality can be achieved by minimizing the 3.2. Partial detection of the interfering user dual objective g(λ1, λ2). Since the dual objective is always convex, global optimum can always be reached by using a In a 2-user interference channel, partial detection of the sig- subgradient search. nal from user 2 at user 1 on tone n worksbyfirstparti- This proof of global optimality is similar to that of the tioning the bitstream at transmitter 2 and then allocating n − n OSB algorithm. The inclusion of the alternative mapping βnSk and (1 βn)Sk to the two streams. Here, βn denotes {b˜n} does not affect the convexity of the primal objective the fraction of signal power at user 2 intended for mul- k tiuser detection. The two bitstreams are modulated sepa- with respect to the power constraint. As long as g(λ1, λ2) is evaluated by maximizing over all {bn, bn} and {b˜n, b˜n}, rately and transmitted through the same channel, as illus- 1 2 1 2 trated in Figure 2. global optimum can be reached by minimizing g(λ1, λ2). For a general 2-user interference channel, an MUD can One possible scheme for implementing bitstream parti- n be installed at both/either/neither receivers, resulting in a to- tioning is nested signal constellation. Suppose b2,β are the bits n tal of four options. However, the placement of MUD can resulting from βnSk , which are designed for multiuser detec- n often be easily determined for practical channels given the tion by user 1, and b2,β¯ are the undetected bits resulting from n channel lengths. Referring to Figure 1, simulation experience (1 − β )Sn.Thebn bits are first modulated in a 2b2,β points ff n k 2,β shows that an MUD at user i is e ective only if ci/li < 1. constellation. Each signal point is yet another constellation n Clearly, it is not possible that both c1

     2 1 hn Sn b˜n β = log 1+ ·  1 1 1 n 2 Γ n  n 2 − n σ1 + α2,1 1 βn S2            2  2 1 hn 1 − β Sn 1 αn β Sn b˜n β = log 1+ · 2  n 2 +min log 1+ ·   2,1 n2 , (9) 2 n 2 Γ n  n 2 n 2 Γ n  n2 n  n 2 − n σ2 + α1,2 S1 σ1 + h1 S1 + α2,1 1 βn S2       2 1 hn β Sn log 1+ ·   2  n 2 . 2 Γ n  n 2 n  n2 − n σ2 + α1,2 S1 + h2 1 βn S2

n In (9), βn represents a continuum between no multiuser de- and S2 until it converges. This means that the following per- tection at user 1 and full detection of user 2. When βn = 0, tone maximization problems will be carried out alternately: (9)canbereducedto(3). Similarly, when βn = 1, (9)canbe reduced to (7).    Similar to the case of full detection, incorporating (9) 2 2 n n n max max w b , w b˜ − λ1S , into the OSB algorithm requires solving N per-tone maxi- n k k k k 1 S1 mization of the dual objective over {Sn, ..., Sn } and βn.The k=1 k=1 1 K    (11) dual objective for a 2-user system becomes 2 2 n ˜n − n max max wkbk , wkbk λ2S2. Sn g λ1, λ2 2 k=1 k=1    N 2 2 = n − n max wkbk βn λkSk Same as in the case of ISB, this iterative algorithm always con- Sn,Sn,β n=1 1 2 n k=1 k=1 (10) verges because g(λ1, λ2) is nondecreasing for each iteration. n 2 In terms of implementation, the maximization over Sk can be once again performed by maximizing over bn. Although this + λkPk. k k=1 iterative technique cannot retain the linear complexity of ISB due to exponentially growing number of MUD modes, this { n n } An exhaustive search over S1, S2, βn is feasible because we technique has drastically reduced the complexity from that only allow integer bitstream partitioning at user 2. Then, of the joint multiuser detection and OSB algorithm. { n n } the search space of S1, S2, βn is equivalent to that of The idea of running ISB with multiuser detection can be { n n n } b1 , b2,β¯, b2,β . The complexity of the this scheme for 2-user extended to systems with more than 2 users. However, the systems becomes O(NB3). multiuser detection scheme becomes much more complex = K Since the optimization space includes cases of βn 0and when K is large. In general, there are 2( 2 ) crosstalk channels βn = 1, this partial detection scheme performs at least as well in a K-user frequency-division duplex system. Although only as full detection. However, simulation results show that the the strong crosstalk requires participation in the multiuser option of partial detection only provides marginal perfor- detection scheme, the number of MUD modes still increases mance gain for DSL systems. Given the increase in transceiver drastically with K. Hence, the number of crosstalk channels complexity involved, allowing partial detection is not neces- considered for cancellation must be limited for complexity sary for DSL systems. concerns.

4. JOINT MULTIUSER DETECTION AND 5. NEAR-END CROSSTALK CANCELLATION IN ITERATIVE SPECTRUM BALANCING FULL DUPLEX DSL SYSTEMS

The complexity of evaluating g(λ1, ..., λK ) in the optimal In traditional DSL system design, upstream and down- spectrum balancing algorithm may be reduced by applying stream transmissions are usually separated with a frequency- ISB, the iterative (and near-optimal) approach. A similar ap- division duplex scheme in order to avoid near-end crosstalk. proach can be applied when multiuser detection is consid- With multiuser detection, near-end crosstalk can potentially ered. The following section describes a scheme that works be detected and cancelled. This gives rise to the possibility of with a 2-user system operating downstream transmission as afullyduplexDSLsystem. in Figure 1 when only full detection is considered. Consider a 2-user VDSL system as shown in Figure 3 in The algorithm involves evaluating g(λ1, λ2)from(8)in which both upstream and downstream transmission takes an iterative fashion. For a fixed set of λk’s, g(λ1, λ2)ismax- place simultaneously in the same frequency band. There are n n imized over S1 while holding S2 constant. Then the maxi- a total of four transmitters. The joint spectrum balancing n n mization is performed over S2, and this continues between S1 and multiuser detection algorithm described in the previous V. M. K. Chan and W. Yu 7

l1 −10 −20 User 1 − Strong NEXT FEXT 30 Strong NEXT CO User 2 RT −40 −50 l2 −60 Full duplex transmission −70 Magnitude (dB) −80 Figure 3: Loop topology for 2-user full duplex VDSL. −90 −100 −110 section can be directly applied to this case by considering an 0 200 400 600 800 1000 1200 equivalent 4-user system with 8 crosstalk channels. Frequency (kHz) n n Let S1 and S2 be the downstream transmission powers for n n Direct channel l1,l2 users 1 and 2, respectively. Let S3, S4 be the upstream trans- Crosstalk channel c1 mission power for users 1 and 2. Let the FEXT channels be n n n n n n n α1,2, α2,1, α3,4, α4,3, and the NEXT channels be α1,4, α4,1, α2,3, n n Figure 4: Channel response of 12 kft direct channels and 1 kft α3,2. Assume perfect echo cancellation. So, the rest of the αi,k’s crosstalk channel. are also zero. The equivalent 4-user system has 8 crosstalk channels, and thus 28 MUD modes. However, the rate equa- 6 tion for a particular user is primarily affectedbyonly2ofthe crosstalk channels, one of them being NEXT and the other n n 5 being FEXT. For example, the bit rate b1 derived from S1 ff n n is only a ected by FEXT from α2,1 and NEXT from α4,1.In addition, the assumption that FEXT is much smaller than 4 NEXT in the configuration of Figure 3 can be safely taken. Hence, crosstalk cancellation from only one NEXT channel 3 should be considered.

Simulation results in the next section show rate improve- 2 ment when NEXT cancellation is performed in a 2-user VDSL full duplex system. This suggests potential grounds for improvement of the current VDSL system with a fixed User 2 downstream data1 rate (Mbps) nonoverlapping bandplan for upstream and downstream. 0 0123456 6. SIMULATIONS User 1 downstream data rate (Mbps) OSB ISB with MUD (user 1 → 2order) This section illustrates the improvement in bit rate with mul- OSB with MUD ISB (user 2 → 1order) tiuser detection. The performances of the joint optimal spec- OSB with partial MUD ISB with MUD (user 2 → 1order) trum balancing and the joint iterative spectrum balancing al- ISB (user 1 → 2order) gorithms are simulated in DSL binders. For all simulations except where specified, a target error probability of 10−7 with Figure 5: Achievable rate region for 2-user ADSL downstream us- about 3 dB coding gain and 6 dB noise margin is used. The ing OSB/ISB and the joint multiuser detection algorithm for gap = DSL lines are 26-gauge twisted pairs for all cases. 12 dB.

6.1. ADSL downstream direct channel at high frequency. Thus, multiuser detection should only be performed at high frequency tones. Note that A 2-user ADSL downstream scenario as shown in Figure 1 the ideal tone selection for crosstalk cancellation depends with l1 = l2 = 12 kft and c1 = 1 kft is simulated. The on not only the channel response, but also the transmis- crosstalk from transmitter 2 to receiver 1 is large due to the sion power of the interfering user. The joint multiuser de- closedistancebetweenthem.TheFEXTchannelissimu- tection algorithms proposed in this paper solve the coupling lated using standard methods. It represents the 99% worst- problem of tone selection and power allocation simultane- case crosstalk scenario. Figure 4 shows the strength of the di- ously in an efficient manner. rect and crosstalk channels. As clearly illustrated in the fig- Figure 5 shows the achievable rate increase offered by the ure, crosstalk is weak at low frequency but it overwhelms the joint multiuser detection algorithm. When OSB is performed 8 EURASIP Journal on Applied Signal Processing

−30 −30

−40 −40

−50 −50

−60 −60

−70 −70 Power allocation for user 1 (dBm) Power allocation for user 2 (dBm)

−80 −80 0 200 400 600 800 1000 0 200 400 600 800 1000 Frequency (kHz) Frequency (kHz)

(a) (b)

−30 −30

−40 −40

−50 −50

−60 −60

−70 −70 Power allocation for user 1 (dBm) Power allocation for user 2 (dBm)

−80 −80 0 200 400 600 800 1000 0 200 400 600 800 1000 Frequency (kHz) Frequency (kHz)

(c) (d)

Figure 6: Power allocations for 2-user ADSL downstream (a) and (b) with optimal spectrum balancing alone and (c) and (d) with the joint multiuser detection algorithm (a) and (b) correspond to the rate pair R1 = 4.1120 Mbps, R2 = 2.6040 Mbps; and (c) and (d) correspond to theratepairR1 = 4.1440 Mbps, R2 = 2.9680 Mbps. The dotted line denotes the frequency band in which multiuser detection is applied. Full interference detection is assumed.

with multiuser detection, a 14% increase for one user or 7% The following further observations can be made. As men- for both users can be observed. For example, without mul- tioned in previous sections, the rate region offered by partial tiuser detection (4.1120 Mbps, 2.6040 Mbps) is achievable; detection is nearly identical to that of full detection. Thus, with multiuser detection it is increased to (4.1440 Mbps, enabling partial detection of the interfering user results in 2.9680 Mbps). The corresponding power allocation for both no noticeable gain from that already achieved by full detec- users at these rates are illustrated in Figure 6. Note that in tion. As shown in Figure 5, the ISB rate regions appear to high frequency bands, frequency-division multiplexing for be close to the OSB rate regions. For ISB, there is a choice the two users is enforced when MUD is off. On the other of user ordering when performing the maximization in (6) hand, joint multiuser detection and spectrum balancing al- and (11). A different choice of ordering slightly alters the lows both users to transmit data even when crosstalk is severe rate regions. Interestingly, the difference between the two at high frequency. The extra bits transmitted in this region orderings decreases when multiuser detection is performed. contribute to the overall bit rate increase. Figure 7 shows the power allocation for both users for the V. M. K. Chan and W. Yu 9

−30 −30

−40 −40

−50 −50

−60 −60

−70 −70 Power allocation for user 1 (dBm) Power allocation for user 2 (dBm)

−80 −80 0 200 400 600 800 1000 0 200 400 600 800 1000 Frequency (kHz) Frequency (kHz)

(a) (b)

−30 −30

−40 −40

−50 −50

−60 −60

−70 −70 Power allocation for user 1 (dBm) Power allocation for user 2 (dBm)

−80 −80 0 200 400 600 800 1000 0 200 400 600 800 1000 Frequency (kHz) Frequency (kHz)

(c) (d)

Figure 7: Power allocations for 2-user ADSL downstream (a) and (b) with iterative optimal spectrum balancing alone and (c) and (d) with the joint multiuser detection algorithm (a) and (b) correspond to the rate pair R1 = 3.5400 Mbps, R2 = 2.8920 Mbps; and (c) and (d) correspond to the rate pair R1 = 3.4800 Mbps, R2 = 3.5400 Mbps. The dotted line denotes the frequency band in which multiuser detection is applied.

user 1 → 2 order. Multiuser detection increases the rates from increase for both users or 17% increase for one user is possi- (3.5400 Mbps, 2.8920 Mbps) to (3.4800 Mbps, 3.5400 Mbps) ble. in this case. The power spectra is similar to that resulted from The above simulations are done with an SNR gap of 6 dB ISB. With more than 2 users, however, the benefit of mul- and a margin of 6 dB. Figure 9 shows the performance gain tiuser detection turns out to be smaller. of multiuser detection when the gap and margin is 0 dB. In Figure 8 illustrates the relationship between the length this case, the benefit of multiuser detection goes as high as of the crosstalk channel and the bit rate increase with mul- 18% for both users or 36% for one user. Thus, the ben- tiuser detection. The same scenario as depicted in Figure 1 is efit of multiuser detection increases when the gap is low- examined, but with a range of common ADSL line lengths. ered. Both direct channel lengths l and l are assumed to be con- 1 2 6.2. VDSL full duplex stant in all cases. In general, the performance gain decreases when the ratio c1/l1 increases. The maximum gain also in- The next set of simulations is for a 2-user VDSL sys- creases with the length of the direct channel so that an 8.5% tem, as shown in Figure 3, with full duplex transmission. 10 EURASIP Journal on Applied Signal Processing

9 60 8 50 7

6 40

5 30 4 20 3 10 2 Common data rate of all users (Mbps) Percentage bitrate increase for both users 1 0 0.05 0.10.15 0.20.25 0.30.35 0.4 1600 2000 2400 2800 3200 3600

Ratio of crosstalk direct channel c1/l1 Length of channel 1 l1 (ft)

9 kft direct channel 13 kft direct channel ISB 10 kft direct channel 14 kft direct channel ISB with MUD 11 kft direct channel 15 kft direct channel 12 kft direct channel Figure 10: Achievable common bit rate as a function of l1 when l2 is fixed at 2.5 kft in a 2-user VDSL full duplex environment. The bit Figure 8: Percentage bit rate increase as a function of the direct and rates of both users in both upstream and downstream directions are crosstalk channel lengths in a 2-user ADSL downstream. kept to be equal.

9 has been attempted for this scenario because running OSB for a 4-user system is too computationally intensive. More- 8 over, since the optimization involves the power spectra of 7 an equivalent of four users, the capacity region is four- dimensional, which is difficult to visualize. Alternatively, 6 Figure 10 illustrates the performance gain of multiuser de- 5 tection when all 4 transmission bit rates are equal. It is found that the performance gain is largest, 22% for all users, when 4 l1 is close to 2.5 kft. The reason is that NEXT is strongest 3 when the two channels have equal lengths. In this condi- tion, allowing crosstalk cancellation mitigates the effect of 2 NEXT drastically. Interestingly, the benefit of multiuser de- User 2 downstream data rate (Mbps) 1 tection fades when the difference between l1 and l2 increases to 1 kft. 0 0123456789 The power spectrum for each transmission with and User 1 downstream data rate (Mbps) without multiuser detection are shown in Figures 11 and 12 respectively. The channel lengths l1, l2 are 2.7 kft and 2.5 kft. OSB ISB with MUD (user 1 → 2order) The dotted lines in Figure 11 denote the frequency bands in OSB with MUD ISB (user 2 → 1order) OSB with partial MUD ISB with MUD (user 2 → 1order) which multiuser detection is turned on. Without multiuser ISB (user 1 → 2order) detection, it is interesting to see that user 1 downstream and user 2 upstream (and similarly user 1 upstream and user 2 downstream) operate in a frequency-division mul- Figure 9: Achievable rate region for 2-user ADSL downstream us- tiplex (FDM) mode. For these two pairs, FDM is optimal ing OSB/ISB and the joint multiuser detection algorithm for gap = because NEXT is too strong for overlapping spectra to oc- 0dB. cur. With multiuser detection, the cancellation of NEXT be- comes a possibility. In this case, overlapping spectra may now be allowed. The extra bits resulting from the overlapping Overlapping spectra are allowed between upstream and spectra contribute to the performance gain that multiuser downstream transmissions. The length of channel two l2 is detection offers. Note that optimal power spectra are very fixed at 2.5 kft while the length of channel one l1 varies be- different from the conventional bandplan where frequency- tween 1.5 and 3.7 kft. The system is transformed into an division duplex is used to separate upstream and down- equivalent 4-user system as described previously. Only ISB stream. V. M. K. Chan and W. Yu 11

−40 −40

−50 −50

−60 −60

−70 −70 User 1 downstream (dBm) User 2 downstream (dBm)

−80 −80 0 5000 10000 15000 0 5000 10000 15000 Frequency (kHz) Frequency (kHz)

(a) (b)

−40 −40

−50 −50

−60 −60 User 1 upstream (dBm) −70 User 2 upstream (dBm) −70

−80 −80 0 5000 10000 15000 0 5000 10000 15000 Frequency (kHz) Frequency (kHz)

(c) (d)

Figure 11: Power allocations for 2-user VDSL downstream with the joint multiuser detection and iterative spectrum balancing algorithm. The channel lengths are set to l1 = 2.7kftandl2 = 2.5 kft. The four resulting bit rates are equal. The dotted line denotes the frequency band for which the generated NEXT is cancelled by the neighbor user. Full interference detection is assumed.

7. CONCLUSIONS number of users is large. The possibility of partial detection of the interfering signal has been explored but simulation re- This paper investigates the benefit of multiuser detection sults show marginal performance gain. and crosstalk cancellation in digital subscriber line systems. An interesting immediate application of multiuser de- Computationally efficient schemes which determine the op- tection is on VDSL full duplex systems where the in- timal transmit power spectra and tone selection for mul- terference caused by NEXT is large. Existing systems use tiuser detection are proposed. Multiuser detection is shown frequency-division multiplex to separate the frequency to bring a further performance gain than that offered by ex- bands for downstream and upstream transmissions. Sim- isting methods. In particular, crosstalk cancellation can be ulations in this paper suggests that performance gain can combined with the optimal spectral balancing algorithm for be achieved by applying NEXT cancellation to overlap- determining the optimal power spectra. Multiuser detection ping upstream and downstream bands and at the same can also be incorporated into the iterative spectral balanc- time optimally allocating power to minimize the effect of ing algorithm to deal with complexity concerns when the crosstalk. 12 EURASIP Journal on Applied Signal Processing

−40 −40

−50 −50

−60 −60

−70 −70 User 1 downstream (dBm) User 2 downstream (dBm)

−80 −80 0 5000 10000 15000 0 5000 10000 15000 Frequency (kHz) Frequency (kHz)

(a) (b)

−40 −40

−50 −50

−60 −60

User 1 upstream (dBm) −70 User 2 upstream (dBm) −70

−80 −80 0 5000 10000 15000 0 5000 10000 15000 Frequency (kHz) Frequency (kHz)

(c) (d)

Figure 12: Power allocations for 2-user VDSL full duplex system with iterative optimal spectrum balancing alone. The channel lengths are set to l1 = 2.7kftandl2 = 2.5 kft, respectively. The four resulting bit rates are equal.

ACKNOWLEDGMENTS [3] R. Cendrillon, W. Yu, M. Moonen, J. Verlinden, and T. Bostoen, “Optimal spectrum balancing for digital subscriber This work was supported by Bell Canada University Labo- lines,” to appear in IEEE Trans. Commun., 2006. ratories, Communication and Information Technology On- [4] G. Ginis and J. M. Cioffi, “Vectored transmission for digi- tario (CITO), and the Natural Science and Engineering Re- tal subscriber line systems,” IEEE Journal on Selected Areas in search Council (NSERC) of Canada. This paper has been Communications, vol. 20, no. 5, pp. 1085–1104, 2002. presented in part at the IEEE International Conference on [5] W. Yu, R. Lui, and R. Cendrillon, “Dual optimization meth- Acoustics, Speech, and Signal Processing (ICASSP), Philadel- ods for multiuser orthogonal frequency division multiplex phia, 2005. systems,” in Proc. IEEE Global Telecommunications Conference (GLOBECOM ’04), vol. 1, pp. 225–229, Dallas, Tex, USA, REFERENCES November–December 2004. [6] R. Lui and W. Yu, “Low-complexity near-optimal spectrum [1] K. B. Song, S. T. Chung, G. Ginis, and J. M. Cioffi,“Dy- balancing for digital subscriber lines,” to appear in IEEE In- namic spectrum management for next-generation DSL sys- ternational Conference on Communications (ICC ’05), Seoul, tems,” IEEE Communications Magazine, vol. 40, no. 10, pp. Korea, May 2005. 101–109, 2002. [7] H. Dai and H. V. Poor, “Crosstalk mitigation in DMT VDSL [2] W. Yu, G. Ginis, and J. M. Cioffi, “Distributed multiuser power with impulse noise,” IEEE Transactions on Circuits and Sys- control for digital subscriber lines,” IEEE Journal on Selected temsPart I: Fundamental Theory and Applications, vol. 48, Areas in Communications, vol. 20, no. 5, pp. 1105–1115, 2002. no. 10, pp. 1205–1213, 2001. V. M. K. Chan and W. Yu 13

[8] C. Zeng and J. M. Cioffi, “Near-end crosstalk mitigation in ADSL systems,” IEEE Journal on Selected Areas in Communi- cations, vol. 20, no. 5, pp. 949–958, 2002. [9]S.T.ChungandJ.M.Cioffi, “The capacity region of frequency-selective Gaussian interference channels under strong interference,” in Proc. IEEE International Conference on Communications (ICC ’03), vol. 4, pp. 2753–2757, Anchorage, Alaska, USA, May 2003. [10] T. S. Han and K. Kobayashi, “A new achievable rate region for the interference channel,” IEEE Transactions on Informa- tion Theory, vol. 27, no. 1, pp. 49–60, 1981. [11] A. B. Carleial, “Interference channels,” IEEE Transactions on Information Theory, vol. 24, no. 1, pp. 60–70, 1978. [12] F. Sjoberg,¨ M. Isaksson, R. Nilsson, P. Odling,¨ S. K. Wilson, and P. O. B orjesson,¨ “Zipper: a duplex method for VDSL based on DMT,” IEEE Transactions on Communications,vol.47,no.8, pp. 1245–1252, 1999. [13] R. Nilsson, F. Sjoberg,¨ M. Isaksson, J. M. Cioffi,andS.K.Wil- son, “Autonomous synchronization of a DMT-VDSL system in unbundled networks,” IEEE Journal on Selected Areas in Com- munications, vol. 20, no. 5, pp. 1055–1063, 2002.

Vincent M. K. Chan received the B.S. de- gree in computer engineering from the University of Waterloo, Waterloo, Ontario, Canada, in 2003. He is currently work- ing towards the M.S. degree in electrical engineering at the University of Toronto, Toronto, Ontario, Canada. Through the co- operative education in the University of Wa- terloo, he worked at Genesis Microchip Inc., Broadcom Corp., and International Busi- ness Machines (IBM) Corp. His research interests are in the general areas of communication systems and signal processing for digital communications. His current research focuses on multiuser detec- tion and spectrum balancing techniques for digital subscriber lines.

Wei Yu received the B.S. degree in com- puter engineering and mathematics from the University of Waterloo, Waterloo, On- tario, Canada, in 1997, and the M.S. and Ph.D. degrees in electrical engineering from Stanford University, Stanford, California, USA, in 1998 and 2002, respectively. Since 2002, he has been an Assistant Professor with the Electrical and Computer Engi- neering Department at the University of Toronto, Toronto, Ontario, Canada, where he also holds a Canada Research Chair. His main research interests include multiuser in- formation theory, coding, optimization, wireless communications, and broadband access networks. He is currently an Associate Editor for IEEE Transactions on Wireless Communications. Hindawi Publishing Corporation EURASIP Journal on Applied Signal Processing Volume 2006, Article ID 58380, Pages 1–10 DOI 10.1155/ASP/2006/58380

Spectrally Compatible Iterative Water Filling

Jan Verlinden,1 Etienne Van den Bogaert,2 Tom Bostoen,1 Francesca Zanier,3 Marco Luise,4 Raphael Cendrillon,5 and Marc Moonen6

1 Access Networks Division, Alcatel, Francis Wellesplein 1, Antwerpen 2018, Belgium 2 Research & Innovation Department, Alcatel, Francis Wellesplein 1, Antwerpen 2018, Belgium 3 Department of Telecommunication Engineering, University of Pisa, 56122 Pisa, Italy 4 Department of Information Engineering, University of Pisa, 56122 Pisa, Italy 5 School of Information Technology and Electrical Engineering, University of Queensland, Brisbane, QLD 4072, Australia 6 Department of Electrical Engineering, Katholieke Universiteit Leuven, Kasteelpark Arenberg 10, Leuven-Heverlee 3001, Belgium

Received 2 December 2004; Revised 1 July 2005; Accepted 12 July 2005 Until now static spectrum management has ensured that DSL lines in the same cable are spectrally compatible under worst- case crosstalk conditions. Recently dynamic spectrum management (DSM) has been proposed aiming at an increased capacity utilization by adaptation of the transmit spectra of DSL lines to the actual crosstalk interference. In this paper, a new DSM method for downstream ADSL is derived from the well-known iterative water-filling (IWF) algorithm. The amount of boosting of this new DSM method is limited, such that it is spectrally compatible with ADSL. Hence it is referred to as spectrally compatible iterative water filling (SC-IWF). This paper focuses on the performance gains of SC-IWF. This method is an autonomous DSM method (DSM level 1) and it will be investigated together with two other DSM level-1 algorithms, under various noise conditions, namely, iterative water-filling algorithm, and flat power back-off (flat PBO).

Copyright © 2006 Hindawi Publishing Corporation. All rights reserved.

1. INTRODUCTION as spectrally compatible iterative water filling (SC-IWF); see also [1]. More and more users today have broadband access to the In- We will only apply DSM to the downstream direction ternet based on ADSL (asymmetric digital subscriber line) (from the Internet to the user), because crosstalk coupling technology. However, still not all users can have this service, is negligible in the ADSL upstream band. because they are located too far from the central office. Peo- ple who have broadband access today prefer to have even Whereas static spectrum management (SSM) will always higher data rates. DSM is a technique that tries to optimize consider the worst-case noise environment to design the the rates and reach for all users in a network. As such a tele- maximum allowed PSD masks, DSM will use the actual mea- com operator can offer broadband access to as many cus- sured noise and adapt its spectrum accordingly. tomers as possible and provide them with the optimal data So, unlike the standardized test procedures for SSM [2], rates. we will not focus only on the performance of a single line in Rate and reach for a certain DSL service are limited by a network with fixed noise. Instead, because of the dynamic crosstalk. Crosstalk is noise that comes from other DSL lines PSD used by DSM, and because of the effect of crosstalk in the same binder. Due to electromagnetic coupling, the sig- of one line on another line, we will consider multiple lines. nal of one line indeed interferes with the signal of the other Since we do not consider worst-case crosstalk, we will quan- line. tify the improvements in a real-life network. DSM tries to optimize the spectrum that is used for a cer- The paper is organized as follows. In Section 2 the var- tain line, such that it does not allocate a higher power spec- ious DSM algorithms are described and SC-IWF is derived tral density (PSD) than necessary to achieve the requested bit from IWF. In Section 3 rates-reach simulations are presented rate. For DSM, several algorithms exist. In this paper we will for a single line under test. In Section 4 a description of the focus on DSM level 1, that is, autonomous algorithms that network (cable plant) is given. In Section 5 the simulation re- do not need coordination from a central agent. We will de- sults are presented. Finally, in Section 6 we present the con- velop a new DSM method for downstream ADSL, referred to clusion of this paper. 2 EURASIP Journal on Applied Signal Processing

2. DSM ALGORITHMS with Pi the maximum power of user i,andλi the Lagrange multiplier of user i. 2.1. Iterative water filling or full boosting Assuming that the noise from other modems is constant (as highlighted in the first paragraph), one can fix the PSD in 2.1.1. General properties the crosstalk terms and formula (3)canberewrittenas   A well-known autonomous DSM algorithm is iterative wa- J S1(k), S2(k) ter filling (IWF); see also [3]. The behavior of IWF and    h2 (k) · S (k) the resulting PSDs has already been investigated in several = log 1+ 11 1 2 Γ other papers [3, 4]. The main properties of IWF are the fact k 1N1(k) that only useful tones are switched on (all other tones are    2 ff ff h (k) · S2(k) (4) switched o ) and that both boosting and power back-o are + log 1+ 22 2 Γ used. The amount of boosting is limited by the total power k 2N2(k)     that can be transmitted.   For modems that use DMT (discrete multitone) modu- + λ1 · P1 − S1(k) + λ2 · P2 − S2(k) lation, like ADSL modems, one can prove that the PSDs ob- k k tained with IWF are almost flat [5]. This will be explained in = 2 · fix = the following paragraphs. with N1(k) N1(k)+h12(k) S2 (k)andN2(k) N2(k)+ 2 · fix h21(k) S1 (k). The cost function can now be decoupled in two indepen- 2.1.2. PSDs of IWF dent cost functions for each user. This leads to the following IWF is obtained as a result of a bit rate maximization of mul- solution for Si(k): tiple users, where each modem optimizes its own PSD under Γ + = 1 − iNi(k) the assumption that the noise coming from other modems is Si(k) 2 (5) constant. Since the other modems will also change their PSD, λi ln 2 hii(k) the noise measured by the first modem is not constant and it with [x]+ = max(0, x). will also change its PSD. This results in an iteration of mul- This procedure is applied in an iterative fashion until tiple modems applying the water-filling procedure, hence its fix = fix = S1 (k) S1(k)andS2 (k) S2(k)[3]. name iterative water filling. This derivation assumes a rate adaptive system with a When using a Shannon capacity model for a DMT mo- fixed amount of power. However, by controlling the power dem, the bit rate of the first modem (assuming a two-user Pi, one can control the resulting bit rate. In that case, IWF system) is described as  is used in a power-adaptive perspective. In the remainder of = 1 this paper, IWF will be used in such a power-adaptive per- R1 RS bk,(1) k spective, where the amount of power depends on the desired   bit rate. In the simulations an additional iteration is used to 1 = SNR1(k) bk log2 1+ determine the power that corresponds to a certain bit rate. Γ1   (2) h2 (k)S (k) =  11 1  2.1.3. The PSDs of IWF are almost flat log2 1+Γ 2 1 N1(k)+h12(k)S2(k) Since the so-called water-filling level λi is independent of the with R1 the bit rate of user 1, RS the symbol rate, k the tone index, SNR(k) the signal-to-noise ratio (SNR) on tone tone index k, the minimum and maximum values of Si(k)in formula (5) are determined by the second term: k, Γ1 the SNR gap including noise margin and coding gain, Si(k) the transmit PSD of user i on tone k, h11(k) the chan- ΓiNi(k) nel transfer function for user 1, h12(k) the crosstalk chan- 2 . (6) hii(k) nel transfer function from user 2 to user 1, and N1(k) all the noises different from the crosstalk from user 2. When this second term is very large, then the SNR for user Maximizing the bit rate of both users leads to the follow- i will be smaller. Considering (2), one can observe that for ing cost function, while taking into account the total power DMT systems (like ADSL) on each tone k an SNR equal to or limitation of each user (constrained optimization problem): larger than Γ1 must be available, in order to load at least 1 bit,      2 so the signal S1(k) needs to be at least as large as the term (6). h (k) · S1(k) =  11  This allows us to derive the minimum value for S1(k): J S1(k), S2(k) log2 1+ 2 Γ1 N1(k)+h (k) · S2(k) k 12   Γ1N1(k) 1  2 · Smin(k) = = . (7) h22(k) S2(k) 1 2   h11(k) 2λ1 ln 2 + log2 1+Γ 2 · 2 N2(k)+h21(k) S1(k) k     When the term (6) is very small, we obtain the maximum   value for S1(k): + λ1 · P1 − S1(k) +λ2 · P2 − S2(k) k k max = 1 S1 (k) . (8) (3) λ1 ln 2 Jan Verlinden et al. 3

Comparing formulas (7)and(8) shows that the difference are done on a technology-by-technology basis. There are sev- min max between S1 and S1 is only a factor 2, which means that eral “basis systems” defined, and for each one a specific test the difference in PSD level is only 3 dB. This means that the is performed. transmit PSD can be approximated by a flat PSD for all the used tones. The small PSD ripple obtained when calculat- 2.2.2. Spectral compatibility with ADSL ing the PSD with IWF does not have a significant effect on the bit rate. This has been observed in simulation, both for In order to implement the Method B tests, a software tool integer (using greedy algorithm) and continuous bit load- (in Matlab) has been developed, following the specifications ings, as long as the requirement of minimum 1 bit per tone is of [7]. In order to check the accuracy of the tool, the results taken into account. ADSL modems in the field use integer bit have been compared with the Telcordia tool (Telcordia DSL loadings and will anyhow adapt their PSD within a range of Spectral Compatibility Computer; see [8]). The comparison −2.5 dB to +2.5 dB around the average PSD value (see also has been made for an extensive number of cases, and there is [6]). This mechanism is also known as the gi gain scaling a very good agreement between the results of both tools. mechanism. Note that this average PSD value can change in Alcatel’s tool works as follows: a “new service” is de- time over a much larger range than this ±2 5dB.Henceno . fined corresponding to a PSD (power spectral density) in PSD shaping is required for IWF. The tones for which the upstream and downstream. Then, by using a user interface SNR is not high enough are shut off. If the noise varies over one selects the basis systems (protected services) to be tested time, the average PSD level will vary and the actual tones that with. For spectral compatibility, throughout this contribu- are used (tones that are switched on) can change. tion, only spectral compatibility with ADSL will be tested. Basically, IWF consists in shutting off tones, boosting for ff The reason why only ADSL is protected in the design of long lines, and power back-o for short lines, always using a these new masks can be found in [9]. In this paper the prob- PSD that is almost flat. ability of such a spectral incompatibility is calculated. This During simulations always a flat PSD is used (both dur- calculation is based on the fact that the worst-case crosstalk ing the convergence period and at the end), ignoring the gain model only occurs for 1% of the lines, that there is only scaling factors and using a continuous bit loading with min- a problem if these systems are both installed in the same imum 1 bit per tone. binder, and so forth. According to [9], it will probably be bet- ter (from a business point of view) to replace the few HDSL 2.2. Spectrally compatible IWF lines on which there is a spectral compatibility problem, with fiber, than to reduce the reach of ADSL and to reduce the 2.2.1. Spectrum management and compatibility number of customers that can be served (with ADSL) by imposing very strict limitations on the PSD of ADSL. This The T1E1.4 Working Group of the T1 Committee (ANSI) is why only spectral compatibility with ADSL is taken into has adopted a “Spectrum Management for Loop Transmis- account (since one can assume that ADSL will always be sion Systems” standard [7]. This standard provides spectrum present in the binder, and as such always needs protection). management requirements and recommendations for the administration of services and technologies that use metal- 2.2.3. Design of spectrally compatible PSD masks lic subscriber loop cables. Spectrum management is the ad- ministration of the loop plant in a way that provides spectral The original downstream ADSL mask is presented in compatibility for services and technologies that use pairs in Figure 1 with a solid line. It spans from 138 kHz up to ap- the same cable. proximately 1.1 MHz, and has a level of −40 dBm/Hz. In order to achieve spectral compatibility, the ingress en- Since the PSD of IWF can be approximated by a rectan- ergy that transfers into a loop pair from services and trans- gular mask, only rectangular PSD masks are considered for mission system technologies on other pairs in the same cable the design of spectrally compatible PSD masks. must not cause an unacceptable degradation of performance For this design, a two-dimensional search space is ex- of the DSL service of the loop under consideration. In ad- plored: in a first dimension the PSD level is increased (in- dition, the egress energy from a particular loop pair must dicated with dash-dotted line), in a second dimension the not transfer into other pairs in a manner that causes an un- maximum frequency of the mask is reduced (dashed line). acceptable degradation in the performance of services and The minimum frequency of the mask is kept fixed. technologies on those pairs. The selection of a one-dimensional set of new masks There are basically two ways to ensure spectral compat- is based on two constraints: firstly there is the total trans- ibility with the existing protected services: Method A and mit power constraint of ADSL that needs to be respected Method B. Method A consists of a series of fixed masks (man- (19.85 dBm) and secondly the masks need to be spec- agement classes). In order to encourage innovation, Method trally compatible with ADSL. Obviously, all the new masks B was proposed. This method provides a generic analyti- will exceed the Method A spectral compatibility require- cal method (instead of a fixed set of masks) for determin- ments (since they exceed the −40 dBm PSD limitation), and ing spectral compatibility. This method is more complicated therefore Method B will be employed for ensuring spectral than Method A, and consists in fact of a series of tests, which compatibility. 4 EURASIP Journal on Applied Signal Processing

−10 −10

−15 −15

− −20 20 Spectral compatibility

−25 −25 Power limitation −30 −30

−35 −35 PSD (dBm/Hz) PSD (dBm/Hz)

−40 −40

−45 −45

−50 −50 0 50 100 150 200 250 0 50 100 150 200 250 Tones Tones

Original ADSL mask Corner point for maximum PSD level Variation of the PSD level Example PSD 1 Variation of the highest tone Example PSD 2

Figure 1: Search for spectrally compatible PSD masks. Figure 3: PSDs for spectrally compatible boosting (protecting ADSL services).

−10 systems. The sidelobes have been calculated using matrix A from [10, Appendix IV]. −20 The resulting spectrally compatible masks are presented −30 in Figure 3. The corner point indicates for each mask (de- fined by a maximum frequency) the maximum transmit PSD −40 level. Two examples (dashed and dash-dotted rectangles) in- −50 dicate how the curve of corner points should be used. In the figure, it is also indicated which constraint deter- PSD (dBm/Hz) − 60 mines the PSD level: the maximum transmit power limita- −70 tion or the spectral compatibility limitation. The maximum transmit power limitation is calculated by setting equal PSDs −80 0 50 100 150 200 250 on all tones that are used. The spectral compatibility limita- Tones tion is based on Method B of [7] as described before. This limitation indicates, for each tone k, the highest PSD that is Figure 2: Example of a PSD, considered in the search algorithm, still allowed, when using a rectangular mask that starts at the including sidelobes. minimum tone (at 138 kHz) and ends at tone k.

2.3. Flat power back off

From all masks with the same maximum frequency, one Since boosting is not allowed by all operators (T1.417 is an mask will be selected. It will need to comply with the two ANSI standard, only valid in North America), and since the constraints mentioned above and from all the masks fulfilling spectral compatibility test is only protecting ADSL, also a flat these constraints, the one with the highest level of boosting PBO algorithm is used for comparison. This algorithm has the same behavior as IWF, but it will never exceed the max- will be selected, since this mask will result in the highest bit − rate. imum level of 40 dBm/Hz, which is currently defined for ADSL in [6]. All the above procedures were automated in a Matlab script, interfacing with Alcatel’s Method B tool. 3. RATE-REACH SIMULATION RESULTS Sidelobes have been calculated for the PSD of the new system (see Figure 2). With these sidelobes the effect of en- 3.1. Introduction ergy spreading of the DMT modulation is taken into account. It is very important to model the energy on the tones that We will first present the rate-reach curves for the various al- have been switched off, because the energy on such tones gorithms. For these rate-reach curves, only one modem un- due to sidelobes can still give significant crosstalk for other der test is considered. Since the simulations consist only of Jan Verlinden et al. 5

11 consisting of 10 ISDN (=ISDN-BRA, 2B1Q), 4 HDSL, 10 15 SHDSL (=SDSL) disturbers, and 15 legacy ADSL dis- 9 turbers. The bit rates are much lower than under AWGN −140 dBm/Hz noise. 8 In Figure 5(a), the mask selection of SC-IWF is based 7 on the “maximum tone.” This means that the algorithm will 6 search for the highest tone that can still be used and will select 5 the corresponding mask from Figure 3, using this maximum tone as the corner point.

Bit rate (Mbps) 4 In Figure 5(b), the mask selection is based on the maxi- 3 mum bit rate. This means that in the receiver (which deter- 2 mines the selection of the mask) the bit rate for all masks 1 from Figure 3 is calculated and the mask that results in the 0 highest bit rate is selected. Since the ETSI-FB noise is very 2000 2500 3000 3500 4000 4500 5000 5500 6000 6500 7000 shaped, and will cause a gap of tones (tones which cannot Loop length (m) be used), optimal mask selection, based on maximization of the bit rate, is indeed seen to lead to higher bit rates. The Flat PBO Spectrally compatible IWF drop in bit rate, observed in Figure 5(a), can be explained IWF as follows. When there are only one few tones available af- ter the gap of useless tones, then it makes sense to switch off Figure 4: Rate-reach curve for the various DSM algorithms under also these tones, and use a higher PSD on the tones before AWGN of −140 dBm/Hz. the gap.

4. NETWORK WIDE APPLICATION one modem under test, it will be easier to interpret the re- sults. This is important because later on in this paper the 4.1. Realistic model of the network results of DSM in a network with a large number of modems An important difference between the results in this paper and will be presented. In order to understand better how such a other simulation results (in standards or other papers) is the DSM modem works under fixed noise conditions, we need fact that we use a realistic network topology in which we in- to investigate its behavior in a controlled noise environment. vestigate the gains of DSM. Other papers usually work with either fixed noise or only a limited number of DSM modems 3.2. Rate-reach simulation results under AWGN (e.g., two) on a fixed binder setup. noise conditions However, the main gain for DSM comes from the fact In Figure 4 one can see the rate-reach curve of the various that DSM modems adapt their PSD according to the loop DSM algorithms. DSM has been applied to one line only, in and noise characteristics. For example, a modem on a short an environment with AWGN at −140 dBm/Hz. line will apply power back-off to reduce the noise for other One can clearly see that there is only a difference between users, whereas a modem on a long line will apply boosting IWF and flat PBO on the longer loops. On these long loops, in order to fully exploit the few tones that are still useful. Be- not all tones are used any more and so IWF will reuse the cause of these changing PSDs, the noise from DSM modems power from these tones to apply boosting. This boosting will is in fact not fixed and depends very much on the actual result in the improved bit rates. topology. So in order to investigate the gains of DSM in case The SC-IWF algorithm first behaves as flat PBO on the it would be deployed on a large scale (more than 2 modems medium length loops, and coincides with IWF for the long in a binder), it is important to have a realistic model of the loops. This can be explained by closely looking at Figure 3. network. There one can observe that when a large number of tones are We use a network model of an average North American used, SC-IWF is limited by the spectral compatibility con- network and an average European network. The loop length straint. But as more tones become unavailable (due to the distributions from the central office (CO) for both of them longer loop length and corresponding channel attenuation), are presented in Figure 6. In the North American network SC-IWF will be only limited by the power constraint and will the loops are longer on the average. For the simulations a in fact have the same PSD as full IWF. Hence, for the longer wire gauge of 26 AWG is used. loops, IWF and SC-IWF will have the same PSD and corre- sponding bit rate. 4.2. Noise environment 4.2.1. Introduction 3.3. Rate-reach simulation results under ETSI-FB noise conditions The performance on a DSL line is limited mainly by the loop length and the noise environment. So the actual choice of In Figure 5 one can see the rate-reach curve of the vari- the noise sources is rather important. In this analysis only ous DSM algorithms under ETSI-FB noise (see [10–12]), the noise from crosstalk is considered. 6 EURASIP Journal on Applied Signal Processing

4 4

3.5 3.5

3 3

2.5 2.5

2 2 Bit rate (Mbps) Bit rate (Mbps) 1.5 1.5

1 1

0.5 0.5

0 0 2500 3000 3500 4000 4500 2500 3000 3500 4000 4500 Loop length (m) Loop length (m)

Flat PBO Flat PBO Spectrally compatible IWF Spectrally compatible IWF IWF IWF (a) (b)

Figure 5: Rate-reach curve for the various DSM algorithms under ETSI-FB noise.

30% model. As already mentioned in Section 3.3, there are 10 ISDN lines, 4 HDSL lines, 15 SHDSL lines, 15 ADSL lines, 20% and 1 line under test in this model. So, in total there are 16 10% ADSL modems in this ETSI-FB model. For the tests we will define the following noise scenar- 0% ios: (a) AWGN noise at −140 dBm/Hz, (b) pure self-crosstalk 0–1 1–2 2–3 3–4 4–5 5–6 6–7 7–8 noise (16 legacy ADSL lines, not applying DSM) derived Loop length (km) from ETSI-FD, and (c) the noise from ETSI-FB (10 ISDN, North American average 4 HDSL, 15 SHDSL). European average In fact, noise scenario (a) corresponds to a case with all lines in a binder doing DSM, scenario (b) corresponds to Figure 6: Loop length distribution (of homes extending out of CO). 50% of the lines applying DSM, and for scenario (c) the effect of alien crosstalkers on DSM is investigated.

4.3. Bandwidth tier bit rates This crosstalk noise is divided into two main categories: near-end crosstalk (NEXT) and far-end crosstalk (FEXT). 4.3.1. Bit rates in function of services NEXT is noise coming from a modem that is located at the Because of loop attenuation, the maximum achievable bit same end of the line: if the victim modem is located in the rate on a line will decrease with the loop length. So, cus- central office (CO), then also the disturber is located in the tomers on short loops can have a higher bit rate than cus- CO. FEXT is noise coming from a modem located at the tomers on a long loop. However, since operators want to of- other end of the line: if the victim modem is located at the fer a package of services like high-speed Internet (HSI), voice CO, then the disturber is located at the customer premises over IP (VoIP), and video on demand, they will limit the (CPE). achievable bit rate to the bit rate that is actually required for a certain package of services. 4.2.2. Noise sources For the simulations, the following profiles have been cho- sen (see also Figure 7): All simulations have been performed with 16 ADSL modems that apply DSM. This number originates from the ETSI-FB (i) Tier 0: 256 kbps for broadband Internet connectivity, Jan Verlinden et al. 7

Flat SC- IWF PBO IWF 5.5Mbps 37 37 37

CO 3.5Mbps 0 100 197 Green zone Grey zone Red zone 1.5Mbps 0 53 334 Tier 0: 256 kbps Tier 1: 1.5Mbps ADSL•••••••••••••••••• coverage from ••• CO Coverage•••••••• 256 kbps 0 971 971 Tier 2: 3.5Mbps 0 2000 4000 6000 8000 Tier 3: 5.5Mbps Bandwidth••••••••• Loop reach (m) Gain (m) services•••••••• No. DSM SC-IWF Flat PBO IWF

Figure 7: Tier bit rates. Figure 8: Reach improvement by applying DSM for 16 ADSL modems in Europe with AWGN −140 dBm/Hz.

(ii) Tier 1: 1.5 Mbps for high-speed Internet connectivity, Flat SC- IWF (iii) Tier 2: 3.5 Mbps for high-speed Internet connectivity PBO IWF and one 2 Mbps video channel, 5.5Mbps 115 115 115 (iv) Tier 3: 5.5 Mbps for high-speed Internet, and two 2 Mbps video channels. 3.5Mbps 83 83 145 Unlike an operator, who installs a profile depending on the 1.5Mbps 75 191 377 subscription of the customer, we will always enable the high- 256 kbps 89 1047 1047 est possible profile on a line. 0 2000 4000 6000 8000 4.3.2. Effect of tier bit rates for DSM Loop reach (m) Gain (m) No. DSM SC-IWF The profile that is selected for a line is always lower than the Flat PBO IWF maximum achievable bit rate (otherwise this profile would not be possible). The amount of excess bit rate is converted ff Figure 9: Reach improvement by applying DSM for 16 ADSL by DSM into power back-o , such that the crosstalk for other modems in Europe with 16 ADSL crosstalkers. users is reduced.

5. SIMULATION RESULTS European network are presented (Figures 8, 9,and10), and 5.1. Introduction then the results for the North American network are given (Figures 11, 12,and13). In order to obtain results for a particular network, several AscanbeseeninFigures8 and 9, it turns out that adding simulations have been performed for this network. For each 16 ADSL self-crosstalkers is not really a problem for the scenario, 30 binders have been generated and the down- ADSL lines under test. The loop reach and the gains for white stream bit rates for all modems in a binder have been ana- noise and for ADSL noise are comparable. lyzed statistically for these 30 binders. For the 1.5 Mbps service, there is only a substantial gain Each binder consists of loops that have been selected ran- for IWF. Considering the average loop attenuation and the domly from the loop distribution. Per binder, there are 16 requested bit rate for this service, we know that still a lot of ADSL modems applying DSM and several other lines acting tones are useful. From Figure 3 we know that in this case, the as noise sources (the amount and type of disturbers depend boosting of the SC-IWF is very limited, and therefore there on the noise scenario). So, per noise scenario, the down- are almost no gains any more for this type of DSM. stream bit rates of 480 ADSL modems applying DSM are an- For the 256 kbps service, only a few tones are active on alyzed. very long loops, so the amount of boosting for IWF and for Reach curves are used to evaluate the performance gains SC-IWF is the same. Also the gains are identical. of DSM. The noise mix from ETSI-FB (Figure 10) has a big im- pact on the loop reach and it also reduces the gains, especially 5.2. Reach curves per tier bit rate for the various for the SC-IWF, using the 256 kbps service. Due to the high DSM algorithms noise level, the reach is reduced very much. For these shorter loops, the higher tones are still useful. As such, the amount of In this section, the results from the simulations with the boosting that is allowed for the SC-IWF is reduced and also noises from Section 4 are presented. First the results for the the gains. 8 EURASIP Journal on Applied Signal Processing

Flat SC- IWF Flat SC- IWF PBO IWF PBO IWF 5.5Mbps 75 75 75 5.5Mbps 67 67 67

3.5Mbps 0 0 0 3.5Mbps 18 18 102

1.5Mbps 0 0 97 1.5Mbps 15 79 285

256 kbps 0 424 889 256 kbps 1 628 718

0 2000 4000 6000 8000 0 2000 4000 6000 8000 Loop reach (m) Gain (m) Loop reach (m) Gain (m) No. DSM SC-IWF No. DSM SC-IWF Flat PBO IWF Flat PBO IWF

Figure 10: Reach improvement by applying DSM for 16 ADSL mo- Figure 12: Reach improvement by applying DSM for 16 ADSL dems in Europe with 10 ISDN, 4 HDSL, and 15 SHDSL crosstalkers. modems in North America with 16 ADSL crosstalkers.

Flat SC- IWF Flat SC- IWF PBO IWF PBO IWF 5.5Mbps 69 61 55 5.5Mbps 22 22 22

3.5Mbps 15 15 85 3.5Mbps 0 0 0

1.5Mbps 0 48 307 1.5Mbps 0 0 125

256 kbps 0 597 776 256 kbps 0 225 551

0 2000 4000 6000 8000 0 2000 4000 6000 8000 Loop reach (m) Gain (m) Loop reach (m) Gain (m) No. DSM SC-IWF No. DSM SC-IWF Flat PBO IWF Flat PBO IWF

Figure 11: Reach improvement by applying DSM for 16 ADSL Figure 13: Reach improvement by applying DSM for 16 ADSL modems in North America with AWGN −140 dBm/Hz. modems in North America with 10 ISDN, 4 HDSL, and 15 SHDSL crosstalkers.

Comparing the results for the North American network (Figures 11, 12,and13) with the results for the European HDSL and SHDSL (in the respective quantities of 4 and network (Figures 8, 9,and10), one can see that in general 15) have comparable impact on the reach. Both noises gener- the DSM reach improvements for the European network are ate a lot of NEXT, which becomes the dominant noise source. larger. The reason is that there are more short loops in the More detailed analysis shows that a single SHDSL noise European network that can apply power back-off (reduction source causes less degradation than a single HDSL noise of crosstalk). The general trends however are the same, even source. However, the ETSI-FB model assumes that there will though in both networks the loop distributions are very dif- be more SHDSL disturbers in a network than HDSL dis- ferent. turbers. It also turns out that in case there would be only 1 HDSL 5.3. Comparing the effect of the various noises for disturber instead of 4, the reach (averaged over all services) the reach performance is approximately 250 m larger. In case there would be only 1 SHDSL disturber instead of 15, the reach would be 300 m In the previous section, the noises from ETSI-FB (ISDN, larger. This means that by reducing the number of HDSL and HDSL, and SHDSL) were treated as a whole. In this section SHDSL disturbers in a network, the reach can also be im- the contribution of the individual components of ETSI-FB is proved. briefly investigated. In Figure 14, one can see the performance gains of IWF 6. CONCLUSION on the North American network for each contribution of the ETSI-FB noises separately. The ISDN noise is clearly the least In this paper we have investigated and compared existing important noise source as the reach with ISDN noise only is algorithms for DSM level 1 with a new so-called spectrally much larger than with the other noise components. compatible iterative water-filling (SC-IWF) algorithm. The Jan Verlinden et al. 9

ETSI-FB 5.5Mbps [7] ANSI Standard T1.417-2003, Spectrum Management for Loop 3.5Mbps Transmission Systems, December 2003. 1.5Mbps 256 kbps [8] The Telcordia DSL Spectral Compatibility Computer, http:// net3.argreenhouse.com:8080/dsl-test/index.htm. 15 SHDSL 5.5Mbps [9] J. M. Cioffi, Incentive-based spectrum management, ANSI 3.5Mbps 1.5Mbps T1E1.4/2004-480R2. 256 kbps [10] ITU-T Recommendation G.992.5, Asymmetric Digital Sub- 4 HDSL 5.5Mbps scriber Line (ADSL) Transceivers—Extended Bandwidth 3.5Mbps ADSL2 (ADSL2plus), January 2005. 1.5Mbps [11] ETSI TS 101 388 V1.3.1, ADSL: European Specific Require- 256 kbps ments, section 5.3.4, May 2002. 10 ISDN 5.5Mbps [12] ETSI STC TM6(98)10, Laboratory Performance Tests for xDSL 3.5Mbps Systems, section B2, February 2001. 1.5Mbps 256 kbps 0 1000 2000 3000 4000 5000 6000 7000 Jan Verlinden received a degree in electri- Loop reach (m) cal engineering in 2000 from the KULeuven, Belgium. He is currently a Member of the No. DSM IWF DSL Experts Team of the Access Network Division in Antwerp, Belgium. He joined the Research and Innovation Division of Al- Figure 14: DSM performance for noise contributions of ETSI-FB. catel in September 2000, where he focussed on echo canceller techniques. From 2002 on,hehasfocussedonDSM.Assuchhe results are based on statistical simulations in average Euro- participated in the VDSL Olympics by in- pean and North American networks. Different noise sources troducing DSM into the VDSL prototype. Within the DSL Experts have been used as disturbers for the DSM lines under test. Team, he is currently studying emerging DSL physical layer tech- When comparing the various algorithms it turns out that nologies. He also contributes to ANSI NIPP-NAI standardization. flat power back off (flat PBO) only gives a small gain. This Etienne Van den Bogaert is a DSL Re- gain is most significant on the high bit rate profiles in a net- search Engineer at the Research & Inno- work with a large number of short lines. vation Department in Antwerp, Belgium. SC-IWF (spectrally compatible with ADSL) gives a lot In this function, he is investigating next- more gain and can perform almost as good as IWF, when generation DSL technologies and their ap- working with the low bit rate profile. Gains are most signifi- plications. His current main research in- cant for the long loops. terest is dynamic spectrum management The gains of SC-IWF are reduced however in case a lot (DSM): algorithms, impact on performance of tones are needed, either because of the profile (high bit and stability, and practical implementation. rate) or because of the limited reach (high noise from NEXT disturbers). NEXT interference from HDSL and SHDSL in Tom Bostoen received the M.S. degree in particular is found to have a large impact on the performance physical engineering from Ghent Univer- sity, Ghent, Belgium, in 1998. Tom Bostoen of ADSL, both with and without DSM. is currently Product Manager of the 5530 Network Analyzer at the Access Networks REFERENCES Division of Alcatel in Antwerp, Belgium. In his previous function he was a Project [1] R. Suciu, E. Van den Bogaert, J. Verlinden, and T. Bostoen, Manager of the DSL physical layer research “Insuring spectral compatibility of Iterative Water-Filling,” in project at the Research & Innovation De- Proceedings of 12th European Signal Processing Conference (EU- partment. Before that, he studied single- SIPCO ’04), pp. 1209–1212, Vienna, Austria, September 2004. ended line testing (SELT) as a Research Engineer in the same de- [2] ITU-T Recommendation G.996.1, Test procedures for digital partment and contributed to ITU G.selt standardization. subscriber line (DSL) transceivers, March 2003. [3] W. Yu, G. Ginis, and J. M. Cioffi, “Distributed multiuser power Francesca Zanier received the degree in control for digital subscriber lines,” IEEE Journal on Selected telecommunication engineering from Pisa Areas in Communications, vol. 20, no. 5, pp. 1105–1115, 2002. University, Italy, in 2004. She joined the [4] R. Cendrillon, W. Yu, J. Verlinden, T. Bostoen, and M. Moo- Research and Innovation Division of Al- nen, “Optimal multi-user spectrum management for digital catel, Antwerp, Belgium, in 2003 for her subscriber lines,” in IEEE International Communications con- thesis’s project on DSM. After winning the ference (ICC ’04), Paris, France, 2004. “Piaggio Talent Recruitment Project 2004”, [5] J. Verlinden, The target PSD obtained with Iterative Water- she worked one year in the Planning and Filling is almost flat, ANSI Contribution. T1E1.4/2003-295, Control Division of Piaggio Pontedera, Pisa 2003. (Italy). Recently she has joined the Department of Information En- [6] ITU-T Recommendation G.992.1, Asymmetric Digital Sub- gineering as a PhD student at the University of Pisa (Italy). Her scriber Line (ADSL) Transceivers, December 2003. research interest is on digital signal processing. 10 EURASIP Journal on Applied Signal Processing

Marco Luise is a Full Professor of telecom- 2002), and is currently a EURASIP AdCom Member (European As- munications at the University of Pisa, Italy. sociation for Signal, Speech, and Image Processing) since 2000. He After receiving his M.S. and Ph.D. degrees has been a Member of the Editorial Board of IEEE Transactions in electronic engineering from the Univer- on Circuits and Systems II (2002-2003). He is currently Editor-in- sity of Pisa, he was a Research Fellow of Chief for the EURASIP Journal on Applied Signal Processing since the European Space Agency (ESA) at ES- 2003, and a Member of the Editorial Board of Integration, the VLSI TEC Noordwijk, the Netherlands, and a Re- Journal, EURASIP Journal on Wireless Communications and Net- search Scientist of the Italian National Re- working, and IEEE Signal Processing Magazine. search Council (CNR), at the CSMDR Pisa. Professor Luise cochaired four editions of the Tyrrhenian International Workshop on Digital Communica- tions, and in 1998 was the General Chairman of the URSI Sympo- sium ISSSE’98. He has been the Technical Cochairman of the 7th International Workshop on Digital Signal Processing Techniques for Space Communications and of the Conference European Wire- less 2002. He will be the General Chairman of EUSIPCO 2006 in Florence. A Senior Member of the IEEE, Professor Luise served as Editor for synchronization of the IEEE transactions on com- munications, and Editor for communications theory of the Euro- pean transactions on telecommunications. He has authored more than 100 publications on international journals and contributions to major international conferences, and holds a few international patents. His main research interests lie in the area of wireless com- munications, with particular emphasis on CDMA/multicarrier sig- nals and satellite communications and positioning.

Raphael Cendrillon was born in Mel- bourne, Australia, in 1978. He received the Electrical Engineering degree (first-class honors) from the University of Queensland, Australia, in 1999, and the Ph.D. degree in electrical engineering from the Katholieke Universiteit Leuven, Belgium, in 2004. His Ph.D. degree was awarded summa cum laude with congratulations of the jury, an honor given to the top 5% of Ph.D. gradu- ates. His research focuses on the application of multiuser commu- nication theory to xDSL. In 2002 he was a Visiting Scholar at the In- formation Systems Laboratory, Stanford University, with Professor John Cioffi. Since 2005 Dr. Cendrillon has been a Postdoctoral Re- search Fellow at the University of Queensland, Australia. His work in xDSL is done in close collaboration with Alcatel Research and In- novation, Belgium, for which he was awarded the Alcatel Bell Sci- entific Prize in 2004. He was also the recipient of an IEEE Travel Grant in 2003 and 2004, and the KULeuven Bursary for Advanced Foreign Scholars in 2004.

Marc Moonen received the Electrical En- gineering degree and the Ph.D. degree in applied sciences from Katholieke Univer- siteit Leuven, Leuven, Belgium, in 1986 and 1990, respectively. Since 2004 he is a Full Professor at the Electrical Engi- neering Department of Katholieke Univer- siteit Leuven, where he is currently head- ing a research team of 16 Ph.D. candi- dates and Postdocs, working in the area of signal processing for digital communications, wireless com- munications, DSL, and audio signal processing. He received the 1994 KULeuven Research Council Award, the 1997 Al- catel Bell (Belgium) Award (with Piet Vandaele), the 2004 Alcatel Bell (Belgium) Award (with Raphael Cendrillon), and was a 1997 “Laureate of the Belgium Royal Academy of Science.” He was Chairman of the IEEE Benelux Signal Processing Chapter (1998– Hindawi Publishing Corporation EURASIP Journal on Applied Signal Processing Volume 2006, Article ID 95175, Pages 1–17 DOI 10.1155/ASP/2006/95175

The Normalized-Rate Iterative Algorithm: A Practical Dynamic Spectrum Management Method for DSL

Driton Statovci, Tomas Nordstrom,¨ and Rickard Nilsson

Telecommunications Research Center Vienna (ftw.), Donau-City-Straße 1, A-1220 Vienna, Austria

Received 17 December 2004; Revised 1 June 2005; Accepted 2 June 2005 We present a practical solution for dynamic spectrum management (DSM) in digital subscriber line systems: the normalized-rate iterative algorithm (NRIA). Supported by a novel optimization problem formulation, the NRIA is the only DSM algorithm that jointly addresses spectrum balancing for frequency division duplexing systems and power allocation for the users sharing a com- mon cable bundle. With a focus on being implementable rather than obtaining the highest possible theoretical performance, the NRIA is designed to efficiently solve the DSM optimization problem with the operators’ business models in mind. This is achieved with the help of two types of parameters: the desired network asymmetry and the desired user priorities. The NRIA is a centralized DSM algorithm based on the iterative water-filling algorithm (IWFA) for finding efficient power allocations, but extends the IWFA by finding the achievable bitrates and by optimizing the bandplan. It is compared with three other DSM proposals: the IWFA, the optimal spectrum balancing algorithm (OSBA), and the bidirectional IWFA (bi-IWFA). We show that the NRIA achieves better bitrate performance than the IWFA and the bi-IWFA. It can even achieve performance almost as good as the OSBA, but with dra- matically lower requirements on complexity. Additionally, the NRIA can achieve bitrate combinations that cannot be supported by any other DSM algorithm.

Copyright © 2006 Driton Statovci et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. INTRODUCTION power control over a predefined static spectrum bandplan between the downstream and upstream is considered. In the development of currently deployed digital subscriber In this paper, we describe a practically applicable DSM line (DSL) systems, a single user scenario was assumed with method for DSL called the normalized-rate iterative algo- worst case crosstalk models. This passive strategy was moti- rithm (NRIA) [1–3]. The NRIA is a centralized DSM algo- vated by the goal of maximizing the robustness of DSL sys- rithm based on the iterative water-filling algorithm (IWFA) tems. In practice, however, it often leads to overly conserva- [4] for finding good power allocations, but it extends the tive performance figures and sometimes even to failures to IWFA by automatically deriving achievable bitrates and deliver a specific DSL service. This is often due to a poorly searching for an optimized bandplan. optimized resource allocation among different loops in a ca- The NRIA jointly balances the spectrum between the ble bundle, combined with unmotivated high noise mar- downstream and upstream directions, that is, it finds an ef- gins, which result in too pessimistic bitrates. Furthermore, if fective common bandplan for frequency division duplexing the system environment changes in a practical scenario, for (FDD) DSLs, and performs power allocation for each line in example, where unmodelled noise sources appear, the ini- a cable bundle. The NRIA is suboptimal in the sense that the tial robustness of a static deployed DSL system easily breaks power allocation is based on the IWFA, and the frequency down. allocation is based on an ad hoc solution. With an active approach to copper bundle resource man- Compared to other DSM methods, the NRIA has essen- agement, the cable resources can be more efficiently shared tially two major advantages: high performance and low com- among the users. Combined with more accurate crosstalk plexity. Since the NRIA optimizes the bandplan, better per- figures obtained from accurate online cable measurements, formance can be achieved than with the IWFA, which uses a higher and more balanced bitrates can be achieved on most static (fixed) bandplan. Compared to the optimal spectrum loops. In the literature, this is often referred to as dynamic balancing algorithm (OSBA) [5–7], which in theory can de- spectrum management (DSM), although usually only active liver the highest bitrates for a given bandplan, the NRIA can 2 EURASIP Journal on Applied Signal Processing in practice achieve almost as good performance but with a CO dramatically lower requirement on complexity. This is cru- LT 1 CPE cial especially for the more realistic cases where the number . . Network . CPE of loops in a bundle is more than a couple. In this case, since X the complexity of the OSBA grows exponentially with the -talk u DSLAM CPE NT number of loops, it effectively fails to deliver any result in a . X-talk . CPE reasonable time. Furthermore, due to the extended capability . SMC LT of searching for an optimized bandplan, the NRIA supports . CPE several downstream and upstream bitrate combinations that Network Fiber U CPE cannot be supported by any other DSM algorithm. DSLAM When using the IWFA the achievable bitrates needs to be Cabinet specified in advance (before running the algorithm). Since thisisdifficult the NRIA has an important advantage as it Figure 1: A distributed multiuser DSL environment with customer finds the achievable bitrates automatically and needs no prior premises equipment (CPE) connected to a central office (CO) as knowledge about them. Furthermore, tractable operating well as to a cabinet. SMC denotes the spectrum management cen- points for desirable business models can easily be achieved, ter. DSLAM denotes DSL access multiplexer. LT and NT denote line because the NRIA has a parameter for selecting the desired termination and network termination sides, respectively. downstream and upstream asymmetry and parameters for selecting the user priorities. As with the OSBA, a potential drawback of the NRIA, ff however, is that it is a centralized algorithm operated by a modulation (DMT), o ers this flexibility. Since Zipper-DMT common DSM agent. Nevertheless, in practice a DSM agent is also part of current VDSL standards, we assume that it is may always be necessary since the users’ bitrates that can be used. Furthermore, full network synchronization is also as- ffi supported by a distributed DSM algorithm like the IWFA sumed [10]inordertoavoidanye ciency loss due to silent must be calculated by a central agent. guard bands between the downstream and upstream sub- The paper is organized as follows: Section 2 describes the bands, to avoid flexibility loss in frequency planning [11], ff system model; Section 3 describes some fundamental bitrate and to make crosstalk noise on di erent subcarriers indepen- 1 relations used in Section 4, which formulates the multiuser dent. optimization problem that we consider; Section 5 describes The NRIA relies on a spectrum management center (S- the NRIA as a suboptimal but practical solution to the prob- MC), as shown in Figure 1. Firstly, the SMC collects all chan- lem; Section 6 presents simulation results of the NRIA with nel characteristics in the network from individual modems, ffl comparisons to the IWFA, the OSBA, and the bidirectional including the crosstalk channels, during an o ine period. IWFA [8]; and Section 7 summarizes the major findings pre- Secondly, it runs the NRIA in order to find a common D- sented in this paper. FDD bandplan and individual power transmit spectra for each modem. Finally, these parameters are returned by the SMC to every modem before they start to operate. 2. SYSTEM MODEL With network synchronization of a Zipper-DMT system, a received symbol after the DFT demodulator on subcarrier n Figure 1 shows a typical network scenario that the NRIA is n for user u, Yu , can be expressed as designed to handle. Specifically, it is assumed that both ends of the cable can be distributed; at the line termination (LT) U ffi n n n n n n side, the loops can be connected to a central o ce (CO) Yu = HuuXu + HuvXv + Vu , u ∈{1, ..., U},(1) as well as a cabinet; and at the network termination (NT) v=1 v u side, the loops are connected to customer premises equip- = ment (CPE), which are usually distributed in space. Xn Xn u The crucial assumption with this network model is that where u and v are the transmitted symbols of user and v n V n all loops have unique crosstalk couplings between each other. user on subcarrier ,respectively. u is the background u n Hn This assumption is valid in practice. Even if some cables are noise of user on subcarrier . uv is the channel transfer v u collocated at one end, or even at both ends, they will still have function from user to user , that is, it represents either the v = u different crosstalk couplings due to other differences such as direct channel (with ), or far-end crosstalk (FEXT). in the twists, the locations of the loops within a bundle, and Note that with synchronization and D-FDD, that is, syn- the loop termination at the NT and LT sides. For example, it chronous Zipper, the near end crosstalk (NEXT) is com- is not always the case that the longest loop in a network has pletely eliminated through orthogonality, regardless of the the poorest DSL channel conditions. selected bandplan (downstream and upstream subcarriers). To make efficient dynamic spectral balancing possible, high flexibility in selecting the transmission spectra is needed 1 by the DSL systems. Multicarrier modulation combined Without network synchronization, signal energy assigned by one user on one subcarrier will also leak over to neighboring subcarriers for the other with digital frequency division duplexing (D-FDD), like the users, due to the asynchrony, and thus appear as nonorthogonal near-end Zipper [9] duplexing method based on discrete multitone crosstalk and far-end crosstalk. Driton Statovci et al. 3

n Therefore, the desired symbol Xu is only disturbed by FEXT, Table 1: Example: relations between different bitrates and users’ which originates from all other users from the corresponding priority values. n subcarrier , and the background noise. User Userpriorities Userbitrates Norm.bitrates ThenumberofbitsinaDMTsymbolforuseru in the uαu,DS αu,US Ru,DS Ru,US Ru,DS/αu,DS Ru,US/αu,US upstream transmission direction is 11/31/641 126 n 22/35/685 126 Ru,US = Ru,(2) Σ 11126 — — n∈IUS Rn u n where u is the number of bits for user on subcarrier For a =∞, the total cable capacity is assigned to the down- and IUS represents the set of upstream subcarrier indices. The R stream transmission direction; thus, we transmit only in the number of downstream bits u,DS, is derived correspond- downstream. ingly. To calculate the number of bits that are transmitted per ∈{ R For a given transmission direction dir, with dir DS, second the u,US is multiplied with the number of DMT sym- US}, we do not know a priori which bitrates can be sup- bols that are transmitted in one second. The NRIA calculates ported by each user. Therefore, we assign to each user a pri- an optimized D-FDD bandplan as one part of the DSM pro- α R R ority value u,dir, which specifies how much of the total cable cess. That is, it finds u,DS and u,US by iteratively redirecting capacity (in a certain transmission direction) will be assigned subcarriers to the downstream or upstream direction. to user u. Hence, we specify the relation between the user pri- Let us denote the squared magnitude of the channel orities and the user bitrates as transfer function from user v to u on subcarrier n by R1,dir R2,dir RU,dir = =···= ,(7) n n 2 α α αU Huv = Huv . (3) 1,dir 2,dir ,dir with Based on the Shannon capacity formula, the number of bits U loaded on subcarrier n by user u, for two-dimensional sym- αu,dir = 1. (8) bols, is u=1 n n H P A “special case” arises when αu,dir = 0. In this case, the user Rn = uu u u log2 1+ n ,(4)u ΓNu is not transmitting in the particular direction dir; thus, it is removed from (7). n where Pu denotes the power spectral density (PSD) of the It can be shown that the downstream and upstream bi- signal. The PSD specifies the signal power allocation versus trates for each user are related by n frequency. Nu denotes the PSD of the noise on subcarrier n. α R = a · u,DS · R u = ... U. Γ is the signal-to-noise (SNR) gap, which for a given bit error u,DS u,US,for 1, 2, , (9) αu,US rate and signal constellation represents the loss compared to the Shannon channel capacity. The PSD of noise is calculated Let us illustrate these parameters with a hypothetic but by simple two-user example, consisting of one private user and one business user. First, let us assume that the two-pair ca- U ble has a total capacity of 18 Mbps (which is in reality un- n n n n Nu = HuvPv + PV ,(5)known), and let us assume that we have a business model v=1 a = v=u that specifies the asymmetry 2 between the downstream and upstream directions. From the formulas above, we now n where PV denotes the PSD of the background noise on sub- have R1,DS + R2,DS = 12 and R1,US + R2,US = 6. Next, let us carrier n. assign the downstream α1,DS = 1/3 to the private user and α2,DS = 2/3 to the business user, which gives downstream 3. BITRATE RELATIONS USED BY THE NRIA rates of 4 Mbps and 8 Mbps to the private user and to the business user, respectively. In this section, we define some simple but usable bitrate rela- Similarly, in the upstream let us assign α1,US = 1/6 tions in order to describe part of the problem with the NRIA (private) and α2,US = 5/6 (business), which gives 1 Mbps that we aim to solve in Section 5. (= R1,US)and5Mbps(= R2,US) to the private user and to TheNRIAusesapredefinedasymmetryparametera that the business user, respectively. It can easily be verified that specifies the ratio between the total desired downstream and the users’ bitrates and priority values fulfill (7)and(8), re- upstream bitrates spectively. Table 1 summarizes the different parameter values given in this example. U u= Ru In an actual network scenario, we do not know before- a = 1 ,DS . (6) U R hand which bitrates can be supported. However, the NRIA u=1 u,US uses the given asymmetry parameter a and the user priority Two“specialcases”arisewhena = 0anda =∞.Fora = 0, values αu,dir to find the desired operating point, (i.e., bitrates the total cable capacity is assigned to the upstream trans- of all users) since the quantities represented by these param- mission direction; thus, we transmit only in the upstream. eters are always related through (6), (7), and (8). 4 EURASIP Journal on Applied Signal Processing

4. PROBLEM FORMULATION A corresponding relation holds for the downstream direc- tion. Using these indicators, (2)canbewrittenas The IWFA and the OSBA assume a fixed D-FDD bandplan. N−1 This makes it difficult to balance the bitrates between the R = Rn = βn Rn. u,US u US u (11) downstream and upstream transmission directions. As a re- n∈IUS n=0 sult many times in a particular transmission direction much The optimization problem can now be formulated as follows: higher bitrates are achieved than those we want to offer and at the same time in the other direction we cannot offer the U desired bitrates. Furthermore, in a particular direction, the maximize Ru,DS + Ru,US , (12a) IWFA assumes that the target bitrates of all users are known u=1 a priori and that they are achievable. On the other hand the U U OSBA uses some form of exhaustive search, which is time subject to Ru,DS = a Ru,US, (12b) consuming, to find the desired operation point. Therefore, u=1 u=1 with these problems in mind we take a different approach R R R 1,DS = 2,DS =···= U,DS with the NRIA. , (12c) α1,DS α2,DS αU,DS As mentioned earlier, the NRIA aims to jointly optimize R R2,US RU,US a bandplan and the power allocation for all users. That is, 1,US = =···= , (12d) the NRIA selects the downstream and upstream subcarriers α1,US α2,US αU,US common for all users represented by the sets IDS and IUS, U U with IDS ∩ IUS =∅. Hence, the users’ downstream and up- αu,DS = 1, αu,US = 1, (12e) stream bitrates will depend on each other, a property that u=1 u=1 is often desirable for practical business models. The depen- N−1 βn P n ≤ P max u = ... U dency between the downstream and upstream bitrates guides DS u,DS u,DS, 1, 2, , , (12f) the NRIA to desirable operating points, as in the exam- n=0 N− ple given in Section 3. Furthermore, to jointly optimize the 1 βn P n ≤ P max u = ... U power allocation among all users, two vectors are to be found US u,US u,US, 1, 2, , , (12g) n=0 for each user, specifying the power allocation in the down- n n N− β = − β n = ... N − P = P 0 P 1 ... P 1 DS 1 US, 0, 1, , 1, (12h) stream direction, u,DS [ u,DS, u,DS, , u,DS ], and in 0 1 N−1 n n the upstream direction, Pu,US = [Pu , Pu , ..., Pu ]. β β ∈{ } ,US ,US ,US DS, US 0, 1 , (12i) In addition, each user should satisfy total power constraints: ≤ P n ≤ P max ≤ P n ≤ P max P n P n ∈ R+ . 0 n u,DS u,DS and 0 n u,US u,US,where u,DS, u,US 0, (12j) max max Pu and Pu denote the maximum total power allowed ,DS ,US The asymmetry parameter a, and the user priority values for user u in the downstream and upstream directions, re- α, are all constants (and a designer’s choice) for the NRIA, spectively. Usually the maximum total power constraint is as explained in Section 3. A PSD constraint is often given selected the same for all users. for practical implementations of DSL modems. If this is We aim to jointly maximize the bitrates in the down- the case the allowed power range [0, R+], where R+ denotes stream and in the upstream for all users under the constraints the real positive numbers, in (12j) should be replaced with that the bitrates should satisfy the predefined relations (6), n,max n,max [0, ..., Pu ], where Pu denotes the maximum PSD (7), and (8). Without these constraints a search for a maxi- level allowed for user u on subcarrier n for a given transmis- mized total bitrate would lead to a situation where the users sion direction. In practice, the PSD mask constraint is usually close to the CO (or cabinet) being given very high bitrates at the same for all DSL systems of the same type. the price of the distant users, who will get very low bitrates The optimization problem (12) involves binary variables or no DSL service at all. from (12i) and continuous variables for the PSDs. Further- When formulating the optimization problem, it is conve- βn βn more, we have nonlinear relations between binary and con- nient to use two indicators for each subcarrier, DS and US, tinuous variables in the optimization (12a)aswellasinthe which specify the transmission direction. Due to the D-FDD 2 constraints (12b), (12c), and (12d), which are related also Zipper type transmission scheme, the subcarrier indicators through (11). Therefore, (12) is a mixed-integer nonlinear βn = − βn n = ... N − fulfill DS 1 US for 0, 1, , 1. For the upstream β I optimization problem [12], which in general is very chal- transmission direction, the relation between US and US is lenging from a computational point of view. Even for a fixed given by downstream and upstream subcarrier allocation, the objec- tive function (12) and the constraints (12b), (12c), and (12d) ⎧ ⎨ are neither convex nor concave with respect to the users’ 1, if n ∈ IUS, βn = US ⎩ (10) power allocations. Thus, this type of optimization problem 0, otherwise. is not solvable with existing algorithms [12, 13]. In theory, it is possible to exhaustively try out all possible combinations of subcarrier allocations and, for each alloca- 2 Without loss of generality, we do not consider the silent (unused) subcar- tion, to try all possible combinations of PSD mask realiza- riers. tions for all users. However, the number of combinations is Driton Statovci et al. 5 tremendously high and practically infeasible. For instance, in VDSL Zipper-DMT [14–16] with 4096 subcarriers this re- Preset Values a α ... α α ... α sults in 24096 possible combinations of subcarrier allocations. Target , 1,DS, , U,DS, 1,US, , U,DS, K M M The number of possible combinations of PSD mask realiza- , DS, US tions of all users for a particular transmission direction is Initialize n,max N ·U n,max (R +1) dir ,whereR denotes the maximum num- U = UserOrdering(DS) N DS ber of bits that can be loaded on a subcarrier, and dir is UUS = UserOrdering(US) the number of subcarriers used in a particular transmission IDS, IUS = InitialBandPlan (K) direction. Thus, a rather typical case with Rn,max = 15 bits, U = 10 users, and Ndir = 2048 upstream subcarriers has Main Function 1620480 possible PSD mask realizations. repeat ∀u P = P = { } For a similar optimization problem but with a fixed sub- : u,DS 0, u,US 0 Set PSD masks to zero T =∞ T =∞{ } carrier allocation, a dual decomposition method has been DS , US Set targets to infinity RDS, PDS = CalcRatesPSDs(DS, PDS, TDS, MDS) suggested [5–7]. In particular, the OSBA has reduced the R P = US P T M US,US CalcRatesPSDs ( , US, US, US) search space for possible PSD mask realizations and has lin- a = U R / U R u=1 u,DS u=1 u,US ear complexity in the number of subcarriers Ndir.However, IDS, IUS = ChangeBandPlan(a, aTarget , IDS, IUS) the OSBA still has a complexity that increases exponentially until a has reached the desired accuracy aTarget or with the number of users U making it too complex for most the maximal number of the iteration in the of the DSL access network scenarios that are found in prac- outer stage Omax has been examined. tice. In the following section, we propose the normalized-rate CalcRatesPSDs Function iterative algorithm that solves the formulated optimization Rdir, Pdir = CalcRatesPSDs(dir, Pdir, Tdir, Mdir) problem in a suboptimal way. i = 1 repeat for u ∈ Udir do n ∈ I 5. THE NORMALIZED-RATE ITERATIVE ALGORITHM for dir do n U n n n Nu = v=1 HuvPv + PV ,dir v=u ,dir The normalized-rate iterative algorithm (NRIA) consists of end for two levels of nested iterations: an outer stage that searches R P = N T P max u,dir, u,dir ModifiedFM-WF( u,dir, dir, u,dir) for an optimized downstream and upstream subcarrier allo- Rdir(i) = Ru,dir/αu,dir i

PSD DUDU DUDU N U n U R > R < N n u,DS u,DS u=1 u=1 U U a R a R Figure 2: An example of initial subcarrier allocation, where the to- u,US u,US u=1 u=1 tal number of available subcarriers N are divided into K = 4 sub- bands. DUDU DUDU NNnn U U . . Ru,DS > Ru,DS < function ModifiedFM-WF as it is an inherent property of any u=1 u=1 type of water-filling (bit-loading) algorithm. U U a Ru,US a Ru,US u=1 u=1 5.1. Algorithmic details D UDU DUD U

User ordering. Due to the estimation of the target bitrate in NNnn each iteration of the inner stage, the user ordering over which . . . . the NRIA iterates become important for the convergence . . speed of the algorithm. To speed up convergence, the users should first be arranged in decreasing priority order and the Figure 3: Illustration of the search for a subcarrier allocation. users within the same priority group should be arranged in order of decreasing line attenuation. The NRIA performs this ordering independently for both transmission directions. An to the right or to the left as shown in Figure 3. The subcarrier extensive analysis with objectives and requirements for such allocation is the same for all users, since FDD is considered. an ordering is given in [3]. The number of iterations in the outer stage depends on Initial bandplan. The NRIA partitions the available spec- the number of subcarriers per subband and on the pre- trum into K subbands, with N/K subcarriers in each. In or- defined accuracy with which the inequalities (c1) and (c2) der to simplify the description but without loss of generality, should be fulfilled. However, the maximum number of iter- N K we assume that both and are powers of two. When there ations Omax in the outer stage can always be determined in are some unused or silent subcarriers, they are simply zeroed advance, and this depends only on the number of subcarri- in the algorithm and kept outside the optimization process. ers per subband, N/K. Therefore, due to the binary search, O = N/K An example with four subbands is shown in Figure 2. max log2( ) + 1. For example, when the number of It is also possible to start with an upstream subband in the subcarriers is N = 4096, as in VDSL, and when we select K = O = = low frequencies. However, in practice, we usually start with a 8 subbands, then max log2(512) + 1 10. downstream subband to be spectrally compatible with asym- Modified fixed-margin water-filling (FM-WF) algorithm. metric DSL (ADSL) downstream transmission. The water-filling (bit-loading) algorithm used in the inner Change bandplan. For a given subcarrier allocation, the stage is a modified version of the FM-WF algorithm [17]. The inner stage of the NRIA calculates the bitrates for all users FM-WF algorithm uses only the power needed to achieve in the downstream and upstream directions. Then, depend- a predefined target bitrate. As described, we do not know ing on the achieved bitrates of all users and the predefined a priori if a specific target bitrate can be supported for a asymmetry a, the NRIA performs a binary search within the given maximum total power. Therefore, we have modified subbands for a new subcarrier allocation. This is performed the fixed-margin water-filling algorithm as follows: if the tar- with the constraint given in (12b). There are three cases: get bitrate can be supported, then only the power needed to U U support that bitrate is used; otherwise, the maximum total (c1) u= Ru >a u= Ru , which indicates that more 1 ,DS 1 ,US power is used and the supported bitrate is calculated. subcarriers should be assigned in the upstream direc- The pseudocode of the modified FM-WF algorithm, tion, U U when the continuous bit-loading is used as a water-filling al- (c2) u= Ru

To achieve a bit error rate of 10−7 we have assumed an SNR gap, Γ = 12.3dB.ItisderivedasΓ = ΓMod +ΓNoise −ΓCode = 9.8+6− 3.5 = 12.3dB,whereΓMod denotes the modula- tion gap, which for quadrature amplitude modulation signal B constellation is 9.8dB[17]; ΓNoise denotes the noise margin, R which is assumed to be 6 dB; and ΓCode denotes the coding 2 gain, which is set to 3.5dB. K = 8 downstream and upstream subbands are typically A sufficient to achieve the desirable bitrates [1], which is also R used in this paper. As suggested in Section 5, we select MDS = 1 MUS = U. DS 6.1. Rate regions US

To compare the performance of the NRIA with the IWFA and Figure 6: Illustration of rate regions of the IWFA and OSBA. the OSBA we will use the rate region concept, which char- acterizes all possible bitrate combinations among the users. Due to the fixed subcarrier allocation, the downstream and R upstream rate regions of the IWFA and the OSBA are inde- 2 pendent and U-dimensional. For example, in a two-user case the rate regions of the IWFA and the OSBA can be plotted R2,DS = 2R2,US in a two-dimensional space, as shown in Figure 6,andany D pair of bitrates can independently be selected from the down- stream and upstream rate regions. Furthermore, any pair of C R , bitrates that lies inside the rate regions can be supported by 2 US the IWFA. With OSBA, however, any pair of bitrates that lie on the rate region boundaries can be found, because only this pair of bitrates maximizes the weighted sum of the bitrates R1,US R1,DS = 2R1,US R1 [6]. Since the NRIA searches for an optimized downstream and upstream subcarrier allocation the downstream and up- DS (a = 2) a = stream rate regions becomes dependent. For a two-user case US ( 2) the NRIA finds two pairs of downstream and upstream bi- trates (R1,DS, R2,DS)and(R1,US, R2,US), which are related Figure 7: An example of rate regions of the NRIA for two-user case α = α α = α by three independent parameters; a, α ,andα as de- when 1,DS 1,US and 2,DS 2,US, and for asymmetry parameter 1,DS 1,US a = scribed in Section 3. Thus, the NRIA supported rate regions value, 2. for the two-user case are five-dimensional. This is difficult to visualize and to compare with the two-dimensional rate re- gions of the IWFA and the OSBA. Furthermore, the NRIA Instead, we will use another way to compare the three al- finds only those pairs of bitrates that lie on the rate regions gorithms. We will assume equal downstream and upstream boundaries, because only those pairs maximize the sum of user priorities for the NRIA, which for the two-user case downstream and upstream bitrates and satisfy the relations from (9) yields defined in (9). One way to compare the NRIA with the IWFA and the Ru,DS = aRu,US,foru = 1, 2. (15) OSBA is to calculate the parameters needed by the NRIA from the two-dimensional rate regions spanned by the IWFA Under this assumption, for a fixed asymmetry parameter and the OSBA. For example, let us select the pair of bitrates at value a, we can plot the rate regions of the NRIA in two- point A for the downstream and the pair of bitrates at point B dimensional space. However, note that the NRIA will now for the upstream as shown in Figure 6. For these two pairs of only support those downstream and upstream bitrates for a bitrates we can calculate the asymmetry parameter and the which αu,US = αu,DS. Thus, we calculate two pairs of αu αu users’ priority values ,DS and ,US, needed in NRIA by us- downstream and upstream bitrates (R1,US, R2,US)and ing (6)and(7). We can repeat this for any two pairs of down- (R1,DS, R2,DS) = (aR1,US, aR2,US), respectively. As a result, de- stream and upstream bitrates and draw the corresponding pending on a, the downstream bitrates are either expanded downstream and upstream rate regions of the NRIA. How- or contracted compared to the upstream bitrates. That is, ever, this strategy excludes a large portion of the bitrates that two pairs of downstream and upstream bitrates lie on a are supported by the NRIA but not by the IWFA and the line that also crosses the origin of the bitrate axes. This line OSBA.Hence,suchacomparisonwouldthereforebecome will be included in some plots to better illustrate the bitrate quite skewed. relations between the different algorithms. Driton Statovci et al. 9

Figure 7 shows an example of the rate regions of the 35 a = NRIA for the asymmetry parameter value set to 2. In 30 the same plot a pair of downstream bitrates at point C and a pair of upstream bitrates at point D, which lie on a line 25 that also crosses the origin, are also shown. Note that for the 20 symmetric case, with a = 1, C = D and the corresponding A rate regions coincide. 15 C bitrate (Mbps)

2 B 6.2. Comparison of the NRIA with the IWFA u 10 In this section, we compare the performance of the NRIA 5 with the iterative water-filling algorithm (IWFA) [4]. The 0 IWFA assumes a fixed frequency bandplan, therefore, for all 0 1020304050607080 u simulations concerning the IWFA we will use one of the stan- 1 bitrate (Mbps) dardized frequency bandplans: the bandplan “997,” without IWFA: DS guard bands, with the corresponding downstream and up- IWFA: US stream subcarriers NRIA: DS = US (a = 1)

IDS ={32 ···695, 1183 ···1634}, (16) Figure 8: Downstream and upstream rate regions of IWFA and IUS ={696 ···1182, 1635 ···2782}. NRIA for a = 1, α1,DS = α1,US,andα2,DS = α2,US.

Figure 8 shows the downstream and upstream rate re- gions for the IWFA and for the NRIA with a = 1, that is, Table 2: Comparison of the NRIA and the IWFA, symmetric bit- NRIA forces symmetric bitrates for each user. To compare rates (a = 1.00). the NRIA with the IWFA we need to find the correspond- User u User u Increase ing symmetric bitrates for the IWFA. They are located where Algorithm Direction 1 2 (Mbps) (Mbps) (%) the boundaries of the downstream and upstream IWFA rate IWFA DS/US 53.35 10.36 — regions intersect. NRIA (A) DS/US 53.35 12.80 23.5 The bitrate figures for the symmetric case are summa- NRIA (B) DS/US 57.50 10.36 7.78 rized in Table 2. We see that if we fix the bitrate of user u 1 NRIA (C) DS/US 56.46 10.96 5.82 to 53.35 Mbps, as achieved by the IWFA, the NRIA can in- crease the bitrate of u2 from 10.36 to 12.80 Mbps (point A, an increase of 23%). If we instead fix the bitrate of u2 at . u 10 36 Mbps the NRIA can increase the bitrate of 1 from We also compare the performance of the NRIA with the . . B 53 35 to 57 50 Mbps (point , an increase of 8%). The gain IWFA when the asymmetry parameter value is a = 1.25, u u is smaller for the latter case, since 1 disturbs 2 more than that is, higher downstream than upstream bitrates. Again, we vice versa, due to the upstream “near-far” problem [4]. should compare the NRIA with the IWFA for which a = 1.25. For distributed DSL access networks in general, decreas- In Figure 9 the bitrates for this case are found at the inter- ing one user’s bitrate does not necessarily increase another sections of the dashed line with the IWFA downstream and user’s bitrate correspondingly. Therefore, in a third compari- upstream rate region boundaries. These bitrates are summa- son the users’ bitrate relations of the NRIA and the IWFA are rized in Table 3; and it can be verified that they satisfy the equal priority relations: R R α R R α 1 = 1 = 1 . 1,DS = 1,DS = 1,DS , R R α (17) R R α 2 IWFA 2 NRIA 2 2,DS IWFA 2,DS NRIA 2,DS R R α (18) 1,US 1,US 1,US . This is depicted in Figure 8 at point C, where the dashed line = = R2,US IWFA R2,US NRIA α2,US (corresponding to (17)) intersects with NRIA’s rate region boundary. For this case, a total bitrate increase of about 6% We see that the NRIA achieves an increase of more than 12% is achieved with the NRIA compared to the IWFA. in each transmission direction. The corresponding down- For a fixed subcarrier allocation (when only the inner stream and upstream transmit PSDs of the IWFA and the stage of the NRIA is used), the NRIA cannot outperform NRIA when a = 1.25 are shown in Figures 10 and 11. the IWFA since the inner stage of the NRIA is based on the The NRIA optimized downstream and upstream subcar- IWFA. However, in this case the NRIA can be used to cal- rier allocation for the analyzed two-user network scenario culate the set of maximum achievable users’ bitrates for the with K = 8 subbands, a = 1.25 asymmetry, and the bitrates IWFA. The alternative, to calculate the sets of achievable bi- given in Table 3 are trates in the IWFA by exhaustively testing all possible maxi- I ={32 ···477, 1025 ···1501, 2049 ···2525}, mum total power constraints [4] requires much higher com- DS I ={ ··· ··· ··· }. (19) putational complexity. US 478 1024, 1502 2048, 2526 2782 10 EURASIP Journal on Applied Signal Processing

35 6.3. Comparison of the NRIA with the OSBA 30 In this section, we compare the NRIA with the optimal 25 spectrum balancing algorithm (OSBA) and use the same two-user scenario as before (cf. Figure 5). Furthermore, as 20 the OSBA assumes a fixed bandplan we use the same subcar- 15 rier allocation as used in Section 6.2 for the IWFA.

bitrate (Mbps) Figure 12 shows the downstream and upstream rate re-

2 10 u gions for the NRIA with symmetric bitrates a = 1andfor 5 the OSBA with varying asymmetry. The corresponding sym- metric bitrates for the OSBA are given in Table 4.Toallow 0 user u to have 52.32 Mbps with the NRIA, as achieved by the 0 1020304050607080 1 OSBA, user u2 can only have 13.55 Mbps (point A, a loss of u1 bitrate (Mbps) 33%). Alternatively, to allow u2 to have 20.13 Mbps with the IWFA: DS NRIA:DS(a = 1.25) NRIA, as achieved by the OSBA, u1 can have only 43.37 Mbps IWFA: US NRIA:US(a = 1.25) (point B, a loss of 17%). Finally, when we want to have the same bitrate relations with the NRIA as with the OSBA, then Figure 9: Downstream and upstream rate regions of the IWFA and the bitrates given in Table 4 can be supported, which corre- NRIA for a = 1.25, α1,DS = α1,US,andα2,DS = α2,US. sponds to a loss of 11% (point C in Figure 12). If we instead compare the performance of the OSBA with the NRIA for the asymmetry parameter value set to a = 1.25 the downstream and upstream bitrates given in Table 5 can Table 3: Comparison of the NRIA and the IWFA, asymmetric bi- be supported. For this case the NRIA suffers a loss of less trates (a = 1.25). than 5% in both transmission directions. User u User u Increase These results are not surprising since the OSBA can in Algorithm Direction 1 2 (Mbps) (Mbps) (%) theory deliver the highest bitrates for a given bandplan. How- IWFA DS 41.25 21.42 — ever, the OSBA has a much higher computational complexity IWFA US 33.25 16.88 — compared to the NRIA. We have shown in [3] that when the NRIA DS 46.60 23.88 12.4 Levin-Campello bit-loading algorithm is used, the computa- NRIA US 37.24 19.19 12.5 tional complexity of the NRIA is CNRIA = O O · iDS · NDS(U +1) (20) + O O · i · N (U +1) , Note that although we have selected eight subbands (four for US US each transmission direction), that has been reduced to six where O denotes the complexity order; O denotes the num- subbands (three for each transmission direction) since only ber of iterations in the outer stage; U denotes the number subcarriers in the range {32, ..., 2782} are used (out of 4096 of users; iDS and iUS denote the number of iterations in the total). downstream and upstream inner stages, respectively; NDS de- The transmit PSDs of the NRIA and the IWFA are non- notes the average number of subcarriers in downstream as- smooth due to the integer bit-loading algorithm. However, signed over all downstream outer stage iterations; and N the PSDs of the NRIA and the IWFA are more or less flat over US denotes the average number of subcarriers in the upstream the used frequency spectrum. The PSDs of IWFA are almost assigned over all upstream outer stage iterations. flat as shown in [20, 21] when the integer bit-loading algo- The complexity of the OSBA for a particular transmis- rithm is used. Because the inner stage of the NRIA is based sion direction is [6] on the IWFA, the PSDs generated by the NRIA are likewise U almost flat when the integer bit-loading algorithm is used. C = O N U Rn,max +1 33U , (21) TheNRIAachievesbetterperformanceduetoabetter OSBA dir optimized bandplan and power allocations found by NRIA where Ndir denotes the number of subcarriers assigned in compared to the fixed bandplan and power allocations found a particular transmission direction and Rn,max denotes the by the IWFA. Let us analyze this for a particular case, which maximum number of bits that can be loaded on a subcarrier. can be applied to other cases in a similar way. Consider the For both transmission directions based on (21), the complex- second user and the upstream transmission direction. From ity of the OSBA is Figure 11 it can be shown that the NRIA and the IWFA utilize quite similar bandwidths and that the PSDs also have similar n,max U U COSBA = O NDSU R +1 33 levels. However, compared to the IWFA, the NRIA utilizes (22) n,max U U low frequencies where the channel conditions typically are + O NUSU R +1 33 . better (the noise-to-channel-gain ratio is low). This results in higher supported bitrates by the NRIA compared to the First we analyze the complexity of the NRIA and the OSBA IWFA. for the two-user case (U = 2). For the NRIA the maximum Driton Statovci et al. 11

−50 −50

−60 −60

−70 −70

−80 −80 PSD (dBm/Hz) PSD (dBm/Hz)

−90 −90

−100 −100 0 500 10 00 15 00 20 00 25 00 30 00 0 500 10 00 15 00 20 00 25 00 30 00 Subcarrier index Subcarrier index NRIA NRIA IWFA IWFA (a) (b)

Figure 10: Downstream transmit PSDs of the NRIA and the IWFA for users’ bitrates given in Table 3.(a)u1 downstream PSDs; (b) u2 downstream PSDs.

−50 −50

−60 −60

−70 −70

−80 −80 PSD (dBm/Hz) PSD (dBm/Hz)

−90 −90

−100 −100 0 500 10 00 15 00 20 00 25 00 30 00 0 500 10 00 15 00 20 00 25 00 30 00 Subcarrier index Subcarrier index NRIA NRIA IWFA IWFA (a) (b)

Figure 11: Upstream transmit PSDs of the NRIA and the IWFA for users’ bitrates given in Table 3.(a)u1 upstream PSDs, (b) u2 upstream PSDs.

number of outer iterations is Omax = 10, when K = 8sub- than the complexity of the NRIA. When the number of users bands and N = 4096 subcarriers. The expected number of increases it is obvious from (20)and(22) that the computa- downstream and upstream inner stage iterations for the two- tional complexity of the OSBA increases faster than that of user case to achieve the desired accuracy is smaller than 50; the NRIA. This is because the complexity of the NRIA in- thus, iDS < 50 and iUS < 50. This statement is confirmed in creases linearly with the number of users, whereas the com- [3]. Substituting O = 10 and iDS = iUS = 50 into (20) yields plexity of the OSBA increases exponentially with the number C = O . · 3N O . · 3N . of users. NRIA 1 5 10 DS + 1 5 10 US (23) Another advantage of the NRIA compared to the OSBA Correspondingly for the OSBA after substituting Rn,max = 15 is that the NRIA supports downstream and upstream users’ into (22), we get bitrate combinations that the OSBA cannot, due to the op- C = O . · 3N O . · 3N . timized downstream and upstream subcarrier allocation. OSBA 557 6 10 DS + 557 6 10 US (24) Table 6 summarizes some bitrate combinations that can be From (23)and(24) we can conclude that also for the supported by the NRIA but not by the OSBA for the se- two-user case the complexity of the OSBA is much higher lected bandplan “997.” Note that these bitrate combinations 12 EURASIP Journal on Applied Signal Processing

35 Table 5: Comparison of the NRIA with the OSBA, asymmetric bi- a = . 30 trates ( 1 25).

User u1 User u2 Loss 25 Algorithm Direction (Mbps) (Mbps) (%) OSBA DS 40.95 27.00 — 20 B OSBA US 32.76 21.60 — 15 C NRIA DS 38.95 25.68 4.9 bitrate (Mbps) . 2 A NRIA US 31.29 20.63 4 5

u 10

5 Table 6: Some bitrate combinations that can be supported by the 0 0 1020304050607080 NRIA but not by the OSBA (corresponding to Figures 12 and 13). u 1 bitrate (Mbps) User u User u Asymmetry Direction 1 2 OSBA: DS (Mbps) (Mbps) OSBA: US a = 1.00 DS/US 25.0 25.0 NRIA: DS = US (a = 1) a = 1.00 DS/US 35.0 23.0 a = 1.00 DS/US 15.0 26.0 Figure 12: Down- and upstream rate regions of the OSBA and a = 1.25 DS 25.0 28.0 NRIA for a = 1, α1,DS = α1,US,andα2,DS = α2,US. a = 1.25 US 20.0 22.5 a = 1.25 DS 13.3 29.3 a = 1.25 US 10.2 23.3 Table 4: Comparison of the NRIA with the OSBA, symmetric bi- trates (a = 1.00). User u User u Loss 35 Algorithm Direction 1 2 (Mbps) (Mbps) (%) 30 OSBA DS/US 52.32 20.13 — NRIA (A) DS/US 52.32 13.55 33.3 25 NRIA (B) DS/US 43.37 20.13 17.2 20 NRIA (C) DS/US 46.52 17.90 11.1 15 bitrate (Mbps)

2 10 are generated under the constraint of equal downstream and u upstream user priorities. 5 Figures 14 and 15 show the downstream and upstream 0 transmit PSDs of the OSBA and the NRIA corresponding to 0 1020304050607080 u asymmetry a = 1.25 and the bitrates given in Table 5.The 1 bitrate (Mbps) optimized D-FDD bandplan found by the NRIA for this two- OSBA: DS NRIA: DS (a = 1.25) user scenario is OSBA: US NRIA: US (a = 1.25)

IDS ={32 ···493, 1025 ···1517, 2049 ···2541}, (25) Figure 13: Down- and upstream rate regions of the OSBA and IUS ={494 ···1024, 1518 ···2048, 2542 ···2782}. NRIA for a = 1.25, α1,DS = α1,US,andα2,DS = α2,US. The downstream and upstream PSDs of the NRIA have sim- ilar shapes to those shown when comparing the NRIA for a = 1.25 with the IWFA. However, the downstream and up- Figure 16 shows the rate regions of all three algorithms stream PSDs generated by OSBA have completely different (IWFA, OSBA, and NRIA with asymmetry a = 1.25). shapes compared to those generated by the IWFA. Figure 14 shows that the OSBA will partially reduce the 6.4. Comparison of the NRIA with u transmit power for user 1 in the downstream direction. The the bidirectional IWFA PSD in the high frequencies of the first downstream subband and in the low frequencies of the second downstream sub- The NRIA, which optimizes the bandplan, always performs band is reduced. Therefore, the users are not disturbing each better than IWFA, which assumes a fixed bandplan. As an other significantly. In the upstream direction, as can be seen alternative to the IWFA, Cioffi [8]hassuggestedtocom- in Figure 15, the PSDs of both users for the OSBA do not pare the NRIA with the bidirectional IWFA (bi-IWFA). The overlap at all. They are flat, because the transmitters see more bi-IWFA does not fix the bandplan, but assumes an echo- or less flat noise-to-channel-gain ratio (N /H, cf. Section 2) cancelled transmission scheme and lets the IWFA decide for over all used subcarriers. For a complete comparison, each loop which subcarriers should be used exclusively for Driton Statovci et al. 13

−50 −50

−60 −60

−70 −70

−80 −80 PSD (dBm/Hz) PSD (dBm/Hz)

−90 −90

−100 −100 0 500 10 00 15 00 20 00 25 00 30 00 0 500 10 00 15 00 20 00 25 00 30 00 Subcarrier index Subcarrier index NRIA NRIA OSBA OSBA (a) (b)

Figure 14: Downstream transmit PSDs of the NRIA and the OSBA for users’ bitrates given in Table 5.(a)u1 downstream PSDs; (b) u2 downstream PSDs.

−50 −50

−60 −60

−70 −70

−80 −80 PSD (dBm/Hz) PSD (dBm/Hz)

−90 −90

−100 −100 0 500 10 00 15 00 20 00 25 00 30 00 0 500 10 00 15 00 20 00 25 00 30 00 Subcarrier index Subcarrier index NRIA NRIA OSBA OSBA (a) (b)

Figure 15: Upstream transmit PSDs of the NRIA and the OSBA for users’ bitrates given in Table 5.(a)u1 upstream PSDs; (b) u2 upstream PSDs.

n n downstream or upstream and which should be used simulta- where Γ, Huv,andHuu are the same as defined in Section n neously for both transmission directions. 2. Note, however, that here Huv can be NEXT and FEXT The simulation scenario used to compare the NRIA with couplings. For the network scenario in Figure 17 the crite- the bi-IWFA is shown in Figure 17.Wehaveselectedanet- rion (26) is not always fulfilled due to high NEXT couplings. work scenario with four users, because in a two-user case an Hence, the bi-IWFA might not have a unique Nash equilib- echo-cancellation (EC) scheme will usually outperform any rium [4, 22]. For this scenario, we have observed that the per- another algorithm that assumes FDD transmission. The rea- formance of the bi-IWFA depends on the user ordering dur- son for this is that the gain achieved with EC is higher than ing the iterations. Therefore, we have performed simulations the loss from the NEXT noise of a single disturber. with two iteration orderings: u1-u2-u3-u4 and u4-u3-u2-u1. When the number of users U>2, [22] has proved that For these simulations we have searched for symmetric the IWFA converges and has a unique Nash equilibrium if and equal bitrates for all users. The simulation results are summarized in Figure 18. When the bi-IWFA is deployed bi- H n Γ uv < 1 ∀n u v u = v trates of approximately 26.2Mbpsand24.6Mbpscanbesup- max H n U − , , , and , (26) uu 1 ported by each user for the iteration orders u4-u3-u2-u1 and 14 EURASIP Journal on Applied Signal Processing

35 35

30

25 30

20 25 15 bitrate (Mbps) 2 u 10 OSBA (a = 1.25) User bitrates20 (Mbps) 5 IWFA (a = 1.25) 0 15 0 1020304050607080 012345 u 1 bitrate (Mbps) User index a = . OSBA: DS NRIA:US( 1 25) NRIA: DS = US bi-IWFA: DS = US (4 − 3 − 2 − 1) OSBA: US IWFA: DS IWFA: DS bi-IWFA: DS = US (1 − 2 − 3 − 4) a = . NRIA:DS( 1 25) IWFA: US IWFA: US

Figure 16: Down- and upstream rate regions of the IWFA, OSBA, Figure 18: Users’ downstream and upstream supported bitrates for and NRIA for a = 1.25, α1,DS = α1,US,andα2,DS = α2,US. NRIA and bi-IWFA for the network scenario shown in Figure 17.

CO/Cabinet simultaneously for both transmission directions; frequencies . . u1 u1 between 6 2 MHz and 10 4 MHz (subcarrier 2405) are used u 100 m only for downstream transmission; and frequencies from 2 u2 u 400 m 10.4 MHz to 12 MHz (subcarrier 2782) are used simultane- 3 u3 u 600 m ously for both transmission directions. Figure 20 shows that 4 u4 900 m in contrast to the bi-IWFA, the NRIA allows user u4 to utilize the maximum total power in both transmission directions. Figure 17: Network scenario used to compare the NRIA and the bi-IWFA. 7. SUMMARY

Dynamic spectrum management (DSM) should use as many u1-u2-u3-u4, respectively. With the NRIA a bitrate of more degrees of freedom as possible in order to optimize the uti- than 31.6 Mbps can be achieved by all users in each trans- lization of cable resources for DSL more efficiently. Previ- mission direction. Thus, for this case a bitrate increase of ously proposed DSM algorithms for frequency division du- more than 20% is achieved with the NRIA for each user com- plexing (FDD) systems consider only user power allocation pared to the bi-IWFA with the best iteration order u4-u3- for a fixed downstream and upstream frequency bandplan, u2-u1. This simulation is also performed with the IWFA us- albeit a dynamic spectrum often helps to utilize the cable ca- ing the bandplan “997,” as described in Section 6.2. Figure 18 pacity more efficiently. shows that the IWFA performs better than the bi-IWFA for In this paper, we presented a novel centralized DSM al- the given scenario. The reason for this can be explained as fol- gorithm for DSL: the normalized-rate iterative algorithm lows: due to the several NEXT couplings the crosstalk noise (NRIA). The NRIA is the only algorithm that jointly opti- is high; however, the noise level at the receivers is not so mizes the bandplan for FDD systems and power allocations high that the algorithm decides for FDD transmission when for all users in a common cable bundle. the transmit PSDs of all transmitters are “moderately” low. The multiuser optimization performed by the NRIA is As Chung [7] recognized, in these environments the IWFA based on a unique problem formulation that has a strong shows significantly worse performance compared to the case practical advantage. It is based on two types of parameters, where the PSDs have high levels. which bridge the gap between the operators’ DSL business The downstream and upstream transmit PSDs of the models and the DSM: the desired user priorities and the bi-IWFA for the iteration order u4-u3-u2-u1 are shown in desired network asymmetry. The NRIA offers high perfor- Figure 19. The transmit PSDs for the iteration order u1-u2- mance in combination with low computational complexity, u3-u4 are not included, but were found to have quite different since it is designed to be practically implementable rather shapes. Figure 19 shows that only user u4 utilizes the maxi- than just obtaining the highest theoretical performance. mum total power for the upstream direction. Therefore, u4 An inner iteration stage of the NRIA is based on the determines the maximum bitrates of all other users. From the seemingly practical iterative water-filling algorithm (IWFA) PSDs we can recognize three main regions: frequencies lower [4] for finding efficient users’ power allocations. However, than approximately 6.2 MHz (subcarrier 1438) are used the NRIA extends on the IWFA by automatically finding the Driton Statovci et al. 15

−50 −50

−60 −60

−70 −70

−80 −80 PSDs (dBm/Hz) PSDs (dBm/Hz)

−90 −90

−100 −100 0 500 10 00 15 00 20 00 25 00 30 00 0 500 10 00 15 00 20 00 25 00 30 00 Subcarrier index Subcarrier index

u1 u3 u1 u3 u2 u4 u2 u4 (a) (b)

Figure 19: (a) The downstream and (b) upstream transmit PSDs of the bi-IWFA for users’ bitrates shown in Figure 18 when the iteration order u4-u3-u2-u1 has been selected.

−50 −50

−60 −60

−70 −70

−80 −80 PSD (dBm/Hz) PSD (dBm/Hz)

−90 −90

−100 −100 0 500 10 00 15 00 20 00 25 00 30 00 0 500 10 00 15 00 20 00 25 00 30 00 Subcarrier index Subcarrier index u u u1 u3 1 3 u u u2 u4 2 4 (a) (b)

Figure 20: (a) The downstream and (b) upstream transmit PSDs of the NRIA for users’ bitrate values shown in Figure 18.

users’ bitrates that are actually achievable. This is accom- NRIA can offer bitrate combinations, that is, DSL services, plished by taking advantage of the property that the NRIA is thatcannotbeoffered by any other DSM algorithm. centralized which enables user cooperation. Furthermore, an outer iteration stage uses a simple but effective search strat- APPENDIX egy for finding an effective bandplan. All these practical ad- A. VALIDITY OF POSTULATE 1 vantages combined make the NRIA attractive also for net- works comprised of many DSL users. In this appendix we will discuss the validity of Postulate 1 Simulations showed that the NRIA achieves better bi- given in Section 5.1, “check the convergence point” stating: trate performance than both the IWFA and the bidirectional “Consider a multiuser D-FDD transmission system operat- IWFA [8].TheNRIAcanachievealmostasgoodperfor- ing in an interference channel where each receiver considers mance as the optimal spectrum balancing algorithm [5], but the crosstalk signal as noise. For such a multiuser system the with much lower requirements on complexity. However, by sum of the user bitrates increases when the power of each utilizing the additional feature of an optimized bandplan, the user increases.” 16 EURASIP Journal on Applied Signal Processing

Using Theorem A.1 below, we show that Postulate 1 is or equivalently, true for the case when all subcarriers are utilized by all users under the assumption that the receivers operate with high Pn Pn H n Pn P n H n Pn P n 1 · 2 > 21 1 + V · 12 2 + V . signal-to-noise ratio (SNR). In our experience, Postulate 1 is P n P n H n P n P n H n P n P n (A.6) 1 2 21 1 + V 12 2 + V also true when not all subcarriers are used. Furthermore, in [23] it is shown that for a theoretical two-user Gaussian in- In (A.6) we can identify the part that relates to the first terference channel and different coupling values the sum of user as the bitrates is increased by increasing the power of the two users. Pn H n Pn P n 1 > 21 1 + V . P n H n P n P n (A.7) Theorem A.1. Assume that receivers operate with high SNR. 1 21 1 + V If all subcarriers are utilized by all users, the users consider the From (A.7)wecanderive crosstalk signal as a Gaussian noise and none of the users utilize the maximum total power, the sum of the user bitrates always Pn > P n. increases when the power of each user increases. That is, 1 1 (A.8) Ru > Ru,(A.1)A corresponding relation is found for the second user and, u u since this is true for both users, (A.6) is always true. Thus, the left-hand side of (A.2) is always larger than the right-hand R R u where u and u denote the bitrates of user with and without side. power increase, respectively. H n Pn P n H n P n P n Proof. Without loss of generality we prove Theorem A.1 for Note that when 21 1 V and 21 1 V the two users. The proof for the case with more than two users is left-hand side of (A.7) is only slightly larger than the right- a generalization of this case. First, (A.1)canbewrittenas hand side. This means that there is only a minor increase in the sum of users’ bitrates when the power of the signal is in- R R >R R . n 1 + 2 1 + 2 (A.2) creased. Furthermore, when PV = 0, both sides in (A.7)are equal. Thus, there is no increase in the sum of the user bi- The power allocations that correspond to R and R are: 1 2 trates when the power of each user increases. However, this P = [P 0, P 1, ..., P N−1]andP = [P 0, P 1, ..., P N−1], 1 1 1 1 2 2 2 2 is not important for communication over copper wires, be- respectively. In the same way we will denote the increased n cause PV is never zero due to the thermal noise on cop- R R P = power levels that correspond to 1 and 2 with 1 per, external noise sources such as radio noise, and also alien P0 P1 ... PN−1 P = P0 P1 ... PN−1 [ 1 , 1 , , 1 ]and 2 [ 2 , 2 , , 2 ]. We will noise from the other DSL systems not included in the opti- now proceed to show that Theorem A.1 follows as a conse- mization process. quence of the increase in P1 and P2. First we note that increasing the bits in each subcarrier ACKNOWLEDGMENT individually also increases their sum. That is, it is enough to study a particular subcarrier n in (A.2): This work was partially financed by the Austrian Kplus pro- gram. Rn Rn >Rn Rn 1 + 2 1 + 2,(A.3) where all the bitrates are calculated based on (4). REFERENCES The SNR at the receivers is much greater than one over all subcarriers due to our assumption that receivers operate [1] D. Statovci and T. Nordstrom,¨ “Adaptive subcarrier allocation, power control, and power allocation for multiuser FDD-DMT with high SNR (this assumption is always true for the sub- systems,” in Proceedings of IEEE International Conference on carriers that are utilized for data transmission). Under this Communications (ICC ’04), vol. 1, pp. 11–15, Paris, France, assumption we can expand (A.3) using (4)and(5)to June 2004. [2]D.StatovciandT.Nordstrom,¨ “Adaptive resource allocation H n Pn H n Pn 11 1 22 2 in multiuser FDD-DMT systems,” in Proceedings of 12th Eu- log2 n n n +log2 n n n Γ H P + PV Γ H P + PV ropean Signal Processing Conference (EUSIPCO ’04), pp. 1213– 12 2 21 1 H n P n H n P n 1216, Vienna, Austria, September 2004. > 11 1 22 2 . [3] D. Statovci, Adaptive resource allocation for multi-user digital log2 Γ H n P n P n +log2 Γ H n P n P n 12 2 + V 21 1 + V subscriber lines, Ph.D. thesis, Vienna University of Technology, (A.4) Vienna, Austria, July 2005. [4] W. Yu, G. Ginis, and J. M. Cioffi, “Distributed multiuser power Using the properties of the logarithm we can rewrite this as control for digital subscriber lines,” IEEE Journal on Selected Areas in Communications, vol. 20, no. 5, pp. 1105–1115, 2002. PnPn 1 2 [5] R. Cendrillon, M. Moonen, J. Verliden, T. Bostoen, and W. n n n n n n H P + PV H P + PV Yu, “Optimal multiuser spectrum management for digital sub- 12 2 21 1 (A.5) P nP n scriber lines,” in Proceedings of IEEE International Conference > 1 2 on Communications (ICC ’04), vol. 1, pp. 1–5, Paris, France, H n P n P n H n P n P n , 12 2 + V 21 1 + V June 2004. Driton Statovci et al. 17

[6] R. Cendrillon, Multi-user signal and spectra co-ordination for Driton Statovci was born in Batllave,¨ Koso- digital subscriber lines, Ph.D. thesis, Department of Electri- va, in 1972. He received the “Inxh. Dipl.” cal Engineering, Katholieke Universiteit Leuven, Leuven, Bel- degree (equivalent to a master’s degree) in gium, December 2004. 1996 from the University of Prishtina, Fac- [7]S.T.ChungandJ.M.Cioffi,“Rateandpowercontrolina ulty of Electrical Engineering, Stream of two-user multicarrier channel with no coordination: the op- Informatics and Telecommunications. He timal scheme versus a suboptimal method,” IEEE Transactions received the Ph.D. degree in 2005 from on Communications, vol. 51, no. 11, pp. 1768–1772, 2003. Vienna University of Technology, Austria. [8]J.M.Cioffi, “Use of bi-directional iterative water-filling,” Per- From August 2000 until January 2002, he sonal communication, Vienna, Austria, September 2004. joined Ahead Communications Systems. At [9] F. Sjoberg,¨ M. Isaksson, R. Nilsson, P. Odling,¨ S. K. Wilson, and the same time, he was delegated by Ahead Communications Sys- P. O. B orjesson,¨ “Zipper: a duplex method for VDSL based on tems as a Researcher at the Telecommunications Research Center DMT,” IEEE Transactions on Communications,vol.47,no.8, Vienna (ftw.), working on the Broadband Access over Wire project. pp. 1245–1252, 1999. At Ahead Communications Systems he worked in system design for [10] R. Nilsson, F. Sjoberg,¨ M. Isaksson, J. M. Cioffi,andS.K.Wil- transmitting data and voice over DSL access networks. In February son, “Autonomous synchronization of a DMT-VDSL system in 2002, he joined the Telecommunications Research Center Vienna unbundled networks,” IEEE Journal on Selected Areas in Com- (ftw.), Austria. Currently he is working as Researcher in the “Ac- munications, vol. 20, no. 5, pp. 1055–1063, 2002. tive Copper Resource Management, ARM-Cu” project. His current [11] F. Sjoberg,¨ R. Nilsson, M. Isaksson, P. Odling,¨ and P. O. research interests include multiuser transmission theory and opti- Borjesson,¨ “Asynchronous Zipper,” in Proceedings of IEEE In- mization of resource utilization in wireline and wireless communi- ternational Conference on Communications (ICC ’99), vol. 1, cations. pp. 231–235, Vancouver, BC, Canada, June 1999. Tomas Nordstrom¨ was born in Harn¨ osand,¨ [12] C. A. Floudas, Nonlinear and Mixed-Integer Optimization: Sweden, in 1963. He received the M.S.E.E. Fundamentals and Applications, Oxford University Press, New degree in 1988, the Licentiate degree in York, NY, USA, 1995. 1991, and the Ph.D. degree in 1995, all [13] I. E. Grossmann, “Review of nonlinear mixed-integer and dis- from Lulea˚ University of Technology, Swe- junctive programming techniques,” Optimization and Engi- den. Currently, he is a Key Researcher and neering, vol. 3, no. 3, pp. 227–252, 2002. Project Manager at the Telecommunica- [14] ETSI, “Transmission and Multiplexing (TM); Access transmis- tions Research Center Vienna (ftw.). Dur- sion systems on metallic access cables; Very high speed Digi- ing 1995 and 1996, he was an Assistant Pro- tal Subscriber Line (VDSL); Part 1: Functional requirements,” fessor at Lulea˚ University of Technology re- ETSI Standard TS 101 270-1, Version 1.3.1, July 2003. searching computer architectures, neural networks, and signal pro- [15] ETSI, “Transmission and Multiplexing (TM); Access transmis- cessing. Between 1996 and 1999, he was with Telia Research AB sion systems on metallic access cables; Very high speed Digi- (the research branch of the Swedish incumbent telephone opera- tal Subscriber Line (VDSL); Part 2: Transceiver specification,” tor) where he developed broadband Internet communication over ETSI Standard TS 101 270-2, Version 1.2.1, July 2003. twisted copper pairs. He was instrumental in the development of [16] ANSI, “Very-high bit-rate Digital Subscriber Lines (VDSL) the Zipper-VDSL concept (contributed to the standardization of Metallic Interface, Part 3: Technical Specification of a VDSL in ETSI, ANSI, and ITU) and in the design of the Zipper- Multi-Carrier Modulation Transceiver,” ANSI Draft Standard VDSL prototype modems. In addition, he was Telia’s National T1.424/Trial-Use Part3, November 2000. Expert on speaker verification. In December 1999, he joined the [17] T. Starr, M. Sorbara, J. M. Cioffi, and P. Silverman, DSL Ad- Telecommunications Research Center Vienna (ftw.), where he is vances, Prentice-Hall, Upper Saddle River, NJ, USA, 2003. [18] J. Campello, “Practical bit loading for DMT,” in Proceedings of the Project Manager of the “Broadband Wireline Access” group. At IEEE International Conference on Communications (ICC ’99), FTW, he has worked with various aspects of wireline communi- vol. 2, pp. 801–805, Vancouver, BC, Canada, June 1999. cations like simulation of xDSL systems, cable measurements, RFI [19] R. F. M. van den Brink, “Cable reference models for simu- suppression, exploiting the common-mode signal in xDSL, and dy- lating metallic access networks,” ETSI/STC TM6 contribution namic spectrum management. He is also a Consultant covering de- 970p02r3, June 1998. sign, development, and deployment of DSL modems. [20] J. Verlinden, “The target PSD obtained with iterative water- Rickard Nilsson was born in Umea,˚ Swe- filling is almost flat,” ANSI T1E1.4 contribution 2003-295,De- den, in 1971. He received the M.S.E.E. de- cember 2003. gree, in 1996, the Licentiate of engineering, [21] R. Suciu, E. Van den Bogaert, J. Verlinden, and T. Bostoen, in 1999, and the Ph.D. degree in signal pro- “Insuring spectral compatibility of iterative water-filling,” in cessing, in January 2002, all from Lulea˚ Uni- Proceedings of 12th European Signal Processing Conference (EU- versity of Technology, Lulea,˚ Sweden. Be- SIPCO ’04), vol. 1, pp. 1209–1212, Vienna, Austria, September tween 1997 and 2001, he worked coopera- 2004. tively with Telia Research AB, Sweden, by [22]S.T.Chung,S.J.Kim,J.Lee,andJ.M.Cioffi, “A game- contributing to the development and stan- theoretic approach to power allocation in frequency-selective dardization of the DMT-Zipper method for Gaussian interference channels,” in Proceedings of IEEE In- VDSL. During the fall semester in 1999, he was a Guest Researcher ternational Symposium on Information Theory (ISIT ’03),pp. at Stanford University, USA. In May 2002, he moved from north- 316–316, Yokohama, Japan, June—July 2003. [23] M. H. M. Costa, “On the Gaussian interference channel,” IEEE ern Sweden to Austria and joined the Telecommunications Re- Transactions on Information Theory, vol. 31, no. 5, pp. 607– search Center Vienna (ftw.). His research interests include mul- 615, 1985. ticarrier modulation, statistical signal processing, low-complexity algorithms, and interference suppression and synchronization. Hindawi Publishing Corporation EURASIP Journal on Applied Signal Processing Volume 2006, Article ID 67686, Pages 1–8 DOI 10.1155/ASP/2006/67686

ADSL Transceivers Applying DSM and Their Nonstationary Noise Robustness

Etienne Van den Bogaert,1 Tom Bostoen,2 Jan Verlinden,2 Raphael Cendrillon,3 and Marc Moonen4

1 Research & Innovation Department of Alcatel, Francis Wellesplein 1, 2018 Antwerpen, Belgium 2 Access Networks Division of Alcatel, Francis Wellesplein 1, 2018 Antwerpen, Belgium 3 School of Information Technology & Electrical Engineering, University of Queensland, Brisbane, QLD 4072, Australia 4 Department of Electrical Engineering, Katholieke Universiteit Leuven, Kasteelpark Arenberg 10, 3001 Leuven-Heverlee, Belgium

Received 10 December 2004; Revised 10 May 2005; Accepted 18 May 2005 Dynamic spectrum management (DSM) comprises a new set of techniques for multiuser power allocation and/or detection in digital subscriber line (DSL) networks. At the Alcatel Research and Innovation Labs, we have recently developed a DSM test bed, which allows the performance of DSM algorithms to be evaluated in practice. With this test bed, we have evaluated the performance of a DSM level-1 algorithm known as iterative water-filling in an ADSL scenario. This paper describes the results of, on the one hand, the performance gains achieved with iterative water-filling, and, on the other hand, the nonstationary noise robustness of DSM-enabled ADSL modems. It will be shown that DSM trades off nonstationary noise robustness for performance improvements. A new bit swap procedure is then introduced to increase the noise robustness when applying DSM.

Copyright © 2006 Hindawi Publishing Corporation. All rights reserved.

1. INTRODUCTION means that the bit rates are reported to and controlled by a spectrum management centre (SMC). The actual transmit DSL deployment is evolving to, on the one hand, ever higher PSDs are computed in each transceiver, hence the multiuser bit rates enabling video services over DSL, and, on the other power control is distributed. At level 2, the received signal hand, increased reach to enlarge the customer base. Higher and noise power spectral densities (PSDs) are reported to the bit rates as well as increased reach can either be obtained SMC and the transmit PSDs are controlled by the SMC [4]. by deploying remote terminals (RTs) or by applying dy- Both level 1 and 2 gains in rate and reach are originating from namic spectrum management (DSM) techniques [1, 2]. The adaptive multiuser power allocation techniques, resulting in latter technology can provide rate/reach improvements on crosstalk avoidance. Finally, level 3 is the highest DSM level the shorter term, because it only requires software adapta- at which all colocated transceivers jointly process the received tions, whereas RT deployment involves heavy investments, symbols for upstream transmission and the transmit symbols and hence is rather suited for the longer term. for downstream transmission [5]. At this level, the gains are DSM is an adaptive form of spectrum management [3] originating from multiuser detection techniques based on ei- and is based on automatic detection of interference caused ther crosstalk cancellation or crosstalk precompensation. by crosstalk. From this perspective, the entire twisted-pair In this paper, we concentrate on DSM at level 1, and in binder is considered as a shared resource and the overall bit particular on a specific DSM algorithm called iterative water- rate is optimized. This optimization can be done in differ- filling [2], as well as a simplified version thereof. In Sec- ent ways, depending on the level of coordination between tions 2 and 3, we first review DSL channel properties and the multiple DSL lines. We remark that the name “dynamic distributed multiuser power allocation before detailing the spectrum management” originates from adaptive multiuser practical implementation of iterative water-filling on DSL power allocation techniques, but the meaning of the term modems. In Section 4, the real-life performance of iterative DSM has widened to include also multiuser detection tech- water-filling is demonstrated in an ADSL scenario, showing niques. data-rate gains of up to 500% in realistic settings. Finally, in A distinction is made between DSM at levels 0, 1, 2, and 3 Section 5, some questions are raised about DSM trading off according to the degree of coordination. Level-0 DSM means nonstationary noise robustness for performance. The non- there is no coordination between the lines. DSM at level 1 stationary noise robustness is further investigated and a new 2 EURASIP Journal on Applied Signal Processing bit swap procedure for enhanced noise robustness is pro- posed showing substantial improvements. −10

−20 2. THE DSL CHANNEL MODEL AND BIT LOADING −30

We focus on DSL modems using discrete multitone (DMT) −40 modulation, as for example, adopted in the ADSL standard [6]. The bit loading is calculated on a per-tone basis, as given −50 by (1) for a two-user case, and depends on the signal-to-noise − Gain (dB) 60 ratio (SNR) at the receiver: −70

−80 k b1 = SNR1( ) k log2 1+ −90 Γ1 (1) −100 S k · h2 k = 1( ) 11( ) . 0.20.40.60.811.21.41.61.8 log2 1+Γ N k S k · h2 k 1 1( )+ 2( ) 12( ) Frequency (MHz)

Direct Channel FEXT form 3 to 1 FEXT form 2 to 1 FEXT form 4 to 1 In (1), k represents the tone index, N1(k) denotes all the noises other than self-crosstalk, and Γ1 ≈ 12 dB is equal to the SNR gap including noise margin and coding gain. The Figure 1: Direct and FEXT channel transfer functions of a 1400 m SNRgaptoachieveabiterrorrate(BER)of10−7 is approxi- section of a 4-quad 0.4 mm France Telecom cable. mately equal to 9.75 dB. Adding to this a noise margin of 6 dB minus a coding gain of 3.75 dB, one obtains an overall value of 12 dB for Γ1. Si(k) denotes the transmit PSD of user i on k h k tone , 11( ) represents the direct channel transfer function This constrained optimization problem can be solved by h k of user 1 and 12( ) denotes the crosstalk channel transfer means of the Lagrangian, which is equal to function from user 2 to user 1. The bit loading given by (1) allows the modem to adapt to the changing line conditions by dynamically varying the J S1(k), S2(k) constellation used on each tone. Moreover, (1) tells us that the bit loading for user 1 depends on the crosstalk coming S k · h2 k = 1( ) 11( ) from the other users. If the crosstalk increases on a partic- log2 1+Γ N k S k · h2 k k 1 1( )+ 2( ) 12( ) ular carrier, fewer bits can be put on this carrier. The same is true for the other users, where the crosstalk coming from 2 S2(k) · h (k) (3) user 1 interferes with the signal of the other users. To illus- + log 1+ 22 2 Γ N (k)+S (k) · h2 (k) trate the importance of crosstalk, an example of measured k 2 2 1 21 channel transfer functions for a 1400m section of a 0.4 mm 4-quad France Telecom cable is shown in Figure 1.Thefar- + λ1 · P1 − S1(k) + λ2 · P2 − S2(k) . end crosstalk (FEXT) will be, in this case, on average equal k k to −120 dBm/Hz, as the nominal transmit PSD of ADSL modems is equal to −40 dBm/Hz. Equation (3) is the sum of the bit rates of both users to- 3. MULTIUSER POWER ALLOCATION gether with the Lagrange multipliers taking into account the total power constraint of both users. This is a non-convex The goal of multiuser power allocation is to optimize the optimization problem. Hence, finding an optimum requires overall bit rate while all transceivers are also subject to a total an exponential complexity in K,withK the total number of power constraint. This constrained optimization problem is tones. In recent work [7], numerically tractable ways of solv- given by (2) for the two-user case: ing this problem through use of a dual decomposition have been developed. Whereas this algorithm demonstrates large ⎧ performance gains, it is centralized and requires the exis- ⎪ ⎪ S1(k) ≤ P1, tence of a spectrum management centre (SMC). In this work, ⎨⎪ k we focus on a distributed algorithm which does not require RS k S k max 1( ), 2( ) s.t. ⎪ (2) ⎪ an SMC. This algorithm is known as iterative water-filling ⎩⎪ S2(k) ≤ P2, k [2]. Iterative water-filling can be derived by first making the assumption that the crosstalk noise is temporarily constant and that it can be incorporated in the term representing the 1 2 with R = k bk + k bk, the rate sum. background noise. This results in a simplified Lagrangian, Etienne Van den Bogaert et al. 3

1.4 From (1), it follows that, to have one bit on a carrier, the SNR must be at least as large as Γ1. Combining this with (6), . 1 2 the transmit PSD on tones loaded with 1 bit will be given by. 1 Γ N (k)+S (k) · h2 (k) 1 Smin(k) = 1 1 2 12 = . (7) 1 h2 k 2 λ ln 2 0.8 11( ) 1 On the other hand, the transmit PSD on tones with very low 0.6 noise-to-channel ratio (NCR) will be approximated by (8). Far-end (Mbps) As a conclusion, the transmit PSD is seen to vary only with 0.4 at most 3 dB: 0.2 1 S1(k) = . (8) λ1 ln 2 0 0123456789 Near-end (Mbps) The water-filled transmit PSD can then be approximated by one PSD level for all usable tones, equal to the total power Flat PSD approximation divided by the useful transmit bandwidth. The simplicity of Iterative water filling this water-filling approximation decreases the power allo- cation complexity of DSM applied at level 1. Although the Figure 2: Iterative water-filling and flat PSD rate regions. complexity of water-filling as such is not that high, this ap- proximation has one clear advantage: existing ADSL imple- mentations (which all use flat PSD allocation) can be used equation (4), with the optimum given by (5): for DSM level 1 by just controlling their average PSD level. Figure 2 shows the rate regions of water-filling and the flat J S1(k), S2(k) PSD approximation, respectively. As can be seen from the S k · h2 k ff = 1( ) 11( ) figure, the di erence in performance is negligible. The simu- log2 1+ Γ1N1(k) lation scenario is the same as the scenario shown in Figure 3, and which will be explained in the next section. S k · h2 k 2( ) 22( ) (4) Note that the resulting iterative procedure is straightfor- + log2 1+ Γ2N2(k) N wardly generalized to the -user case. Finally, an important aspect is that a DSL transceiver + λ1 · P1 − S1(k) + λ2 · P2 − S2(k) , can be operated in 3 so-called adaptation modes. In rate- k k adaptive (RA) mode, the transceiver uses all available power Γ N k + to maximize the bit rate, while maintaining a fixed noise mar- S k = 1 − 1 1( ) 1( ) λ h2 k , gin. Similarly, in margin-adaptive (MA) mode, the transceiver 1 ln 2 11( ) (5) uses all available power to maximize the noise margin, while Γ N k + S k = 1 − 2 2( ) maintaining a fixed bit rate. Finally, in power-adaptive (PA) 2( ) λ h2 k , 2 ln 2 22( ) mode, the transceiver minimizes the power consumption, while maintaining a fixed bit rate and noise margin. Cur- x + = x where [ ] max(0, ). rently, most DSL lines are operated in MA mode, which The iterative water-filling solution is then obtained by re- means that a lot of power is wasted on the short loops, placing the background noise with the total noise in (5), lead- also generating unnecessary crosstalk on the longer loops. ing to DSM at level 1 proposes to switch all DSL transceivers to PA + mode, this means that a DSL transceiver connected to a short 1 Γ N (k)+S (k) · h2 (k) S k = − 1 1 2 12 loop will apply power back-off (PBO) in order to minimize 1( ) λ h2 k , 1 ln 2 11( ) (6) its power. Furthermore, it is also proposed to abandon the Γ N k S k · h2 k + idea of using spectral masks to ensure spectral compatibility S k = 1 − 2 2( )+ 1( ) 21( ) . 2( ) 2 with other DSL services, but only to restrict the total power. λ2 ln 2 h (k) 22 Hence, a DSL transceiver connected to a long loop would be Assuming the crosstalk noise to be constant is not valid allowed to reallocate power from the higher tones, which are when considering a larger time window. So, each time the then not used, to the lower tones, a technique called boosting. crosstalk noise changes, the modems will adapt to this time- varying noise environment and adapt their transmit PSD. 4. DSM PERFORMANCE This means that there will be an iteration of modems ap- plying water-filling, hence this explains the name “iterative Figure 3 shows a block diagram of the DSM (level 1) demon- water-filling.” Applying these power allocation formulas iter- strator at Alcatel Research and Innovation Labs, which has atively is proved to converge to a so-called Nash equilibrium provided the results shown in Figures 4 and 5.Thedemon- [2]. strator is based on ADSL modems and a mixed deployment 4 EURASIP Journal on Applied Signal Processing

CO

5000 m, pair 1 xTU-C1 xTU-R1

oTU-R2 2000 m, pair 3 oTU-C1 oTU-R1 xTU-C2 xTU-R2

RT

Figure 3: DSM demonstrator at Alcatel Research & Innovation Labs implementing 1 long CO line of 5000 m, 1 short RT line of 2000 m, and a distance CO-RT of 3000 m.

−20

−30

−40

−50 (dBm/Hz) −60

−70

−80 01234567891011 Frequency (Hz) ×105

Tx PSD with DSM NCR with DSM Tx PSD without DSM NCR without DSM

Figure 4: Downstream ADSL transmit power spectral density (PSD) (solid) of the ATU-C transmitting over the 5000 m loop, together with the noise-to-channel ratio (dotted). Average PSD with DSM=−35.6 dBm/Hz and average PSD without DSM=−40 dBm/Hz.

1.4 of central office (CO) distributed and remote terminal (RT) distributed lines in the same cable binder. 1.2 The demonstrator allows switching from normal mode to DSM mode for downstream only. DSM is only applied to 1 the downstream PSD as the upstream does not suffer signif- icantly from crosstalk. In DSM mode, some modem param- 0.8 eters are switched to ensure PA operation, and in addition the ADSL transceivers switch from a normal modem soft- . 0 6 ware build to a DSM modem software build. Some changes have been made to the modem software to allow DSM at level . 0 4 1, where the water-filling is approximated by a flat PSD. Long-loops bit rate (Mbps) The changes in the software consist of, in the first place, . 0 2 expanding the range of the average relative gain from initial- ization to showtime1 from (0,−12) dB to (6,−20,5) dB. This 0 0123456789 means that a larger power back-off and boosting are made Short-loops bit rate (Mbps) possible. A second topic of software changes concerns the

Figure 5: Rate region for the short- and long-loops scenario: with- 1 Showtime is the state in either ATU-C or ATU-R reached after all initial- out DSM (dotted, circles) and with DSM (solid, plusses). ization and training is completed, in which user data is transmitted [6]. Etienne Van den Bogaert et al. 5 sync symbols in showtime. Once in showtime, the modems for an increase in PSD on all active tones. It makes it possible react to upcoming and disappearing noises coming from for the modems to react quickly to rapidly increasing noises neighbouring lines. A modem starts up with a high noise such as a new upcoming disturber. The short length of the level due to many disturbers, the transmit PSD will be cal- message decreases the probability of corrupt reception [9], culated to achieve the SNR necessary to attain the target bit and as such enhances the stability. The nonstationary noise- rate. If the noise then decreases due to neighbouring lines robustness results are detailed in the next section. becoming inactive, the modem will automatically decrease its transmit PSD as the SNR is higher than needed. As the 5. NONSTATIONARY NOISE ROBUSTNESS transmit PSD of the sync symbols may not change during showtime, it has to be low enough compared to the transmit Robustness of a DSL modem against nonstationary noise PSD of the data symbols to avoid intersymbol interference translates to stability on the level of the DSL link and higher (ISI) from the sync symbols into the data symbols. This can protocol layer communication links. Hence, a good robust- be either achieved by ensuring a low transmit PSD of the sync ness is a key to the development of a stable network and sat- symbols or by adapting the transmit PSD of the sync symbols isfied customers. according to the data symbol transmit PSD variation. In this section, nonstationary noise robustness is inves- The demonstrator shows a significant bit rate increase on tigated by injecting time-varying noise on the line. To show the long CO loop. This results from, on the one hand, power DSM gains, one typically needs multiple active DSL lines in back-off on the short RT loop and, on the other hand, boost- a binder, but for the sake of simplicity, only one DSL line is ing on the long CO loop. Figure 4 illustrates this boosting on taken into account here and the nonstationary noise is em- the long loop. The figure shows also the noise-to-channel ra- ulated. As DSM is only applied to downstream transmission tio (NCR), depicted with dotted lines. in the case of ADSL, the noise injection happens only at the Without DSM, only 256 kbps is achieved on the long CO customer premises equipment (CPE) side. Many parameters loop while the short RT loop operates at 4 Mps. With DSM, play a role in the noise-robustness measurement: loop length, not less than 1344 kbps is achieved on the long CO loop bit rate, noise margin, injected noise level, noise level change, with still 4 Mbps on the short RT loop. This is an increase and so forth, but, as can be seen in the next section, the of over 400%. For a more general scenario with two long CO results show that the key parameters are the noise margin, loops together with two short RT loops, the bit rates increase power back-off, changing noise level, and number of active even more, namely from 208 kbps to 1280 kbps, an increase tones. Indeed, the nonstationary noise robustness is by defi- of over 500%. nition the robustness against the changing noise level. How- Figure 5 depicts the rate region for the short and long ever, the study will also show that the level of power back-off lines with and without DSM. It is clear that DSM allows ex- influences the results. In this study, the spectral shape of the tending the rate region substantially. Remark that these re- noise has been kept flat over the entire bandwidth. sults here merely indicate the potential of DSM. The results achievable in the field will depend on the actual noise envi- 5.1. Noise-robustness measurements ronment and loop length distribution. Although these results look very promising, iterative DSM, that is, PA mode of operation, is achieved by provi- water-filling also has a number of drawbacks. Firstly, as sioning the modems with a target bit rate and a maximum shown in Figure 4, iterative water-filling results in boosting additional noise margin set to zero. The target noise mar- on the long loops. Boosting implies breaking the spectral gin is set to 6 dB and the only noise robustness the modems mask constraints, hence spectral compatibility with other ser- have left beside this noise margin is the bit swap proce- vices is not assured. Spectrally compatible DSM has been in- dure. Unfortunately, the bit swap protocol is limited to max- vestigated by means of the American spectrum management imum 6 swaps per message [6]. Furthermore, the bit swap standard [3] method B compliancy [8]. Method B ensures is done over the ADSL overhead channel (AOC) with at spectral compatibility of a new technology not by imposing a least 800 milliseconds between every two bit swap messages. spectral mask, but by ensuring that the new technology does Both restrictions limit the achievable noise-increase recov- not harm the specified basis systems. This is verified by com- ery. The measurement results for DSM, when all tones are puting the impact on, for example, the bit rate of these basis loaded with bits, are shown in Figure 6 and labelled as “DSM systems. without QB.” The label “DSM with QB” is explained further. A second important drawback of iterative water-filling is The modems are DSM-enabled prototypes and can apply the fact that DSM reduces the noise margin on the short line power back-off up to 20.5 dB, in comparison with conven- significantly compared to the current deployment. The lines tional ADSL1 modems, which are limited to a 12 dB power are then operated in PA mode with 6 dB noise margin, which back-off. The figure shows the maximum noise increase an means that, if, for example, a new DSL line is activated, the ADSL transceiver can handle without resynchronization ver- short line could go out of sync due to the large noise level sus the power back-off level. change. We therefore implemented a new ADSL overhead Figure 6 shows us that conventional ADSL1 modems op- channel (AOC) message enabling the modem to request a erating at fixed margin (DSM without QB) can only recover quick gi boost (QB). This quick gi boost message is a very from a maximum noise increase of 7.5 dB. Indeed, the max- short message from the Rx modem to the Tx modem asking imum power back-off for a conventional ADSL1 modem is 6 EURASIP Journal on Applied Signal Processing

20 20

18 18

16 16

14 14

12 12 Noise increase (dB)

Noise increase (dB) 10 10

8 8 6 0 10 15 0 5 0 2 4 6 8 101214161820 Power back-off (dB) Power back-off (dB) DSM with QB DSM with QB DSM without QB DSM without QB Margin adaptive (MA) Margin adaptive (MA)

Figure 6: Noise-robustness measurements with all tones active. Figure 7: Noise-robustness measurement on long loop (40 tones active). limited to 12 dB, for which the figure shows that the maxi- maximum bit rate, be it a low bit rate, and hence no power mum noise increase is equal to 7.5 dB. If the power back-off back-off can be applied. can go up to 20.5 dB, a better noise robustness is obtained up to 13 dB. The explanation can be found in the difference be- 5.2. Improved noise robustness with quick gi boost tween initialization and showtime. During initialization, the modem is training and transmits at −40 dBm/Hz (somewhat Applying DSM with a slow bit swap algorithm makes it im- lower if politeness is applied). During the training period, a possible for the modem to adapt to quick noise or channel noise measurement is performed on which the modem will changes. Therefore, we have introduced a quick gi boost mes- compute the power back-off value assuming that the noise sage, that is, a very short message from the Rx to the Tx mo- level will remain constant. The noise measurement not only dem asking to boost all carriers with a certain gain included comprises the background noise and crosstalk noise, but also in the message. The noise robustness increases thanks to two signal-related noises such as intersymbol interference (ISI), factors. First, the message is very short. This lowers the prob- intercarrier interference (ICI), and noise inherent to the mo- ability of corrupt message reception [9]. Second, the transmit dem. This means that if a modem performs a large power PSD of all carriers is increased at once. In our experimen- back-off, the total noise will decrease also. This is why the tal setup, the noise increase is flat, for which a flat quick gi noise margin, when entering showtime, is slightly larger than boost gives great benefit. But even for shaped noise increases, 6 dB and increasing with increasing power back-off,which aquickgi boost makes it possible to recover very fast from a thus results in a slightly better noise robustness. negative noise margin. The finetuning to restore the noise The figure also shows the comparison with margin- margin to the same level for all carriers happens then with adaptive operation, which is equivalent to no DSM. In this the traditional bit swap mechanism. case, the x-axis has to be seen as additional noise margin In Figures 6 and 7, the possible noise increases versus instead of power back-off. Indeed, the noise margin is then power back-off is denoted as ‘DSM with QB’. As can be seen not decreased as no power back-off is applied. As expected, from the figure, the noise robustness is better by up to 3 dB the robustness against a sudden noise increase grows linearly compared to DSM without quick gi boost. DSM with quick gi with the noise margin. boost can be said to be as stable as fixed power operation up In case only a few tones are loaded with bits, the modems to 4 dB power back-off. So a modem with the quick gi boost operating in DSM mode perform better against nonstation- can easily apply 4 dB power back-off: if extra noise forces ary noise. Indeed, with only few active tones, the bit swap the modem to increase its power, then the quick gi boost is can increase the transmit PSD faster than when many tones fast enough such that this line has the same noise robust- are active. The results when 40 tones are active are shown ness as a line that does not apply power back-off.Oncemore in Figure 7. However, this study focuses on high bit rates power back-off is applied, the noise robustness decreases. with modems applying power back-off. Modems with only This means that there is clearly a tradeoff between the per- a few active tones are most of the time operated at the line’s formance gains of applying power back-off (as indicated in Etienne Van den Bogaert et al. 7

Received point that when water-filling is applied to discrete multitone sys- Decision region of S 2 tems, it is possible to approximate the obtained PSDs by flat transmit spectra, hence significantly reducing the complexity of the power allocation algorithm. · S 2 The results of iterative water-filling implemented on ADSL modems show a significant performance increase compared to the deployment mode currently used by the op- S 1 erators (margin-adaptive mode). The rate region is plotted for a particular deployment case showing the huge advan- tages of iterative water-filling. Transmitted symbol d As a consequence of the overall bit-rate optimization, some DSL lines decrease their transmit power to mitigate the induced interference on neighbouring lines. Hence, the noise Figure 8: QAM hard-decision symbol detection and its impact on margin is decreased to improve the performance. noise measurement. As the noise margin acts as a protection against nonsta- tionary noise, we investigated the impact of DSM on nonsta- tionary noise robustness. The results show clearly that DSM trades off nonstationary noise robustness for performance. Section 4) and the robustness. A quick gi boost can improve Finally, a new scheme is proposed to improve the nonsta- the robustness (with 4 dB) of a system applying power back- ff ff tionary noise robustness when applying DSM. It speeds up o , but there is still a di erence in robustness between a sys- the ADSL overhead channel by requesting a quick g boost, tem applying power back-off and a system that does not ap- i ff and therefore improves the nonstationary noise robustness ply power back-o at all (MA mode of operation). with approximately 3 dB. There are 2 reasons for this relatively small robustness when DSM is applied. First, the noise measurement within the modem is not accurate when large noise increases oc- ACKNOWLEDGMENTS cur. Indeed, QAM demodulation at the receiver is based on hard decisions and the process of corrupt symbol detection is This work was supported in part by the IWT MIDAS Project shown in Figure 8. After the decision is made, right or wrong, (030032), Multistandard integrated devices for broadband the difference between the demodulated symbol and the re- DSL access and in-home powerline communications. Part ceived signal is measured as noise. This means that the mea- of this work has been presented at the IEEE International sured noise will never exceed the maximum distance from Conference on Acoustics, Speech, and Signal Processing demodulated√ to received symbol within one decision region, (ICASSP), May 2004 and the IASTED Communication Sys- which is ( 2/2)·d. As a consequence, the modem measures a tems and Networks Conference (CSN), September 2004. noise increase smaller than it is actually, and therefore makes arequestforaquickgi boost that is too small compared to REFERENCES the noise. Second, the AOC protocol carrying the bit swap is slow, [1]T.Starr,M.Sorbara,J.M.Cioffi, and P. J. Silverman, DSL Ad- such that no more than two consecutive quick gi boost mes- vances, Prentice Hall Communications Engineering and Emerg- sages can be carried out before the modem goes out of show- ing Technologies Series, Prentice Hall PTR, Upper Saddle River, time. It has to be noted that AOC and bit swap are im- NJ, USA, 2002. [2] W. Yu, G. Ginis, and J. M. Cioffi, “Distributed multiuser power proved in ADSL2, but still no quick gi boost message is im- plemented. control for digital subscriber lines,” IEEE Journal on Selected Ar- eas in Communications, vol. 20, no. 5, pp. 1105–1115, 2002. No investigation has been done on mutual interference and stability of several modems applying quick g boosts. [3] “Spectrum management for loop transmission systems, Issue i 2,” ANSI, ANSI-T1.417, November 2002. Although the convergence of iterative water-filling has been proven theoretically in [2], the stability of many modems ap- [4] K. J. Kerpez, D. L. Waring, S. Galli, J. Dixon, and P. Madon, “Advanced DSL management,” IEEE Communications Maga- plying DSM together with quick g boosts is still an open is- i zine, vol. 41, no. 9, pp. 116–123, 2003. sue. Indeed, a quick g boost changes suddenly the transmit i [5] G. Ginis and J. M. Cioffi, “Vectored transmission for digital sub- PSD on one line, but this results in a sudden change of the scriber line systems,” IEEE Journal on Selected Areas in Commu- crosstalk seen by neighbouring lines, hence potentially trig- nications, vol. 20, no. 5, pp. 1085–1104, 2002. gering a quick g boost on the neighbouring lines. i [6] “Asymmetrical digital subscriber line (ADSL) transceivers,” ITU Std G.992.1, 1999. 6. CONCLUSION [7] “Asymmetrical digital subscriber line (ADSL) transceivers,” ITU Std G.992.1, 2003. In this paper, the practical implementation of distributed [8] R. Suciu, E. Van den Bogaert, J. Verlinden, and T. Bostoen, multiuser power allocation, more specifically iterative water- “Ensuring spectral compatibility of DSM,” in Proceedings of filling, is investigated together with its performance on the EURASIP 12th European Signal Processing Conference (EU- cable farm at Alcatel Research and Innovation Labs. We show SIPCO ’04), Vienna, Austria, September 2004. 8 EURASIP Journal on Applied Signal Processing

[9] J. Verlinden, E. Van den Bogaert, T. Bostoen, R. Cendrillon, and Marc Moonen received the Electrical En- M. Moonen, “Protecting the robustness of ADSL and VDSL gineering degree and the Ph.D. degree in DMT modems when applying DSM,” in Proceedings of IEEE In- applied sciences from Katholieke Univer- ternational Zurich Seminar on Communications (IZS ’04),pp. siteit Leuven, Leuven, Belgium, in 1986 and 140–143, Zurich, Switzerland, February 2004. 1990, respectively. Since 2004, he is a Full Professor at the Electrical Engineering De- partment of Katholieke Universiteit Leuven, Etienne Van den Bogaert is a DSL Re- where he is currently heading a research search Engineer at the Research & Inno- team of 16 Ph.D. candidates and postdocs, vation Department in Antwerp, Belgium. working in the area of signal processing for In this function, he is investigating next- digital communications, wireless communications, DSL, and audio generation DSL technologies and their ap- signal processing. He received the 1994 KU Leuven Research Coun- plications. His current main research in- cil Award, the 1997 Alcatel Bell (Belgium) Award (with Piet Van- terest is dynamic spectrum management daele), the 2004 Alcatel Bell (Belgium) Award (with Raphael Cen- (DSM): algorithms, impact on performance drillon), and was a 1997 ‘Laureate of the Belgium Royal Academy and stability, and practical implementation. of Science’. He was the Chairman of the IEEE Benelux Signal Pro- cessing Chapter (1998–2002), and is currently a EURASIP AdCom Tom Bostoen received the M.S. degree in Member (European Association for Signal, Speech and Image Pro- physical engineering from Ghent Univer- cessing, from 2000 up till now. He has been a Member of the edito- sity, Ghent, Belgium, in 1998. Tom Bostoen rial board of IEEE Transactions on Circuits and Systems II (2002– is currently the Product Manager of the 2003). He is currently Editor-in-Chief for the EURASIP Journal on 5530 network analyzer at the Access Net- Applied Signal Processing from 2003 up till now, and a Member works Division of Alcatel in Antwerp, Bel- of the editorial board of Integration, the VLSI Journal, EURASIP gium. In his previous function, he was the Journal on Wireless Communications and Networking, and IEEE Project Manager of the DSL Physical Layer Signal Processing Magazine. Research Project at the Research & Inno- vation Department. Before that, he studied single-ended line testing (SELT) as a Research Engineer in the same department and contributed to ITU G.selt standardization.

Jan Verlinden received a degree in electri- cal engineering in 2000 from the KULeu- ven, Belgium. He is currently Member of theDSLExpertsTeamoftheAccessNet- works Division in Antwerp, Belgium. He joined the Research & Innovation Division of Alcatel in September 2000, where he fo- cussed on echo canceller techniques. From 2002 on, he has focussed on DSM. As such, he participated in the VDSL Olympics by introducing DSM into the VDSL prototype. Within the DSL Ex- perts Team, he is currently studying emerging DSL physical layer technologies. He also contributes to ANSI NIPP-NAI standardiza- tion. Raphael Cendrillon wasborninMel- bourne, Australia, in 1978. He received an Electrical Engineering degree (first class honours) from the University of Queens- land, Australia, in 1999, and a Ph.D. degree in electrical engineering at the Katholieke Universiteit Leuven, Belgium in 2004. His Ph.D. was awarded summa cum laude with congratulations of the jury, an honor given to the top 5% of Ph.D. graduates. His re- search focuses on the application of multiuser communication the- ory to xDSL. In 2002, he was a Visiting Scholar at the Information Systems Laboratory, Stanford University with Professor John Cioffi. Since 2005, Dr. Cendrillon has been a PostDoctoral Research Fel- low at the University of Queensland, Australia. His work in xDSL is done in close collaboration with Alcatel Research & Innovation Department, Belgium, for which he was awarded the Alcatel Bell Scientific Prize in 2004. He was also the recipient of an IEEE Travel Grant in 2003 and 2004, and the KU Leuven Bursary for Advanced Foreign Scholars in 2004. Hindawi Publishing Corporation EURASIP Journal on Applied Signal Processing Volume 2006, Article ID 24012, Pages 1–10 DOI 10.1155/ASP/2006/24012

Analysis of Iterative Waterfilling Algorithm for Multiuser Power Control in Digital Subscriber Lines

Zhi-Quan Luo1 and Jong-Shi Pang2

1 Department of Electrical and Computer Engineering, University of Minnesota, 200 Union Street SE, Minneapolis, MN 55455, USA 2 Department of Mathematical Sciences and Department of Decision Sciences and Engineering Systems, Rensselaer Polytechnic Institute, Troy, NY 12180-3590, USA Received 3 December 2004; Revised 19 July 2005; Accepted 22 July 2005 We present an equivalent linear complementarity problem (LCP) formulation of the noncooperative Nash game resulting from the DSL power control problem. Based on this LCP reformulation, we establish the linear convergence of the popular distributed iterative waterfilling algorithm (IWFA) for arbitrary symmetric interference environment and for certain asymmetric channel con- ditions with any number of users. In the case of symmetric interference crosstalk coefficients, we show that the users of IWFA in fact, unknowingly but willingly, cooperate to minimize a common quadratic cost function whose gradient measures the received signal power from all users. This is surprising since the DSL users in the IWFA have no intention to cooperate as each maximizes its own rate to reach a Nash equilibrium. In the case of asymmetric coefficients, the convergence of the IWFA is due to a con- traction property of the iterates. In addition, the LCP reformulation enables us to solve the DSL power control problem under no restrictions on the interference coefficients using existing LCP algorithms, for example, Lemke’s method. Indeed, we use the latter method to benchmark the empirical performance of IWFA in the presence of strong crosstalk interference.

Copyright © 2006 Hindawi Publishing Corporation. All rights reserved.

1. INTRODUCTION whose complexity was claimed by the authors to be linear in terms of the number of frequency tones but exponential In modern DSL systems, all users share the same frequency in the number of users. Notice that all of these approaches band and crosstalk is known to be the dominant source of require a centralized implementation whereby a spectrum interference. Since the conventional interference cancellation management center collects all the channel and noise infor- schemes require access to all users’ signals from different mation, and calculates rate-maximizing power spectra vec- vendors in a bundled cable, they are difficult to implement tors and send them to individual users for implementation. in an unbundled service environment. An alternative strat- In a departure from this centralized framework, Yu et al. egy for reducing crosstalk interference and increasing system [4] proposed a distributed game-theoretic approach for the throughput is power control whereby interference is con- power control problem. The key observation is that each DSL trolled (rather than cancelled) through the judicious choice user’s data rate is a concave function of its own power spec- of power allocations across frequency. This strategy does not tra vector when the interfering users’ power vectors are fixed. require vendor collaboration and can be easily implemented Letting each user locally measure the interference plus noise to mitigate the effect of crosstalk interference and maximize levels and greedily allocate its power to maximize its own total throughput. rate gives rise to a noncooperative Nash game (called DSL A typical measure of system throughput is the sum of all game hereafter) [4, 5]. The resulting distributed power con- users’ rates. Unfortunately the problem of maximizing the trol scheme is known as the iterative waterfilling algorithm sum rate subject to individual power constraints turns out (IWFA) and has become a popular candidate for the dynamic to be nonconvex with many local maxima [1]. To obtain a spectrum management standard for future DSL systems. global optimal power allocation solution, a simulated an- Despite its popularity and its apparent convergent be- nealing method was proposed in [2]; however, this method havior in extensive computer simulations, IWFA has only suffers from slow convergence and lacks a rigorous analysis. been theoretically shown to converge in limited cases where More recently, a dual decomposition approach [3]wasde- the crosstalk interferences are weak [6] and/or the number veloped to solve the nonconvex rate maximization problem, of users is two [4]. The goal of this paper is to present a 2 EURASIP Journal on Applied Signal Processing convergence analysis of IWFA in more realistic channel set- convergence analysis of the IWFA where we apply an exist- tings and for arbitrary number of users. Our approach is ing convergence theory for a symmetric LCP and the con- based on a key new result that establishes a simple reformu- traction principle in the asymmetric case to show the lin- lation of the noncooperative Nash game (resulting from the ear convergence of IWFA under two sets of channel condi- distributed power control problem) as a linear complemen- tions. These convergence results significantly enhance those tarity problem (LCP) of the “copositive-plus” type [7]. Based of [4, 6] by allowing arbitrary number of users and more re- on this equivalent LCP reformulation, we establish the lin- alistic channel conditions. Section 5 reports simulation re- ear convergence of IWFA for arbitrary symmetric interfer- sults of Lemke’s algorithm and IWFA. It is observed that the ence environment as well as for diagonally dominant asym- IWFA delivers robust convergent behavior under all simu- metric channel conditions with any number of users. More- lated channel conditions and achieves superior sum rate per- over, in the case of symmetric interference crosstalk coeffi- formance. Section 6 gives some concluding remarks and sug- cients, we show a surprising result that the users of IWFA gestions for future work. A brief summary of the LCP and in fact, unknowingly but willingly, cooperate to minimize a its extension to an affine variational inequality (AVI) is pre- common quadratic cost function whose gradient measures sented in an Appendix. the total received signal power from all users, subject to the constraints that each user must allocate all of its budgeted 2. LCP FORMULATION power across the frequency tones. This “virtual collaborating behavior” is unexpected since the DSL users in IWFA never Let there be m DSL users who wish to communicate with have any intention nor incentives to cooperate as each simply acentraloffice in an uplink multiaccess channel. Let n de- maximizes its own rate to reach a Nash equilibrium. Another note the total number of frequency tones available to the DSL major advantage of this LCP reformulation is that it opens up users. Each user i has its own power budget described by the the possibility to solve the DSL power control problem using feasible set the existing well-developed algorithms for LCP, for example, Lemke’s method [7, 8]. The latter method requires no restric- P i = i ∈ Rn | ≤ i ≤ i p 0 pk CAPk tion on the interference coefficients and therefore can be used (1) to benchmark the performance of IWFA, especially in the n ∀ = i ≤ i presence of strong crosstalk interference which leads to mul- k 1, ..., n, pk Pmax tiple Nash equilibrium solutions. In contrast, there has been k=1 no theoretical proof of convergence (to an equilibrium solu- i i i = for some positive constants CAPk and Pmax,wherep tion) for the IWFA under general interference conditions. i i i (p1, p2, ..., pn) denotes the power spectra vector of user i Our current work was partly inspired by the recent work i with pk signifying the power allocated to frequency tone k. of [9] which presented a nonlinear complementarity prob- i ≤∞ In this model, we allow CAPk . To avoid triviality, we lem (NCP) formulation of the DSL game using the Karush- assume throughout the paper that Kuhn-Tucker (KKT) optimality condition for each user’s n own rate maximization problem. Such an NCP approach can i i be implemented in a distributed manner despite the need for Pmax < CAPk,(2) k=1 some small amount of coordination among the DSL users through a spectrum management center. It was shown [9] n i ≤ i which ensures that the budget constraint k=1 pk Pmax is that the resulting NCP belongs to the P0 class under certain not redundant. conditions on the crosstalk interference coefficients among j = Taking pk for j i as fixed, IWFA lets user i solve the the users relative to the various frequency tones. It was fur- i following concave maximization problem in the variables pk ther shown that, under the same conditions, the solution to for k = 1, ..., n: the NCP is “B-regular” [10]; as a consequence, the NCP can n i be solved in this case by a host of Newton-type methods as p maximize f p1, ..., pm ≡ log 1+ k described in the Chapter 9 of the latter monograph. In con- i i ij j k=1 σk + j=i αk pk trast to [9], our present work shows that the DSL game is i ∈ P i basically a linear problem. This simple result has important subject to p , consequences as we will see. (3) The rest of this paper is organized as follows. In Section 2, ij where σi are positive scalars and α are nonnegative scalars we present the Nash game formulation of the DSL power k k for all i = j and all k representing noise power spectra and control problem and develop an equivalent mixed LCP for- channel crosstalk coefficients, respectively. A Nash equilib- mulation, based on which we obtain a new uniqueness result ∗ ∗,i m rium of the DSL game is a tuple of strategies p ≡ (p ) = of the Nash equilibrium solution to the game. In Section 3, i 1 such that, for every i = 1, ..., m, p∗,i ∈ P i and we convert the mixed LCP formulation of the DSL game into a standard LCP and show that the well-known Lemke ∗,1 ∗,i−1 ∗,i ∗,i+1 ∗,m fi p , ..., p , p , p , ..., p method will successfully compute a Nash equilibrium of the ≥ ∗,1 ∗,i−1 i ∗,i+1 ∗,m DSL game, under essentially no conditions on the inter- fi p , ..., p , p , p , ..., p (4) ference and noise coefficients. Section 4 is devoted to the ∀pi ∈ P i. Z.-Q. Luo and J.-S. Pang 3

The existence of such an equilibrium power vector p∗ is well all k = 1, ..., n, which contradicts the equality constraint. known. Subsequently, we will give some new sufficient con- Consequently, letting ditions for p∗ to be unique; see Proposition 2.Ourmaingoal ∗ i in the paper pertains the computation of p . Throughout the ≡−1 i ≡− ϕk ui , γk , (11) ii = ij j paper, we let αk 1foralli and k. vi i m n i ≤ vi σk + j=1 αk pk Letting ui be the multiplier of the inequality k=1 pk i i Pmax,andγk be the multiplier of the upper bound constraint i ≤ i we easily see that (5)holds. pk CAPk, we can write down the KKT conditions for user i’s problem (3) as follows (where a ⊥ b means that the two In turn, the mixed LCP (6) is the KKT condition of the scalars (or vectors) a and b are orthogonal): ffi ≡ i m ∈ Rmn → AVI defined by the a ne mapping p (p )i=1 ∈ Rmn ≡ m Pi ≡ 1 σ + Mp and the polyhedron X i=1 ,whereσ 0 ≤ pi ⊥− + u + γi ≥ 0 ∀k = 1, ..., n, i m i k i m ij j i k (σ )i=1 with σ being the n-dimensional noise power vector σk + j=1 αk pk i n ij m (σk)k=1 for user i, M is the block partitioned matrix (M )i,j=1 n ij ≡ ij n × ≤ ⊥ i − i ≥ with each M Diag(αk )k=1 being the n n diagonal matrix 0 ui Pmax pk 0, ii k=1 of power interferences (note: M is an identity matrix), and ≤ i ⊥ i − i ≥ ∀ = 0 γk CAPk pk 0 k 1, ..., n. Pi ≡ i ∈ Rn | ≤ i ≤ i (5) p 0 pk CAPk (12) Although the above KKT system is nonlinear, Proposition 1 n ∀ = 1, , , i = i shows that, under the assumption (2), the system is equiva- k ... n pk Pmax . =1 lent to a mixed linear complementarity system (see the Ap- k pendix for a discussion on the LCP). (See the Appendix for a discussion on the AVI.) Conse- quently, the tuple p is a Nash equilibrium to the DSL game if Proposition 1. Suppose that (2) holds. The system (5) is and only if p ∈ X and equivalent to (p − p)T (σ + Mp) ≥ 0 ∀p ∈ X. (13) m ≤ i ⊥ i ij j i ≥ ∀ = 0 pk σk + αk pk + vi + ϕk 0 k 1, ..., n, We denote this AVI by the triple (X, σ, M). Among its con- j=1 sequences, the above AVI reformulation of the DSL game n (6) enables us to obtain some new sufficient conditions for the i − i = vi free, Pmax pk 0, uniqueness of a Nash equilibrium solution. To present these k=1 conditions, we define the m × m matrix B = [bij]by ≤ i ⊥ i − i ≥ ∀ = 0 ϕk CAPk pk 0 k 1, ..., n. ≡ ij ∀ = bij max αk i, j 1, ..., m. (14) i i 1≤k≤n Proof. Let (pk, ui, γk)satisfy(5). We must have = m Note that bii 1. In what follows, we review some back- σi + αijp j > 0 ∀k = 1, ..., n. (7) ground results in matrix theory, which can be found in [7]. k k k upp j=1 Let Bdia, Blow,andB be the diagonal, strictly lower, and strictly upper triangular parts of B, respectively. Since ij We claim that ui > 0. Indeed, if ui = 0, then αk are all nonnegative, B is a nonnegative matrix. Hence − ff 1 Bdia Blow is a “Z-matrix”; that is, all its o -diagonal en- γi ≥ > 0 ∀k = 1, ..., n,(8)tries are nonpositive. Since all principal minors of − k i m ij j Bdia Blow σ + = α p k j 1 k k are equal to one, Bdia − Blow is a “P-matrix,” and thus a “Minkowski matrix” (also known as an “M-matrix”). It fol- i = i = −1 which implies pk CAPk for all k 1, ..., n.Thus lows that (Bdia − Blow) exists and is a nonnegative matrix. Therefore, so is the matrix Υ ≡ (B − B )−1Bupp.Letρ(Υ) n n dia low i ≥ i = i denote the spectral radius of Υ, which is equal to its largest Pmax pk CAPk,(9) k=1 k=1 eigenvalue, by the well-known Perron-Frobenius theorm for nonnegative matrices. The matrix which contradicts (2). Hence to get a solution to (6), it suf- ¯ ≡ − − upp fices to define B Bdia Blow B (15) ¯ i i m ij j is the “comparison matrix” of B. Notice that B is also a Z- γk σk + j=1 αk pk matrix. The matrix is called an -matrix if ¯ is also a - ≡−1 i ≡ B H B P vi , ϕk . (10) ui ui matrix. There are many characterizations for the latter con- dition to hold; we mention two of these: (a) ρ(Υ) < 1and(b) i i ∈ Rm Conversely, suppose that (pk, vi, ϕk)satisfies(6). We must for every nonzero vector x , there exists an index i such i = ¯ have vi < 0; otherwise, complementarity yields pk 0for that xi(Bx)i > 0. 4 EURASIP Journal on Applied Signal Processing

For each k = 1, ..., n, we call the m×m matrix M ,where n n 1/2 k ≥ i 2 − ij i 2 pk max αk pk 1≤k≤n ≡ ij ∀ = k=1 j=i k=1 Mk ij αk i, j 1, ..., m, (16) n 1/2 × j 2 a tone matrix. Notice that the matrix M in the AVI (X, σ, M) pk k=1 is a principal rearrangement of the block diagonal matrix n 1/2 m n 1/2 with M as its diagonal blocks for k = 1, ..., n.Thisrear- 2 j 2 k = pi b¯ p , rangement simply amounts to the alternative grouping of the k ij k k=1 j=1 k=1 tuple p by tones, instead of users as done above. (20) Proposition 2. Suppose that where the first and third inequality are obvious and the sec- n m ond is due to the Cauchy-Schwarz inequality. Hence letting ij i j ∀ ≡ i m = max αk pk pk > 0 p p i=1 0. (17) 1≤i≤m 1/2 k=1 j=1 n ≡ i 2 qi pk , (21) There exists a unique Nash equilibrium to the DSL game. In k=1 particular, this holds if either one of the following two condi- we have tions is satisfied: m n m ij j (a) for every k = 1, ..., n,thetonematrixMk is positive i ≥ ¯ = ¯ ∀ = αk pk pk qi bijqj qi Bq i i 1, ..., m. definite; j=1 k=1 j=1 (b) ρ(Υ) < 1. (22)

Proof. As X is the Cartesian product of the sets Pi,itfollows By what has been mentioned above, condition (b) implies that the AVI (X, σ, M) has a unique solution if M has the ¯ “uniform property” relative to the Cartesian structure of max qi Bq > 0, (23) P 1≤i≤m i X;see[10]. This property says that for any nonzero tuple ≡ i m p (p )i=1, because q is obviously a nonzero vector; thus (17)holds.

m Proposition 2 significantly extends the current existence T max pi Mijp j > 0. (18) ≤ ij ≤ 1≤i≤m and uniqueness result of [4–6] which required 0 αk 1/n j=1 for all i = j and all k. Under the latter condition, it can be shown that the symmetric part of each tone matrix Mk, ij = ij n T Since M Diag(αk )k=1, the above condition is precisely (1/2)(Mk + Mk ), is strictly diagonally dominant; hence each (17). Under condition (a), the matrix M is positive definite Mk is positive definite. The condition ρ(Υ) < 1isquitebroad; n because it is a principal rearrangement of Diag(Mk)k=1.Itis for instance, it includes the case where each matrix Mk is easy to verify that “strictly quasi-diagonally dominant,” that is, where for each k, there exist positive scalars d j such that m n m k T = ij i j p Mp αk pk pk. (19) m i=1 k=1 j=1 i ij j ∀ = dk > αk dk i 1, ..., m. (24) j=1 Hence condition (a) implies (17). To show that condition (b) also implies (17), write In Section 4, we will see that the condition ρ(Υ) < 1isre- sponsible for the convergence of the IWFA with asymmetric m n ffi ij i j interference coe cients. αk pk pk As another application of the AVI formulation of the j=1 k=1 DSL game, we show that if each tone matrix Mk is positive n n 2 ij j semidefinite (but not definite), it is still possible to say some- = pi + α pi p k k k k thing about the uniqueness of certain quantities. k=1 j=i k=1 n n 2 ij j Proposition 3. Suppose that the tone matrices M ,fork = ≥ pi − α pi p k k k k k 1, ..., n, are all positive semidefinite. Then the set of DSL Nash k=1 j=i k=1 equilibria is a convex polyhedron; moreover, the quantities n n 1/2 2 2 ≥ pi − pi m k k ij ji j = = = ∀ = = k 1 j i k 1 αk + αk pk, i 1, ..., m; k 1, ..., n, (25) j=1 n 1/2 2 × αijp j k k are constants among all Nash equilibria. k=1 Z.-Q. Luo and J.-S. Pang 5

Proof. Under the given assumption, the matrix M is positive Consequently, the concatenation of the system (6)foralli = semidefinite. As such, the polyhedrality of the set of Nash 1, ..., m is equivalent to the following: for all i = 1, ..., m and equilibria follows from the well-known monotone AVI the- all k = 2, ..., n, T ory [10]. Furthermore, in this case, the vector (M + M )p is m n ≤ i ⊥ i = i ij ij a constant among all such equilibria p. By unwrapping the 0 pk wk σk + α1 + α δk structure of the matrix M, the desired constancy of the dis- j=1 =2 played quantities follows readily. × j i i − i ≥ p + w1 + ϕk ϕ1 0, ij ji n We can interpret (α + α )/2 as the “average interfer- k k 0 ≤ wi ⊥ pi = Pi − pi ≥ 0, (30) ence coefficient” between user i and user j at frequency k.In 1 1 max k k=2 m ij ji j this way, the invariant quantity (1/2) j=1(αk + αk )pk rep- 0 ≤ ϕi ⊥ CAPi −pi ≥ 0, resents the average of signal and interference power received k k k n and caused by user i across all frequency tones. ≤ i ⊥ i − i i ≥ 0 ϕ1 CAP1 Pmax + pk 0. k=2 3. SOLUTION BY LEMKE’S METHOD The above is an LCP of the standard type ≤ ⊥ ≥ We next discuss the solution of the mixed LCP (6) by the 0 z q + Mz 0, (31) well-known Lemke method [7]. Since this method has a ro- where the constant vector q is given by bust theory of convergence, its solution can be used as a ⎛ ⎞ i : = 1, , ; = 2, , benchmark to evaluate the empirical performance of IWFA; ⎜ σk i ... m k ... n ⎟ ⎜ i = ⎟ see Section 5. For convenience, let us first convert the prob- ⎜ Pmax : i 1, ..., m ⎟ q ≡ ⎜ ⎟ , (32) lem (6) into a standard LCP. Let ⎜ i = = ⎟ ⎝CAPk : i 1, ..., m; k 2, ..., n⎠ m CAPi −Pi : i = 1, ..., m i ≡ i ij j i ∀ = 1 max wk σk + αk pk + vi + ϕk k 1, ..., n, (26) = z is the vector of variables: j 1 ⎛ ⎞ pi : i = 1, ..., m; k = 2, ..., n from which we obtain, considering k = 1 and substituting ⎜ k ⎟ ⎜ i = ⎟ j = j − n j = ⎜ w1 : i 1, ..., m ⎟ p1 Pmax k=2 pk for all j 1, ..., m, z ≡ ⎜ ⎟ , (33) ⎜ i = = ⎟ ⎝ϕk : i 1, ..., m; k 2, ..., n⎠ m i = =− i i − ij j − i ϕ1 : i 1, ..., m vi σ1 + w1 α1 p1 ϕ1 j=1 and the matrix M, partitioned in accordance with the vectors m n q and z, is of the form =− i i − ij j − j i σ1 + w1 α1 Pmax pk + ϕ1 (27) ⎡ ⎤ = − j 1 k=2 ⎢ MNI N⎥ ⎢− T ⎥ m m n ≡ ⎢ N 00 0⎥ =− i − ij j i ij j − i M ⎣ − ⎦ , (34) σ1 α1 Pmax + w1 + α1 pk ϕ1. I 00 0 j=1 j=1 k=2 NT 00 0

i ≥ where the leading principal submatrix is a nonnegative Substituting this into the expression of wk for k 2, we de- M duce (albeit asymmetric) matrix with positive diagonals and N is a special nonnegative matrix. (The details of the matrices M m m i ≡ i − i − ij j i ij j and N are not important except for the distinctive features wk σk σ1 α1 Pmax + w1 + αk pk j=1 j=1 mentioned here.) Based on (34), it follows that the matrix M T ≥ ≥ ≥ m n is copositive-plus (i.e., z Mz 0forallz 0, and [z 0, ij j i − i zT Mz = 0] implies (M + MT )z = 0). Consequently, Lemke’s + α1 p + ϕk ϕ1 (28) j=1 =2 algorithm can successfully compute a solution to the LCP m n (31) provided that this LCP is feasible; see [7]. But the lat- = i i ij ij j i − i σi + w1 + α1 + α δk p + ϕk ϕ1, ter feasibility condition trivially holds by the nonemptiness j=1 =2 of the sets Pi for i = 1, ..., m, which is a blanket assumption that we have made. Summarizing this discussion, we obtain where δk is Kronecker delta, that is, the following result. ⎧ ⎨ 1if= , i ≡ k  Theorem 1. Suppose that (2) holds and that P =∅for all δk ⎩ 0 otherwise, = ffi ij = i 1, ..., m. For all nonnegative coe cients αk , i j,andall m (29) i positive σk, there exists a Nash equilibrium solution which can i ≡ i − i − ij j ∀ = σk σk σ1 α1 Pmax k 2, ..., n. be obtained by Lemke’s algorithm applied to the LCP (31) with j=1 q and M given by (32) and (34),respectively. 6 EURASIP Journal on Applied Signal Processing

This existence result extends that of [4]whichrequired Symmetric interferences { 21 12} the condition that maxk αk αk < 1 and was only for the two user case. When the DSL users are symmetrically located, the corre- ffi ij = ji sponding interference coe cients are symmetric: αk αk ij = ji 4. CONVERGENCE ANALYSIS OF THE IWFA for all i, j, k. In this case, it follows that M M for all i, j. Hence the matrix M is symmetric. Consequently, the The LCP formulation (31) of the DSL game, where each mixed LCP (6) is precisely the KKT condition for the follow- user’s variables associated with tone 1 are eliminated, facil- ing quadratic program (QP): itates the computation of a Nash equilibrium by Lemke’s 1 m method (see Section 5 for numerical results). Nevertheless, minimize g(p) ≡ pT Mp+ σi T pi for the convergence analysis of the IWFA, it would be con- 2 =1 i (39) venient to return to the AVI (X, q, M), where all variables m = i m ∈ Pi are left in the formulation. It is well known [10] that the subject to p p i=1 . latter AVI is equivalent to the fixed-point equations: for all i=1 = i 1, ..., m, Notice that the gradient of ( ) measures precisely the total g p m received signal power by every user at each frequency. More- pi = pi − σi − Mijp j = − σi − Mijp j , over, the set of Nash equilibrium points for the noncoopera- = P = Pi j 1 i j i tive rate maximization game (3) correspond exactly to the set (35) of stationary points of the quadratic minimization problem (39), which is not necessarily convex because the matrix M where [·]i denotes the Euclidean projection operator onto P is not positive semidefinite in general. More importantly, the P i, that is, IWFA (38) can be viewed as a block Gauss-Seidel coordinate = − i descent iteration to solve the QP (39). As such, its conver- [x]Pi argmin i Pi x p . (36) p ∈ gence behavior can be established by appealing to the follow- As briefly described in Section 2, the IWFA [4–6]isa ing general convergence result for the Gauss-Seidel algorithm distributed algorithm for solving the DSL game; it has the [11, Proposition 3.4]. attractive feature of not requiring the coordination of the DSL users. In fact, each DSL user i simply maximizes its Proposition 4. Consider the following quadratic minimiza- 1 m i tion problem: rate fi(p , ..., p ) on the feasible set P by adjusting its own i power vector p while assuming other users’ powers are fixed minimize θ(x , x , ..., x ) but unknown. In so doing, user i measures the aggregated 1 2 n ∈ ∀ = (40) interference powers, subject to xi Xi i 1, 2, ..., n, = ij i = ij j ∀ with each Xi being a given polyhedral set. Suppose that X M p k αk pk k, (37) X1 × X2 ×···×Xn is nonempty and that θ is strongly convex j=i j=i in each variable xi.LetX¯ denote the set of stationary points of locally without the specific knowledge of other users’ power (40) and let x0, x1, x2, ... beasequenceofiteratesgeneratedby j ffi ij = the following fixed-point iteration: allocations p or crosstalk coe cients αk , j i.Suchaggre- ffi gated interference powers are su cient for user i to carry out r+1 r+1 r+1 r+1 r+1 r r x = x −∇x θ x , x , ..., x , x , ..., x . its own rate maximization (3). i i i 1 2 i i+1 n Xi Specifically, the iterative waterfilling method can be de- (41) scribed as follows: at each iteration, user i measures the ag- gregated interferences and updates the new iterate by Then {xr } convergeslinearlytoanelementofX¯ and {θ(xr )} new converges linearly and monotonically. pi ⎡ ⎛ ⎞⎤ Under the following identifications: ⎢ ⎜ ⎟⎥ ⎢ ⎜ ⎟⎥ ⎢ ⎜i−1 m ⎟⎥ ≡ i ≡ Pi ≡ ⎢ ⎜ new old⎟⎥ xi p , Xi , θ(x) g(p), (42) = ⎢−σi − ⎜ Mij p j + Mij p j ⎟⎥ . ⎢ ⎜ ⎟⎥ = = ii ⎣ ⎝j 1 j i+1 ⎠⎦ iteration (38) is precisely (41). Since M is the identity ma- trix for each , it follows that the quadratic function ( ) aggregated interferences Pi i g p i (38) is strongly convex in each variable p . Thus, we can invoke Proposition 4 to conclude the following. In other words, user i simply projects the negative of the ag- gregated interferences plus the noise power vector onto the Corollary 1. If the interference coefficients are symmetric, that Pi ij = ji { ν ≡ ν,i m } polyhedral set . This simple geometric interpretation of is, αk αk for all i, j, k, then the iterates p (p )i=1 gen- the IWFA is key to its convergence analysis, which we sepa- erated by the IWFA converges linearly to a Nash equilibrium rate into two cases: symmetric and nonsymmetric interfer- point of the noncooperative DSL game. Moreover, {g(pν)} con- ences. verges linearly and monotonically. Z.-Q. Luo and J.-S. Pang 7

Notice that in the original IWFA, each user acts greed- Let ·denote the Euclidean norm in Rm. By the nonex-  − ily to maximize its own rate without coordination. What is pansiveness property of projection operator (i.e., [x]Pi ≤ −  = surprising is that this seemingly totally distributed algorithm [y]Pi x y for all x, y), we have, for all i 1, ..., m, can in fact be viewed equivalently as a coordinate descent al- ν ∗ gorithm for the minimization of a single quadratic function. p +1,i − p ,i In other words, the users actually collaborate, implicitly and − i 1 m = − i − ij ν+1,j ij ν,j willingly, to minimize a common quadratic objective func- σ M p + M p tion g(p) whose gradient corresponds to precisely the total j=1 j=i+1 Pi − received signal power by every user at each frequency. This i 1 m − − i − ij ∗,j ij ∗,j important insight is the key to the convergence of the IWFA σ M p + M p in the symmetric case. j=1 j=i+1 Pi − If signal attenuation increases deterministically with the i 1 m (45) ≤ ij ν+1,j − ∗,j ij ν,j − ∗,j propagation distance, then the symmetric interference as- M p p + M p p sumption used in the above analysis translates directly to the j=1 j=i+1 situation that the DSL users are symmetrically located: they i−1 m ν ∗ ν ∗ are of the same distance to the central office (base station). ≤ Mij p +1,j − p ,j + Mij p ,j − p ,j Such an assumption is obviously idealistic from a practical j=1 j=i+1 standpoint. Nonetheless, our analysis of IWFA for this ideal- i−1 m ν+1,j ∗,j ν,j ∗,j ized situation may still shed some light on the general behav- ≤ bij p − p + bij p − p . ior of IWFA under arbitrary interferences. j=1 j=i+1 Hence, Asymmetric interferences i m ¯ ν+1,j ∗,j ν,j ∗,j In general, the DSL users may not be symmetrically located. bij p − p ≤ bij p − p , (46) In this case, the interference matrix M is not symmetric and j=1 j=i+1 the aggregated interference power vectors cannot be viewed ¯ = ¯ ν ≡ ν m as the gradient of a scalar function. Thus, Proposition 4 is where B [bij]isdefinedby(15). Letting e (ei )i=1 with ν ≡ ν,j − ∗,j  no longer applicable. More importantly, there is now a lack ei p p and concatenating the above inequalities of an obvious objective function which serves as a monitor for all i = 1, ..., m,wededuce for the progress of the IWFA, making the convergence anal- ν+1 upp ν ysis of this algorithm less straightforward. Nevertheless, it is Bdia − Blow e ≤ B e , (47) still possible to establish the convergence of the IWFA by im- posing the spectral radius condition ρ(Υ) < 1 introduced in which implies Proposition 2. ν+1 −1 upp ν ν 0 ≤ e ≤ Bdia − Blow B e = Υe ∀ν, (48) Theorem 2. Suppose that (Υ) 1. Then the iterates { ν ≡ ρ < p − ν,i m } where we have used the fact that (B −B ) 1 is nonnegative (p )i=1 generated by the IWFA converge linearly to the dia low unique Nash equilibrium of the DSL game. entry-wise; see the discussion preceding Proposition 2. Since ρ(Υ) < 1, the above inequality implies that the sequence of ν Proof. Our proof is by a vector contraction argument; see [7]. error vectors {e } contract according to a certain norm. Con- ∗ ∗,i m Let p ≡ (p ) = be the unique Nash equilibrium solution, sequently, the sequence converges to zero, implying that the i 1 ν which satisfies sequence of waterfilling iterates {p } converges linearly to the unique solution p∗ of the DSL game. m Theorem 2 strengthens the existing convergence results p∗,i = p∗,i − σi − Mijp∗,j [4, 6]. Specifically, the condition required for convergence is j=1 Pi weaker. In particular, it can be seen that the strong diagonal (43) ij dominance condition (α ≤ 1/(m − 1))requiredin[6]and i ij ∗,j k = − σ − M p ∀i = 1, ..., m. the respective condition for two user case [4] both imply the j=i P i condition ρ(Υ) < 1. Thus, Theorem 2 covers the convergence for a broader class of DSL problems. ν For each ,wehave 5. NUMERICAL SIMULATIONS

In this section, we present some computer simulation results i−1 m comparing the convergence behavior of IWFA with Lemke’s ν ν ν p +1,i = − σi − Mijp +1,j + Mijp ,j algorithm under various channel conditions and system load j=1 j=i+1 Pi (44) (i.e., number of users). In all simulated cases, the channel ∀ = i i 1, ..., m. background noise levels σk are chosen randomly from the 8 EURASIP Journal on Applied Signal Processing

Table 1: Average sum rate: two user case. Table 2: Average sum rate: m = 10 user case.

α12, α21 ∈ (0, 1) α12, α21 ∈ (0, 1.5) αij ∈ (0, 1/(m − 1)) n k k k k n k Lemke IWFA Lemke IWFA Lemke IWFA 256 704 698 829.73 826.5787 256 2.8216 × 103 2.824 × 103 512 1.402 × 103 1.398 × 103 1.6555 × 103 1.6333 × 103 512 5.6464 × 103 5.6457 × 103 1024 2.786 × 103 2.811 × 103 3.3028 × 103 3.2968 × 103 1024 1.1284 × 104 1.1296 × 104 interval (0, 0.1/(m − 1)) with the uniform distribution, and such as Lemke’s method. With the latter as a benchmark, we i the total power budgets Pmax are chosen uniformly from the show via computer simulations that IWFA tends to converge interval (n/2, n). All sum rates are averaged over 100 in- to good Nash solutions with high sum rates. Our theoret- dependent runs. The IWFA and Lemke’s method are both ical and simulation work affirms the potential of IWFA as a implemented on a Pentium 4 (1.6 GHz) PC using Matlab promising candidate for the dynamic power spectra manage- 6.5 running under Windows XP. For IWFA, we set a max- ment in DSL environment. imum of 400 iteration cycles (among all users), while the Several extensions of current work are possible. For ex- maximum pivoting steps for Lemke’s method is set to be ample, under either the diagonal dominance condition of min(1000, 25 mn). ρ(Υ) < 1 or the symmetric interference condition, one can Table 1 reports the achieved sum rates (averaged over 100 establish the linear convergence of a distributed (partially) independent runs) of Lemke’s method and IWFA for 2 users asynchronous implementation of IWFA. In particular, for and various numbers n of frequency tones. In this case we the diagonal dominance case, one can use a contraction ar- ffi { ij} have chosen crosstalk coe cients αk from the intervals gument similar to that in [13, page 493], while for the sym- (0, 1) and (0, 1.5), respectively, for all k,andalli, j. This rep- metric interference case, use an error bound technique [14] resents strong crosstalk interference scenarios. According to to bound the distance from the iterates to the solution set of the table, the average rates achieved by both algorithms are the quadratic QP (39). Asynchronous implementation is in- comparable (within 2%), suggesting that the IWFA is capa- teresting from a practical standpoint since it does not require ble of computing approximate Nash solutions with high sum the DSL users to coordinate the timing of their power spectra rates. Moreover, the results show that stronger interference updates. actually lead to Nash solutions with higher overall sum rates. As a future work, we are interested in further analyzing This seems to indicate that the well-known Braess paradox the behavior of IWFA under no assumptions on the crosstalk [12] exist in DSL games. (In fact, using the QP characteriza- coefficients. Perhaps the compactness of the feasible set and tion of Nash game (cf. Section 4), it is possible to construct the nonnegativity of the crosstalk coefficients already ensure simple examples whereby more transmission power for in- the convergence of IWFA, or at least eliminate the possibility dividual users do not necessarily lead to Nash solutions with of finite limit cycles. These issues and the design of an effi- higher sum rate.) cient optimal power allocation algorithm for the nonconvex For the case with more (m = 10) users, the situation is sum rate maximization problem are interesting topics for fu- ij ∈ − similar, as shown in Table 2. Indeed, when αk (0, 1/(m ture research. 1)), the condition for the uniqueness of Nash solution is sat- isfied and the two methods have identical performance. The APPENDIX results in both tables show that IWFA solutions are compa- rable in quality to the respective solutions generated by the BACKGROUND ON LCPs AND AVIs Lemke method. The difference in the solution qualities are due to the finite termination criteria we have used in both al- In this appendix, we briefly summarize some technical back- gorithms which can cause either algorithm to stop before an ground related to the linear complementarity problems and ffi equilibrium solution is found. a ne variational inequalities. For a comprehensive treat- ment of these problems, the readers are referred to the two monographs [7, 10]. 6. CONCLUSIONS Unifying linear and quadratic programs and many re- In this paper we reformulate the DSL Nash game (resulting lated problems, the LCP is an inequality system with no ob- from the distributed implementation of IWFA) as an equiv- jective function to be optimized. Specifically, let M be a given × Rn alent LCP, and apply the rich theory for LCP to analyze the square matrix of order n n and q acolumnvectorin .The convergence behavior of IWFA. Our analysis not only signif- LCP associated with (q, M) (denoted as LCP(q, M)) is to find ∈ Rn icantly strengthens the existing convergence results, but also x such that yields surprising insight on IWFA. In particular, in the case x ≥ 0, Mx + q ≥ 0, xT (Mx + q) = 0. (A.1) of symmetric interference, the users of IWFA in fact collab- orate unknowingly to minimize a common quadratic cost, Let Sol(q, M) denote the solution set of LCP(q, M). It is even though their original intention is to maximize their in- known that Sol(q, M) is in general equal to a finite union dividual rates. Moreover, the LCP reformulation makes it of polyhedral sets. If M is positive semidefinite (not neces- possible to solve the DSL game with existing LCP solvers, sarily symmetric), then we say that the corresponding LCP Z.-Q. Luo and J.-S. Pang 9

· Rn is monotone; in this case, the solution set Sol(q, M)isconvex where [ ]+ denotes projection to + and α>0isanycon- (and polyhedral). If M is symmetric, it can be easily seen that stant. This suggests the following simple iterative scheme to LCP(q, M) corresponds exactly to the KKT conditions for the compute a solution of LCP(q, M): for a given stepsize α>0 following QP: and an initial iterate x0 ≥ 0, 1 r+1 = r − r = minimize f (x) ≡ xT Mx + qT x x x α Mx + q +, r 1, 2, .... (A.7) 2 (A.2) subject to x ≥ 0. Thisiterativeschemeiscalledthegradient projection algo- rithm. If {xr } converges, then the limit must be a solution Therefore, the stationary points of above QP are precisely the of LCP(q, M). More generally, we can split the matrix M as solutions of the LCP(q, M). Moreover, the gradient vector M = B + C for some matrices B and C and generate a se- ∇ f (x) can be shown to be constant on each of the polyhedral quence according to piece of Sol(q, M). (If M is in addition positive semidefinite, ∇ r+1 = r+1 − r+1 r = then Sol(q, M) consists of one polyhedral piece, so f (x) x x α Bx + Cx + q +, r 1, 2, .... (A.8) is constant over Sol(q, M).) When M is not symmetric, the above QP equivalence no longer holds. Instead, we can asso- Again, if the sequence {xr } converges, then its limit must be ciate with the LCP(q, M) the following alternate QP: an element of Sol(q, M). The aforementioned gradient pro- jection is a special matrix splitting algorithm with B ≡ I/α minimize xT (q + Mx) and C ≡ M − I/α.IfB is taken to be the lower triangular part (A.3) subject to q + Mx ≥ 0, x ≥ 0. (including the diagonal) of M while C is taken to be the strict upper triangular part of M, then the resulting matrix split- In this case, a vector x is a global minimizer of (A.3)witha ting algorithm simply corresponds to the well-known Gauss- zero objective value if and only if x ∈ Sol(q, M). Unlike the Seidel method for LCP.In general, to ensure convergence, the symmetric case, the KKT points of (A.3) are not necessarily matrix splitting M = B + C must satisfy certain conditions. the solutions of LCP(q, M). For example, if M is symmetric, B and B − C are both posi- The LCP can also be used to model a linear program (LP) tive definite, then the iterates generated by the resulting ma- via duality. Indeed, the following LP: trix splitting algorithm converges linearly to an element of Sol(q, M). minimize cT x Much of the theory and algorithms for the LCP can be (A.4) subject to Ax ≥ b, x ≥ 0 extended to the AVI of the following form: given the polyhe- dron, is equivalent to the LCP(q, M)with n P ≡ x ∈ R : Ax ≥ b ,(A.9) c 0 −AT q ≡ , M ≡ ,(A.5) ∗ ∈ P −b A 0 find x such that − ∗ T ∗ ≥ ∀ ∈ P where the matrix M is skew-symmetric, thus positive (x x ) (q + Mx ) 0 x . (A.10) semidefinite. Within this framework, LCP(q, M) simply corresponds to the There are many algorithms developed for solving an LCP. = = Among them, Lemke’s method is perhaps the most versatile case where A I and b 0. The solution set of an AVI is due to its weak requirements for convergence. Algorithmi- also the union of a finite number of polyhedral sets, which cally, Lemke’s method is a pivoting algorithm, much like the becomes a single (convex) polyhedron when M is positive celebrated simplex method for an LP. As such, it is a finite semidefinite (the monotone case). In general, a vector x solves method but suffers from exponential worst case complexity. the above AVI if and only if x satisfies the following fixed Nonetheless, its simplicity and superior average performance point equation: have made it a popular choice in practice. x = x − α(Mx + q) , (A.11) For monotone LCPs, we can also use interior point algo- P rithms which offer polynomial complexity [15]. These algo- where [·]P denotes the orthogonal projection operator onto rithms exploit the positive semidefiniteness of M and typi- P . Similar to the case of LCP, we can devise matrix splitting cally require only a small number of iterations, albeit every algorithms for solving the above AVI: iteration requires the solution of a system of linear equations × r+1 r+1 r+1 r of size n n. In the absence of monotonicity, interior point x = x − α Bx + Cx + q P , r = 1, 2, ..., algorithms are not guaranteed to converge. (A.12) Another popular class of iterative algorithms for solving LCPs consists of the matrix splitting algorithms, which are where M = B + C is a splitting of matrix M. Under condi- based on the observation that a vector x ∈ Sol(q, M)ifand tions similar to those for the LCP, we can also establish linear only if x satisfies the following fixed point equation: convergence of the matrix splitting algorithms for solving a symmetric AVI (i.e., M = MT ) provided a solution exists; see = − x x α(Mx + q) +,(A.6)[11]. 10 EURASIP Journal on Applied Signal Processing

ACKNOWLEDGMENTS [15] M. Kojima, N. Megiddo, T. Noma, and A. Yoshise, A Unified Approach to Interior Point Algorithms for Linear Complemen- We wish to thank Nobuo Yamashita for making his IWFA tarity Problems, vol. 538 of Lecture Notes in Computer Science, code available and Michael Ferris for helping with the Lemke Springer, Berlin, Germany, 1991. code in the simulation work reported in this paper. The re- search of the first author is supported in part by the Natural Sciences and Engineering Research Council of Canada, Grant Zhi-Quan Luo received the B.S. degree in mathematics from Peking University, no. OPG0090391, by the Canada Research Chair Program, China, in 1984. During the academic year of and by the National Science Foundation under Grant DMS- 1984 to 1985, he was with Nankai Institute 0312416. The research of the second author is supported in of Mathematics, Tianjin, China. From 1985 part by the National Science Foundation under Grants CCR- to 1989, he studied at the Department of 0098013 and CCR-0353073. Electrical Engineering and Computer Sci- ence, Massachusetts Institute of Technol- REFERENCES ogy,wherehereceivedthePh.D.degreein operations research. In 1989, he joined the [1] K. B. Song, S. T. Chung, G. Ginis, and J. M. Cioffi,“Dy- Department of Electrical and Computer Engineering, McMaster namic spectrum management for next-generation DSL sys- University, Hamilton, Canada, where he became a Professor in 1998 tems,” IEEE Communications Magazine, vol. 40, no. 10, pp. and held the Canada Research Chair in information processing 101–109, 2002. since 2001. Starting April 2003, he has been a Professor with the [2] G. Cherubini, E. Eleftheriou, and S. Olcer, “On the optimal- Department of Electrical and Computer Engineering and holds an ity of power back-off methods,” American National Standards endowed ADC Research Chair in wireless telecommunications with Institute, ANSI-T1E1.4/235, August 2000. the Digital Technology Center at the University of Minnesota. His [3] R. Cendrillon, M. Moonen, J. Verliden, T. Bostoen, and W. research interests lie in the union of large-scale optimization, infor- Yu, “Optimal multiuser spectrum management for digital sub- mation theory and coding, data communications, and signal pro- scriber lines,” in Proceedings of IEEE International Conference cessing. Professor Luo is a Member of SIAM and MPS. He is a recip- on Communications (ICC ’04),vol.1,pp.1–5,Paris,France, ient of the 2004 IEEE Signal Processing Society’s Best Paper Award, June 2004. and has held editorial positions for several international journals [4] W. Yu, G. Ginis, and J. M. Cioffi, “Distributed multiuser power including SIAM Journal on Optimization, Mathematics of Com- control for digital subscriber lines,” IEEE Journal on Selected putation, Mathematics of Operations Research, and IEEE Transac- Areas in Communications, vol. 20, no. 5, pp. 1105–1115, 2002. tions on Signal Processing. [5]S.T.Chung,S.J.Kim,J.Lee,andJ.M.Cioffi, “A game- theoretic approach to power allocation in frequency-selective Jong-Shi Pang with a Ph.D. degree in oper- Gaussian interference channels,” in Proceedings of IEEE In- ations research from Stanford University, he ternational Symposium on Information Theory (ISIT ’03),pp. is presently the Margaret A. Darrin Distin- 316–316, Pacifico Yokohama, Kanagawa, Japan, June–July guished Professor in applied mathematics 2003. at Rensselaer Polytechnic Institute in Troy, [6] S. T. Chung, “Transmission schemes for frequency selective New York. Prior to this position, he has Gaussian interference channels,” Doctral disseration, Depart- taught at The John Hopkins University, The ment of Electrical Engineering, Stanford University, Stanford, University of Texas at Dallas, and Carnegie- Calif, USA, 2003. Mellon University. He has received sev- [7]R.W.Cottle,J.-S.Pang,andR.E.Stone,The Linear Comple- eral awards and honors, most notably the mentarity Problem, Academic Press, Boston, Mass, USA, 1992. George B. Dantzig Prize in 2003 jointly awarded by the Mathe- [8] C. E. Lemke, “Bimatrix equilibrium points and mathematical matical Programming Society and the Society for Industrial and programming,” Management Science, vol. 11, no. 7, pp. 681– Applied Mathematics and the 1994 Lanchester Prize by the Insti- 689, 1965. tute for Operations Research and Management Science. He is an [9] N. Yamashita and Z.-Q. Luo, “A nonlinear complementarity ISI highly cited author in the mathematics category. His research approach to multiuser power control for digital subscriber interests are in continuous optimization and equilibrium program- lines,” Optimization Methods and Software,vol.19,no.5,pp. ming and their applications in engineering, economics, and fi- 633–652, 2004. nance. Among the current projects, he is studying various exten- [10] F. Facchinei and J.-S. Pang, Finite-Dimensional Variational In- sions of the basic Nash equilibrium problem, including the Stack- equalities and Complementarity Problems,Springer,NewYork, elberg game and its multileader generalization, and the dynamic NY, USA, 2003. version of the Nash problem. The mathematical tool for the latter ff [11] Z.-Q. Luo and P. Tseng, “Error bounds and convergence anal- problem is a new class of dynamical systems known as di eren- ysis of feasible descent methods: A general approach,” Annals tial variational inequalities, which provides a powerful framework of Operations Research, vol. 46, pp. 157–178, 1993. for dealing with applications that involve dynamics, unilateral con- [12] D. Braess, “Uber¨ ein Paradoxon aus der Verkehrsplanung,” straints, and mode switches. Unternehmensforschung, vol. 12, pp. 258–268, 1968. [13] D. P. Bertsekas and J. N. Tsitsiklis, Parallel and Distributed Computation: Numerical Methods, Prentice-Hall, Englewood Cliffs, NJ, USA, 1989. [14] Z.-Q. Luo and P. Tseng, “On the rate of convergence of a dis- tributed asynchronous routing algorithm,” IEEE Transactions on Automatic Control, vol. 39, no. 5, pp. 1123–1129, 1994. Hindawi Publishing Corporation EURASIP Journal on Applied Signal Processing Volume 2006, Article ID 16828, Pages 1–12 DOI 10.1155/ASP/2006/16828

Alien Crosstalk Cancellation for Multipair Digital Subscriber Line Systems

George Ginis and Chia-Ning Peng

Broadband Communications Group, Incorporated, 2043 Samaritan Drive, San Jose, CA, USA

Received 30 November 2004; Revised 18 February 2005; Accepted 4 May 2005 An overview of alien crosstalk cancellation for DSL systems with multiple pairs is here presented. It is shown that when a com- mon crosstalk source affects the receivers of multiple pairs, the noise exhibits a certain correlation among the pairs. In a DMT system, the frequency-domain noise samples are most strongly correlated between pairs when they correspond to the same tone. Thus, noise decorrelation algorithms applied independently for each tone can provide significant performance enhancements. Three possible methods are described for noise decorrelation, one is suitable for two-sided coordination and two are suited for receiver coordination among the pairs. It is theoretically proven that the data-rate performance of these three methods is identical from the perspective of the sum rate over all pairs. Simulation results corresponding to an ADSL2+ two-pair system with a T1 disturber are presented to illustrate the noise correlation property and to indicate the potential performance bene- fits.

Copyright © 2006 Hindawi Publishing Corporation. All rights reserved.

1. INTRODUCTION An alternative categorization of vectoring methods re- lates to the type of crosstalk that is being cancelled. The Digital subscriber line (DSL) transmission is typically con- crosstalk interference affecting transmission of information strained by crosstalk interference. As DSL technology ad- on a pair belonging to the vectored system can originate from vances and processing power increases, the interest in tech- two kinds of sources. Crosstalk may originate from a trans- niques for crosstalk reduction has increased. Most of these mitter belonging to the vectored system, or it may originate techniques are based on multiple-input multiple-output from a transmitter outside the vectored system. The first kind (MIMO) system representations, and it is common in the is defined as in-domain crosstalk, while the second kind is de- DSL literature to refer to them as vectoring methods and to fined as alien or out-of-domain crosstalk. describe a DSL system employing such methods as a vec- One of the earliest references to MIMO methods for tored system.Thispapergivesanoverviewofaspecificclass DSL is [1], where both transmitter and receiver methods of vectoring methods that aim to suppress interference aris- were described. The concept of alien crosstalk cancella- ing from crosstalk sources that lie outside the vectored sys- tion was described extensively in [2, 3]. This early research tem. The effect of such sources is typically referred to as alien was later followed by work focusing on the discrete multi- or out-of-domain crosstalk. These concepts are further ex- tone (DMT) modulation method. Methods for coordinated plained in the paragraphs that follow. reception/transmission of in-domain crosstalk cancellation Vectored systems are comprised of multiple twisted pairs were described in [4, 5], which involved MIMO decision and the corresponding transceiver modules. The multiple feedback at the receiver, and MIMO precoding at the trans- twisted pairs are represented as a MIMO channel. In a vec- mitter. Two-sided coordination for in-domain crosstalk can- tored system, joint signal processing (or MIMO) techniques cellation was proposed in [6]. References [7, 8] showed that are employed on one or both sides of the MIMO channel. significant performance benefits are still possible with sim- Vectoring techniques that require joint signal processing at plified vectoring techniques that achieve partial cancellation. both sides fall under the category of two-sided coordination. Linear MIMO signal processing was shown in [9, 10]tobe Vectoring techniques that only require joint signal process- near-optimal for in-domain crosstalk cancellation. A general ing at one side are described as coordinated reception or coor- theory of MIMO for DSL and an overview of its applications dinated transmission, depending on the side where the joint are given in [11]. An alternative approach towards crosstalk signal processing takes place. cancellation which exploits the common-mode signal of DSL 2 EURASIP Journal on Applied Signal Processing

Crosstalk source 1 Alien crosstalk L twisted pairs, and that crosstalk originates from M sources . that lie outside the vectored system. These sources may give . rise to either of far-end crosstalk (FEXT) near-end crosstalk Crosstalk source M (NEXT), but this fact has no consequence for the anal- ysis that follows. The important observation is that alien crosstalk affects the vectored pairs in a common way. This Line 1 implies that the received noise has a certain degree of corre- lation among the vectored pairs. Exploiting this noise corre- lation is the fundamental concept for alien crosstalk cancel- Line 2 lation. . Using vector notation, the output signal on pair i can be . expressed as Line L M Transmitters Receivers = c p c p yi Hi xi + ni + Ai,kzk ,(1) k=1 Figure 1: DSL crosstalk environment.

where yi is a column vector containing a block of N received samples, xp is (N + ν) × 1 and holds the transmitted samples pairs and which does not rely on joint signal processing of i on pair i, ni is N × 1 and contains the Gaussian-distributed signals from multiple pairs was presented in [12]. p × noise samples received on pair i,andzk is (N + ξ) 1and Vectoring methods have attracted significant interest holds the transmitted samples of crosstalk source k. The sam- from the DSL industry. Such methods have been presented ples contained in these vectors are ordered from the most re- and discussed within DSL standardization bodies such as cent to the least recent. The noise correlation matrix can be the Network Access Interfaces (NAI) Committee of the Al- = assumed to be Rnini I with no loss of generality. The ma- liance for Telecommunications Industry Solutions (ATIS). c × ν trix Hi is N (N + ) and represents the convolution ma- These discussions have partly been motivated by the appli- c trix corresponding to the channel of pair i.ThematrixAi,k cation of DSL bonding, where multiple pairs are used to is N × (N + ξ) and represents the convolution matrix of the ff transport a single data stream, and which o ers the pos- crosstalk coupling channel from source k to victim pair i.The sibility to implement two-sided coordination. Contribution parameters ν and ξ denote the length (in number of sam- [13] presented information on the concept of alien crosstalk ples) of the nonzero part of the impulse response of the direct cancellation using two-sided coordination. The benefits of channel and of the alien crosstalk coupling correspondingly. vectoring were analyzed using real measured channels in The sampling rates and sampling times are assumed to be [14]. Alien crosstalk cancellation is also described in detail identical for all pairs. It is noted that the direct and the indi- in [15]. rect channels are here assumed to include the time-domain This paper presents the theory of alien crosstalk cancel- filtering of the transmitter and receiver modems. lation, and gives an overview of such MIMO methods. The system model is given in Section 2, where the assumption of DMT transmission is made and the property of noise correla- 2.2. DMT transmission and synchronization tion among tones is illustrated. Section 3 describes a number of approaches for alien crosstalk cancellation (also referred DMT transmission is here considered, since it is by far the to as MIMO noise decorrelation), and shows their equiva- most popular transmission method in DSL. It is assumed lence from an information-theory standpoint. The approach that each DMT symbol includes both a cyclic prefix (CP) and ffi of receiver coordination with noise prediction for DMT has acyclicsu x(CS)[16]. It is also assumed that timing ad- not been previously proposed in the literature. Section 4 vance is employed at the customer premises’ modems as de- shows simulation results for alien crosstalk cancellation, and fined in the very-high-speed digital subscriber line (VDSL) Section 5 draws some final conclusions. recommendation [17]. The following notation is used in the rest of this pa- A timing diagram is shown in Figure 2 to illustrate the per. Bold letters indicate vectors, lowercase is used for time- concept of synchronized DMT transmission among multi- domain signals, and uppercase is used for frequency-domain ple pairs. The time scale corresponds to the vertical axis. Two signals. The superscripts T and ∗ denote the transpose and pairs are assumed and both transmission directions are con- conjugate transpose operations correspondingly. sidered. LT-TX1 and LT-TX2 indicate the transmitters at the line termination (LT) or central office’s side for the two pairs. NT-TX1 and NT-TX2 indicate the transmitters at the net- 2. SYSTEM MODEL work termination (NT) or customer premises’ side for the two 2.1. Channel model pairs. The corresponding receivers are denoted as NT-RX1, NT-RX2, LT-RX1, and LT-RX2. The transmission is shown The DSL environment for a vectored system is shown in for the duration of a single DMT block. The following ex- Figure 1. It is assumed that the vectored system consists of ample assumptions are made. The channel memory is equal G. Ginis and C.-N. Peng 3

Time LT-TX1 NT-RX1 LT-TX2 NT-RX2 NT-TX1 LT-RX1 NT-TX2 LT-RX2 −4 Timing Timing −2 advance advance 0 CP CP CP CP 2 4 6 8 Received Received Received Received block block block block 10

12 CS CS CS CS 14 16 18 Downstream transmission Upstream transmission Line 1 Line 2 Line 1 Line 2

Figure 2: Example of synchronized lines.

to 2 units of time, and the CP is set equal to 2 units of time. parameter N is chosen to be equal to twice the number of The propagation delay of pair 1 is 2 units of time, and the DMT tones. The matrix Pi is (N + ν) × N and models the ad- propagation delay of pair 2 is 3 units of time. The maxi- dition of the CP and CS. Letting the length of the CP be equal mum propagation delay is equal to 3 units of time, so the to ν, the length of the CS that is included in the received block CS is also set to 3 units of time. The timing advance at the be equal to μi, and with the assumption that μi < ν, there ex- NT side equals the propagation delay of the corresponding ists pair. ⎡ ⎤ O × − I × In the above example, the block transmission of the main ⎢ μi (N μi) μi μi ⎥ ⎢ ⎥ part of the DMT symbol starts at time 0 for the LT transmit- Pi = ⎣ IN×N ⎦ . (3) ters. The transmission of the CP starts at time −2. The timing I(ν−μi)×(ν−μi)O(ν−μi)×(N−ν+μi) advance at the NT side means that the block transmission at the NT side is advanced by an amount of time equal to the Additionally, the FFT and IFFT operations are repre- propagation delay of the line. This results in DMT symbols at sented as the NT side starting transmission at the same time as those = at the LT side. Evidently, the DMT blocks are synchronized Yi Qyi, = ∗ (4) across all transmitters. xi Q Xi, Next, the receiver timing is examined. It is a require- where Yi is the FFT output at the receiver, and Xi is the IFFT ment for the DMT blocks to be synchronized across the re- ∗ ceivers, in order to fully exploit noise correlation among the input at the transmitter. The matrices Q and Q correspond receivers. The existence of the cyclic suffix makes this possi- to the FFT and IFFT transforms, respectively. ble: the thick vertical line segments show the block of sam- From (1), (2), (4), a new expression is obtained for the ples that are extracted at the receivers. For NT-RX2, the re- received samples of user i at the FFT output: ceived block includes all the samples of the main part of M the DMT symbol. For NT-RX1, the received block includes = c ∗ c p Yi QHi PiQ Xi + Qni + QAi,kzk (5) part of the CS, in order to coincide in time with the re- k=1 ceived block of NT-RX2. Similarly, the received blocks at M = ∗  the LT side are chosen to coincide in time. The inclusion of QHiQ Xi + Ni + Ai,k (6) part of the CS implies only a phase change for the channel k=1 = Λ  model. A mathematical representation of this scheme is given iXi + Ni + Ai. (7) next. = c Λ = The inclusion of the CP and the CS can be expressed as Note that Hi Hi Pi is a circulant matrix, which makes i ∗ QHiQ a diagonal matrix. The correlation matrix of the new p noise term is R   = I. The choice of μ in the definition = Ni Ni i xi Pixi,(2) of Pi depends on the timing of the received block. It can be shown that this choice only affects the phase of the diagonal where xi is an N × 1 vector containing the DMT block sam- elements of Λi. The next subsection investigates the term Ai, ples to be transmitted excluding the CP and the CS. The which represents the alien crosstalk. 4 EURASIP Journal on Applied Signal Processing

2.3. Alien crosstalk correlation −40

In (7), the alien crosstalk for user i is defined as −50 M = Ai Ai,k −60 =1 k (8) M = c p −70 QAi,kzk . k=1 With the assumption that crosstalk sources are indepen- PSD (dBm/Hz) −80 dent, the correlation matrix of alien crosstalk of user i is − M 90 = c p p∗ c∗ ∗ RAiAi QAi,kE zk zk Ai,k Q ,(9) k=1 −100 0 50 100 150 200 250 300 350 400 450 500 while the correlation matrix of alien crosstalk of user i with Tone index that of user j is M Figure 3: Power spectral density of T1 transmitter. = c p p∗ c∗ ∗ RAiAj QAi,kE zk zk Aj,kQ . (10) k=1 Thereisaspecialcasethatdeservessomeattention.Set- 2.4. Example ting ξ =ν and zN−k =z−k,fork = 1, ..., ν, results in p = An example is next presented to illustrate the noise correla- zk Pzk, (11) tion property in practical conditions. It is assumed that there = where P is as in (3)withμi 0. These conditions correspond is a single crosstalk source from a T1 transmitter, which af- to the special case where the alien crosstalk sources are DMT fects the downstream receivers of 2 ADSL2+ systems [18]. transmitters with the same sampling rates, same block sizes, Other noise sources (e.g., background noise) are here ig- and same CP and CS as the vectored system. Also, the trans- nored. mission of the alien crosstalk sources must be such that the The bandwidth of the ADSL2+ system is 2.208 MHz, DMT symbol boundaries are synchronized with the received where 512 tones with a 4.3125 kHz spacing are used. The DMT symbols of the vectored system. This special case does power spectral density (PSD) of the T1 transmitter over this not correspond to a practical situation, but it does provide frequency region is shown in Figure 3. some intuition. It is assumed that the ADSL2+ receivers are affected by Assuming that the transmitted samples of the alien cross- NEXT interference originating from the T1 transmitter. Ac- ∗ = E talk are white, there exists E(zkzk ) z,kI. (Note that any tual crosstalk measurements were used to calculate the result- transmitter spectrum shaping of the alien crosstalk sources ing effect, which were obtained through Stanford University. can be incorporated in the alien crosstalk coupling.) Then, The measurements were made over 300 m of 26AWG cable. the correlation matrix of the alien crosstalk of user i with that Two sets of coupling measurements are here used, with set of user j becomes 1 representing the coupling between pairs 2 and 6, and set M 2 representing the coupling between pairs 2 and 11 within = c ∗ ∗ c∗ ∗ RAiAj QAi,kPE zkzk P Aj,kQ a single binder. In what follows, it is assumed that set 1 ex- k=1 presses the coupling between the crosstalk source and vec- M tored pair 1, while set 2 expresses the coupling between the = ∗ ∗ ∗ QAi,kE zkzk Aj,kQ (12) crosstalk source and vectored pair 2. The magnitude of the k=1 NEXT coupling is shown in Figure 4. M For the numerical computations that follow, (9)and(10) = E ΛA ΛA∗ z,k i,k j,k , are used. The frequency responses of the crosstalk coupling k=1 are converted to impulse responses and shortened to cap- ΛA = ∗ where i,k QAi,kQ is a diagonal matrix. This expres- ture 99.9% of the impulse energy, thus determining the el- c p p∗ sion shows that for this special case, noise correlation exists ements of Ai,k.TheE(zk zk ) correlation matrix is derived only when the same tone is examined for different users. In by computing the correlation of the transmitted samples of other words, there is no correlation between noise of differ- the crosstalk source. For this computation, the transmit spec- ent users, when that noise corresponds to different tones. trum shaping is modeled through an FIR filter with 81 sam- When the above special case does not hold, noise on some ples. tone of one user may be correlated with noise on another Figure 5 shows a curve representing the magnitude of tone of a second user. This is attributed to the effect of FFT the noise variance of pair 1, and five curves representing the spreading, and it is known that it can be controlled through magnitude of the correlation between noise on a specific tone appropriate receiver windowing. and other tones of pair 1. In the five curves representing the G. Ginis and C.-N. Peng 5

−50 Finally, Figure 7 presents a curve with the magnitude of the correlation between noise of pair 1 and of pair 2 for the −55 same tone, and five curves with the magnitude of the corre- −60 lation between noise on a specific tone of pair 1 and noise on tones of pair 2. For the five curves where a specific tone of −65 pair 1 is used as a reference, the tone index is chosen as 100, 200, 300, 400, and 500, respectively. It is observed that, on the −70 one hand, the correlation between different tones of different −75 pairs is relatively weak. On the other hand, the correlation between pairs 1 and 2 for the same tone is significant and −80 comparable to the noise variances of the two pairs. This indi- cates that methods exploiting the noise correlation between −85 Magnitude of frequency response (dB) pairs within the same tone are of interest, while methods for ff −90 noise decorrelation among di erent tones are of much lesser 0 50 100 150 200 250 300 350 400 450 500 use. Tone index Crosstalk coupling for pair 1 3. NOISE DECORRELATION Crosstalk coupling for pair 2 3.1. Multiuser model

Figure 4: Crosstalk coupling for the 2 vectored pairs. Equation (7) is here repeated for convenience,

= Λ  Yi iXi + Ni + Ai. (13) −80

−90 In the above equation, the received frequency-domain sam- ples of user i are given, where vector samples correspond to −100 different tones, and i = 1, ..., L. Since Λi is diagonal, this ex- −110 pression can also be given as

−120 = Λ  Yi,n i,nXi,n + Ni,n + Ai,n, (14) −130

− where n = 1, ..., N is the tone index, and where Yi = 140    Magnitude (dBm/Hz) ··· T = ··· T = ··· T [Yi,1 Yi,N ] , Xi [Xi,1 Xi,N ] , Ni [Ni,1 Ni,N ] , − T 150 Ai = [Ai,1 ···Ai,N ] ,andΛi,n are the diagonal elements of Λ −160 i. Reorganizing (14) into vectors corresponding to a spe- −170 0 50 100 150 200 250 300 350 400 450 500 cific tone, the following expression is obtained: Tone index Zn = TnWn + Nn, n = 1, ..., N, (15) Noise variance across tones Noise correlation between tone 100 and other tones = ··· T = ··· T = Noise correlation between tone 200 and other tones where Zn [Y1,n YL,n] , Wn [X1,n XL,n] ,andNn T Noise correlation between tone 300 and other tones [N1,n ···NL,n] . Also, Tn is a diagonal matrix with elements Noise correlation between tone 400 and other tones Λ1,n, ..., ΛL,n.ThetermNn includes noise components cor- Noise correlation between tone 500 and other tones  responding to both background noise Ni,n and alien crosstalk

Ai,n. In the expression above, RNnNn is in general nondiagonal, Figure 5: Noise correlation for user 1. as explained in Section 2.

3.2. Two-sided coordination correlation, the reference is chosen to be tones 100, 200, 300, 400, and 500, respectively. The noise variance corresponds to Two-sided coordination for DSL has been proposed in sev- the dashed curve, and the other curves can be easily recog- eral instances. Reference [1] first showed the concept for DSL nized by noting the point where they coincide with the noise and presented transmitter and receiver design methods. In variance curve. The main observation from these results is [2], the idea of exploiting noise correlation with two-sided that the noise among tones is indeed correlated to some ex- coordination was first analyzed. Also, [13, 15]describedan tent, but that such correlation is relatively weak. approach for alien crosstalk cancellation for DMT-based sys- Figure 6 shows similar curves corresponding to user 2. tems with two-sided coordination. Again, it is observed that noise correlation between tones is The analysis of this section essentially makes use of the relatively weak. methodology of [19], when applied to DSL. Starting from 6 EURASIP Journal on Applied Signal Processing

− ∗ −1/2 80 where N = N and = , so that the noise sam- n Un RNnNn n RNnNn I −90 ples are no longer correlated. The transmitter and receiver structures are shown in −100 Figure 8. −110 Next, it is shown that on a per-tone basis, the maximum achievable data-rate sum with this method is equal to the ca- −120 pacity of the channel. The capacity of the vector channel of −130 (15) is defined as the maximum of the mutual information between Zn and Wn [21]. With Wn having a Gaussian distri- −140 bution, the mutual information is expressed as Magnitude (dBm/Hz) − 150 I W ; Z = H Z − H Z | W n n n n n −160 = 1 RZnZn log2 −170 2 RNnNn (23) 0 50 100 150 200 250 300 350 400 450 500 ∗ + = 1 TnRWnWn Tn RNnNn Tone index log2 , 2 RNnNn Noise variance across tones

Noise correlation between tone 100 and other tones where RZnZn and RWnWn are the correlation matrices of Zn and Noise correlation between tone 200 and other tones Wn,respectively. Noise correlation between tone 300 and other tones The transmitter and receiver operations of (20)and(21) Noise correlation between tone 400 and other tones represent 1-1 transformations. Therefore, the mutual infor- Noise correlation between tone 500 and other tones mation of the channel of (22) equals the mutual information of the original channel, Figure 6: Noise correlation for user 2. I Wn; Zn = I Wn; Zn . (24) (15), it is noted that the noise correlation matrix has the fol- Finally, the mutual information can be expressed as lowing Cholesky decomposition: R = 1 ZnZn = ∗ I Wn; Zn log2 RNnNn GnREnEn Gn , (16) 2 R NnNn ∗ (25) ΣnR Σ + R where Gn is a lower-triangular matrix with diagonal elements = 1 WnWn n NnNn log2 , 2 equal to 1, and REnEn is a diagonal matrix with positive ele- RNnNn ments. where and are the correlation matrices of Z and The singular value decomposition (SVD) of the equiva- RZnZn RWnWn n lent channel of tone n after noise decorrelation is expressed Wn, respectively. By making use of Hadamard’s inequality, as L −1/2 1 2 = I Wn; Zn ≤ log 1+ρ En,k , (26) Tn RNnNn Tn (17) 2 n,k =1 2 −1/2 −1 k = R G Tn (18) EnEn n Σ E = Σ ∗ where ρn,k are the diagonal elements of n,and n,k are the Un nVn , (19) diagonal elements of . Equality holds when the off- RWnWn Σ diagonal elements of R equal 0. where Un and Vn are unitary matrices, and n is diagonal WnWn with positive elements in decreasing order. (Square-root fac- Thus, transmission optimization becomes the problem torization other than Cholesky can also be used to obtain the of power allocation to parallel channels, which has the well- equivalent channel after noise decorrelation.) known water-filling solution [22]. It is clear from above that Then, channel diagonalization is achieved by the follow- two-sided coordination is capable of reaching the maximum ing operations. At the transmitter, achievable data rate.

Wn = VnWn, (20) 3.3. Receiver coordination with decision-feedback structure where Wn is the input of the operation. At the receiver, Alien crosstalk cancellation with receiver coordination was = ∗ −1/2 Zn Un RNnNn Zn, (21) investigated in [3], where both decision-feedback approaches and noise-predictive approaches were proposed to achieve where Zn is the output of the operation. noise decorrelation. A general framework for MIMO re- Combining (15), (19), (20), and (21), ceivers was presented in [20] as the generalized DFE (GDFE). The GDFE can be shown to contain as subcases many pop- Zn = ΣnWn + Nn, (22) ular MIMO techniques. The application of the GDFE for G. Ginis and C.-N. Peng 7

−80

−90

−100

−110

−120

−130

−140 Magnitude (dBm/Hz) −150

−160

−170 0 50 100 150 200 250 300 350 400 450 500 Tone index Noise correlation between users 1 and 2 for the same tone Noise correlation between tone 100 of user 1 and tones of user 2 Noise correlation between tone 200 of user 1 and tones of user 2 Noise correlation between tone 300 of user 1 and tones of user 2 Noise correlation between tone 400 of user 1 and tones of user 2 Noise correlation between tone 500 of user 1 and tones of user 2

Figure 7: Noise correlation between user 1 and user 2.

Wn,1 Wn,1 Zn,1 Zn,1 Decoder

Zn,2 Wn,2 Wn,2 ∗ −1/2 Zn,2 Vn Tn Un RNnNn Decoder

. . . .

Wn,L Wn,L Zn,L Zn,L Decoder

Transmitter for alien Channel and time-domain Receiver for alien crosstalk cancellation signal processing crosstalk cancellation

Figure 8: SVD transmission.

 DSL vectoring was proposed in [4] with the objective of in- where Zn is the output of the operation, Gn isasgivenin(16), = domain FEXT cancellation. Also, the DSL multiuser theory and En is the resulting error term with REnEn I. Although the givenin[11] describes the use of the GDFE for the purpose feedforward operation decorrelates the noise, it introduces −1 of noise decorrelation. interference among the in-domain pairs, since Gn Tn is non- The analysis of this section is based on the GDFE formu- diagonal. To eliminate this interference, a feedback operation lation. Starting from (15), the feedforward operation of the is applied, GDFE is expressed as =  − −1 − Zn Zn Gn Tn Dn Wn (28)  = −1  Zn Gn Zn = Z − F W (29) (27) n n n = −1 = Gn TnWn + En, TnWn + En, (30) 8 EURASIP Journal on Applied Signal Processing

Wn,1 Zn,1 Zn,1 Wn,1 Decoder

(2,1) fn ×

 Wn,2 Zn,2 Zn,2 Zn,2 Wn,2 −1 + Decoder Tn Gn

......

(L,1) (L,2) fn fn × ×  Wn,L Zn,L Zn,L Zn,L Wn,L ++··· Decoder

Channel and time-domain Receiver for alien signal processing crosstalk cancellation

Figure 9: Generalized decision feedback.

where Zn is the output of the operation, and Dn is a diagonal A note on the decoding order of the vectored pairs can be −1 matrix whose elements are the diagonal elements of Gn Tn. made. It can be observed that changing the decoding order The vector Wn represents the decoded symbols, for which it is equivalent to premultiplying Wn and Zn by a permutation is here assumed that they are always correct. matrix. Such an operation indeed changes the values of τn,k The transmitter and receiver structures are shown in and σn,k, however, it is easy to see that the mutual informa- Figure 9. tion as defined above is not affected. This implies that the de- Next, the mutual information is computed. The mutual coding order affects the rate of each pair but not the highest information of the original channel can be expressed as possible sum rate.  I Wn; Zn = I Wn; GnZ (31) n 3.4. Receiver coordination with = I W ; Z (32) n n noise-prediction structure =  −  | H Zn H Zn Wn (33) An alternative to the decision-feedback structure is next de- = H Z − H Z − F W | W (34) n n n n n scribed. Instead of decorrelating the noise with the feedfor-  = I Wn; Z − FnWn (35) ward section and then cancelling the interference by sub- n = I W ; Z , (36) tracting weighted estimates of the symbols, the receiver noise n n can be directly decorrelated by estimating and subtracting where (31) holds because multiplication by Gn represents a the error of the decoder. This is similar to the well-known 1-1 transformation, and (34) is obvious by the definition of concept of a noise-predictive decision-feedback equalizer conditional entropy [21]. The above formulas show that the [23]. GDFE scheme does not reduce the mutual information of the From (16), the noise vector is expressed as channel. Finally, a simple expression is given for the mutual infor- Nn = GnEn, (39) mation based on (30): where En is called the innovations vector. The lower-trian- ∗ + = 1 TnRWnWn Tn REnEn gular property of the Gn matrix naturally leads to the follow- I Wn; Zn log2 (37) = 2 REnEn ing noise decorrelation procedure. First, Nn,1 En,1,whichis easily found by subtracting the decoder output of pair 1 from L 1 |τ |2E ≤ log 1+ n,k n,k , (38) the decoder input of pair 1. Then, 2 2 2 k=1 σn,k = (2,1) ⇐⇒ E Nn,2 En,2 + gn En,1 En,2 where τn,k are the diagonal elements of Tn, n,k are the diago- (40) 2 = − (2,1) nal elements of ,and are the diagonal elements of Nn,2 gn En,1, RWnWn σn,k R . The inequality relation follows from Hadamard’s in- EnEn (2,1) equality, and equality holds when the off-diagonal elements where gn is the element of Gn at row 2 and column 1. So, the noise term of pair 2 can be decorrelated by subtracting of RW W equal 0. Thus, the maximization of the sum capac- n n (2,1) ity can be achieved by solving the power allocation problem gn En,1 from the received signal of pair 2. Then, En,2 is es- on parallel channels. timated by subtracting the decoder output of pair 2 from G. Ginis and C.-N. Peng 9

Wn,1 Zn,1 Zn,1 Wn,1 Decoder × τ − n,1 +

(2,1) gn × Wn,2 Zn,2 Zn,2 Wn,2 + Decoder × τn,2 Tn − + ......

g(L,1) g(L,2) n ××n W Z Zn,L Wn,L n,L n,L + + ··· Decoder

Channel and time-domain Receiver for alien signal processing crosstalk cancellation

Figure 10: Error whitening. the decoder input of pair 2. This process continues for the For the purpose of gaining intuition, an alternative ap- remaining pairs, where in each iteration the previously es- proach is here used to arrive at (38). Using the chain rule, the timated errors En,1, ..., En,k−1 are weighted and subtracted mutual information of the channel is expressed as from the received signal of k. Starting with (15), the noise-prediction operation is ex- L pressed as I Wn; Zn = I Wn,k; Zn | Wn,1, ..., Wn,k−1 . (45) k=1 Z = Z + I − G E (41) n n n n = TnWn + Nn + I − Gn En (42) A second use of the chain rule on each individual term of the

= TnWn + En, (43) above sum yields where it is seen that the resulting noise term is uncorrelated, I Wn,k; Zn | Wn,1, ..., Wn,k−1 and Tn, En are the same as in (30). The transmitter and receiver structures are shown in L = | Figure 10. I Wn,k; Zn,m Wn,1, ..., Wn,k−1, Zn,1, ..., Zn,m−1 . m=1 An interesting advantage of this scheme is the lack of the (46) feedforward section. It should be noted that there are several possibilities for the computation of the error En. The DSL de- coder consists of multiple stages, and the error computation The terms of this sum are next individually investigated. can be computed based on the output of any of these stages. For m

−55 about En,m.Form>k, −60 I W ; Z | W , ..., W − , Z , ..., Z − (51) n,k n,m n,1 n,k 1 n,1 n,m 1 −65 =H W | W , ..., W − , Z ,..., Z − n,k n,1 n,k 1 n,1 n,m 1 −70 − | H Wn,k Wn,1, ..., Wn,k−1, Wn,k,Zn,1, ..., Zn,m−1Zn,m −75 (52) −80 = 0, (53) −85 −90 where (53) holds because the conditioning on Zn,m offers no additional information on Wn,k. Then, for m = k, −95 −100 I Wn,k; Zn,k | Wn,1, ..., Wn,k−1, Zn,1, ..., Zn,k−1 (54) Magnitude of frequency− response (dB) 105 = I W ; Z −110 n,k n,k 0 50 100 150 200 250 300 350 400 450 500 k−1 Tone index − (k,m) | fn En,m Wn,1, ..., Wn,k−1, Zn,1, ..., Zn,k−1 FEXT coupling for pair 1 at 300 m m=1 FEXT coupling for pair 2 at 300 m (55) = I Wn,k; Zn,k , (56) Figure 11: FEXT coupling for the 2 vectored pairs at 300 m.

(k,m) where fn is the element of Fn on row k and column m, and (55) follows from the fact that the conditioning fully de- Table 1: Simulation parameters. termines En,m for m = 1, ..., k − 1. Thus, it is found that

L Number of DMT tones 512 I Wn; Zn = I Wn,k; Zn,k Tone width 4.3125 kHz k=1 (57) Symbol rate 4 kHz L τ2 E Coding gain 6 dB ≤ 1 n,k n,k log2 1+ 2 . Noise margin 6 dB = 2 σn,k k 1 SNR gap 9.8dB Again, it is noted that changing the decoding order affects Maximum power 20.4dBmW the rate of each individual pair, but does not change the sum Cable type 26-Gauge rate. Source/load impedance 100 Ohm Max information bits per tone 14 4. SIMULATION RESULTS Background noise −140 dBmW/Hz

In this section, simulation results are shown to demonstrate the performance benefits that can be realized with alien noise measurements obtained with the same setup as that for the cancellation. These are generated by “frequency-domain” NEXT measurements. The magnitude of the FEXT coupling simulations, where the SNR is computed for each tone by is shown in Figure 11. Since these measurements were ob- using the knowledge of the channel and of the transmitted tained with a 300 m loop, the FEXT coupling was normalized PSD, and by evaluating the receiver noise level. The SNR per for other loop lengths using the well-known rule according tone is then converted to the number of bits per tone, which to which the FEXT PSD is proportional to the loop length leads to an estimate of the achievable data rate. In the follow- and to the squared magnitude of the channel transfer func- ing, the simulation assumes the error-whitening structure of tion. Note that no in-domain crosstalk has been included in Figure 10.But,asnotedinSection 3, the maximum achiev- the simulation. The rest of the simulation assumptions are able data-rate sum over all vectored pairs is the same regard- shown in Table 1. less of the specific noise decorrelation method. Figure 12 shows the noise PSDs of the two receivers for Similarly to the example of Section 2.4,downstream a 3 km loop. The noise of pair 1 is unaffected by the noise communication is considered, where the vectored system decorrelation algorithm. The noise of pair 2 is significantly consists of two pairs and there is a single T1 disturber (up- mitigated after the application of the algorithm. Note that stream and downstream) that causes interference. Again, the the frequencies where the noise of pair 2 is not greatly re- vectored pairs employ ADSL2+ for communication and use duced correspond to those frequencies where pairs 1 and 2 the PSD mask of Annex A [18]. The NEXT coupling coeffi- have weak noise correlation. cients and the T1 PSD are the same as in the example. Ad- Next, Figure 13 shows the bit distributions of the two re- ditionally, FEXT coupling from the T1 disturber to the pairs ceivers for the same case. As expected, the bit distribution is taken into account. The FEXT coupling was obtained from of pair 1 is unchanged, but pair 2 is capable of transmitting G. Ginis and C.-N. Peng 11

−100 60 −105 50 −110 − 115 40 −120 −125 30 −130 20 − Noise PSD (dBm/Hz) 135

− Downstream data rate (Mbps) 140 10 −145 −150 0 0 50 100 150 200 250 300 350 400 450 500 1000 1500 2000 2500 3000 3500 4000 Tone index Loop length (m) Pair 1 before/after decorrelation Pair 1 before/after decorrelation Pair 2 before decorrelation Pair 2 before decorrelation Pair 2 after decorrelation Pair 2 after decorrelation Sum rate before decorrelation Sum rate after decorrelation Figure 12: Noise PSD before and after noise decorrelation with 3000 m loop. Figure 14: Rate-reach curves before and after noise decorrelation.

15 Performance gains over a wide range of loops have also been observed with other NEXT interferers that affect a sig- nificant portion of the downstream or upstream band. For FEXT interferers (e.g., FEXT from a similar system), the gains 10 are considerable primarily in short enough loops, where FEXT is the dominant noise source.

5. CONCLUSIONS

5 This paper gave an overview of the subject of alien crosstalk

Number of bits per tone cancellation for DSL systems employing multiple pairs. First, it was shown that a common interference source leads to noise correlation among the victim DSL pairs, and that such correlation is strongest within the same tones. Then, specific 0 0 50 100 150 200 250 300 350 400 450 500 methods were presented to perform crosstalk cancellation by Tone index exploiting the noise correlation property. It was theoretically shown that the methods have equivalent performance with Pair 1 before/after decorrelation respect to the sum rate over the pairs. Last, performance sim- Pair 2 before decorrelation ulation results were presented to illustrate the potential ben- Pair 2 after decorrelation efits. It is worth making some additional comments about the Figure 13: Bit distributions before and after noise decorrelation practical application of noise decorrelation. Although not with 3000 m loop. presented in this paper, the noise decorrelation methods re- quire the existence of additional algorithms for initialization (before the DSL link is established) and for adaptation (af- more bits per tone after noise decorrelation as a result of the ter the DSL link is operational). The initialization algorithms higher receiver SNR. aim at producing estimates of the noise correlation, which Finally, Figure 14 shows the rate-reach curves of the 2 was simply assumed to be known in advance in this paper. pairs and also the sum rate for loops between 1 and 4 km. The adaptation (or updating) algorithms are needed since it Of course, the rate-reach curve of pair 1 is the same before is generally expected that crosstalk interference varies over and after noise decorrelation, but the rate increase of pair time, either due to slow-varying variations of the crosstalk 2 is significant over almost all loop lengths. In the shortest coupling, or due to crosstalk sources becoming active or in- loops, the improvements become smaller mainly because of active. the cap on the number of bits that can be transported on each Finally, the results presented in [24] demonstrated that tone. in-domain crosstalk cancellation can be simplified by careful 12 EURASIP Journal on Applied Signal Processing choice of the tones and pairs over which joint signal process- [14] K. Schneider and L. Sandstrom, “MIMO vs. SISO capacity on ing is applied. This previous conclusion indicates that similar twisted pair loops,” T1E1.4 committee, contribution 2002-259, selection methods can also be applied for alien crosstalk can- November 2002. cellation, thus offering the possibility of complexity reduc- [15] M. Tsatsanis, M. A. Erickson, and S. Shah, “Multiline trans- tion. mission in communication systems,” International application published under the Patent Cooperation Treaty, international application number PCT/US03/18004. ACKNOWLEDGMENT [16] F. Sjoberg,¨ M. Isaksson, R. Nilsson, P. Odling,¨ S. K. Wilson, and The authors would like to thank Bin Lee of Stanford Univer- P. O. B orjesson,¨ “Zipper: a duplex method for VDSL based on DMT,” IEEE Transactions on Communications,vol.47,no.8, sity for providing the crosstalk measurements. pp. 1245–1252, 1999. [17] ITU-T Recommendation G.993.1, Very high speed digital sub- REFERENCES scriber line, June 2004. [18] ITU-T Recommendation G.992.5, Asymmetrical Digital Sub- [1] M. L. Honig, K. Steiglitz, and B. Gopinath, “Multichannel scriber Line (ADSL) transceivers—Extended bandwidth ADSL2 signal processing for data communications in the presence (ADSL2plus+), May 2003. of crosstalk,” IEEE Transactions on Communications, vol. 38, [19] G. G. Raleigh and J. M. Cioffi, “Spatio-temporal coding for no. 4, pp. 551–558, 1990. wireless communication,” IEEE Transactions on Communica- [2] J. W. Lechleider, “Coordinated transmission for two-pair digi- tions, vol. 46, no. 3, pp. 357–366, 1998. tal subscriber lines,” IEEE Journal on Selected Areas in Commu- [20] J. M. Cioffi and G. D. Forney Jr., “Generalized decision- nications, vol. 9, no. 6, pp. 920–930, 1991. feedback equalization for packet transmission with ISI and [3] J. Huber and R. Fischer, “Dynamically coordinated reception Gaussian noise,” in Communications, Computation, Control, of multiple signals in correlated noise,” in Proceedings of IEEE and Signal Processing: A Tribute to , A. Paulraj, International Symposium on Information Theory (ISIT ’94),pp. V. P. Roychowdhury, and C. D. Schaper, Eds., chapter 4, pp. 132–132, Trondheim, Norway, June–July 1994. 79–127, Kluwer Academic, Boston, Mass, USA, 1997. [4] G. Ginis and J. M. Cioffi, “Vectored-DMT: a FEXT canceling [21] T. M. Cover and J. A. Thomas, Elements of Information Theory, modulation scheme for coordinating users,” in Proceedings of John Wiley & Sons, New York, NY, USA, 1991. IEEE International Conference on Communications (ICC ’01), [22] S. Kasturia, J. T. Aslanis, and J. M. Cioffi, “Vector coding for vol. 1, pp. 305–309, Helsinki, Finland, June 2001. partial response channels,” IEEE Transactions on Information [5] G. Ginis and J. M. Cioffi, “Vectored transmission for digi- Theory, vol. 36, no. 4, pp. 741–762, 1990. tal subscriber line systems,” IEEE Journal on Selected Areas in [23] J. G. Proakis, Digital Communications, McGraw-Hill, New Communications, vol. 20, no. 5, pp. 1085–1104, 2002. York, NY, USA, 3rd edition, 1995. [6] G. Taubock¨ and W. Henkel, “MIMO systems in the subscriber- [24] R. Cendrillon, M. Moonen, G. Ginis, K. Van Acker, T. Bostoen, line network,” in Proceedings of 5th International OFDM- and P. Vandaele, “Partial crosstalk cancellation exploiting line Workshop (InOWo ’00), pp. 18.1–18.3, Hamburg, Germany, and tone selection in upstream VDSL,” in Proceedings of September 2000. 6th Baiona Workshop on Signal Processing in Communications [7] R. Cendrillon, G. Ginis, M. Moonen, K. Van Acker, T. Bostoen, (Baiona ’03), Baiona, Spain, September 2003. and P. Vandaele, “Partial crosstalk precompensation in down- stream VDSL,” Signal Processing, vol. 84, no. 11, pp. 2005– 2019, 2004. George Ginis receivedtheDiplomainelec- [8] R. Cendrillon, M. Moonen, G. Ginis, K. Van Acker, T. trical and computer engineering from the Bostoen, and P. Vandaele, “Partial crosstalk cancellation for National Technical University of Athens, upstream VDSL,” EURASIP Journal on Applied Signal Process- Athens, Greece, in 1997, and the M.S. ing, vol. 2004, no. 10, pp. 1520–1535, 2004. and Ph.D. degrees in electrical engineering [9] R. Cendrillon, M. Moonen, E. Van den Bogaert, and G. Ginis, from Stanford University, Stanford, Calif, “The linear zero-forcing crosstalk canceller is near-optimal in in 1998 and 2002, respectively. He is cur- DSL channels,” in Proceedings of IEEE Global Telecommuni- rently a Member Group Technical Staff with cations Conference (GLOBECOM ’04), vol. 4, pp. 2334–2338, DSP Systems of Texas Instruments Incopo- Dallas, Tex, USA, November–December 2004. rated. His research interests include mul- [10] R. Cendrillon, G. Ginis, M. Moonen, J. Verlinden, and T. tiuser transmission theory, interference mitigation, and their ap- Bostoen, “Improved linear crosstalk precompensation for plications to wireline and wireless communications. DSL,” in Proceedings of IEEE International Conference on Acoustics, Speech, Signal Processing (ICASSP ’04), vol. 4, pp. Chia-Ning Peng was born in Taipei, Tai- 1053–1056, Montreal, Quebec, Canada, May 2004. wan, in 1968. He received his B.S. degree in [11] T. Starr, M. Sorbara, J. M. Cioffi, and P. J. Silverman, DSL Ad- electrical engineering from National Taiwan vances, Prentice-Hall, Englewood Cliffs, NJ, USA, 2003. University in 1990 and the M.S. and Ph.D. [12] T. Magesacher, P. Odling,P.O.B¨ orjesson,¨ et al., “On the ca- degrees in electrical engineering from the pacity of the copper cable channel using the common mode,” University of Michigan, Ann Arbor, Mich, in Proceedings of IEEE Global Telecommunications Conference USA, in 1994 and 1998, respectively. His (GLOBECOM ’02), vol. 2, pp. 1269–1273, Taipei, Taiwan, research interests include error-correction Novemeber 2002. codes, digital communications theory and [13] M. Tsatsanis, “Vectoring techniques for multi-line 10MDSL application, and applied signal processing. systems,” T1E1.4 committee, contribution 2002-196, August He now works for Texas Instruments Incorporated in the area of 2002. residential gateways. Hindawi Publishing Corporation EURASIP Journal on Applied Signal Processing Volume 2006, Article ID 85859, Pages 1–9 DOI 10.1155/ASP/2006/85859

Crosstalk Models for Short VDSL2 Lines from Measured 30 MHz Data

E. Karipidis,1 N. Sidiropoulos,1 A. Leshem,2 Li Youming,2 R. Tarafi,3 and M. Ouzzif3

1 Department of Electronic and Computer Engineering, Technical University of Crete, 73100 Chania, Crete, Greece 2 Department of Electrical Engineering, Bar-Ilan University, 52900 Ramat-Gan, Israel 3 France Telecom R&D, 22307 Lannion, France

Received 30 November 2004; Revised 25 April 2005; Accepted 2 August 2005 In recent years, there has been a growing interest in hybrid fiber-copper access solutions, as in fiber to the basement (FTTB) and fiber to the curb/cabinet (FTTC). The twisted pair segment in these architectures is in the range of a few hundred meters, thus supporting transmission over tens of MHz. This paper provides crosstalk models derived from measured data for quad cable, lengths between 75 and 590 meters, and frequencies up to 30 MHz. The results indicate that the log-normal statistical model (with a simple parametric law for the frequency-dependent mean) fits well up to 30 MHz for both FEXT and NEXT. This extends earlier log-normal statistical modeling and validation results for NEXT over bandwidths in the order of a few MHz. The fitted crosstalk power spectra are useful for modem design and simulation. Insertion loss, phase, and impulse response duration characteristics of the direct channels are also provided.

Copyright © 2006 Hindawi Publishing Corporation. All rights reserved.

1. INTRODUCTION log-normal model for the marginal distribution of both NEXT and FEXT is validated, extending earlier results [3, 4]. Hybrid fiber-copper access solutions, such as fiber to the Finally certain key fitted model parameters are provided, basement (FTTB) and fiber to the curb/cabinet (FTTC), en- which are important for system development and service tail twisted pair segments in the order of a few hundred provisioning. meters—thus supporting transmission over up to 30 MHz. The rest of this paper is structured as follows. Section 2 Very-high bit-rate digital subscriber line (VDSL) and the provides a concise description of the measurement process emerging VDSL2 draft are the pertinent high-speed trans- and associated apparatus, while Section 3 reviews the ba- mission modalities for these lengths. This scenario is very sic parametric models for IL, NEXT, and FEXT. Section 4 different from the typical asymmetric digital subscriber line presents the main results: fitted models for the crosstalk spec- (ADSL) or high bit-rate digital subscriber line (HDSL) envi- tra plus model validation (Sections 4.1, 4.2). Section 4 also ronment. For the shortest loops, for example, the shape of the provides useful data regarding IL (Section 4.3), and the phase far-end crosstalk (FEXT) power spectrum can be expected and essential duration of the direct channels (Sections 4.4, to be similar to the shape of the near-end crosstalk (NEXT) 4.5). Conclusions are drawn in Section 5. power spectrum; while it is a priori unclear that NEXT and FEXT models [3, 4] developed and fitted to ADSL/HDSL bandwidths, will hold up over a much wider bandwidth. 2. DESCRIPTION OF THE CHANNEL This paper describes the results of an extensive channel MEASUREMENT PROCESS AND APPARATUS measurement campaign conducted by France Telecom R&D, and associated data analysis undertaken by the authors in or- IL, NEXT, and FEXT were measured for different lengths of der to better understand the properties of these very short 0.4 mm gauge S88.28.4 cable, comprising 14 quads (14 × 2 = copper channels. A large number of FEXT, NEXT, and in- 28 loops) [7]. The measured lengths were 75, 150, 300, and sertion loss (IL) channels were measured and analyzed, for 590 meters. A network analyzer (NA) was employed in the lengths ranging from 75 to 590 meters and bandwidth up measurement process. A power splitter was used to inject half to 30 MHz. The main contribution is three-fold. First, the of the source power to the cable, while the other half was simple parametric models in [3] are tested and validated diverted to the reference input R of the NA. The output of over the target lengths and range of frequencies. Second, the the measured channel was connected to input A of the NA, 2 EURASIP Journal on Applied Signal Processing and the ratio A/R was recorded. When measuring crosstalk 3.1. Insertion loss between pairs i and j,pairsi and j were terminated using 120 ohm resistances; all other pairs in the binder were left The magnitude squared of insertion loss obeys a simple para- open-circuit. metric model [3]   √ An impedance transformer (balun) was used to connect HIL( f , l)2 = e−2αl f ,(1) the measured pair with the measurement device. The refer- ence for the baluns is North Hills 0302BB (10 kHz–60 MHz), where f is the frequency in Hz, l is the length of the channel, except for FEXT and IL for 300 and 590 meters, for which and α is a constant. In dB, the reference is North Hills 413BF (100 kHz–100 MHz). Prior    HIL f l  = β l f to taking actual measurements, a calibration procedure was 20 log10 ( , ) ( ) ,(2) employed to offset the combined effect of the baluns and the wherewehavedefinedβ(l) =−20αl log (e). coaxial cables. 10 Three different network analyzers were used, depending on cable length. 3.2. NEXT

= NEXT can be modeled as [3, 4] (i) 75 meters. HP8753ES, resolution bandwidth 20 Hz.    2 / (ii) 150 meters. HP8751A, resolution bandwidth = 20 Hz. HN( f ) = Kf3 2,(3) (iii) 300 and 590 meters. HP4395A, resolution bandwidth where K is a log-normal random variable. In dB, = 100 Hz.   HN f  = K f 20 log10 ( ) 10 log10( ) + 15 log10( ), (4) For all the measurements, the setup was as follows. K where now 10 log10( ) is a normal random variable. It |HN f | (i) Source power = 15 dBm. follows that 20 log10 ( ) is a normal variable, with (ii) Start frequency = 10 kHz. frequency-dependent mean. Lin [6] has shown that 10 log (K) can be better mod- = 10 (iii) Stop frequency 30 MHz. eled by a gamma distribution, under certain conditions. In (iv) Number of points = 801. particular, a gamma distribution can better fit the tails of the (v) Frequency sweep scale = logarithmic. empirical distribution. On the other hand, the normal distri- bution is simpler and widely used in this context, because it Fifteen dBm was the maximum source power available  in the fits quite well. 28 = lab. For each measured length, all possible (i.e., 2 378) crosstalk channels in the binder were actually measured. In 3.3. FEXT addition to NEXT and FEXT, IL and phase for the 28 direct channels were also measured. FEXT can be modeled as [3]     Due to the fact that measurements were taken in log- HF( f , l)2 = K(l) f 2HIL( f , l)2,(5) arithmic frequency scale, there was a need to interpolate the measured data over a linear frequency scale. For each where K(l) is a log-normal random variable, which now de- measured channel, shape-preserving piecewise cubic (Her- pends on length, l. In dB and using (2), mite) interpolation of the log-scale amplitude of the fre-      20 log HF ( f , l) = 10 log K(l) + β(l) f quency samples was used, to obtain 6955 equispaced fre- 10 10 (6) quency samples (spacing = 4.3125 kHz) from the 801 mea- f +20log10( ), sured log-scale frequency samples. The choice of frequency K l sweep scale (linear versus logarithmic) hinges on a number where now 10 log10( ( )) is a normal random variable, |HF f l | of factors. A logarithmic scale packs higher sample density and thus 20 log10 ( , ) is a normal variable too, with in the lower frequencies, wherein NEXT and FEXT typically frequency-dependent mean. exhibit faster variation with frequency, and can be relatively close to the measurement error floor. In this case, a loga- 4. RESULTS rithmic frequency sweep naturally yields more reliable inter- polated channel estimates in the lower frequencies. On the 4.1. Fitted cross-spectra and log-normal other hand, this comes at the expense of lower sample den- model validation sity in the higher frequencies. Results for NEXT are presented first; FEXT follows, in or- der of increasing loop length. The NEXT power spectrum 3. MODELING OF COPPER CHANNELS is approximately independent of loop length for the lengths considered,1 as can be verified from the fitted parameter in A good overview of twisted pair channel models can be found in [3] (see also [4–6]). A summary of the most pertinent facts follows. 1 NEXT generally depends on loop length, see [1]. E. Karipidis et al. 3

−45 0.14 −50 0.12 −55 −60 0.1 −65 0.08 −70 0.06 Probability Power (dB) −75

−80 0.04 −85 0.02 −90 −95 0 0 5 10 15 20 25 30 −70 −60 −50 −40 −30 −20 −10 0 10 20 30 Frequency (MHz) (dB)

Measured mean power Measured histogram − . ∗ f Fitted model: 158 4+15 log10( ) Fitted Gaussian model

Figure 1: Measured mean power and fitted model for NEXT, 300 m Figure 3: Histogram of the mean-centered power for NEXT, 300 m. (mean std = 9.5dB).

minimize    μ ( f ) − c +15log ( f ) 2,(8) . s 1 10 0 999 f 0.99 c μ f − f 0.9 yielding 1 equal to the mean of s( ) 15 log10( ). The situ- 0.75 ation is similar for FEXT, except that this time the parametric 0.5 mean regression model is 0.25 Probability     0.05 HF f l  ≈c l c l f f E 20 log10 ( , ) 1( )+ 2( ) +20log10( ), (9) 0.01 0.001 c l = K l where 1( ) E[10 log10( ( ))] is now length-dependent, and c2(l) ≡ β(l), as per the associated discussion in Section 3. Fitting the two parameters is a standard linear LS problem. μ f −60 −50 −40 −30 −20 −10 0 10 20 The fitted curve is plotted along with s( ) in the first of (dB) each pair of plots corresponding to each type of channel. The standard deviation (std) of the channel’s log-power response For Gaussian, plot should be a straight line is found to be approximately constant over the entire 30 MHz frequency band; its average value is reported in the caption of Figure 2: Deviation from Gaussian pdf for NEXT, 300 m. the respective mean power plot. After frequency-dependent mean removal (“centering” or “detrending”) using the fitted parametric model, the Figure 12. For brevity, detailed plots are therefore only pro- residual frequency samples should behave like zero-mean vided for 300 meter NEXT. There are two plots per chan- normal random variables, if the log-normal model of the nel type and length considered. The first shows the measured marginal distribution is correct. In the second plot of each mean log-power of all available channels of the given type, pair, the validity of this assumption is assessed, by a so- and the associated fitted model, as a function of frequency. called normal probability plot, which is produced using Mat- As per Section 3, we use the following parametric model for lab’s normplot routine. The purpose of a normal probabil- the mean NEXT log-power: ity plot is to graphically assess whether the data could come    from a normal distribution. If so, the normal probability plot HN f  ≈ c f E 20 log10 ( ) 1 +15log10( ), (7) should be linear. Other distributions will introduce curva- ture in the plot. The normal probability plot helps in assess- c = K c where 1 E[10 log10( )]. The parameter 1 is fitted to the ing deviations from normality, especially in the tails of the |HN f | model as follows. First, E[20 log10 ( ) ] is replaced by its distribution. For 300 m NEXT, a third figure has been in- sample estimate, μs( f ). Then, the sought parameter is fitted cluded showing a histogram of the mean-centered log-power to μs( f ) in a least-squares (LS) sense. That is, c1 is chosen to responses, accumulated across all channels of the given type 4 EURASIP Journal on Applied Signal Processing

−50 −60

−65 −60 −70

−70 −75

−80 −80 −85 Power (dB) Power (dB) −90 −90

−95 −100 −100

−110 −105 0 5 10 15 20 25 30 0 5 10 15 20 25 30 Frequency (MHz) Frequency (MHz)

Measured mean power  Measured mean power  − . ∗ f − . ∗ f − . ∗ f − . ∗ f Fitted model: 195 2+20 log10( ) 0 0015 Fitted model: 192 6+20 log10( ) 0 0035

Figure 4: Measured mean power and fitted model for FEXT, 75 m Figure 6: Measured mean power and fitted model for FEXT, 150 m (mean std = 9dB). (mean std = 9dB).

0.999 0.999 0.99 0.99

0.9 0.9 0.75 0.75 0.5 0.5 0.25 0.25 Probability Probability 0.05 0.05 0.01 0.01 0.001 0.001

−60 −50 −40 −30 −20 −10 0 10 20 −70 −60 −50 −40 −30 −20 −10 0 10 20 (dB) (dB)

For Gaussian, plot should be a straight line For Gaussian, plot should be a straight line

Figure 5: Deviation from Gaussian pdf for FEXT, 75 m. Figure 7: Deviation from Gaussian pdf for FEXT, 150 m.

and across all frequencies. A Gaussian probability density NEXT plots for 300 meters are presented in Figures 1, 2, function has been fitted to the said data (not the histogram and 3. Figure 2 indicates that the normal distribution is a rea- per se), and overlaid on top of the same plot. Gaussian fitting sonable approximation, while a gamma distribution could be is performed in the maximum likelihood (ML) sense, which used to further improve the fit of the tails [6]. Plots for FEXT boils down to using the sample estimate of the variance of are shown in Figure pairs 4-5, 6-7, 8-9,and10-11, for 75, the centered data. This figure helps to assess (deviation from) 150, 300, and 590 meters, respectively. normality, however tail inconsistencies are relatively hard to The results indicate that the simple parametric models detect this way. For this reason, and for the sake of brevity, in [3]describesufficiently well the mean log-power of the we are only showing normal probability plots for the FEXT crosstalk channels, except for the 590 m FEXT case, where channels. there is a noticeable deviation of the fitted model from the E. Karipidis et al. 5

−70 −70

− 75 −80

−80 −90 −85 −100 −90 −110 Power (dB) −95 Power (dB) −120 −100

−105 −130

−110 −140 0 5 10 15 20 25 30 0 5 10 15 20 25 30 Frequency (MHz) Frequency (MHz)

Measured mean power  Measured mean power  − . ∗ f − . ∗ f − . ∗ f − . ∗ f Fitted model: 187 6+20 log10( ) 0 0081 Fitted model: 185 9+20 log10( ) 0 0171  − . . ∗ f − . ∗ f Fitted model: 127 3+1004 log10( ) 0 014

Figure 8: Measured mean power and fitted model for FEXT, 300 m Figure 10: Measured mean power and fitted model for FEXT, 590 m (mean std = 8.8dB). (mean std = 11.2dB).

. 0.999 0 999 . 0.99 0 99 . 0.9 0 9 . 0.75 0 75 . 0.5 0 5 . 0.25

0 25 Probability Probability . 0.05 0 05 . 0.01 0 01 . 0.001 0 001

− − − −60 −40 −20 0 20 60 40 20 0 20 40 (dB) (dB)

For Gaussian, plot should be a straight line For Gaussian, plot should be a straight line

Figure 9: Deviation from Gaussian pdf for FEXT, 300 m. Figure 11: Deviation from Gaussian pdf for FEXT, 590 m.

measured mean power, as high as 3 dB in the frequencies ap- and the parametric mean regression model becomes proximately up to 2 MHz (see Figure 10). In order to obtain     a better fit, we can generalize the model of (5) by relaxing the F E 20 log H ( f , l) ≈ c (l)+c (l) f + c (l)log ( f ), f 2 term to f γ(l),whereγ(l) is a length-dependent parameter. 10 1 2 3 10 Then, (6)becomes (11)

where c (l) ≡ 10γ(l). That is, we are effectively introducing      3 20 log HF ( f , l) = 10 log K(l) + β(l) f a third degree of freedom. The resulting profile and param- 10 10 (10) eters of this fit are reported along with the original ones in γ l f +10 ( )log10( ), Figure 10 for comparison purposes. 6 EURASIP Journal on Applied Signal Processing

0 0

−50 −20

−100 −40 1 c

−150 −60 Power (dB) Parameter −200 −80

−250 −100

−300 −120 0 100 200 300 400 500 600 700 0 5 10 15 20 25 30 Loop length (L)(m) Frequency (MHz)  ∗ Model parameter c1 (NEXT) 75 m, model: −0.0020 f ∗ Constant fit: c1 =−158.7(NEXT) 150 m, model: −0.0042  f c ∗ Model parameter 1 (FEXT) 300 m, model: −0.0082  f ∗ ∗ Line fit: c1(L) = 0.018 L − 195.2(FEXT) 590 m, model: −0.0183 f

c Figure 12: Fitted regression parameter 1. Figure 14: Measured mean power of direct channel and fitted model. (Insertion loss.)

−3 ×10 affine dependence on length. In Figure 13 the fitted parame- 5 ter c2(l) ≡ β(l) of the frequency-dependent mean model for the direct channel is shown to be an affine function of length 0 as well.

4.3. Insertion loss 2

c −5 Figure 14 shows the sample mean IL (in dB) and the asso- ciated fitted model, for all four lengths. Notice that the us- −10 Parameter able bandwidth indeed extends to 30 MHz for the shortest (75 m) loop, but is effectively limited to about 7.5 MHz for −15 the longest (590 m) loop considered. At that point, the loop’s IL drops under −50 dB. Figure 13 shows the dependence on loop length of the model parameter c2(l) ≡ β(l)in(2). −20 0 100 200 300 400 500 600 700 4.4. Phase of direct channels Loop length (L)(m) Figure 15 shows the unwrapped phase of all 28 direct chan- Model parameter c2 (FEXT) ∗ Line fit: c2 =−0.000030 L +0.00095 (FEXT) nels, for 75, 150, 300, and 590 meters. Note that the (un- Model parameter c2 (IL) wrapped) phase is approximately linear. ∗ Line fit: c2 =−0.000032 L +0.00065 (IL) 4.5. Impulse response duration Figure 13: Fitted regression parameter c2. One parameter that is important from the viewpoint of modem design is the duration of the impulse response of the direct channel. For a multicarrier line code, this affects 4.2. Fitted regression parameters versus length both the length of the cyclic prefix, and the number of taps (and thus cost and complexity) of the time-domain chan- The fitted frequency-dependent mean model parameters are nel shortening equalizer (TEQ). We plot the dB magnitude also plotted in Figures 12 and 13, versus length. For NEXT, of the direct channel’s impulse response in Figures 16 and c1 ≈−158.7(−165.4forKerpez’smodel[4]) independent of 17, for length 75 and 150 meters, respectively. The 99% en- length, as expected. For FEXT, both parameters show a nice ergy breakpoint (the “essential duration” that contains 99% E. Karipidis et al. 7

100 150

0 75 m 150 m −100 100 300 m −200

−300 590 m Power (dB) 50 −400 Unwrapped phase (rad)

−500

−600 0 0 5 10 15 20 25 30 00.511.522.53 Frequency (MHz) Time (μs)

Magnitude–squared impulse response Figure 15: Unwrapped phase of all direct channels. 99% energy breakpoint at 0.932 μs

Figure 17: Magnitude-squared of direct channel’s impulse re- sponse, 150 m.

180 the continuous-time Fourier transform, and the impulse re- ffi 160 sponses are not su ciently time-limited; thus time-domain aliasing is introduced as per the sampling theorem. This pro- 140 hibits reliable estimation of, for example, the 99.99% energy breakpoint. The 99% energy breakpoint, on the other hand, 120 is at least 18 times lower than the period of the aliased im- pulse response, and thus can be reliably estimated. 100

Power (dB) 80 5. CONCLUSIONS

60 Simple parametric crosstalk models are useful tools in VDSL system engineering. The evolution towards FTTC/FTTB ar- 40 chitectures implies shorter twisted pair segments, and corre- 20 spondingly wider usable system bandwidth. This brings up 00.20.40.60.81the issue of whether or not existing models for NEXT and Time (μs) FEXT are valid in the FTTC/FTTB scenario. An extensive measurement campaign was undertaken in Magnitude–squared impulse response 99% energy breakpoint at 0.412 μs order to address this question. An important conclusion of the ensuing analysis is that the simple log-normal statistical models in [3] capture the essential aspects of both NEXT and Figure 16: Magnitude-squared of direct channel’s impulse re- FEXT over the extended range of frequencies considered. In- sponse, 75 m. tuition regarding the behavior of FEXT for the shortest loops has been confirmed by analysis. A number of useful fitted model parameters were also provided. of the total energy) is also shown on each figure. The impulse ACKNOWLEDGMENTS responses were calculated via Riemann sum approximation2 of the inverse continuous-time Fourier transform of the in- The authors would like to thank the anonymous reviewers terpolated frequency samples, using conjugate folding for for their insightful comments. This work was supported by the negative frequencies. Note that this approximation intro- the EU-FP6 under U-BROAD STREP contract 506790. duces aliasing error in the tails of the estimated impulse re- sponse. This is unavoidable, because we work with samples of REFERENCES

[1] “Spectrum Management for Loop Transmission Systems,” 2 For computational savings, this can be implemented via the (inverse) FFT. ANSI Standard T1.417-2003, Section A.3.2.1. 8 EURASIP Journal on Applied Signal Processing

[2] E. Karipidis, N. Sidiropoulos, A. Leshem, and L. Youming, consultant for industry in the areas of frequency hopping systems “Experimental evaluation of capacity statistics for short VDSL and signal processing for xDSL modems. loops,” IEEE Transactions on Communications,vol.53,no.7,pp. 1119–1122, 2005. A. Leshem received the B.S. degree (cum [3] J.-J. Werner, “The HDSL environment [high bit rate digital sub- laude) in mathematics and physics, the M.S. scriber line],” IEEE Journal on Selected Areas in Communica- degree (cum laude) in mathematics, and tions, vol. 9, no. 6, pp. 785–800, 1991. the Ph.D. degree in mathematics all from [4] K. Kerpez, “Models for the numbers of NEXT disturbers and the Hebrew University, Jerusalem, Israel, in NEXT loss,” Contribution number T1E1.4/99-471, October 1986, 1990, and 1998, respectively. From 1999, available at http://contributions.atis.org/UPLOAD/NIPP/ 1998 to 2000, he was with Faculty of Infor- NAI/9E144710.pdf. mation Technology and Systems, Delft Uni- [5] A. Leshem, “Multichannel noise models for DSL I: Near end versity of Technology, the Netherlands, as a crosstalk,” Contribution T1E1.4/2001-227, September 2001, postdoctoral fellow working on algorithms available at http://contributions.atis.org/UPLOAD/NIPP/NAI/ for reduction of terrestrial electromagnetic interference in radio- 1E142270.zip. astronomical radio-telescope antenna arrays and signal processing [6] S. H. Lin, “Statistical behaviour of multipair crosstalk,” Bell Sys- for communication. From 2000 to 2003, he was Director of ad- temTechnicalJournal, vol. 59, no. 6, pp. 955–974, 1980. vanced technologies with Metalink Broadband. He was responsi- [7] Norme Franc¸aise # NF C 93-527-2, July 1991. ble for research and development of new DSL and wireless MIMO modem technologies. From 2000 to 2002, he was also a Visiting Researcher at Delft University of Technology. Since October 2002, E. Karipidis received the Diploma in elec- he has been a Senior Lecturer in the new School of Electrical and trical and computer engineering from the Computer Engineering, at Bar-Ilan University. From 2003 to 2005, Aristotle University of Thessaloniki, Greece, he also was Technical Manager of the U-BROAD consortium devel- and the M.S. degree in communications en- oping technologies to provide 100 Mbps and beyond over copper gineering from the Technical University of lines. His main research interests include transmission over cop- Munich, in 2001 and 2003, respectively. He per lines including multiuser and multichannel transmission tech- worked as an intern from February 2002 to niques, array and statistical signal processing with applications to October 2002 in Siemens ICM, and from multiple-element sensor arrays in radio-astronomy and wireless December 2002 to November 2003 in the communications, radio-astronomical imaging methods, set theory, Wireless Solutions Lab of DoCoMo Euro- logic and foundations of mathematics. Labs, both in Munich, Germany. He is currently a Ph.D. candi- date in the Telecommunications Division, Department of Elec- tronic and Computer Engineering, Technical University of Crete, Li Youming received the B.S. degree in com- Chania, Greece. His broad research interests are in the area of putational mathematics from Lan Zhou signal processing for communications, with current emphasis on University, Lan Zhou, China, in 1985, the MIMO VDSL systems, convex optimization, and applications in M.S. degree in computational mathemat- transmit precoding for wireline and wireless systems. He is Mem- ics from Xi’an Jiaotong University in 1988, ber of the Technical Chamber of Greece and Student Member of and the Ph.D. degree in electrical engineer- the IEEE. ing from Xi Dian University. From 1988 to 1998, he worked in the Department N. Sidiropoulos received the Diploma in of Applied Mathematics, Xidian University, electrical engineering from the Aristotle where he was an Associate Professor. From University of Thessaloniki, Greece, and 1999 to 2000, he was a Research Fellow in the School of EEE, M.S. and Ph.D. degrees in electrical engi- Nanyang Technological University. From 2001 to 2003, he joined neering from the University of Maryland at DSO National Laboratories, Singapore. From 2001 to 2004, he College Park (UMCP), in 1988, 1990, and was a postdoctoral research fellow in School of Engineering, Bar- 1992, respectively. He has been an Assistant Ilan University, Israel. He is now working in the Faculty of In- Professor in the Department of Electrical formation Science and Engineering, Ningbo University. His re- Engineering, University of Virginia (1997– search interests are in the areas of statistical signal processing 1999), and Associate Professor in the De- and its application in wireline and wireless communications and partment of Electrical and Computer Engineering, University of radar. Minnesota, Minneapolis (2000–2002). Since 2002, he is a profes- sor in the Department of Electronic and Computer Engineering, R. Tarafi was born on October 20, 1968. Technical University of Crete, Chania, Crete, Greece. His current He is an Engineer at the Ecole Nationale research interests are in signal processing for communications, and d’Ingenieurs´ de Brest (ENIB). In 1998, he multiway analysis. He is Vice-Chair of the Signal Processing for received the title of Docteur of the Univer- Communications Technical Committee (SPCOM-TC), and Mem- sity of Brest. He joined the National Re- ber of the Sensor Array and Multichannel Processing Technical search Center of France Telecom in 1998, Committee (SAMTC) of the IEEE SP Society, and Associate Ed- where he is in charge of modelization and itor for IEEE Transactions on Signal Processing (2000-). He re- investigation studies related to the EMC ceived the U.S. NSF/CAREER Award in June 1998, and an IEEE of the France Telecom telecommunication Signal Processing Society Best Paper Award in 2001. He is an active network. E. Karipidis et al. 9

M. Ouzzif received the Engineering degree as well as the M.S. degree in electrical en- gineering from INSA (Institut National des Sciences Appliquees)´ of Rennes in 2000 and the Ph.D. degree in electronics from INSA in 2004. Since November 2000, she has been with FranceTelecom R&D. Her current in- terests include multiuser transmissions and their application to wireline communica- tions. Hindawi Publishing Corporation EURASIP Journal on Applied Signal Processing Volume 2006, Article ID 94105, Pages 1–14 DOI 10.1155/ASP/2006/94105

Error Sign Feedback as an Alternative to Pilots for the Tracking of FEXT Transfer Functions in Downstream VDSL

J. Louveaux and A.-J. van der Veen

Delft University of Technology, 2600AA Delft, The Netherlands

Received 1 December 2004; Revised 11 August 2005; Accepted 22 August 2005 With increasing bandwidths and decreasing loop lengths, crosstalk becomes the main impairment in VDSL systems. For down- stream communication, crosstalk precompensation techniques have been designed to cope with this issue by using the collocation of the transmitters. These techniques naturally need an accurate estimation of the crosstalk channel impulse responses. We in- vestigate the issue of tracking these channels. Due to the lack of coordination between the receivers, and because the amplitude levels of the remaining interference from crosstalk after precompensation are very low, blind estimation schemes are inefficient in this case. So some part of the upstream or downstream bit rate needs to be used to help the estimation. In this paper, we design a new algorithm to try to limit the bandwidth used for the estimation purpose by exploiting the collocation at the transmitter side. The principle is to use feedback from the receiver to the transmitter instead of using pilots in the downstream signal. It is justified by computing the Cramer-Rao lower bound on the estimation error variance and showing that, for the levels of power in consideration, and for a given bit rate used to help the estimation, this bound is effectively lower for the proposed scheme. A sim- ple algorithm based on the maximum likelihood is proposed. Its performance is analyzed in detail and is compared to a classical scheme using pilot symbols. Finally, an improved but more complex version is proposed to approach the performance bound. Copyright © 2006 Hindawi Publishing Corporation. All rights reserved.

1. INTRODUCTION more information on the precancellation design, see previ- ous references or [5–7]. Future DSL systems such as VDSL (very high-bit-rate dig- All these precancellation schemes rely on a good estima- ital subscriber line) evolve towards shorter loops thanks to tion of the crosstalk channels between the various pairs of the increasing development of optical fiber infrastructure. users (or equivalently pairs of lines). So the issue of crosstalk This allows the use of higher bandwidths, typically from 10 channel estimation has to be solved to be able to use those to as high as 30 MHz for very short loops. At these high schemes. In this paper, we investigate the issue of tracking of frequencies and low attenuation channels, the FEXT (far- these channel estimates. Copper wires generally have static end crosstalk) becomes the main degradation in the system, channel impulse responses, but they can still vary slowly, for higher than additive noise. In order to overcome this issue, example, due to temperature changes. So in order to guar- multiuser detectors can be designed [1] when the receivers antee a constant behavior of the crosstalk mitigation tech- are coordinated, that is, when the receivers have access to the nique, some kind of tracking of the channel estimates is nec- signals coming from all the different lines. However, in typ- essary. Due to the lack of coordination between the CPEs ical downstream VDSL systems, the receivers will not be co- (customer premise equipments, i.e., the users’ receivers), the ordinated. For this reason, a number of precancellation tech- downstream channel estimation appears to be a much more niques have been designed to decrease the effect of FEXT [2– complicate task than the upstream channel estimation. So 4] using the coordination at the CO (central office) and as- we focus on downstream in this paper. There are basically suming no coordination at the receiver side. These systems two characteristics of the system that make the downstream are quite different than in the MIMO wireless case because crosstalk channel estimation difficult. First, because of the each receiver can only use the signal from its own line. So non-coordination, each receiver can only use the signal from each receiver essentially sees a MISO channel. In addition, its own line to perform its estimation and has no information the physical characteristics of the VDSL channel ensure that on the symbols transmitted to the other users. Furthermore, the useful signal, which is the one transmitted on the line, is due to the presence of the crosstalk mitigation techniques, of much higher amplitude than the crosstalk. This also has to the power of the signal corresponding to the other users be- be taken into account in the design of the precanceller. For comes very low at the receiver of one user. In other terms, 2 EURASIP Journal on Applied Signal Processing

Limited feedback

Symbols CPE information CO Channel FEXT tracking transmitter FEXT channels information CPE

Figure 1: Principle of the proposed estimation structure. the crosstalk impulse responses that need to be tracked are of (both pilots and feedback) associated with the estimation very low amplitude with respect to the noise. So the down- process. stream channel estimation appears as the joint estimation Note that we consider a DMT-based transmission and we of multiple channels of very low amplitude corresponding focus on a simple algorithm that is working on a per tone to multiple independent sources (the different users’ signal). basis. So we do not take into account the correlation between This is a very difficult issue. the tones, but it could be done in the same way as it is done Blind techniques, such as the ones presented in [8, 9], are with pilot schemes [10, 11], by performing the estimation on not practical in this context. They are useful for the estima- a limited number of tones and then interpolating between tion of the main transmission coefficient, that is, the direct the estimated tones using the correlation across frequencies. transmission on the line itself. But concerning the crosstalk Besides, we do not make use of the samples available in the the low amplitude level with respect to the noise prevents cyclic extension [13]. from achieving reasonable performance. The easiest way to The paper is organized as follows. First, the system model solve the problem would be to use a set of pilot symbols, and the issue investigated are described. In Section 3, the sent periodically, to perform the tracking of the downstream proposed algorithm is derived. In Section 4, the Cramer-Rao channels at the CPEs. Many solutions exist in this frame- bound for the proposed structure is investigated and com- work [10, 11]. However, as the VDSL standards usually do pared to the use of pilot symbols, in order to show that the not assume the use of preamble bits or periodically transmit- proposed scheme is indeed potentially superior. In Section 5, ted training sequences, it is necessary to use part of the useful the performance of the proposed scheme is analyzed both bit rate as pilot symbols. In addition, the information about theoretically and with simulations. Finally, an improved, but the estimates needs to be sent back to the CO periodically more complex, algorithm is proposed in Section 6. The basic to perform an update of the crosstalk mitigating transmis- algorithm has already been presented in [14]andafewsim- sion scheme. So this may lead to a large amount of bit rate ulations results have been shown. In this paper, we addition- usage. In order to try to limit the quantity of bit rate needed ally provide a theoretical justification based on the Cramer- for the tracking, we propose another method which takes ad- Rao bound, we provide a more detailed analysis of the per- vantage of the coordination that is present at the transmitter formance both analytically and with extensive simulations. (CO). Finally, we also show how the algorithm can be improved to The principle of the proposed algorithm (see Figure 1)is approach the performance bound. to send back to the CO some very limited amount of infor- mation about the signal received at the CPEs. Now thanks to 2. SYSTEM MODEL the coordination at the CO, all symbols transmitted to all dif- ferent lines are known, and that additional information can We consider the estimation of the downstream crosstalk be used for the estimation. Furthermore, since the estima- channels in a DSL environment. DMT modulation is as- tion is performed at the CO itself, feedback of the channel sumed. It is also assumed that the cyclic prefix is long enough estimates is no longer needed. The algorithm is presented in and the different users are transmitted synchronously from this paper and it is compared through simulations to a simple the CO so that the channel (including crosstalk) is free of in- solution using pilot symbols. It is shown that the proposed tersymbol interference and intercarrier interference. Hence, solution performs better for a given amount of bandwidth for a given tone, the channel model is written as usage. The issue of limiting the quantity of feedback for chan- y = Cx + n,(1) nel estimation has already been investigated in the MIMO wireless context in [12] and several other papers. However where x, y are the vectors of transmitted and received sam- ff the problem considered here turns out to be very di erent. ples, respectively,1 for the different users (or equivalently, on Indeed, in [12], the focus is on the feedback of the channel information to the transmitter. It is assumed that the esti- mation itself has been performed already. Here, the focus is 1 The notation y is used here because the actual observations used will be on the estimation process and on limiting the total overhead a slightly modified version of this (see later). J. Louveaux and A.-J. van der Veen 3 the different lines), C is the channel matrix, and n is the vec- way of handling this initialization. First, the DMT initial- tor of noise samples at the different receivers (CPEs). In this ization is performed. Then transmission can start at a lower paper, we focus on one fixed tone. The same developments rate, without any crosstalk cancellation, considering crosstalk can be done independently for each tone (or a subset of the as noise. During this first part, some coarse estimation of the tones if the frequency-domain correlation is used). The ad- crosstalk channel can be performed, for instance using pi- ditive noise is assumed to be Gaussian and white with in- lot symbols. The method proposed here would also be able dependent elements. The noise variance for user (receiver) i to perform this coarse estimation. However, for reasons ex- σ2 ffi is denoted by n,i. In the model (1), the diagonal elements plained later, it might not be as e cient in the initialization of C correspond to the line transmission (also called direct phase. The precoder can then be computed and transmission channel later in this paper), the off-diagonal elements corre- can start at the highest rate. Then, the channel is changing spond to crosstalk. We assume N users, the channel matrix slowly, for example due to changes in temperature, or possi- C is thus N × N. It must be noted that the channel model bly due to changes in the alien crosstalk environment if such considered here is supposed to take into account all the oper- a cancellation scheme is used. Equivalently, the initial esti- ations from the DMT modulation, through the channel, and mate might just be inaccurate. Therefore, the precoder might until the input of the decision device. This thus includes the not diagonalize the channel perfectly and the remaining in- channel shortening, the cyclic extension operations, possible terference due to crosstalk might increase around the same equalization and may even incorporate, for instance, some power level as the additive noise, thereby decreasing the per- alien crosstalk suppression schemes at the receiver. The pre- formance. Mathematically, this means that the matrix CF in coder (or precanceller) can be viewed as an additional layer the received signal expression working on top of all these operations. y = Cx + n = CFu + n (4) 2.1. Precoder is not perfectly diagonal. In order to update the precoder Because the receivers (CPEs) are not collocated, each one of and recover a low level of interference, some estimation (or them can only use one received signal yi for detection and/or tracking) of the nondiagonal elements of this matrix is nec- estimation purposes. In order to mitigate the effect of FEXT, essary. In the remainder of this paper, we call these values it is assumed that the CO uses some kind of precoder. We as- the interference coefficients. They correspond to the interfer- sume a linear precoder as presented in [2] and later improved ence between lines that remains due to a mismatch between in [4]. the precoder and the actual channel and are thus generally of The CO designs a matrix F such that CFis diagonal, low amplitude. We will refer to channel coefficients to denote the crosstalk coefficients of the channel (matrix C) before the − F = C 1Cd,(2)precoder is applied. where Cd represents the diagonal matrix formed by keeping only the diagonal elements of C, and sends 3. PROPOSED ALGORITHM

x = Fu (3) 3.1. Algorithm derivation on the different lines, where u are the transmitted informa- In this section, the proposed estimation algorithm is derived tion symbols for the different users. Thanks to the precoder in detail. The solution (Figure 1)investigatedhereistoal- design, the received samples for one user suffer from little in- low a limited feedback from the various users about their re- terference from other users. Regarding the transmitted sym- ceived samples. This information is collected at the CO and bols, it is assumed that all the users have the same transmit- the channel estimation is performed there. It is important to ted power, and we therefore normalize the symbol variance to limit drastically the information that is sent back in order to 2 σu = 1 for all users without loss of generality. The sizes of the keep an acceptable usage of the upstream bit rate. Even with user constellations are different however. They are adapted a limited amount of feedback, and since the CO knows per- to the SNR (signal-to-noise ratio) available on the given tone fectly what was sent on the different lines (the samples x and by the various users, in such a way that the bit error rate is the symbols u), the channel estimation is possible. maintained below 10−7 for each user. In order to simplify the It is first assumed that the direct channel coefficients (di- notations, the symbols are assumed to be real throughout the agonal ones) are estimated perfectly at the receivers (this paper, but the extension to complex symbols is straightfor- can be done easily with a decision-directed scheme since the ward. power of the useful signal is high). After detection, the con- tribution of the corresponding user’s symbol is subtracted at 2.2. Initialization procedure and tracking issue the receiver, only remaining with the crosstalk interference and the noise. We call this quantity (crosstalk + noise) the In this paper, we focus on the issue of tracking the crosstalk symbol error. The receivers send back the sign of this sym- channel coefficients. Hence it is assumed that some initial es- bol error, so that the smallest possible amount of the up- timate of the crosstalk channel has been obtained during the stream bit rate is used: 1 bit. We focus on real-valued sym- initialization phase. Here is a little description of a possible bols here. The extension to complex symbols can easily be 4 EURASIP Journal on Applied Signal Processing done by splitting the complex values in real and imaginary probability is parts, feeding back the sign of both quantities. ⎛ ⎞ k Mathematically, K DMT blocks are stacked up (still fo- k k ⎝ k hiu ⎠ P sign yi = zi | H, U = Q −zi , 2 cusing on one tone only) in the following way: σn i , (11) K− N− K− 1 1 k k = 0 ··· 1 −zi hiu X x x ,(5) Λ(H) = Q , σn i k=0 i=0 , k where x denotes the vector of transmitted samples for block i k k k.ThematricesU, Y,andN are built similarly. K is the where hi is the th row of H, u is the th column of U,and number of observations used by the algorithm. Since VDSL where ∞ channels are varying slowly, this number can be quite large 1 2 Q(x) = √ e−t /2dt. (12) in practice. The channel model and precoding operations are 2π v rewritten as The tracking algorithm is obtained by taking the derivate of Y = CX + N, the likelihood function, and performing a simplified steepest (6) descent procedure. The gradient of the likelihood function is X = FU. given by

K− k At the receivers, the diagonal elements of CF are assumed to 1 − i 2/ σ2 ∂Λ Λ T e (h u ) 2 n,i (H) = (H) zk k . be estimated perfectly, and the symbols transmitted to the i u k k (13) ∂hi πσ2 Q − zi hiu /σn i corresponding users are also assumed to be detected per- 2 n,i k=0 , fectly. Their contribution is then subtracted to obtain the so- The proposed basic tracking algorithm computes the cor- called symbol errors responding term of the gradient for each new received sam- ple (each block k) and adapts the coefficients estimates in the Y = Y −{CF}dU (7) direction of the gradient. In other words, it realizes the sum = CF −{CF}d U + N (8) over k in (13) by adapting progressively for each new coming = HU + N,(9)sample (except that the interference coefficient estimates hi are changing slowly). It is important to keep the weightings where the last line defines a new matrix H with zeros on the that depend on the sample k (i.e., the big fraction) because it diagonal. We call it the interference matrix. It represents the contains the information on the relative importance of each residual interference at the output of the receiver in presence term of the gradient. The common factor can be removed of of the precoding scheme, and its elements are thus of low am- course, and incorporated in the stepsize. Finally, the follow- plitude. The nondiagonal elements are the so-called interfer- ing algorithm is provided: ffi ence coe cients. This is the matrix that will be estimated at ⎛ k ⎞ k k k k the CO by the algorithm. +1 k ⎝ −zi hi u ⎠ k T hi = hi + μzi D · u , (14) The algorithm is based on the ML (maximum likelihood) σ2 principle. We denote by Z = sign(Y), the set of received signs n,i of the symbol errors coming from the different lines. They k k i are the observations on which the estimation will be based. where hi denotes the current estimate at block of row μ The error sign sample received from user i for block k is of the interference matrix H, is the stepsize which can be k k k k denoted by zi (similarly for yi , ui ,andni ). It is assumed chosen to tune the properties of the algorithm, and where that the noise variance of each receiver is known at the CO. e−x2/2 D x = √ . This will be necessary in the computation of the algorithm ( ) πQ x (15) as shown later. The noise variance at receiver i is denoted by 2 ( ) σ2 ffi n,i. The likelihood of a set of interference coe cients can be The tracking algorithm (14) appears to be similar to an LMS written as algorithm, or more precisely to the sign LMS [15]. However it is very different because, in the sign-LMS algorithm, the K−1 N−1 k k sign operation is taken on the “prediction error” computed Λ(H) = P sign yi = zi | H, U , (10) k k k=0 i=0 between the observation yi and the predicted version hi u, based on the estimation. In our case, the sign is directly ap- k k k where P(sign(yi ) = zi | H, U) denotes the conditional prob- plied on the symbol error yi and the “prediction error” is ability on the value of some error sign sample, given the not available. As can be seen in (14), it is replaced here by transmitted symbols and given the set of interference coef- some more complicated expression. Consequently, the be- ficients. Note that the estimation can be performed inde- havior and performance of this algorithm can be expected pendently for each line as the interference coefficients re- to be very different. lated to one line only impact the received samples from the Finally, the ultimate goal is to adapt the precoder to the corresponding line. However, for generality, the matrix for- changes in the channel. To achieve this, the diagonal coeffi- malism is kept here. For one specific error sign sample, the cients of the matrix CF (direct channel coefficients), which J. Louveaux and A.-J. van der Veen 5 are easy to estimate at the CPEs, have to be sent back period- the same amount of pilots (unless the constellation sizes are ically as well. This allows the CO to reconstruct CF and hence very high) and such a system is thus not worthwhile in prac- C, and then to compute the new precoder with (2). tice.

3.2. Comparison with pilot symbols 4. CRAMER-RAO BOUND

In order to verify the behavior of the proposed algorithm, In this section, the CRB (Cramer-Rao lower bound) associ- it will be compared to an estimation method based on pilot ated with the proposed estimation structure (i.e., using the symbols. We assume the use of an LMS algorithm at each sign feedback) is investigated. Then, it is compared to the receiver, using the different pilots to estimate the interference CRB of the estimation performed using pilot symbols. The coefficients. Hence it is also an iterative algorithm but it is objective is to show that, for a fixed number of bits used (as performed at the receivers instead of the CO. The adaptation either feedback or pilot symbols), and in presence of high can be written as noise, the proposed structure has a higher potential than the

k k k pilot-based estimation (the CRB is lower). This thus provides +1 k k k T hi = hi + μLMS yi − hi u u . (16) a theoretical justification for the proposed approach. Regarding the CRB computation, the two basic differ- The symbols uk are the pilots. Now, the purpose of the com- ences between the two schemes are the following. parison is to evaluate which algorithm consumes the small- (i) The “sign” scheme only uses the sign of the observa- est amount of bit rate for the estimation purpose, or equiv- tion while the “pilot” scheme can use the full observa- alently, which has the best performance for a given bit rate tions y to make the estimations. usage. Hence the bit rate usage of the two different methods (ii) In counterpart, the “pilot” scheme needs to transmit is computed in this section. The proposed algorithm uses one pilots instead of full symbols, corresponding to multi- bit of the upstream for each feedback of a symbol error. So, ple bits, while the “sign” scheme only uses one bit per for K transmitted symbols and N users, the bit rate usage symbol in feedback (see previous section). ffi of the proposed method for the estimation of all the coe - 2 cients is KN bits. The LMS solution using pilots consumes As in our case the observation interval is long, the so- the downstream bit rate of the pilots, as well as some addi- called modified CRB (MCRB) can be used [16] and provides tional upstream bit rate needed to feedback the value of the a very good approximation to the true CRB. For the estima- Θ estimated channel coefficients. Here we neglect this feedback, tion of some set of parameters , using observations y and but this is of course an additional overhead with respect to with a set of nuisance parameters U, the modified Cramer- the proposed method. The downstream bit rate used by the Rao lower bound on the variance of any unbiased estimator θ pilots actually depends on the constellation size of the sym- for one parameter m is given by bols they replace, and thus on the SNR of the corresponding ff b tone for the di erent users. If we denote by i the number 2 1 σ ≥− , (17) θm 2 2 of bits that could be transmitted on the tone of interest for EU En ∂ ln P y | Θ, U /∂θm user i,andbyKLMS the number of pilot symbols transmitted, the total amount of downstream bit rate used by the pilots N−1 where En[·] denotes the expectation with respect to the is K i= bi. It is assumed that the consumed bit rates on LMS 0 noise. This is a lower bound looser than the true CRB. But upstream and on downstream are treated equally. Then, a fair when the number of observations is very large as it is the case comparison between the two methods can be done when the here, it gets tight thanks to the fact that the Fisher informa- same number of bits is consumed in both cases (for one pre- tion matrix is almost diagonal and tightly distributed. KN = K N−1 b coder update), that is, when LMS i=0 i.Soinprac- tice, the number of symbols K will be higher in the proposed method (constellation sizes can go up to 1024 depending on 4.1. Modified CRB for pilot symbols the available SNR on the corresponding tone). The actual bit We first compute the MCRB for the simple pilot scheme. rate usage for the estimation purpose is of course dependent This corresponds to a classical DA (data aided) scheme. The on the update rate of the crosstalk model, which will be the model (9) applies, but we focus on one row of H only: same for the two methods and has no further influence on the performance. As an additional comment, it can be pointed out that a yi = hiU + ni, (18) k system where the symbol errors yi are fed back in full pre- cision to the transmitter would actually have access to the where yi denotes the row vector obtained from Y by taking same information as the system using pilot symbols (except only the received samples for user i. Assuming the noise is that the information is available at the transmission side in- stead of the receiving side). Such a system would therefore be able to provide equal performance than estimation meth- 2 This is required due to the high level of noise with respect to the interfer- ods based on pilots. However, the feedback in full precision ence coefficients to estimate, and this is possible since the channel varia- is much more demanding in terms of consumed bit rate than tions are slow. 6 EURASIP Journal on Applied Signal Processing white and Gaussian, it follows that Computing the expectations k K − z LMS 1 i 1 − yk− k 2/ σ2 En ( i hiu ) 2 n,i k k P yi | hi, U = e , Q − z ¯ i 2 i h u k= πσ 0 2 n,i k k k k (19) P zi = 1 | hi, u P zi =−1 | hi, u K −1 = − ∂2 P | LMS k +( 1) k ln yi hi, U =− 1 uk 2. Q − h¯ iu Q h¯ iu (24) ∂h2 σ2 m i,m n,i k=0 = 1 − 1 = 0, E 1 = 1 1 Finally, the lower bound is obtained as n k k k + k , Q2 − zi h¯ iu Q h¯ iu Q − h¯ iu

2 σn i it becomes σ2 ≥ ,  σ2 h hi m,min,pilot, (20) i,m K σu2 , LMS ∂2 P | − E ln zi hi, U n ∂h2 h i i,m where i,m denotes the element of H on the th row and the 2 K− k 2 k 2 2 mth column, and where σu denotes the symbol variance, nor- 1 u e−(h¯ iu ) /2 = m √ 1 1 . σ2 = 2 k + k malized to u 1 in this paper. σ π Q ¯ i Q − ¯ i k=0 n,i 2 h u h u (25) 4.2. Modified CRB for the proposed scheme The modified CRB is thus For the proposed scheme, the channel model is again given by (9) and the observations used at the CO for the estimation σ2 hi m,min,sign are Z = sign(Y). We focus on one row of the interference ma- , σ2 i n,i trix (i.e., on one user only). For simplification of the equa- = √ , KE u2 e−(h¯ iu)2/2/ π D ¯ D −¯ tions, we define the normalized interference coefficients for u m 2 (hiu)+ ( hiu) row i as (26)

hi where u is a random vector of transmitted symbols for one h¯ i  . (21) block. The expectation in (26) is not tractable analytically so σ2 n,i it is computed numerically. It must be noted that it is clearly dependent on the various parameters: the constellation sizes Note that they are just used for notation, we are of course of the different users, the interference coefficients themselves, still interested in the variance on the estimation of the true and of course the noise variance. Now, another interesting interference coefficients. value to compute is the gain (or loss) of our method with re- The probability distribution of the observations is writ- spect to the use of pilot symbols. It can be done by comput- ten as ing the ratio between the two CRBs. Since the symbol vari- ance can be assumed equal to 1 without loss of generality, it K−1 follows that k k ln P zi | hi, U = ln Q − z h¯ iu . (22) σ2 k= h 0 G  i,m,min,pilot i,m σ2 hi,m,min,sign (27) Then e−(h¯ iu)2/2 = E u2 √ D ¯ D −¯ . u m π (hiu)+ ( hiu) 2 2 ∂ ln P zi | hi, U ∂h2 i,m This represents the “gain” of the proposed method (using K−1 uk 2 sign feedback) with respect to the use of pilot symbols for = − m D − zk ¯ k · zk ¯ k D − zk ¯ k ffi h σ2 i hiu i hiu + i hiu , the estimation of interference coe cient i,m for an identical k=0 n,i K = K number of symbols sent (i.e., for fixed LMS). The gain is 2 ffi ff ∂ ln P zi | hi, U dependent on the interference coe cients and may be di er- En ∂h2 ent for all coefficients hi,m. As defined here, the gain should i,m be always smaller than 1 since the pilot scheme has always K− 2 k 2 k 1 − uk e−(h¯ iu ) /2 z = m ¯ k √ E i more information available. However, as mentioned earlier a 2 hiu n k k σ π Q − z ¯ i k=0 n,i 2 i h u fair comparison should be done for an identical number of K− k 2 bits used. In that case, the gain becomes 1 k 2 −(h¯ iu )2/2 − um e 1 + √ En . σ2 π Q2 − zk ¯ k N−1 b k= n,i 2 i hiu i=0 i 0 Gi m = Gi m , (28) (23) , ,fixed#bits , N J. Louveaux and A.-J. van der Veen 7

3 designed estimator is likely to perform better in the proposed scheme than with pilot symbols. This confirms the results ob- 2.5 tained previously. The figure also shows that the interest of the proposed structure is limited to situations were the inter- 4 2 ference is about the same level as the noise or lower. For high interference-to-noise ratio, the traditional pilot schemes are likely to perform better. 1.5

sign, fixed # bits For illustration, Figure 3 shows the MCRB (variance) of G the proposed (sign) scheme as a function of the noise vari- 1 ance for a given set of interference coefficients.

0.5 5. EVALUATION OF PERFORMANCE

0 5.1. Relation between estimation variance and 10−2 10−1 100 101 102 103 transmission performance 2 Interference-to-noise ratio (P /σn i) interf , One drawback of the Cramer-Rao bound is that it provides a performance evaluation of the channel estimation in terms Figure 2: Average gain of the sign method as a function of the ratio of error variance. But, in practice, the purpose of our estima- ffi P between the power of the interference coe cients to estimate interf tion is to be able to compute a refined precoder and finally and the noise variance. The constellation sizes are 16. get better SNIRs for transmission on the different lines. So, in this section, we show how to relate the estimation perfor- mance, in terms of variance, to the achievable SNIRs on the different lines after the refined precoding. This is done using b where i denotes the number of bits transmitted per symbol a few assumptions, and it is later shown by simulations that i G for user .Soif i,m is not too small, the gain (28)canbecome the obtained relation is closely followed. much larger than 1. It can be observed that this gain is only The precoder may be written as dependent on the constellation sizes of the different users and ffi −1 on the normalized interference coe cients. F = C Cd, (29)

4.3. Comparison where C is the estimation of the channel matrix C available at the transmitter. We write The gain (28) and the MCRB are evaluated in this section. Both are however dependent on the true interference coeffi- = −1 C C + EFold, (30) cients (the vector hi).So,inordertogetsomevaluableresult, the MCRB and the gain are averaged over several realizations where E is the estimation error matrix on the interference of the channel with a fixed interference power. Mathemati- matrix H,andFold is the old precoder, needed to compute cally, it is assumed that the interference coefficients hi,m are the estimate of the channel matrix C from the estimate of the Gaussian distributed, but are then proportionally corrected interference matrix. It is assumed that the error matrix E is a h2 = P to satisfy m i,m interf for some constant power of in- zero mean Gaussian random matrix with i.i.d. elements hav- 2 terference Pinterf. Figure 2 shows the average gain (28) of the ing variance σe . Although the proposed estimation scheme proposed (sign) scheme as a function of ratio between the may result in correlation between the errors, it is reasonable P σ2 interference level ( interf) and the noise variance n,i,andfor to assume that, using a large number of samples, this correla- constellation sizes of 16. Each result is averaged over 3000 tion may vanish. The estimation error variances may also not realizations3 of the channel as described above. It can be be the same for the different interference coefficients, but in seen that the gain is always decreasing for increasing inter- practice, it appears that the differences are not large, so this ference coefficients (or decreasing noise variance). It can also approximation is acceptable. This is confirmed by the simu- be seen that the gain is indeed higher than 1 for reasonable lation results and partly by the performance analysis in the cases: it does not seem reasonable to allow the interference, next section. The inverse of C is approximated as which is due to changes in the channel, to go significantly above the noise as it would unacceptably decrease the perfor- −1 ≈ −1 − −1 −1 −1. C C C EFoldC (31) mance. So this shows that for a given bit rate usage, a well- So the vector of received samples is

3 = = − −1 −1 Note that because the MCRB is inversely proportional to the gain (28), we y CFu + n Cdu EFoldC Cdu + n (32) actually compute the inverse of the average of the inverse of the gains— that is, the so-called harmonic mean. It provides a slightly lower value than a direct average of the gain. Also note that, for a given ratio, the gains corresponding to the various channel realizations usually differ only by 4 Or, in a more general context, the power of the signal for which the chan- 1-2 dB from the mean value. nel needs to be estimated. 8 EURASIP Journal on Applied Signal Processing and the vector of symbol estimates at the receivers is Then, the expectation is taken. In steady state, it is assumed that E[|hk|2] = E[|hk+1|2], so it follows that = −1 = − −1 −1 −1 −1 . u Cd y I Cd EFoldC Cd u + Cd n (33) zkhkuk There is an additional ISI term E zkD − hkuk σn − − − = 1 1 1 . k k k (40) uISI Cd EFoldC Cdu (34) μ z =− E D2 − h u k2 . σ u Thanks to the independence of the estimation errors on the 2 n different interference coefficients, it can be shown that the ISI T This expectation is taken over all noise samples and all sym- covariance matrix RISI = E[uISI u ] is diagonal (i.e., the ISI ISI k terms are not correlated). Indeed, using the i.i.d assumption bols. Clearly, h is influenced by all past noise samples and k on the elements of E, it can easily be shown that, for any ma- past symbols. But only z is dependent on the noise at the k trix A, current time n . So the expectation can be first carried out with respect to the nk with fixed hk and uk: T 2 E EAE = σe Tr{A}I, (35) k k k k k k k z h u hu h u where I is the identity matrix. Since the symbols from the dif- Enk z D − = Q − D − σn σn σn ferent users are also assumed independent, with fixed symbol (41) σ2 k k k variance u , the covariance matrix of the ISI is − Q hu D h u . σn σn = σ2σ2 −1 −1 T −T −T −1 −T . RISI u e Tr FoldC CdCd C Fold Cd Cd (36) Now, it is assumed that hk = hk + h is close to h and a Taylor It is a diagonal matrix. Now, in order to compute (36), the approximation is applied around the true interference coef- estimations are replaced by the true value, and furthermore, ficients such that due to the diagonal dominance of the channel matrix C, the N k k k k k k trace in (36) is well approximated by .So,finally, h u hu h u hu D ≈ D + D˙ , (42) 2 2 −1 −1,T σn σn σn σn RISI ≈ Nσu σe Cd Cd . (37) D˙ x D x This provides the power of interference present after the up- where ( ) denotes the derivative of ( ). It follows, after date of the precoding on the different lines when the inter- some simple computations, that ffi ference coe cients (before the update) are estimated with a 2 k k k variance σe . The value of the power provided by (37)isnor- k z h u E k z D − 2 n malized for a useful signal of power σu . It can thus be directly σn (43) translated in terms of SIR or SNIR. k k − k 2/ σ2 k k h u e (hu ) 2 n hu hu =− √ D − + D . σn 2π σn σn 5.2. Steady-state performance analysis

In this section, we investigate the performance of the algo- On the other hand, rithm itself. Thanks to the relation given in the previous sec- ffi zk k k k k tion,itisnowsu cient to investigate the performance of the 2 h u hu 2 hu Enk d − ≈ Q − D − proposed adaptive algorithm in terms of the error variance σn σn σn 2 (44) σe . The steady-state error variance is computed in this sec- k k Q hu D2 hu tion, using a method similar to [15]. Let us consider only + σ σ one line here, so the subscript i (user index) is temporarily n n dropped for legibility. First, the following definition of the k estimation error vector is used by assuming5 h ≈ h. Finally, by inserting (43)and(44) into (40), the following is obtained: hk = hk − h. (38) k k 2 − k 2/ σ2 k k h u e (hu ) 2 n hu hu The adaptation rule (14) is obviously unchanged when it is E √ D − + D σn 2π σn σn written for hk instead of hk. The square norm of the adapta- k 2 2 −(hu ) /2σn k k k μ e hu hu 2 tion rule (in h ) is written: = E √ D − + D uk . 2 π σn σn 2 k k k 2 2 z h u (45) hk+1 = hk +2μzkD − hkuk σn (39) k k k z h u 2 + μ2D2 − uk . 5 It is not necessary this time to use a Taylor approximation because the σn Taylor correction is much smaller than the 0-order value. J. Louveaux and A.-J. van der Veen 9

10−8

10−9

10−10

10−11

10−12 CRB (variance) 10−13

10−14

10−15 10−11 10−10 10−9 10−8 10−7 10−6 10−5 10−4 Noise variance Sign method Pilot symbols

Figure 3: Modified CRB for a fixed set of interference coefficients, as a function of the noise variance. Interference coefficients are −9 1e-6 · [−1.42.1 − 40.969.2] (Pinterf = 6.510 ), K = 50 000, the constellation sizes are 32.

Nowsince hk onlydependsonpastnoisesamplesand,symbols it is independent of uk, and the relation can be rewritten as

k 2 2 −(hu ) /2σn k k 1 k e hu hu k k T k T E E k √ D − D h u π σ + σ u u h σn2 2 n n (46) k 2 2 μ e−(hu ) /2σn k k √ hu hu k2 = Euk D − + D u . 2 2π σn σn

Thanks to the approximations, most of the expectations re- the matrix equation (46) can be rewritten in the simpler form k maining are taken on u only (h is the true interference vec- μ σ2 tor and is fixed), which is much more tractable. Note that the n,i Tr Re i A i = a i. (50) inner expectation on the left-hand term is a matrix while the , 1, 2 2, expectation on the right-hand term is a scalar. This matrix equation, for a fixed interference vector, characterizes the be- It is readily seen that the definitions (48)and(49)arevery havior of the various estimates in steady state. Defining the similar to the gain definition (27). We have covariance matrix of the estimation error (the superscript k ≈ G ··· G is dropped because it corresponds to the steady-state behav- A1,i diag i0 iN−1 , (51) ior, but we reintroduce the subscript i corresponding to the · line of interest), where diag( ) denotes the diagonal matrix formed with the given elements. The matrix A i can be shown to be approxi- 1, ∞ T ∞ mately diagonal, although the nondiagonal elements are not Re,i = E hi hi , (47) exactly zero. The diagonal elements are the gains defined in and defining (27). Furthermore, N−1 e−(h¯ iu)2/2 √ T a2,i = Gi,m. (52) A1,i = Eu D(−h¯ iu)+D(h¯ iu) u(u) , (48) 2π m=0 −(h¯ iu)2/2 e 2 In practice, both Re i and A i are approximately diagonal, so a i = E √ D(−h¯ iu)+D(h¯ iu) u , (49) , 1, 2, u 2π the nondiagonal elements can be neglected in (50), and the 10 EURASIP Journal on Applied Signal Processing performance can finally be described by

N−1 2 N−1 μ σn i G σ2 = , G −10 i,m e,i,m i,m, (53) 10 m=0 2 m=0 σ2 where e,i,m is the estimation error variance for interference −11 coefficient hi,m. If we further assume that all the estimates 10 corresponding to one line i have the same error variance, it follows that

−12

2 Estimation error variance μ σn i 10 2 , σe i = . (54) , 2 So, finally, we obtain a very simple expression of the estima- 10−13 tion error variance that can be achieved by the algorithm. As 10−2 10−1 100 we can see, the assumption made in the previous section that 2 Pinterf/σn the estimation error variances of all interference coefficients 2 Sign method Theoretical performance are equal between the different lines (σe i equal for all i)is , LMS with pilots CRB coherent with this result if the noise variances at the various CPEs are the same. The analysis does not provide any justifi- 2 Figure 4: Estimation error variance σe averaged over 1000 simula- cation for the assumption that the estimation error variances ffi are equal within one line i as (53) only provides information tions, and averaged over all interference coe cients. The stepsizes are kept fixed. μ = 5.10−8, μ = 5.10−4. The constellation sizes about the sum (or a weighted sum) of the variances for that LMS are adjusted according to the SNR. Comparison with the theoreti- line. However, this assumption was verified to be acceptable cal variance predicted by the analysis and with the CRB. by simulations.

5.3. Simulation results

2 The simulations are performed for N = 5 lines, and hence 4 σn , that is, an interference-to-noise ratio. It is clear that, for interfering users. The insertion loss and FEXT transfer func- low interference-to-noise ratio, the proposed method pro- tions used here come from a set of measurements conducted vides better performance. On the contrary, when the ratio by France Telecom R&D, which include both the amplitude becomes large (the noise is low or the power of interference and phase. A detailed analysis of the measurements is given too high), the algorithm does not perform well with respect in [17]. The values used here correspond to a cable of length to the LMS algorithm. The reason is that, for lower noise, 300 m, and a tone at frequency around 10 MHz. The other the sign of the symbol error no longer provides enough in- parameters are set according to the standards [18, 19]: the formation on the amplitude of the interference coefficients. transmitted PSD is limited at −60 dBm/Hz and the noise In conclusion, this algorithm is well suited when the noise is PSD is −140 dBm/Hz. However, in order to consider differ- approximately of the same amplitude as the remaining inter- ent SNR situations, various values around −140 dBm/Hz will ference from crosstalk. So this is perfectly suited to the issue be considered. For the computation of the constellation sizes, of interest, since, because of the precoding, the interference a target error probability of 10−7 is considered with a coding coefficients that we try to estimate are usually lower than the gain of 3 dB and a noise margin of 6 dB. noise. The first set of simulations aims at comparing the av- For the same set of simulations, Figure 5 shows the per- erage performance of the proposed method with a classical formance of the transmission after computing a new pre- LMS method. The stepsizes for the proposed algorithm and coder with the available estimations. The average SIR (signal- for the LMS are adjusted so as to provide similar convergence to-interference ratio) and the average SNIR (signal-to-noise- speeds. Several noise variances are investigated. For each one, and-interference ratio) before and after the updated precoder a set of 1000 simulations is run. Each simulation uses a block are compared. The bottom curve is always the value before of K = 60 000 symbols. The output of the algorithm is taken the updated precoder and the top curve is the corresponding at the end of the K blocks and the performance (in terms result after the updated precoder. The results are presented as of the estimation error, SIR and SNIR) is averaged over the a function of the SNR that would be available if the interfer- 1000 simulations. Note that the constellation sizes are always ence was totally removed. The figure shows the good perfor- adjusted according to the available SNR on the line. Figure 4 mance obtained by the estimation technique. The resulting provides the estimation error variance, averaged over all co- precoder decreases the interference to at least 10 dB below the efficients, for various noise variances (solid line). It is com- noise. Using the SNIR, the corresponding throughput loss for pared to the performance of the LMS using pilots for the the given tone can be computed for both methods—the pro- same simulation setup (dashed line) but, for a fair compari- posed one and the LMS method using pilots. As expected, the K son, with a lower LMS (see Section 3.2).Theresultsarepre- proposed method brings some gain when the interference- P = h2 sented as a function of the ratio between interf m i,m and to-noise ratio is low. The bit rate loss can be up to 3-4 times J. Louveaux and A.-J. van der Veen 11

95 75

90 70

85 65

80 60

75 55 SIR (dB) SNIR (dB) 70 50

65 45

60 40

55 35 35 40 45 50 55 60 65 70 35 40 45 50 55 60 65 70 SNR for user 1 (dB) SNR for user 1 (dB)

User 1 User 1 User 2 User 2 User 3 User 3 (a) (b)

Figure 5: (a) Signal-to-interference ratio for different users, averaged over 1000 simulations. (b) Signal-to-noise-and-interference ratio averaged over the same set of simulations. The bottom curve is the result before the updated precoder and the upper curve is the result after the updated precoder.

lower. This bit rate loss is given in Figure 6 using, as an ex- It is interesting to note that, for fixed stepsizes, the error ample, a system with 4096 tones and 20 MHz bandwidth. For variance for the proposed algorithm is proportional to the these values,6 the throughput on the studied tone would be square root of the noise variance (54) while the error vari- on the order of 50–100 kbps depending on the noise variance. ance of the LMS is directly proportional to the noise vari- Now, these simulation results can be compared to the an- ance [15]. This is predicted by the analytical results and is alytical developments presented in Section 5. First, the rela- very well verified in the simulations: the two curves are al- 2 tion (37) between the error variance σe (averaged over all in- most straight lines with different slopes. This behavior also terference coefficients) and the corresponding achievable SIR explains why the proposed method is mainly attractive for a after precoding needs to be verified. For the set of simulations low interference-to-noise ratio. described above, using the error variance from Figure 4 and putting it into (37), the results, in terms of SIR, are provided in Figure 7. It shows a very good correspondence with the av- 6. ITERATIVE APPROXIMATE ML ALGORITHM eraged SIR from Figure 5 obtained through the simulations by actually computing the precoder for each channel estima- The biggest drawback of the adaptive algorithm presented tion. above is the necessity to adequately choose the stepsize. As Then, we can compare the analytical evaluation (54)of in any LMS-like algorithm, the performance is sensitive to 2 μ the estimation error variance σe to the simulations. Figure 4 the choice of the stepsize ,see(54), and the error variance μ shows the theoretical error variance from (54) in dotted lines. can be reduced by decreasing . However the convergence It appears that the prediction provides a very good approxi- speed is also dependent on the stepsize, and decreasing it mation of the achievable estimation error variance. The dif- tends to slow down the convergence of the algorithm. Hence, ference is probably due to non-steady-state behavior. As a the achievable performance is usually still several dB from the matter of fact, the stepsize is usually taken as the smallest CRB due to the fact that the convergence has to be assured on value that provides a good convergence on the limited num- a limited number of samples. ber of samples K available. Hence, even though the conver- In order to achieve better performance, a block algorithm gence is acceptable, the steady-state behavior is usually not is presented in this section. The principle is to try to approach ffi reached on that limited number of samples. the true ML estimate of the interference coe cients given the entire blocks of K symbols. So, instead of simplifying the steepest descent algorithm, as it is done in Section 3, the full K 6 Another assumption on these would only change the results by a fixed gradient, involving the summation over all observations is factor. used at each step in the iteration. The iterative algorithm can 12 EURASIP Journal on Applied Signal Processing

60 92

90 50 88

40 86

84 30

SIR (dB) 82 20 80 Rate loss on one tone (bps) 10 78

76 0 10−2 10−1 100 10−2 10−1 100 2 2 Pinterf/σn Pinterf/σn Sign method True value LMS with pilots Theoretical

Figure 6: Throughput loss for one tone corresponding to the ob- Figure 7: Comparison between the true averaged SIR (after updat- tained signal-to-noise-and-interference ratio (using standard for- ing the precoder) and the theoretical SIR given the estimation error mulas), for a system with 4096 tones and 20 MHz bandwidth. variance, for user 1.

simply be written as ⎛ ⎞ K− n n n 1 k k +1 k zi hi u k T ⎝ ⎠ − hi = hi + μzi D − u . (55) 10 10 σ2 k=0 n,i

Note that, now, for each iteration n, the summation is done over all k. The stepsize can be kept the same as in the LMS- 10−11 like adaptive algorithm from Section 3. The LMS-like algo- rithm can be used once on the whole block to provide the

initial estimate of the iterative ML algorithm. Estimation error variance The simulations show that this algorithm usually con- 10−12 verges in approximately 10 iterations. The convergence can easily be observed through the amplitude of the corrections. Figure 8 shows the results (error variance) of the iterative ML 10−2 10−1 procedure, with 10 iterations following the initial estimation P /σ2 based on the adaptive LMS-like algorithm. It appears that interf n the results are very close to the CRB, as could be expected Iterative ML Theoretical sign adaptive from an ML algorithm using a large number of observations LMS with pilots CRB K. This confirms that the iterative procedure converges suf- ficiently in around 10 iterations. Figure 8: Results of the iterative ML algorithm (error variance) for 10 iterations, compared to the CRB and other methods. 7. CONCLUSIONS

We have proposed a new scheme for the tracking of FEXT likelihood principle, using the sign of the “symbol error” as channel coefficients in downstream VDSL. This scheme is feedback. We have computed the Cramer-Rao bound and intended for systems with uncoordinated receivers and co- shown that, for a given number of bits to be used as feed- ordination at the transmitter, using some kind of precod- back or pilots, the proposed structure exhibits a better poten- ing scheme to remove the influence of FEXT. The principle tial when the ratio between the interference power and the is to feed back some limited amount of information about noise power is low. The simulation results and the analysis the received signals from the receivers to the transmitter in have confirmed that the method performs better than a clas- order to allow the estimation of the crosstalk channels at the sical scheme using pilot symbols when this ratio is low, which transmitter (where the precoder needs to be computed). We is the case in the problem of interest, due to the presence have proposed a tracking algorithm, based on the maximum of the precoder. Analytical results that have been provided J. Louveaux and A.-J. van der Veen 13 closely approximate the performance of the scheme. Finally, Transactions on Signal Processing, vol. 52, no. 5, pp. 1419–1429, an improved version that has been presented exhibits a per- 2004. formance close to the Cramer-Rao bound. It must be noted [10] M. Morelli and U. Mengali, “A comparison of pilot-aided that all the results presented here have been provided for real channel estimation methods for OFDM systems,” IEEE Trans- symbols, but the algorithm can easily be extended to complex actions on Signal Processing, vol. 49, no. 12, pp. 3065–3073, symbols. 2001. [11] S. Coleri, M. Ergen, A. Puri, and A. Bahai, “Channel estima- tion techniques based on pilot arrangement in OFDM sys- ACKNOWLEDGMENTS tems,” IEEE Transactions on Broadcasting,vol.48,no.3,pp. 223–229, 2002. We would like to thank the U-BROAD project partners and [12] D. J. Love and R. W. Heath Jr., “Limited feedback precoding in particular the people from France Telecom R&D Labs for for spatial multiplexing systems,” in Proceedings of IEEE Global providing the crosstalk measurements used in this paper. Telecommunications Conference (GLOBECOM ’03), vol. 4, pp. This research was supported in part by the Commission of 1857–1861, San Francisco, Calif, USA, December 2003. the EC under Contract FP6 IST1-506790 (U-BROAD). Parts [13] R. W. Heath Jr. and G. B. Giannakis, “Exploiting input cy- of this paper were presented at the International Conference clostationarity for blind channel identification in OFDM sys- on Acoustics, Speech, and Signal Processing, Philadephia, Pa, tems,” IEEE Transactions on Signal Processing, vol. 47, no. 3, pp. USA, March 2005. 848–856, 1999. [14] J. Louveaux and A.-J. van der Veen, “Downstream VDSL chan- nel tracking using limited feedback for crosstalk precompen- REFERENCES sated schemes,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ’05), vol. 3, [1] G. Latsoudas and N. D. Sidiropoulos, “On the performance pp. 337–340, Philadelphia, Pa, USA, March 2005. of certain fixed-complexity multiuser detectors in FEXT- [15] A. H. Sayed, Fundamentals of Adaptive Filtering, John Wiley & limited vectored DSL systems,” in Proceedings of IEEE Inter- Sons, Hoboken, NJ, USA, 2003. national Conference on Acoustics, Speech, and Signal Processing (ICASSP ’05), vol. 3, pp. 889–892, Philadelphia, Pa, USA, [16] M. Moeneclaey, “On the true and the modified Cramer-Rao March 2005. bounds for the estimation of a scalar parameter in the pres- [2] R. Cendrillon, M. Moonen, J. Verlinden, T. Bostoen, and G. ence of nuisance parameters,” IEEE Transactions on Commu- Ginis, “Improved linear crosstalk precompensation for DSL,” nications, vol. 46, no. 11, pp. 1536–1544, 1998. in Proceedings of IEEE International Conference on Acoustics, [17] N. D. Sidiropoulos, E. Karipidis, A. Leshem, and L. Youming, Speech, and Signal Processing (ICASSP ’04), vol. 4, pp. 1053– “Statistical characterization and modelling of the copper phys- 1056, Montreal, Quebec, Canada, May 2004. ical channel,” Tech. Rep., Deliverable D2.1. EU-FP6 STREP project U-BROAD No. 506790, 2004, http://www.metalinkbb. [3] G. Ginis and J. M. Cioffi, “Vectored transmission for digi- com/site/app/UBoard Publications.asp?year=2004 tal subscriber line systems,” IEEE Journal on Selected Areas in [18] ETSI, “Transmission and Multiplexing (TM); Access transmis- Communications, vol. 20, no. 5, pp. 1085–1104, 2002. sion systems on metallic access cables; Very high speed Digi- [4] A. Leshem and L. Youming, “A low complexity coordinated tal Subscriber Line (VDSL); Part 2: Transceiver specification,” FEXT cancellation for VDSL,” in Proceedings of 11th IEEE ETSI TS 101 270-2, 2000. International Conference on Electronics, Circuits and Systems [19] J. A. C. Bingham, ADSL, VDSL, and Multicarrier Modulation, (ICECS ’04), pp. 338–341, Tel-Aviv, Israel, December 2004. John Wiley & Sons, New York, NY, USA, 2000. [5] M. L. Honig, P. Crespo, and K. Steiglitz, “Suppression of near- and far-end crosstalk by linear pre- and post-filtering,” IEEE Journal on Selected Areas in Communications, vol. 10, no. 3, pp. 614–629, 1992. J. Louveaux received the Electrical Enginee- ffi ring degree and the Ph.D. degree from the [6] G. Ginis and J. M. Cio , “A multi-user precoding scheme Universite´ Catholique de Louvain (UCL), achieving crosstalk cancellation with application to DSL sys- Louvain-la-Neuve, Belgium, in 1996 and tems,” in Proceedings of the 34th Asilomar Conference on Sig- 2000, respectively. From 2000 to 2001, he nals, Systems and Computers (ACSSC ’00), vol. 2, pp. 1627– was a Visiting Scholar in the Electrical En- 1631, Pacific Grove, Calif, USA, October 2000. gineering Department at Stanford Univer- [7] R. Cendrillon, M. Moonen, E. Van den Bogaert, and G. Ginis, sity, Calif. He is currently a Postdoctoral Re- “The linear zero-forcing crosstalk canceller is near-optimal in searcher at the Delft University of Technol- DSL channels,” in Proceedings of IEEE Global Telecommuni- ogy, The Netherlands. His research inter- cations Conference (GLOBECOM ’04), vol. 4, pp. 2334–2338, ests are in signal processing for digital communications, mainly Dallas, Tex, USA, November–December 2004. synchronization issues and high-bit-rate transmission over wired [8] L. Tong and S. Perreau, “Multichannel blind identification: channels. His current specific interests are in crosstalk cancellation from subspace to maximum likelihood methods,” Proceedings techniques in DSL systems. He serves as an Associate Editor for of the IEEE, vol. 86, no. 10, pp. 1951–1968, 1998. the IEEE Communications Letters since 2003. He is corecipient of [9] Y. Zeng and T.-S. Ng, “A semi-blind channel estimation the “Prix Biennal Siemens 2000” and the “Prix Scientifique Alcatel method for multiuser multiantenna OFDM systems,” IEEE 2005.” 14 EURASIP Journal on Applied Signal Processing

A.-J. van der Veen was born in The Nether- lands in 1966. He graduated (cum laude) from the Department of Electrical Engi- neering, Delft University of Technology, in 1988, and received the Ph.D. degree (cum laude) from the same institute in 1993. Throughout 1994, he was a postdoctoral scholar at Stanford University, in the Sci- entific Computing/Computational Mathe- matics group and in the Information Sys- tems Lab. At present, he is a Full Professor in the Signal Processing group of DIMES, Delft University of Technology. He is the recipient of a 1994 and a 1997 IEEE SPS Young Author Paper Award, and was an Associate Editor for the IEEE Transactions on Signal Processing (1998–2001), Chairman of the IEEE SPS SPCOM Technical Com- mittee (2002–2004), and Editor-in-Chief of the IEEE Signal Pro- cessing Letters (2002–2005). He is currently the Editor-in-Chief of the IEEE Transactions on Signal Processing. His research interests are in the general area of system theory applied to signal process- ing, and in particular algebraic methods for array signal processing and signal processing for communications. Hindawi Publishing Corporation EURASIP Journal on Applied Signal Processing Volume 2006, Article ID 43154, Pages 1–12 DOI 10.1155/ASP/2006/43154

Iterative Refinement Methods for Time-Domain Equalizer Design

Guner¨ Arslan,1 Biao Lu,2 Lloyd D. Clark,3, 4 and Brian L. Evans5

1 Silicon Laboratories, Corporate Headquarters, 7000 West William Cannon Drive, Austin, TX 78735, USA 2 Schlumberger Sugar Land Product Center, 110 Schlumberger Drive, Sugar Land, TX 77478, USA 3 Schlumberger Austin Systems Center, 8311 N FM 620 Road, Austin, TX 78726, USA 4 TICOM Geomatics, 9130 Jollyville Road, Austin, TX 78759, USA 5 Department of Electrical and Computer Engineering, The University of Texas, Austin, TX 78712-1084, USA

Received 1 December 2004; Revised 23 May 2005; Accepted 2 August 2005 Commonly used time domain equalizer (TEQ) design methods have been recently unified as an optimization problem involving an objective function in the form of a Rayleigh quotient. The direct generalized eigenvalue solution relies on matrix decompositions. To reduce implementation complexity, we propose an iterative refinement approach in which the TEQ length starts at two taps and increases by one tap at each iteration. Each iteration involves matrix-vector multiplications and vector additions with 2 × 2 matrices and two-element vectors. At each iteration, the optimization of the objective function either improves or the approach terminates. The iterative refinement approach provides a range of communication performance versus implementation complexity tradeoffs for any TEQ method that fits the Rayleigh quotient framework. We apply the proposed approach to three such TEQ design methods: maximum shortening signal-to-noise ratio, minimum intersymbol interference, and minimum delay spread.

Copyright © 2006 Guner¨ Arslan et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. INTRODUCTION [2]. In a decoupled approach, the equalizer is a cascade of a time-domain equalizer (TEQ) to shorten the channel, a fast Multicarrier modulation is a widely used modulation me- Fourier transform (FFT) to perform multicarrier demodula- thod for reliable high-speed communication. Discrete multi- tion, and a frequency-domain equalizer (FEQ) to invert the tone (DMT) modulation is a popular variant of multicarrier frequency response of the shortened channel [3]. These three modulation that has been standardized for asymmetric and operations are linear. Combined equalization approaches ex- very high-speed digital subscriber loops (ADSL and VDSL, ploit the linearity by either moving the TEQ into the FEQ resp.) [1]. In these applications, a guard sequence known as to yield per-tone equalizers [4], or moving the FEQ into the the cyclic prefix is prepended to each symbol to help the re- TEQ to yield complex-valued time-domain equalizer filter ceiver eliminate intersymbol interference (ISI) and perform banks [5]. Combined equalization approaches yield higher symbol recovery. A DMT symbol consists of N samples, and the cyclic pre- data rates than decoupled approaches for the downstream fix is a copy of the last ν samples of the symbol. The length ADSL case [2]. of the channel impulse response has to be less than or equal A TEQ is generally implemented as a finite impulse re- to (ν + 1) samples in order for all ISI to be eliminated. Using sponse (FIR) filter placed at the receiver. The cascade of the a cyclic prefix, however, reduces the channel throughput of channel impulse response and the TEQ forms an effective a DMT transceiver by a factor of ν/(N + ν). Therefore, it is channel impulse response with length of ν + 1 samples, as desirable to choose ν as small as possible. shown in Figure 1. (In the case of ADSL, the channel im- The ADSL and VDSL standards set ν to be N/16. In pulse response is actually shortened to ν samples.) Various the field, however, ADSL and VDSL channel impulse re- design criteria resulting in many different design methods sponses can exceed N/16 samples. It is up to the equalizer have been proposed to calculate the TEQ coefficients [3, 6– in the receiver to shorten the channel impulse response and 8]. These four cited design methods can be unified as an op- to correct for frequency distortion in the shortened channel. timization problem involving a Rayleigh quotient [2]. The These two equalization tasks may be decoupled or combined generalized eigenvalue solution using matrix decompositions 2 EURASIP Journal on Applied Signal Processing

×10−3 2 sign method, we develop two iterative refinement algorithms. The divide-and-conquer Rayleigh quotient (DC-Rayleigh) 1.5 algorithm uses the objective function in Rayleigh quotient form. The divide-and-conquer eigenvector algorithm (DC- 1 eigenvector) optimizes the numerator of the objective func- 0.5 tion subject to a constraint involving the TEQ. The DC- eigenvector algorithm will have lower implementation com- 0 plexity than the DC-Rayleigh algorithm, which in turn will have significantly lower complexity than the originally re- Amplitude −0.5 ported TEQ design method. The rest of the paper is organized as follows. Section 2 −1 summarizes the three TEQ design methods of interest with −1.5 their objective functions. Section 3 derives the closed-form solutions for the DC-Rayleigh and DC-eigenvector meth- −2 ods. Section 4 applies the DC-Rayleigh and DC-eigenvector 0 50 100 150 200 250 300 methods to three TEQ design methods. Section 5 shows de- Discrete time tailed simulation results for the proposed methods. Section 6 Original channel concludes the paper. Shortened channel 2. BACKGROUND Figure 1: Example of the channel impulse response (carrier serving area loop 1), and the shortened channel impulse response obtained In this section, we summarize three existing TEQ design with a 16-tap TEQ designed with a maximum shortening signal-to- methods and the objective functions they optimize. All noise ratio (MSSNR) method. methods assume that ν is the length of the cyclic prefix, that the equalized or effective channel impulse response has a to- tal delay of Δ samples, and that perfect knowledge of the is in general not practical to implement in real-time on pro- channel impulse response is available. In ADSL and VDSL, grammable digital signal processors. the channel impulse response can be estimated during train- Instead, iterative design methods could be applied. The ing. During training, the discrete Fourier transform (DFT) of the channel impulse response is estimated, from which we iterative method could fix the TEQ length, Nw,andusegra- dient descent based on the Rayleigh quotient formulation to can obtain the channel impulse response estimate. The ef- iterate towards an optimal answer [9]. The step size must fect of channel estimation error on the following TEQ design be chosen with care, and scaling (normalization) may be methods has been quantified in [10]. needed at each iteration. Although each iteration depends on matrix-vector multiplications and vector additions involving 2.1. The maximum shortening signal-to-noise Nw × Nw matrices and vectors of length Nw,matrixdecom- ratio method positions are avoided. We propose an iterative refinement approach in which Melsa et al. [6] approach the TEQ design as solely a chan- the TEQ length starts at two taps and increases by one tap at nel shortening problem. They define a shortening signal-to- each iteration. A maximum TEQ length may be set. Other noise (SSNR) and derive the optimal TEQ in terms of maxi- stopping criteria include the cases in which no significant mizing SSNR which is the ratio of the energy inside a win- improvement in the objective function over the previous dow of (ν + 1) samples starting at sample (Δ + 1) to the iteration, and cases in which the objective function value energy outside the same window of the shortened channel has degraded over the previous iteration. Hence, the ap- impulse response. An ideal shortened channel impulse re- proach will improve the design at each iteration until it ter- sponse would be zero-valued outside the window in order to minates. No step size needs to be chosen and no scaling yield zero ISI and infinite SSNR. The assumption is that the is needed. Each iteration involves matrix-vector multiplica- larger the SSNR, the closer the shortened channel impulse re- tions and vector additions but involving 2 × 2matricesand sponse is to the ideal. However, optimizing SSNR is not nec- two-element vectors. Provided that the proposed approach essarily equivalent to maximizing bitrate or minimizing bit- completes its initialization step, the proposed approach can error rate but only an approximation to make the TEQ design be terminated at any time and a useful TEQ will result. problem mathematically tractable. Although this method ig- Hence, our approach scales with the available computational nores all noise components simulation results show that it resources. performs comparably well to other methods that take noise We apply the iterative refinement approach to the objec- into account [2]. tive functions of three different TEQ design methods: max- Let us define the effective or shortened channel impulse imum shortening signal-to-noise ratio (MSSNR) [6], min- response heff (k)as imum intersymbol interference (min-ISI) [7], and mini- = ∗ mum delay spread (MDS) [8] methods. For each TEQ de- heff (k) h(k) w(k), (1) Guner¨ Arslan et al. 3 where h(k) is the channel impulse response of length Lh, hwin w(k) represents the Nw TEQ coefficients, and “∗”denotes = heff (Δ+1),heff (Δ +2),..., heff (Δ + ν +1) , linear convolution. We can represent h ff (k)invectorform e hwall as = heff (1), ..., heff (Δ), heff (Δ+ν +2),..., heff Lh + Nw − 1 . (3) heff = heff (1), heff (2), ..., heff Lh + Nw − 1 . (2) The samples in hwall include the samples before the window The goal is to choose the TEQ coefficients such that the en- and the samples after the window. The SSNR objective func- tion [6]isdefinedas ergy of the effective channel impulse response heff mostly concentrates inside a window with length ν + 1, one sample = Energy in hwin SSNR 10 log10 ,(4) longer than the cyclic prefix. To accomplish this goal, we split Energy in hwall heff into two parts, hwin and hwall, which represent samples of the effective channel impulse response inside and outside the where hwin and hwall can be written in matrix form as shown window [6], respectively: in the following:

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ h ff (Δ +1) h(Δ +1) h(Δ) ··· h Δ − N +2 w(0) ⎢ e ⎥ ⎢ w ⎥ ⎢ ⎥ ⎢ ff (Δ +2) ⎥ ⎢ (Δ +2) (Δ +1) ··· Δ − +3 ⎥ ⎢ (1) ⎥ ⎢ he ⎥ ⎢ h h h Nw ⎥ ⎢ w ⎥ ⎢ . ⎥ = ⎢ . . . ⎥ ⎢ . ⎥ (5) ⎣ . ⎦ ⎣ . . .. . ⎦ ⎣ . ⎦ . . . . . . h ff (Δ + ν +1) h(Δ + ν +1) h(Δ + ν) ··· h Δ + ν − N +2 w N − 1 e    w  w   h H w ⎡ win ⎤ ⎡ win ⎤ h ff (1) h(1) 0 ··· 0 ⎢ e ⎥ ⎢ ⎥ ⎢ . ⎥ ⎢ . . . ⎥ ⎡ ⎤ ⎢ . ⎥ ⎢ . . .. . ⎥ w(0) ⎢ . ⎥ ⎢ . . . . ⎥ ⎢ ⎥ ⎢ Δ ⎥ ⎢ Δ Δ − ··· Δ − ⎥ ⎢ w(1) ⎥ ⎢ heff ( ) ⎥ ⎢ h( ) h( 1) h Nw +1 ⎥ ⎢ ⎥ ⎢ ⎥ = ⎢ ⎥ ⎢ . ⎥ . (6) ⎢ h ff (Δ + ν +2) ⎥ ⎢h(Δ + ν +2) h(Δ + ν +1) ··· h Δ + ν − N +3 ⎥ ⎣ . ⎦ ⎢ e ⎥ ⎢ w ⎥ . ⎢ . ⎥ ⎢ . . . ⎥ ⎣ . ⎦ ⎣ . . .. . ⎦ w N − 1 . . . . . w   h ff L + N − 1 00··· h L − 1 w e h  w   h 

hwall Hwall

The energy of hwin and hwall in (4)canbewrittenas should be chosen as the generalized eigenvector of B and A corresponding to the largest generalized eigenvalue [11]. T = T T = T hwallhwall w HwallHwallw w Aw, The approach in [6] to find the solutions is based on the T T T T (7) h hwin = w H Hwinw = w Bw, assumption that B is positive√ √ definite so that it has a Cholesky win win T decomposition as, B = B B .Then,lmin is computed as where the N × N matrices are defined as w w the eigenvector√ associated√ with the smallest√ eigenvalue of the −1 T −1 T −1 T matrix ( B) A( B ) . Finally, wopt = ( B ) lmin. A = H Hwall wall (8) Amorecomplicatedmethodin[6]applieswhenB is sin- = T B HwinHwin. gular. In order to avoid B from being singular, Yin and Yue [12] suggest an objective function to maximize wT Bw while Note that both A and B are real, symmetric and positive def- satisfying the constraint wT Aw = 1. In this case, they as- inite (excluding the case of ideal equalization which is not sume that A is positive definite since they perform a Cholesky possible in practice) by definition. SSNR can then be written decomposition on A.Bothcases[6, 12] require a Cholesky in compact form as decomposition, an eigendecomposition, and a matrix inver- × wT Bw sion of an Nw Nw matrix to find wopt. SSNR = 10 log . (9) 10 wT Aw 2.2. The minimum intersymbol interference method This form is known as the Rayleigh quotient. The optimal shortening method would find w to minimize wT Aw while Arslan et al. [7] propose a TEQ design method that can satisfying wT Bw = 1[6]. Solving this problem via the La- be viewed as a generalization to the MSSNR method. The grange multiplier method easily yields the solution that w minimum intersymbol interference (min-ISI) method is also 4 EURASIP Journal on Applied Signal Processing a simplified version of the maximum bitrate TEQ [7]design response, E = cT c is the total energy in the effective channel method that directly optimizes bitrate based on a subchannel impulse response, and n is the predefined “center of mass.” SNR model: If we can think of the MSSNR method as weighting the   ff 2 samples of the e ective channel impulse response with zero S C  ν =  x,i  signal,i   inside the window of size + 1 and one elsewhere, the MDS SNRi  2  2 , (10) Sn,i Cnoise,i + Sx,i CISI,i method, on the other hand, weights all samples with the square distance from the “center of mass” which has a similar where Sx,i, Sn,i, Csignal,i, Cnoise,i,andCISI,i are the signal power, function to the Δ delay parameter in the MSSNR or min-ISI noise power, signal path gain, noise path gain, and ISI path methods. The objective function of this method is the square gain in the ith subchannel, respectively. The min-ISI method delay spread which can be written as a Rayleigh quotient with makes use of the observation that the ISI term in the sub- A and B defined as channel SNR model is the dominant factor limiting bitrate; A = HT QH, hence, minimizing ISI alone would be a viable alternative to (14) the Maximum Bit Rate (MBR) method [7] that otherwise re- B = HT H, quires nonlinear optimization to calculate the TEQ taps. The where Q is a diagonal matrix with the diagonal made of the objective function for the min-ISI method can also be writ- 2 2 2 vector [(0 − n) ,(1− n) , ...,(Lw + Lh − 1 − n) ], and n is the ten as a Rayleigh quotient with matrices A and B defined as: “center of mass.”    = T T H A H D fi Sx,ifi DH, 3. DIVIDE-AND-CONQUER METHODS i∈R (11) = T Each method in the previous section requires a Rayleigh B HwinHwin, quotient to be optimized. The solution to this optimiza- where H is the channel convolution matrix, Hwin is defined tion problem is a generalized eigenvector of the two matri- in (5), fi is the ith row of the N × N DFT matrix, and D is a ces. Computing the generalized eigenvectors is a computa- diagonal matrix where the diagonal is defined as tionally challenging task that requires a heavy computational ⎧ burden and careful scaling to prevent singularities in the ⎨ 0, Δ +1≤ k ≤ Δ + ν +1, matrix computations. In this section, we propose two sub- g = (12) k ⎩1, otherwise. optimal methods called the divide-and-conquer Rayleigh- quotient (DC-Rayleigh) and divide-and-conquer eigenvec- Compared to the MSSNR method, the min-ISI method holds tor (DC-eigenvector) methods that can be used with most the energy inside the window of size (ν + 1) constant while objective functions that can be written as a Rayleigh quotient. minimizing a frequency-weighted form of the energy outside The proposed DC methods divide the calculation of a Nw- the window. The frequency weighting is based on the signal tap TEQ into smaller problems of finding two-tap TEQs, one per iteration. A unit-tap constraint is placed on each two-tap energy at a given frequency bin which can be thought as the ffi ISI energy. The weighting can also be chosen to take channel TEQ. The proposed methods are computationally e cient and do not require any advanced matrix computation that noise into account by replacing Sx,i in (11)withSx,i/Sn,i where could cause singularity problems. Sn,i is the noise power in subchannel i. This weighting func- tion emphasizes the placement of ISI in the frequencies with high SNR (low noise power). A small amount of ISI power 3.1. Divide-and-conquer Rayleigh quotient method in subchannels with low noise power can reduce the overall SNR dramatically. In subchannels with low SNR, however, The goal is to optimize an objective function of the form the noise power is large enough to dominate the ISI power wT Aw ff J = . (15) such that the e ect of ISI power on the SNR is negligible. wT Bw At the ith iteration, w is a 2 × 1 vector (a two-tap equalizer), 2.3. The minimum delay spread method i and Ai and Bi are 2 × 2 matrices. Assuming a unit-tap con- Schur and Speidel [8] propose another approach to shorten straint on each wi: a channel impulse response which can be described as vari- T wi = 1, gi (16) ation of the MSSNR method. The idea behind this approach is to minimize the square of the delay spread of the effective the objective function becomes channel impulse response which is defined as a1,i a2,i 1 1 gi wT A w a2,i a3,i gi Lc i i i   Ji = = 1 2 2 T D = (n − n) c[n] , (13) wi Biwi b1,i b2,i 1 E 1 gi (17) n=0 b2,i b3,i gi 2 where c is the effective channel impulse response defined as = a1,i +2a2,igi + a3,igi 2 . c = Hw. Lc is the length of the effective channel impulse b1,i +2b2,igi + b3,igi Guner¨ Arslan et al. 5

Table 1: DC-Rayleigh algorithm steps and complexity analysis for MSSNR, min-ISI, and MDS TEQ design, only step 3.1 differs among the TEQ methods. Step Description Multiplications Additions Divisions Square root

1 Initialize wTEQ = [1]— ———

2 Initialize h0 = h ————

3 Repeat for i = 1, ..., Nw − 1— — — —

3.1 MSSNR Ai (28)andBi (29)2(Lh + i +1) 2(Lh + i)— —

3.1 min-ISI Ai (30)andBi (29)2(Lh + i + N +2) 2(Lh + i + N +1) — —

3.1 MDS Ai (32)andBi (33)5(Lh + i)+4 5(Lh + i)— —

3.2 gi,1 and gi,2 from (19)148 21

3.3 J in (25)forgi,1 and gi,2 12 8 2 —

3.4 wTEQ = wTEQ ∗ wi (i − 1) (i − 1) — —

3.5 hi = hi−1 ∗ wi−1 (Lh + i)(Lh + i)——

The assumption that matrix B is positive definite prevents the where wi(k) is the two-tap TEQ obtained at the ith iteration. denominator in (17) from going to zero for any value of gi. Table 1 summarizes the steps of the DC-Rayleigh method. Inspection of the B matrices in the three objective functions We can also design a two-tap equalizer with a unit-norm in the previous section will show that all are symmetric and constraint (UNC) as positive definite by definition. ff Di erentiating Ji in (17)withrespecttogi, setting the w = sin θ ,cosθ T . (22) derivative to zero, and simplifying the result leads to i i i

By factoring out sin θi,wecanrewrite(22)toobtain a b − a b g2 + a b − a b g 3,i 2,i 2,i 3,i i 3,i1,i 1,i 3,i i (18) + a2,ib1,i − a1,ib2,i = 0. T T cos θi T wi = sin θi,cosθi = sin θi 1, = sin θi 1, ηi . sin θi The solutions to the quadratic function of gi in (18)are (23) − − ± a3,ib1,i a1,ib3,i γ If we substitute (23) into (17), then the sin θi term would can- gi,(1,2) = , (19) 2 a3,ib2,i − a2,ib3,i cel out, which would give the same result as (19). where γ is 3.2. Divide-and-conquer eigenvector method The DC-Rayleigh method finds a suboptimal solution of an 2 a3,ib1,i − a1,ib3,i − 4 a3,ib2,i − a2,ib3,i a2,ib1,i − a1,ib2,i . objective function described as a Rayleigh quotient. In many (20) cases, however, the denominator term of the objective func- tion is constrained to prevent the trivial all-zero TEQ solu- tion. For example in the MSSNR and min-ISI methods the We choose the value of gi among {gi,1, gi,2} in (19) that gives the optimal value for J . Once the value for g is chosen, denominator term is to constrain the energy inside the win- i i ν we have a two-tap TEQ w that maximizes the given objec- dow of length + 1. In the MDS method the denominator is i ff tive. constraining the total energy in the e ective channel impulse response. The DC-Rayleigh method already places a unit-tap Our goal is to maximize the objective for a Nw-tap TEQ. After the first iteration, we convolve the calculated two-tap constraint on each two-tap TEQ, which prevents the trivial TEQ with the channel impulse response h to obtain an in- solution. termediate effective channel impulse response h1. Assuming The DC-eigenvector method is developed to drop the de- that this newly calculated intermediate effective channel im- nominator term from the objective function and optimize pulse response is our new channel we repeat the above proce- the numerator only in order to prevent over-constraining dure and calculate a new two-tap TEQ and a new intermedi- the solution space. The problem is reduced to optimizing the quadratic objective function ate channel impulse response. This process is repeated Nw −1 times so that we have gi, and hence wi for i = 1, ..., Nw − 1. = T The Nw-tap TEQ can than be obtained by convolving all two- Ji wi Aiwi. (24) tap TEQs together: We apply the same idea in optimizing this objective func- = ∗ ∗···∗ w(k) w1(k) w2(k) wNw −1(k), (21) tion by defining a two-tap TEQ as in (16) and rewriting the 6 EURASIP Journal on Applied Signal Processing objective function at the ith iteration as 4.1. Application to MSSNR = T = a1,i a2,i 1 In the case of MSSNR, the Ai and Bi matrices used at iteration Ji wi Aiwi 1 gi a2,i a3,i gi (25) i are written as = 2 a1,i +2a2,igi + a3,igi . = T = a1,i a2,i Ai Hi,winHi,win Differentiating J in (25)withrespecttog and setting the a2,i a3,i i i ⎡ ⎤ derivative to zero gives Δ+ν+1 Δ+ν+1 ⎢ 2 ⎥ ⎢ h (k) hi(k)hi(k − 1)⎥ =−a2,i ⎢ i ⎥ (28) gi . (26) ⎢ k=Δ+1 k=Δ+1 ⎥ a3,i = ⎢Δ ν Δ ν ⎥ , ⎢ ++1 ++1 ⎥ ⎣ − 2 − ⎦ Once again we obtain the optimal solution for a two-tap hi(k)hi(k 1) hi (k 1) − k=Δ+1 k=Δ+1 TEQ. Repeating this process Nw 1 times and convolving the resulting two-tap TEQs together, we obtain the Nw-tap = T = b1,i b2,i Bi Hi,wallHi,wall TEQ. b2,i b3,i Note that the DC-Rayleigh method requires the calcula- ⎡   ⎤ 2 − tion of all entries of both A and B matrices at every iteration, ⎢ hi (k) hi(k)hi(k 1)⎥ (29) ⎢ ∈ ∈ ⎥ but the DC-eigenvector method only requires two entries of = ⎢ k S k S ⎥ ⎣ − 2 − ⎦ , the A matrix to be computed in every iteration. The DC- hi(k)hi(k 1) hi (k 1) ∈ ∈ eigenvector method also does not require a square root oper- k S k S ation, which further reduces the computational complexity = ∗ = = and is more suitable for real-time implementation on a pro- where hi(k) hi−1(k) wi(k)fori 1, ..., Nw−1, k − grammable digital signal processor. 0, ..., Lh + i 1, and h0(k) is the original channel impulse Similar to the DC-Rayleigh method we can derive the response. The convolution to obtain the new intermediate channel impulse response is simplified by the fact the first unit-norm constrained DC-eigenvector by replacing wi in (25)by(23)toobtain tap of the two-tap TEQ is always set to one; hence, only one multiplication and one addition is required to calculate each J = wT A w tap of the new intermediate impulse response. Also note that i,UNC i i i a3,i and b3,i are closely related to a1,i and b1,i,respectively,in a1,i a2,i 1 = sin θi 1 ηi sin θi (27) that they differ in only two elements of the sums hence can a2,i a3,i ηi be derived from each other without recomputing the square = 2 2 sin θi a1,i +2a2,iηi + a3,iηi of the sums. which will make ηi equal to gi in (26)afterwesolvefor 4.2. Application to min-ISI ηi. Therefore, both unit-tap constraint and unit-norm con- straint in DC-Rayleigh and DC-eigenvector methods should In the case of min-ISI, the B matrix is the same as given in yield the same performance. i (29) and the elements of Ai are defined as 4. APPLICATION OF DIVIDE-AND-CONQUER  N−1 2 METHODS a1,i = hi(n)s(k − n) , k∈R n=0 This section gives detailed derivations on how the MSSNR,  N−1 N−1 2 min-ISI, and MDS objective functions can be used in con- a2,i = hi(n)s(k − n) hi(n)s(k − 1 − n) , junction with DC-Rayleigh and DC-eigenvector methods. k∈R n=0 n=0 Tables 1 and 2 describe the steps and quantify the  N−1 2 computations per iteration for the DC-Rayleigh and DC- a3,i = hi(n)s(k − 1 − n) , eigenvector methods, respectively. Note that only the calcu- k∈R n=0 lation of the Ai and Bi matrices differ between methods. (30) The delay parameter (Δ in the min-ISI and MSSNR S ={ Δ Δ ν } methods or n in the MDS method) is still an important pa- where 1, ..., , + +2,..., N and s(n) is the time- rameter in the DC methods although it does not appear in domain equivalent of the frequency-domain weighting func- the derivation of the DC methods themselves. This param- tion Sx,i andisdefinedas eter is embedded in the Ai and Bi matrices, as it was in the N−1 original methods. In [2], a range of 15–35 for the delay pa- = j(2π/N)in ± s(n) Sx,ie . (31) rameter caused a change in achieved bitrate of less than 1% i=0 for MDS, ±2% for MSSNR, and ±5% for min-ISI methods. A reasonable initial guess for the delay parameter is the cyclic The application of DC methods to the min-ISI objective prefix length ν (i.e., 32 for downstream ADSL). When using function requires first the calculation of the time-domain the DC methods, a delay search could be performed during weighting function s(n), which can be performed with an the first iteration. N-point inverse FFT. This calculation needs to be done only Guner¨ Arslan et al. 7

Table 2: Implementation complexity of DC-eigenvector algorithms for MSSNR, min-ISI, and MDS TEQ design methods. Only step 3.1 differs among the TEQ methods. Step Description Multiplications Additions Divisions Square root

1 Initialize wTEQ = [1]— — ——

2 Initialize h0 = h ————

3FixΔ.Fori = 1, ..., Nw − 1— — ——

3.1 MSSNR Ai (28)(Lh + i − ν)(Lh + i − ν − 1) — —

3.1 min-ISI Ai (30)(Lh + i − ν)+2(N +1) (Lh + i − ν − 1) + 2(N +1) — —

3.1 MDS Ai (32)3(Lh + i)+2 3(Lh + i)—— 3.2 g (26)— —1—

3.3 wTEQ = wTEQ ∗ wi (i − 1) (i − 1) — —

3.4 hi = hi−1 ∗ wi−1 (Lh + i)(Lh + i)——

once and not for every iteration. However the inner sums comparison of computational complexity, we replace the in (30) are required for every iteration which adds to the eigenvalue decomposition in the original methods by the it- computational complexity compared to the MSSNR objec- erative power method [13] since only the dominant eigen- tive function. As a side benefit, the DC methods get around a value and its corresponding eigenvector are needed. We as- restriction of the min-ISI method that the TEQ length could sume 10 iterations for the power method as in [14]. exceed the CP length by designing two taps at a time. Table 3 summarizes the original TEQ design methods and their computational costs, whereas Tables 1 and 2 sum- 4.3. Application to MDS marize the DC-Rayleigh and DC-eigenvector methods and their computational complexity, respectively. For the MDS objective functions, the Ai and Bi matrix ele- Both proposed methods have reduced computational ments are defined as complexity when compared to the original methods and are better suited for real-time implementation on digital signal Lc processors because they avoid any matrix calculations that a = (k − n)2h2(k), 1,i i require careful scaling. The complexity gap between the orig- k=0 inal and proposed methods increases with increasing Nw be- Lc = − − − − cause the dominant cost savings are from the matrix opera- a2,i (k n)(k 1 n)hi(k)hi(k 1), (32) × k=0 tions performed on Nw Nw matrices in the original meth- ods. Lc a = (k − 1)(k − 1 − n)2h2, Table 4 lists the computational complexity for each of the 3,i i = k=0 methods for a moderate length TEQ of size Nw 16 and N = L = 512, and ν = 32, by assuming 10 iterations in the Lc h = 2 power method for the original methods. The largest com- b1,i hi (k), k=0 plexity reduction is 24% and 15% for the MSSNR objective

Lc function for the DC-Rayleigh and DC-eigenvector methods, b2,i = hi(k)hi(k − 1), (33) respectively. Percentage savings in all cases would increase for k=0 longer TEQs.

Lc = 2 − b3,i hi (k 1). 5. SIMULATION RESULTS k=0 We showed in the previous section that the divide-and- As with the MSSNR method the calculation of both a3,i and conquer methods, especially the DC-eigenvector method, b3,i can be based on a1,i and b1,i, respectively, to avoid recal- have significant complexity savings over the original meth- culating the sum of squares. Even with these savings, how- ods. In this section, we present simulation results to analyze ever, the MDS method requires all sums to be over the entire the bitrate performance of the proposed methods. It is worth length of the intermediate channel impulse response dou- noting that the DC-Rayleigh method communication per- bling the computational complexity compared to the former formance is bounded above by the performance of the origi- two methods. nal method used because it optimizes the same function but two taps at a time. Since we fix the previous taps at every 4.4. Comparison of computational complexity iteration, we are not guaranteed to obtain the optimal solu- tion. It is not possible to determine the upper-bound perfor- We compare the computation complexity of the applica- mance for the DC-eigenvector method in terms of the origi- tion of both the DC-Rayleigh and DC-eigenvector meth- nal methods since the DC-eigenvector method uses different ods to all three objective functions in this section. For a fair constraints compared to the original methods. 8 EURASIP Journal on Applied Signal Processing

Table 3: Implementation complexity of the MSSNR, min-ISI, and MDS methods. Step Description Multiplications Additions Divisions

1 A and B for MSSNR (8) Nw(Lh +2Nw − 2) Nw(Lh +2Nw − 3) — 1 A and B for min-ISI (11) Nw(Lh +2Nw − 1+N) Nw(Lh +2Nw − 2+N)— 1 A and B for MDS (14)2Nw(Lh +2Nw − 2) + Lh + Nw − 12Nw(Lh +2Nw − 3) + Lh + Nw − 1— 3 3 2 Cholesky√ Decomposition B Nw Nw — 3 ( B)−1 [11](53 + ) 3(53 + ) 3— √ √ Nw Nw / Nw Nw / = −1 T −1 3 2 − 4 c ( B) A( B ) 2Nw 2Nw(Nw 1) — 6 Power method to find eigenvector corresponding to the minimum eigenvalue of C −1 3 3 6.1 Calculate C [11](5Nw + Nw)/3(5Nw + Nw)/30 6.2 Initialize l(0) ——— (k) = −1 (k−1) 2 − 6.3 z C l Nw (Nw 1)Nw 0 (k) (k) (k) 6.4 lopt = z /  z  Nw Nw − 1 Nw (k) = (k) T −1 (k) 2 − 6.5 λ [l ] C l (Nw +1)Nw Nw 10 6.6 if |λ(k) − λ(k−1)| > threshold, go to step 6.3 — — — √ = T −1 (k) 2 − 7 wopt ( B ) lopt Nw (Nw 1)Nw 0

Table 4: Tradeoff between bitrate performance and complexity that is slid over the original channel impulse response so that (multiplications) of the original and the two divide-and-conquer the SSNR as defined in (4) is maximized. The index of the ν = = variations of each TEQ design method for 32, Nw 16, and first nonzero sample of the window is chosen as the delay = Lh 512. Complexity of original methods assume that the power parameter. method is run for 10 iterations. Simulations are carried out for 300 DMT symbols carry- Bitrate ing quadrature phase-shift keying (QPSK) signals in all sub- Method Complexity Bitrate Complexity (Mbps) channels, except for subchannel 0 (voiceband) which is not MSSNR original 7.96 101 808 100% 100% used for data, subchannels 1–5 (ISDN band) which are not MSSNR Rayleigh 7.34 23 925 92% 24% used for data, and subchannel N/2 + 1 which cannot carry complex symbols. At the receiver, an FIR TEQ filters the re- MSSNR eigenvector 7.74 15 225 97% 15% ceived noisy signal, and passes its output through the cyclic Min-ISI original 8.02 110 016 100% 100% prefix removal block and the FFT. A one-tap FEQ per sub- Min-ISI Rayleigh 7.40 39 315 92% 36% channel rotates the symbols in each subchannel. (We de- Min-ISI eigenvector 7.70 30 615 96% 28% signed the FEQ tap on each subchannel to be optimal in MDS original 7.72 111 007 100% 100% a mean-squared error sense.) The rotated symbols are then MDS Rayleigh 7.45 47 355 96% 43% compared to the transmitted symbols in each subchannel, MDS eigenvector 7.58 31 335 98% 28% and the difference is the error signal, from which the receiver SNR is calculated. Based on the SNR in each subchannel, we calculate the total bitrate achievable for the given TEQ by us- ing All simulations are based on the commonly used eight  = SNRi carrier-serving-area (CSA) loops that were obtained from bDMT log2 1+ Γ , (34) R the UT Austin Matlab DMTTEQ Toolbox [15]. The CSA k∈ loops are placed in cascade with two fifth-order high-pass where R is a set of indices for all used subchannels, SNRi is Chebyshev filters. The first filter has a turn-on frequency of the SNR in subchannel i, and the SNR gap Γ is 10.8dB[16]. 4.8 kHz and simulates the effect of the splitter that separates Figure 1 shows the impulse response of CSA loop 1 the voice-band from the data-band. The second filter is used and the shortened or effective impulse response obtained to separate the upstream from the downstream in frequency with a 16-tap TEQ designed with the proposed MSSNR- division multiplexing at a frequency of 138 kHz. DC-Rayleigh method. Figure 2 compares the performance A transmit signal power of 26 dBm on a 100 Ω load is of all nine methods for all 8 CSA loops with the matched- assumed. The thermal noise is modeled as white Gaussian filter bound (MFB) and the case where no TEQ is used at noise with −140 dBm/Hz spectral power. Near-end crosstalk all. The motivation of using a TEQ is apparent due to the noise is introduced for 8 ISDN disturbers as described in the gap in communication performance when no TEQ is used. ADSL specifications [1].Allmethodsmakeuseofanideal When comparing the original methods among each other in estimate of the channel impulse response. achieved bitrate, the MSSNR and min-ISI methods perform The delay parameter is chosen based on a heuristic search closely while outperforming the MDS method. All meth- that gives satisfactory performance with minimal complexity ods seem to perform relatively close to the MFB although [14]. The method uses a rectangular window of size ν +1 other TEQ design methods that get even closer to the MFB Guner¨ Arslan et al. 9

10 would lie in the lower right corner of this map where perfor- 9 8 mance is maxima and computational complexity is minimal. 7 This plot is obtained by averaging the bitrate numbers ob- 6 tained for each method on each of the eight CSA loops. We 5 4 see from this plot that the MSSNR-DC-eigenvector method is 3 the choice when computational complexity is the major de- Bitrate (Mbps) 2 ciding factor. If, however, communication performance is the 1 only factor, then the original min-ISI or the MSSNR meth- 0 12345678 ods seemed to be the best choice. Complexity of the original CSA loop methods assume that the power method is run for 10 itera- tions. One could easily argue that the performance gap be- MFB MDS-RQ tween the proposed DC methods and the original methods MSSNR MSSNR-EV MINISI MINISI-EV is so small (on the order of 0.5 Mbps) that the extra com- MDS MDS-EV plexity and implementation hardship due to matrix opera- MSSNR-RQ NO-TEQ tions is not justified. This plot also reveals that for all DC- MINISI-RQ Rayleigh methods there exists a method that gets better per- formance with lower computational complexity. A similar ar- Figure 2: Performance of all methods on 8 CSA loops with TEQ gument holds for the min-ISI objective function which seems = = = length Nw 16, symbol length N 512, channel length Lh 512, not to perform as well as the other two objective functions and cyclic prefix length ν = 32. when DC methods are applied. The MSSNR-DC-eigenvector method gives on average better performance with less com- plexity compared to the min-ISI when DC methods are ap- exist [2]. As expected, the proposed suboptimal divide-and- plied. Figure 3(b) shows performance for all methods un- conquer methods generally perform worse than the orig- der varying numbers of TEQ taps. The graph shows that inal methods. However, on CSA loops 6 and 8, the pro- most methods settle around an upper-bound performance posed MDS-DC-eigenvector method actually outperforms with a 10-tap TEQ. The DC-Rayleigh methods actually re- the original MDS method. This could be expected since none duce bitrate performance with increasing numbers of taps. of the methods directly optimize bitrates but alternative ob- This again could be explained by the fact that none of these jective functions such as the delay spread in this case. Since methods directly optimize the bitrate. It turns out that the optimizing the delay spread is not equivalent to optimizing DC-Rayleigh method tends to use the additional freedom of the bitrate, one can sometimes expect the bitrate to increase more TEQ taps in the wrong direction in terms of bitrate while the delay spread decreases. performance. Another observation from Figure 2 is that most of the Finally we analyze the effect of channel estimation error time, the DC-eigenvector method outperforms the DC- on each method. The channel estimation error is modeled Rayleigh method. At first thought, one might think that as additive white Gaussian noise on the ideal (real) channel the DC-Rayleigh method should perform better because it impulse response. The noisy channel estimate is used in the solves the original objective functions as opposed to a sim- calculations of the TEQ coefficients with each method while plified one, as the DC-eigenvector method does. As men- the performance estimation is done using the real channel tioned in Section 3.2, the DC-Rayleigh method may be over- constraining the solution due to the new constraint on the impulse response. first tap of the two-tap equalizers designed at every iteration. As shown in Figure 4 the performance of all methods in- For all three original methods in this paper, the denominator creases with increasing SNR of the channel estimation error. of the Rayleigh quotient serves mostly as a constraint to pre- For SNRs higher than 80 dB the original methods outper- vent the all-zero trivial solution for the TEQ. The divide-and- form the iterative methods by similar margin as with an ideal ff conquer methods already have a unit-tap constraint built channel so the noise is too low to have an e ect on the results. into them. Thus, removing the original constraint expands The performance gap between the original and DC meth- the solution space, and the DC-eigenvector method is able to ods increases for SNRs lower than 80 dB–90 dB. The worst- find better solutions most of the time while having lower im- case additional performance loss of the DC methods over the plementation complexity. Taking into account the reduction original methods is around 3% for the MSSNR and min- in computational complexity when compared to the DC- ISI-based DC methods and about 14% for the MDS-based Rayleigh method the DC-eigenvector method seems to be a DC methods. So we can conclude that the MDS method is better choice in general. more sensitive to channel estimation errors when used in The primary motivation of the proposed DC methods conjunction with the proposed DC methods. This conclu- is to reduce the computational complexity and avoid ma- sion also agrees with the results in [10]. trix computations that require careful scaling in return for Although the DC-Rayleigh method follows the trend of some communication performance loss. Figure 3(a) maps all the original method with drastic performance reductions at methods referred in this paper onto a two-dimensional space lower SNRs, the DC-eigenvector method delivers about 30– with one axis representing the computational complexity and 40% of the peak bitrate even with bad channel estimates. This the other communication performance. The best solutions again may be explained by the fact that the DC-eigenvector 10 EURASIP Journal on Applied Signal Processing

×104 12 8

10 7.5

8 7 6 6.5 4

Achievable bit rate (Mbps) 6 2 Complexity (number of multiplications) 0 5.5 7.47.67.888.2 51015202530

Bitrate (Mbps) Number of TEQ taps (Nw)

MSSNR MSSNR-RQ MSSNR-EV MSSNR MINISI MINISI-RQ MINISI-EV MSSNR-EV MDS MDS-RQ MDS-EV MSSNR-RQ (a) (b)

8 8

7.5 7.5

7 7

6.5 6.5

Achievable bitrate (Mbps) 6 Achievable bitrate (Mbps) 6

5.5 5.5 51015202530 51015202530

Number of TEQ taps (Nw) Number of TEQ taps (Nw)

MINISI MDS MINISI-EV MDS-EV MINISI-RQ MDS-RQ (c) (d)

Figure 3: With symbol length N = 512 and channel length Lh = 512, communication performance versus (a) implementation complexity for all methods with TEQ length Nw = 16 and cyclic prefix length ν = 32, where the bitrates are taken as the average over all eight CSA loops; (b) TEQ length for MSSNR methods; (c) TEQ length for min-ISI methods; and (d) TEQ length for MDS methods. EV means the DC-eigenvalue method and RQ means the DC-Rayleigh method. methods are less constrained hence have a larger space to find a number of methods can deliver bitrates close to the up- a better solution even with noisy channel estimates. The orig- per bound of achievable performance. Many of these high- inal methods as well as the DC-Rayleigh methods practically performance methods can mathematically be classified as an stop working at low SNR situations delivering only about optimization of a Rayleigh quotient, which requires com- 10% of the peak bitrate with 20 dB estimation noise. putationally intensive matrix decompositions to solve di- rectly. The focus of this paper is to reduce the computational 6. CONCLUSION complexity by avoiding matrix decompositions. We propose an iterative refinement approach in which the TEQ length The design of a time-domain equalizer (TEQ) for dis- starts at two taps and increases by one tap at each itera- crete multitone modulation has been studied extensively and tion. Guner¨ Arslan et al. 11

10 10

5 5

0 0 20 30 40 50 60 70 80 90 100 20 30 40 50 60 70 80 90 100 Achievable bitrate (Mbps) Channel estimation SNR (dB) Achievable bitrate (Mbps) Channel estimation SNR (dB)

MSSNR MINISI MSSNR-EV MINISI-EV MSSNR-RQ MINISI-RQ (a) (b)

10

5

0 20 30 40 50 60 70 80 90 100

Achievable bitrate (Mbps) Channel estimation SNR (dB)

MDS MDS-EV MDS-RQ (c)

Figure 4: Performance of all methods on CSA loop 1 with TEQ length Nw = 16, symbol length N = 512, channel length Lh = 512, cyclic prefix length ν = 32, and SNR of channel estimation error. Estimation error is modeled as additive white Gaussian noise. Performance is averaged over 10 runs.

The first method is the divide-and-conquer Rayleigh REFERENCES quotient (DC-Rayleigh) method. The DC-Rayleigh gives an approximate solution to the Rayleigh quotient optimization [1] “ANSI T1.413-1995, Network and customer installation in- problem. Our simulation results show that the proposed DC- terfaces: Asymmetrical digital subscriber line (ADSL) metal- Rayleigh method gives close to ideal performance with re- lic interface,” printed from: Digital Subscriber Line Technol- ffi duced computational complexity. The fact that the proposed ogybyT.Starr,J.M.Cio , and P. J. Silverman, Prentice-Hall, DC-Rayleigh method introduces an additional unit-tap con- 1999. straint on the solutions motivates us to further simplify the [2] R. K. Martin, K. Vanbleu, M. Ding, et al., “Unification and evaluation of equalization structures and design algorithms TEQ design methods by dropping the divisor term from the for discrete multitone modulation systems,” IEEE Transac- Rayleigh quotient. This yields to a quadratic cost functions tions Signal Processing, vol. 53, no. 10, part 1, pp. 3880–3894, with eigenvector solutions. The second method is the divide- 2005. and-conquer eigenvector method (DC-eigenvector), which [3] J. S. Chow and J. M. Cioffi,“Acost-effective maximum likeli- solves the eigenvalue problem approximately with further re- hood receiver for multicarrier systems,” in Proceedings of IEEE duced complexity. International Conference on Communications (ICC ’92), vol. 2, We apply both divide-and-conquer methods to optimize pp. 948–952, Chicago, Ill, USA, June 1992. the objective functions of three different TEQ design meth- [4] K. Van Acker, G. Leus, M. Moonen, O. van de Wiel, and T. ods. The methods are the maximum shortening signal-to- Pollet, “Per tone equalization for DMT-based systems,” IEEE noise ratio, minimum intersymbol interference, and mini- Transactions on Communications, vol. 49, no. 1, pp. 109–119, mum delay spread (MDS). Complexity analysis and simu- 2001. lations results show that the proposed methods reduce the [5] M. Ding, Z. Shen, and B. L. Evans, “An achievable per- computational complexity of the original methods with mi- formance upper bound for discrete multitone equalization,” in Proceedings of IEEE Global Telecommunications Conference nor performance degradation. In fact, the proposed itera- (GLOBECOM ’04), vol. 4, pp. 2297–2301, Dallas, Tex, USA, tive refinement approach provides a range of communication November–December 2004. ff performance versus implementation complexity tradeo sfor [6]P.J.W.Melsa,R.C.Younce,andC.E.Rohrs,“Impulsere- any TEQ method that fits the Rayleigh quotient framework. sponse shortening for discrete multitone transceivers,” IEEE The measure of communication performance depends on Transactions on Communications, vol. 44, no. 12, pp. 1662– the objective function used by the TEQ method. 1672, 1996. 12 EURASIP Journal on Applied Signal Processing

[7] G. Arslan, B. L. Evans, and S. Kiaei, “Equalization for discrete Biao Lu received her Ph.D. degree in elec- multitone transceivers to maximize bit rate,” IEEE Transactions trical engineering (2000) and M.S.E.E. de- Signal Processing, vol. 49, no. 12, pp. 3123–3135, 2001. gree (1997) from the University of Texas at [8] R. Schur and J. Speidel, “An efficient equalization method to Austin in Austin, Texas, USA. Her disserta- minimize delay spread in OFDM/DMT systems,” in Proceed- tion was entitled Wireline Channel Estima- ings of IEEE International Conference on Communications (ICC tion and Equalization. She received her B.S. ’01), vol. 5, pp. 1481–1485, Helsinki, Finland, June 2001. degree in biomedical engineering (1992) from the Capital Institute of Medicine in [9] R. K. Martin, K. Vanbleu, M. Ding, et al., “Implementation Beijing, China. Since 2000, she has been complexity and communication performance tradeoffs in dis- with Schlumberger in Houston, Texas, USA, crete multitone modulation equalizers,” to appear in IEEE where she is currently a Senior Software Engineer in telemetry sys- Transactions Signal Processing, http://www.ece.utexas.edu/ tems. Her research interests include signal processing, image pro- bevans/papers/2005/equalizationII. cessing, neural networks, and embedded systems. [10] M. Ding, B. L. Evans, and I. Wong, “Effect of channel estima- tion error on bit rate performance of time domain equalizers,” Lloyd D. Clark received his B.S., M.S., and in Proceedings of 38th IEEE Asilomar Conference on Signals, Ph.D. degrees in electrical engineering and Systems and Computers, vol. 2, pp. 2056–2060, Pacific Grove, computer science from the Massachusetts Calif, USA, November 2004. Institute of Technology in 1984, 1986, and [11] G. H. Golub and C. F. Van Loan, Matrix Computation,John 1990, respectively. From 1990 to 2003, he Hopkins University Press, Baltimore, Md, USA, 3rd edition, held various positions at the Schlumberger 1996. Austin Technology Center in Austin, Texas, [12] C. Yin and G. Yue, “Optimal impulse response shortening for USA, including Principal Engineer and Re- discrete multitone transceivers,” IEE Electronics Letters, vol. 34, search Scientist. While at Schlumberger, he no. 1, pp. 35–36, 1998. designed and developed wireline telemetry [13] J. W. Demmel, Applied Numerical Linear Algebra,SIAM, systems for well logging applications for the oil field, as well as Philadelphia, Pa, USA, 1997. wireless metering systems. Since 2004, he has been a Principal Sci- entist with Ticom Geomatics in Austin, Texas, USA, where he has [14]B.Lu,L.D.Clark,G.Arslan,andB.L.Evans,“FastTime- been the technical lead on several wireless geolocation projects. He Domain Equalization for Discrete Multitone Modulation Sys- holds several patents, has published several technical papers, and tems,” in Proceedings of IEEE Digital Signal Processing Work- has coadvised graduate students at both MIT and the University of shop, Hunt, Tex, USA, October 2000. Texas at Austin. [15] G. Arslan, M. Ding, B. Lu, M. Milosevic, Z. Shen, and B. L. Evans, “MATLAB DMTTEQ Toolbox 3.1,” 2003, Available Brian L. Evans is Professor of Electrical at: http://www.ece.utexas.edu/∼bevans/projects/adsl/dmtteq/ and Computer Engineering at the Univer- dmtteq.html. sity of Texas at Austin in Austin, Texas, [16] N. Al-Dhahir and J. M. Cioffi, “A bandwidth-optimized USA. His B.S.E.E.C.S. (1987) degree is from reduced-complexity equalized multicarrier transceiver,” IEEE the Rose-Hulman Institute of Technology in Transactions on Communications, vol. 45, no. 8, pp. 948–956, Terre Haute, Indiana, USA, and his M.S.E.E. 1997. (1988) and Ph.D.E.E. (1993) degrees are from the Georgia Institute of Technology in Atlanta, Georgia, USA. From 1993 to 1996, Guner¨ Arslan received his Ph.D. degree in he was a Postdoctoral Researcher at the Uni- electrical engineering (2000) at the Uni- versity of California, Berkeley, in design automation for embedded versity of Texas at Austin in Austin, Texas, digital systems. At UT Austin, his research group develops signal USA. His dissertation was entitled Equal- quality bounds, optimal algorithms, low-complexity algorithms, ization for Discrete Multitone Transceivers. and real-time embedded software of high-quality image halftoning He received his M.S. degree in electron- for desktop printers, smart image acquisition for digital still cam- ics and communications engineering from eras, high-bitrate equalizers for multicarrier ADSL receivers, and Yildiz Technical University, Istanbul, Turkey resource allocation for multiuser OFDM base stations. He is the in 1998, and his B.S. degree in electronics architect of the Signals and Systems Pack for Mathematica. He re- and communications engineering as Vale- ceived a 1997 US National Science Foundation CAREER Award. dictorian of his class from Yildiz University, Kocaeli, Turkey in 1994. He is currently a Senior Systems Design Engineer in the wire- less product division of Silicon Laboratories based in Austin, Texas, USA. He is also an Adjunct Faculty in the Electrical and Computer Engineering Department at the University of Texas at Austin. His research interests are in digital signal processing, communications systems, and embedded real-time digital signal processing. He is a Member of IEEE Signal Processing and Communication Soci- eties. Hindawi Publishing Corporation EURASIP Journal on Applied Signal Processing Volume 2006, Article ID 98738, Pages 1–10 DOI 10.1155/ASP/2006/98738

Near-Capacity Coding for Discrete Multitone Systems with Impulse Noise

Masoud Ardakani,1 Frank R. Kschischang,2 and Wei Yu2

1 Department of Electrical and Computer Engineering, University of Alberta, ECERF Building, Edmonton, AB, Canada, T6G 2V4 2 Department of Electrica and Computer Engineering, University of Toronto, 10 King’s College Road, Toronto, ON, Canada M5S 3G4 Received 1 December 2004; Revised 28 April 2005; Accepted 9 June 2005 We consider the design of near-capacity-achieving error-correcting codes for a discrete multitone (DMT) system in the presence of both additive white Gaussian noise and impulse noise. Impulse noise is one of the main channel impairments for digital subscriber lines (DSL). One way to combat impulse noise is to detect the presence of the impulses and to declare an erasure when an impulse occurs. In this paper, we propose a coding system based on low-density parity-check (LDPC) codes and bit-interleaved coded modulation that is capable of taking advantage of the knowledge of erasures. We show that by carefully choosing the degree distribution of an irregular LDPC code, both the additive noise and the erasures can be handled by a single code, thus eliminating the need for an outer code. Such a system can perform close to the capacity of the channel and for the same redundancy is significantly more immune to the impulse noise than existing methods based on an outer Reed-Solomon (RS) code. The proposed method has a lower implementation complexity than the concatenated coding approach.

Copyright © 2006 Hindawi Publishing Corporation. All rights reserved.

1. INTRODUCTION In their original forms, both turbo and LDPC codes are binary codes. In the low signal-to-noise-ratio (SNR) regime, The design of error control codes for discrete multitone where binary modulation is spectrally efficient, binary turbo (DMT) systems is of great interest for applications such as codes and binary low-density parity-check (LDPC) codes can digital subscriber lines (DSL) [1–6]. In a DMT system, dif- effectively approach the capacity of many channels (e.g., [7– ferent constellations may be used in different tones and non- 11]). A DMT system, however, often includes higher-order binary high-order modulation formats are typically used. constellations that are not necessarily binary. This neces- Current asymmetric DSL (ADSL) standards use a trellis- sitates the use of nonbinary alphabets or the use of mul- coded modulation scheme concatenated with an outer Reed- tilevel coding techniques. Theoretically, multilevel coding Solomon (RS) code. The inner trellis code provides a coding [12], when combined with capacity-achieving binary codes, gain for an additive white Gaussian noise (AWGN) channel, is capable of achieving the capacity of higher-order modula- andtheouterRScodeoffers additional noise immunity, es- tion. In practice, schemes such as bit-interleaved coded mod- pecially against impulse noise. ulation (BICM) [13]canoffer a performance very close to This paper is motivated by the phenomenal success of the capacity of higher-order modulation. turbo and low-density parity-check (LDPC) codes in the past The use of multilevel coding for DMT systems has been decade. It is now possible to design codes that perform within studied by various authors for DSL and power-line com- a fraction of a decibel (dB) from the Shannon limit in an munication channels [2, 4–6]. For example, in [5] a regu- additive white Gaussian noise channel. However, the use of lar high-rate LDPC code is used for error correction in a turbo codes and LDPC codes in DMT systems is not yet DSL transmission system, achieving a coding gain of 6 dB widespread. This is in part due to the fact that the effect at a symbol error rate (SER) of 10−7.In[2], turbo codes of impulse noise on turbo or LDPC codes has not yet been have been used for ADSL and a coding gain of 6 dB at an studied in depth. Impulse noise is one of the main channel SER of 10−6 (equivalent to approximately 6.8dBatanSER impairments in DSL. The main focus of this paper is the de- of 10−7)isreported.In[6], the idea of using LDPC codes to- sign of LDPC codes for a DMT system in an impulse-noise gether with coded modulation is used, although the issue of environment. code optimization is not addressed. In [4], irregular LDPC 2 EURASIP Journal on Applied Signal Processing

x n codes—optimized for Gaussian channels—are used for data + 1 − ε transmission over power-line channels and an average cod- x ing gain of about 7.5dBatanSERof10−7 is achieved. These + coding gains can be compared with a 5.5 dB gain which can ε be achieved by a 512-state trellis code on an AWGN channel E at an SER of 10−7 [14]. n The above-mentioned systems all assume an additive Gaussian noise model. The effect of impulsive noise is ei- ther ignored, or not considered in the code design. How- Figure 1: Channel model: the concatenation of an AWGN channel ever, impulse noise is a major channel impairment in digital with an erasure channel. subscriber lines. A large impulse often causes an entire DMT symbol to be corrupted [15] which can seriously deteriorate overall system performance. Current DSL standards rely on an LDPC code optimization technique is described. The extensive interleaving and an outer RS code to provide im- optimization technique is capable of finding LDPC codes pulse protection. Much work has been done (e.g., see [15–20] that perform close to the capacity of the effective “bit- and the references therein) on the characteristics of impulse channel.” Section 5 provides simulation results and compari- noise and the use of RS codes for impulse protection. ff son with existing solutions. Finally, conclusions are drawn in One way to combat the e ect of impulse noise is the idea Section 6. of impulse detection and erasure. As an impulse is typically a large signal that occurs for an extended period of time and across a large number of frequency tones, reliable detection 2. CHANNEL MODEL AND PROBLEM STATEMENT of impulses is often possible. When an impulse is detected, an Throughout this work we assume that the DMT symbols that erasure can then be declared. The declaration of erasure pro- are corrupted by impulsive noise are known to the receiver, vides side information to the decoder of the error-correcting ff and can therefore be replaced by erasures. In [19]aprocess code and e ectively increases its error-correcting capabil- called erasing is used to detect the impulse noise and to in- ity. For example, in a system with an RS code, the error- dicate the location of the potentially corrupted data in an correcting capability can be increased by a factor of two when RS block. This way, the error-correcting capability of the RS erasures are declared [18, 19]. Under the erasure model, the code is increased by a factor of two. There are many different impulse channel is essentially a concatenation of an additive methods for detecting the impulse noise at the receiver, for noise channel and an erasure channel. example, [18–20]. The detection of impulse noise is usually LDPC codes have excellent performance on both additive based on the observation that when a noise burst occurs, the noise channels (e.g., AWGN) and the binary erasure channel squared distance between several of the received signals in a (BEC) alone. However, the design and performance of these DMT symbol and the constellation points exceeds a thresh- codes over channels with both additive noise and erasures old. Therefore, using square distance-based methods, the im- have not—to the best of our knowledge—been studied. This pulsive noise can be replaced with an erasure in the channel. motivates us to study the design and the analysis of LDPC Figure 1 shows the channel model that we consider in this codes over a channel model seen in DMT systems in impul- study. The input to this channel, x, is a point selected from sive environments. a constellation A ⊂ C of size 2l. The output, y ∈ C ∪{E}, In this study, we rigorously analyze an impulse-noise of this channel is either erasure E with probability  or x + n channel model and design an irregular LDPC code specifi- with probability 1 − ,wheren is assumed to be a sample of cally for the impulse noise channel. Our main contribution a complex zero-mean circularly-symmetric white Gaussian is a design methodology that enables the design of capacity- noise with variance 2σ2. The erasure output of this channel approaching LDPC codes for channels with both additive model reflects the effect of the impulse noise and the additive Gaussian noise and erasures. Such an optimized LDPC code, noise reflects the effect of the AWGN channel. when combined with bit-interleaved coded modulation, out- Erasures can be naturally incorporated in the message- performs both conventional systems with RS codes and sys- passing decoding of LDPC codes (see Section 3.2). However, tems with LDPC codes optimized for AWGN channel alone it is not clear that a code optimized for an AWGN channel in a DMT system. In fact, such a design can effectively handle would also remain optimized for a channel with both AWGN both the additive Gaussian noise and the erasures introduced and erasures. One of the objectives of this paper is to design by the DMT channel, thus eliminating the need for an outer LDPC codes specifically for such channels. code. This results in hardware—and software—complexity Since in this work we use multilevel coding, the trans- savingsaswellashighercoderates. mission of the signal will be modelled as the transmission of The rest of the paper is organized as follows. In Section 2, the bits of the binary labels used for labelling different con- the impulse-noise channel model is introduced. In Section 3, stellation points. Thus, we are most interested in the charac- a review of multilevel coding and density evolution analy- teristics of the channels seen by the label bits. We call these sisforLDPCcodesisprovided.InSection 4, a coding struc- channels bit-channels, and will study them after a brief re- ture based on BICM and LDPC coding is established; the view of multilevel coding and a description of our proposed characteristics of the effective “bit-channel” are studied; and system. Masoud Ardakani et al. 3

3. BACKGROUND ν1 ν2 ν3 ν4 ν5 ν6 ν7 In this section we briefly review the idea of multilevel coding, the structure of LDPC codes, and their analysis techniques.

3.1. Multilevel coding c1 c2 c3 c4 The main idea of multilevel coding [12] is to label each point of a nonbinary constellation A ={a0, a1, ..., aN−1} of size l Figure 2: A bipartite graph representing a parity-check code. N = 2 with a binary address b = (b0, b1, ..., bl−1)anduse binary codes to protect each address bit bi by an individual binary code Ci at level i [21]. Let A and Y represent the random variables correspond- have, for all the checks ci, ing to the transmitted and received signals, respectively. Also b b ... b consider ( 0, 1, , l−1) as the vector of random variables corresponding to the address bits. Since a one-to-one map- vj = 0, (3) ping between the address vector and the constellation points j:vj ∈n(ci) exists, we have n ci ci ⊕ I(Y; A) = I Y; b0, b1, ..., bl−1 (1) where ( ) is the set of all the neighbors of and repre- sents modulo-two sum. which, using the mutual information chain rule, can be writ- LDPC codes, depending on their structure, are classified ten as as being regular or irregular. In a regular code, all variable nodes are of a fixed degree and all the check nodes are of I Y A = I Y b I Y b | b ( ; ) ; 0 + ; 1 0 another fixed degree. Irregular LDPC codes can significantly (2) + ···+ I Y; bl−1 | b0, b1, ..., bl−2 . outperform regular codes [26]. By carefully designing the ir- regularity in the graph, codes which perform very close to the This decomposition of the mutual information implies that capacity can be found [9, 11, 26, 27]. transmission of the signal over the physical channel can be An ensemble of irregular codes can be defined by its vari- thought of as parallel transmission of individual bits. Due to able and check-node-degree distributions. The degree dis- the labelling scheme, each address bit sees an effective chan- tribution at the variable/check side represents the percent- nel (bit-channel) whose capacity can be different from the age of edges incident on variable/check nodes with different capacity of other bit-channels. An individual code should be degrees. As the block length of a randomly chosen code ap- used for each bit-channel, and it is known that if each code proaches infinity, the performance of the code depends only achieves the capacity of its bit-channel, the total capacity of on its degree distributions [9]; therefore, code design prob- the channel is achieved [21]. A complete study of multilevel lem translates to the problem of finding good variable and coding and the structure of the encoder and decoder can be check-degree distributions. In practice, when a finite-length found in [22, 23]. code is used, the larger the block length, the closer the perfor- One drawback of capacity-approaching systems using mance of the code to the predicted asymptotic behavior. In multilevel coding is that they require different codes with dif- the remainder of this paper, we refer to variable-node-degree λ ={λ λ ... λ } λ ferent rates for each bit-channel. This problem can be solved distribution as 2, 3, , dv ,where i is the fraction using BICM. In BICM, after using gray labelling, the address of edges incident on variable nodes of degree i and dv is the bits of the labels are interleaved. Hence, instead of using in- maximum variable-node degree of the code. Similarly we use ρ ={ρ ρ ... ρ } dividual codes at different levels, a single code can be used 2, 3, , dc for the check-degree distribution. for all the address bits. BICM can approach the channel ca- The decoding of LDPC codes is done by iterative pacity very closely [13], but, unlike multilevel coding, can- message-passing algorithms. Different types of messages ex- not achieve capacity. What makes BICM attractive is its lower ist, but log-likelihood-ratio (LLR) messages are the most software/hardware complexity at the expense of very minor common ones. The LLR for a bit b is defined as performance loss. Pr(b = 0) LLR(b) = ln . (4) 3.2. LDPC codes Pr(b = 1) An LDPC code is a linear block code with a sparse parity- check matrix that can be represented by a bipartite graph [24, Notice that in the case of an erasure, LLR(b) = 0. 25]. Figure 2 shows one such graphical representation. Many different message-passing algorithms exist [9, 25], Avariablenodevj (represented by a circle) is a binary among which the sum-product algorithm is the most ac- variable from the alphabet {0, 1} and a check node ci (rep- curate one [25]. Assuming that LLR messages are used, resented by a square) is an even parity constraint on its and following the notation of [25], the update rules of the neighboring variable nodes, so that in a valid codeword we sum-product algorithm at a check node c and a neighboring 4 EURASIP Journal on Applied Signal Processing

b0

b1

Input b2 DMT er er ff ff bits er and ff

IDFT symbols Bu Bu selector Bu

LDPC encoder b Data partitioning l−1 Constellation point

Figure 3:Theblockdiagramoftheencoderoftheproposedsystem.

LLR0 negative tail of the LLR density) approaches zero as the num- ber of iterations goes to infinity. LLR1 For higher-order modulation, however, assuming that DMT LLR Output

2 er

ff the all-zero codeword is transmitted is not valid. This is be- symbols bits DFT Bu cause, different constellation points might have different er- LDPC decoder

Compute LLRs ror rates. In Section 4.3 we will show how this problem can be LLRl−1 tackled and an LLR density whose negative tail corresponds to the message error rate can be found, without the need for Figure 4: The block diagram of the decoder of the proposed system. an all-zero codeword assumption. Using density evolution as a probe, one can search for variable-node- and check-node-degree distributions that variable node v are provide performance approaching capacity [27]. The inputs to this optimization program are the message update rules μ −1 h→c of the decoder at the variable and check nodes as well as the μc→v = 2tanh tanh , channel LLR PDF and the output is the optimized degree- h∈n(c)−{v} 2 (5) distribution. μ = μ μ v→c ch→v + y→v, There are many papers on such degree optimizations for y∈n(v)−{c} various channel models. Probably the one most related to our where μh→c represents a message from a variable node h to the study is [28], where component LDPC codes are designed in ff check node c, μy→v represents a message from a check node a multilevel coding structure. Our work is di erent in two y to the variable node v,andμch→v is the channel message to ways: first, we use BICM and hence we need to design one ff the variable node v. In this paper, we assume that our LDPC single LDPC code; second, we consider the e ect of impulse decoder uses the sum-product algorithm. noise. Using one LDPC code has many practical benefits, for Erasure LLR messages are zero and are processed like example, for the same delay, it allows for a longer block- other messages in the decoder. Hence, they are naturally in- length (hence better performance); it also reduces hardware corporated in the decoding, requiring no modification to the and software complexity. decoder. 4. SYSTEM STRUCTURE 3.3. LDPC code analysis and design Figures 3 and 4 show the block diagrams of the proposed Theanalysisofthedecoder’sperformancecanbedoneby system at the transmitter and the receiver, respectively. At density evolution [10] or discrete density evolution [27]. The the transmitter, the constellation mapper uses l bits from the main idea of the analysis is to track the evolution of the output of LDPC encoder to select a point from the constella- probability density function (PDF) of the messages in the tion for each subchannel of the DMT system. After assigning decoder. The inputs to this algorithm are the code variable- data to all subchannels, this complex vector is mapped to a node- and check-node-degree distributions, the update rules DMT symbol using an inverse Fourier transform (IDFT). at the variable and check nodes, and the PDF of LLR mes- At the receiver, a DMT symbol is first converted to a sages from the channel. The output of the algorithm is the complex vector using the discrete Fourier transform (DFT). PDF of LLR messages after each iteration. Then, for each tone, l LLR values associated with the l trans- One of the pillars of density evolution analysis of LDPC mitted bits are computed and buffered. Once all LLR values codes for binary transmission is the assumption that the all- are ready, they are passed to the LDPC decoder. zero codeword (all-{+1} channel-word) is transmitted. This Here, we assume that the length of the LDPC code is an assumption is valid because the performance of the code is integer multiple of the length of the DMT symbols. Even independent of the transmitted codeword [10]. Therefore, if we relax this assumption (due to the time-varying num- under this assumption we can associate an error rate to a ber of bits loaded into the DMT symbols), we can simply message PDF, that is, the negative tail of the PDF. Thus, one put as many DMT symbols as possible in the LDPC struc- can investigate whether or not the error rate of messages (the ture and fill its remaining portion with zeros. Typically, the Masoud Ardakani et al. 5 length of an LDPC codeword is much longer than the length 1 1 − ε of a DMT symbol. Thus, filling the remaining portion of the 0.9 LDPC code with zeros has no significant effect on the system 0.8 performance. 0.7 4.1. Channel capacity 0.6 0.5 The physical channel model for a DMT system in an im- b pulse environment is as shown in Figure 1. In this section, C 0.4 we consider instead the transmission of the bits of the con- 0.3 stellation labels and study the resulting bit-channel. Since the . binary LDPC decoder receives its initial LLRs from these bit- 0 2 channels, we first investigate the distribution of the LLRs of 0.1 the bit-channels. We then characterize the bit-channel capac- 0 ity, which provides a metric with which the actual achieved 024681012141618 code rates may be compared. SNR (dB) As shown in [29], the mutual information between the Multilevel coding input and the output of an AWGN channel, when the input BICM signals are equally likely chosen from a constellation A of size l 2 ,is Figure 5: The average bit-capacity for a gray-labelled 16-QAM at ff  = . di erent SNRs with an erasure rate of 0 06. C = l A log2 2 2l−1 2l−1 2 1 ak + w − ai −|w|2 bit-channels. The effective noise of each bit-channel highly − E log exp − , l 2 σ2 2 k= i= 2 depends on the constellation labelling. 0 0 ff (6) We can easily find the LLRs for di erent bit-channels. The LLR value for bi, can be computed as where ak is the kth point on the constellation, and w repre- sents samples of a complex Gaussian noise with variance 2σ2. Pr bi = 0 | y LLR bi = ln Our channel model is a concatenation of such an AWGN Pr bi = 1 | y  ⎧ channel with an erasure channel whose erasure rate is . ⎪ y = ⎨⎪0if E, (9) Therefore the channel capacity is 2 2 = e−aj −y /2σ ⎪ aj ∈A (bi=0) C = −  C . ⎩⎪ln otherwise, A, (1 ) A (7) e−aj −y2/2σ2 aj ∈A (bi=1) This capacity should be normalized per bit to reflect the y a average capacity of bit-channels. Thus, (7) results in bit- where is the received signal, j represents a point of the σ2 channels whose average capacity Cb is transmitted constellation, 2 is the Gaussian noise variance, and A (bi = 0) represents the subconstellation of A with the 1 b Cb = (1 − )CA. (8) address bit i equal to zero. l In BICM, since the bits are interleaved, the density of In this work we use BICM. Although, BICM cannot LLRs that the LDPC code sees is a mixture of LLR densities achieve the capacity (8), it can approach the capacity very for different bit-channels. For a gray-labelled 16-QAM con- closely, as shown in [13]. At high SNRs the capacities of stellation this density is shown in Figure 6. In this figure, the BICM and multilevel coding are almost equal. erasure probability is 0.06 and the Gaussian channel SNR is Figure 5 shows the plot of Cb versus SNR. It also shows 10 dB. Notice that since Pr (y = E) = , the density of LLRs the capacity of BICM. It can be seen that for SNRs exceeding has an impulse of weight  at zero. Another interesting obser- 6 dB, the gap from the capacity of BICM to the capacity of vation is that the density is not symmetric with respect to the multilevel coding is minor. It should be mentioned that here vertical axis. To see why, consider the gray-labelled 16-QAM E /σ2 E b SNR is defined as 10 log10( s ), where s is the average en- constellation of Figure 7 and notice that the value of LLR( 0) ergy of constellation per real dimension and 2σ2 is the vari- can grow unboundedly large (for received signals far to the ance of the complex Gaussian noise. The erasure rate of the right or to the left), but approaches its minimum for the re- channel is assumed to be 0.06, hence at high SNR the value ceived signals close to the vertical axis. Therefore, the LLR of Cb approaches 1 −  = 0.94. PDF of b0 (and as a result the mixture of LLR densities) is not symmetric with respect to the vertical axis. 4.2. Density of bit-channel LLRs 4.3. All-zero codeword assumption While an erasure in the actual channel translates to an era- sure in all bit-channels, a Gaussian noise on the actual chan- For the design and analysis of binary LDPC codes over sym- nel does not necessarily translate to Gaussian noise on the metric binary-input channels, it is usually assumed that the 6 EURASIP Journal on Applied Signal Processing

0.07 0.12

0.06 0.1

0.05 0.08 0.04 0.06 Density 0.03 Density 0.04 0.02

. 0.01 0 02

0 0 −40 −30 −20 −10 0 10 20 30 40 −30 −20 −100 102030405060 LLR LLR

Figure 6: The density of LLRs for a 16-QAM gray-labelled constel- Figure 8: The density of modified LLRs for a 16-QAM gray-labelled lation at SNR = 10 dB. constellation at SNR = 10 dB.

0000 0001 0101 0100 Notice that we can always add a random dither to the codewords to make them typical Bernoulli sequences, but it seems reasonable that such a dither is not required in prac- tice. Indeed, we will see later that the codes designed based 0010 0011 0111 0110 on the assumption that all constellation points are transmit- ted equally likely perform very close to the predicted behav- ior without the need for a random dither. 1010 1011 1111 1110 Now that all the codewords are assumed to be received with the same quality at the receiver, the performance of the LDPC decoder is independent of the codeword. There- fore, we only need to translate each LLR to its equivalent 1000 1001 1101 1100 value corresponding to the all-{+1} channel-word. In other words, for every bit which is a “1,” we define the LLR to be ln(Pr(1)/ Pr(0)) and for every bit which is a “0” we use the conventional definition of ln(Pr(0)/ Pr(1)). This way, a posi- b b b b Figure 7: A gray-labelled 16-QAM constellation ( 3 2 1 0). tive LLR carries a correct belief, and (once more) the negative tail of the LLR density shows the LLR error rate. This equiv- alent LLR density may be computed for all bit-channels and all-zero codeword (equivalent to all-{+1} channel-word) is for every constellation point. We may then mix these densi- transmitted. In fact, the density of LLR messages associated ties uniformly to produce a modified LLR which is suitable with the all-{+1} channel-word, is essential for the design for code optimization. procedures which are based on density evolution. The density of these modified LLRs is shown in Figure 8. In our case, however, assuming that the all-zero code- A negative LLR here reflects an error in a hard decision de- word is transmitted is not valid because the points of the coder (which was not the case in Figure 6). For a given con- constellation are not all equivalent. To tackle this problem, stellation, a given labelling scheme, and a given noise vari- we first notice that for an LDPC code of length N and dimen- ance, we can find such a modified LLR density to be used as sion K, an information sequence of K bits, when K is large, is input to a density evolution program for degree distribution a typical Bernoulli sequence with p = 1/2. While not strictly optimization. true, it is reasonable to assume that the encoded sequence is also a typical Bernoulli sequence. Therefore, in the encoded 4.4. Code design codeword, all binary sequences of length l appear with the same probability. Hence, for one typical LDPC codeword, all So far, we have discussed the use of BICM and LDPC codes the constellation points are transmitted the same number of for DMT systems in an impulsive environment. We now fo- times. As a result, we can assume that almost all the LDPC cus on the design of capacity-approaching LDPC codes on codewords are received with the same quality at the receiver. such channels. As mentioned before, the optimization algo- That is to say, the performance of the decoder is independent rithm should take the density of LLR messages and the up- of the codeword, as long as a typical codeword is transmitted. date rules of the decoder as the input and should give an Masoud Ardakani et al. 7 optimized degree distribution. Thus, given the channel con- Table 1: Code design for 16-QAM signalling. ditions, that is, SNR of the Gaussian part of the channel and  Degree sequence Code 1 Code 2 of the erasure part, the purpose of our optimization is to find d λd . . the highest rate LDPC code whose convergence is guaranteed 1, 1 2, 0 1448 2, 0 1561 d λ . . at these channel conditions. 2, d2 3, 0 1767 3, 0 1790 Different methods are used for finding optimized degree d3, λd 7, 0.1434 7, 0.0786 distributions [10, 27]. In this work, we use a method which is 3 d , λd 8, 0.0819 8, 0.1404 based on linear programming and reflects our goal of finding 4 4 d λd . . the maximum rate for a given channel. We start with a degree 5, 5 21, 0 1015 22, 0 0004 d λ . . distribution for which convergence to zero error rate is guar- 6, d6 22, 0 0256 23, 0 0309 anteed. In the density evolution program, at each iteration, d λ . . 7, d7 24, 0 0606 24, 0 1656 we save pin, that is, the error rate of the LLR messages at the d , λd 25, 0.0273 100, 0.2490 input of the iteration (the input messages to the check nodes) 8 8 d λd . and also pout,i, that is, the error rate of the LLR messages at 9, 9 80, 0 2382 — the output of variable nodes of degree i (for all values of i). A dc 22 13 plot of pout,i versus pin for all values of i is equivalent to the Channel conditions: 10 dB 7 dB elementary pin − pout extrinsic information transfer (EXIT)  = 0.06, SNR = ff charts of [30], that is, EXIT charts corresponding to di er- Rate 0.740 0.566 ent variable degrees. For more information on EXIT charts, Approx. gap to 0.09 dB 0.08 dB please see [31]. the Shannon limit It is shown in [30] that for a fixed check-degree distri- bution, a linear combination of elementary pin − pout EXIT λ ={λ ... λ } charts with a set of weights 2, , dv , such that for all i, λi ≥ 0and i λi = 1 results in the EXIT chart of an irreg- ular code whose degree distribution is determined by λ, that 5. SIMULATION RESULTS is, for all p , in To show the success of our proposed system, we have de- p = f p = λ f p signed two irregular codes for 16-QAM signaling at two out in i i in , (10) ff i di erent channel conditions. The first code is designed at SNR = 10 dB, and  = 0.06 and the second one at SNR = where f (pin) is the EXIT chart of the irregular code and 7dBand  = 0.06. At these channel conditions, the capac- fi(pin) is the elementary EXIT chart associated with variable ityofBICMisalmost0.744 bits/symbol (per bit-channel) nodes of degree i. and 0.570 bits/symbol, respectively. Our codes, whose de- Therefore, a linear program similar to that of [30]can gree distributions can be found in Table 1,haveratesof be used to find an optimized degree distribution. The lin- 0.740 bits/symbol and 0.566 bits/symbol, respectively. The ear program of [30] takes a set of elementary EXIT charts, a maximum variable degree allowed in these designs is 80 for desired convergence behavior (given by a curve h(pin)), and code 1 and 100 for code 2, and their check degree is chosen finds the variable-node-degree distribution λ which maxi- to be regular. The capacity (normalized per bit-channel) of mizes the code rate and for all pin keeps the pout of the ir- BICM with 16-QAM signaling on our channel model with regular code less than or equal to h(pin). Here, we have to  = 0.06 is 0.740 at about 9.91 dB and is 0.566 at about make sure that the change made in the degree distribution is 6.92 dB, hence in both cases a gap of about only 0.08-0.09 dB small, since the elementary EXIT charts found here are valid from the Shannon limit exists. only for the degree distribution for which the density evolu- We have computed the threshold of a code of rate 0.74 tion program is executed. optimized for the AWGN channel (with dc = 24 and max- Assuming that making a small change in the degree dis- imum dv = 100) on our channel. The degree distribution tribution of the code will not significantly change the ele- of the code is selected from the optimized codes reported mentary EXIT charts, our design procedure is as follows. in [32]. We first verified the reported threshold of this code First, we find f (pin), the current EXIT chart of the code on a Gaussian channel. Then, assuming an erasure rate of (for which convergence was observed). Then, we make a  = 0.06 and 16-QAM signalling, we found the threshold small change in this EXIT chart to make it closer to the line ofconvergenceofthiscodetobeatabout10.2 dB or almost pout = pin and feed it back as h(pin) to the linear program to 0.29 dB away from the capacity. find another degree distribution whose rate is maximum and It can be concluded that although the performance of whose convergence behavior is better than h(pin). It is shown codes designed for the AWGN is fairly good on our channel in [30] that by moving towards h(pin) = pin we achieve model, further improvement in the order of 0.2 dB is possi- higher code rates. Using the updated degree distribution, we ble. Similar results have been observed in [33], where irreg- run the density evolution program to update the elementary ular codes are designed for the Rayleigh channel. In fact, the EXIT charts and repeat the linear program to update the de- performance of the codes designed for the AWGN channel on gree distribution. This optimization is iterated until the de- a Rayleigh channel is about 0.2 dB worse than those designed sired convergence behavior is h(pin) pin. for the Rayleigh channel. 8 EURASIP Journal on Applied Signal Processing

− 10 2 The conventional solution based on trellis codes and outer RS codes is known to have a gap of a few dB from the Shannon limit. Also notice that a (255, 239) RS code (of a 10−3 rate a little bit less than 0.94) can correct a maximum of 16 erased bytes. Assuming that each byte has at most one erased −  10 4 Threshold of optimized bit and that the bit erasure probability is ,wecandefine code for AWGN p = 1 − (1 − )8 as the byte erasure probability. To guaran- −6 BER p Threshold of code 1 tee an error rate of less than 10 , shouldtakeavaluefor − 10 5 (optimized for BICM) which Capacity 255 10−6 255 pi − p (255−i) < −6. of BICM i (1 ) 10 (12) i=17

10−7 1.31.61.92.22.5 This results in an error-correction capability of p<0.0159  < . × −3 . SNRnorm (dB) and hence 2 003 10 . In other words, about 0 2% error can overwhelm an RS code that introduces more than Code 1 (optimized for BICM) 6% rate loss to the system. Our code, however, can correct up Optimized code for AWGN to 6% erasures. This erasure correction capability is achieved by a rate loss of 6% introduced by the channel model whose Figure 9: Performance comparison of LDPC codes in a system capacity at high SNR is 0.94. based on 16-QAM signaling and BICM in the presence of impulse noise. 6. CONCLUSIONS

In this work, we have shown that a carefully optimized LDPC code is a promising candidate coding approach in DMT sys- We have also simulated the performance of our rate 0.74 tems with impulsive noise. To illustrate the potential of this code and have compared it with the above-mentioned rate approach, we have shown how to optimize an LDPC code 0.74 code optimized for AWGN. Both codes have a block at a fixed SNR and for a fixed constellation size. Further re- length of 30 000. Figure 9 shows the results of our compar- search is required to address practical DMT systems that re- ison. The erasure rate for simulation is  = 0.06 and a maxi- quire flexible coding to accommodate a variety of SNRs and mum of 100 iterations is allowed. As expected, our code has hence a variety of constellation sizes at different tones. In- aperformancethatisabout0.2 dB better than the code de- deed, it is shown in [4] that a single LDPC code can perform signed for AWGN channel. The code designed for AWGN effectivelyinaDMTsystemifafixednumberofbitsfrom is slightly more complex because it has a higher check de- each constellation is assigned to it. gree and hence its decoding complexity is about 10% higher. From Figure 9, it can be seen that both LDPC codes show The decoding complexity for both codes is comparable with a an error floor at bit error rates near 10−6. Error floors in 512-state trellis code (see [4] for a detailed study of complex- LDPC codes are mainly due to the presence of short cycles ity and comparison with the complexity of 512-state trellis in the graph structure of the code. This will be a less serious code). problem for long block-length codes. Many solutions have With this code length and code rate, and assuming an been proposed to handle the error floor issue. One easy so- ADSL system with a bit rate of 1 Mbits/s, the buffer-fill de- lution is to use a very high-rate, low-complexity outer BCH lay is code (e.g., a double-error correcting BCH code). However, 30 000 × 0.74 there also exist LDPC codes which are carefully structured to tbf = = 22.2 milliseconds. (11) push the error floor to a much lower BER (10−8 and lower) 1 000 000 [34, 35]. Using such LDPC codes, the need for an outer code Therefore, neglecting the encoding/decoding delay and also is completely eliminated. the transmission delay, the total delay of the system can be approximated to be about 45 milliseconds. For ADSL systems REFERENCES with higher data bit rates, longer block-length codes can be ffi used and hence, performance closer to the threshold can be [1]T.N.Zogakis,J.T.AslanisJr.,andJ.M.Cio , “Analysis of a expected. concatenated coding scheme for a discrete multitone modula- tion system,” in Proceedings of IEEE Military Communications Although SNR ,definedasSNR = SNR /(2R − 1) norm norm Conference (MILCOM ’94), vol. 2, pp. 433–437, Fort Mon- [14], is usually a better measure than SNR, this paper uses mouth, NJ, USA, October 1994. SNR because our design goal is to maximize the code rate for [2] L. Zhang and A. Yongacoglu, “Turbo coding in ADSL DMT a given channel condition, and prior to the design the code systems,” in Proceedings of IEEE International Conference on rate is not known (however, Figure 9 uses SNRnorm, since the Communications (ICC ’01), vol. 1, pp. 151–155, Helsinki, Fin- rate is known). land, June 2001. Masoud Ardakani et al. 9

[3] Z. Cai, K. R. Subramanian, and L. Zhang, “DMT scheme [19] D. Toumpakaris, J. M. Cioffi, D. Gardan, and M. Ouzzif, “A with multidimensional turbo trellis code,” Electronics Letters, square distance-based byte-erasure method for reduced-delay vol. 36, no. 4, pp. 334–335, 2000. protection of DSL systems from non-stationary interference,” [4] M. Ardakani, T. Esmailian, and F. R. Kschischang, “Near- in Proceedings of IEEE Global Telecommunications Conference capacity coding in multicarrier modulation systems,” IEEE (GLOBECOM ’03), vol. 4, pp. 2114–2119, San Francisco, Calif, Transactions on Communications, vol. 52, no. 11, pp. 1880– USA, December 2003. 1889, 2004. [20] P. S. Chow, Bandwidth optimized digital transmission tech- [5] E. Eleftheriou and S. Olc¨ ¸er, “Low-density parity-check codes niques for spectrally shaped channels with impulse noise,Ph.D. for digital subscriber lines,” in Proceedings of IEEE Interna- thesis, Department of Electrical Engineering, Stanford Univer- tional Conference on Communications (ICC ’02), vol. 3, pp. sity, Stanford, Calif, USA, May 1993. 1752–1757, New York, NY, USA, April—May 2002. [21] U. Wachsmann, R. F. H. Fischer, and J. B. Huber, “Multilevel [6]T.Cooklev,M.Tzannes,andA.Friedman,“Low-density codes: theoretical concepts and practical design rules,” IEEE parity-check coded modulation for ADSL,” Temporary Doc- Transactions on Information Theory, vol. 45, no. 5, pp. 1361– ument BI-081, ITU-Telecommunication Standardization Sec- 1391, 1999. tor, Geneva, Switzerland, October 2000. [22] G. J. Pottie and D. P. Taylor, “Multilevel codes based on par- [7] C. Berrou, A. Glavieux, and P. Thitimajshima, “Near Shannon titioning,” IEEE Transactions on Information Theory, vol. 35, limit error-correcting coding and decoding: Turbo-codes. 1,” no. 1, pp. 87–98, 1989. in Proceedings of IEEE International Conference on Communi- [23] A. R. Calderbank, “Multilevel codes and multistage decoding,” cations (ICC ’93), vol. 2, pp. 1064–1070, Geneva, Switzerland, IEEE Transactions on Communications, vol. 37, no. 3, pp. 222– May 1993. 229, 1989. [8] D. Divsalar and F. Pollara, “On the design of turbo codes,” [24] R. M. Tanner, “A recursive approach to low complexity codes,” TDA Progr. Rep. 42-123, Jet Propulsion Laboratory, Califor- IEEE Transactions on Information Theory,vol.27,no.5,pp. nia Institute of Technology, Pasadena, Calif, USA, 1995. 533–547, 1981. [9] T. J. Richardson and R. L. Urbanke, “The capacity of low- [25] F. R. Kschischang, B. J. Frey, and H.-A. Loeliger, “Factor graphs density parity-check codes under message-passing decoding,” and the sum-product algorithm,” IEEE Transactions on Infor- IEEE Transactions on Information Theory,vol.47,no.2,pp. mation Theory, vol. 47, no. 2, pp. 498–519, 2001. 599–618, 2001. [26] M. G. Luby, M. Mitzenmacher, M. A. Shokrollahi, and D. A. [10]T.J.Richardson,M.A.Shokrollahi,andR.L.Urbanke,“De- Spielman, “Efficient erasure correcting codes,” IEEE Transac- sign of capacity-approaching irregular low-density parity- tions on Information Theory, vol. 47, no. 2, pp. 569–584, 2001. check codes,” IEEE Transactions on Information Theory, vol. 47, [27]S.-Y.Chung,G.D.ForneyJr.,T.J.Richardson,andR.Ur- no. 2, pp. 619–637, 2001. banke, “On the design of low-density parity-check codes [11] A. Shokrollahi, “Capacity-achieving sequences,” in Codes, Sys- within 0.0045 dB of the Shannon limit,” IEEE Communications tems, and Graphical Models, B. Marcus and J. Rosenthal, Eds., Letters, vol. 5, no. 2, pp. 58 –60, 2001. IMA Volumes in Mathematics and Its Applications, no. 123, [28] J. Hou, P. H. Siegel, L. B. Milstein, and D. Pfister, “Multi- pp. 153–166, Springer, New York, NY, USA, 2000. level coding with low-density parity-check component codes,” [12] H. Imai and S. Hirakawa, “A new multilevel coding method in Proceedings of IEEE Global Telecommunications Conference using error-correcting codes,” IEEE Transactions on Informa- (GLOBECOM ’01), vol. 2, pp. 1016–1020, San Antonio, Tex, tion Theory, vol. 23, no. 3, pp. 371–377, 1977. USA, November 2001. [13] G. Caire, G. Taricco, and E. Biglieri, “Bit-interleaved coded [29] G. Ungerboeck, “Channel coding with multilevel/phase sig- modulation,” IEEE Transactions on Information Theory, nals,” IEEE Transactions on Information Theory, vol. 28, no. 1, vol. 44, no. 3, pp. 927–946, 1998. pp. 55–67, 1982. [14] G. D. Forney Jr. and G. Ungerboeck, “Modulation and coding [30] M. Ardakani, T. H. Chan, and F. R. Kschischang, “EXIT-chart for linear Gaussian channels,” IEEE Transactions on Informa- properties of the highest-rate LDPC code with desired conver- tion Theory, vol. 44, no. 6, pp. 2384–2415, 1998. gence behavior,” IEEE Communications Letters, vol. 9, no. 1, [15] W. Yu, D. Toumpakaris, J. M. Cioffi, D. Gardan, and F. Gau- pp. 52–54, 2005. thier, “Performance of asymmetric digital subscriber lines in [31] S. ten Brink, “Convergence behavior of iteratively decoded an impulse noise environment,” IEEE Transactions on Commu- parallel concatenated codes,” IEEE Transactions on Communi- nications, vol. 51, no. 10, pp. 1653–1657, 2003. cations, vol. 49, no. 10, pp. 1727–1737, 2001. [16] M. Barton, “Impulse noise performance of an asymmetric [32] http://lthcwww.epfl.ch/research/ldpcopt/. digital subscriber lines passband transmission system,” IEEE [33] J. Hou, P. H. Siegel, and L. B. Milstein, “Performance analy- Transactions on Communications, vol. 43, no. 234, pp. 1337– sis and code optimization of low density parity-check codes 1340, 1995. on Rayleigh fading channels,” IEEE Journal on Selected areas in [17] K. J. Kerpez and A. M. Gottlieb, “The error performance of Communications, vol. 19, no. 5, pp. 924–934, 2001. digital subscriber lines in the presence of impulse noise,” IEEE [34] Flarion Inc., “Vector-LDPC Coding Solution Data Sheet”, Transactions on Communications, vol. 43, no. 5, pp. 1902– http://www.flarion.com/products/overviews/Vector-LDPC 1905, 1995. Product Overview.pdf. [18] D. Toumpakaris, W. Yu, J. M. Cioffi, D. Gardan, and M. Ouzzif, [35]T.Tian,C.Jones,J.D.Villasenor,andR.D.Wesel,“Construc- “A byte-erasure method for improved impulse immunity in tion of irregular LDPC codes with low error floors,” in Pro- DSL systems using soft information from an inner code,” in ceedings of IEEE International Conference on Communications Proceedings of IEEE International Conference on Communica- (ICC ’03), vol. 5, pp. 3125–3129, Anchorage, Alaska, USA, May tions (ICC ’03), vol. 4, pp. 2431–2435, Anchorage, Alaska, 2003. USA, May 2003. 10 EURASIP Journal on Applied Signal Processing

Masoud Ardakani received the B.S. degree from Isfahan University of Technology in 1994, the M.S. degree form Tehran Univer- sity in 1997, and the Ph.D. degree from the University of Toronto in 2004, all in electri- cal engineering. He was a Postdoctoral fel- low at the University of Toronto from 2004 to 2005. From 1997 to 1999, he was with the Electrical and Computer Engineering Re- search Center, Isfahan, Iran. Currently, he is an Assistant Professor of electrical and computer engineering at the University of Alberta, where he holds an Informatics Circle of Re- search Excellence (iCore) Junior Research Chair in wireless com- munications. His research interests are in the general area of digital communications, codes defined on graphs associated with iterative decoding, and MIMO systems.

FrankR.Kschischangreceived the B.A.Sc. degree (with honors) from the University of British Columbia in 1985, and the M.A.Sc. and Ph.D. degrees from the University of Toronto in 1988 and 1991, respectively, all in electrical engineering. He is a Profes- sor of electrical and computer engineering and Canada Research Chair in communica- tion algorithms at the University of Toronto, where he has been a faculty member since 1991. During 1997-1998 he was a Visitor at the Massachusetts In- stitute of Technology, Cambridge, Mass, and in 2005 at the Swiss Federal Institute of Technology, Zurich, Switzerland. His research interests are focused on the area of coding techniques, primarily on soft-decision decoding algorithms, trellis structure of codes, codes defined on graphs, and iterative decoders. Dr. Kschischang is a re- cipient of the Ontario Premier’s Research Excellence Award. From October 1997 to October 2000, he served as the Associate Editor for coding theory for the IEEE Transactions on Information Theory. He was Technical Program Cochair of the 2004 IEEE International Symposium on Information Theory held in Chicago.

Wei Yu received the B.A.Sc. degree in com- puter engineering and mathematics from the University of Waterloo, Waterloo, On- tario, Canada, in 1997, and M.S. and Ph.D. degrees in electrical engineering from Stan- ford University, Stanford, Calif, USA, in 1998 and 2002, respectively. Since 2002, he has been an Assistant Professor with the Electrical and Computer Engineering De- partment, University of Toronto, Toronto, Ontario, Canada, where he also holds a Canada Research Chair. His main research interests include multiuser information theory, cod- ing, optimization, wireless communications, and broadband access networks. Hindawi Publishing Corporation EURASIP Journal on Applied Signal Processing Volume 2006, Article ID 65716, Pages 1–13 DOI 10.1155/ASP/2006/65716

Fine-Granularity Loading Schemes Using Adaptive Reed-Solomon Coding for xDSL-DMT Systems

Saswat Panigrahi and Tho Le-Ngoc

Department of Electrical and Computer Engineering, McGill University, 3480 University Street, Montr´eal, QC, Canada H3A 2A7

Received 29 November 2004; Revised 9 May 2005; Accepted 22 July 2005 While most existing loading algorithms for xDSL-DMT systems strive for the optimal energy distribution to maximize their rate, the amounts of bits loaded to subcarriers are constrained to be integers and the associated granularity losses can represent a significant percentage of the achievable data rate, especially in the presence of the peak-power constraint. To recover these losses, we propose a fine-granularity loading scheme using joint optimization of adaptive modulation and flexible coding parameters based on programmable Reed-Solomon (RS) codes and bit-error probability criterion. Illustrative examples of applications to VDSL-DMT systems indicate that the proposed scheme can offer a rate increase of about 20% in most cases as compared to various existing integer-bit-loading algorithms. This improvement is in good agreement with the theoretical estimates developed to quantify the granularity loss.

Copyright © 2006 Hindawi Publishing Corporation. All rights reserved.

1. INTRODUCTION lapses in the bit-domain) and significant losses in achievable data rates of integer-bit algorithms are observed. Discrete multitone (DMT) modulation [1]hasbeenwidely These losses accounting to be a significant percentage used in xDSL applications such as asymmetric DSL (ADSL) of the supported information rate compel us to tackle the [2] by the American National Standards Institute (ANSI) and integer-bit granularity problem through bit-error-rate-based the European Telecommunications Standard Institute (ETSI) joint optimization of adaptive modulation and flexible and more recently for VDSL [3] by ANSI. Loading strat- RS(n, k) coding on each subcarrier that can provide a wide egy is used for dynamic subcarrier rate and power allocation range of fine choicesincoderateanderror-correctioncapa- for given channel conditions, system constraints, and perfor- bility. mance requirements. The remainder of the paper is organized as follows. For a multichannel total-power constrained problem, the Section 2 presents the overall optimization problem formu- optimal power distribution has long been known to be the lation and inferences from related literature about gran- “water-filling” distribution [4]. However the derivation tac- ularity. Section 3 develops a quantification of granularity itly assumes infinite granularity while most of the known loss based on relative strictness of peak-power and total- modulation schemes support the integer number of bits per power constraints. Section 4 describes the proposed adap- symbol. It was initially observed in [5, 6] that most of the tive Reed-Solomon-based fine-granularity loading (ARS- granularity losses due to the integer number of bits per sym- ff FGL) scheme. Section 5 presents the illustrative results for bol could be recovered by rounding o rates to integers and various VDSL-DMT scenarios and concluding remarks are scaling energies accordingly after starting with a water-filling made in Section 6. [6]orflaton/off [5] energy distribution. However the free- dom for such rescaling is considerably reduced in the pres- 2. POWER, INTEGER-BIT CONSTRAINTS, ence of peak-power constraint. AND GRANULARITY LOSS Peak-power constraint [7, 8] arises from spectrum com- patibility requirements to enable coexistence among multiple Consider a xDSL-DMT system with N subcarriers. Let εj be users and diverse services. When the peak-power constraint the controllable transmitted power spectral density (PSD) is far stricter than the total-power constraint, as is often the and ρj be the normalized channel signal-to-noise ratio when case in VDSL-DMT, there is hardly any room left for ma- εj = 1 over the jth subcarrier, that is, ρj is the ratio of the neuverability (or rescaling) in the energy domain (to recover squared channel transfer function to the noise PSD over the 2 EURASIP Journal on Applied Signal Processing

jth subcarrier. The noise includes both crosstalk and ther- bit-rounding and proper energy adjustment only for the TPO mal additive white Gaussian noise (AWGN). The intercarrier case. spacing Δ f is assumed to be small enough for all the afore- In [5, Section 4.3.4], an Ad hoc extension for the TPP mentioned PSDs to be nearly flat over each subcarrier. problem is presented by capping the bit round-off and the The subcarrier rate function b(σj ) is defined as the max- final energy rescaling with a maximal bit distribution and imum information rate in bits per symbol that can be sup- the peak energy constraint, respectively. A more formal treat- ported at the received SNR of σj = ρj εj to maintain the con- ment of the problem is presented in [7]. At first, the problem ceded error probability not exceeding a specified target value. is solved without the integer-bit constraint for the general The object function of the overall rate maximization problem case of a continuously differentiable, strictly increasing, and is the total supported rate: strictly concave rate function. This solution is reproduced be- low with minor notational changes for easy reference, N R = b ρj εj . (1) N j=1 max max if Δ f · εi ≤ Ebudget, then εj = εj . (5) The traditional total-power constraint for the nontrivial i=1 power distribution can be expressed as N N max If Δ f · εi >Ebudget, Δ f · εj ≤ Ebudget for εj ≥ 0, 1 ≤ j ≤ N. (2) i=1 j=1 then εj = εj (λ) In addition, many practical systems have limitation on the max max maximum transmit PSD. This implies the peak-power con- = εj ,ifλ ≤ ρj bσ ρj εj straints: (6) λ max 1 −1 max εj ≤ εj ,1≤ j ≤ N,(3) = bσ ,ifρj bσ ρj εj ρj ρj {εmax}N where j j=1 is specified by the admissible transmit PSD ≤ λ ≤ ρj bσ (0) mask, for example, SMClass3 in [8]orM1FTTCabin[3]. The subcarrier specific rate function can be expressed as = 0, if λ ≤ ρj bσ (0), σj b σj = rj log 1+ ,(4) −1 2 Γj where bσ (σ) = ∂b(σ)/∂σ and bσ (·) is the inverse of bσ (·). The parameter λ is the solution to where rj is the coding rate and Γj is the SNR gap determined by the performance of the modulation and coding schemes N x=m in use. The floor operation (i.e., for the largest Δ f · εj (λ) = Ebudget. (7) integer m ≤ x) arises from the integer-bit constraint, since j=1 we try to find the largest integer number of bits per symbol that would satisfy the error rate target at SNR of σj . When When (5) is satisfied, the energy distribution is indepen- the same FEC coding is applied for all subcarriers, that is, dent of the rate function. Consequently, the peak-power con- rj = r, this floor operation restricts the subcarrier rate to straint completely dominates and total-power constraint is n have steps of nr where is integer (i.e., integer-bit constraint) trivially satisfied. In the rest of this paper, we will refer to this R = r N  σ /Γ  and j=1 log2(1 + ( j j )) . case as the peak-power only (PPO) case and by TPP we will Loading algorithms with objective to maximize rate (1) mean the case when the inequality in (6) is satisfied, that is, are called rate-adaptive (RA) loading algorithms. The total- both total-power and peak-power constraints play a role. A power only (TPO) constrained problem specified by (1)and suboptimal algorithm for the TPP case with integer-bit con- (2) leads to the classical water-filling solution. Most RA al- straint is presented in [7, Table IV]. The optimal algorithm gorithms [5, 6, 9, 10] addressed the TPO problem with (in terms of rate achieved) for the TPP case with integer-bit integer-bit constraint. The more practical total and peak- constraint is presented in [11]. power (TPP) constrained problem, that is, (1), (2), and (3), The granularity loss in [7]isreportedtobebetween6– with integer-bit constraint was considered in [5, 7, 11]. 12% of the rate conveyed for the ADSL-TPP case, which is Both the greedy method in [9] and the Lagrangean significant as compared to the variation of only 0.2–4% in method in [10] lead to the optimal solution for the integer- the achievable rates of most existing integer-bit algorithms bit TPO problem, the latter being much more computation- for the TPO case [12, Figure 4]. It is also higher than what ally effective than the former. In [5, 6], a SNR-gap function- wouldbeexpectedfromthe0.2dBmargindifference due to [1] based method is proposed, which, in the initial stage, granularity reported for the ADSL-TPO case in [5, 13]. This gives a continuous bit distribution resulting from a flat on/off leads us to believe that granularity losses would grow with and water-filling energy distribution, respectively. The differ- strictness in the peak-power constraint. Hence, we examine ence between the rates resulting from these two energy distri- the case of VDSL-DMT for which the peak-power constraint butions was seen to be only 2%. To achieve negligible degra- is known to be particularly strict and also the number of sub- dation due to the integer-bit constraint, both methods use carriersislarge. S. Panigrahi and T. Le-Ngoc 3

3. QUANTIFICATION OF GRANULARITY LOSS (iii) TPP Case: For the TPP case, the analysis of η is more involved and depends on the specific scenario. How- Ω  { j ∈{ ... N} ε > }=Ω ∪ Ω Let 1, 2, , : j 0 1 2 where ever, observing the values of η in TPO and PPO cases, −1 max Ω1 ∩ Ω2 =∅, Ω1  { j ∈ Ω : b ( b(ρj εj ) ) >ρj εj }, which act as the boundaries of the TPP case and its −1 max Ω2  { j ∈ Ω : b ( b(ρj εj ) ) ≤ ρj εj },and x represents monotonic nature, we can consider the following ap- the ceiling operation (i.e., x =n where n is the smallest in- proximation: teger such that x ≤ n). It follows that NΩ = NΩ1 + NΩ2 where N N N Ω Ω ⎧ Ω, Ω1,and Ω2 are the cardinality of the sets , 1,and 1 ⎨ E l l, x>l, Ω ,respectively.Ω represents the set of nontrivially loaded η ≈ budget x  2 max , ⎩ (12) Δ f · i∈Ω εi x x ≤ l. subcarriers and Ω1 is the set of subcarriers in which ceil- , ing the noninteger-bit b(ρj εj ) would cause the correspond- ing energy allocation to violate the peak-power constraint, η represents the relative strictness of the total-to-peak-power max that is, εj >εj . Thus the only possibility to satisfy both the constraint and we can expect that as η increases due to integer-bit and peak-power constraints in such a scenario is stricter peak-power constraint, granularity losses will be using the floor operation b(σj ). Hence the granularity loss higher. It is worthwhile to note that for a general TPP case, as for the jth subcarrier is, channel conditions worsen, Ω shrinks, thereby reducing the denominator of η.Eventuallyη will increase to 1 and the TPP G ∂bj = b σj − b σj , ∀ j ∈ Ω1. (8) case will reduce to a PPO case and all previous inferences will apply. In VDSL-DMT application, η is seen too fairly close to For subcarriers, where rounding is possible without violation 1 in most cases and NΩ is large, thus the granularity loss is ex- of peak-power constraint: pected to be a fairly significant percentage of the supported

G rate. ∂bj = b σj − round b σj , ∀ j ∈ Ω2. (9) 4. ADAPTIVE REED-SOLOMON-BASED ∂bG In both cases j can be treated as a quantization error with FINE-GRANULARITY LOADING SCHEME a quantization step of 1. Since the variable to be quantized, b(σj ), has a much larger range (up to 15 bits/symbol) than In the current VDSL1 system [3], as shown in Figure 1, there G the quantization step, the granularity loss ∂bj can be con- is only one fixed-rate RS(n, k)encoderwithn = 255 and k = sidered as a uniformly distributed random variable (see [14, 239 in the PMS-TC layer and the bit and energy allocation page 194]), are carried out only in the PMD layer. The RS(255, 239) coding is applied to bits that can be G ∂bj ∼ U[0, 1), ∀ j ∈ Ω1; transmitted in various subcarriers. The coding channel is assumed to be a binary symmetric channel (BSC) with the G 1 1 (10) ∂bj ∼ U − , , ∀ j ∈ Ω2. crossover bit-error probability, Pe,ch and Pe,ch represents the 2 2 BER averaged over all N subcarrier DMT modems. The fi- nal system performance is represented by the post-decoding The random variable representing the total granularity loss m G G bit-error probability of the RS(n, k) code over GF(2 )[15, is ∂b = i∈Ω ∂bi with its average Equations 4.23, 4.24]: G G G ∂bG = E ∂b = E ∂b E ∂b n   i + i m−1 i t n i∈Ω i∈Ω 2 + i n−i 1 2 Pe Pe , n, k ≤ P (1 − P) , ,dec ,ch 2m − 1 n i 1 NΩ1 i=t+1 = NΩ · + NΩ · 0 = (11) 1 2 m n − k 2 2 P = − − P t = . ηN where 1 1 e,ch , Ω NΩ1 2 = , where 0 ≤ η = ≤ 1, 2 NΩ (13) where E(·)in(11) represents the stochastic expectation op- The above upper bound is less than 0.1dB away from the ex- erator. The ratio η can be estimated as follows. act BER [16]. For RS(255, 239) with m = 8, n = 255, k = 239, and t = (i) TPO Case: In this case, by definition, there is no peak- −7 −3 −4 max 8, to achieve Pe,dec ≤ 10 , we need Pe,ch < 10 (5.65 × 10 power constraint or εj =∞;forallj, that is, Ω1 =∅ and NΩ = 0, η = 0. Also, due to the denominator to be precise). This is ensured indirectly and approximately 1 M ∞ E /Δ f · N εmax = using the SNR gap method. Since only -QAM is used, the being , budget i=1 i 0. Thus the average − uncoded SNR gap for Pe ≤ 10 7 is nearly 9.75 dB for a granularity loss is nearly zero, as observed in [5, 6, 13]. ,dec max large range of M and also the RS(255, 239) code is assumed (ii) PPO Case: In this case, εj = εj ;forallj ∈ Ω, that is, to provide a uniform coding gain γc=3.75 dB. Thus Γj = Γ = Ω = Ω Ω =∅ N = N ∂bG = 1 and 2 .Thus Ω1 Ω and PPO 9.75 − γc [17] and the code rate rj = r = 239/255 in (4). NΩ/2, η = 1. NΩ is fairly large in xDSL applications (e.g., more than 1000 in VDSL-DMT). Also from (5), N E /Δ f · εmax ≤ 1 budget i=1 i 1. ADSL has a similar structure. 4 EURASIP Journal on Applied Signal Processing

TPS-TC

Scrambler/descrambler Header MUX/DEMUX FEC RS(n, k)codec Interleaver/deinterleaver PMS-TC MUX/DEMUX

Data Data decoder encoder

Demodulator Modulator

Multicarrier Multicarrier demod modulation

Strip cyclic Cyclic PMD prefix extension

Undo Windowing windowing

Analog front-end (AFE)

To transmission medium (channel)

PMD: physical medium-dependant PMS-TC: physical media specific transmission convergence TPS-TC: transport protocol specific transmission convergence FEC: forward error correction-RS(255, 239)

Figure 1: Functional diagram of PMD and PMS-TC sublayers in current VDSL-DMT system.

Input Variable Adaptive Input RS(255,k) M-QAM energy Channel bit stream gain encoder modulator scaling Transmitter Channel AWGN

Figure 2: Subcarrier transmission model.

In the proposed adaptive RS-based fine-granularity load- interleaving may not be required since independence of er- ing (ARSFGL) scheme, instead of using a fixed-rate RS(n, ror patterns is maintained before decoding unlike in [3]. The k) code for all subcarriers, we assume a variable rate RS(n, loading algorithm provides the allocated rates (i.e., ki,and ki) code for each subcarrier #i. This can be implemented the number of bits/symbol) and power as follows. by replacing the fixed-rate RS codec in Figure 1 with a single programmable RS(255, k)codec[18, 19], which 4.1. Rate allocation operates on a per-subcarrier basis. Framing and buffer- ing in MUX/DEMUX (Figure 1) will be modified accord- Figure 2 depicts the equivalent model representing the trans- ingly to support this per-subcarrier RS codec operation and mission operation for each subcarrier. The complex symbol S. Panigrahi and T. Le-Ngoc 5 output of the M-QAM modulator is scaled to an in- Table 1: Example of rate lookup table. put PSD level of εj to achieve the overall received SNR, σ (dB) Optimum kj (1–255) Optimum log (Mj ) σj = εj ρj , corresponding to the Mj -QAM demodulator 2 30.0 245 8 and RS(n, kj ) decoder bit-error probabilities of Pe,ch(Mj , σj ) P P M σ n k 30.5 247 8 and e,dec[ e,ch( j , j ), , j], respectively. Our optimization 31.0 249 8 problem is formulated as follows: 31.5 251 8 32.0 229 9 kj b σ = × M 32.5 235 9 maximize j log2 j , (14) kj ,Mj n 33.0 239 9 constraints : k = 1, 3, 5, ..., n, 33.5 223 10 34.0 229 10 log Mj = 1, 2, 3, ... (15)  2  34.5 235 10 −7 Pe,dec Pe,ch Mj , σj , n, kj ≤ 10 . 35.0 239 10

Pe,dec[Pe,ch(Mj , σj ), n, kj ] is obtained from (13)withk = kj and Pe,ch = Pe,ch(Mj , σj ). Pe,ch(Mj , σj )istheBERofMj -QAM in AWGN channels, M that is, for log2 j : odd with cross-QAM using impure Gray The optimized rate function (14) of the proposed AR- encoding [20], SFGL is plotted along with that of the integer-bit-loading   for VDSL in Figure 3. The finer granularity and inherent Gp MNM σ , 2 2 Pe,ch(M, σ) ≈ · Q , (16) gains in rate can be clearly seen. The gains stem from the log M Cp,M 2 fact that while k, and hence Pe,ch, are fixed in the exist- G N C ing VDSL schemes, the proposed ARSFGL scheme varies where p,M, M,and p,M, represent the Gray penalty, P M σ number of nearest neighbors and packing coefficients, re- e,ch( j , j ), jointly optimizing the adaptive coding and modulation schemes to achieve the maximum information spectively. For validation purposes, we simulated cross- ff constellations constructed from the above scheme and we rate. The gain in rate o ered by the proposed ARSFGL is observe that (16) gives an accurate estimate of BER for all larger at higher SNR due to the fact that the proposed AR- cross-constellations from 25,27, ...,215 for BERs below 0.07, SFGL uses the bit-error probability (BER) criterion while M the existing VDSL loading scheme is based on symbol-error For even log2 j with square-QAM using perfect Gray M encoding [21], probability (SER) [1]. As SNR increases, higher can be used and the difference between BER and SER becomes sig- √ M log2 nificant. Hence the BER-based ARSFGL is closer to the con- 2 − P M σ = P s σ straint Pe ≤ 10 7. Another reason for choosing the BER- for square-QAM [21]: e,ch( , ) M ( , ), ,dec log2 s=1 based scheme is that for the choice of RS(255, kj )oneach (17) subcarrier, the input BER Pe,ch(Mj , σj ) is a more meaningful quantity than the Mj -ary SER(see (13)). where

P s σ = √1 4.2. Energy allocation ( , ) M √ − −s M−   AscanbeseenfromFigure 3, the ARSFGL rate function is (1 2 ) 1 √ s−1 s−1 2 i 1 × (−1)i·2 / M 2s−1 − √ + nondecreasing and can provide near-continuous rate adap- M 3 i= 2 tation. These conditions are sufficient for (5)tobesatisfied . 1   σ Thus, for the PPO case, the optimal power allocation will be × c i 3 . erf (2 +1) M − the PSD constraint. For the TPP case, however, the energy 2( 1) allocation depends on the rate function. Note that the solu- (18) tion for a continuously differentiable, strictly increasing, and strictly concave ratefunction is already available in (6)and Note that b(σj ) is a monotonously increasing with kj and (7). Furthermore, the ARSFGL rate function is close to the Mj , Pe (Mj , σj )andPe [Pe (Mj , σj ), n, kj ]aremono- ,ch ,dec ,ch above properties. Therefore, we consider the rate function tonously increasing with Mj and kj ,respectively.Thuswe can search for Mj and kj in a sequential manner. At first, M j is found to be within the limits specified by the uncoded 2 σ /Γ ≤ Based on [3], no other code than RS is assumed in Figure 3. When addi- case and the ideal Shannon limit, that is, log2(1 + j ) tional or higher-performance coding is used, the gap between the Shan- M ≤ σ  k log2 j log2(1 + j ) . We then search for j in de- non limit and both curves in Figure 3 would be reduced by the same scending order, that is, from n to (n − 2), (n − 5), ..., until amount due to the additional coding gain. However, the granularity loss −7 would remain the same. Pe,dec[Pe,ch(Mj , σj ), n, kj] ≤ 10 .Theoptimumvaluesforkj 3 It is straightforward to verify (5) to hold for a continuous and increas- Mj σj and for given can also be precalculated and stored in a ing case. For the continuous and nondecreasing case, the only change is tablesuchasTable 1 so that the search for kj and Mj can be that (5) is no longer the unique optimum and solutions with smaller total done by the table lookup technique. energy might exist. 6 EURASIP Journal on Applied Signal Processing

18

16 Unachievable region 14

12

10

8 Rate (bits) σ 6 log2(1 + )

4

2

0 0 5 10 15 20 25 30 35 40 45 50 SNR (dB) Shannon capacity ARSFGL rate Int. bit VDSL rate

Figure 3: ARSFGL performance: rate versus SNR.

approximated by method proposed in [7] with the following minor changes f B = Δ f · to suit our notation and special usage of (19), ( ) εmax be(σ) = α log (βσ + γ . (19) N B − γ/βρ j − E b ≡ {γ/βρ } 2 j=1[ j ]0 budget, 0 min1≤i≤N j ,and max b1 ≡ max1≤i≤N {εj + γ/βρj}. b0 and b1 are the limits of the The approximation4 is achieved by curve-fitting and the secant-based search. After incorporating these changes, the values of α = 0.9597, β = 0.2736, and γ = 0.8232 yield a pseudocode in [7, Table I] can be directly used. It is worth- mean-squared error of less than 0.0076 bits. while to note that by virtue of providing near-continuous From (6)and(7)withbe(σ) as the rate function, the final rate adaptation, a secondary iterative procedure characteris- solution to the TPP energy allocation problem is tic to integer-bit algorithms (e.g., bit-rounding and energy- adjustment in [5, 6] or bisection search in [7]) is not neces- εmax B − γ j sary. Thus the energy allocation for ARSFGL is simpler. εj = , (20) βρj 0 5. ILLUSTRATIVE EXAMPLES FOR APPLICATION TO where VDSL-DMT SYSTEMS ⎧ ⎪ ⎨⎪m, x ≥ m, We consider the 4 transmit PSDs specified for VDSL-DMT x m  x

For PPO cases, (5) indicates that the energy allocation is in- 4 The approximation is done only for the purpose of energy allocation so that (5)–(7) can be directly used. However, the rate allocation following dependent of the rate allocation function. Thus all existing this energy allocation is done using the lookup table. algorithms would result in the same solution because they S. Panigrahi and T. Le-Ngoc 7

Table 2: Occurrence of PPO and TPP cases in VDSL-DMT. Upstream Downstream PSD max max Δ f · i εi (dBm) Ebudget(dBm) Δ f · i εi (dBm) Ebudget (dBm) M1 FTT Cab 6.94 14.5 8.39 11.5 M2 FTT Cab 13.26 14.5 14.47 11.5 M1 FTT Ex 6.94 14.5 20.54 14.5 M2 FTT Ex 13.26 14.5 21.52 14.5

Table 3: Simulation parameters. Number of subcarriers: 4096 Cyclic prefix length: 640 samples US carriers: U1: 870–1205, U2: 1972–2782 DS carriers: D1: 33–869, D2: 1206–1972 Loop and basic noise: Loop 1 with AWGN(−140 dBm/Hz) + 20 VDSL xTalkers PPO TPP Direction: Upstream Downstream Total-power constraint: 14.5 dBm 11.5 dBm Tx PSD constraint: M1 FTTCab M2 FTTCab Additional noise: + Alien noise A + Alien noise F strive for optimization in the energy domain and in this case the proposed ARSFGL are 23.6%, 27.5%, and 70% at loop the energy distribution is completely decided by the peak- lengths of 2500 ft, 3600 ft and 4000 ft, respectively. power constraint. The received the SNR profile as a result of max any loading algorithm would be σj = εj ρj . 5.2. Evaluation of TPP cases The general simulation parameters and those specific to the PPO case are presented in Table 3. This configuration In TPP cases, the peak-power constraint is less stringent resembles Test case-1 in [22] except that we do not fix the than the PPO case, and hence there is some room for data rate at 10 Mbps, and study its variation over a wide N maneuverability in the energy domain to recover some of the range of loop lengths. The received SNR profile {σi}i= and 1 granularity losses. rate allocation over the subcarriers for this configuration at 2400 ft are presented in Figures 4(a) and 4(b),respectively. The simulation parameters specific to the TPP case are The resulting data rates offered by the integer-bit-loading presented in Table 3. This configuration resembles test case- algorithm and proposed ARSFGL schemes are 10.94 Mbps 25 in [22] except that we do not fix the data rate at 22 Mbps, 5 and study its variation over a wide range of loop lengths. The and 13.41 Mbps , respectively. In other words, the proposed ρ ARSFGL scheme provides an increase in rate of 22.6% (= channel SNR j for the above configuration and a loop length 13.41/10.94-1). The rate-reach curves for different schemes of 2100 ft is shown in Figure 5(a).InFigure 5(b), the PSD- are presented in Figure 4(c). Any integer-bit-loading algo- constraint in the form of M2FTTCab mask is presented along rithm would result in this same distribution as shown for with the transmit PSD allocated by the ARSFGL scheme and the coded and uncoded cases. The proposed ARSFGL offers integer-bit scheme by Baccarelli [7]. The integer-bit scheme a much better reach-rate curve than the integer-bit-loading leads to a sawtooth distribution, which deviates on both algorithm. The “theoretical expectation” curve is generated sides of the smooth distribution of the ARSFGL scheme. In ∂bG η = Figure 5(c), the resulting bit distributions are presented. Un- by adding PPO, that is, (11)with 1, to the reach- like in the PPO case (where Ω =∅), here we observe sets of rate curve of the integer-bit-loading algorithm for the coded 2 subcarriers (belonging to Ω2) where the integer-bit scheme is case at each reach value. The ARSFGL curve closely fol- able to allocate more bits than the ARSFGL scheme due to the lows the “theoretical expectation” for distances longer than sawtooth nature of the energy distribution. This is what we 1800 ft. However, for distances shorter than 1800 ft, it is no- have referred to as recovery of granularity loss through en- ticeable that the ARSFGL curve is better due to the im- ergy readjustment in earlier parts of the paper. It can be seen provements arising from a BER-based loading. Shorter dis- M that, in the subcarrier 33-300 where the M2 mask is particu- tances allow higher SNR and hence higher j . Therefore, larly stringent at −60 dBm/Hz, the ARSFGL scheme is always the BER-based improvement is more pronounced as pre- able to allocate more bits just like in PPO cases. These sub- viously discussed (Figure 3). The improvements offered by carriers form a part of set Ω1. Therate-reachcurvesarepresentedinFigure 6(a).The rates achieved by the ARSFGL scheme for TPP case is 5 It is worth noting that to achieve this increased rate with the integer-bit- loading algorithm, a coding gain of 8.6 dB would be required, assuming compared with 3 integer-bit algorithms—Chow’s TPP algo- 1 bit redundancy per-subcarrier characteristic of TTCM schemes [23]. rithm [5, Section 4.3.4], Baccarelli’s (suboptimal) integer-bit 8 EURASIP Journal on Applied Signal Processing

30

25

20

15

RSNR (dB) 10

5

0 800 1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 Subcarrier number (a)

7

6

5

4

Rate (bits) 3

2

1

800 1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 Subcarrier number ARSFGL Int. bit (b)

30

25

20

15

Rate (Mbps) 10

5

0 1000 1500 2000 2500 3000 3500 4000 Reach (feet) ARSFGL Uncoded int. bit VDSL int. bit ARSFGL theory (c)

Figure 4: ARSFGL performance for PPO case. (a) Channel signal-to-noise ratio, ρ, at 2400 ft, (b) sample rate allocation, b(σj ), at 2400 ft, and (c) rate versus reach. S. Panigrahi and T. Le-Ngoc 9

130

120

110 CSNR (dB) 100 0 200 400 600 800 1000 1200 1400 1600 1800 2000 (a)

−50 −52 −54 −56 −58 −60 PSD (dBm/Hz) −62 −64 0 200 400 600 800 1000 1200 1400 1600 1800 2000

ARSFGL Baccarelli int. bit M2 FTTCab mask

−56.4

−56.45 −56.5

300 400 500 600 700 800 (b)

9 8 7 6 5 Rate (bits) 4 3 0 200 400 600 800 1000 1200 1400 1600 1800 2000

ARSFGL Int. bit. (bac.) (c)

Figure 5: ARSFGL performance for TPP case. (a) Channel signal-to-noise ratio, ρ,atdifferent subcarriers for 2100 ft, (b) transmit PSD εj at different subcarriers for 2100 ft, and (c) bit distribution b(ρj εj )atdifferent subcarriers for 2100 ft. algorithm, and the matroid optimal integer-bit algorithm6 in Figure 6(b).FromFigure 6(a), we can see that on average, [11]. For easier comparison of schemes, the percentage the ARSFGL scheme provides about 2 Mbps improvement improvements over Chow’s algorithm have been presented over the integer-bit schemes for loops shorter than 5500 ft. As expected from (12), for loops longer than 4700 ft, η be- comes 1 and this case reduces to a PPO case as shown in 6 In [11], an optimal solution to the integer-bit TPP problem was proposed. Figure 6(b), with all the 3 integer-bit schemes giving exactly The optimality was proven using the matroid structure of the underly- the same performance. As reach increases, both granularity ing combinatorial optimization problem. The optimality of the algorithm N makes it valuable for benchmarking (in the context of our paper for the loss (that depends on Ω) and rate are reduced. However, the ARSFGL scheme), because this is the best any integer-bit scheme (simple reduction in rate is much faster than that in NΩ(and hence or complicated) can do. However, we must also note that the rate achieved granularity loss). Since the proposed ARSFGL draws most of by the algorithm in [7], though suboptimal is very close to the optimal its improvement from the granularity loss, its percentage of rate achieved by [11]. This is observed in Figure 6 for VDSL cases and was also seen in [11, Table 4]. improvement increases with reach as shown in Figure 6(b). 10 EURASIP Journal on Applied Signal Processing

40

35 30 25 20 15 Rate (Mbps) 10

5 0 2000 3000 4000 5000 6000 7000 8000 9000 Reach (feet)

ARSFGL ARSFGL theory Chow int. bit. Matroid optimal int. bit. Baccarelli int. bit (suboptimal) (a)

50

40

30

20

10 Increase from Chow (%) 0 2000 3000 4000 5000 6000 7000 8000 9000 Reach (feet) ARSFGL ARSFGL theory Chow int. bit. Matroid optimal int. bit. Baccarelli int. bit (suboptimal) (b)

Figure 6: Performance of various schemes for TPP. (a) Rate-reach curves and (b) percent increase in data rate as compared to Chow’s algorithm.

The theoretical curves are generated by adding ∂bG from (12) The power and rate allocation results of the Leke’s algo- to the rate provided by the matroid optimal integer-bit algo- rithm [6] and proposed ARSFGL scheme are shown in Fig- rithm at different reach values. It is observed that the rate- ures 7(a) and 7(b), respectively, for a 2400 ft loop. reach curve of the ARSFGL closely follows the theoretical ex- Figure 8 shows the percentage increase in rate as com- pectations and thereby the assumption on η in Section 3 is pared to Chow’s algorithm [5] versus loop lengths offered by validated. the Leke [6], Baccarelli [7], and the optimal (greedy) integer- bit Hughes-Hartogs (HH) [9] algorithms and the proposed 5.3. Evaluation of TPO case ARSFGL scheme. It indicates that the rate increase offered by the Leke, Baccarelli, and Hughes-Hartogs algorithms is less Though the TPO case does not occur in practice, it has been than 1% while the proposed ARSFGL scheme can provide presented here for the sake of completeness. The hypothetical 4–6% rate improvement for distances up to 7000 ft. This im- TPO scenario is constructed by removing the PSD constraint provement is explained by the fact that though in Section 3, from the TPP configuration shown in Table 3. we have assumed bit-rounding to be an unbiased operation S. Panigrahi and T. Le-Ngoc 11

−56

−57

−58 Tx PSD (dBm/Hz) −59

0 200 400 600 800 1000 1200 1400 1600 1800 2000 Subcarrier number Leke int. bit ARSFGL (a)

10

8

6

Rate (bits) 4

2 0 200 400 600 800 1000 1200 1400 1600 1800 2000 Subcarrier number ARSFGL Leke int. bit (b)

Figure 7: Power and rate allocation for TPO case and 2400 ft loop. (a) Transmit PSD εj at different subcarriers, and (b) bit distribution b(ρj εj )atdifferent subcarriers.

difference due to the logarithmic (concave) nature of the rate 6 function. This bias leads to the granularity loss being positive even for the TPO case due to Ω2 set of subcarriers. However 4 in PPO and TPP case, as we observed, this effect is strongly dominated by loss due to Ω1. 2

0 5.4. Applicability to dynamic spectrum management

−2 The above results and analysis have been presented for the case when spectrum management is performed through −4 specification of spectral masks for all users, which is the cur-

Percent increase in rate from Chow (Mbps) 2000 3000 4000 5000 6000 7000 8000 9000 rently standardized form of spectrum management in ADSL Reach (feet) [2, 8] and VDSL [3], known as static spectrum management. Dynamic spectrum management (DSM) techniques have Chow[8] Baccarelli[11] Leke[9] ARSFGL been recently introduced to improve the reach-rate perfor- HH[7] mance of xDSL, for example, [24, 25]. In a DSM case, peak- power constraint still occurs although it is more implicit, and Figure 8: Performance of various schemes for TPO case. granularity loss still exists. For example, in [26], it was ob- served that, for a 24-AWG scenario consisting of 4 loops of 600 m and 4 loops of 1200 m, when the 1200 m loops are constrained to achieve a minimum of 5 Mbps, the 600 m for simplifying the analysis, rounding up a bit always costs loops using iterated water-filling (IWF) [27] can achieve more in terms of energy than rounding down for the same 3.4 Mbps and 7.7 Mbps with integer-bit-loading and ideal 12 EURASIP Journal on Applied Signal Processing continuous bit-loading, respectively. When optimum spec- REFERENCES trum management (OSM) is used in the same scenario, the ffi 600 m loops achieve 13 Mbps and 15 Mbps with integer-bit- [1] J. M. Cio , “A Multicarrier Primer,” ANSI T1E1.4 Committee loading and ideal continuous bit-loading, respectively. With Contribution, pp. 91–157, November, 1991. its near-continuous bit-loading nature, the proposed ARS- [2] “Asymmetric Digital Subscriber Line (ADSL) Metallic Inter- face,” ANSI Std. T1.413-1998, 1998. FGL scheme could be used to recover most of such large [3] “Very-high Speed Digital Subscriber Lines (VDSL) Metallic granularity losses, that is, to approach the rates offered by Interface,” ANSI Std. T1E1.4/2003-210R5, 2003. ideal continuous bit-loading. Furthermore, given the saw- [4] R. G. Gallager, Information Theory and Reliable Communica- tooth and discrete nature of integer-bit distribution, multiple tion, John Wiley & Sons, New York, NY, USA, 1968. Nash equilibriums might exist and oscillations around these [5] P. S. Chow, “Bandwidth optimized digital transmission tech- also seem likely when IWF is used with integer-bit-loading niques for spectrally shaped channels with impulse noise,” algorithm. This problem could be also mitigated to a large Ph.D. dissertation, Stanford University, Stanford, Calif, USA, extent by using the ARSFGL scheme. 1993. [6] A. Leke and J. M. Cioffi, “A maximum rate loading algo- rithm for discrete multitone modulation systems,” in Proceed- 6. CONCLUSIONS ings of IEEE Global Telecommunications Conference (GLOBE- COM ’97), vol. 3, pp. 1514–1518, Phoenix, Ariz, USA, Novem- We examined the granularity loss due to the integer-bit re- ber 1997. striction that can contribute in a significant percentage in [7] E. Baccarelli, A. Fasano, and M. Biagi, “Novel efficient bit- reducing the achievable data rates, especially in peak-power loading algorithms for peak-energy-limited ADSL-type mul- constrained cases, and developed a fine-granularity BER- ticarrier systems,” IEEE Transactions on Signal Processing, based loading scheme to recover these losses. This is done vol. 50, no. 5, pp. 1237–1247, 2002. by jointly optimizing the coding rate of a programmable [8] “Spectrum Management for Loop Transmission Systems,” RS(n, k) code and the bit and energy allocation on each sub- ANSI Std. T1.417-2001, January 2001. carrier. Illustrative examples of applications to VDSL-DMT [9] D. Hughes-Hartogs, “Ensemble modem structure for imper- systems indicate that the proposed scheme outperforms var- fect transmission media,” U.S. Patents nos. 4,679,227 (July ious existing integer-bit-loading algorithms with an increase 1987), 4,731,816 (March 1988) and 4,833,706 (May 1989). in rate of about 20% in most cases. This is a large rate increase [10] B. S. Krongold, K. Ramchandran, and D. L. Jones, “Computa- ffi as compared to the variation in achievable rates of less than tionally e cient optimal power allocation algorithms for mul- 4% between various existing integer-bit-loading algorithms. ticarrier communication systems,” IEEE Transactions on Com- munications, vol. 48, no. 1, pp. 23–27, 2000. This improvement is in good agreement with the theoretical [11] E. Baccarelli and M. Biagi, “Optimal integer bit-loading for estimates developed to quantify the granularity loss. The the- multicarrier ADSL systems subject to spectral-compatibility oretical estimates also present an insight into how the granu- limits,” Signal Processing, vol. 84, no. 4, pp. 729–741, 2004. larity losses increase with rising strictness in the peak-power [12]J.Jang,K.B.Lee,andY.-H.Lee,“Transmitpowerandbitallo- constraint, in comparison to the total-power constraint and cations for OFDM systems in a fading channel,” in Proceedings with the number of subcarriers in use. Although the illustra- of IEEE Global Telecommunications Conference (GLOBECOM tive results are for the currently standardized static spectrum ’03), vol. 2, pp. 858–862, San Francisco, Calif, USA, December management, it is expected that, with its near-continuous 2003. bit-loading nature, the proposed scheme can also be used [13] P. S. Chow, J. M. Cioffi, and J. A. C. Bingham, “A practical dis- to recover potential granularity losses that exist in dynamic crete multitone transceiver loading algorithm for data trans- spectrum management (DSM) cases. mission over spectrally shaped channels,” IEEE Transactions on Communications, vol. 43, no. 234, pp. 773–775, 1995. [14] A. V. Oppenheim, R. W. Schafer, and J. R. Buck, Discrete-Time ACKNOWLEDGMENTS Signal Processing, Prentice-Hall, Englewood Cliffs, NJ, USA, 2nd edition, 1998, (Sec. 4.8.3. Analysis of Quantization Er- This work is partially supported by a National Sciences and rors). Engineering Research Council Collaborative Research and [15] R. H. Moroles-Zaragoza, The Art of Error Control Coding,John Development Grant with Laboratoires universitaires Bell. Wiley & Sons, New York, NY, USA, 2000. The authors wish to thank Mr. Martino Freda and Mr. Nestor [16] L. Zhang, C. Gao, and Z. Cao, “Exact analysis of bit er- Couras, Electrical and Computer Engineering Department, ror rate of maximum-distance-separable codes,” in Proceed- McGill University, Montreal, Canada, for their indispens- ings of IEEE Global Telecommunications Conference (GLOBE- COM ’00), vol. 2, pp. 816–819, San Francisco, Calif, USA, able help in channel model construction; and Dr. Ioannis November–December 2000. Psaromiligkos, Dr. Jan Bajcsy, and Dr. Harry Leib, all from [17] Telcordia Technologies, “Proposed Bit Rates for Spectral Com- Electrical and Computer Engineering Department, McGill patibility with VDSL,” ANSI T1E1.4 Committee Contribution, University, Montreal, Canada, for their insightful discus- T1E1.4/2002-159R2, August 2002. sions. The authors also acknowledge the comments by an [18] M. A. Hasan and V. K. Bhargava, “Architecture for a low com- anonymous reviewer, who helped to better highlight the im- plexity rate-adaptive Reed-Solomon encoder,” IEEE Transac- provements of the proposed scheme. tions on Computers, vol. 44, no. 7, pp. 938–942, 1995. S. Panigrahi and T. Le-Ngoc 13

[19] Y. R. Shayan and T. Le-Ngoc, “A cellular structure for a versa- Institute of Canada (EIC), and a Fellow of the Canadian Academy tile Reed-Solomon decoder,” IEEE Transactions on Computers, of Engineering (CAE). He is the recipient of the 2004 Canadian vol. 46, no. 1, pp. 80–85, 1997. Award in Telecommunications Research and the recipient of the [20] J. G. Smith, “Odd-bit quadrature amplitude-shift keying,” IEEE Canada Fessenden Award 2005. IEEE Transactions on Communications, vol. 23, no. 3, pp. 385– 389, 1975. [21] K. Cho and D. Yoon, “On the general BER expression of one- and two-dimensional amplitude modulations,” IEEE Transac- tions on Communications, vol. 50, no. 7, pp. 1074–1080, 2002. [22] “VDSL Test Specification for VDSL Olympics,” ANSI T1E1.4 Contribution, T1E1.4/2003-036R4, February 2003. [23] J. P. Lauer and J. M. Cioffi, “A turbo trellis coded discrete multitone transmission system,” in Proceedings of IEEE Global Telecommunications Conference (GLOBECOM ’99), vol. 5, pp. 2581–2585, Rio de Janeireo, Brazil, December 1999. [24]K.J.Kerpez,D.L.Waring,S.Galli,J.Dixon,andP.Madon, “Advanced DSL management,” IEEE Communications Maga- zine, vol. 41, no. 9, pp. 116–123, 2003. [25] K. B. Song, S. T. Chung, G. Ginis, and J. M. Cioffi,“Dy- namic spectrum management for next-generation DSL sys- tems,” IEEE Communications Magazine, vol. 40, no. 10, pp. 101–109, 2002. [26] R. Cendrillon, W. Yu, M. Moonen, J. Verlinden, and T. Bostoen, “Optimal multi-user spectrum management for digital subscriber lines,” to appear in IEEE Transactions on Communications, http://www.comm.toronto.edu/∼weiyu/ osm.pdf. [27] W. Yu, G. Ginis, and J. M. Cioffi, “Distributed multiuser power control for digital subscriber lines,” IEEE Journal on Selected Areas in Communications, vol. 20, no. 5, pp. 1105–1115, 2002.

Saswat Panigrahi received the B.Tech. de- gree in electrical engineering from the In- dian Institute of Technology (IIT), Kanpur, India in 2003. He is currently finishing his M.Eng. degree in electrical engineering at McGill University, Montreal,´ QC, Canada. His current research interests include mul- ticarrier systems, coding theory, and opti- mization. Tho Le-Ngoc obtained his B.Eng. degree (with distinction) in electrical engineering in 1976, his M.Eng. degree in microproces- sor applications in 1978 from McGill Uni- versity, Montreal, and his Ph.D. degree in digital communications in 1983 from the University of Ottawa, Canada. From 1977 to 1982, he was with Spar Aerospace Lim- ited and was involved in the development and design of satellite communications sys- tems. From 1982 to 1985, he was an Engineering Manager of the Radio Group in the Department of Development Engineering of SR Telecom Inc. and developed the new point-to-multipoint DA- TDMA/TDM subscriber radio system SR500. From 1985 to 2000, he was a Professor at the Department of Electrical and Computer Engineering of Concordia University. Since 2000, he has been with the Department of Electrical and Computer Engineering of McGill University. His research interest is in the area of broadband digital communications with a special emphasis on modulation, coding, and multiple-access techniques. He is a Senior Member of the Or- dre des ingenieur´ du Quebec, a Fellow of the Institute of Electri- cal and Electronics Engineers (IEEE), a Fellow of the Engineering Hindawi Publishing Corporation EURASIP Journal on Applied Signal Processing Volume 2006, Article ID 70387, Pages 1–9 DOI 10.1155/ASP/2006/70387

Intra-Symbol Windowing for Egress Reduction in DMT Transmitters

Gert Cuypers,1 Koen Vanbleu,2 Geert Ysebaert,3 and Marc Moonen1

1 ESAT/SCD-SISTA, Katholieke Universiteit Leuven, 3001 Heverlee, Belgium 2 Broadcom Corporation, 2800 Mechelen, Belgium 3 Alcatel Bell, 2018 Antwerp, Belgium

Received 28 December 2004; Revised 20 July 2005; Accepted 22 July 2005 Discrete multitone (DMT) uses an inverse discrete Fourier transform (IDFT) to modulate data on the carriers. The high sidelobes of the IDFT filter bank used can lead to spurious emissions (egress) in unauthorized frequency bands. Applying a window func- tion within the DMT symbol can alleviate this. However, window functions either require additional redundancy or will introduce distortions that are generally not easy to compensate for. In this paper, a special class of window functions is constructed that corresponds to a precoding at the transmitter. These do not require any additional redundancy and need only a modest amount of additional processing at the receiver. The results can be used to increase the spectral containment of DMT-based wired communi- cations such as ADSL and VDSL (i.e., asymmetric, resp., very-high-bitrate digital subscriber loop).

Copyright © 2006 Gert Cuypers et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. INTRODUCTION (FEQ) for each tone, correcting the phase shift and attenu- ation at each tone individually. When the channel impulse Discrete Fourier transform (DFT-) based modulation tech- response is longer than the CP, the transmission suffers from niques [1] have become increasingly popular for high-speed intercarrier interference (ICI) and intersymbol interference communications systems. In the wireless context, for exam- (ISI), requiring more complex receivers, for example, a per- ple, for the digital transmission of audio and video, this is tone equalizer (PTEQ) [3]. The windowing technique pre- usually referred to as orthogonal frequency-division multi- sented in this article is irrespective of the equalization tech- plexing (OFDM). Its wired counterpart has been dubbed dis- nique used but can be combined with the PTEQ in a very crete multitone (DMT), and is employed, for example, for elegant way. digital subscriber loop (DSL) systems, such as asymmetric In addition to a CP, VDSL systems can also use a cyclic DSL (ADSL) and very-high-bitrate DSL (VDSL). suffix(CS).Thedifference between the CP and CS is irrel- A high bandwidth efficiency is achieved by dividing the evant to this article, therefore they will be treated as one available bandwidth into small frequency bands centered (larger) CP. More importantly, the presence of the CP influ- around carriers (tones). These carriers are individually mod- ences the spectrum of the transmit signal, as will be shown ulated in the frequency domain, using the inverse DFT later. (IDFT). A cyclic prefix (CP) is added to the resulting block While DMT seems attractive because of its flexibility of time-domain samples by copying the last few samples and towards spectrum control, the high sidelobe levels associ- putting them in front of the symbol [2]. This extended block ated with the DFT filter bank form a serious impediment, is parallel-to-serialized, passed to a digital-to-analog (DA) resulting in an energy transfer between in-band and out- convertor and then transmitted over the channel. At the re- of-band signals. This contributes to the crosstalk, for ex- ceiver, the signal is sampled and serial-to-parallelized again. ample, between different pairs in a binder, especially for The part corresponding to the CP is discarded, and the re- next-generation DSL systems using dynamic spectrum man- mainder is demodulated using the DFT. agement (DSM), where the transmit band is variable [4]. In case the order of the channel impulse response does Moreover, because the twisted pair acts as an antenna [5], not exceed the CP length by more than one, equalization can there exists a coupling with air signals. The narrowband be done easily using a one-tap frequency-domain equalizer signals from, for example, an AM broadcast station can 2 EURASIP Journal on Applied Signal Processing

(k) (k) X0 Z0 AWGN . ADD . . IDFT P/S D/A + A/D S/PDFT PTEQ . . CP H . γ βα (k) (k) XN−1 ZN−1

Figure 1: Basic DMT system (refer to text for α to γ). be picked up by the receiver and, due to the sidelobes, be 2. DMT TRANSMIT SIGNAL SPECTRUM smearedoutoverabroadfrequencytonerange.Thisprob- lem has been recognized, and various schemes have been de- Consider the DMT system of Figure 1, with DFT-size N and ν = ν veloped to tackle it (see [6–8]). On the other hand, the same aCPlength , resulting in a symbol length L N + .The (k) = (k) ··· (k) T poor spectral containment of transmitted signals makes it symbol index is k and X [X0 XN−1] holds the difficult to meet egress norms, for example, the ITU-norm complex subsymbols at tones i, i = 0:N − 1. In a base- [9] specifies that the transmit power of VDSL should be low- band system, such as ADSL, the time-domain signal is real- (k) = (k)∗ ered by 20 dB in the amateur radio bands. Controlling egress valued, requiring that Xi XN−i . The corresponding dis- is usually done in the frequency domain by combining neigh- crete time-domain sample vector (at point α in Figure 1)is bouring IDFT-inputs (such as in [10]) or, equivalently, by equal to abandoning the DFT altogether and reverting to other filter banks, such as, for example, in [11]. x(k) = x(k)[0], ..., x(k)[L − 1] T , Another approach would be to apply an appropriate N−1 (1) time-domain window (see [12] for an overview) at the trans- (k) = √1 (k) j(2πi/N)(n−ν) = − x [n] Xi e , n 0, ..., L 1. mitter. Unfortunately, the application of nonrectangular N i=0 windows destroys the orthogonality between the tones, re- sulting in ICI. In [13], a windowed VDSL system is proposed, Note that the CP is automatically present, due to the peri- where the window is applied to additional cyclic continua- odicity of the complex exponentials. The total discrete time- tions of the DMT symbol to prevent distorting the symbol domain sample stream x[n] is obtained as a concatenation itself. of the individual symbols x(k). Interpolation of these samples The technique proposed in this article avoids the over- yields the continuous time-domain signal s(t), given by head resulting from such additional symbol extension by ∞ ∞ applying the window directly to the DMT symbol, that is, s(t) = v(τ − t) δ(t − nT)x[n] dτ, =−∞ without adding additional guard bands. This windowing τ n=−∞ (2) is observed to correspond to a precoding operation at the ∞ N−1 = √1 (k) j(2πi/N)(n−ν−kL) − transmitter. Obviously, this alters the frequency content at x[n] Xi e wr,s[n kL], each carrier, such that a correction at the receiver is needed. N k=−∞ i=0 While this compensation is generally nontrivial [14], we con- struct a class of windows that can be compensated for with with δ(t) the dirac impulse function, T the sampling period, only a minor amount of additional computations at the re- wr,s[n]a(rectangular, sampled) discrete time-domain win- = ≤ ≤ − ceiver. dow, wr,s[n] 1for0 n L 1 and zero elsewhere, and When investigating transmit windowing techniques, it v(t) an interpolation function. is important to have an accurate description of the trans- The shape of the DMT spectrum will now be derived by mit spectrum of DMT/OFDM signals. Although DMT and construction, starting from a single symbol with only one ac- OFDM are commonplace, a lot of misconception and confu- tive carrier at DC. This result will be extended to a succession sion seem to exist with regard to the nature of their transmit of symbols with all carriers excited. After this, the influence signal spectrum. When working on sampled channel data, of time-domain windowing will be investigated in Section 3. = the continuous-time character of the line signals is transpar- Assume a single DMT symbol, having a duration L ν ent, and therefore usually neglected. However, it is important N + in which only the DC component is excited (e.g., with to realize that the behaviour in between the sample points unit value), in other words, ⎧ can be of great importance [15]. The analog signal will gen- ⎨ 1, i = 0, k = 0, erally exceed the sampled points’ reach, possibly leading to X(k) = (3) unnoticed clipping, and hence out-of-band radiation. i ⎩0, elsewhere. Therefore, Section 2 starts by describing the spectrum of the classical DMT signal. The novel windowing system is The corresponding discrete time-domain signal is a sequence then presented in Section 3. Section 4 covers the simulation of L identical pulses, which is equivalent to a multiplication results. Finally, in Section 5, conclusions are presented. of a rectangular window and an impulse train (Figure 2). A Gert Cuypers et al. 3

1

∼ N−1 PSD

0 ∼ (N + ν)−1 Frequency 0 TL− 1 L t Prefixless system Single prefixless tone Rectangular window wr (t) Interpolated window wi(t) Prefix system Single tone with prefix

Sampled window wr,s(t) Next symbol

Figure 2: The first (DC only) symbol as a sampled rectangular win- Figure 4: The cyclic prefix in DMT systems leads to a toothed spec- dow, and a possible next symbol. trum exhibiting valleys in between the tones.

The final DA conversion consists of a lowpass filtering L with v(t), such that only the frequencies between −1/T and 1/T are withheld. In the case of an ideal lowpass filter, this is equivalent to a time-domain interpolation with a sinc func- |·| tion, resulting in wi(t), as shown in Figure 2. Note that the continuous behaviour in between the sampled values is far from constant. 0 This result can now be extended to describe a succes- − 1/(2T)01/(2LT)1/(2T) f sion of multiple symbols (k = 0, 1, ...), with all tones (i = − (k) 0, 1, ..., N 1) excited. Assume that the Xi have a variance E| (k)|2 = 2 |Wr,s( f )| Xi σi , and are uncorrelated. The power spectral den- | | Wr ( f ) sity (PSD) S( f )ofs(t) can then be described as

Figure 3: Spectrum of the continuous and sampled rectangular N−1 2 window. = 2 − i · S( f ) σi Wr,s f V( f ) ,(6) i=0 NT rectangular window wr (t) extending from t = 0tot = L has a modulated sinc as its Fourier transform with V( f ) the frequency characteristic of the interpolation filter v(t) (an example of this is shown in Section 4). sin(πLf ) ν = Wr ( f ) = . exp(− jπLf ). (4) Only in the case where the prefix is omitted ( 0) and πf 2 = 2 the variances σi σ are equal for all tones (except DC and the Nyquist frequency, having only σ2/2), this spectrum is The multiplication of this w (t) with a sequence of pulses r more or less flat. In general, the CP results in a toothed spec- with period T results in the spectrum W ( f ) being convolved r trum. Indeed, because the symbols are lengthened by the CP, with a pulse train with period 2π/T. The original sinc spec- the PSD of the individual tones is narrowed compared to trum W ( f )andtheconvolvedspectrumW ( f ) are repre- r r,s the intertone distance, such that “valleys” (or “teeth”) ap- sented in Figure 3.Here,W ( f ) is periodic with a period r,s pear in between the tone frequencies. This is demonstrated in 1/T. Surprisingly, this can be expressed analytically as [16] Figure 4, where a detail of the spectrum of a prefixless DMT sin(πLT f ) system (ν = 0) is compared to a system using a prefix. W ( f ) = exp(− jπLf ). (5) r,s sin(πT f ) 3. TRANSMITTER WINDOWING In literature, Wr,s( f ) is sometimes approximated by a sinc. While this approximation is suitable for some applications, Practical lowpass filters are not infinitely steep, such that it leads to an underestimate of the (possible egress) energy some small signal components above the Nyquist frequency in nonexcited frequency bands. More specifically, from (5), it will remain. The out-of-band performance is then largely de- is clear that this leads to a maximum error of 3.9 dB around pendent on the quality of these filters (and possible clipping f =±1/2T. in further analog stages). On the other hand, the in-band 4 EURASIP Journal on Applied Signal Processing

X(k) Z(k) 0 AWGN 0

. Coding ADD Decoding . . IDFT P/S H + S/P DFT PTEQ . . C CP D .

(k) (k) XN−1 ZN−1 Window IDFT g

Figure 5: Transmitter windowing translates to symbol precoding.

transitions (e.g., for suppression of VDSL in the amateur ra- The circulant matrix C (“C” for coding) is fully defined by its dio bands) can only be sharpened by the application of a win- first row cT ,with dow function on the entire time-domain symbol. To achieve T this, the rectangular window wr,s[n] is replaced by another c = c(0) ··· c(N − 1) = IN · g, (12) one having faster decaying sidelobes. This new window that is, IDFT of g. The transition from (10)to(11)ismore = ··· − T w w(0) w(L 1) (7) than mathematical trickery. Looking at the DMT-scheme in- is applied at point α in Figure 1. In the next paragraph we corporating transmitter windowing of Figure 5,itbecomes impose constraints on w to construct a class of window func- clear that the windowing operation in the time domain is tions that are easy to compensate for at the receiver. equivalent to the multiplication of the subsymbol vector X(k) with a (pre-)coding matrix C. Compensating for the window 3.1. Derivation of the window structure at the receiver is now identical to a decoding in the frequency domain, which is done by multiplication with the decod- − To preserve the cyclic structure of the transmitted symbols, ing matrix D = C 1 (“D” for decoding), leaving the rest of needed for an easy equalization, we impose the cyclic con- the signal path (equalization, etc.) unaltered. Thus, appeal- straint ing windows should not only satisfy the constraint (8), but = = ν − preferably also give rise to a sparse decoding matrix D.We w(n) w(n + N), n 0, ..., 1. (8) will now further investigate the nature of such windows. As a result, instead of applying the window w at point α Being the inverse of a circulant matrix, D is also circulant. (Figure 1), one can also apply the window We denote the first row of D as = ··· − T g g(0) g(N 1) dT = d(0) ··· d(N − 1) . (13) (9) = w(ν) ··· w(N + ν − 1) T Define FN the DFT-matrix of size N,and at point β.LetG be a diagonal matrix with g as its diagonal. After defining I the IDFT-matrix of size N, the vector of N f = f (0) · f (N − 1) = FN · d. (14) (k) windowed samples xw at point β (before the application of the CP) can be written as It is now possible to associate to D a diagonal matrix F,hav- ⎡ ⎤ g(0) 0 ··· 0 ing on its diagonal the elements of f. The following relations ⎢ ⎥ ⎢ . ⎥ now hold: ⎢ 0 g(1) .. 0 ⎥ x(k) = ⎢ ⎥ I · X(k). (10) −1 = w ⎢ . . . ⎥ N (i) C and D are circular, with C D,andhaveasafirst ⎣ . .. . ⎦ row cT and dT ,respectively; 0 ··· 0 g(n − 1) (ii) G and F are diagonal, with diagonals g and f;    (iii) c = IN · g; G (iv) d = IN · f. As the product of a diagonal matrix and the IDFT-matrix is = I · · F = I · −1 · equal to the product of the IDFT-matrix and a circulant ma- From this, we can conclude that F N D N N C F = I · · F −1 = −1 trix, we can rewrite (10)as N ( N C N) G . In other words, ⎡ ⎤ c(0) c(1) ··· c(N − 1) = −1 = − ⎢ ⎥ g(n) f (n) , n 0, ..., N 1. (15) ⎢ . ⎥ ⎢ − .. − ⎥ (k) = I ⎢c(N 1) c(0) c(N 2)⎥ · (k) xw N ⎢ ⎥ X . (11) Since g is real-valued, so is f. Consequently, d is the IDFT of a ⎣ ...... ⎦ . . . real-valued vector. Because of the IDFT’s symmetry proper- c(1) ··· c(0)    ties, the first and middle elements of d are real-valued, and all C other nonzero elements appear in complex conjugate pairs. Gert Cuypers et al. 5

We can now distinguish between three cases. under unit-energy constraint

(i) A general d (nonsparse). wT · w = 1. (22) (ii) A maximally sparse d (with only three nonzero ele- ments) is as follows: Equation (19)canbewrittenas ⎧ ⎪ ⎪a, n = 0, ωs ⎪ = T ∗ jω jω dω ⎨ jφ ρ w e e e e w (23) b · e , n = l, 0 π d(n) = (16) ⎪ · − jφ = − = T · · ⎪b e , n N l, w Q w, (24) ⎩⎪ 0, n/∈{0, l, N − l}, where Q has (m, n)th entry with ωs = − dω ≤ ≤ ν − a, b real, qmn cos(m n)ω ,0m, n N + 1, 0 π ∈ − − (25) φ real [ ππ], (17) = sin (m n)ωs − . l integer ∈ [1 N − 1], (m n)π so that To enforce the cyclic structure (8), (24) is transformed into a ⎡ ⎤ problem in g. After defining a 0 ··· 0 b · e jφ 0 ··· ⎢ ⎥ ⎢ . . ⎥ ⎢ . . ⎥ = Oν×(N−ν) Iν×ν ⎢ 0 . . ⎥ P , (26) ⎢ ⎥ IN×N ⎢ . ⎥ ⎢ . ⎥ ⎢ ⎥ ⎢ ⎥ with Om×n and Im×n the all-zero and identity matrix of size = ⎢ 0 ⎥ D ⎢ · − jφ ⎥ (18) m × n,(24)canbewrittenas ⎢b e ⎥ ⎢ ⎥ ⎢ . ⎥ ⎢ .. ⎥ = gT · PT · Q · P · g, (27) ⎢ 0 ⎥ ρ ⎢ ⎥ ⎣ . .. ⎦ . . and the unit-norm constraint becomes 0 a g · PT · P · g = 1. (28) is a sparse matrix. In practice, this means that f (f = FN · d) takes the form of a generalized raised cosine We can now again distinguish between three cases. function. The different parameters influencing f are the pedestal height a, the frequency and amplitude of (i) A general d (nonsparse) the sinusoidal part l and b,andφ determining the po- sition of the peak(s). The maximization of (27) satisfying (28)canberewrittenas (iii) Intermediate structures. Obviously, multiple complex a generalized eigenvalue problem: pairs can be included (hence 5, 7, ...nonzero elements in d), possibly leading to more powerful windows. A PT QP g = λ PT P g, (29) tradeoff should be made between the window quality and the complexity of the decoding. and the optimal vector gopt is equal to the eigenvector cor- responding to the largest eigenvalue of (PT P)−1PT QP.The optimal w is now equal to 3.2. Determining the window parameters opt = · Returning to the original goal of egress reduction, we now wopt P gopt. (30) need to choose w such that an improved sidelobe charac- Note that wopt is only dependent on the (chosen) width of teristic is obtained. For the rectangular window, the width the mainlobe. of the mainlobe is equal to ωs = 2 · π/(N + ν). Note that this decreases with increasing CP length. As a general de- sign criterion, we specify that the power outside the main- (ii) A maximally sparse d = · ν lobe ωs 2 π/(N + ) should be as low as possible. Assum- To obtain the optimal sparse decoding matrix D,wehaveto ing that the total energy is kept constant, this is equivalent to determine the parameters a, b, φ,andl from (16) optimizing maximizing the energy ρ within the mainlobe [17], that is, −1 (27)-(28), with f = FN ·d and g(n) = f (n) , n = 1, ..., N − maximizing 1. We will use l = 1, and φ such that w is symmetrical (i.e., ωs φ =−νπ/N). Due to the unit-energy constraint, only one of = jω 2 dω ρ W e , (19) a or b can be chosen freely. This leads to a one-dimensional 0 π optimization problem in either a or b. Because only three with W(z) = wT e(z), (20) nonzero coefficients are present in d,wedenotethisoptimal ··· N+ν−1 T e(z) = 1 z z (21) (sparse) solution as w3,opt 6 EURASIP Journal on Applied Signal Processing

(iii) Intermediate structures it is based on the same principle. For the remainder of the article, we assume approach-2 is used. For the intermediate structures, multiple (5, 7, ...)nonzero The difference between approach-1 and approach-2 is il- elements are present in d, leading to w5,opt, w7,opt, .... These lustrated in Figure 6. structures offer a tradeoff between egress reduction and com- putational complexity. The corresponding optimal windows 4. SIMULATION RESULTS are found using numerical optimization. 4.1. Influence on the egress 3.3. Modification of the equalizer Three windows are presented: the minimal window w3,opt de- scribed by 3 nonzero coefficients in d (16), a slightly more In the previous sections, it has been shown that the classical complex window w ,forwhichd contains 5 nonzero co- DMT structure can be modified to incorporate an encoding 5,opt efficients, and the optimal window w based on (29)and (C) and a decoding (D) to reduce the spectral leakage. The opt with nonsparse decoding. influence on the transmission itself was not mentioned so far The simulations have been done for a VDSL system. and will now be investigated. There are 2048 carriers (N = 4096), the prefix length is CP = 320 (see [18, page 22]). The sampling frequency is (i) Approach-1: cascaded equalization and decoding 17 664 kHz, the tone spacing is 4.3125 kHz. In Figure 7, the shapes of the rectangular window, w , In case the equalization of the received (encoded) symbols is 3,opt w ,andw is shown. To illustrate the egress reduction, perfect and in the absence of noise (i.e., if the dashed rect- 5,opt opt the spectra are compared for a VDSL scenario based on angle in Figure 5 is equal to a unity-matrix), it is obvious the power spectral density mask Pcab.P.M1 from [9]. The that the decoding will result in the original symbols. Because most important features are that the frequencies between D = C−1, it can be considered to be a decorrelator or zero- 3000 kHz and 5200 kHz and above 7050 kHz are reserved for forcing equalizer (ZFE). Unfortunately in practical situations upstream communications (see [18, page 17]), and that the such a ZFE can enhance the noise. Moreover, it is not imme- power is lowered by 20 dB in the amateur radio bands, from diately clear how the equalizer itself (e.g., a PTEQ) should be 1810 kHz to 2000 kHz, and from 7000 kHz to 7100 kHz (see designed in this case. Clearly this approach is not optimal. [9, page 35]). The results are shown in Figures 8 and 9, show- ing a detail around the first amateur radio band. It is inter- (ii) Approach-2: integrated per-tone equalization esting to note that the spectrum is less toothed (the “valleys” and decoding in between the tones are less pronounced). Moreover, there is a significant egress reduction, especially around the band It turns out that the PTEQ can easily be modified to over- edge (about 5 dB), achieved without adding any additional come both of the problems mentioned. To understand this, (redundant) cyclic extension. Obviously it would be possible we first take a look at the structure of the original PTEQ to combine this method with such extensions. (for details on its derivation, see [3]). An ordinary T-tap Note that the sidelobe suppression of this technique in PTEQ for tone i operates on received sample blocks of length itself is not sufficient to allow the use of all tones up to the N + T − 1 and makes a linear combination of ith output bin forbidden band. Other measures are necessary, such as leav- ofaDFTandT −1so-calleddifference terms which are com- ing some tones unused close to the band edge. Note however mon for all tones. that the number of unused (lost) tones will be lower than in For the case of a maximally sparse d (16), the subsequent case a rectangular window is used. decoding (D) amounts to a linear combination of three of the PTEQ outputs. The result is now a linear combination of 4.2. Influence on the transmission the difference terms and three output bins of the DFT. The decoder and the PTEQ can now be easily combined As mentioned before, the PTEQ is usually designed accord- by making one linear combination of the difference terms ing to an MMSE criterion. The exact solution to this prob- and three output bins of the DFT. This effectively increases lem requires a channel model and is very computationally the number of taps by two (for each tone), but solves both demanding. Therefore, practical implementations generally our problems. use an adaptive scheme and a number of training symbols. To make a fair comparison, however, we prefer the exact (a) The PTEQ design criterion remains unchanged, only MMSE solution over an approach which relies on the conver- the number of inputs changes. Usually the PTEQ is gence of the adaptive scheme. To reduce the simulation com- designed to minimize the mean square error (MMSE) plexity, we then select an ADSL scenario. It can be expected between the output and a known transmitted constel- that the obtained results are readily applicable to VDSL too. lation point. More specifically, the simulations are done for an ADSL (b) the decoding is part of the equalizer and no longer rep- downstream scenario over a standard loop T1.601#13, with resents a ZFE such that noise enhancement is avoided. N = 512, ν = 32, and using tones 38 to 256. The transmit Obviously, selecting a d with additional nonzero elements will power is −40 dBm/Hz and additive white Gaussian noise of lead to an equalizer with an increased number of inputs, but −140 dBm/Hz was assumed. Gert Cuypers et al. 7

Difference terms

Equalized ··· decoded tones i Equalized Difference encoded tones terms PTEQ ··· and decoder tone i PTEQ tone i − 1 Received samples

PTEQ Decoder ··· tone i tone i

PTEQ

Received samples tone i +1 DFT ···

DFT

(a) (b)

Figure 6: In approach-1 (left) the linear combiners (LC) of the PTEQ and the decoder are separated. In approach-2 (right) they are com- bined.

1.8 −80

1.6 −90 1.4 CP −100 1.2 −110 1 − 120 wr,s 0.8 w3,opt −130 w5,opt 0.6 wopt −140 0.4 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 0 320 1000 2000 3000 4416

Figure 8: Spectrum of the rectangular window, W3,opt, W5,opt,and

wr,s w5,opt Wopt. w3,opt wopt

Figure 7: The shape of the rectangular window as well as W , 1,opt PTEQ. This comes as no surprise because no taps are avail- W2,opt,andWopt. able for the difference terms, and the equalization is therfore poor. As the number of taps is increased, both techniques are very comparable. A system using a rectangular window at the transmitter and an ordinary PTEQ at the receiver is compared to a sys- 5. CONCLUSION AND FURTHER WORK tem using W3,opt (for ADSL dimensions) and approach-2 at the receiver. Note that this modified equalizer has the same A novel transmitter windowing technique for DMT has been number of taps as the ordinary PTEQ, implying that it uses proposed, which does not rely on an additional cyclic exten- 2difference terms less, because these taps are assigned to the sion of the symbol. This inevitably introduces a distortion of two additional DFT outputs. the signal. For a special class of windows, this distortion can The results are shown in Figure 10.ForT = 3, the perfor- be described as a precoding operation for which the decod- mance of the proposed technique is significantly lower than ing at the receiver can be done easily. In the simplest case, the that of the rectangular window combined with an ordinary window function can be described as the pointwise inversion 8 EURASIP Journal on Applied Signal Processing

−90 ACKNOWLEDGMENTS −92 −94 ThisresearchworkwascarriedoutattheESATLaboratory −96 of the Katholieke Universiteit Leuven, in the frame of Belgian −98 Programme on Interuniversity Attraction Poles, initiated by −100 −102 the Belgian Federal Science Policy Office IUAP P5/22 (“Dy- −104 namical systems and control: computation, identification −106 and modelling”) and P5/11 (“Mobile multimedia communi- −108 −110 cation systems and networks”), the Concerted Research Ac- 1800 1820 1840 1860 1880 1900 1920 tion GOA-AMBioRICS Research Project FWO no.G.0196.02 (“Design of efficient communication techniques for wireless time-dispersive multi-user MIMO systems”), CELTIC/IWT wr,s w5,opt w3,opt wopt Project 040049: “BANITS” (Broadband Access Networks In- tegrated Telecommunications) and was partially sponsored by Alcatel Bell. The authors wish to thank the reviewers for Figure 9: Spectrum of the rectangular window, W , W ,and 3,opt 5,opt their valuable comments and suggestions. Wopt (detail of amateur radio band). REFERENCES 60 [1]S.B.WeinsteinandP.M.Ebert,“Datatransmissionby 50 frequency-division multiplexing using the discrete fourier transform,” IEEE Transactions on Communications, vol. 19, 40 no. 5, pp. 628–634, 1971. [2] A. Peled and A. Ruiz, “Frequency domain data transmission 30 using reduced computational complexity algorithms,” in Pro- ceedings IEEE International Conference on Acoustics, Speech, 20 and Signal Processing (ICASSP ’80), vol. 5, pp. 964–967, Den- SNR (dB) 10 ver, Colo, USA, April 1980. [3] K. Van Acker, G. Leus, M. Moonen, O. van de Wiel, and T. 0 Pollet, “Per tone equalization for DMT-based systems,” IEEE Transactions on Communications, vol. 49, no. 1, pp. 109–119, −10 2001. [4] K. B. Song, S. T. Chung, G. Ginis, and J. M. Cioffi,“Dy- −20 namic spectrum management for next-generation DSL sys- 0 50 100 150 200 250 300 tems,” IEEE Communications Magazine, vol. 40, no. 10, pp. Tone index 101–109, 2002. [5] R. Stolle, “Electromagnetic coupling of twisted pair cables,” Rectangular T = 11 w  T = 7 3,opt IEEE Journal on Selected Areas in Communications, vol. 20, w  T = 11 Rectangular T = 3 3,opt no. 5, pp. 883–892, 2002. Rectangular T = 7 w  T = 3 3,opt [6] A. J. Redfern, “Receiver window design for multicarrier com- munication systems,” IEEE Journal on Selected Areas in Com- Figure 10: Comparison between the rectangular window using an munications, vol. 20, no. 5, pp. 1029–1036, 2002. ordinary PTEQ and the W3,opt window using approach-2. [7] S. Kapoor and S. Nedic, “Interference suppression in DMT receivers using windowing,” in Proceedings IEEE International Conference on Communications (ICC ’00), vol. 2, pp. 778–782, New Orleans, La, USA, June 2000. of a raised cosine window. More complex windows can also [8]G.Cuypers,G.Ysebaert,M.Moonen,andP.Vandaele, be described, but the advantage of the easy decoding then “Combining per tone equalization and windowing in DMT gradually vanishes. Furthermore, formulas are provided to receivers,” in Proceedings IEEE International Conference on calculate the optimal window, and this is illustrated for the Acoustics, Speech, and Signal Processing (ICASSP ’02), vol. 3, VDSL case. pp. 2341–2344, Orlando, Fla, USA, May 2002. The decoding at the receiver can be combined with a per- [9] ETSI, “Transmission and Multiplexing (TM); Access transmis- tone equalizer in a very elegant way by taking into account sion systems on metallic access cables; Very high speed Digital additional DFT outputs. The effect on the transmission was Subscriber Line (VDSL); part 1: Functional requirements,” TS illustrated for an ADSL scenario. 101 270-1 V1.2.1 (1999-10), October 1999. Future work will focus on a selective windowing of the [10] K. W. Martin, “Small side-lobe filter design for multitone data- communication applications,” IEEE Transactions on Circuits tones in the vicinity of an unauthorized band, and the combi- and Systems—Part II: Analog and Digital Signal Processing, nation of the proposed technique with windowing in a cyclic vol. 45, no. 8, pp. 1155–1161, 1998. ff extension of the symbol. Also the tradeo between decoder [11] G. Cherubini, E. Eleftheriou, and S. Olc¨ ¸er, “Filtered multitone complexity and egress should be further studied, as well as modulation for VDSL,” in Proceedings IEEE Global Telecom- the interaction between the transmitter window and a chan- munications Conference (GLOBECOM ’99), vol. 2, pp. 1139– nel equalizer using windowing at the receiver. 1144, Rio de Janeireo, Brazil, December 1999. Gert Cuypers et al. 9

[12] F. J. Harris, “On the use of windows for harmonic analysis Marc Moonen received the Electrical Engi- with the discrete Fourier transform,” Proceedings of the IEEE, neering degree and the Ph.D. degree in ap- vol. 66, no. 1, pp. 51–83, 1978. plied sciences from Katholieke Universiteit [13] F. Sjoberg,¨ R. Nilsson, M. Isaksson, P. Odling,¨ and P. O. Leuven, Leuven, Belgium, in 1986 and 1990, Borjesson,¨ “Asynchronous Zipper [subscriber line duplex respectively. Since 2004 he is a Full Professor method],” in Proceedings IEEE International Conference on at the Electrical Engineering Department Communications (ICC ’99), vol. 1, pp. 231–235, Vancouver, of Katholieke Universiteit Leuven, where he BC, Canada, June 1999. is currently heading a research team of 16 [14] Y.-P. Lin and S.-M. Phoong, “Window designs for DFT-based Ph.D. candidates and postdocs, working in multicarrier systems,” IEEE Transactions on Signal Processing, the area of signal processing for digital communications, wireless vol. 53, no. 3, pp. 1015–1024, 2005. communications, DSL, and audio signal processing. He received [15] H. Minn, C. Tellambura, and V. K. Bhargava, “On the peak the 1994 K.U. Leuven Research Council Award, the 1997 Alcatel factors of sampled and continuous signals,” IEEE Communi- Bell (Belgium) Award (with Piet Vandaele), the 2004 Alcatel Bell cations Letters, vol. 5, no. 4, pp. 129–131, 2001. (Belgium) Award (with Raphael Cendrillon), and was a 1997 “Lau- [16] A. D. Poularikas, Handbook of Formulas and Tables for Signal reate of the Belgium Royal Academy of Science”. He was the Chair- Processing, CRC Press/IEEE, Boca Raton, Fla, USA, 1998. man of the IEEE Benelux Signal Processing Chapter (1998–2002), [17] P.P.Vaidyanathan, Multirate Systems and Filter Banks, Prentice and is currently a EURASIP AdCom Member (European Associa- Hall, Englewood Cliffs, NJ, USA, 1st edition, 1993. tion for Signal, Speech and Image Processing, from 2000 till now). [18] ETSI, “Vdsl: Transceiver specification,” TS 101 270-2 V1.1.1 He has been a Member of the Editorial Board of “IEEE Transac- (2001-02), 2001. tions on Circuits and Systems II” (2002–2003). He is currently the Editor-in-Chief for the “EURASIP Journal on Applied Signal Pro- cessing” (from 2003 till now), and a Member of the Editorial Board Gert Cuypers wasborninLeuven,Bel- of “Integration, the VLSI Journal”, “EURASIP Journal on Wireless gium, in 1975. In 1998 he received the Mas- Communications and Networking”, and “IEEE Signal Processing ter’s degree in electrical engineering from Magazine”. the Katholieke Universiteit Leuven (KULeu- ven), Leuven, Belgium. Currently he is pur- suing the Ph.D. degree at the Department of Electrical Engineering (ESAT), Leuven, Bel- gium, under the supervision of Marc Moo- nen. From 1999 to 2003, he was supported by the Flemish Institute for Scientific and Technological Research in Industry (IWT). At the moment he teaches at the Leuven Engineering School (Groep T), Leuven, Bel- gium. His interests are in digital communications and RF technol- ogy. His amateur radio call sign is ON4DSP.

Koen Vanbleu was born in Bonheiden, Bel- gium, in 1976. He received the Master’s and Ph.D. degrees in electrical engineering from the Katholieke Universiteit Leuven (KULeu- ven), Leuven, Belgium, in 1999 and 2004, respectively. From 1999 to 2003, he was sup- ported by the Fonds voor Wetenschappelijk Onderzoek (FWO) Vlaanderen. At the mo- ment he works for Broadcom (Belgium).

Geert Ysebaert was born in Leuven, Bel- gium, in 1976. He received the Master’s and the Ph.D. degrees in electrical engineer- ing from the Katholieke Universiteit Leuven (KULeuven), Leuven, Belgium, in 1999 and 2004, respectively. From 1999 to 2003, he was supported by the Flemish Institute for Scientific and Technological Research in In- dustry (IWT). In September 2004, he joined theDSLExpertsTeamatAlcatelBell,whereheisinvolvedindy- namic spectrum management (DSM), single ended line testing (SELT), and quality of service (QoS) for DSL. He is married to Ilse and has a baby named Roan. Hindawi Publishing Corporation EURASIP Journal on Applied Signal Processing Volume 2006, Article ID 38237, Pages 1–14 DOI 10.1155/ASP/2006/38237

Designing Tone Reservation PAR Reduction

Niklas Andgart,1 Per Odling,¨ 1 Albin Johansson,2 and Per Ola Borjesson¨ 1

1 Signal Processing Group, Department of Information Technology, Lund University, Box 118, SE-221 00 Lund, Sweden 2 Ericsson AB, 126 25 Stockholm, Sweden

Received 20 December 2004; Revised 27 May 2005; Accepted 8 July 2005 Tone reservation peak-to-average (PAR) ratio reduction is an established area when it comes to bringing down signal peaks in mul- ticarrier (DMT or OFDM) systems. When designing such a system, some questions often arise about PAR reduction. Is it worth the effort? How much can it give? How much does it give depending on the parameter choices? With this paper, we attempt to answer these questions without resolving to extensive simulations for every system and every parameter choice. From a specifica- tion of the allowed spectrum, for instance prescribed by a standard, including a PSD-mask and a number of tones, we analytically predict achievable PAR levels, and thus implicitly suggest parameter choices. We use the ADSL2 and ADSL2+ systems as design examples.

Copyright © 2006 Niklas Andgart et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. INTRODUCTION When a system designer is contemplating whether it is With discrete multitone modulation (DMT) as the dominat- worthwhile to include PAR reduction in a system or not, it ing modulation scheme in digital subscriber line (DSL) sys- normally requires a lot of work to develop a simulation chain tems, there is a problem with high signal amplitudes. This is in order to evaluate the potential gain. With the results pre- caused by several independent sequences adding up to a sig- sented here, the “worthwhile or not” question can easily be nal that approximately will adhere to a Gaussian distribution answered in an afternoon by a skilled engineer. A simulation andiscommonlyreferredtoasahighpeak-to-average ratio chain would then only be developed when needed, that is, for (PAR). Several methods have been presented to alleviate this the precise determination of PAR reduction parameter val- problem [1–7]. ues. A practicing engineer, who only at a later stage would We focus on the tone reservation method, which has like to enjoy the theory, could for now skip reading Sections been presented in [1, 2], with further improvements in [8– 2 and 3, and move directly to Section 4. 12]. Following the constraints set up by the standards, the Earlier work has discussed the existence and effect of a achievable performance is limited, and can be determined by PSD bound for tone reservation [13] and algorithms suitable mathematical analysis in combination with some sound en- for implementing it [12]. This paper aims at explaining in gineering assumptions. Construction of a system where the which situations this PSD constraint is an issue. It also in- designer is unaware of the limitations will likely lead to a se- tends to show what levels on PSD constraints and expected vere violation of the power spectral density (PSD) mask, or PAR reduction performance that can be used in a system de- to a worse performance than what could be expected. This is sign. illustrated in Figure 1, where the target PAR level is of signif- icant importance. Aiming at a too low PAR level will lead to The paper starts out with discussing practical standard- a violation of the PSD, or to a much worse result if the PSD ised systems and what would be proper engineering assump- is somehow enforced. In this paper we explain this relation- tions. In Section 2, the system and its requirements are de- ship and develop means to predict what can be done when fined and a set of theoretical results necessary for the analysis applying tone reservation PAR reduction to a practical DMT is given. Thereafter, Section 3 analyses what impact this has system. The aim is to produce results that are valid without on reduction performance. The results from this section are having to run extensive simulations for each individual case. summarised in Section 4, where practical instructions for us- Hence, we will look at a number of bounds and engineer- ing them are given in a “how-to” style. Section 5 applies the ing approximations that will tell us what can be done in the results to an ADSL2 system and also extends the analysis to complete system. include ADSL2+ systems. 2 EURASIP Journal on Applied Signal Processing

2. A SYSTEM IN PRACTICE 16 15 The Gaussian distribution of the transmit signal [14]implies a possibility of very high peak amplitudes. This may lead to 14 that the signal is clipped or, if the amplitude span of the line 13 driver is increased, high power dissipation. This is especially 12 the case with the commonly used class AB line drivers, where Resulting PAR (dB) 11 the power dissipation in the line driver is directly dependent 10 on the supply voltage. Notably, the line driver, or power am- 6 8 10 12 14 16 18 1 plifier, is responsible for the major part of the total power Target PAR (dB) consumption in many communication systems, for example, DSL systems [15]. With average PSD constraint Without PSD constraint 2.1. PAR reduction in standardized systems Figure 1: Relationship between target PAR and the resulting PAR As a first example system, we will look at an ADSL2 system level for reduction with 6 tones, with and without constraints on using an FFT size of 512 in the downstream direction. Since average reduction PSD. The narrow optimum and its lack of toler- ance for design errors illustrates the nonintuitive behaviour of PAR the lowest part of the frequency band, covering the first 32 reduction under PSD constraints. It also emphasizes the need of de- tones, is used for analogue telephony and upstream trans- sign methods such as those given in this paper. mission, only tones 33–255 are available for downstream data transmission. These parameters are the same as for the well- established ADSL1 system [16], with the ADSL2 standard results in a very strict peak PSD constraint and a quite low [17] more closely defining the requirements on subcarriers PAR reduction performance. that could be used for PAR reduction. We quote the most rel- A compromise between these two cases, with a limit on evant text in Appendix A. The ADSL2 standard is also a base average PSD as well as a looser limit on maximum instanta- for the ADSL2+ standard [18], which follows similar specifi- neous PSD, would give a standard-compliant system that is cations. The main difference is the double downstream band- not overly inefficient but still friendly to neighbouring users. width (tones 33–511), and that ADSL2+ systems thereby op- The value of the peak PSD limit can be obtained from the erate with an FFT size of 1024 in the downstream direction. acceptable amount of disturbance that can be put on other In the ADSL2 standard, the PSD mask on the reduction users. tones is set to −10 dB relative to the PSD for the data tones [17], see also Appendix A. The question now is how the PSD should be properly measured. Since the PSD commonly is 2.2. Tone reservation averaged over time, the instantaneous PSD may be allowed The goal with all PAR reduction methods is to generate a to vary between symbols, with certain symbols exceeding the transmit signal that has a low amplitude swing. Different average PSD constraint. We will consider two extreme cases approachestoPARreductionexist,butonlyafewarevi- of averaging time, as well as a reasonable intermediate point. able alternatives to include in standardised DSL systems. The One extreme would be to average over a long time. schemes possible to use are those that are transparent to Thereby, we could sometimes allow high instantaneous PSD the receiver side, meaning that the receiver does not have levels, if we most of the time use little or no power. This in- to know about the existence of PAR reduction, nor which terpretation of the definition restricts only the average level, method is being used. without imposing any restrictions on individual symbols. One of these viable methods is the tone reservation This can generate occasional large reduction signals, with the method [1, 2], which adds a reduction signal c[n] to the data average still being below the limit. There are reasons to why signal x[n], see Figure 2. The goal is to make the resulting this may be undesirable, for instance, large amplitudes on the signal x¯[n] = x[n]+c[n] have a lower amplitude span than reduction tones could generate intermodulation interference before. The PAR is defined as to neighbouring data-carrying subcarriers by exciting non- 2 linearities in the line driver. Other users or other systems in maxn x[n]+c[n] ff PAR{x¯}= ,(1) the same cable bundle can also be a ected through crosstalk. σ2 Although this interpretation of the PSD limitation is stan- dard compliant, it is neither neighbour friendly, nor neces- where the peak power is compared to the average power be- 2 = | |2 sary according to the spirit of the standard. fore the PAR reduction is applied, σ E[ x[n] ]. The reduction signal [ ] is constructed of a set of re- The other extreme is when the PSD of each particular c n served subcarriers, which are not used for data transmission. symbol has to conform with the −10 dB limitation. This These may be tones that cannot transmit data reliably, or tones that are explicitly reserved for PAR reduction. Natu- 1 Anything between 50%–80% could be considered normal, where the rally, the reduction performance will increase as the num- numbers have been increasing over time as the digital system parts have ber of reserved tones is increased. At the same time, exclud- become more and more efficient. ing too many tones from the set of data-carrying tones will Niklas Andgart et al. 3

2 x[n] + x[n] γtarget, or target PAR level γtarget [15]:

minimize Cˇ T Cˇ ˇ c[n] C (4) PAR ˇ ˇ ≤ reducer subject to xL + QLC γtargetσ. This quadratic programme does not always have feasible so- lutions since it cannot be guaranteed that the target PAR level Figure 2: A PAR reduction signal, c[n], is added to the data signal γ2 is achievable. Thus, we choose to minimise the PAR x[n] to counteract the peaks in x[n]. The signal c[n] is a function of target level if the target PAR is not reached. the data signal x[n], and is constructed from a small subset of tones. To take the maximum allowed PSD level into considera- tion, a set of quadratic constraints can be added to (3)and (4): reduce the data capacity of the system unnecessarily [19]. ˇ 2 ˇ 2 ≤ 2 Thus, it is of interest to have, already at an early stage in a Cl,sin + Cl,cos Al,max,(5) system design, some knowledge about the reduction capabil- where Al,max denotes the maximum magnitude for reduction ities for a certain number of tones in order to balance this tone ,with ∈{1 ··· },andCˇ and Cˇ denote the ff tl l U l,sin l,cos tradeo . sine and cosine weights on a certain reduction tone. This In matrix form, we define the signal model as introduction of a set of quadratic constraints will result in (3)and(4) no longer being linear or quadratic programmes. x¯L = xL + cL = xL + Qˇ LCˇ . (2) However, they will still be quadratically constrained prob- lems, which still are convex and thereby reasonably easy to

We let the length NL vector xL denote the data signal of one solve. symbol block and cL the reduction signal of the same length. What will be studied now is the combination of the prob- The FFT size is denoted by N and the number L represents lem formulations in (3)and(4). the oversampling factor, which we introduce in order to have a better control of the continuous-time PAR. The construc- Step 1. First, try to solve (4) with the additional constraints in (5) to obtain the target PAR with as little added power as tion of cL from the reserved tones is written as cL = Qˇ LCˇ , possible. where Qˇ L is an NL × 2U matrix of sine and cosine basis vectors with frequencies specified by the U reserved tones Step 2. If this fails, solve (3)and(5) to minimize the PAR t , ..., t [12]. 1 U level.

2.3. Optimization criteria A suitable algorithm for practical implementation is the active-set algorithm [9, 12, 20]. Solving for the minimum Having defined the reduction model in (2)above,wenow PAR, this algorithm will converge to a solution close to the formulate what the reduction algorithm should aim at. The optimal solution already after a few iterations, and the PSD most common approach in PAR reduction is basically to re- constraints can easily be incorporated [12]. duce the signal as much as possible. In the real-valued base- band environment of a DSL system, this can be formulated as a linear programme [2]: 2.4. Expression for PAR for an unreduced signal In order to analyse what is achievable with the PAR reduction minimize γ algorithm, the distribution of the PAR for an unreduced sig- ˇ C (3) nal has to be derived. We here focus on the symbol clip prob- ˇ ˇ ≤ subject to xL + QLC γσ. ability, defined as the probability that the maximum sample valueinafullDMTframeisaboveacertainlevel.Thisre- The inequality compares each vector element to the right- flects the probability that a symbol is distorted during trans- hand side scalar γσ, which is the level of the highest signal mission. Our choice of definition is commonly used in the peaks. literature, although it would also be possible to view the clip When assigning a certain target PAR level to the system, probability on a per-sample basis. the algorithm can be told not to put any efforts in reducing Assuming that the signal after the IFFT is Gaussian IID, the peak level further down than this level. Since what is im- x[n] ∈ N(0, σ2), the sample clip probability at level γσ is portant in practice is to avoid signal clipping through over- P = = loading the line driver or clipping in the D/A converter, there 1(γ) Prob x[n] >γσ 2Q(γ), (6) is no reason to reduce the PAR of already acceptable symbols. where the Q(·) function denotes the tail probability for a Additionally, reducing the peak level further may cause the Gaussian random variable: power on the reduction tones to increase to undesired levels. ∞ 1 2 We can then define the optimisation criterion as minimising Q(x) = √ e−x /2 dx. (7) the added reduction power given a certain target crest factor 2π x 4 EURASIP Journal on Applied Signal Processing

100 100

10−1 10−1 10−2

clip level) −2

> 10 10−3

(PAR − − P 10 4 10 3

10−5 Probability 10−4 10−6 −5 Clip probability, 10 10−7

− 10 8 10−6 7 8 9 10 11 12 13 14 15 16 7 8 9 10 11 12 13 14 15 16 Clip level (dB) Clip level (dB)

Theory, continuous-time One clip per symbol Theory, without oversampling At least one clip per symbol Simulations, 1-16 times oversampling from left to right At least two clips per symbol Theory, sample clip probability Figure 4: Probabilities of different number of clips, for a symbol Figure 3: Symbol clip probability for an unreduced signal of length length of N = 512. In the shaded area to the left, we will almost N = 512, evaluated at different levels of oversampling by sinc in- always have one or many clips. In the rightmost shaded area, there terpolation. The cross-marks show the theoretical values for a white will most often be one or zero clips. Gaussian signal without oversampling and the circles show the cal- culated values for a lowpass continuous-time white signal using Rice’s formula. The solid lines show, from left to right, the simu- lated clip probability for a signal without oversampling, and with Figure 3 shows the sample and symbol clip probabilities for 2, 4, 8, and 16 times oversampling, respectively. After four to eight different levels of oversampling, from L = 1toL = 16, for times oversampling, the calculated results for the continuous-time a system with a symbol length of N = 512. As seen from the signal describes the data signal closely. The dashed line shows the figure, with increasing oversampling, the expression (9)gives sample clip probability, which is the same for continuous-time sig- nals as for signals without oversampling. The dotted lines show the an excellent match. translation between sample and symbol clip probabilities. Starting In Figure 4 we plot some qualitative results of when PAR at the sample clip probability 10−7, we see that this corresponds to a reduction is needed. The solid line is the same as the theoret- clip level of γunred = 14.5 dB, which for the continuous-time signal ical values marked with circles in Figure 3, showing the prob- −4 corresponds to the symbol clip probability psymclip = 2 · 10 . ability of at least one clip in a symbol from (9). Using (B.12) in Appendix B, we plot two more lines showing the proba- bilities of at least two clips, and exactly one clip, respectively. We see that in the leftmost shaded region, the reduction al- The symbol clip probability for the symbol of N IID samples gorithm will have to be active for almost every symbol. In is straightforward to calculate [2] the rightmost region, there will very seldom be more than one peak exceeding a given level. We will use these results to N judge when the bounds developed in Section 3 are reasonably PN (γ) = 1 − Prob all x[n] <γσ = 1 − 1 − P1(γ) tight. N = 1 − 1 − 2Q(γ) . Moving on, while we focus on the symbol clip probabil- (8) ity, the ADSL standard [16] is based on unreduced signals and sets the limit on clip probability on a per-sample ba- sis. The signal is restricted to be clipped no more than the For a signal oversampled using sinc interpolation, the sample fraction 10−7 of the time. Translating this to a symbol clip clip probability is identical to the critically sampled case. The probability is not straight-forward, since there can be one derivation of the symbol clip probability is not as easy to ob- or many clips in a clipped symbol. However, considering an tain, due to that the signal no longer is IID. However, assum- unreduced signal, we can use the expression for the sample ing that the signal is a continuous-time band-limited Gaus- clip probability in (6) to get the acceptable clip level: sian noise gives us a possibility of deriving the clip probability using Rice’s formula. The derivation is given in Appendix B, −7 and the resulting expression is − 10 γ = Q 1 = 5.33 (14.5dB). (10) unred 2 N − 2 P (γ) = Prob(clip at γσ) = 1 − exp − √ e γ /2 . (9) S 3 Then,wecanuse(9) with this clip level to get a value of the Niklas Andgart et al. 5

A2/2 3.1.1. Allowed magnitude

Power We start by predicting the average power needed to reduce 2 0 the signal down to the target PAR level γtarget.Thenwecan 1 1.5 2 2.5 3 3.5 4 4.5 5 assign a maximum tone magnitude to avoid a too high av- Crest factor (PAR level in linear scale) erage power. For an individual symbol, we use the following lower bound on reduction tone magnitude. Assume that the Figure 5: Illustration of (12), showing minimum instantaneous signal has a maximum peak with magnitude xmax.Toreduce 2 power on a reduction tone as a function of peak magnitude, with a the signal down to a PAR of γtarget requires at least a total re- = target PAR level of 10 dB, γtarget 3.16. The axes are shown in linear duction tone magnitude of xmax − γtargetσ. scale. Aiming at the target PAR level of 10 dB, the reduction algo- rithm will not be active below this value. Above γtarget +UA/σ = 4.16 (12.4 dB), the algorithm will output the maximum power A2/2. Proof. Only looking at the highest peak of the signal, we see that letting all reduction tones be in phase at this sample, with a total magnitude of xmax − γtargetσ in the counter-phase symbol clip probability: direction, will reduce the signal down to γtargetσ,withPAR γ2 . All other scenarios, such as taking into consideration target N − 2 other samples in the original signal, and the possibility of p = 1 − exp − √ e 5.33 /2 , (11) symclip 3 generating new peaks, would need a larger reduction sig- nal. approximately 2 · 10−4 for N = 512. This translation is also shown with the dotted lines in Figure 3. This γ value will unred For an individual symbol with peak magnitude x ,we be used as a reference in Figures 6, 7, 8, 11,and12, and the max need at least the total magnitude x − γ σ,spreadover p is shown as the baseline reference in Figure 9. max target symclip the U tones. With the same PSD constraints on all tones, wewouldliketohavethemaximumtonemagnitudeas 3. PERFORMANCE PREDICTION WITH BOUNDS low as possible, which means having the same magnitude (x − γ σ)/U on all reduction tones. This could also be Our aim is to see, or rather predict, what PAR level we can max target seen as a lower bound, or perhaps rather a best case, for a achieve with the tone reservation approach and PSD restric- PSD-friendly average reduction power spread over the tones. tions. We present a material allowing to do this analytically, Depending on the peak magnitude x , the instantaneous without having to resort to extensive system simulations, max reduction power on a certain tone will at least be which are often quite complicated when it comes to PAR re- duction. Based on the optimisation criteria and distribution g xmax of the unreduced signal from the previous section, bounds ⎧ ⎪ ≤ for the achievable PAR level given a certain system using a ⎪0 xmax γtargetσ, ⎪ certain number of tones will be derived. For easy practical ⎨⎪ 2 1 xmax −γtargetσ use of these bounds, the outcome of this section will be sum- = γ σγtargetσ +UA, 3.1. Limitations imposed by a maximum average PSD (12) We will now analyse what can be done under the −10 dB per tone PSD limitation on the reduction signal compared to the where A denotes the maximum allowed reduction magni- data signal. Most of the symbols do not have very high sig- tude per tone, see also Figure 5. Following Step 2 of the opti- nal peaks, and thus do not need a large reduction signal to misation criterion given in Section 2.3, for peak levels above pass under the clip level. We will assign a target PAR level to γtargetσ + UA, the algorithm can only output this large reduc- the algorithm, and not put any efforts in reducing the sig- tion signal, and the target PAR will thereby not be achieved. nal further down. We will also define a maximum magnitude To evaluate the minimum average PSD for a tone, we per reduction tone. It can be expected that having a high tar- calculate the expected value of (12). For this, we need the get PAR level, only a few symbols will need reduction and density function f (γ) for the normalised peak magnitude the maximum tone magnitude can be set high, still having xmax/σ, based on the clip probability from (9): an average PSD below the limit. On the other hand, lowering ∂F(γ) ∂ 1 − P (γ) the target PAR will demand reduction of more symbols, and f (γ) = = S the maximum reduction magnitude has to be kept lower, to ∂γ ∂γ not violate the average PSD constraint. This lower reduction ∂ N − 2 magnitude will, in turn, make it difficult to achieve the target = exp − √ e γ /2 (13) PAR. We will predict the best balance of the target PAR level ∂γ 3 2 γtarget (how often reduction is used) and allowed reduction Nγ N − 2 − 2 = √ exp − √ e γ /2 e γ /2. magnitude A (to stay below the PSD limit). 3 3 6 EURASIP Journal on Applied Signal Processing

15 15

10 10 Unreduced PAR 5 5 Unreduced PAR 24 tones 24 tones 12 12 6 0 6 0 3 3 3 3 6 6 12 12 −5 −5 24 tones 24 tones

−10 −10 Reduction tone magnitude relative to data tones (dB) 7 8 9 10 11 12 13 14 15 Reduction tone magnitude relative to data tones (dB) 7 8 9 10 11 12 13 14 15 Target PAR level (dB) Target PAR level (dB)

Allowed magnitude +4.8 dB constraint Allowed magnitude Required magnitude −10 dB constraint Required magnitude Simulations-random placement Figure 6: The thick solid lines starting from the bottom-left corner and bending upwards show the maximum reduction tone magni- Figure 7: The bounds from Figure 6, shown together with simula- tude without exceeding the average PSD level, when aiming at the tions with a random placement of tones. Since the bounds only cal- target PAR level shown on the horizontal axis. The thin solid lines culate with a single signal peak, the simulations results move some- starting from the bottom right show what reduction tone magni- what upwards and to the right. tude is needed to possibly achieve the target PAR level on the hor- izontal axis. The two horizontal lines at +4.8dBand −10 dB show the limitations when having a maximum peak PSD, as described in Section 3.2. to use the two Powerred expressions in (14)and(15)tosolve for the maximum value of A. The thick solid lines, starting in the bottom left and bending upwards, in Figure 6 show Then we can calculate a lower bound on the average reduc- the resulting values of the maximum magnitude A as a func- tion power on each tone as the expected value of the mini- tion of a certain target PAR level. The four lines correspond, mum instantaneous power: from right to left, to systems with 3, 6, 12, and 24 tones, re- ∞ spectively. Choosing a target PAR level, we can read out the Powerred ≥ g(γσ) f (γ)dγ highest value we could set the maximum tone magnitude A 0 to in order not to exceed the PSD limit on average. Thus, the γ +UA/σ 2 target 1 γσ − γtargetσ = f (γ)dγ allowed area is to the right, or below, the thick solid lines. γtarget 2 U For very low target PAR levels in the system, almost all ∞ 2 symbols will need to be reduced. In order to not exceed the + A /2 f (γ)dγ (14) constraint on the average PSD, we have to put a strict con- γtarget+UA/σ straint on the maximum tone amplitude, . For low target γtarget+UA/σ A = 1 − 2 PAR levels, it can be seen that the thick solid lines for the peak 2 γσ γtargetσ f (γ)dγ 2U γtarget − constraint A converge to the average constraint at 10 dB. A2 UA On the other hand, if we aim at a not too low PAR, fewer + P γ + . 2 S target σ symbols will need reduction (cf. Figure 4). Then the reduc- tion signal can be allowed to be much stronger on occasional In addition, to conform to the PSD limitation of a reduction symbols without violating the average PSD limit. In Figure 6, tone PSD of 10 dB below the data-tone PSD, we can calculate the curves are bending strongly upwards at certain PAR lev- the allowed average power Powerred on a reduction tone as els. Choosing a target PAR level a bit above this value will 2 allow a strong reduction signal, and thus a high probability −10/10 −1 σ Powerred ≤ 10 Powerdata = 10 , (15) of attaining the target. U0 − U where U0 is the number of tones originally available for data 3.1.2. Required magnitude transmission before introducing tone reservation. The aver- age power on the data tones, Powerdata, is obtained by di- The curves developed in Section 3.1.1 and shown in Figure 6 viding the total signal power σ2 with the number of tones tell us how much power we can put on the reduction used for data transmission. We assume that the power on the tones without exceeding the PSD mask on average. How- data tones fills the PSD mask completely. It is now possible ever, the thick lines in Figure 6 give only a bound on an Niklas Andgart et al. 7 allowable region for combinations of the target PAR level 16 2 γtarget and maximum reduction tone magnitude. They say 15 nothing about what is achievable, that is, they do not guar- antee that the target PAR level can, or will be, reached. Here 14 we can reuse the lower bound on reduction tone magnitude from Section 3.1.1. Based on this bound and the distribution 13 of the unreduced signal peak from (9), we can calculate what 3 12 reduction tone magnitude at least is needed for a certain level 6 3tones of reduction at a certain clip probability: 11 12 6tones Resulting PAR (dB) 10 12 tones 24 γunred − γtarget σ 9 A ≥ , (16) 24 tones U 8 7 8 9 10 11 12 13 14 15 16 Target PAR (dB) where γunred is the crest factor for the unreduced signal at the = · −4 clip probability psymclip 2 10 , as described in Section 2.4. Figure 8: Relationship between target PAR and the effective result- The thin solid lines starting from the bottom right in ing PAR level (given that the average PSD level is enforced), for re- Figure 6 showthistonemagnitudeasafunctionoftarget duction with 3, 6, 12, and 24 tones. The curve for 6 tones is a bold PAR level, for 3, 6, 12, and 24 tones. The rightmost verti- solid line. The dotted horizontal lines show the simulated value for cal line shows the PAR of the unreduced signal, γ2 ,which reduction without PSD constraints, for the same number of tones. unred The asterisks show the combined −10 dB average PSD bound and is about 14 5 dB at the clip probability 2 · 10−4.Tobeableto . +4.8 dB peak PSD bound based on FEXT calculations as described reduce the signal level down to the target PAR on the hori- in Section 3.2. The 12 and 24 tone cases are not constrained by the zontal axis, we need at least the amount of tone magnitude +4.8 dB peak PSD bound. specified by the thin solid lines. For each number of tones we then have two different lower bounds. The limit on average PSD gives a maximum 3.1.3. Influence of target PSD allowed tone magnitude, and (16) gives a value of the mini- mum needed magnitude. Both bounds give a lower limit on The balancing of target PAR level and allowed maximum am- PAR level and the allowable area is to the right of both curves. plitude is difficult. As an example, consider a system with six If it would be attainable, the best point would be given by the reduction tones. From Figure 6 we see that the optimal point intersection of the bounds, for each number or tones marked is 11.4 dB. Let us assume that the designer chooses a target with a circle in the figure. PAR level of 11.0 dB, which is below this point. Following the The solid lines in Figure 6 show the allowed and required dotted line at 11.0dBin Figure 6 up to the thick line cor- magnitude based on derivations only assuming one peak. responding to the allowed magnitude, we see that the max- Here Figure 4 comes to our aid. There we can see that the imum allowed reduction tone magnitude is −2.8dB com- one-peak-only assumption almost always holds above 12 dB. pared to the data tones. If the algorithm really aims at the To evaluate the derivations, simulation results are shown in 11.0 dB level, the PSD mask will be violated. Otherwise, the Figure 7 with thin dashed lines next to the bounds. Looking reduction tone magnitude has to be limited to −2.8dB.Fol- at the thick curves for allowed tone magnitude, we see that lowing the dotted line along this level to the thin 6-tone curve for a low number of tones, the simulations closely follow the to the right shows us that we at maximum can reach a result- bound. For a higher number of tones, we cannot allow our- ingPARof13.8 dB, or a crest factor of 4.90. selves as high magnitude as the bound suggests. The reason This relationship between target and resulting PAR while is that with a high number of tones, we can work at low PAR conforming to the PSD mask is shown in Figure 8.Asde- levels, which means that we may have several peaks (cf. the scribed in the previous example, we can get the maximum A leftmost grey area in Figure 4). Then the reduction algorithm value based on a certain target PAR. Then we can use (16)to has to spend power on reducing many peaks, and not one get the bound on the resulting PAR: single peak, which is the situation the bound describes. Also, UA for the thin bound lines we see that the bounds are tighter at γ ≥ γunred − , (17) σ higher PAR levels. Reduction to low levels will generate sig- nals with many peaks, again deviating from the basic case the shown on the vertical axis for a clip probability of 2 · 10−4. bound is based on. Considering the simulations, the intersec- From the figure we see that there is a sharp optimum for each tion points move upwards, to a higher magnitude per tone, choice of number of tones. Aiming a little bit lower than this rather than rightwards, to a higher PAR. The bound gives a optimum, the resulting PAR level is severely increased (or good indication about the PAR, while it may show a too low in practice, the average PSD would be violated). The upper required tone magnitude. The bounds will be further evalu- horizontal line shows the PAR when having no reduction at ated with simulations in Section 5. all, and the four lower lines show simulation results of what 8 EURASIP Journal on Applied Signal Processing

100 expression is γunred −1 1 1 2 − 10 ≥ − ( ) 10 1 − 2 γ γtarget f γ dγ clip level) U0 U 2U γtarget

> ∞ 1 2 (18) + γunred − γtarget f (γ)dγ, −2 2 10 2U γunred  

=psymclip from (11) (symbol PAR P 10−3 where we note that the second integral is the tail probability psymclip at level γunred. Let us interpret this equation. To the left, we have the allowed power on a tone, from the limit on −4 10 the average PSD. To the right, we have a sum of two terms.

Clip probability, The first is the power per tone needed to reduce from the peak level down to the target level. This corresponds to the 7 8 9 10 11 12 13 14 15 16 first item in the optimisation formulation, “reduce down to Clip level (dB) the target PAR level using as little added power as possible.” The second expression corresponds to the cases when the al- Bound with average constraint −10 dB constraint ffi Unreduced signal +4.8 dB constraint lowed instantaneous power is insu cient for reducing the peak down to the target PAR level. For these cases, we reduce with the maximum available instantaneous tone magnitude Figure 9: Clip probability curve for reduction with 6 tones. The , which from (16)isequalto ( − ) . This cor- thick line shows the bound for reduction under the average con- A σ γunred γtarget /U straint for any parameter choice. The rightmost dash-dotted curve responds to the second item in the optimisation formulation, shows the PAR for an unreduced signal according to (9), and the “minimise the PAR level if the target is unreachable.” Com- dashed curve shows the bound when using no averaging. The thin pared to the first integral, the second term is very small, since solid line shows the bound when using the +4.8 dB constraint based it includes the small probability psymclip of a signal larger than on the FEXT calculations in Section 3.2. γunred. Note that (18) can be solved easily with regards to U and γtarget. The relationship between U and the target PAR level is shown in Figures 11 and 12 in the upcoming section. could be achieved when not having any PSD constraints. The 3.2. Limitations imposed by a maximum peak PSD optimum points discussed above have higher PAR than these levels, which shows that the system performance is primar- The derivations so far concerned only a limit on average PSD ily limited by the constraint in PSD levels. Having no PSD level. In Section 2.1 was discussed that a maximum instan- constraints gives results dependent on the number and place- taneous level may be needed as well, which will affect the ment of the reduction tones. reachable PAR level according to (17). This is the case con- ff Evaluating the bound in Figure 8 for di erent clip proba- sidered in [12, 13], which we elaborate on here to give more bilities will give new values of the optimum points. These op- detailed guidelines of where to put the constraint. We will timum points can then be plotted in a clip probability curve, consider the most restrictive case first, when the peak value commonly shown in PAR reduction papers; see the thick line is constrained at −10 dB and we do not use any averaging at in Figure 9. all, and then extend to a scenario based on crosstalk calcula- Worth to notice in Figure 9 is that we get a minimum ob- tions. tainable PAR that is almost independent on what clip prob- ability you look at. Also, since the curve represents the best 3.2.1. Constraint without averaging solution given different clip probabilities, a fixed design can- not get arbitrarily close to this bound at all values. We can For the case when no averaging is done between different only expect a system with a certain parameter choice to get symbols, all symbols need to comply with the −10 dB lim- close to the bound at the very specific target PAR level the itation and we cannot take advantage of the fact that most system is designed for. of the symbols need no or a very small amount of reduction. Thus, no power can be saved for the cases when it is needed 3.1.4. Combination of allowed and required magnitudes to reduce a strong peak. Two horizontal dashed lines are shown in Figure 6.The We are now ready to define this optimum point in terms lower one is placed at −10 dB and corresponds to this lim- of achievable PAR level for a system with a certain number itation. In addition to being on the right-hand side of the of tones. First we combine the two Powerred expressions in bounds discussed before, we now also have to be below this (14)and(15) to get the relationship between A and U.To horizontal line. As can be seen from the figure, this severely getasetofbestdesignchoicesofU and γtarget (although affects the reduction performance. For example, using only maybe not reachable), we let the maximum allowed magni- 6 PAR reduction tones, only about 0.3dBreductioncanbe tude A be equal to the minimum needed required value from achieved. In this case, the PAR reduction will most likely not (16). After eliminating the dependence on σ2, the resulting be worth the effort. Niklas Andgart et al. 9

In Figure 8, the curves move right and down with in- Bound based on peak PSD creasing target PAR level. Not using any averaging corre- sponds to having so low target PAR that all symbols need reduction. What may be possible to achieve here is thus shown by the values to the left in the figure. Having this strict limitation on the reduction signal makes the PAR reduction only usable if there is a high number of tones available for PAR reduction, for instance when there are many tones un- Design choice based on bounds

available to carry data due to low SNR. PAR level (dB) Bound based on average PSD

3.2.2. Constraint based on FEXT calculations

Thetwopreviousextremecasesshowverydifferent reduc- tion results, due to the difference in maximum allowed mag- Number of reduction tones nitude. It can be expected that there should be a maximum instantaneous tone power somewhere in between, deter- Figure 10: Schematic figure of two bounds on achievable PAR re- mined to not cause harmful interference to other users. If we duction, shown as a function of number of reduction tones. The two bounds based on maximum average and maximum peak PSDs consider the far-end crosstalk (FEXT), we can come up with define an unreachable region. For system design, a target point with a reasonable point to put the PSD constraint at. a certain safety margin is desired. Consider a situation where our modem in question is re- sponsible for one quarter of the FEXT to a certain neighbour- ing modem, a fairly pessimistic case. As our reduction tones directly following from (18): are at −10 dB of our data tones, it means that the neighbour- 2 γunred U 2 ing modem in question on the reduction tones are expecting · dBavg/10 ≥ − − 2 10 γ γtarget f (γ)dγ a FEXT level of three quarters of the normal level. ADSL is U0 U γtarget (19) designed to use a SNR margin of 6 dB. If that user has a SNR 2 + PS γunred γunred − γtarget , margin of 6 dB, how much can we then increase our distur- bance, without more than half of this SNR margin being lost? where Adding 3 dB to the FEXT level means adding (during an in- Nγ N − 2 − 2 stant only) as much FEXT as the user already has from all f (γ) = √ exp − √ e γ /2 e γ /2, 3 3 the other users, that is, matching the three quarters of the (20) nominal FEXT level. This is three times the one quarter we √N −γ2/2 PS(γ) = 1 − exp − e . have on the data-carrying tones, which means a peak level 3 of +4 8 dB on our reduction tones as compared to our data- . This equation can be plotted for U (number of reduction carrying tones. tones) as a function of γtarget (target PAR level), showing the This level is shown as the upper horizontal dash-dotted minimum number of reduction tones needed to achieve a line in Figure 6. We also need to be below this line, but com- certain PAR level. An example of this bound is plotted as the pared to the previous, much more restrictive, case, the system solid line in Figure 10. now has more reduction capabilities. The 24-tone system is A lower bound on achievable PAR under a peak PSD con- ff not a ected at all by this peak constraint, and the 12-tone straint dB , typically +4.8dB,isfrom(17)givenby ff peak system has just enough magnitude to avoid being a ected. √ However, the systems with 3 and 6 tones are still limited 2 dBpeak/20  U γ ≥ γunred − 10 . (21) by the constraint, which is also shown with the asterisks in U0 − U Figure 8. Such a bound has been plotted with the dash-dotted line in Figure 10. Together with the previous bound, it defines an 4. NUMERICAL RECIPE unreachable region marked with grey. When designing a sys- tem using randomly chosen tones, choosing a point 1 dB out- To summarize the previous section, we will now give a de- side this area should be sufficient as a tentative design choice. scription of how to, based on a certain system environment, The 1 dB margin will be motivated in the following section. generate an estimate of PAR reduction performance under PSD constraints. A description and typical values of the sys- 5. EFFECTS ON SYSTEM PERFORMANCE tem parameters are given in Table 1, and the design parame- ters we want to obtain values on are given in Table 2. Using the method in Section 4, we now combine the bounds To get a lower bound on the achievable PAR under an av- from Section 3 to evaluate what effect these will have on sys- erage PSD constraint dBavg, typically −10 dB, use for example tem performance, first for an ADSL2 system, then also for an Matlab’s quad function to evaluate the following function, ADSL2+ system. 10 EURASIP Journal on Applied Signal Processing

15 constraint imposed by the FEXT calculations, shown by the 14 dash-dotted line. The larger allowed signal for this case will give a significantly better performance. We can see that this 13 peak-PSD bound crosses the bound based on average PSD af- ter 11 reduction tones. The good performance for only a few 12 tones indicated by that bound demanded a very large reduc- 11 tion signal. Having a constraint on this as well, we see that the combined bound, with both the solid and dash-dotted lines, 10 will show a better yield per tone in the beginning up to 11 PAR level (dB) 9 tones. After this point, the average PSD will be the limiting factor, and the performance will increase slower. 8 The lines only represent bounds on achievable perfor- 7 mance based on PSD limitations. The performance will also depend on what tones are chosen as reduction tones. With- 0 5 10 15 20 25 30 out PSD-constraints, tones spread out over the frequency Number of reduction tones band in an unstructured manner will give better results than Average limit regularly or block-placed tones [2, 15, 21]. However, spread- −10 dB peak limit ing out tones over the whole frequency band is not too attrac- +4.8 dB peak limit tive in wireline systems as the SNR for the lower part of the Simulations, block, unconstrained spectrum generally is much better than for the highest tones. Simulations, random, unconstrained We will consider two extreme cases: tones randomly placed Simulations, block, constrained over the frequency band and tones allocated as a contiguous Simulations, random, constrained block of the highest tones. In practice, a combination of these extremecasesmaybeagoodchoice. Figure 11: Bounds on achievable PAR as a function of a number Simulation results for reduction with and without PSD of reduction tones, for an ADSL2 system with N = 512, and a sym- bol clip probability of 2·10−4. The solid line shows the bound based constraints are also shown in Figure 11. The simulations are on a maximum average PSD. The dash-dotted line shows the bound done with 8 times oversampling, and a 32-sided linear ap- based on a peak PSD set by FEXT calculations, and the upper dashed proximation [12] of the quadratic power constraint in (5). line is the very restrictive case when no averaging is used at all. The For the simulations without PSD constraints, we see that marker symbols show simulated reduction performance, for ran- the performance for the block-placed tones increases very dom or block-placed tones, with or without PSD constraints. slowly with the number of reduction tones, compared to the random placement. Most of the reduction performance is achieved after only a few reduction tones. Nevertheless, when 5.1. ADSL2 system with combined the PSD is constrained, the peak PSD bound sets the limit on peak-and-average constraints performance up to about 8 tones. This suggests that the tone placement is of minor importance for this low number of Looking at the bounds for a different number of tones at a tones. certain clip probability, we obtain a plot of the achievable Since the bound for the peak constraints decreases faster PAR as a function of the number of reduction tones, see than the bound for the average constraint, we are interested Figure 11. These curves show what performance could at best in the point where the two bounds meet. From Figure 11 this be achieved when having a certain number of reduction tones can be seen to be between 11 and 12 tones. Simulations per- evaluated at a certain probability. formed with randomly scattered tones are shown with the The solid line shows the bound for reduction under the triangles. The bounds give us a good hint about what the re- −10 dB average PSD limit from (18)inSection 3.1.Asseen, sulting performance will be. However, with a high number the bound decreases very fast in the beginning, then de- of tones, we do not end up as close to the bounds as we do creases slower at the higher number of reduction tones. With with a low number of tones. The difference is around one only a few tones, we can effectively reduce the highest peaks half dB. This could be seen already in Figure 6. There, the in the signal. These high peaks occur very seldom, so signif- simulations did not allow as much reduction magnitude as icant reduction can be achieved without increasing the aver- the bound suggested, due to the bound only calculating with age PSD much. Using more tones, we can start working with one peak. lower peaks in the signal as well, but cannot save the reduc- The simulated PAR for the number of tones where the tion power to a few, very high, peaks. bounds cross was around 11.3 dB. Translating to linear scale, The two other lines show the peak PSD constraints from we have a crest factor of about 3.7. It is interesting to know Section 3.2, using (17). The upper dashed line shows the re- what this means in volts on the line if this is to be used, for strictive −10 dB PSD constraint based on no averaging. Since example, to design the supply voltage to the line driver. If we this constraint admits only a very small reduction signal, we have a transmit power of P = 20 dBm (100 mW), and a load would have to use a very large number of tones to get any of√ R = 100√ Ω, the RMS value of the transmit signal is U = significant PAR reduction. More interesting is the +4.8dB PR = 0.1W· 100 Ω = 3.16 V. This means that instead of Niklas Andgart et al. 11

15 Table 1: System parameters for calculation of reduction bounds.

14 Symbol Description Typical value 13 ADSL2: 512 N FFT size ADSL2+: 12 1024 Crest factor (PAR in linear scale) 11 γ 5.33 (14.5dB) unred for the unreduced signal 10 Number of data tones available ADSL2: 223 PAR level (dB) U0 9 before PAR ADSL2+: 479

dBavg Average PSD constraint in dB −10 8 dBpeak Peak PSD constraint in dB +4.8 7

0 5 10 15 20 25 30 Number of reduction tones Table 2: Design parameters for calculation of reduction bounds.

Average limit Symbol Description − 10 dB peak limit U Number of reduction tones +4.8 dB peak limit Simulations, block, unconstrained γtarget Target crest factor Simulations, random, unconstrained Simulations, block, constrained Simulations, random, constrained to about 11 dB. Below that level, a higher number of tones is needed for the same gain. Figure 12: Bounds on achievable PAR as a function of number of If we compare Figures 11 and 12, we see that the bounds reduction tones. This figure is like Figure 11, but shows an ADSL2+ have moved to the right for the ADSL2+ case with the dou- = · −4 system with N 1024, and a symbol clip probability of 4 10 . ble number of subcarriers. The success in PAR reduction de- pends on how large reduction signal we can create, in com- parison to the size of the data signal. Thus, what is important 3.16 V·5.33 ≈ 17Vwenowhaveapeakof3.16 V·3.7 ≈ 12 V. for good reduction under a PSD constraint is often the rela- (In practice, this calculation is more complicated, involving tive number of reduction tones. On the contrary, the reduc- higher loads, step-up transformers, etc.) tion performance of the nonconstrained reduction is more The reduction in signal span and supply voltage gives a dependent on the absolute number of reduction tones, since ff good indication on power consumption in the line driver. what e ects the reduction performance here is how well we However, the exact relationship between the supply voltage can create an impulse in counter-phase to the signal peak. and power consumption is not straightforward, since it de- pends on how capacitive the load is [22]. The rules of thumb 6. CONCLUSIONS for power consumption given by designers of line drivers vary between linear and quadratic functions of the supply Applying tone reservation to DSL systems can reduce the sys- voltage. Reduction from 17 V to 12 V would then mean a tem PAR, with algorithms possible to implement in current power reduction between 30% and 50%. standards. For implementation, low-cost active-set-based al- gorithms may be a good choice [12]. However, designing the 5.2. Extension to ADSL2+ algorithms, care must be taken not to exceed the PSD levels set up by the standards. Introducing PSD constraints will sig- The derivations so far have described a system with an FFT nificantly alter the achievable performance of the reduction length of 512 in the downstream direction, such as ADSL2. systems. ADSL2+, the extension of ADSL2, uses more downstream Using the requirements specified in the ITU standards bandwidth, which means that the downstream FFT length and extending this with engineering assumptions, we have here instead is 1024, and all tones in the newly added top half derived bounds on achievable performance. These bounds of the spectrum are used for downstream transmission. are applicable to most DMT systems, such as ADSL2 and We have seen that the limits on achievable PAR reduc- ADSL2+. Simulations searching for the optimal solution tion performance are strongly dependent on the number of confirm that the bounds give a good indication of realistic PAR reduction tones. Increasing the system FFT size to 1024, system performance. where we use a flat transmit PSD for all data tones, moves We have demonstrated how these bounds can be used to the intersection point between the two bounds up to about predict system performance for varying parameter choices, 15 tones, as shown in Figure 12. The simulation results are and we have exemplified how they can be used to tailor PAR here closer to the bound than in the ADSL2 case. We see that reduction to different systems. Thus, the bounds can be used also in this case, we can expect a faster PAR decrease down to quickly determine if tone reservation PAR reduction is a 12 EURASIP Journal on Applied Signal Processing worthwhile technology to be included in a system. Using the B. DERIVATION OF THE PAR FOR bounds, this can be done in an afternoon. After a positive AN OVERSAMPLED SYSTEM indication, the system design could then proceed with the large task to create a simulation chain in order to fine-tune The derivation is based on the signal viewed as a Gaussian the settings for the PAR reduction algorithm. process [14]. This can also be intuitively motivated by the central limit theorem when adding many subcarriers with in- dependent data. We model the signal as Gaussian noise with APPENDICES constant spectral density in the frequency interval up to f1: ⎧ A. PSD CONSTRAINTS FORMULATED IN 2 ⎨⎪ σ THE STANDARDS if − f1 ≤ f ≤ f1, ( ) = 2 f1 (B.1) R f ⎩⎪ The ADSL1 standard [16] defines the PSD constraints as: 0 otherwise. = = Forthesubcarrierswith(bi 0andgi 0), Rice’s formula for a Gaussian stationary process [23–25] the ATU-C transmitter should and is recom- states that the intensity of upcrossings, that is, the expected mended to transmit no power on those subcar- number of upcrossings of the level γσ in an interval of length riers. The ATU-R receiver cannot assume any 1, is particular PSD levels on those subcarriers. The    transmit PSD levels of those subcarriers with 1 λ − 2 + = + = 2 γσ /(2λ0) = μ (γσ) E N[0,1](x,γσ) e ,(B.2) gi 0 will be at least 10 dB below the sync sym- 2π λ0 bol reference transmit PSD level if the subcarrier is below the lowest used subcarrier (lowest i with where λ0 and λ2 are functions of the covariance function r(τ): b > 0) and will be below the sync symbol refer- i = =−  ence transmit PSD level if the subcarrier is above λ0 r(0), λ2 r (0). (B.3) the lowest used subcarrier. If we consider the case when the level γσ is high enough for In ITU standards, the word will defines a mandatory re- the crossing times to be spread out and independent, we can quirement, should a recommendation, and may an option. model the time between each crossing as exponentially dis- The ADSL1 standard thus recommends to not use these sub- tributed. Then the number of upcrossings during the interval carriers for PAR reduction, but allows the same PSD as for of length 1 follows a Poisson process with intensity μ+(γσ). the data tones, except for a −10 dB limitation on the tones The corresponding process describing the number of down- below the lowest data subcarrier. crossings of the level −γσ will have identical intensity, due to In the newer ADSL2 standard [17], the formulation is: the symmetry and zero-mean of the signal. We are interested in the intensity of crossing either γσ or −γσ, and this inten- For the subcarriers not in the MEDLEYset, sity could then be described by the sum of the two single- the ATU will transmit no power on the sub- sided intensities. carrier (i.e., Zi = 0, see Section 8.8.2) if the The probability that we will have no crossings of the level subcarrier is below the first used subcarrier in- γσ in a time interval of length T is dex or if the subcarrier is in the SUPPORTED- − + set and in the BLACKOUTset. Otherwise, the Prob max |x| <γσ = e T2μ (γσ),(B.4) ATU may transmit at a discretionary transmit PSD level on the subcarrier (which may change and the clip probability is thereby given by from symbol to symbol), not to exceed the max- − + P (γ) = Prob(clip at level γσ) = 1 − e T2μ (γσ). (B.5) imum transmit PSD level for these subcarriers. S The maximum transmit PSD level for each of To solve this for our signal, we start with calculating the co- these subcarriers will be defined as 10 dB below variance function: the reference transmit PSD level, fine tuned by ∞ RMSGI dB (see Section 8.5) and limited to the sin 2πτ f1 r(τ) = e j2πfτR( f )df = σ2 ,(B.6) transmit spectral mask. −∞ 2πτ f1

Below the lowest data subcarrier, no PAR reduction tones which shows the variance of the signal, r(0) = σ2.Differenti- are allowed. However, there is no recommendation to not use ating r(τ)gives the other subcarriers as reduction tones, as long as they are at least 10 dB below the data tone PSD.  cos 2πτ f1 1 sin 2πτ f1 r (τ) = σ2 − , Summarizing the standards, we see that although the rec- τ 2 πτ2 f 1 ommendation is to not put any power at all on the reduction  sin 2πτ f1 πf1 tones, the ADSL1 standard allows much higher power, com- r (τ) = σ2 − 2 (B.7) τ pared to the ADSL2 standard. Thus, the more well-defined ADSL2 formulation sets a stricter limit on reduction perfor- − cos 2πτ f1 sin 2πτ f1 2 2 + 3 . mance. τ πτ f1 Niklas Andgart et al. 13

Then the intensities λ0 and λ2 are given by REFERENCES

2 [1] A. Gatherer and M. Polley, “Controlling clipping probability in λ0 = r(0) = σ ,   DMT transmission,” in Proceedings of 31st Asilomar Conference λ2 =−r (0) =−lim r (τ) τ→0 on Signals, Systems, and Computers, vol. 1, pp. 578–584, Pacific Grove, Calif, USA, November 1997. = 4π2 f 2σ2 1  [2] J. Tellado-Mourelo, Peak to average power reduction for multi- − 2 − 2 − 1 2 2 2 4 carrier modulation, Ph.D. thesis, Stanford University, Stanford, lim σ 1 4π τ f1 + O τ τ→0 τ2 2  (B.8) Calif, USA, 1999. 1 1 [3] M. Friese, “Multitone signals with low crest factor,” IEEE + 2πτ f − 8π3τ3 f 3 + O τ5 πτ3 f 1 6 1 Transactions on Communications, vol. 45, no. 10, pp. 1338– 1   1344, 1997. = 2 2 2 − 2 2 2 − 8 2 2 2 4π f1 σ lim σ +4π f1 π f1 + O τ [4]D.J.G.MestdaghandP.M.P.Spruyt,“Amethodtoreduce τ→0 6 the probability of clipping in DMT-based transceivers,” IEEE = 4 2 2 2 π f1 σ . Transactions on Communications, vol. 44, no. 10, pp. 1234– 3 1238, 1996. ffi Inserting this into (B.2)and(B.5), we get [5] B. M. Popovic, “Synthesis of power e cient multitone signals with flat amplitude spectrum,” IEEE Transactions on Commu-  ⎛  ⎞ nications, vol. 39, no. 7, pp. 1031–1033, 1991.  2 ⎜ 2  2πf1 − 2 2 ⎟ [6] R. W. Bauml, R. F. H. Fischer, and J. B. Huber, “Reducing the P (γ) = 1 − exp ⎝−T e γ σ /(2λ0)⎠ S 2π 3 peak-to-average power ratio of multicarrier modulation by se- (B.9) lected mapping,” Electronics Letters, vol. 32, no. 22, pp. 2056– 1 − 2 2057, 1996. = 1 − exp − 2Tf √ e γ /2 . 1 3 [7] D. L. Jones, “Peak power reduction in OFDM and DMT via active channel modification,” in Proceedings of 33rd Asilomar The factor Tf1 is dependent on the symbol length. If the sam- Conference on Signals, Systems, and Computers, vol. 2, pp. ple rate is Fs, and the whole band up to the Nyquist frequency 1076–1079, Pacific Grove, Calif, USA, October 1999. [8]P.O.Borjesson,¨ H. G. Feichtinger, N. Grip, et al., “A low- is used, then f1 = Fs/2. At the same time, T corresponds to complexity PAR-reduction method for DMT-VDSL,” in Pro- N samples, each 1/Fs in time. Thereby, ceedings of 5th International Symposium on Digital Signal Pro- cessing for Communication Systems (DSPCS ’99), pp. 164–169, N Fs N Tf1 = = . (B.10) Perth, Australia, February 1999. F 2 2 s [9]B.S.KrongoldandD.L.Jones,“Anactive-setapproachfor The resulting clip probability is then OFDM PAR reduction via tone reservation,” IEEE Transactions on Signal Processing, vol. 52, no. 2, pp. 495–509, 2004. [10] W. Henkel and V. Zrno, “PAR reduction revisited: an ex- √N −γ2/2 PS(γ) = Prob(clip at level γσ) = 1 − exp − e . tension to Tellado’s method,” in Proceedings of 6th Interna- 3 tional OFDM-Workshop (InOWo ’01), pp. 31-1–31-6, Ham- (B.11) burg, Germany, September 2001. [11] J. Tellado and J. M. Cioffi, Further results on peak-to-average Notable with this approximation is that the clip probability√ ratio reduction, ANSI Document, T1E1.4 no. 98-252, August, for γ = 0 is not exactly one but instead 1 − exp(−N/ 3). 1998. As mentioned, when modelling the crossings with a Poisson [12] N. Andgart, B. S. Krongold, P. Odling,¨ A. Johansson, process, the model is not applicable when we have a too low andP.O.Borjesson,¨ “PSD-constrained PAR reduction for clip level, that is, close to 0. However, this low region is not DMT/OFDM,” EURASIP Journal on Applied Signal Processing, of interest. The related problem of deriving clip probability vol. 2004, no. 10, pp. 1498–1507, 2004. for a complex-valued continuous-time OFDM signal was ad- [13] N. Petersson, A. Johansson, P. Odling,¨ and P. O. Borjesson,¨ dressed in [26]. “A performance bound on PSD-constrained PAR reduction,” Using the Poisson process model, we can also calculate in Proceedings of IEEE International Conference on Communi- the probability for exactly a certain number of clips. In par- cations (ICC ’03), vol. 5, pp. 3498–3502, Anchorage, Alaska, ticular, the probability for one single clip during a symbol USA, May 2003. interval is [14] S. Wei, D. L. Goeckel, and P. E. Kelly, “A modern extreme value theory approach to calculating the distribution of the peak-to- − + N − 2 N − 2 average power ratio in OFDM systems,” in Proceedings of IEEE T2μ+(γσ)e T2μ (γσ) = √ e γ /2 exp − √ e γ /2 . 3 3 International Conference on Communications (ICC ’02), vol. 3, (B.12) pp. 1686–1690, New York, NY, USA, April–May 2002. [15] N. Petersson, “Peak and power reduction in multicarrier sys- ACKNOWLEDGMENTS tems,” Licentiate thesis, Lund University, Lund, Sweden, 2002. [16] ITU-T, Asymmetric digital subscriber line (ADSL) transceivers, This work was supported by Ericsson AB; the Eureka Projects Recommendation G.992.1, June 1999. MIDAS A110 and BANITS; and by the MUSE Project of the [17] ITU-T, Asymmetric digital subscriber line (ADSL) trans- European Union’s 6th Framework Programme. ceivers—2 (ADSL2), Recommendation G.992.3, July 2002. 14 EURASIP Journal on Applied Signal Processing

[18] ITU-T, Asymmetric digital subscriber line (ADSL) trans- Albin Johansson was born 1968 in Stockholm, Sweden. He re- ceivers—Extended bandwidth ADSL2 (ADSL2+), Recommen- ceived an M.S.E.E. degree in 1993 from Royal Institute of Tech- dation G.992.5, May 2003. nology in Stockholm and is now pursuing a Ph.D. degree at Lund [19] P. Odling,¨ N. Petersson, A. Johansson, and P. O. Borjesson,¨ Institute of Technology. Since 2004, he has been holding a position “How much PAR to bring to the party?” in Proceedings of IEEE at Ericsson AB as an Expert Broadband Access within Wireline, be- Nordic Signal Processing Symposium (NORSIG ’02),Tromsø- ing responsible for system architecture in Ericsson’s wireline broad- Trondheim, Norway, October 2002. band access products. He has been actively involved in the devel- [20] D. G. Luenberger, Linear and Nonlinear Programming, opment of the standardization of ADSL within ETSI, ANSI, ITU-T, Addison-Wesley, Reading, Mass, USA, 2nd edition, 1984. and ADSL Forum. He has been Editor for ITU-T G.997.1 and Chair [21] N. Petersson, A. Johansson, P. Odling,¨ and P. O. Borjesson,¨ in one of ADSL Forums subcommittees. In addition, from 1992 to “Analysis of tone selection for PAR reduction,” in Proceedings 1995, he was teaching undergraduate students at Royal Institute of of 3rd International Conference on Information, Communica- Technology. Since 2001, he has been a Member of the Signal Pro- tions and Signal Processing (ICICS ’01), Singapore, October cessing Group at Lund Institute of Technology. He has published 2001. 6 conference papers, numerous standardization contributions, and [22] N. Larsson and K. Werner, “Signal peak reduction for power holds 7 patents. amplifiers with active termination,” Master’s thesis, Lund In- stitute of Technology, Lund, Sweden, 2002. Per Ola Borjesson¨ was born in Karlshamn, [23] G. Lindgren, Lectures on Stationary Stochastic Processes,Math- Sweden in 1945. He received his M.S. de- ematical Statistics, Lund University, Lund, Sweden, 2002. gree in electrical engineering in 1970 and [24] S. O. Rice, “Mathematical analysis of random noise,” Bell Sys- his Ph.D. degree in telecommunication the- tems Technical Journal, vol. 23, pp. 282–332, 1944. ory in 1980, both from Lund Institute of [25] S. O. Rice, “Mathematical analysis of random noise,” Bell Sys- Technology (LTH), Lund, Sweden. In 1983, tems Technical Journal, vol. 24, pp. 46–156, 1945. he received the degree of Docent in telecom- [26] H. Ochiai and H. Imai, “On the distribution of the peak-to- munication theory. Between 1988 and 1998, average power ratio in OFDM signals,” IEEE Transactions on he was Professor of signal processing at Communications, vol. 49, no. 2, pp. 282–289, 2001. Lulea˚ University of Technology. Since 1998, he is Professor of signal processing at Lund University. His primary research interest lies in high-performance communication systems, Niklas Andgart was born in Hassleholm,¨ in particular, high-data-rate wireless and twisted-pair systems. He Sweden, in 1975. He received the M.S.E.E. is presently researching signal processing techniques in communi- degree in 2000, the Licentiate in Engineer- cation systems that use orthogonal frequency-division multiplex- ing degree in 2002, and the Ph.D. degree ing (OFDM) or discrete multitone (DMT) modulation. He empha- in signal processing in 2005, all from Lund sizes the interaction between models and real systems, from the cre- University. During the fall of 1999, he was ation of application-oriented models based on system knowledge to with the Vehicle and Dynamics Laboratory the implementation and evaluation of algorithms. at the University of California at Berkeley, and in early 2004, he was visiting the De- partment of Electrical and Electronic Engi- neering at the University of Melbourne. Currently, he is with the Department of Information Technology at Lund University. His re- search is within signal processing for communication systems and he works with DSL research in cooperation with Ericsson AB in Stockholm. Per Odling¨ was born 1966 in Ornsk¨ oldsvik,¨ Sweden. He received an M.S.E.E. degree in 1989, a Licentiate in Engineering degree in 1993, and a Ph.D. degree in signal processing in 1995, all from Lulea˚ Uni- versity of Technology, Sweden. In 2000, he was awarded the Docent degree from Lund Institute of Technology, and in 2003, he was ap- pointed Full Professor there. From 1995, he was an Assistant Pro- fessor at Lulea˚ University of Technology, serving as Vice-Head of the Division of Signal Processing. In parallel, he consulted for Telia AB and ST Microelectronics, developing an OFDM-based proposal for the standardization of UMTS/IMT-2000 and VDSL for stan- dardization in ITU, ETSI, and ANSI. Accepting a position as Key Researcher at the Telecommunications Research Center, Vienna, in 1999, he left the Arctic North for historic Vienna. There he spent three years advising graduate students and industry. He also con- sulted for the Austrian Telecommunications Regulatory Authority on the unbundling of the local loop. He is, since 2003, a Professor at Lund Institute of Technology, stationed at Ericsson AB, Stock- holm. He also serves as an Associate Editor for the IEEE Transac- tions on Vehicular Technology. He has published more than fourty journal and conference papers, thirty five standardization contri- butions, and a dozen patents. Hindawi Publishing Corporation EURASIP Journal on Applied Signal Processing Volume 2006, Article ID 19329, Pages 1–16 DOI 10.1155/ASP/2006/19329

Cosine-Modulated Multitone for Very-High-Speed Digital Subscriber Lines

Lekun Lin and Behrouz Farhang-Boroujeny

Department of Electrical and Computer Engineering, University of Utah, Salt Lake City, UT 84112-9206, USA

Received 17 November 2004; Revised 24 June 2005; Accepted 22 July 2005 In this paper, the use of cosine-modulated filter banks (CMFBs) for multicarrier modulation in the application of very-high-speed digital subscriber lines (VDSLs) is studied. We refer to this modulation technique as cosine-modulated multitone (CMT). CMT has the same transmitter structure as discrete wavelet multitone (DWMT). However, the receiver structure in CMT is different from its DWMT counterpart. DWMT uses linear combiner equalizers, which typically have more than 20 taps per subcarrier. CMT, on the other hand, adopts a receiver structure that uses only two taps per subcarrier for equalization. This paper has the following contributions. (i) A modification that reduces the computational complexity of the receiver structure of CMT is proposed. (ii) Although traditionally CMFBs are designed to satisfy perfect-reconstruction (PR) property, in transmultiplexing applications, the presence of channel destroys the PR property of the filter bank, and thus other criteria of filter design should be adopted. We propose one such method. (iii) Through extensive computer simulations, we compare CMT with zipper discrete multitone (z-DMT) and filtered multitone (FMT), the two modulation techniques that have been included in the VDSL draft standard. Comparisons are made in terms of computational complexity, transmission latency, achievable bit rate, and resistance to radio ingress noise.

Copyright © 2006 Hindawi Publishing Corporation. All rights reserved.

1. INTRODUCTION (i) synchronous zipper [13, 14] and (ii) asynchronous zipper [15]. The synchronous zipper requires synchronization of all In recent years, multicarrier modulation (MCM) has at- modems sharing the same cable (a bundle of twisted pairs). tracted considerable attention as a practical and viable tech- Asthisisfoundtoorestrictive(manymodemshavetobesyn- nology for high-speed data transmission over spectrally chronized), it has been identified as an infeasible solution. shaped noisy channels [1–6]. The most popular MCM tech- The asynchronous zipper, on the other hand, at the cost of nique uses the properties of the discrete Fourier transform some loss in performance, requires only synchronization of (DFT) in an elegant way so as to achieve a computation- the pairs of modems that communicate with each other. The ally efficient realization. Cyclic prefix (CP) samples are added unsynchronized modems on the same cable then introduce to each block of data to resolve and compensate for chan- some undesirable crosstalk noise. Since the asynchronous z- nel distortion. This modulation technique has been accepted DMT is the one that has been adopted in the VDSL draft by standardization bodies in both wired (digital subscriber standard [16], in the rest of this paper all references to z- lines—DSL) [7–10] and wireless [11, 12] channels. While the DMT are with respect to its asynchronous version. terminology discrete multitone (DMT) is used in the DSL lit- To synchronize a pair of modems in z-DMT, cyclic suffix erature to refer to this MCM technique, in wireless applica- (CS) samples are used. Moreover, to suppress the sidelobes tions, the terminology orthogonal frequency-division multi- of DFT filters and thus allow more effective FDD, extensions plexing (OFDM) has been adopted. The difference is that in are made to the CP and CS samples and pulse-shaping filters DSL applications, MCM signals are transmitted at baseband, are applied [15]. All these add to the system overhead, and while in wireless applications, MCM signals are upconverted thus reduce the bandwidth efficiency of z-DMT. to a radio frequency (RF) band for transmission. Radio frequency interference (RFI) is a major challenge Zipper DMT (z-DMT) is the latest version of DMT that that any VDSL modem has to deal with. RF signals generated has been proposed as an effective frequency-division duplex- by amateur radios (HAM signals) coincide with the VDSL ing (FDD) method for very-high-speed DSL (VDSL) ap- band [3, 4]. Thus, there is a potential of interference be- plications. Two variations of z-DMT have been proposed: tween VDSL and HAM signals. The first solution to separate 2 EURASIP Journal on Applied Signal Processing

HAM and VDSL signals is to prohibit VDSL transmission postcombiner equalizers impose significant load on the com- over the HAM bands. This solution along with the pulse- putational complexity of the receiver. This complexity and shaping method adopted in z-DMT will solve the problem the lack of an in-depth theoretical understanding of DWMT of VDSL signals egress interference with HAM signals. How- have kept industry lukewarm about it in the past. ever, the poor sidelobe behavior of DFT filters and also the A revisit to CMFB-MCM/DWMT has been made re- very high level of RFI still result in interference which de- cently [33–36]. In the first work, [33], an in-depth study grades the performance of z-DMT significantly. RFI can- of DWMT has been performed, assuming that the channel cellers are thus needed to improve the performance of z- could be approximated by a complex constant gain over each DMT. There are a number of methods in the literature that subcarrier band. This study, which is also intuitively sound, cancel RFI by treating the ingress as a tone with no or very revealed that the coefficients of each postcombiner equal- small variation in amplitude over each data block of DMT izer are closely related to the underlying prototype filter of [17–19]. Such methods have been found to be limited in per- the filter bank. Furthermore, there are only two parameters formance. Another method is to pick up a reference RFI sig- per subcarrier that need to be adapted; namely, the real and nal from the common-mode component of the twisted-pair imaginary parts of the inverse of channel gain. In a further signals and use it as input to an adaptive filter for synthe- study [34, 35], it was noted that by properly restructuring sizing and subtracting the RFI from the received signal [20]. the receiver, each postcombiner equalizer could be replaced This method which may be implemented in analog or digital by a two-tap filter. It was also shown that there is no need form can suppress RFI by as much as 20 to 25 dB [19]. Our for cross-filters (as used in the postcombiner equalizers in understanding from the limited literature available on RFI DWMT), thanks to the (near-) perfect reconstruction prop- cancellation is that a combination of these two methods will erty of CMFB. Moreover, a constant modulo blind equaliza- result in the best performance in any DMT-based transceiver. tion algorithm (CMA) was developed [34, 35]. In [36], also Thus, the comparisons given in the later sections of this pa- a receiver structure that combines signals from a CMFB and per consider such an RFI canceller setup for z-DMT. a sine-modulated filter bank (SMFB) is proposed to avoid Since RFI cancellation is rather difficult to implement, cross-filters. This structure which is fundamentally similar there is a current trend in the industry to adopt filter-bank- to the one in [34, 35] approaches the receiver design from based MCM techniques. These can deal with RFI more effi- a slightly different angle. The complexity of CMFB/SMFB ciently, thanks to much superior stopband suppression be- receiver is discussed in [37] where an efficient structure is havior of filter banks compared to DFT filters. We note that proposed. In a further development [38], it is noted that z-DMT has made an attempt to improve on stopband sup- CMFB/SMFB can be configured for transmission of com- pression. However, as we show in Section 6, z-DMT is still plex modulated (such as QAM—quadrature amplitude mod- much inferior to filter bank solutions. ulated) signals. This is useful for data transmission over RF Filtered multitone (FMT) is a filter bank solution that has channels, but is not relevant to xDSL channels which are fun- been proposed by IBM [21–23] and has been widely studied damentally baseband. recently. In order to avoid interference among various sub- In this paper, we extend the application of CMFB-MCM carriers, FMT adopts a filter bank with very sharp transition to VDSL channels. The following contributions are made. bands and allocates sufficient excess bandwidth, typically in The receiver structure proposed in [34, 35] is modified in the range from 0.05 to 0.125. This introduces significant in- order to minimize its computational complexity. Moreover, tersymbol interference (ISI) that is dealt with by using a sep- we discuss the problem of prototype filter design in trans- arate decision feedback equalizer (DFE) for each subcarrier multiplexer systems. We note that the traditional perfect- [23]. Such DFEs are computationally very costly as they re- reconstruction (PR) designs are not appropriate in this ap- quire relatively large number of feedforward and feedback plication, and thus develop a near-PR (NPR) design strat- taps. Nevertheless, the advantages offered by this solution, egy. There are some similarities between our design strat- especially with respect to suppression of ingress RFI, has jus- egy and that of [39] where prototype filters are designed tified its application, and thus FMT has been included as an for FMT. We contrast the CMFB-MCM against z-DMT and annex to the VDSL draft standard [16]. FMT and make an attempt to highlight the relative advan- Cosine-modulated filter banks (CMFBs) working at tages that each of these three methods offer. In order to dis- maximally decimated rate, on the other hand, are well un- tinguish between the proposed method and DWMT, we re- derstood and widely used for signal compression [24]. More- fer to it as cosine-modulated multitone (CMT). We believe over, the use of filter banks for realization of transmultiplexer the term “cosine-modulated filter bank” (and thus CMT) is systems [24] as well as their application to MCM [25]have more reflective of the nature of this modulation technique been recognized by many researchers. In particular, the use than the term “wavelet.” The term wavelet is commonly used of CMFB to multicarrier data transmission in DSL channels in conjunction with filter banks in which the bandwidth of has been widely addressed in the literature, under the com- each subband varies proportional to its center frequency. In mon terminology of discrete wavelet multitone (DWMT), CMFB, all subbands have the same bandwidth. Moreover, the for example, see [25–32]. In DWMT, it is proposed that modulator and demodulator blocks that we use are directly channel equalization in each subcarrier be performed by developed from a pair of synthesis and analysis CMFB, re- combining the signals from the desired band and its adja- spectively. We should also acknowledge that there have been cent bands. These equalizers that have been referred to as some attempts to develop communication systems that use L. Lin and B. Farhang-Boroujeny 3

Transmitter Receiver

S (n) S (n) 0 Fs z Fa z M 0 M 0( ) 0( ) v(n) S (n) S1(n) 1 Fs z Fa z M M 1( ) 1( ) H(z) z – δ . . . .

SM (n) SM (n) –1 Fs z Fa z M –1 M M –1( ) M –1( ) Synthesis Analysis filter bank filter bank

Figure 1: Block schematic of a CMFB-based transmultiplexer.

wavelets with variable bandwidths, for example, see [40]and Δ is an integer. However, due to the channel distortion, the the references therein. recovered symbols suffer from intersymbol interference (ISI) An important class of filter-bank-based transmultiplexer and intercarrier interference (ICI). Equalizers are thus used systems that avoid ISI and ICI completely has been studied to combat the channel distortion. As noted above, postcom- recently, for example, [41, 42].SimilartoDMT,wherecyclic biner equalizers that span across the adjacent subbands and prefix samples are used to avoid ISI and ICI, here also re- along the time axis were originally proposed for this pur- dundant samples are added (e.g., through precoding) for the pose [25]. Such equalizers are rather complex—typically, 20 same purpose. Such systems, thus, similar to DMT and FMT, or more taps per subcarrier are used. A recent development suffer from bandwidth loss/inefficiency. Moreover, since the [34, 35] has shown that with a modified analysis filter bank, designed filter banks, in general, are not based on a proto- each subcarrier can be equalized by using only two taps. In type filter, they cannot be realized in any simple manner, the rest of this section, we present a review of this modified for example, in a polyphase DFT structure. Hence, they do CMFB-based transmultiplexer and explain how such simple not seem attractive for applications such as DSL where filter equalization can be established. As noted above, we call this banks with a large number of subbands have to be adopted. new scheme CMT. The rest of this paper is organized as follows. We present In CMT, the transmitter follows the conventional imple- an overview of CMFB-MCM/CMT in Section 2.InSection 3, mentation of synthesis CMFB [24]. For the receiver, we resort we propose a novel structure of CMT receiver which reduces to a nonsimplified structure of the analysis CMFB. Figure 2 its complexity significantly compared to the previous reports presents a block diagram of this nonsimplified structure for [34, 35]. In Section 4,wedevelopanNPRprototypefilterde- an M-band analysis CMFB; see [24] for development of this sign scheme. Computational complexities and latency issues structure. Gk(z), 0 ≤ k ≤ 2M − 1, are the polyphase compo- are discussed and comparisons with z-DMT and FMT are nents of the filter bank prototype filter P(z), namely, made in Section 5. This will be followed by a presentation of a wide range of computer simulations, in Section 6,wherewe 2M−1 −k M compare z-DMT, FMT, and CMT under different practical P(z) = z Gk z2 . (1) conditions. The concluding remarks are made in Section 7. k=0 ffi d d ... d 2. COSINE-MODULATED MULTITONE The coe cients 0, 1, , 2M−1 are chosen in order to equalize the group delay of the filter bank subchannels. This k . N/ d = e jθk W( +0 5) 2 k = ... M − d = Figure 1 presents block diagram of a CMFB-based transmul- gives k 2M for 0, 1, , 1, and k d∗ k = M M ... M − θ = tiplexer system. At the transmitter, the data symbol streams 2M−1−k for , +1, ,2 1, where k k − j2π/2M sk(n) are first expanded to a higher rate by inserting M −1ze- (−1) (π/4), W2M = e , ∗ denotes conjugate, and N ros after each sample. Modulation and multiplexing of data is the order of P(z). Qa z Qa z ... Qa z streams are then done using a synthesis filter bank. At the Let 0( ), 1( ), , 2M−1( ) denote the transfer func- receiver, an analysis filter bank followed by a set of decima- tions between the input x(n) and the analyzed outputs uo n uo n ... uo n tors are used to demodulate and extract the transmitted sym- 0( ), 1( ), , 2M−1( ), respectively. We recall from the Qa z = d P zWk+0.5 k = bols. The delay δ at the receiver input is required to adjust theory of CMFB that k( ) k 0( 2M )for 0, 1, the total delay introduced by the system to an integral mul- ...,2M − 1, see [24]. The CMFB analysis filters are gener- M δ ν n Qa z Qa z k = tiple of . When is selected correctly, channel noise ( ) ated by adding the pairs of k( )and 2M−1−k( ), for is zero and the channel is perfect, that is, H(z) = 1, a well- 0, 1, ..., M − 1. This leads to M analysis filters [24] designed transmultiplexer delivers a delayed replica of data symbols sk(n) at its outputs, that is, sk(n) = sk(n − Δ), where Fa z = Qa z Qa z k = ... M − . k ( ) k( )+ 2M−1−k( ), 0, 1, , 1 (2) 4 EURASIP Journal on Applied Signal Processing

d0 x n uo n u n ( ) G −z2M 0( ) 0( ) 0( ) M −1 −1/2 d z W M 1 2 o u (n) u1(n) 2M 1 G1(−z ) M

2M-point z−1W −1/2 . . . 2M . IDFT . . . . d2M−1 . −1 −1/2 z W M 2 o u M− (n) u2M−1(n) 2M 2 1 G2M−1(−z ) M

Figure 2: The analysis CMFBstructure that is proposed for CMT.

s The synthesis filters Fk(z)aregivenas[24] Considering (5), an estimate of sk(n) can be obtained as follows: s s s Fk(z) = Qk(z)+Q M− −k(z), k = 0, 1, ..., M − 1, (3) 2 1 ∗ sk(n) = wk uk(n) s −N a −1 (6) where Qk(z) = z Qk ∗(z ) and the subscript ∗ means con- , = wk,Ruk,R(n)+wk,Iuk,I(n), jugating the coefficients. s In a CMT transceiver, the synthesis filters Fk(z)areused where the subscripts R and I denote the real and imaginary at the transmitter. However, at the receiver, we resort to using parts of the respective variables. Equation (6) shows that the a the complex coefficient analysis filters Qk(z). In the absence distorted received signal uk(n)canbeequalizedbyusinga ∗ of channel, and assuming that a pair of synthesis and analysis complex tap weight wk or, equivalently, by using two real CMFB with PR are used, we get [24] tap weights wk,R and wk,I. If we define the optimum value of w∗, w∗ , as the one that maximizes the signal-to-noise- 1 k k,opt uk(n) = sk(n − Δ)+jrk(n) ,(4)plus-interference ratio at the equalizer output, we find that 2 where rk(n) arises because of ISI from the kth subchannel ∗ 2 and ICI from other subchannels. The PR property of CMFB wk = . (7) ,opt hk allows us to remove the ISI-plus-ICI term rk(n)andextract the desired symbol sk(n−Δ) simply by taking twice of the real At this point, we will make some comments about part of uk(n). This, of course, is in the absence of channel. DWMT and clarify the difference between the proposed re- The presence of channel affects uk(n), and sk(n − Δ)canno ceiver and that of the DWMT [25]. In DWMT, the analyzed longer be extracted by the above procedure. subcarrier signals that are passed to the postcombiner equal- ff a In order to include the e ect of the channel, we make izersaretheoutputsofFk (z) filters, that is, 2{uk(n)}. Since the simplifying, but reasonable, assumption that the num- these outputs are real-valued, they lack the channel phase ber of subbands is sufficiently large such that the channel information and, hence, a transversal equalizer with input frequency response H(z) over the kth subchannel can be ap- 2{uk(n)} will fall short in removing ISI and ICI. To com- proximated by a complex constant gain hk.Moreover,weas- pensate for the loss of phase information, in DWMT, it was sume that variation of the channel group delay over the band proposed that samples of signals from kth subcarrier channel of transmission is negligible. Then, in the presence of chan- and its adjacent subcarrier channels be combined together nel, we obtain for equalization. Theoretical explanation of why this method ff 1 works can be found in [33]. Hence, the main di erence be- uk(n) ≈ sk(n − Δ)+jrk(n) × hk + νk(n), (5) tween DWMT and CMT is their respective receiver struc- 2 a tures. DWMT uses Fk (z) as analysis filters. CMT, on the other a where νk(n) is the channel additive noise after filtering. The hand, uses the analysis filters Qk(z). This (minor) change numerical results presented in Section 6 show that for a rea- in the receiver allows CMT to adopt simple equalizers with sonly large value of M, the assumption of flat channel gain only two real-valued tap weights per subcarrier band while over each subcarrier is very reasonable. However, for chan- DWMT needs equalizers that are of an order of magnitude nels with bridged taps, the group delay variation may not higher in complexity. be insignificant. Nevertheless, the incurred performance loss, found through simulation, is tolerable. Clearly, the latter loss 3. EFFICIENT REALIZATION OF ANALYSIS CMFB could be compensated by adjusting the delay in each sub- carrier channel separately. But, this would be at the cost of Efficient implementation of synthesis CMFB using discrete significant increase in the receiver complexity which may not cosine transform (DCT) can be found in [24]. This will be be justifiable for such a minor improvement. used at the transmitter side of a CMT transceiver. At the L. Lin and B. Farhang-Boroujeny 5

d W −0/2 0 2M

2 −1 2 u n M G0(−z )+jz GM (−z ) 0( )

z−1 W −1/2 d 2M 1

2 −1 2 u n M G1(−z )+jz GM+1(−z ) 1( ) z−1 M-point C IDFT ...... W −(M−1)/2 . 2M dM− z−1 1 M G −z2 jz−1G −z2 M−1( )+ 2M−1( ) uM−1(n)

Figure 3: Efficient implementation of the analysis CMFB. receiver, as discussed above, we use a modified structure operation count (the number of multiplications and ad- of analysis CMFB. Thus, efficient implementations that are ditions per unit of time), the two structures are similar. available for the conventional analysis CMFB, for example, However, they are different in their structural details. While [24], are of no use here. We develop a computationally ef- Figure 3 uses an M-point IDFT with complex-valued inputs, ficient realization of the analysis CMFB by modifying the the CMFB/SMFB structure uses two separate transforms (a structure of Figure 2. DCT and a DST) with real-valued inputs. Therefore, a prefer- Qa z At the receiver, we need to implement filters 0( ), ence of one against the other depends on the available hard- Qa z ... Qa z Qa e− jω = 1( ), , M−1( ). Recalling that 2M−1−k( ) ware or software platform on which the system is to be im- a jω ∗ [Qk(e )] and x(n) is real-valued, we argue that these filters plemented. a can equivalently be implemented by realizing Qk(z)for k = 0, 2, 4, ...,2M − 2, that is, for even values of k only; Qa z 4. PROTOTYPE FILTER DESIGN 1( ), for instance, is realized by taking the conjugate of the a output of Q M− (z). We thus note from Figure 2 that 2 2 Prototype filter design is an important issue in CMT mod- ulation. In CMFB, conventionally, the prototype filter is de- 2M−1 l signed to satisfy the PR property. However, in the application Qa z = d z−1W −1/2 G − z2M W −2kl 2k( ) 2k 2M l 2M of interest to this paper, the presence of channel results in a l= 0 loss of the PR property. In this section, we take note of this M−1 fact and propose a prototype filter design scheme which in- −l 2M (8) = d2k z Gl − z stead of designing for PR aims at minimizing the ISI plus ICI l=0 and maximizing the stopband attenuation. We thus adopt an jz−MG − z2M W −l/2 W −kl. NPR design. For this purpose, we develop a cost function in + l+M 2M M which a balance between the ISI plus ICI and the stopband attenuation is struck through a design parameter. A similar Using (8) to modify Figure 2 and using the noble identi- approach was adopted in [39] for designing prototype filter ties, [24], to move the decimators to the position before the in FMT. polyphase component filters, we obtain the efficient imple- mentation of Figure 3. This implementation has a computa- 4.1. ISI and ICI tional complexity that is approximately one half of that of the original structure in Figure 2, assuming that the decimators Referring to Figures 1 and 2, and assuming that only adjacent in the latter are also moved the position before the polyphase subchannels overlap, in the absence of channel noise, we ob- component filters—here, the 2M-point IDFT in Figure 2 is tain replaced by an M-point IDFT. The block C is to reorder and conjugate the output samples, wherever needed. Uo z = z−δ S zM Fs z S zM Fs z The realization of Figure 3 involves implementation of M k ( ) k k( )+ k−1 k−1( ) − polyphase component filters Gl(−z2)+jz 1Gl M(−z2), M (9) + S zM Fs z H z Qa z W −l/2 M + k+1 k+1( ) ( ) k( ), complex scaling factors 2M ,an -point IDFT, and the group delay compensatory coefficients dl. The latter coeffi- cients may be deleted as they can be lumped together with where Sk(z) is the z-transform of sk(n)andz-transforms ∗ the equalizer coefficients wk . of other sequences are defined similarly. Substituting (3) a The structure of Figure 3 should be compared with the in (9) and noting that for k = 0andM − 1, Qk(z)has Qs z Qs z analysis CMFB/SMFB structure of [37]. On the basis of the no (significant) overlap with 2M−k( ), 2M−1−k( ), and 6 EURASIP Journal on Applied Signal Processing

Qs z 1 k = M − 2M−2−k( ), we obtain, for 0and 1, 4.2. The cost function Uo z = z−δ S zM Qs z S zM Qs z The cost function that we minimize for designing the proto- k ( ) k k( )+ k−1 k−1( ) (10) type filter is defined as S zM Qs z H z Qa z . + k+1 k+1( ) ( ) k( ) ζ = ζs + γ ζISI + ζICI , (16) We use the notation [·]↓M to denote the M-fold deci- o where ζs is the stopband energy of the prototype filter, de- mation. Recalling that [Uk (z)]↓M = Uk(z) and for arbitrary M fined below, and γ is a positive parameter which should be functions X(z)andY(z), [X(z )Y(z)]↓M = X(z)[Y(z)]↓M, from (10), we obtain selected to strike a balance between the stopband energy and γ ISI plus ICI. A larger leads to a smaller ISI plus ICI. Here −δ s a Uk(z) = Sk(z) z Qk(z)H(z)Qk(z) ↓M and in the remaining discussions, for convenience, we drop the subcarrier band index k of ζk,ISI and ζk,ICI. S z z−δQs z H z Qa z {ω ω ... ω } + k−1( ) k−1( ) ( ) k( ) ↓M (11) Selecting the frequency grid 0, 1, , L−1 in the in- terval [ωs, π], where ωs is the stopband edge of the prototype S z z−δQs z H z Qa z . + k+1( ) k+1( ) ( ) k( ) ↓M filter, we define

L− Using (7), we get the estimate of Sk(z) (the equalized signal) 1 ζ = 1 P e jωl 2. as s L (17) l=0 2 Sk(z) = Uk(z) We also assume that the prototype filter P(z)hasalengthof hk (12) 2mM. This choice of the length follows that of the PR CMFB = Sk(z)Ak(z)+Sk−1(z)Bk(z)+Sk+1(z)Ck(z), [24], and is believed to be appropriate since here we design a filter bank with NPR property. Moreover, we follow the PR where CMFB convention and design a linear-phase prototype filter. This implies that A z = 2 z−δQs z H z Qa z k( ) h k( ) ( ) k( ) ↓M , k mM−1 jωl − jωl(mM−0.5) P e = e 2p(mM + n)cos ωl(n +0.5) , 2 −δ s a n= Bk(z) = z Qk− (z)H(z)Qk(z) ↓M , (13) 0 hk 1 (18) C z = 2 z−δQs z H z Qa z where p(n) is the nth coefficient of P(z). Rearranging (18), k( ) h k+1( ) ( ) k( )]↓M , k we obtain ⎡ jω mM− . jω ⎤ and {·} when applied to a transfer function means forming 0( 0 5) 0 e Pe ffi ⎢ jω mM− . jω ⎥ a transfer function by taking the real parts of the coe cients ⎢ e 1( 0 5)P e 1 ⎥ ⎢ ⎥ of the argument. When applied to a complex number of vec- = ⎢ ⎥ Cp ⎢ . ⎥ , (19) tor, {·} denotes “the real part of.” ⎣ . ⎦ jω mM− . jω If the prototype filter was designed to satisfy the PR con- e L−1( 0 5)P e L−1 dition, in the absence of the channel, we would have Ak(z) = −Δ z , Bk(z) = 0, and Ck(z) = 0. In the presence of the chan- where C is an L × mM matrix with the ijth element of nel, these properties are lost and accordingly the ISI and ICI ci,j = 2cos(ωi−1(j − 0.5)) and p = [p(mM)p(mM + powers at kth subchannel are expressed, respectively, as 1) ···p(2mM − 1)]T. Using (19), (17)mayberearranged as ζk = (ak − u)T(ak − u), (14) ,ISI 1 ζs = pTCTCp. (20) T T L ζk,ICI = bk bk + ck ck, (15) ζ ζ Qs z Qa z ffi To calculate ISI and ICI, we note that since k( ) k( ), where ak, bk,andck are the column vectors of the coe cients s a s a Qk− (z)Qk(z)andQk (z)Qk(z) are narrowband filters cen- of Ak(z), Bk(z), and Ck(z), respectively, and u is a column 1 +1 tered around the kth subcarrier band and over this band vector with Δth element of 1 and 0 elsewhere. H(z) may be approximated by the constant gain hk,from The above results were given for the case when only the (13), we obtain adjacent bands overlap. When each subcarrier band overlaps with more than two of its neighbor subcarrier bands, the s a ak = 2 qk  qk ↓M , (21) above results may be easily extended by defining more poly- nomials like Bk(z)andCk(z), and accordingly adding more =  s  a bk 2 qk−1 qk ↓M , (22) terms to (15). =  s  a ck 2 qk+1 qk]↓M , (23)

s a 1 In DSL applications, the sub-channels near origin (k = 0) and π (k = where  stands for convolution and qk and qk are the column −δ s a M − 1) do not carry any data [25]. vectors of coefficients of z Qk(z)andQk(z), respectively. L. Lin and B. Farhang-Boroujeny 7

Equation (21) may be expressed in a matrix form as Steps 2 to 5 are run for sufficient iterations until the de- sign converges. =  a ak 2 Qqk , (24) Numerical examples show that this algorithm can con- verge to a good design if the initial choice p = p and the where the matrix Q is obtained by the arranging of qs 0 k parameter γ are selected properly. Compared to other CMFB and its shifted copies in a matrix Qo and the decimation o a prototype filter designs, this method is attractive because of Q by M in each of the columns. Noting that qk(n) = k of its relatively low computational complexity. Other meth- p(n)e j((π/M)(k+0.5)(n−N/2)+(−1) (π/4)), p(n) = p(2mM − n − 1), n ods such as those based on paraunitary property of PR filter and defining D as a diagonal matrix with the th diagonal el- banks [24] are too complicated and hard to apply to filter d = e j((π/M)(k+0.5)(n−N/2)+(−1)k(π/4)) ement n,n ,(24)maybewrit- banks with large number of subbands; the case of interest ten as in this paper. Besides, such design methods are not useful here because we are not interested in designing filter banks pr ak = 2{QD} , (25) with PR property. Because of these reasons, we found the ap- p proach of [43] the most appropriate in this paper, and thus where pr is obtained by reversing the order of elements of p. elaborate on it further. In matrix/vector notations, pr = Jp where J is the antidiago- In CMT, we are interested in very long prototype filters nal matrix with the antidiagonal elements of 1. Using this in whose length exceeds a few thousands. This means in the (25), we obtain normal equation Ψp = θ, Ψ is a very large matrix. Hence, Step 4 in the above procedure may be computationally ex- ak = Ep, (26) pensive and sensitive to numerical errors. In our experiments where we designed filters with length of up to 3072, using the = { } J where E 2 QD [ I ]andI is the identity matrix. Substi- Matlab routine of [43], we did not encounter any numerical tuting (26)in(14), we obtain inaccuracy problem. However, the design times were exces- sively long. Since we wished to design many prototype fil- ζ = − T − . ISI (Ep u) (Ep u) (27) ters, we had to find other alternative methods that could run faster. Fortunately, we found the Gauss-Seidel method as a Following similar steps, we obtain good alternative. T T ζICI = p F Fp, (28) Gauss-Seidel method is a general mathematical opti- mization method that is applicable to variety of optimiza- where the matrix F is constructed in the same way as E,by tion problems [44, 45]. It finds the optimum parameters of s s qk−1 interest by adopting an iterative approach. A cost function is replacing qk with [ s ]. qk+1 Now substituting (20), (27), and (28)in(16), we obtain chosen and it is optimized by successively optimizing one of the cost function parameters at a time, while other parame- ζ = (Gp − v)T(Gp − v), (29) ters are fixed. A particular version of Gauss-Seidel reported ff − in [46] can be used to minimize the di erence Gp v in E u the least-squares sense without resorting to the normal equa- where G = √F , v = [ ], and 0 is a zero column vector (1/ γ)C 0 tion Ψp = θ. Moreover, an accelerated step that improves with proper length. the convergence rate of the Gauss-Seidel method has been proposed in [46]. Through numerical examples, we found 4.3. Minimization of the cost function that the accelerated Gauss-Seidel method could be used to

s replace for Step 4 in the above procedure, with the advantage We note that qk,andthusG, depends on p. Hence, the cost of speeding up the design time by an order of magnitude or ffi p n function (29) is fourth order in the filter coe cients ( ), more. and thus its minimization is nontrivial. Rossi et al. [43]pro- Here, we request the interested readers to refer to [46] posed an iterative least-squares (ILS) minimization for a sim- for details of the accelerated Gauss-Seidel method. In an ap- ilar problem. They formulated the same filter design problem pendix at the end of this paper, we have given the script of a for the case of a PR CMFB. Adopting the method of Rossi et Matlab m-file that we have used for the design of the proto- al. [43], we minimize ζ by using the following procedure. type filters. The prototype filter that we have used to generate the simulation results of Section 6 is based on the following Step 1. Let p = p0; an initial choice. parameters: M = 512, m = 3, fs = 1.2/2M, γ = 100, and K Step 2. Construct the matrix G using the current value of p. = 2.

Step 3. Form the normal equation Ψp = θ,whereΨ = GTG and θ = GTv. 5. COMPUTATIONAL COMPLEXITY AND LATENCY

−1 Step 4. Compute p1 = Ψ θ. Computational complexity and latency are two issues of concern in any system implementation. In this section, we Step 5. (p0 + p1)/2 → p0 and go back to Step 2. present a detailed evaluation of computational complexity 8 EURASIP Journal on Applied Signal Processing

Table 1: Summary of computational complexity of z-DMT trans- Table 3: Summary of computational complexity of FMT trans- ceiver. ceiver. Function Additions Multiplications Function Additions Multiplications M M − M M − M M m − M M m − Modulator (IFFT) (3 log2 2) (log2 2) Modulator (3 log2 +2 4) (log2 +2 2) M M − M M − M M m − M M m − Demodulator (FFT) (3 log2 2) (log2 2) Demodulator (3 log2 +2 4) (log2 +2 2)

FEQ 3M 3M Equalizer M(5Nf +5Nb − 2) 3M(Nf + Nb)

Table 2: Summary of computational complexity of CMT trans- equalizers. The parameters which appeared in Table 2 are the ceiver. number of subcarriers M and the overlapping factor m; the P z mM Function Additions Multiplications length of prototype filter ( )is2 . Table 3 lists the computational blocks of an FMT Modulator M(1.5log M +2m) M(0.5log M +2m +1) 2 2 transceiver and the number of operations for each block. M M m − M M m Demodulator (3 log2 +2 2) (log2 +2 ) The operation counts are based on the efficient realization in Equalizer M 2M [23]. Similar to z-DMT and CMT, here also, the adaptation of the equalizer coefficients is not counted. M is the number of subcarrier channels. The prototype filter length is 2mM. N N and latency of CMT and compare that against z-DMT and f and b denote the number of taps in the feedforward and FMT. feedback sections of DFE, respectively. Adding up the number of operations given in each of Tables 1, 2,and3, and normalizing the results by the block 5.1. Computational complexity length (2M for z-DMT and FMT, and M for CMT), the per- sample complexities of z-DMT, CMT, and FMT are obtained The computational blocks involved in z-DMT and their as- as sociated operation counts are summarized in Table 1.The C = M − number of operations given for each block is based on some DMT 4log2 1, of the best available algorithms. In particular, we have con- C = 6log M +8m +2, sidered using the split-radix FFT algorithm [47] for imple- CMT 2 (30) mentation of the modulator and demodulator blocks. We CFMT = 4log M +4m +4 Nf + Nb − 7. have counted each complex multiplication as three real mul- 2 tiplications and three real additions [47]. The variable M, For all comparisons in this paper, the following parame- here, indicates the number of subcarriers in z-DMT. The ters are used. For z-DMT, we choose M = 2048. This is con- FEQs are single-tap complex equalizers used to equalize sistent with the VDSL draft standard [16] and the latest re- the demodulated data symbols. We have not accounted for ports on z-DMT [15]. For FMT, we follow [23]andchoose possible adaptation of the equalizers. The RFI cancellation M = 128, m = 10, Nf = 26, and Nb = 9. For CMT, we also is not accounted for, as it varies with the number of in- experimentally found that M = 512 and m = 3aresuffi- terferers. For instance, when there is no RFI, the computa- cient to get very close to the best results that it can achieve. tional load introduced by the canceller is limited to channel With these choices, we obtain CDMT = 43, CCMT = 80, and sounding for detection of RFI and this can be negligible. On CFMT = 201 operations per sample. It is noted that FMT is the other hand, when an RFI is detected, the system may mo- significantly more complex than z-DMT and CMT, and the mentarily have to take a relatively large computational load computational complexity of CMT is about 2 times that of to set up the canceller parameters. Thus, the issue here might the z-DMT. However, we should note that the complexity of be that of a peak computational power load. Since account- z-DMT given here does not include the RFI canceller which, ing for this can complicate our analysis, we simply ignore the as noted above, can momentarily exhibit a significant com- complexity imposed by the RFI canceller and only comment putational peak load, whenever a new RFI is detected. that this can be a burden to a practical z-DMT system. Table 2 lists the computational blocks of a CMT 5.2. Latency transceiver and the number of operations for each block. Here, the modulator and demodulator are the CMFB syn- In the context of our discussion in this paper, the latency is thesis and analysis filter banks, respectively. The operation defined as the time delay that each coded information sym- counts of modulation are based on the efficient implemen- bol will undergo in passing through a transceiver. In z-DMT, tation of synthesis CMFB with DCT in [24], and the oper- the following operations have to be counted for. A block of ation counts of demodulation are based on Figure 3.Two- data symbols has to be collected in an input buffer before tap equalizers, discussed in Section 2, are used to mitigate ISI being passed to the modulator. This, which we refer to as and ICI at the demodulator outputs. Here also, we have not buffering delay, introduces a delay equivalent to one block of accounted for possible adaptation of the equalizers. The dk DMT. While the next block of data symbols is being buffered, coefficients at the output of the analysis CMFB of Figure 3 the modulator processes the previous block of data. This in- are not accounted for as they can be combined with the troduces another block of DMT delay. We refer to this as L. Lin and B. Farhang-Boroujeny 9

Symbol Background Modulator NEXT coupling generator noise

Symbol Modulator FEXT coupling RFI generator

Symbol Bit Modulator Channel Demodulator Calculate generator SNR allocation

Figure 4: Simulation setup.

processing delay. The buffering and processing delay together Since fractionally spaced DFEs work at the rate decimated by count for a delay of the equivalent of two blocks of DMT at M, the introduced delay is MNf Ts/2. The latency of FMT is the transmitter. Following the same discussion, we find that thus the receiver also introduces two blocks of DMT delay. Thus, Nf the total latency introduced by the transmitter and receiver ΔFMT = 2m +8+ MTs. (34) in z-DMT (or DMT, in general) is given by 2 As noted in Section 5.1, we choose M = 2048 and μcp + ΔDMT = 4TDMT, (31) μcs = 320 for z-DMT, M = 512 and m = 3forCMT,and M = m = N = N = where T is the time duration of each z-DMT block. This choose 128, 10, f 26, and b 9forFMT. DMT Δ = includes a block of data and the associated cyclic extensions. These result in the latency values DMT 800 microseconds, Δ = Δ = We also note that the channel introduces some delay. Since CMT 232 microseconds, and FMT 238 microseconds. this delay is small and common to the three schemes, we ig- We note that the latencies of CMT and FMT are significantly nore it in all the latency calculations. We thus use the follow- lower than that of z-DMT. This, clearly, is because of the use M ing approximation for the purpose of comparisons: of a much smaller block size in CMT and FMT. Δ = M μ μ T DMT 4(2 + cp + cs) s, (32) 6. SIMULATION RESULTS AND DISCUSSION μ μ where cp and cs are the length of cyclic prefix and cyclic The system model used for simulations is presented ffi T su x, respectively, and s is the sampling interval which in in Figure 4. This setup accommodates NEXT (near-end the case of VDSL is 0.0453 microseconds, corresponding to crosstalk) and FEXT (far-end crosstalk) coupling, back- the sampling frequency of 22.08 MHz. ground noise, and RFI ingress. The setup assumes that the The latency calculation of CMT is straightforward. The system is in training mode, and thus transmitted symbols are delay introduced by the synthesis and analysis filter banks is available at the receiver. Hence, we can measure SNRs at var- determined by the total group delay introduced by them. It is ious subcarrier bands, and accordingly find the correspond- equal to the length of the prototype filter times the sampling ing bit allocations. The symbol generator output is 4-QAM T mMT interval s. This results in a delay of 2 s. We should add in the cases of z-DMT and FMT, and antipodal binary for ff to this the bu ering and processing delays. Since each pro- CMT. M cessing of CMT is performed after collecting a block of To make comparisons with the previous works possible, ff samples, the total bu ering plus processing delay in a CMT we follow simulation parameters of [15],ascloseaspossible. MT transceiver is equal to 4 s. The latency of CMT is thus ob- We use a transmission bandwidth of 300 kHz to 11 MHz. The tained as noise sources include a mix of ETSI‘A’,[48], 25 NEXT, and 25 FEXT disturbers. Transmit band allocation is also performed Δ = (2m +4)MT . (33) CMT s according to [15]. The latency calculation of FMT is similar to that of CMT. Delays are introduced by the synthesis filter bank, the analy- 6.1. System parameters sis filter bank, and the DFEs. The delay introduced by synthe- sis and analysis filter banks is 2mMTs. A total buffering and The number of subcarriers M and the length of the proto- processing delay 4MTs should be added to this. The delay in- type filter 2mM are the two most important parameters in troduced by the feedforward section of DFE is Nf /2samples. CMT. Obviously, the system performance improves as one 10 EURASIP Journal on Applied Signal Processing

40 present results and compare CMT with z-DMT and FMT M = m = ω = . π/M 35 when in CMT, 512, 3and s 1 2 .Detailsof other cases will be reported in [49]. 30 For z-DMT, the number of subcarriers is set equal to 2048, following the VDSL draft standard [16]. As in [15], we 25 have selected the length of CP equal to 100, determined the 20 length of CS according to the channel group delay, and the length of the pulse-shaping and windowing samples are set 15 Bit rate (Mbps) equal to 140 and 70, respectively. 10 Following the parameters of [23], we use an FMT system with M = 128 subchannels, and a prototype filter of length 5 2mM,withm = 10. The excess bandwidth α is set equal to . 0 0 125. Per-subcarrier equalization is performed by employ- 0 200 400 600 800 1000 1200 1400 ing a Tomlinson-Harashima precoder with Nb = 9tapsand Length of TP1 (m) a T/2-spaced linear equalizer with Nf = 26 taps. Upper bound z-DMT CMT proposed design FMT CMT PR design 6.2. Crosstalk dominated channels The DSL environment is crosstalk dominated due to Figure 5: Comparison of bit rates of z-DMT, CMT, and FMT on ff bundling of wire pairs in binder cables. Here, we consider the TP1 lines of di erent lengths. performance of z-DMT, CMT, and FMT when both NEXT and FEXT are present. Since the three modulation schemes are frequency-division duplexed (FDD) systems, NEXT is significant only near the frequency band edges where there or both of these parameters increase. However, as we may is a change in transmit direction. FEXT, on the other hand, recall from the results of Section 5, both system complexity affects all the transmit band. and latency increase with M and m.Itisthusdesirableto In our simulations, NEXT and FEXT are generated ac- choose M and m to strike a balance between the system per- cording to the coupling equations provided in [16]fora50- formance and complexity. Moreover, for a given pair of M pair binder cable as and m, the system performance is affected by the choice of the CMFB prototype filter. An important parameter that af- N 0.6 fects the performance of CMT is the stopband edge of the d 1.5 PSDNEXT = KNEXTSd( f ) f , prototype filter ωs.Theoptimumvalueofωs is hard to find. 49 ω (35) On one hand, the choice of a small s is desirable as it limits N 0.6 2 d 2 the bandwidth of each subcarrier and makes the assumption PSDFEXT = KFEXTSd( f ) H( f ) d f , of constant channel gain over each subband more accurate. 49 On the other hand, a larger choice of ω improves the stop- s K K . × band attenuation of the prototype filter, and this in turn re- where NEXT and FEXT areconstantswithvaluesof8818 −14 . × −20 S f duces the ICI and noise interference from the nonadjacent 10 and 7 999 10 ,respectively, d( )isthePSDofa N H f subbands.Moreover,alargevalueofω increases RF ingress disturber, d is the number of disturbers, ( ) is the chan- s d noise and the NEXT near the frequency band edges. Unfor- nel frequency response, and is the channel length in meters. tunately, because of the complexity of the problem and the Figure 6 presents SNR curves demonstrating the impact variety of the parameters that affect the system performance, of NEXT in degrading the performance of z-DMT, CMT, and a good compromised choice of Mm and ωs could only be ob- FMT. The results correspond to a 810 m TP1 line. The arrows tained through extensive numerical tests over a wide variety ↓ and ↑ indicate downstream and upstream bands, respec- of channel setups. The details of such results will be reported tively.TheSNRineachsubcarrierchannelismeasuredin in [49]. Here, we mention the summary of observations that the time domain by looking at the power of the residual error we have had. The choice of M = 512 was generally found suf- after subtracting the transmitted symbols. As one would ex- ficient to satisfy the approximation “constant channel gain pect, there is a significant performance loss in z-DMT at the over each subband.” With M = 512, the choices m = 3 (thus, points where the transmission direction changes. The CMT a prototype filter length of 3072) and ωs = 1.2π/M result in and FMT, on the other hand, do not show any visible degra- a system which behaves very close to the optimum perfor- dation due to NEXT. It is worth noting that the SNR results mance, where the optimum performance is that of an ideal of z-DMT match closely those reported in [15]. system with nonoverlapping subcarrier bands; see Figure 5. Another observation in Figure 6 thatrequiressomecom- In our study, we also explored the choices of m = 2and ments is that although CMT has a lower SNR compared to m = 1. The results, obviously, were not as good as those of z-DMT and FMT, it may achieve a higher transmission rate m = 3, however, for most cases, they were still superior to z- because of higher bandwidth efficiency—no cyclic extensions DMT and FMT. Here, because of space limitation, we only or excess bandwidth. L. Lin and B. Farhang-Boroujeny 11

40 6.3. Channels with bridged taps 35 So far, the simulated subscriber loops were homogeneous 30 lengths of TP1 cables. Previous reports, [30], as well as our simulation studies have shown that the group delay distor- 25 tion of such lines is very minimal and mostly limited to very 20 low and very high frequencies in the VDSL band. Nonho- mogeneous subscriber lines with bridged taps, on the other SNR (dB) 15 hand, exhibit significant group delay distortion. Hence, a 10 study of CMT behavior in VDSL loops with bridged taps is essential to complete our study. We present simulation re- 5 sults for the five test loops that are shown in Figure 7. These are chosen from the test loops provided in [16]. Figure 8 0 0246810 presents the group delays of two of these loops and also that Frequency (MHz) of a 300 m TP1 line with no bridged tap. We note that the line without bridged tap exhibits almost no group delay dis- z-DMT tortion over most of the channel band, while as the number CMT of bridged taps increases, the group delay distortion also in- FMT creases. We also note that the fast variations of the group de- lay at certain frequencies coincide with the points where the Figure 6: SNR curves showing the impact of NEXT on z-DMT, CMT, and FMT. Arrows indicate the direction of data transmission. magnitude gain of the channel is reduced due to signal reflec- tion from the open-ended bridged-tap extensions. This phe- nomenon is clearly seen by referring to Figure 9 where the subcarrier SNRs of z-DMT, CMT, and FMT are shown for the loop 4 “short.” The following observations are also made by Figure 5 presents plots that compare the bit rates of z- referring to Figure 9. Even though the group delay distortion DMT, CMT, and FMT on TP1 lines of different lengths. Also may bring some degradation to the CMT performance since shown in this figure are the results of an ideal system where a it affects the flatness of each subchannel, this degradation is bank of ideal filters with zero transition bands and a channel not significant. It is worth noting that the sharp variations of with flat gain over each subband are assumed. Moreover, for the group delay at frequencies (about) 0.6 and 1.3 MHz, in CMT, we have presented the results when a prototype filter Figure 8, coincide with the sharp drops in SNRs of all the with PR property (designed using the code given in [43]) is three systems in Figure 9. The fact that both CMT and z- used and when the design procedure of Section 4 is adopted. DMT behave similarly, at these points, and also recalling that As seen, CMT, even with PR design, outperforms z-DMT and DMT has no sensitivity to group delay distortion clearly indi- FMT for all the line lengths with a gain of 5 to 10% higher cate that the variation of group delay, in VDSL channels, has bit rate. Moreover, CMT approaches very close to the upper little effect in degrading the performance of CMT. On the bound of the bit rate determined by the idealized system. A other hand, bit-rate evaluations presented in Table 4 reveal design based on PR property is already within 5% of the up- that even for such extreme lines, CMT is superior to z-DMT per bound. The filter design proposed in Section 4 reduces and FMT. this gap to around 2 ∼ 3%. An observation in Figure 5 that requires some comments is that the performance of FMT is worse than that of FMT obtained in [23], especially when the 6.4. Effect of RFI ingress noise length of the line is larger than 1000 m. This is because we use a different noise model than [23]. We follow [15]anduse The RFI noise can badly affect the performance of the VDSL ETSI‘A’ as the background noise, while −140 dBm/Hz white systems as it may appear at a level much higher than the Gaussian noise is used in [23]. VDSL signal. The RFI has to be suppressed at two stages. The Bit allocation for each subcarrier is done based on the first stage uses an analog RFI suppressor at the receiver in- following formula [4, 50]: put [20]. It has been reported that this technique can result in an RFI suppression of 20 to 25 dB [19]. However, unfortu- i · γ nately, this suppression is not sufficient for an acceptable per- b = SNR code i log2 1+ , (36) Γ · γmargin formance of z-DMT system. It is thus proposed that further suppression of RFI has to be made at the demodulator out- where SNRi is signal-to-noise ratio at the ith subcarrier, put [17, 18]. Here, we consider the RFI cancellation method γcode = 3 dB is the coding gain, Γ = 9.8 dB is the SNR proposed in [17]. In this method, the center frequency of the gap between the Shannon capacity and QAM-modulation to RFI is estimated by locating the peak of the signal within the −7 achieve a BER of approximately 10 ,andγmargin = 6dBis set of tones in the HAM bands. It then uses two listener tones, the system margin. Since in CMT data symbols are PAM, we one on each side of the RFI, to estimate this ingress and in- treat each pair of adjacent PAM symbols as one QAM symbol terpolate the RFI through the transfer function of the receiver and apply (36). window (see [17] for details). In our simulations, we follow 12 EURASIP Journal on Applied Signal Processing

VDSL 3 1500´/TP2 250´/TP3 ‘short’

VDSL 4 1000´/TP1 300´/TP2 150´/TP2 ‘short’ Aerial cable 150´/TP2 150´/TP2

550´/TP2 100´/TP2 250´/TP2 50´/TP3 VDSL 5 50´/TP2 Underground cable, Underground, Overhead aerial 20 pair 5pair

1650´/TP1 650´/TP2 550´/TP2 100´/TP2 250´/TP2 50´/TP3 VDSL 6 50´/TP2 Underground cable, Underground, Underground, Underground, Overhead aerial 100 pair 100 pair 20 pair 5pair

1650´/TP1 2300´/TP2 550´/TP2 100´/TP2 250´/TP2 50´/TP3 VDSL 7 50´/TP2 Underground cable, Underground, Underground, Underground, Overhead aerial 100 pair 100 pair 20 pair 5pair

Figure 7: Examples of test loops with bridged taps.

70 40

60 35 30 50 25 40 20 30 SNR (dB) 15 20

Group delay (in samples) 10

10 5

0 0 0246810 0246810 Frequency (MHz) Frequency (MHz)

300 m TP1 z-DMT VDSL test loop 5 CMT VDSL test loop 4 “short” FMT

Figure 8: Group delays of the test loops shown in Figure 7 and a Figure 9: SNR plots of z-DMT, CMT, and FMT for the VDSL test TP1 line of length 300 m. loop 4 “short.” The plots confirm that group delay distortion in this loop has no significant impact on degrading CMT performance when compared with z-DMT. Arrows indicate the direction of data transmission.

[17] and set the listener tones to be at 8-tone spacing from canceller). However, we note that to get an acceptable perfor- the center frequency of the RFI. mance, the first stage of RFI suppression is needed for CMT In CMT and FMT, the sharp roll-off and the high and FMT systems, as well. stopband attenuation of the analysis filters allow cancella- Figures 10(a) and 10(b) present a set of results that com- tion of the RFI without resorting to any additional post- pare the performance of z-DMT, CMT, and FMT in the pres- demodulator RFI canceller (i.e., the second stage of the RFI ence of RFI. In both cases, the RFI power has been set equal L. Lin and B. Farhang-Boroujeny 13

Table 4: Comparison of bit rates (Mbps) of z-DMT, CMT, and FMT over bridged loops.

Bridged loop z-DMT FMT CMT VDSL 3 “short” 20.39 20.08 21.99 VDSL 4 “short” 19.21 19.13 20.05 VDSL 5 24.12 23.67 25.52 VDSL 6 9.84 10.58 11.96 VDSL 7 2.92 3.24 3.60

45 45

40 40

35 35 30 30

25 25 20 20 SNR (dB) SNR (dB) 15 15 10 10

5 5

0 0 00.511.522.533.5 00.511.522.533.5 Frequency (MHz) Frequency (MHz)

z-DMT w/o RFC CMT z-DMT w/o RFC CMT z-DMT with RFC FMT z-DMT with RFC FMT (a) (b)

Figure 10: RFI performance of z-DMT, CMT, and FMT when an RFI with bandwidth of 4 kHz at the level of −35 dBm presents at the center frequency (a) 1.9 MHz and (b) 1.82 MHz. Arrows indicate the direction of data transmission. to −35 dBm at the demodulator input. This is assumed to be publications on the subject [34, 35], the receiver structure the residual from a −10 dBm RFI (stipulated in [16]), after of CMT was modified to reduce its computational complex- the first stage of suppression. The RFI is chosen to be a 4 kHz ity. A criterion that balances between ISI plus ICI and the narrowband signal. In Figure 10(a), the center frequency of stopband attenuation was proposed for designing NPR pro- the RFI is at 1.9 MHz. This is near the center of the first totype filters for CMT. Numerical results showed that this HAM band. We observe that in this case, the RFI canceller criterion leads to designs that are superior to those that are clears RFI almost perfectly. There is only slight degradation designed based on the PR criterion. Moreover, CMT was in SNRs near the band edges. However, the RFI canceller fails compared with z-DMT and FMT, the two candidate modu- when the RFI center frequency moves to a point near one of lation schemes for VDSL [16]. Comparisons were made with the VDSL signal band edges. This is shown in Figure 10(b) respect to computational complexity, latency, achievable bit where the center frequency of the RFI is shifted to 1.82 MHz. rates, and resistance to crosstalks and RFI. Except for com- The reason for the failure of the RFI canceller in this case is putational complexity, where CMT was found to be more that one of the listener tones used to measure RFI coincides complex than z-DMT, CMT showed superior performance with the VDSL signal. According to [17], as well as our sim- with all other respects. Compared to FMT, CMT was found ulations, any attempt to shift the listener tone nearer to the to be superior with respect to computational complexity and center frequency of the RFI will result in a significant degra- achievable bit rate. CMT and FMT showed similar resistance dation of the tone estimates, and thus equally results in fail- to crosstalks and RFI, and had similar latency. ureoftheRFIcanceller. We note that the CMT scheme that was proposed in this paper is nothing but an amended version of DWMT, a modu- 7. CONCLUSIONS lated scheme which has been known for a decade [25]. How- ever, because of its relatively high computational complexity, A thorough study of a new multicarrier modulation in which was a consequence of inappropriate selection of the VDSL channels was presented. This modulation which uses receiver structure, DWMT was never accepted by the indus- cosine-modulated filter banks was called CMT—an acronym try. We hope that this revisit of the scheme and in particular for cosine-modulated multitone. Compared to the earlier the simplification of the receiver structure that is proposed 14 EURASIP Journal on Applied Signal Processing

function h=PFDesign(M, Lh, fs, gammaf, K) % h: prototype filter, M: number of sub-channels, Lh: prototype filter length % fs: stopband edge frequency, 1/(2M)

Algorithm 1: Near perfect reconstruction prototype filter design. in this paper can initiate new thoughts on reconsideration of the design proceeds. We have experimentally found that this this powerful signal processing tool in xDSL applications. procedure always leads to good design within small number of iterations (see Algorithm 1). APPENDIX REFERENCES A. PROTOTYPE FILTER DESIGN [1]E.A.LeeandD.G.Messerschmitt,Digital Communication, The Matlab function below can be used to design a prototype Kluwer Academic, Boston, Mass, USA, 2nd edition, 1994. filter based on the design criterion discussed in Section 4. [2]J.G.Proakis,Digital Communications, McGraw-Hill, New Note that to guarantee the stability of the design, the stop- York, NY, USA, 3rd edition, 1995. f = ω / π band edge frequency s s 2 should be limited to the [3] W. Y. Chen, DSL: Simulation Techniques and Standards Devel- range 1/(2M)to3.8/(2M). Also, the parameter γ is initialized opment for Digital Subscriber Lines Systems, Macmillan, Indi- to 1 and progressively increase of a specified maximum as anapolis, Ind, USA, 1998. L. Lin and B. Farhang-Boroujeny 15

[4]T.Starr,J.M.Cioffi, and P. J. Silverman, Understanding Digital Journal on Selected Areas in Communications,vol.20,no.5,pp. Subscriber Line Technology, Prentice-Hall, Upper Saddle River, 1016–1028, 2002. NJ, USA, 1999. [24] P. P. Vaidyanathan, Multirate Systems and Filter Banks, [5] R. Steele, Mobile Radio Communications, IEEE Press, New Prentice-Hall, Englewood Cliffs, NJ, USA, 1993. York, NY, USA, 1992. [25] S. D. Sandberg and M. A. Tzannes, “Overlapped discrete mul- [6] J. A. C. Bingham, “Multicarrier modulation for data transmis- titone modulation for high speed copper wire communica- sion: an idea whose time has come,” IEEE Communications tions,” IEEE Journal on Selected Areas in Communications, Magazine, vol. 28, no. 5, pp. 5–14, 1990. vol. 13, no. 9, pp. 1571–1585, 1995. [7] xDSL Forum, http://www.dslforum.org. [26] M. A. Tzannes, M. C. Tzannes, and H. Resnikoff,“The [8] ETSI, http://www.etsi.org. DWMT: A Multicarrier Transceiver for ADSL using M- [9] ANSI T1E1.4 Working Group, http://www.t1.org. band Wavelet Transforms,” ANSI Contribution T1E1.4/93- [10] “Network and Customer Installation Interfaces—Asymmetric 067, March 1993. Digital Subscriber Line (ADSL) Metallic Interface,” T1.413- [27] M. A. Tzannes, M. C. Tzannes, J. Proakis, and P. N. Heller, 1998, American National Standards Institute, New York, NY, “DMT systems, DWMT systems and digital filter banks,” in USA, 1998. Proc. IEEE International Conference on Communications (SU- [11] “Radio broadcasting systems; Digital Audio Broadcasting PERCOMM/ICC ’94), vol. 1, pp. 311–315, New Orleans, La, (DAB) to mobile, portable and fixed receivers,” ETS 300 401, USA, May 1994. ffi European Telecommunications Standards Institute, 2nd ed., [28] M. Hawryluck, A. Yongacoglu, and M. Kavehrad, “E cient May 1997. equalization of discrete wavelet multi-tone over twisted pair,” [12] “Digital Video Broadcasting (DVB); Framing structure, chan- in Proc. International Zurich Seminar on Broadband Commu- nel coding and modulation for digital terrestrial television,” nications, pp. 185–191, Zurich, Switzerland, February 1998. ETS 300 744, European Telecommunications Standards Insti- [29] N. Neurohr and M. Schilpp, “Comparison of transmultiplex- tute, March 1997. ers for multicarrier modulation,” in Proc. 4th IEEE Interna- [13] F. Sjoberg, M. Isaksson, R. Nilsson, P.Odling, S. K. Wilson, and tional Conference on Signal Processing (ICSP ’98), vol. 1, pp. P. O. Borjesson, “Zipper: a duplex method for VDSL based on 35–38, Beijing, China, October 1998. DMT,” IEEE Transactions on Communications,vol.47,no.8, [30] A. Viholainen, J. Alhava, J. Helenius, J. Rinne, and M. Ren- pp. 1245–1252, 1999. fors, “Equalization in filter bank based multicarrier systems,” [14] D. G. Mestdagh, M. Isaksson, and P. Odling, “Zipper VDSL: in Proc. 6th IEEE International Conference on Electronics, Cir- a solution for robust duplex communication over telephone cuits and Systems (ICECS ’99), vol. 3, pp. 1467–1470, Pafos, lines,” IEEE Communications Magazine, vol. 38, no. 5, pp. 90– Cyprus, September 1999. 96, 2000. [31] S. Govardhanagiri, T. Karp, P. Heller, and T. Nguyen, “Perfor- [15] F. Sjoberg, R. Nilsson, M. Isaksson, P. Odling, and P. O. Borjes- mance analysis of multicarrier modulation systems using co- son, “Asynchronous zipper [subscriber line duplex method],” sine modulated filter banks,” in Proc. IEEE International Con- in Proc. IEEE International Conference on Communications ference on Acoustics, Speech, and Signal Processing (ICASSP (ICC ’99), vol. 1, pp. 231–235, Vancouver, British Columbia, ’99), vol. 3, pp. 1405–1408, Phoenix, Ariz, USA, March 1999. Canada, June 1999. [32] B. Farhang-Boroujeny and W. H. Chin, “Time domain [16] Committee T1Working Group T1E1.4, “VDSL Metallic Inter- equaliser design for DWMT multicarrier transceivers,” Elec- face: Part1—Functional rquirements and common specifica- tronics Letters, vol. 36, no. 18, pp. 1590–1592, 2000. tions,” Draft Standard, T1E1.4/2000-009R3, February 2001; [33] B. Farhang-Boroujeny and L. Lin, “Analysis of post-combiner and “VDSL Metallic Interface: Part 3—Technical specifica- equalizers in cosine-modulated filter bank-based transmulti- tion of multi-carrier modulation transceiver,” Draft Standard, plexer systems,” IEEE Transactions on Signal Processing, vol. 51, T1E1.4/2000-013R1, May 2000. no. 12, pp. 3249–3262, 2003. [17] R. Nilsson, Digital communication in wireline and wire- [34] B. Farhang-Boroujeny, “Multicarrier modulation with blind less environments, Ph.D. thesis, Lulea˚ University of detection capability using cosine modulated filter banks,” IEEE Technology, Lulea,˚ Sweden, March 1999, available at: Transactions on Communications, vol. 51, no. 12, pp. 2057– http://www.sm.luth.se/csee/sp/publications.html#Theses. 2070, 2003. [18] B.-J. Jeong and K.-H. Yoo, “Digital RFI canceller for DMT [35] B. Farhang-Boroujeny, “Discrete multitone modulation with based VDSL,” Electronics Letters, vol. 34, no. 17, pp. 1640– blind detection capability,” in Proc. 56th IEEE Vehicular Tech- 1641, 1998. nology Conference (VTC ’02), vol. 1, pp. 376–380, Vancouver, [19] J. Cioffi, M. Mallory, and J. Bingham, “Digital RF cancellation British Columbia, Canada, September 2002. with SDMT,” ANSI Contribution T1E1.4/96-083, April 1996. [36] J. Alhava and M. Renfors, “Adaptive sine-modulated/cosine- [20] J. Cioffi, M. Mallory, and J. Bingham, “Analog RF cancellation modulated filter bank equalizer for transmultiplexers,” in Proc. with SDMT,” ANSI Contribution T1E1.4/96-084, April 1996. European Conference on Circuit Theory and Design (ECCTD [21] G. Cherubini, E. Eleftheriou, and S. Olcer, “Filtered multitone ’01), Espoo, Finland, August 2001. modulation for VDSL,” in Proc. IEEE Global Telecommunica- [37] A. Viholainen, J. Alhava, and M. Renfors, “Implementation of tions Conference (GLOBECOM ’99), vol. 2, pp. 1139–1144, Rio parallel cosine and sine modulated filter banks for equalized de Janeireo, Brazil, December 1999. transmultiplexer systems,” in Proc. IEEE International Confer- [22] G. Cherubini, E. Eleftheriou, S. Olcer, and J. M. Cioffi, “Filter ence on Acoustics, Speech, and Signal Processing (ICASSP ’01), bank modulation techniques for very high speed digital sub- vol. 6, pp. 3625–3628, Salt Lake City, Utah, USA, May 2001. scriber lines,” IEEE Communications Magazine, vol. 38, no. 5, [38] J. Alhava and M. Renfors, “Exponentially-modulated fil- pp. 98–104, 2000. ter bank-based transmultiplexer,” in Proc. IEEE International [23] G. Cherubini, E. Eleftheriou, and S. Olcer, “Filtered multitone Symposium on Circuits and Systems (ISCAS ’03),vol.4,pp.IV- modulation for very high-speed digital subscriber lines,” IEEE 233–IV-236, Bangkok, Thailand, May 2003. 16 EURASIP Journal on Applied Signal Processing

[39] B. Borna and T. N. Davidson, “Efficient filter bank design general area of signal processing. His current scientific interests for filtered multitone modulation,” in Proc. IEEE International are adaptive filters, multicarrier communications, detection tech- Conference on Communications (ICC ’04), vol. 1, pp. 38–42, niques for space-time coded systems, and signal processing appli- Paris, France, June 2004. cations to optical devices. In the past, he has worked and has made [40] G. W. Wornell, “Emerging applications of multirate signal significant contribution to areas of adaptive filters theory, acous- processing and wavelets in digital communications,” Proceed- tic echo cancellation, magnetic/optical recoding, and digital sub- ings of the IEEE, vol. 84, no. 4, pp. 586–603, 1996. scriber line technologies. He is the author of the book Adaptive [41] A. Scaglione, G. B. Giannakis, and S. Barbarossa, “Redundant Filters: Theory and Applications, John Wiley & Sons, 1998. He re- ffi filterbank precoders and equalizers. I. Unification and optimal ceived the UNESCO Regional O ce of Science and Technology for designs,” IEEE Transactions on Signal Processing, vol. 47, no. 7, South and Central Asia Young Scientists Award in 1987. He served pp. 1988–2006, 1999. as an Associate Editor of IEEE Transactions on Signal Processing [42] Y.-P. Lin and S.-M. Phoong, “ISI-free FIR filterbank from July 2002 to July 2005. He has also been involved in various transceivers for frequency-selective channels,” IEEE Transac- IEEE activities. He is currently the Chairman of the Signal Process- tions on Signal Processing, vol. 49, no. 11, pp. 2648–2658, 2001. ing/Communications Chapter of IEEE in Utah. [43] M. Rossi, J.-Y. Zhang, and W. Steenaart, “Iterative least squares design of perfect reconstruction QMF banks,” in Proc. Canadian Conference on Electrical and Computer Engineering (CCECE ’96), vol. 2, pp. 762–765, Calgary, Alberta, Canada, May 1996. [44] A.˚ Bjorck,¨ Numerical Methods for Least Squares Problems, SIAM, Philadelphia, Pa, USA, 1996. [45] C. Brezinski and L. Wuytack, Projection Methods for Systems of Equations, Elsevier, Amsterdam, The Netherlands, 1997. [46] T. M. Ng, B. Farhang-Boroujeny, and H. K. Garg, “An accel- erated Gauss-Seidel method for inverse modeling,” Signal Pro- cessing, vol. 83, no. 3, pp. 517–529, 2003. [47] H. S. Malvar, Signal Processing with Lapped Transforms,Artech House, Norwood, Mass, USA, 1992. [48] ETSI, “Transmission and Multiplexing (TM); Access transmis- sion systems on metallic cables: Very high speed Digital Sub- scriber Line (VDSL); Part1: Functional requirements,” Tech- nical Specification TS 101 270-1 V1.1.1 (1998-04), 1998. [49] L. Lin, Multicarrier communications based on cosine modulated filter banks, Ph.D. thesis, University of Utah, Salt Lake City, Utah, USA, submitted. [50] J. M. Cioffi, “A Multicarrier Primer,” ANSI Contribution T1E1.4/91-157, November 1991.

Lekun Lin received the B.S. and M.S. de- grees in electrical and telecommunications engineering from Nanjing University of Posts and Telecommunications, Nanjing, China, in 1995 and 1998, respectively. He received the Ph.D. degree from the De- partment of Electrical and Computer Engi- neering, University of Utah, Salt Lake City, in 2005. His current research interests are multicarrier communications, digital signal processing, and MIMO systems.

Behrouz Farhang-Boroujeny received the B.S. degree in electrical engineering from Teheran University, Iran, in 1976, the M.Eng. degree from University of Wales, In- stitute of Science and Technology, UK, in 1977, and the Ph.D. degree from Imperial College, University of London, UK, in 1981. From 1981 to 1989, he was with the Isfa- han University of Technology, Isfahan, Iran. From 1989 to 2000, he was with the National University of Singa- pore. Since August 2000, he has been with the University of Utah where he is now a Professor and Associate Chair of the Department of Electrical and Computer Engineering. He is an expert in the