ACKNOWLEDGMENTS

My deep gratitude goes to Dr. Laurie Joiner, who has been supportive, knowledgeable, thoughtful, and considerate during this journey. I am so thankful with no limits, Dr. Joiner, I learned a lot from you and I have the pleasure knowing such a smart and a wise woman like you.

Thanks to every person in the ECE department at UAH. Special thanks are passed to Jackie Siniard and Linda Grubbs for their lovely smiles and the relieving chats.

I have to mention my wonderful friend, Aditi, without whom, my PhD years would not have been as joyful as they were. Thank you, Aditi, for the encouragement you provided me with and for the funny talks. I am really lucky getting to know such a positive and warm-hearted person like you.

I would also like to thank Yarmouk University in Jordan for funding my PhD study. It will be my honor to get back and teach in such a reputable institution.

Last but not least, I would like to pass my thanks and love to my mom and dad,

Fatimah and Ahmad, and to all my family members in Jordan. They have been always there: loving, keeping up with me, tolerating the full spectrum of my personality, and supporting me until the final mile of every pursuit. I cannot wait to get back to you guys and hug you all…

Asma Ahmad Alqudah

vi

Table of Contents

ABSTRACT……………………………………………………………………………. iv

ACKNOWLEDGMENTS……………………………………………………………… vi

List of Figures……………………………………………………………………………. x

List of Tables……………………………………………………………………………xiii

CHAPTER

1. Introduction…………………………………………………………………………… 1

1.1 Motivation for FTN signaling………………………………………………………….. 5

1.1.1 FTN signaling: background……………………………………………………… 6

1.2 FTN prior work………………………………………………………………………….. 7

1.3 A discussion on tree-based detection algorithms………………………………………. 10

1.4 Dissertation contributions…………………………………………………………..….. 12

2. Basic principles of linear systems……………………………………… 14

2.1 Single carrier linear modulation systems………………………………………….…….. 14

2.1.1 Bit and block error rate definitions…………………………………….………... 17

2.1.2 Bandwidth characteristics……………………………………………………..… 20

2.1.3 The squared Euclidean distance definition……………………………….…..…. 24

2.1.4 T-orthogonal modulation pulses………………………………………………… 26

2.2 Introduction to maximum-likelihood sequence estimation………………………….….. 30

2.2.1 The recursively-structured MLSE…………………………………………….…. 33

2.2.2 The error performance of the MLSE………………………………………….…. 36

2.3 The M-algorithm……………………………………………………………………….... 38

2.4 Maximum a posteriori decoding……………………………………………………..……40

2.4.1 The BCJR algorithm…………………………………………………………… 43

vii

2.5 Faster-than-Nyquist signaling………………………………………………………… 46

2.5.1 The system model……………………………………………………………..…. 49

2.5.2 The capacity estimation of FTN signaling …………………………………..….. 52

2.6 Principles of turbo equalization………………………………………………………… 57

3. Employing higher order modulation in combination with nonbinary code alphabets in the context of faster-than-Nyquist signaling………………………………………………………....62

3.1 Problem statement……………………………………………………………………..….63

3.1.1 The selection of the FTN pulse shape………………………………………….….67

3.2 Equivalent discrete-time system models…………………………………………………68

3.2.1 An improved minimum phase model………………………………………….….69

3.3 The extension of the M-BCJR algorithm for nonbinary alphabets………………….….. 71

3.3.1 Performance of the M-BCJR in simple detection…………………………….…..74

3.3.2 Backup M-BCJR for nonbinary alphabets……………………………………..…76

3.4 Turbo equalization………………………………………………………………….……79

3.4.1 System model……………………………………………………………….….....80

3.4.2 Simulation results………………………………………………………….……. 95

3.5 Binary code-aided QPSK-based FTN………………………………………………….. 103

3.5.1 System model……………………………………………………………………103

3.5.2 Simulation results………………………………………………………………..108

4. Turbo equalization of the faster-than-Nyquist signaling

using the reduced-complexity Z-MAP algorithm………………………………… 111

4.1 Introduction……………………………………………………………………………..111

4.2 Error moments………………………………………………………………………….113

4.3 Z-MAP applied to turbo equalization of FTN signals………………………………….117

4.3.1 Simulation results………………………………………………………………..118

viii

5. Summary and future directions…………………………………………………… 128

REFERENCES…………………………………………………………………………132

ix

List of Figures

Figure Page

2.1 A basic device for transmitting information via carrier modulation……………….. 15

2.2 A system model of a communications system when transmitted over an AWGN channel………………………………………………………….. 16

2.3 The frequency content of the bandpass signal ………………………………22 () 2.4 Root RC pulses at three different values of the excess bandwidth factors …………………………………………………………………………….30 2.5 A straightforward way to produce the sequence from the received signal ………………………………………………………………………….. 33 () 2.6 A 4-state binary trellis example…………………………………………………….. 34

2.7 A block diagram of a serial concatenation communication system with encoding and ISI. denotes an interleaver………………………………………49 ∏ 2.8 A serial concatenation communications system employing iterative turbo equalization at the receiver…………………………………………………………..58

3.1 Turbo equalization structure……………………………………………………..….64

3.2 Model for converting the continuous FTN into discrete time……………………….68

3.3 BER vs. for simple ISI detection BPSK-based FTN………………………...74 ⁄ 3.4 BER vs. for simple ISI detection QPSK-based FTN……………………..….75 ⁄ 3.5 Backup M-BCJR procedure for and . Illustrating and recursions, hard decision path, and =backup 3 recursion………………………………..78 = 2

x

3.6 Nonbinary turbo equalization receiver………………………………………..…….80

3.7 Turbo equalizer BER vs. for binary code-aided BPSK-based FTN signaling at ………………………………………………………………..96⁄ τ = 1⁄ 2 3.8 Turbo equalizer BER vs. for quaternary code-aided QPSK-based FTN signaling at ………………………………………………………………..96⁄ τ = 1⁄ 2 3.9 Turbo equalizer BER vs. for binary code-aided BPSK-based FTN signaling at ………………………………………………………………..98⁄ τ = 0.35 3.10 Turbo equalizer BER vs. for quaternary code-aided QPSK-based FTN signaling at ……………………………………………………………....98⁄ τ = 0.35 3.11 Turbo equalizer BER vs. for binary code-aided BPSK-based FTN signaling at ………………………………………………………………99⁄ τ = 0.25 3.12 Turbo equalizer BER vs. for quaternary code-aided QPSK-based FTN signaling at …………………………………………………………..…100⁄ τ = 0.25 3.13 Turbo equalizer BER vs. for quaternary code-aided QPSK-based FTN signaling for and for⁄ different number of iterarions…………………….100 τ = 0.5 3.14 Turbo equalizer BER vs. for quaternary code-aided QPSK-based FTN signaling at ………………………………………………………..108⁄ τ = 1⁄ 2 3.15 Turbo equalizer BER vs. for binary code-aided QPSK-based FTN signaling at …………………………………………………………..…..109⁄ τ = 1⁄ 2 4.1 The most probable state in the correct instant…………………………………… 113

4.2 Presence of a concurrent at the error instant……………………………………..…114

4.3 A trellis with 4 states……………………………………………………………..…115

4.4 The Z-MAP principle [42]………………………………………………………….117

xi

4.5 Average number of states of the Z-MAP simple detection of the FTN binary signals at …………………………………………………………….…..119 τ = 1⁄ 2 … 4.6 BER comparison between M-BCJR and Z-MAP turbo decoding for binary FTN signaling at ……………………………………………………………..…120 τ = 1⁄ 2 4.7 Turbo equalizer BER vs. for binary code-aided BPSK-based FTN signaling at applyingE⁄N the Z-MAP…………………………………………121 τ = 1⁄ 2 4.8 Average number of states of the Z-MAP turbo system of figure 4.7………………122

4.9 BER comparison between M-BCJR and Z-MAP turbo decoding at the 4 th iteration for binary code-aided BPSK-based FTN signaling at …………. 124 τ = 1⁄ 2 4.10 Turbo equalizer BER vs. for quaternary code-aided QPSK-based FTN signaling at applyingE⁄N the Z-MAP…………………………………….… 125 τ = 1⁄ 2 4.11 Average number of states of the Z-MAP turbo system of Figure 4.10………….. 125

xii

List of Tables

Table Page

I Algorithm for computing the posterior probabilities for a memory-2 ISI channel………………………………………………………Pr(|) …….84

II Algorithm for computing the posterior probabilities for the memory-2 quaternary convolutional code definedPr( in= this |()) section………….89

III Algorithm for computing the posterior probabilities for the memory-2 binary convolutional code definedP( in this = section……………. |()) 106

xiii

Dedication

To Abdullah…

CHAPTER 1

Introduction

Tremendous progress has been witnessed in wireless communications over the last two decades. Mobile telephony, which was primarily meant for voice-based services, has evolved to the extent that non-voice-based services now predominate. In addition, the explosive growth computer networking has gone through has laid the foundation for the largest global medium for information exchange, the Internet. Therefore, there is an ever increasing demand to make better use of the available resources in order to sustain this growth.

One primary resource in wireless communications is the frequency band of operation. The frequency bands are controlled and allocated by regulatory bodies such as the Body of European Regulator for Electronic Communications (BEREC) [1], the

Federal Communications Commission (FCC) [2], and the Telecom Regulatory Authority of India (TRAI) [3]. In this context, a wireless system mainly refers to a mobile phone or a handheld device communicating with other wireless devices or base stations. Since mobile phones have evolved from simple communicating devices to portable computers, there is a requirement of more efficient transmission systems that accommodate the increasing amount of wireless traffic. Even though the amount of available bandwidth has increased some, the great demand for wireless access has triggered a strong competition

1

among the wireless systems operators, who are paying a high premium to own spectrum allocations. For example, the 4G wireless mobile deployment in Spain, also known as

Long-Term Evolution (LTE) or Evolved UTRA (EUTRA), brought EUR 1.5 billion from the bidding of a total of 310 MHz in different frequency bands [4]. Even though the

LTE/LTE-Advanced is currently meeting the wireless user access demands, the requirements to increase the system capacity is growing dramatically. It is expected that wireless traffic will increase beyond 500-fold in 2020 as compared to the traffic in 2010

[5]. Taking into account the plethora of smart phones and tablets, the need for developing new wireless radio access technologies is essential to enhance the system capacity as well as the user data rates for future operations beyond LTE-Advanced. In the meanwhile, developments in the semiconductor technology are enabling the small-sized microprocessors to handle more complex operations.

Modern applications demand more bits be carried over the wireless channels. The straightforward ways to do so are either to send longer or faster data streams, i.e., to consume more time or more bandwidth. However, both time and frequency are scarce and costly resources. How can more information be carried per hertz and second?

Modulation theory, since the pioneering work of Nyquist [6], has mainly considered the memoryless transmission of data, which greatly simplifies receiver design and theoretical analysis. In the context of memoryless transmission, the symbols transmitted in different time epochs are considered independent. Thus, the different signals are received assuming there is no intersymbol interference (ISI) among them, and a simple symbol-by-symbol receiver can be used to detect the sequence reliably.

2

Shannon in 1948 and 1949 [7, 8] developed a fundamental concept to communications with his work in information theory. He proved that highly reliable communication can be achieved if the transmitted symbols are made in groups. He also verified that this construction is possible if the time signals are generated using sinc pulses. Based on this work, most communication technologies maintained the memoryless modulation assumption (see Figure 2.2). Even this assumption is only optimal analytically; it incurs some capacity losses practically due to the non-ideal system components.

Nyquist criterion states that in order to transmit data at a rate symbols per second with no ISI, pulses with a minimum bandwidth of are required. This data /2 Hz rate requires the peaks of the pulses to be spaced in time by seconds. 1/ Most data transmission scenarios employ linear modulation, which is formed by adding up a sequence of data pulses shifted from each other by an integer multiple of the symbol time duration , with the form

() = ∑ ℎ( − ). (1.1) Here, are independent data symbols drawn from an -ary modulation alphabet, each with energy , and is the data pulse. The transmission rate in this configuration is ℎ() bits per second ( ). log / b/s Pulses that are used for transmissions are usually orthogonal or “Nyquist” to ℎ() each other. This means mathematically that the inner product of the pulse and any - shifted version of itself is equal to zero. The practical meaning of this behavior is that the

pulse and its shifted versions are invisible to each other, i.e. there is no interference ℎ() 3

between pulses when sampled at . This makes the receiver design simple and = basically of the form of a matched filter to the pulse followed by a sampler. ℎ() Again, let us turn to the question: How is it possible to increase the bit density of the system? If more bits are to be carried in a given transmission system, more time or bandwidth can be spent. If these two resources are scarce, the method until now in both coded and uncoded transmissions, is to employ -ary modulation with a high , which in turn requires a larger to maintain performance and, thus more energy to be / consumed. With Nyquist pulses, modulation theory states that there are 2 independent channel uses in seconds and Hz, that is, per - . The bit density 2 bits Hz s of a simple -ary modulator is thus - . Detection theory states that in 2 log b/Hz s order to increase this bit density by 1, the required , should almost double at larger / values of , if the error probability is not to increase. The signaling in the telephone and the coded -ary QAM digital television exploit this principle. 64 A potential solution for this challenge is the use of non-orthogonal transmission schemes, which have an improvement in the bandwidth-energy efficiency as compared to the conventional orthogonal systems. In this dissertation a somewhat unconventional signaling scheme called faster-than-Nyquist (FTN) is considered. The non-orthogonal transmission scheme, FTN, is a promising future candidate for wireless transmissions for its inherent ability to improve the spectrum efficiency by increasing the data rate. In FTN,

ISI is intentionally introduced by using a signaling rate that is faster than allowed by the

Nyquist orthogonality criterion.

4

Faster-than-Nyquist has recently attracted attention as a transmission scheme, but its roots trace back to the 1970s. The reason FTN is gaining interest in our bandwidth- starved world is because it can send 30%-100% more data using the same bandwidth, bit energy, and error rate as compared to conventional orthogonal transmission systems.

FTN signals exhibit a higher Shannon capacity as well [28].

1.1 Motivation for FTN signaling

The history of wireless communications dates back to the late 1800’s stemming from the pioneering contributions of G. Marconi, R. Fessenden, J.C. Bose, N. Tesla and many others. Since then wireless communications have come a long way in the form of telegraphy, radio broadcasting, television, and in the most recent decades as mobile telephony. The mobile phone was originally used for voice-based services as a portable and handheld version of the landline phones. Mobile telephony then evolved into the second generation, known widely as Global System for Mobile Communications (GSM)

[9]. Since then mobile telephony has grown explosively with mobile phone’s non-voice based services overtaking the voice-based uses. Consequently, the mobile phone has evolved to more than a mere voice-communicating device, which requires communications in order accommodate the next generation wireless technologies. The ever increasing challenge is to improve the existing technologies to cope with this rapid increase, or envision newer techniques that scale up with the demand. While scaling the bandwidth has been possible to a certain extent, it poses challenges with the receiver processing, and the restrictions imposed by the regulatory bodies for frequency usage. A more favorable solution would be to adapt the available bandwidth in the existing systems to meet the demand. Faster-than-Nyquist signaling is one such approach that

5

trades bandwidth for processing complexity. Nowadays, bandwidth is the scarce resource as compared to the processing power that is at the disposal of the state-of-the-art semiconductor technology. Thus, approaches such as FTN that trade processing complexity for bandwidth savings should be investigated.

1.1.1 FTN signaling: background

The concept of FTN was first proposed by J. Mazo in 1975 [10]. The transceiver

involved in such a signaling is of manifold processing complexity, thus limiting its early

adoption history. FTN adoption mimics the trend of the LDPC codes [11] which were

introduced in 1962, but not used until recently due to their inherent processing

complexity. The FTN concept is realized by transmitting signals at a rate faster than that

allowed by the Nyquist rate for an ISI-free transmission [6, 12, 13]. The Nyquist

transmission scheme is often referred to as orthogonal signaling, as there is no ISI present

among the symbols. Faster-than-Nyquist signaling, on the other hand, packs more data in

a given bandwidth than conventional systems, thus, it introduces intentional ISI in the

transmission. If the orthogonal pulses are sent faster than , then a violation to the 1/ Nyquist’s orthogonality criterion occurs, and ISI is introduced. A more complex

maximum likelihood sequence estimation (MLSE) receiver is essential to eliminate the

effects of the ISI.

If the receiver is able to cope with the FTN interference, then the systems’s

spectral efficiency is greatly improved. This spectral efficiency improvement is true for

any -orthogonal pulse. In each case, there is a closest packing (a smallest spacing) at which the minimum Euclidean distance of the signal first falls below its orthogonal

6

pulse value. This spacing is the Mazo limit corresponding to this pulse and alphabet. ℎ() Mazo investigated the binary sinc pulse case for binary modulation, which has a

minimum squared Euclidean distance . Mazo sent the pulses faster and he = 2 discovered an interesting fact: the minimum Euclidean distance does not change as the time acceleration factor, , falls below 1 and the pulses become nonorthogonal. He τ noticed that remains 2 for in the range [0.802, 1] despite the ISI. This means that τ more bits are accommodated in the transmission using the same 1 . 802 ≈ 25% bandwidth and without affecting the error rate performance. And hence, two terms have evolved: 1) The Mazo limit, which is the bandwidth at which first falls below its d orthogonal value, and 2) The Nyquist limit, which is the bandwidth below which

orthogonality can no longer exist.

1.2 FTN prior work

Although FTN was developed in the 1970’s, there had not been extensive work

for developing algorithms for FTN until recently. FTN has started to gain popularity in

the research communities worldwide due to the fact that spectrum allocations have

undergone more constraints, while more relaxing requirements on the hardware

complexity are evolving. The works listed below highlight current research in the field.

In [14] the preservation of the minimum Euclidean distance for binary signaling,

despite violating the Nyquist orthogonality criterion, has been extended to a family of

raised-cosine pulses. Practical ways of taking advantage of these gains by utilizing an

iterative joint equalization and decoding scheme are presented. In 2009, Barbieri,

Fertonani, and Colavolpe [15] investigated the spectral efficiency achievable by a simple

7

symbol-by-symbol receiver. It is shown that when the assumption is finite order constellations, giving up orthogonality greatly improves the performance. In 2010,

McGuire and Sima [16] present a description of a discrete time form of the FTN signaling that enables a low-complexity receiver, which performs comparable to the continuous time FTN signaling previously explored in the literature. Han and Zhang [17] discuss aspects of fading, Doppler, and the behavior of FTN in dispersive channels.

Hamamura and Tachikawa [18] define the improvement of the bandwidth efficiency as High Compaction Multicarrier Modulation (HC-MCM). They make the transmitter and receiver design discrete Fourier transform (DFT)-based. They also discuss the bandwidth efficiency achieved along with the bit error rate (BER) performance. In 2010, Yoo and Cho [19] discussed the asymptotically achievable information rate of binary FTN signaling, and Kim and Bajcsy [20] presented the pre- coding of FTN signals and ways of preventing spectrum broadening due to pre-coding.

The authors in [21] discuss the spectral efficiency improvement achieved when intentionally violating the carrier orthogonality in a frequency division

(OFDM) system at the expense of increased complexity.

In [22], a discussion of the peak-to-average power ratio (PAPR) reduction is performed. In addition to the standard PAPR reduction techniques, a novel PAPR reduction technique is proposed and termed as Sliding Window PAPR reduction technique. In [23], a procedure was given on how to find good convolutional codes to be used in the context of iterative decoding of coded additive white Gaussian noise

(AWGN) ISI channel, based on the input-output error rate characteristics of the BCJR outer decoder in the iterative decoding along with the code’s minimum distance.

8

In 2009, capacity computations of FTN signaling were developed [24]. They show that the capacity of FTN transmissions is always higher than that of orthogonal linear modulation systems for all pulses except for the sinc pulse. They lower- and upper- bound the capacity computations of the FTN given the constraint of finite input alphabet.

The authors in [5] survey the nonorthogonal transmission scheme FTN, and analyze the system in both time and frequency domains to show the reason behind its higher capacity as compared to the Nyquist-based transmission schemes.

Rusek and Anderson [25] investigated the performance of concatenated coding systems based on the FTN concept. They analyzed both serial and parallel concatenations over an AWGN channel. In the case of parallel concatenations, a precoding device turns out to be essential. The results, in terms of the needed signal-to-noise ratio (SNR) versus the spectral efficiency, are very appealing. In [26], some basic receiver issues were studied in the context of FTN such as: How to model the signaling efficiently in discrete time, how much the Viterbi algorithm can be truncated, and how to combine the system with an outer code. The authors develop a minimum phase model and minimum distance formulations along with some receiver tests.

The authors in [27] propose reduced-search BCJR algorithms (M-BCJR) for low- complexity turbo detection of signals transmitted over ISI channels. The algorithms are applied for the detection of the severe ISI introduced by FTN signaling. The M-BCJR algorithms are compared to other reduced-trellis Viterbi algorithm (VA) and BCJR benchmarks. The authors conclude that the combination of coded FTN and the reduced- search BCJR is an attractive narrowband signaling scheme. In 2013, Anderson, Rusek, and Owall [28] surveyed the FTN signaling and concluded that FTN can transmit up to

9

twice the bits as ordinary modulation while using the same bandwidth, bit energy, and error rate. They also show that the method can be directly extended to handle OFDM signaling.

1.3 Trellis-based detection algorithms

Decoding signals using trellises and trees is very popular in digital communications, and it plays a significant role in that field. In fact, many signal detection problems in wireless communications use trellis algorithms for the decoding process. One example of such approaches is the detection of the FTN signals. In FTN signaling, ISI is introduced by transmitting the signals at a rate that is higher than that allowed by the

Nyquist orthogonality criterion. Therefore, in the presence of ISI, each received signal will contain a combination of the most recent transmitted symbol in addition to the past symbols, where is the memory of the ISI channel. This signal with memory can be modeled as a finite state machine (FSM) process and, therefore, can be represented using

a trellis [29]. Multiple-input multiple-output (MIMO) and frequency selective channels

are other examples of systems where trellis detection is used.

A well-known algorithm that operates on a trellis is the Viterbi algorithm (VA), which was invented in 1967 [30]. Due to its high computational load for detecting signals over long ISI responses and high constellation sizes, the full VA is intractable for these schemes. Instead, the M-algorithm invented by Anderson in 1969 [31] can be used. The

M-algorithm executes only a part of the full trellis, thus reducing the underlying complexity considerably. For more information on the M-algorithm, the reader is advised to see Section 2.3.

10

The invention of turbo codes [32, 33] was a milestone in the world of

communications. The turbo processing principle, which was developed by Hagenauer

[34], has been applied to the concatenated systems in order to improve their overall

performance. Theoretical upper bound limits for capacity were specified by Shannon in

1949 for a given bandwidth, data rate, and certain value of SNR. However, the practical

codes could not approach the capacity limit until the advent of turbo codes in 1993.

Turbo coding schemes have been applied in a wide range of applications in the latest

standards of wireless communications, such as the Universal Mobile

Telecommunications System (UMTS) and LTE. These coding schemes have been

introduced as comprising of convolutional codes, and they exhibit a BER performance

close to the Shannon limit [32]. The basic turbo coding consists of two or more

component encoders separated by an interleaver, where the receiver decodes the received

signal iteratively in order to improve the reliability of the decision about the detected

signal. The principle of exchanging soft information between the soft-input soft-output

component decoders will be used frequently in this dissertation.

Turbo equalization [35] is one such application of turbo decoding. In turbo equalization, instead of using two conventional hard-output decoders, a soft-output ISI equalizer is followed by a soft-output decoder, where the two blocks exchange the likelihood ratios in order to make a decision on a particular symbol. Using this iterative approach, the turbo equalizer can get close to the optimal performance, which is achieved by the joint equalization and decoding, with a tractable complexity. More information about the turbo equalization principle can be found in section 2.6.

11

Some well-known soft-output algorithms are the Bahl-Cocke-Jelinek-Raviv

(BCJR) algorithm [36], which is also known as the maximum a posteriori (MAP) decoding algorithm, and the soft-output Viterbi (SOVA). Both of these algorithms require processing the complete trellis, which incurs impractical complexity.

In this dissertation, reduced-complexity trellis and tree based soft-input soft- output algorithms for FTN signaling and general linear channels are employed. The well- known reduced-complexity M-algorithm, introduced by Franz and Anderson [38], is used for the soft-output equalization. A similar reduced-complexity algorithm called the T- algorithm was proposed in [38]. Both of these algorithms reduce the complexity of the full BCJR significantly with a slight performance degradation. A reduced-complexity variant of the SOVA is the soft-output M-algorithm (SOMA) [39]. A soft-output sequential decoder known as the LIST-sequential (LISS) decoder was proposed by Kuhn and Hagenauer in [40]. A popular reduced-complexity technique for turbo detection of

MIMO channels is the sphere decoder proposed by Hochwald and ten Brink [41]. Note that the impressive amount of literature on reduced-complexity techniques prevents a full treatment in this dissertation. The above mentioned reduced-complexity methods may be considered for future research in the context of turbo equalization of the FTN signaling.

1.4 Dissertation contributions

In Chapter 3, the FTN signaling is extended to carry symbols drawn from a higher order modulation alphabet along with the application of a nonbinary convolutional code.

The performance of the extended system is evaluated in both uncoded and coded signaling setups. The performance results are attained using the reduced-complexity

12

nonbinary adapted- M-BCJR and the backup M-BCJR algorithms. Chapter 4 evaluates the performance of the same nonbinary setups demonstrated in Chapter 3 when the M-

BCJR equalizer is replaced with the Z-MAP [42] equalizer. The performance results based on the Z-MAP algorithm are compared with the results from Chapter 3.

Furthermore, the complexity behavior of the Z-MAP, in the context of the FTN equalization, is also studied in the same chapter.

The dissertation is organized into five chapters. Chapter 1 discusses an introduction about the dissertation work considering general FTN signaling phenomenon and the prior work in this field. Chapter 2 includes an overview of the linear signal transmission model, the FTN signaling, and the trellis-based detectors employed at the

FTN receiver. Also turbo codes are discussed in this chapter, including both the encoder and decoder architecture and the interleaver.

System model and the setup environment employing the QPSK modulation along with the quaternary convolutional code are explained in Chapter 3. Further, in Chapter 3, two FTN-based schemes based on the same modulation and different coding approaches are compared in terms of their BER performance.

Chapter 4 portrays a comparison of the performance of two reduced-search

BCJR-based algorithms over the severe ISI channels introduced by the FTN.

Chapter 5 summarizes the results of this research and presents some future research topics. For the performance analysis, the MATLAB implementation has been utilized for the simulation results performed in this work.

13

CHAPTER 2

Basic principles of linear modulation systems

In this chapter, a presentation of the time-continuous linear modulation systems is given. Mathematical models for representing the linear modulation systems and the communications channel along with basic detection algorithms are portrayed, resulting in a formulation of equivalent discrete-time models of these systems. For more details and derivations of the formulas, the reader is referred to [12, 43, 44, 45].

2.1 Single carrier linear modulation systems

The signal transmission model considered in this dissertation is the linear modulation scheme represented in baseband form as

() = () ≜ ∑ ℎ( − ), (2.1)

where is the complex-valued information carrying symbol sequence, = {, , ,...} and is a real-valued continuous modulation pulse. In order to satisfy frequency ℎ() assignment requirements on the system, the spectrum of (2.1) is modulated to a carrier frequency before transmission, yielding the radio frequency (RF) representation

14

() = √2 2.2

Here is the carrier frequency in , denotes the real part of a complex number, Hz . and the superscript “ ” denotes a modulated signal. Note that the baseband signal (2.1) RF has its frequency support concentrated around . A simple model of the device that 0 generates the bandpass signal from the baseband signal is shown in Figure 2.1. It is also assumed that the signal is bandlimited to , where is referred Hz to as the bandwidth of . This band limitation is achieved by bandlimiting the modulation pulse to Hz. Additionally, in order to avoid frequency overlaps in the transmitted signal, it is assumed that . ≪

Figure 2.1: A basic device for transmitting information via carrier modulation.

Now consider the communication system in Figure 2.2. The quaternary information sequence , whose elements are drawn from the integer ring , is encoded by a quaternary encoder with code rate producing the sequence . The length of is given by the length of divided by . The encoding introduces a structured dependence among the encoded symbols which in general improves the communication performance.

The mapper in Figure 2.2 now maps the quaternary codeword onto the sequence 15

consisting of symbols from the QPSK symbol alphabet Ω. A modulator uses the sequence

as input in order to produce a sequence of analog signal waveforms to be transmitted. Finally, an AWGN channel with noise follows, resulting in the received signal , i.e., . It is assumed that the encoder/mapper combinations are such that they generate uncorrelated output streams, which can be expressed as

∗ , 2.3

where denotes the expectation operator, denotes complex conjugation, and is the ∗ Kronecker delta . These notations will be used throughout the dissertation.

Figure 2.2: A system model of a communications system when transmitted over an AWGN channel.

Furthermore, the data symbols are assumed to be equiprobable, i.e.,

Pr ||. ∈ Ω 2.4

In (2.4), denotes a probability mass function (PMF), while a probability density Pr function (PDF) will be denoted throughout. In this dissertation, convolutional codes

16

are used for encoding. The symbol alphabet is assumed to be time-invariant. It is also a balanced one, that is

0. ∈

2.1.1 Bit and block error rate definitions

Since the received signal is distorted and noisy, the receiver decision about the transmitted symbol will occasionally, and in a random manner, be erroneous. To quantify this error as a performance measure, the bit error rate (alternatively BER, bit error ratio or bit error probability), defined as the average number of information bit errors per

detected information bit, is used.

The uncoded sequence consists of information symbols. These symbols are communicated over a linear channel. As shown in Figure 2.2, these symbols are generally encoded, which produces a longer sequence of symbols, . After mapping onto the discrete alphabet , the symbol sequence is sent over the channel as analog waveforms. Ω After filtering and sampling, the receiver observes the sequence and produces an estimate of the information sequence . This estimate is usually made over two steps, and decoding. Usually the demapping is included in the demodulation process. The demodulation step converts the received analog signal into a sequence of information carrying numbers containing both distortion and noise. This sequence is then fed to the decoder’s input which, by following a certain decoding rule, produces the estimate of the quaternary uncoded sequence .

17

Now, consider the sequence and let denote its th information symbol and be its error probability. The symbol error rate can now be defined ≜ Pr( ≠ ) as

∑ ≜ , (2.5)

and the average bit error probability is found by

= . (2.6) |Ω|

In the same way, let us define the block error rate (BLER) , denoted , as the probability that the sequence is erroneous, that is

≜ Pr ( ≠ ). (2.7)

The smaller these quantities are the better the reliability of communications is, but

sometimes and due to certain constraints, the desired value of the error probability should

be specified. Typical values of are in the range of based on the 10 − 10 application.

From (2.7) we have that

18

Pr { ≠ } , (2.8)

and by using the union bound, we obtain the following upper limit

= Pr { ≠ } ≤ Pr ( ≠ ) = . (2.9)

Additionally, we have that

≥ Pr ( ≠ ) , =0,...,−1.

We can now write

which results in

≥ . (2.10)

Finally can be bounded by relating (2.9) and (2.10):

≤ ≤ . (2.11) 19

2.1.2 Bandwidth characteristics

The power spectral density (PSD), denoted , of a wide sense Φ cyclostationary process is defined as the Fourier transform of the autocorrelation of the signal . For the following analysis, we need to define the autocorrelation of the pulse , which is denoted ,

∗ ≜ ( + )d, (2.12)

or equivalently

∗ () = ℎ() ⋆ ℎ (−), (2.13)

where is the convolution operator. From (2.12) it follows that the modulation pulse ⋆ energy, denoted , is given by

≜ || d 0). (2.14)

The autocorrelation of is ()

∗ ( + , ) ≜ ( + )()

∗ ∗ = ℎ( + − ) ℎ ( − )

20

∗ = ℎ( + − )ℎ ( − ). (2.15)

Since is a wide-sense cyclostationary process with period , its time () average autocorrelation function is

1 () ≜ ( + , )d

∗ = ℎ(+−)ℎ ( − )d

∗ = ℎ( + ) ℎ ()d = (). (2.16)

Now taking the Fourier transform of (2.16), we get the power spectral density of : ()

() ≜ ℱ{ ()}

= Λ(), || < (2.17)

where, based on (2.13), is given by Λ() = ℱ{()}

∗ Λ() = () () = || . (2.18)

21

Combining (2.17) and (2.18) gives

|| . 2.19

Note that is symmetric around due to the fact that the modulation pulse || 0 is real-valued. The bandwidth is defined as the smallest single scalar number such that

0, || 2.20

This process is represented graphically in Figure 2.3.

Figure 2.3: The frequency content of the bandpass signal .

22

From the average power of given by

= (0) = , (2.21)

we can attain the average symbol energy based on

≜ (0) = (0) = . (2.22)

The average energy per information bit is given by

≜ . (2.23) log |Ω| log |Ω|

In this dissertation it is assumed that is unit energy, i.e., ℎ()

= || d 1

so that (2.22) and (2.23) become

= = . log |Ω|

23

Finally, we come to the definition of the normalized bandwidth, denoted by

, so to make it possible to compare different setups. It is defined as the ratio between the total consumed bandwidth and the total information bit rate. Also, if denotes the number of dimensions spanned by the signal in (2.1), then the normalized bandwidth is defined as

≜ , Hz−s⁄ bit . (2.24) log |Ω|

A real spans one dimension, while a complex spans two dimensions. () () The actual data bits carried by a communications system is given by the product of its

bandwidth and its symbol time duration , divided by . For example, a 2 MHz wide system running for 2 seconds carries bits. 4 × 10 ⁄

2.1.3 The squared Euclidean distance definition

In order for the receiver to distinguish between two analog signals, corresponding to data sequences , with good reliability, it is important that the two signals be as , different as possible. One important measure that evaluates the difference between two

signals is the squared Euclidean distance, which relates directly to the achieved bit error

probability of the system. It is given by

( , ) ≜ | | d

24

= (, − ,)ℎ( − ) d

= ℎ( − ) d

∗ ∗ = ℎ( − )ℎ ( − )d (2.25)

where and the notation denotes the th symbol in the sequence . = , − , , Since (2.25) only depends on the difference , then a new quantity can be defined −

() ≜ (, ), (2.26)

where the error symbols in an error event belong to an error symbol alphabet . An ℰ equivalent expression of the squared Euclidean distance is attained if (2.12) is used in

(2.25):

∗ () = (( − ))

∗ = , (2.27)

where is the autocorrelation function of sampled at the baud rate, i.e., ℎ()

25

≜ . (2.28)

Since in (2.27) depends on the average symbol energy , it is not possible () to compare different systems with different pulses and different modulation alphabets if the value of varies. Thus, it is more favourable to use the quantity normalized squared Euclidean distance which is defined as

() log |Ω| () ≜ = (). (2.29) 2 2

The asymptotic error probability of any linear signaling system, neglecting

multiplicities, depends mainly on the normalized minimum squared Euclidean distance

[45] as defined as

≜ min { ()}. (2.30)

This significant measure is found by performing a minimization over all the error

events possible by the outer code. In the case of uncoded transmission, the minimization

is carried out over all the error events. From now on, will be referred to as minimum distance. Note that in (2.30) depends on . ℎ()

2.1.4 T-orthogonal modulation pulses

A pulse is -orthogonal (or orthogonal over -shifts) if it satisfies ℎ()

26

d 0, = ±1,±2,..., (2.31)

where is the symbol duration. Note that with this notation, the sampled autocorrelation function, in (2.28), is . Since the -orthogonal pulse is uncorrelated to any = shift of itself by any multiple of , then it is possible to extract the symbol from the noise-free signal by performing the correlation integral, ()

()ℎ( − )d = ℎ( − ) ℎ( − )d = |ℎ( − )| d.

(2.32)

If in (2.14 , the right-hand side of (2.32) is equal to . In fact, if the signal ) = 1 is passed through a matched filter (MF), and then sampled at intervals of seconds, () the whole sequence is recovered. If , ISI is present. The memory of the ISI response , given by the ≠ {} smallest so that

= 0, || > 0, (2.33)

governs the complexity of the trellis-based detection algorithm. An ISI channel with a

memory and a constellation size produce a trellis of size . In order to decrease |Ω| |Ω|

27

the complexity of the receiver, the ISI memory should be reduced as much as possible, keeping in mind the smallest possible bandwidth too. However, there is a tradeoff between the ISI length and the bandwidth consumed, i.e., reducing the bandwidth will in general increase the length of the ISI response. The narrowest required bandwidth of any

-orthogonal pulse is and the least bandwidth-consuming pulse among all pulses is 1/2, the sinc pulse,

sin( ⁄ ) ℎ = . (2.34) ⁄

The Fourier transform of the pulse in (2.34) is a rectangular pulse, given by

√, || ≤ 1⁄ 2 () = . (2.35) 0, || > 1⁄ 2

In order to reduce the amplitude variations of the signal and to minimize the () temporal tails of the pulse , another class of the -orthogonal pulses is used, the root ℎ() raised cosine (rRC). The rRC has a smoother spectrum and its Fourier transform is

represented by:

, || ≤ (1 − )⁄2 1 − |()| = cos || − , (1 − )⁄ 2 < || ≤ (1+)⁄ 2 (2.36) 2 2 0, || >(1+)⁄ 2

28

The factor , is known as the “excess bandwidth” or the “roll off ” , 0 ≤ ≤ 1 factor. The bandwidth of the rRC pulse in (2.36) is equal to and its (1 + )/2, bandwidth is a fraction of greater than the equivalent sinc pulse. When , we = 0 obtain a sinc pulse. Nyquist showed [6] that a sufficient condition to ensure the - orthogonality is that the Fourier transform of the pulse be antisymmetric around the

frequency component . The rRC pulse family satisfies this criterion. The time = 1/2 domain representation of the rRC pulse is given by

1 sin( (1 − )⁄ + (2⁄ cos((1 + )⁄ ))) , ≠ 0, ± √ (⁄ )(1 − (4⁄ ) ) 4 1 4 ℎ() = 1 − + , = 0 (2.37) √ 2 2 1 + sin + 1 − cos , = ± . √2 4 4 4

Figure 2.4 shows the time domain representation of the rRC pulse with three

different values of the excess bandwidth factor , . From the figure, it = 0,0.3,and 0.6 is shown that the case with is equivalent to the sinc pulse. Furthermore, it is = 0 noticed that as goes higher the pulse gets narrower with less amplitude variations in the oscillations. Since the rRC pulses have infinite time support, they have to be truncated in practice, but attention should be paid not to truncate the pulse too early, which in turn would give a false test result.

29

Figure 2.4: Root RC pulses at three different values of the excess bandwidth factors .

2.2 Introduction to maximum-likelihood sequence estimation

Whenever ISI is present in the received signal, sequence estimation should be performed. In this section the maximum likelihood sequence estimation (MLSE) is presented. Furthermore, it is assumed that the ISI-structured signal is transmitted over an AWGN channel resulting in

. 2.38

It is assumed that is a complex-valued white Gaussian process with mean and autocorrelation 0

∗ . 2.39

30

Furthermore, uncoded data transmission is assumed, i.e., . Even in the case that no multipath exists in the environment, that is the channel impulse response () = , there can still be need for sequence detection if the ISI is intentionally introduced () in the transmitter, as is always the case in this dissertation.

For the purpose of detecting the data symbols from the received signal , () MLSE is applied at the receiver. The decoding rule of the MLSE is

≜ arg max (()|, (2.40)

where is the conditional PDF of given that the sequence is sent. It can (()| be shown [12] that the optimization in (2.40) is optimal if and only if the sequences are equiprobable. It is also possible to show [12] that the optimization in (2.40) corresponds

to minimizing the Euclidean distance between the received signal and the estimated

sequence in the case of transmitting over an AWGN channel.

arg min | | d ∗ = arg min || 2ℛ {()()} + || d (2.41)

Note that the term does not affect the minimization (does not depend || d on ) and can therefore be neglected. By combining (2.1) and (2.41), the minimization of the Euclidean distance corresponds to the following maximization

31

∗ 1 arg max ℛ{()()} − |()| d 2

∗ ∗ 1 = arg max ℛ () ℎ ( − ) − | ()| d 2

∗ 1 = arg max ℛ{ } − | ()| d, (2.42) 2

where

∗ ≜ () ( − )d. (2.43)

The sequence can be obtained by applying a matched filter = , , , . . . followed by a signaling rate sampler at the receiver. This is schematically shown ∗ () in Figure 2.5. In addition, the sequence is a set of sufficient statistics for detecting . By using the expression for in (2.43), the samples become ()

∗ ∗ = ℎ( − ) (−)d+ () ( − )d

= + (2.44)

where the sequence is ISI if . The noise sequence is a white = , . . . , ≠ sequence. An equivalent discrete-time model of (2.38) is therefore

32

⋆ . 2.45

The sequence is a causal -tap long ISI response sequence with 1 autocorrelation while is a random Gaussian sequence with zero mean and autocorrelation

, . 2.46

Figure 2.5: A straightforward way to produce the sequence from the received signal .

2.2.1 The recursively-structured MLSE

This section presents the recursive structure of the MLSE principle realized by the

Viterbi algorithm (VA). Again, the assumptions are an uncoded sequence is transmitted over the AWGN channel in (2.45). It is assumed that the reader is familiar with the trellis structure of the algorithm. However a brief review is given next.

Assume a length- quaternary data sequence and a causal ISI response such that

0, 2.47

33

The quaternary setup above can be associated with a -state trellis of depth where 4 each state corresponds to a distinct combination of the most recent symbols, that is

≜ , , . . . , , 2.48

where represents a state in the trellis at depth . Figure 2.6 shows an example of a 4- state binary trellis (for simplicity) where . 2

Figure 2.6: A 4-state binary trellis example.

In a quaternary modulation setup, a state is connected to four different states at depth . These four states can be uniquely identified by the symbol pattern 1 corresponding to the origin state at depth and the transition symbol at time . Suppose , is a state in the trellis and , , . . . , ∈ 1,1,1 is the transition symbol at time . The four states that are connected to are , 1

34

now given by , , , , . . . , , +1 + , , . . . , , −1 + , and , for inputs , , . . . , , +1 − , , . . . , , −1 − , , , and , respectively. +1 + −1 + +1 − −1 − The succession of states is Markovian. From the figure above, a branch is a line

that connects two states while a trellis path is a sequence of connected states. An MLSE

algorithm selects the most likely data symbol sequence that maximizes (2.40). Assuming

that all paths begin from the so-called all-zero state, , = +1+,+1+ ,...,+1+ there exist in total different paths through the trellis. Since every path represents a 4 distinct symbol sequence, the decoding problem is equivalent to finding the most likely trellis path.

Consider now the equivalent expression of the last term in (2.42) given by

1 1 ∗ |()| d = . (2.49) 2 2

Equation (2.47) implies that is a finite ISI sequence. Introducing the definition of the (compare with (2.42))

∗ 1 ∗ Θ() = ℛ{} − , (2.50) 2

we notice that can be recursively computed as Θ()

35

∗ 1 Θ. . . , , ) = Θ(. . . , , ) + ℛ − − . (2.51) 2

By following the same notation in (2.48), the survivor metric of the VA can now be

defined as

Θ( ) ≜ max,..., Θ(. . . , , ). (2.52)

Finally, relating (2.51) and (2.52) gives

∗ 1 ∗ ∗ Θ( ) = ℛ{ } + max→ Θ( ) − − ℛ . (2.53) 2

2.2.2 The error performance of the MLSE

The performance of the MLSE algorithm is evaluated in this section. It is performed by investigating its error performance. We start by defining the probability of a symbol error as

≜ Pr ( ≠ ), (2.54)

where is an MLSE estimated symbol at depth . Since there exists no closed form formulas for in the case of ISI, upper bound estimations should be sought. A useful

36

definition that will be used with the results is the complementary Gaussian distribution function, also known as the Gaussian tail function as well, given by

1 ⁄ ≜ d. (2.55) √2

An upper bound to , according to [29, 46], is

≤ () (), (2.56) ∈

where is the set of all possible error events. The quantities and are the () multiplicity and Hamming weight of the error event , respectively. Given the fact that and , the summation in (2.56) can be () = (−), () = (−) = performed only over those events where with given by = , , . . . = 2

() = 2 . (2.57)

Additional other general cases can be found in [12, 44].

Since declines steeply towards zero, then the dominant part affecting the () function behavior is the minimum distance in the function’s argument. As a matter of fact, Forney [29] showed that, given (2.56) converges, there are two constants and such that

37

≤ ≤ . (2.58)

The first error probability , which is the probability that an error event begins at depth , given that there are no errors up to this point, is defined as

≜ Pr ( ≠ | , . . . , = ). (2.59)

By omitting the factor in (2.56), an upper bound to is attained. () 2.3 The M-algorithm

Even though the MLSE decoder is the optimal sequence , its complexity

grows exponential as the length of the ISI response, increases. If the size of the , underlying trellis, , is too large, then a realization of the optimal MLSE is intractable. Ω| | Instead, other suboptimum reduced-complexity techniques should be used. One well- established trellis-based decoder is the -algorithm, invented by Anderson 1969 [31]. It M is based on traversing only a part of the full trellis, thus reducing the underlying

complexity quite considerably. The -algorithm is given next and it will be used M throughout this dissertation.

The M-algorithm is a suboptimal trellis-search technique, which reduces the trellis decoding complexity by traversing only a portion of the trellis. At each depth , only the most probable states (survivors),

38

, , … , }, (2.60)

which are the states with the highest cumulative metrics , are retained and extended Θ(.) to the next trellis depth , while the other states are considered null. Furthermore, no + 1 paths are to be extended from a discarded state. The retained states are the ones which are

the closest in Euclidean distance to the received signal sequence. When a new received

signal gets to the detector’s -algorithm, the best states are extended to the next trellis M making up to new states, which then are sorted and only the most Ω| | promising states along with their corresponding paths are kept. This process continues until the end of the trellis is reached. The path that gets to the end of the trellis with the highest metric value is called the approximated maximum likelihood (ML) path. Θ .)

The main advantage of the -algorithm over other reduced-complexity decoding M techniques, is that it performs the same number of computations at each trellis depth . This behavior makes it easy for a system designer to specify the parameter in advance to meet a certain level of complexity-performace tradeoff. The total number of calculations the -algorithm performs for a depth trellis is reduced from to Ω| |Ω| M | . However, a sorting of states is needed at each trellis depth . In fact, there |Ω| |Ω| is no need for a complete sorting of all of the states, only the best states must be found, which is a linear operation in [47]. M

In [48, 49], it was shown that the -algorithm is optimum in terms of minimizing M the probability of missing a correct path among the constant-complexity breadth-first

39

decoders. More information on other variants of the -algorithm can be found in [50,51, M 52].

There are two types of reduced-complexity decoders: ones that traverse only a small part of the full trellis, and ones that execute all the possible paths of a reduced-size trellis. Algorithms which move in the forward direction only, such as the -algorithm, M are called breadth-first algorithms. Algorithms that move in both directions are called

backtracking decoders. The breadth-first class of decoders can further be classified into

two groups: the one-way decoders and the two-way decoders. Algorithms from the first

group, such as the VA, perform only one forward recursion, while algorithms from the

second group, such as the BCJR algorithm presented in Section 2.4.1, apply one forward

and one backward recursion. Other examples of the one-way decoders from the first class

are the well-known soft-output VA (SOVA), introduced by Hagenauer and Hoeher in

[37], and its reduced-complexity version, which is the soft-output M-algorithm (SOMA)

proposed in [39]. More examples on the breadth-first decoders can be found in [53, 54].

Two examples of the backtracking decoders are the Fano and Stack algorithms. For more

details on algorithms from both classes, the reader is referred to [55, 56].

2.4 Maximum a posteriori decoding

Optimal methods for detecting trellis-structured sequences are based on the

maximum likelihood sequence estimation algorithms, which are optimal in the sense of

minimizing the bit error rate and the block error rate. In the presence of a priori

information about the transmitted sequences the MLSE turns into a maximum a

40

posteriori (MAP) decoder. This section considers the MAP symbol-by-symbol trellis decoder proposed by Bahl, Cocke, Jelinek and Raviv in 1974 [36]. It is well-known as the

BCJR algorithm.

Let be the probability of receiving a correct Pr 1 − Pr( ≠ ) sequence estimate at the receiver. Moreover, let be the PDF of the received () sequence from (2.45). Then, the probability of a correct decision is = , , , . . . given by

Pr ( = ) = Pr ( sent | d (2.61)

The goal of an optimal decoder is to minimize the probability of error, or to

maximize the probability of making the correct decision. The right-hand side of (2.61) is

maximized when the term is maximized for each . Thus, once the received Pr ( sent | signal is observed, the optimal decision rule is

arg max Pr sent |. (2.62)

The decision rule in (2.62) is known as the MAP rule. This receiver minimizes the probability of detecting a message in error. We can equivalently write

| sent)Pr( sent) Pr ( sent | . (2.63) ()

41

Since is independent of it can be taken off the maximization. Furthermore, if all symbol sequences are equiprobable, the maximization in (2.62) turns into

arg max | sent ) (2.64)

where is the likelihood of A decoder of the form (2.64) is, based on Section (| . 2.2, known as a ML decoder. This decoder is optimal if equiprobable symbol sequences are assumed. In this dissertation, the sequences are assumed to be drawn from a uniform distribution.

Consider now a MAP sequence equalizer for detecting the most probable data sequence according to (2.62). As (2.45) is a sufficient statistic for optimal detection, it is possible to write

= arg max Pr (|

where is the received sequence from (2.45). Furthermore, in (2.63) is known as Pr() the a priori sequence probability, and it can be provided by the convolutional decoder in

the receiver. If the data symbols are assumed independent, factorizes into Pr()

Pr () = Pr( )

where is the sequence length. The above assumption allows the following factorization 42

| |. (2.65)

As each term in (2.65) is

1 (| exp − , (2.66)

The VA branch metric at th trellis stage is proportional to . Pr( )(|

2.4.1 The BCJR algorithm

A MAP symbol decoder decides in favor of the symbol , by applying the following decision rule

arg max Pr ( | . (2.67)

This decoder also produces soft information corresponding to the data symbols, in the form of logarithmic a posteriori (APP) ratios, sometimes referred to as L-values, given by

Pr ∑: Pr | 1 + | log ( | ≜ log Pr | ∑: Pr | . (2.68)

43

In (2.68) we have a QPSK alphabet, i.e., . Ω 1+j,−1+j,+1−j,−1−j} The APP ratio can further be expressed as

∑(, )∈ , = , ) (| log , (2.69) ∑(, )∈ ( = , = , )

where and are the sets of trellis state pairs at depth that correspond (, ) to and , respectively. Note that in (2.67) time-invariant trellises are = +1 + = assumed.

The BCJR algorithm calculates probabilities of states and paths in a trellis, given

the channel outputs and the a priori data probabilities. It effectively = , , . . . , calculates logarithmic APP ratios by taking advantage of the factorization

( = , = , ) = ()(, )( ). (2.70)

In (2.70), the recursively calculated forward and backward metrics that correspond to the state at trellis depth are denotd by and , respectively. () () The metric that represents the probability of a branch connecting two successive states

is denoted by , and it is expressed as (, ) (, )

Pr | (, ) = (, | , (2.71)

where the likelihoods are given by (2.66). (|

44

Starting from the initial all-zero state at the origin of the trellis, the forward metric is calculated recursively in a forward trellis pass according to

, ), (2.72) ∈

where the forward metric is initialized as at the root of the trellis, and = 1,0,...,0 is the set of states reaching state at depth (in quaternary transmission there are + 1 4). In the same manner, the backward recursion, initialized with , = 1,0,...,0 starts at the end of the trellis and continues towards the root, calculating at each trellis

depth

() = ( )(, ). (2.73) ∈

The superscript symbol “ ” means the transpose operation throughout this T dissertation. The set is the set of states reached from the state at depth . The way that both the forward and backward state metrics can be interpreted in the probabilistic

domain is as

() = (, , , . . . , ),

() = (, , . . . , |.

`

45

2.5 Faster-than-Nyquist signaling

This section presents some of the basics of the nonorthogonal signaling faster- than-Nyquist (FTN). This form of signaling was first introduced in 1975. It is based on the linear transmission of the pulse (PAM) signals according to

, (2.74)

where the pulse is a -orthogonal pulse being sent at a rate faster than the Nyquist ℎ() rate, without incurring any loss in the minimum Euclidean distance and, thus, not affecting the asymptotic error probability when applying the optimum detection. The

FTN works by reducing the time spacing between adjacent transmitted pulses to be less than the Nyquist rate, while keeping a fixed PSD shape. Noting (2.19), the PSD shape, in the case of independent and identically distributed (IID) symbols, depends only on the modulation pulse . FTN improves the spectral efficiency of the communication ℎ() channel beyond that achieved by the Nyquist signaling.

Faster-than-Nyquist was originally introduced by Mazo in 1975 [10]. Mazo

showed that the binary -orthogonal sinc pulses can be sent faster than the rate 1/ without any loss in the minimum Euclidean distance. In fact, the symbol time can be reduced to 0.802 of the symbol duration without any loss in the Euclidean distance.

Equivalently, 1/0.802 ≈ 25% more bits could be carried in the same bandwidth without affecting the asymptotic error rate. He called this scheme faster-than-Nyquist signaling and the value 0.802 is called the Mazo limit. Even though the asymptotic error probability above the Mazo limit is not reduced with optimal detection, FTN introduces

46

controlled amount of ISI in the received sequence, since the symbols are being transmitted at a rate that violates the Nyquist criterion. Nyquist criterion states that in order to tansmit the bits at a rate of , a minimum of Hz of baseband 1/ bits/s 1/2 bandwidth is required for a received sequence with zero ISI.

Because of the significant spectral sidelobes, Foschini in [57] suggested that FTN cannot be competitive. In fact, in his work, the ISI support was truncated at a certain point making the ISI response of a small duration. Since then, the concept of FTN has been extended in many directions. It is possible that the scheme be coded, the modulation can be nonbinary or nonlinear [58], and the modulation pulse does not need to be sinc or even -orthogonal. The concept can also be applied in the frequency domain, by implementing a system that is OFDM-like, and placing the subcarriers closer than orthogonality requires [59]. Furthermore, FTN can be extended into the multicarrier transmission schemes [60, 61, 62]. Results show that for the same BER performance, multicarrier signaling is superior to the single carrier signaling in the sense of consuming less bandwidth. In [14, 63], FTN receivers and other related issues are explored.

Additionally, in [64] a complete chapter is dedicated to FTN signaling.

Since FTN introduces intentional ISI in the received sequence due to the fact that transmission is carried out at a rate that is faster than the Nyquist rate, a more 1/ complex maximum likelihood sequence estimation receiver is required to eliminate the

effects of the ISI. If the receiver is able to handle the existing ISI then a considerable

improvement in the spectral efficiency is attained. This improvement also occurs for any

-orthogonal pulse. In each case, there will be a smallest packing (or equivalently a smallest or closest frequency spacing) at which the Euclidean distance first goes below

47

its independent pulse value. This value is called the Mazo limit for the pulse and this modulation alphabet. Mazo limits for the root RC pulses with various values of excess bandwidth factors are derived in [14]. An early study for deriving receivers for FTN signaling is carried out in [14] as well. Mazo limits for other general pulse shapes, including those which are not orthogonal for any shift , are worked out in [65]. A motivating problem is to explore the minimum Euclidean distance for FTN signals. Some

work on this topic has already been done in [66, 67]. In [13, 68, 24], analytical results

show that the capacity of FTN signaling is always higher than in memoryless systems.

Further, in [19], it is shown that the binary FTN signaling reaches, asymptotically with

the signaling rate, the PSD capacity of

2 log 1 + |()| d bits⁄ s, (2.75)

where is the average signal power and is the signal PSD. In (2.75), |()| |()| is normalized to unit integral. In [69], some results on precoding for FTN can be found.

Concatenated systems for coded FTN that run on limits close to the theoretical capacity bounds were introduced in [25]. An illustration of a serially concatenated communication system with encoding and an inner ISI mechanism is shown in Figure 2.7.

48

Figure 2.7: A block diagram of a serial concatenation communication system with encoding and ISI. denotes an interleaver. ∏

In [70], FTN is explored for the first time in MIMO setups. In [71], it is proven that the best pulse shape , in the sense it provides the smallest Mazo limit for FTN. is nearly Gaussian. The Gaussian pulse is not orthogonal for any shift of integer multiples of . In [72, 15], an approach for improving the spectral efficiency of a linear modulation system by reducing the spacing between symbols both in time and frequency, was proposed along with some reduced-complexity detectors. The extension of time and frequency packing to the optical links has recently been deliberated in [73]. A modification of [15] to a more complex receiver structure is considered in [74]. Some spectrally-efficient FTN-like systems along with some hardware implementation issues are explored in [21, 75, 76, 77, 78].

2.5.1 The system model

The baseband form of ordinary linearly modulated signals is modeled by

, ≤ 1 2.76

49

where are complex equiprobable independent and identically distributed data symbols drawn from an alphabet , and is a real unit-energy -orthogonal baseband pulse. Ω ℎ This signaling form where is found in many practical , such as trellis = 1 coded modulation (TCM) and the subcarriers in OFDM. The signaling rate in (2.76) is

. Note that when this underlies an orthogonal system. This ISI-free signaling 1/ = 1 is called the Nyquist signaling, while when the signaling is referred to as the FTN < 1 signaling. Note that in FTN signaling, the transmission time is , which Ƭ = < means that there is an integer where

ℎ()ℎ( − Ƭ)d ≠ 0. (2.77)

In this dissertation the bandwidth of the modulation pulse in (2.76) is much ℎ() narrower than Hz which introduces severe ISI in the system. Decoding of signals 1/2 at a signaling rate close to the Mazo limit is reasonably simple. However, for smaller , resulting in attractive combinations of bandwidth-energy efficiency and, particularly,

higher bit densities, more complex structures in the receiver are required for detection.

Moreover, if the transmitted symbols are encoded, one needs to rely on iterative receiver

structures realized by the turbo decoding, which is presented in the Section 2.7.

As an alternative to transmitting faster, consider now transmitting wider pulses in

time, i.e., pulses given by

ℎ () = √ℎ( ), (2.78)

50

where the transmission rate is kept at . As the widening factor is equal to , the 1/ 1/ new pulse is -orthogonal where . So, basically, the same discrete-time = / ≥ transmission model is obtained as the one before, and this makes both systems correspondent in terms of needed SNR versus bandwidth efficiency, measured by the normalized bandwidth.

Assuming IID symbols, the PSD in (2.19) for signals of the form (2.76) equals

() = || , (2.79)

where

= || . (2.80)

By substituting in (2.79), the average power of an FTN signaling becomes = 0

= . (2.81)

We conclude this section by mentioning the Mazo limit:

Definition 1. The Mazo limit is the smallest value that fulfills when ℳ = . = ℳ

51

2.5.2 The capacity estimation of FTN signaling

Shannon in [8] proved that any signal of bandwidth Hz spans ≈ 2 dimensions if transmitted in seconds duration. Thus a signal is bandlimited to Hz, and a symbol has a duration of seconds, then it can be completely specified by 2 numbers. These numbers can be imagined as coordinates in a dimensional space. 2 Moreover, Shannon showed that these numbers can be transmitted using shifted sinc

pulses. This means that data symbols ] can be transmitted in a 2 = , . . . , duration of seconds. Consider now the linearly modulated signal in (2.1) assuming equiprobable and

IID data symbols . All possible sequences are valid unless the sequence is {} encoded. Then, the design of the underlying code typically controls which subset of the data sequences are allowed. Assume the signals have an average power and a rectangular PSD shape that is bounded by the interval , where is width of the −, single-sided PSD of the signal. The maximum signaling rate over the AWGN channel in

(2.38), where the noise power spectral density equals , is given by ⁄2

= log 1 + bits⁄ s. (2.82) This capacity formula is the Shannon’s result from [8]. If the PSD |()| shape is smooth, it can be divided into infinitesimal rectangular pieces and small channels, which, by applying integral calculus, extends (2.82) to (2.75). The term capacity in this dissertation is reserved for signals with PSD over the AWGN |()| channel. The term constrained capacity is to express the maximum information rate

achieved when signaling with a certain modulation alphabet or with FTN transmission. If

we use to indicate the mutual information between the (; ) = () − (|) 52

sequences and , where denotes the differential entropy operator, the information .) rate is given by

≜→ lim (; )⁄, bits⁄ ch. use (2.83)

where is the sequence length. The capacity in (2.82) can chiefly be attained by transmitting of the form in (2.1) with [79]. That is () = 1/2

() = sinc( − ⁄ 2). (2.84)

Here is a sequence of Gaussian data symbols and sinc is the sinc pulse in (2.34). {} () Since the sinc pulse is impractical, other smoother pulses such as the rRC is used, and even with the extra bandwidth, the optimum detection characteristics are still valid along with the capacity in (2.82). Now, let us make a comparison between the capacities in both

(2.82) and (2.75).

In [13], it is found that signaling over -orthogonal non sinc pulses whose ℎ(), PSD is antisymmetric around the point , can only increase the (1⁄ 2,|(0)| ⁄2) capacity in (2.75) as compared with signaling with a sinc pulse. So it is clear that the capacity in (2.75) is always higher than that in (2.82). However, this capacity increase cannot be reached when signaling with -orthogonal, non sinc pulses, . Signaling () with FTN, on the other hand, makes use of the whole potential of the pulse’s PSD shape

53

[13]. The antisymmetric feature of the -orthogonal pulses was elaborated on by Gibby and Smith [80] as

, ∀. (2.85)

Regarding the derivation of the capacity of the FTN signaling; the assumptions

are that the data symbols are IID. Assume the discrete-time model in (2.45) where {} is the probability mass function of the signal sequence . Moreover, Pr() = ∏ Pr () assume that data symbols, , are to be signaled. Then, the constrained = , . . . , capacity of a general ISI channel is given by

1 ≜ sup lim ( ; ) (2.86) () →

1 = sup lim ( ) − ( | bits⁄ ch.use. () →

The subscript “DT” denotes discrete time. Assuming Gaussian inputs, [79, 81],

the capacity in (2.86) turns into

1 = log 1 + () d, (2.87) 2

where

54

|| (2.88)

is the power spectral density of the ISI sequence , here given in angular frequency. For the FTN constrained capacity computations, the quantity of the corresponding ISI () needs be evaluated. It can be shown that [13]

1 1 () = + = . (2.89) 2 2

Therefore, it is clear that is proportional to the folded spectrum of about () || the frequency . Based on (2.89) the folded spectrum satisfies = /2

| | 2 ). (2.90)

When normalizing (2.87) by the signaling rate , the constrained capacity becomes 1/

1 ≜ bits⁄ s. (2.91)

Finally, by substituting (2.89) in (2.87), and then performing a variable change, we get

1 = log 1 + () d (2.92) 2

55

1 = log 1 + d 2 2

⁄ 2 = log 1 + | | d bits⁄ s,

where and . Assuming in (2.92), the resulting constrained = = ⁄2 = 1 capacity of orthogonal or Nyquist signaling is obtained. This capacity is denoted and it is given by

1 2 = log 1 + , bits⁄ s (2.93) 2

where in this case , since in (2.90) is equal to 1 (ISI-free). | | = (2) Employing the same assumptions, the following theorem was verified in [13, 24].

Theorem 1. Unless is a sinc pulse, there exists such that ℎ()

> .

When it follows that . Hence, by increasing the ℎ() = ℎ () = signaling rate beyond for the -orthogonal nonsinc pulses, an increased capacity is 1/ achieved as compared to the orthogonal transmission of the same pulses.

Note that having in (2.92), the capacity is the maximum = 1/2 possible, in other words, it equals the capacity in (2.75). This means that when signaling at the rate , no folding of the spectrum takes place in (2.92). For non-sinc 1/2 ℎ() and a signaling rate of , the capacity in (2.92) is always less than that in (2.75). 1/

56

Moreover, using a value of that is smaller than has no advantage with 1/2 Gaussian inputs. However, this does not apply for discrete symbol alphabets. More on

FTN capacity results can be found in [13, 82].

2.6 Principles of turbo equalization

Figure 2.8 shows a serially concatenated system along with the iterative structure of the turbo equalization receiver. Turbo equalization was first introduced in [35] for serially concatenated schemes where the ISI channel along with the mapper perform as the inner encoder. This approach, which was originally developed for turbo codes

(concatenated convolutional codes), has been widely applied to various communication setups. Only the serial concatenation is investigated in this dissertation, however the turbo principle [34] can be extended to parallel concatenations as well.

The iterative structure consists of two constituent soft-input soft-output component decoders. The inner ISI decoder is usually referred to as the equalizer. An interleaver and a deinterleaver are added to the iterative scheme in order to decorrelate the errors between adjacent symbols in a data block. Since the two turbo decoders share the data sequence (input to the inner decoder and a shuffled version at the output of the outer decoder), the idea behind the iterative scheme is to get the two decoders to agree on

a final decision on and not . By exchanging soft information between the constituent decoders instead of the hard-decision symbols, the BER performance of the receiver gets

significantly better. This improvement in the performance comes at the expense of

increased receiver complexity. The situation is made worse with the need to perform the

iterative detection for each block of data over many times. In this dissertation,

57

convolutional codes are used for the outer decoder while the intentional ISI introduced by the FTN signaling performs as the inner encoder.

Figure 2.8: A serial concatenation communications system employing iterative turbo equalization at the receiver.

There are other detection techniques for serially concatenated systems. One possible way is the optimal joint equalization-decoding scheme based on the MAP/MLSE that directly outputs the estimates from . This approach suffers a high computational load since the state space of the underlying trellis grows exponential with the block size.

This complexity limits the optimal MAP/MLSE detection to rather small block sizes.

Another possibility is running a non-iterative detection where first the inner decoder equalizes the ISI channel and outputs hard estimates . Following the equalizer is the outer decoder which gets the deinterleaved sequence, , at its input, and produces ∏ 58

the estimates at its output. The main downside of this method is that the inner decoder produces hard outputs which incurs losses. A better option is to replace the inner decoder

with a soft-output version. There are many candidates in the literature and the one used in

this dissertation is the MAP-based BCJR algorithm. Note that with this setup, the outer

decoder decodes a probabilistic channel in order to produce the final output .

In general, optimal and suboptimal MAP-based algorithms are used for the inner

equalization of the ISI channel. All approaches keep the same iterative loop structure but

differ only in the nature of equalizer. In Figure 2.8, the equalizer calculates, at each depth

, the APPs where and is the received sequence. In this Pr( = |) ∈ Ω dissertation quaternary PSK (QPSK) is used, that is, the signal constellation alphabet is Ω The extrinsic log likelihood ratios (LLRs), Ω = {+1+,−1+,+1−,−1−}. , which are sent to the outer decoder as a priori information, are found by () subtracting the a priori LLRs, , from the a posteriori LLRs generated by the () equalizer, that is,

Pr ( = +1 + |) Pr ( = +1 + ) () ≜ log − log , ∈ Ω. (2.94) (Pr ( = |)) Pr ( = )

Note that the a prior information is provided by the outer decoder. Since at the first iteration there is no a priori information available, then it is assumed that () = . Independence among data symbols is assumed (supported by argument of using 0, ∀ interleavers and running large block sizes). This assumption along with the fact of treating the extrinsic information as a priori are the two main characteristics of any turbo receiver. Extrinsic information is, in the probabilistic domain, a generated information

59

about a certain symbol when only considering information from other symbols . , ≠ ℓ ℓ Now considering the outer decoder: at each depth , the decoder calculates the APPs given only the a priori LLRs . The a priori Pr( = ()) () = ∏( ( | )) LLRs are subtracted resulting in the extrinsic LLRs

Pr = 0 () Pr ( = 0) f ( ) ≜ log − log , or = 0,1,2,3 (2.95) Pr = () Pr ( = )

which are then fed back to the inner decoder in order to be used as a priori information of the data symbols, . After a first detection of a received block, the iterative process is ( ) repeated a predefined number of iterations (or until a suitably chosen termination

criterion is met). In the last iteration, the outer decoder only calculates the data symbol

estimates

f ≜ arg max Pr = () , or = 0,1,2,3. (2.96)

In this dissertation, only the optimal (in the sense of minimizing the BER) MAP

symbol-by-symbol decoder, accomplished using the BCJR algorithm, is considered for

decoding. Since BCJR equalization for large constellations Ω and/or long ISI responses is

intractable to be carried out, reduced-complexity methods based on the same algorithm

will be implemented for the equalizer. At the end, it is important to realize that even

when implementing the turbo equalizer with full complexity, it is still less complex than

60

the optimal MAP/MLSE detector. More information on turbo equalization can be reached in the references [83, 84, 85, 86, 87, 88].

61

CHAPTER 3

Employing higher order modulation in combination with

nonbinary code alphabets in the context of faster-than-Nyquist

signaling

In this chapter, we study the extension into higher order modulation transmission over the severe intersymbol interference (ISI) introduced by faster-than-Nyquist (FTN) signaling. This scheme is evaluated over the uncoded ISI channel and in iterative decoding of coded FTN transmissions. Since FTN is an inherently bandwidth efficient scheme, and typical FTN signals can carry 4-8 bits/Hz-s in a fixed spectrum, most research in this field has focused so far on the binary modulation schemes. This chapter contributes in employing a non-binary modulation scheme along with a non-binary convolutional code in an iterative receiver over ISI models as long as 32 taps. This combination has resulted in a FTN system that carries 8-16 bits/Hz-s in a fixed spectrum.

This attractive narrowband method is achievable at a relatively practical complexity, using a modified version of the reduced-search M-BCJR algorithm. The general conclusion of the chapter will be that the combination of coded higher order modulation- based FTN and the reduced-complexity BCJR is an attractive narrowband signaling.

62

3.1 Problem statement

This chapter investigates the performance when a quaternary convolutionally coded QPSK-based transmission is strongly band limited, and the receiver is of the soft- input soft-output type. The modulation method is faster-than-Nyquist (FTN) signaling; that is, linearly modulated data symbols with a baseband pulse according to ℎ()

( ) = ℎ( − ), ≤ 1 (3.1)

where in (3.1), are quaternary independent and identically distributed (IID) complex {} symbols with zero mean and unit variance, is the average modulation symbol energy, and is an arbitrary unit energy -orthogonal pulse. The sequence ( is transmitted ℎ() ) over an additive white Gaussian noise (AWGN) channel with noise power spectral density . The model (3.1), with , describes many practical modulations. ⁄2 = 1 The objective of this chapter is to explore the performance of the iterative equalization of nonbinary-coded nonbinary-modulated narrowband FTN signaling. In an iterative scheme, log likelihood ratios (LLRs) are passed around the turbo loop. The performance is directly affected by the quality of the LLRs.

In fact, the complexity reduction involved in the M-algorithm degrades the quality of the LLRs. However, preprocessing can be added to the system so as to make better use of the retained states in the reduced-search BCJR. The continuous FTN signals get to the receiver, which includes a matched filter, a sampler, and possibly a post processor, which together make the system equivalent to the

63

discrete-time convolution of the signal sequence , , . . . (quaternary in this chapter) with the discrete-time FTN ISI model , , . . . , . This provides a 4 -state trellis of the channel. A suitable receiver model is used in Section 3.2, which represents the transmission process in the sense that zero-mean IID Gaussians with variance ⁄2 are added to the discrete convolution values. The FTN signaling in this dissertation is applied in two ways: by itself as an uncoded narrowband signaling scheme, and as the inner ISI mechanism in a coded concatenated system. These two approaches are shown schematically in Figure 3.1. The first embraces the parts in the dashed box and is called the “simple detection” of the ISI approach while the latter is termed the turbo equalization system [35].

Figure 3.1: Turbo equalization structure.

64

As the intentional ISI introduced by FTN is severe, especially for smaller values of thus incurring long ISI tap set, the complexity of the ISI-BCJR block in Figure 3.1 is high. Thus, a reduction of the BCJR equalizer’s state space has to be performed. Coded

FTN provides a unique advantage of both bandwidth reduction and coding gain, but,

again, a reasonable receiver has to be employed. There are two methods of trellis

complexity reduction: either reducing the search size in a full complexity trellis (a

reduced-search approach), or running a full search in a reduced size trellis (a reduced-

trellis approach). In this dissertation the former technique is used. Early work on reduced-

search decoders mainly focused on non-iterative applications where the magnitude of the

log likelihood ratios (LLRs) is not needed. In iterative equalization, accurate values of the

LLRs are needed to be passed around the turbo loop as soft information of the data

symbols. One algorithm that provides this soft information efficiently is the BCJR [36],

which is impractical to be employed for long ISI responses. More material on the M- and

other reduced-search BCJRs can be found in [38, 89, 90, 91, 92, 93].

It has been known that the receiver front-end processor should provide the

reduced-complexity detector with a minimum phase input (see [89, 94, 93, 95]). This can

be implemented in a straightforward way by cascading the matched filter-sampler stage

by an all-pass maximum phase filter, and then applying a frame reversal at its output. The

minimum phase concept focuses the energy at the front of the ISI model. A minimum

phase sequence directs the search of a reduced-complexity detector further efficiently (a

minimum phase does not improve the performance of a full -state Viterbi algorithm 4 (VA) or BCJR). Section 3.2 presents the minimum phase concept more, and adopts the

65

super minimum phase model in [27] that focuses energy in a more promising way than the simple mathematically calculated correct minimum phase model.

The probability of error of the maximum likelihood simple detection of the

symbols is proportional in logarithm to , asymptotically in the {} ( ⁄) term , where is a multiplicity factor that depends on the most probable error ⁄ events. A union bound estimate is made by investigating the error events at distances near

the minimum Euclidean distance of the signal set. More results on error probability calculations for the simple detection of the ISI are found in coding texts, such as [45]. As the state size of a reduced-complexity detection algorithm diminishes, the error probability at some point departs from the ML bound and, therefore, distance-based calculations should be performed in order to determine the amount of complexity reduction to be applied to the full complexity algorithm.

In this dissertation, the convolutional code has been applied for the error correction coding. Above a certain threshold SNR, the performance of the system converges to the BER of the underlying code over an ISI-free channel. Below this threshold SNR, convergence is to a much higher BER.

There is a difference between a code that calculates the accurate LLRs and an algorithm that makes decisions about symbols, that is, calculates the LLR sign. In iterative equalization of FTN signals, accurate LLRs should be produced about symbols, and failure of producing them is a main drawback of the M-algorithm. Some work on this

66

issue appears in [93]. We extend the approach introduced in [27] to handle the quaternary modulation of the FTN signals. The approach adds a third low complexity recursion to the M-BCJR in order to produce high quality LLRs.

3.1.1 The selection of the FTN pulse shape

Even though any pulse could be used as the transmitter pulse for FTN, in this ℎ() dissertation a unit-energy -orthogonal root raised-cosine (rRC) pulse with a roll off factor is used. Its response is zero outside Hz. When , this = .3 ±1.3/2 = 1 underlies the widely used -orthogonal rRC pulse. As drops below 1, then the pulse is being sent faster than the Nyquist rate, but the PSD shape remains unchanged. For uncoded QPSK-based FTN transmission, the bit density is equal to 4 data bits/Hz-s / (when considering the 3 dB bandwidth as a scale unit). The asymptotic error probability remains , as long as , the Mazo limit. Thus, the error probability (2 ⁄) ≥ .703 remains approximately , where decays with a decreasing . The ≈ ( ⁄) Mazo limit changes with different pulse excess bandwith factors, ′, for the same pulse, that is, will go below 2 at different ’s.

The remainder of this chapter is organized as follows. Section 3.2 presents a suitable receiver front end and its equivalent discrete-time model that results in white noise and minimum phase. Section 3.3 presents and evaluates the M-BCJR algorithm for simple detection FTN systems. Sections 3.4 and 3.5 present the QPSK-based FTN turbo receiver, along with the attained results, when employing two different outer code alphabets which are quaternary and binary codes, respectively.

67

3.2 Equivalent discrete-time system models

This section considers the conversion of the continuous FTN signals into an equivalent discrete time model. There are many methods available, and by adopting one track, the discrete-time model seen by the detector is created. In this work, the following model of the conversion to discrete time (hereafter called “conversion/model”) introduced in [96] is adopted. It assumes a linear modulation of the signal sequence by the pulse ( at rate 1/, which is then transmitted over an AWGN channel, as shown ) in Figure 3.2.

In Figure 3.2, the all-pass filter ( is used to produce a maximum phase output, ) which is then reversed frame-wise in order to get a minimum phase output sequence. The matched filter is matched to a pulse , and the output passes through a sampler () running at the faster rate, 1/ . The shifted versions of ( constitute an orthonormal ) basis set for , as ℎ()

, an integer (3.2 ℎ() = ( − ) )

where

. (3.3 = ℎ()( − )d )

Figure 3.2: Model for converting the continuous FTN into discrete time

68

The basis pulse is chosen such that its coefficients are the energy- () {} normalized samples of the pulse . The pulse has infinite time support ℎ() ℎ() ℎ() and it is time-symmetric. The pulse is truncated such that the set = {}, = − ,..., captures all but of the pulse energy, where . Since the pulse is - > 0 () orthonormal, the matched filter output samples achieve the following two advantages:

• The filtered noise samples are still white Gaussian random variables.

• The minimum Euclidean distance of the signal set of the form (3.1) can be

calculated from the matched filter output samples.

3.2.1 An improved minimum phase model

The allpass filter in Figure 3.2 is a filter that makes the model maximum ( ) phase. An allpass filter neither changes the noise statistics, nor the minimum distance

characteristics of a signal set ([45], Chapter 6). This is true for all allpass filters. A

maximum phase filter reflects the zeros of that lie inside the unit { } ( ) = ∑ circle to outside the unit circle. That means that the poles of lie at and the zeros ( ) { } at . The zeros of on the unit circle are not affected. {1/ } ( ) In fact, with a reduced-complexity detector, there are allpass filters that ( ) improve the detector’s performance even better than the mathematically correct minimum phase filters [27]. According to [27], the reduced-complexity detectors favor a steep growth in the energy of the ISI model . In this dissertation, we adopt the super minimum phase model attained using specific found through extensive searching from [27]. ( ) This super minimum phase model exhibits a steeper energy growth but, at the same time, it produces a precursor to the model of length . As the precursor’s energy is low, it can

69

be neglected, i.e. the first taps of the model can be set to 0, with almost no effect. An M-BCJR decoder, working with a super minimum phase , can achieve the same error rate while employing a 2–4 times smaller than working with the regular mathematically correct minimum phase models [27].

Listed below are the improved FTN models presented to the receiver processor for the main tests in this dissertation. The unit-energy models [27] for

are, respectively, as follows = 0.703, 0.5, 0.35, 0.25

= . ,. , −. , −. ,. , −. ,. ,. , −. ,. , −. ,

. , −. ] (3.4)

= −.005, −.003, .007, −.011, −.001, .034, −.019, .003, . ,. ,

. , −. , −. ,. , . , −. , −. ,. ] (3.5)

= .025, .012, −.024, .008, . ,. ,. ,. ,. , −. ,

−. , −. ,. ,. ,. , −. , −. ] (3.6)

= −.010, −.013, −.007, .005, .011, .004, −.008, .001, . ,. ,

. ,. ,. ,. ,. ,. , −. , −. , −. , −. ,

. ,. ,. ,. , −. , −. , −. ,. ,. ,. ,

70

. , −. ] (3.7)

The precursor values are written in lightface in (3.4)–(3.7); all detectors set these

to zeros and work at a delay of . The first represents the Mazo limit for the 30% rRC pulse . The models in (3.5)-(3.7) are super minimum phase. ℎ()

3.3 The extension of the M-BCJR algorithm for nonbinary alphabets

This section extends the reduced-search M-BCJR algorithm to accommodate the

quaternary modulation constellation, and tests it its performance in simple detection of

ISI. The basic binary-based M-algorithm for reduced-search of trees and trellises is well

known.

The general procedure of the M-algorithm goes like this: the algorithm proceeds breadth-first through a trellis of metric values, retaining only the best paths at each trellis depth . In the M-BCJR, the algorithm works both in the forward and backward directions keeping the dominant and which, supposedly, are close to the ] ], metric values that a full BCJR would find.

When and are multiplied by each other, the set is produced ] ] { } through where is a state at trellis depth . Log likelihood () = , ratios are found from these by

∈ Pr ( = +1 + |, R ) ∑ ℒ ] R () ≜ log LL = log . (3.8) LL Pr ( = |, R ) ∑∈ ] LL ℒ

71

Here are the sets of states reached by , for which nonzero and have both = been found. An issue takes place when some of are empty, which often occurs in a heavily reduced calculation. If only one set is empty, the numerator or denominator of

(3.8) must be replaced by a backup parameter.

The procedure of the M-BCJR applied to the QPSK-modulated symbols now follows. It performs quite good in simple detection of ISI, and it is called simple detection M-BCJR. Recursions start and end in the all-zero state. The noisy channel outputs and the apriori probabilities of the symbols are the inputs to the algorithm. The signed LLR values in (3.8) constitute the outputs of the algorithm. The set of dominant paths consists of two subsets, one having or values at depth and one having the corresponding trellis states.

Forward Recursion of . Starting at , perform at stage : = 0 1,2,..., − 1 1) The recursion in (2.70) is computed from the retained nonzero values of . There are outcomes corresponding to symbol , to , ( = +1 + ) (−1 + ) to , and to ; only the corresponding elements are (+1 − ) (−1 − ) 4 computed.

2) In case of merging trellis paths at depth , merges are detected and removed, + 1 leaving one survivor whose value is the sum of the four incoming values. 3) The best of all of the paths are found. These are stored for the next trellis depth and for the recursion.

72

Backward Recursion of . Starting at , the end of the channel block, perform at = stage , − 1,...,2: 4) The recursion in (2.71) is computed from the retained nonzero values of . There are outcomes corresponding to symbol , to , to (+1 + ) (−1 + ) , and to ; only the corresponding elements are computed. (+1 − ) (−1 − ) 4 5) In case of merging trellis paths at depth , merges are detected and removed, leaving one survivor whose value is the sum of the four incoming values. 6) The best of all of the paths are found, taking into consideration the following condition: paths must be retained if their state and stage overlap with that of a stored . The -list is then filled with non-overlapping paths.

Completion. Starting at , perform at stage : = 0 0,1,..., − 1 7) Find the LLRs from (3.8). If , , or is empty, the ℒ ℒ,ℒ ℒ corresponding λ-sum in (3.8) is set to , where is a backup value set in advance.

The idea of a reserve value , and keeping the s that overlap with the s in the backward recursion were proposed in [93]. However, in this dissertation, the s that overlap with the s are only given first priority, while the list is completed from the other winning s to make a total of . The efficiency of this method can be seen by observing the search dynamics. During most of the detection period, few paths in the search

overlap, and one of these is usually the correct path. The extra paths are needed in the event they merge to paths a few trellis depths later.

73

3.3.1 Performance of the M-BCJR in simple detection

Figures 3.3 and 3.4 plot the BER of the detection of the uncoded FTN signals when the M-BCJR algorithm is employed as the reduced-search equalizer at the ISI intensities 0.5,0.35,0.25 . The algorithm decides symbols from the LLR sign at its output.

The figures show the general behavior of the bit error performance as a function of the number of the survivor states of the M-BCJR algorithm for two uncoded FTN- based configurations over the long narrowband ISI models (3.5)-(3.7). The system in

Figure 3.3 represents the simple detection of an uncoded BPSK-based FTN, while Figure

(3.4 ) is an uncoded QPSK-based FTN system. The heavy lines in Figure 3.3 are the Q- function estimates based on the full model in [96].

Figure: 3.3. BER vs. for simple ISI detection BPSK-based FTN. Heavy lines are Q-function estimates. ⁄

74

Figure 3.4: BER vs. for simple ISI detection QPSK-based FTN. ⁄

From the two figures, we notice that the QPSK-based FTN achieves approximately the same BER as the binary PSK-based FTN system while, at the same time, the former achieves twice the baud rate as compared to the latter scheme at the same spectral usage and power consumption.

It is also observed that the M-BCJR has good performance. The complexity of the full trellis search has been considerably reduced, with only a slight degradation of performance as compared to the Q-function estimates, which are based on the distance study of full-state BCJR. Note that as the state size of a reduced-complexity BCJR algorithm gets reduced, its error rate will at some point depart from the ML estimate.

Distance based estimates are thus essential for deciding the minimum required size of an algorithm. Thus, as the whole idea behind FTN is based on trading bandwidth for processing complexity, we notice that higher alphabet modulation combined with FTN

75

would support this trend, and provide a potential solution in our bandwidth-starved world.

3.3.2 Backup M-BCJR for nonbinary alphabets

When accurate magnitudes of the LLRs are needed—as is the case with iterative decoding in the next section—the simple detection M-BCJR introduced in the last section is not adequate. A serious issue occurs when the algorithm calculates an of ℒ one of the probable symbols at practical values of . In that case, the M-algorithm is ⁄ quite sure of the correct symbol, and it ends up having no estimate of the LLR at all at that trellis stage except for the value set a priori. There are several suggestions in the ϵ literature to resolve this problem. In this dissertation, the backup M-BCJR, introduced in

[27], is extended to work with quaternary-based FTN transmission. The backup M-BCJR adds a third, low-complexity recursion, which aims at providing a backup LLR magnitude when the two original M-BCJR recursions do not. In what follows, the quaternary backup M-BCJR replaces step 7 of the M-BCJR procedure by a new step:

New step 7. Decide the QPSK symbols from the sign of (3.8), noting when is empty. ℒ In a third recursion, compute a symbol probability using the only, as follows: s From each decided node of the path, trace forward in the ISI trellis for a certain length of stages; that correspond to the decided path form the probability of one symbol s outcome, while that correspond to the other undecided path form the probability of s another symbol outcome. The traces are performed with a small -search value equal to M (typically works well). The search gives a backup estimate of = 2

76

, to be used when some of are empty; otherwise (3.8) is Pr = +1 + j]⁄ Pr = ] ℒ used.

A sketch of the entire quaternary backup M-BCJR procedure with is = 3 illustrated in Figure 3.5. First, the α recursion is performed shown in (a), then the β.

Shown in (b), the β paths, shown dotted, must follow the α paths as a first priority, and in

this case they are shown as overlapping. Shown in (c) is the decided path that results

from the whole first two recursions. Finally in (d), a backup search with is = 2 shown for a branch that was decided to be . +1 − j

77

Figure 3.5: Backup M-BCJR procedure for 3 and 2. Illustrating and recursions, hard decision path, and backup recursio =n.

78

3.4 Turbo equalization

This section investigates the BER performance of the quaternary code-aided

QPSK-based FTN turbo equalization, when the backup M-BCJR is employed for performing the ISI detection. While only the LLR signs were needed for the simple detection of the FTN signals in the previous section, turbo decoding requires practically accurate absolute values of the LLRs, especially in the initial iterations. These quantities are provided by the backup M-BCJR.

In this turbo system, a quaternary code is applied at the transmitter side along with a QPSK modulation. At the receiver side the channel effects are equalized using the quaternary backup M-BCJR, discussed in the previous section, and the equalized sequence passes through a full BCJR quaternary decoder.

The transceiver of this nonbinary turbo configuration is shown in Figure 3.6 along with the iterative loop.

79

Figure 3.6: Nonbinary turbo equalization receiver

3.4.1 System model

From the figure above, in this configuration, a data source produces independent and uniformly distributed (IUD) data symbols . The sequence ( … of ) data symbols is drawn from the integer ring , and protected by a memory 2 (5,7 = ) quaternary convolutional forward error correcting (FEC) code, defined by the generator

polynomials 1 and ( 1 . The encoding operation yields () = ) = the code symbols (νν … ν . ) Correspondingly, the FEC code rate is equal to

2 2 1 2⁄ (3.9 2( = 2 ) )

80

The code symbols are permuted using a random interleaver to the symbols , which are modulated to the symbols using ( = ( … ) … ) QPSK modulation. It follows that the transmitted symbols , according to Gray mapping, are as follows

+1 + , = 0 −1 + , = 1 = +1 − , = 2 (3.10) −1 − , = 3

The system code rate is given by

log |Ω| 1 − 2⁄ . (3.11)

The QPSK symbols are transmitted over the AWGN ISI channel model of (3.1).

The equalizer in Figure 3.6, at each depth , computes the APPs Pr( = |) where , and is the received sequence. The posterior probabilities ∈ Ω Pr( = |) follow from marginalizing over in the sequence-based posterior probability , Pr( |)

Pr ( = |) =∈ Pr ( |)

1 = . (| ) Pr ( ). (3.12) () ∈

81

From the IUD assumption on the symbols , can be factored into Pr( )

. In turbo equalization, it is more convenient to work with log-likelihood ∏ Pr() ratios (LLRs) rather than probabilities [98]. The conditional LLR ) of given ( | is

Pr( = +1 + ( = log |) ∈ Ω. (3.13) |) Pr( = , |)

Considering (3.12), can be decomposed into an extrinsic and a priori LLR ( |)

: ∑ ( ( ) ∑ ( Pr ( = log : |) ∏ |) Pr ( ) |) ∏ : : ∑ ( Pr () Pr( = 1 + ) = log ∑ ( |) ∏ + log : Pr ( ) Pr( = ) : |) ∏

= ( () (3.14) |) +

The extrinsic LLRs, , which are sent to the outer decoder as a priori ( ) information, are found by subtracting the a priori LLRs, , from the a posteriori LLRs ( ) generated by the equalizer, that is,

Pr ( = 1 + Pr ( = 1 + ) f () ≜ log |) , or ∈ Ω. (3.15) Pr ( = − log Pr ( = ) |)

Note that the a priori information is provided by the outer decoder. Since at the

first iteration there is no a priori information available, it is assumed that . () = 0, ∀ 82

Table I applies a description of the algorithm for computing the posterior probabilities , where denotes a length- column vector containing all ones Pr( |) and denotes the element-wise multiplication. The entries ( , ) = (0.5/ ʘ of follow from (3.1), where is the label on √2 ) . ( = ,) , ∑ | the branch from state to state , and is the apriori probability of the symbol Pr () that causes the transition. The scaling of by has been removed ( , ) 0.5⁄ √2 because of the normalization applied in the forward and backward recursions. Note here

that the algorithm in Table I is implemented assuming for simplicity and to = 2 visualize the process. But for the ISI models applied in this dissertation, they are as long as 32 taps, which would constitute a large matrix, that is difficult to visualize.

83

TABLE I Algorithm for computing the posterior probabilities for a memory-2 ISI channel INPUT Pr( |)

, , , , − − − − exp − 000 exp− 000 exp− 000 exp− 000 2 2 2 2 , , , , − − − − exp − 000 exp− 000 exp − 000exp− 000 2 2 2 2 , , , , − − − − exp − 000 exp − 000 exp − 000 exp − 000 2 2 2 2 , , , , − − − − exp − 000 exp − 000 exp − 000 exp − 000 2 2 2 2 , , , , − − − − 0 exp − 000 exp− 000 exp− 000 exp − 00 2 2 2 2 , , , , − − − − 0 exp − 000 exp− 000 exp− 000 exp− 00 2 2 2 2 , , , , − − − − 0 exp − 000 exp− 000 exp− 000 exp− 00 2 2 2 2 , , , , − − − − 0 exp − 000 exp − 000 exp − 000 exp − 00 Γ 2 2 2 2 = , , , , − − − − 00 exp − 000 exp − 000 exp − 000 exp− 0 2 2 2 2 − , − , − , − , 00 exp − 000 exp − 000 exp − 000 exp − 0 2 2 2 2 − , − , − , − , 00 exp − 000 exp − 000 exp − 000 exp − 0 2 2 2 2 − , − , − , − , 00 exp − 000 exp − 000 exp − 000 exp − 0 2 2 2 2 , , , , − − − − 000 exp − 000 exp − 000 exp − 000 exp − 2 2 2 2 , , , , − − − − 000 exp − 000 exp − 000 exp − 000 exp − 2 2 2 2 , , , , − − − − 000 exp − 000 exp − 000 exp − 000 exp − 2 2 2 2 − , − , − , − , 000 exp − 000 exp − 000 exp − 000 exp − 2 2 2 2 for = 0,1,…,−1

84

Table I (continued)

1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 (0) = (1) = 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 (2) = (3) = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 INITIALIZATION

= = BCJR ALGORITHM

FOR TO DO = 0 − 1 (forward recursion) = (normalization) = ⁄( ) END

FOR TO DO = − 1 1 (backward recursion) = (normalization) = ⁄( ) 85

Table I (continued)

END

OUTPUT

ʘ Pr ( = |) = ( () )⁄( )

The equalizer provides soft information to the decoder in the form of extrinsic probabilities that takes on a particular symbol from , that is . Ω ( ) Since the decoder in Figure 3.6 operates on the code symbols the next step for , the receiver algorithm is to compute the soft information . The mapping from ( ) LLRs, to for is referred to as soft demapping, and is |Ω| = 2 () ( ) ∈ performed as

( = 0) = ( = 1 + ) ( = 1) = ( = −1 + ) ( = 2) = ( = 1 − ) ( = 3) = ( = −1 − ). (3.16) The symbol probabilities , which are used for the next decoder, are Pr ( = ) computed as

f Pr ( = 0) ( = ) = log , or = 0,1,2,3. (3.17) Pr ( = )

Since

Pr ( = 0) ( = 0) = log = log 1 = 0 Pr ( = 0) 86

Pr ( = 0) ( = 1) = log (3.18) Pr ( = 1)

Pr ( = 0) ( = 2) = log Pr ( = 2)

Pr ( = 0) ( = 3) = log Pr ( = 3) then

( ) Pr ( = 1) = Pr ( = 0)

( ) Pr ( = 2) = Pr ( = 0) (3.19)

( ) Pr ( = 3) = Pr ( = 0), and

Pr ( = 0) + Pr ( = 1) + Pr ( = 2) + Pr ( = 3) = 1. (3.20)

Hence

1 Pr ( = 0) = ( ) ( ) ( ) 1 + + +

( ) Pr ( = 1) = ( ) ( ) ( ) (3.21) 1 + + +

( ) Pr ( = 2) = ( ) ( ) ( ) 1 + + +

( ) Pr ( = 3) = ( ) ( ) ( ) 1 + + + where

( ) = ()

87

and

= map ( ). After deinterleaving to , the outer decoder can compute, at each ( ) ( ) depth , the APPs given only the a priori LLRs Pr( = ()) |

() = ( ( )). (3.22)

The posterior probabilities can be computed efficiently using the Pr( = ()) | BCJR algorithm as long as the FEC code has a trellis with a reasonable small number of states. Table II applies the BCJR algorithm to the memory-2 quaternary (5,7)

convolutional code used in this dissertation. The entries of and the initial ( , ) vectors and apply from the code trellis.

88

TABLE II Algorithm for computing the posterior probabilities for the memory-2 quaternary convolutional code definedPr( in this= secti())on | INPUT

Pr ( = 0)Pr ( = 0)000Pr ( = 1)Pr ( = 1)000 Pr ( = 2)Pr ( = 2)000( = 3)( = 3)000 Pr ( = 1)Pr ( = 1)000Pr ( = 2)Pr ( = 2)000 Pr ( = 3)Pr ( = 3)000( = 0)( = 0)000 Pr ( = 2)Pr( = 2)000Pr ( = 3)Pr( = 3)000Pr ( = 0)Pr( = 0)000( = 1)( = 1)000 Pr ( = 3)Pr( = 3)000Pr ( = 0)Pr( = 0)000Pr ( = 1)Pr( = 1)000( = 2)( = 2)000 ( ) ( ) ( ) ( ) 0Pr = 0 Pr( = 1)000Pr = 1 Pr( = 2)000Pr = 2 Pr( = 3)000 = 3 ( = 0)00 0Pr ( = 1)Pr( = 2)000Pr ( = 2)Pr( = 3)000Pr ( = 3)Pr( = 0)000( = 0)( = 1)00 0Pr ( = 2)Pr( = 3)000Pr ( = 3)Pr( = 0)000Pr ( = 0)Pr( = 1)000( = 1)( = 2)00

0Pr ( = 3)Pr( = 0)000Pr ( = 0)Pr( = 1)000Pr ( = 1)Pr( = 2)000( = 2)( = 3)00 = 00Pr ( = 0)Pr( = 2)000Pr ( = 1)Pr( = 3)000Pr ( = 2)Pr( = 0)000( = 3)( = 1)0 0 0Pr ( = 1)Pr( = 3)000Pr ( = 2)Pr( = 0)000Pr ( = 3)Pr( = 1)000( = 0)( = 2)0 00Pr ( = 2)Pr( = 0)000Pr ( = 3)Pr( = 1)000Pr ( = 0)Pr( = 2)000( = 1)( = 3)0 00Pr ( = 3)Pr( = 1)000Pr ( = 0)Pr( = 2)000Pr ( = 1)Pr( = 3)000( = 2)( = 0)0 000Pr ( = 0)Pr( = 3)000Pr ( = 1)Pr( = 0)000Pr ( = 2)Pr( = 1)000( = 3)( = 2) 000Pr ( = 1)Pr( = 0)000Pr ( = 3)Pr( = 2)000Pr ( = 3)Pr( = 2)000( = 0)( = 3) 000Pr ( = 2)Pr( = 1)000Pr ( = 3)Pr( = 2)000Pr ( = 0)Pr( = 3)000( = 1)( = 0) 000Pr ( = 3)Pr( = 2)000Pr ( = 0)Pr( = 3)000Pr ( = 1)Pr( = 0)000( = 2)( = 1)

For =0,1,...,−1

1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0

0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 (0) = (1) = 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0

89

TABLE II (continued)

0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 (2) = (3) = 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 INITIALIZATION

= = (1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0) BCJR ALGORITHM

FOR TO DO = 0 − 1 (forward recursion) = (normalization) = ⁄ END

FOR TO DO = − 1 1 (backward recursion) = (normalization) = ⁄( ) END

OUTPUT

Pr ( = | ( ))) = ( () )⁄( ) ʘ

From Table II, the decoder computes the soft information . The Pr(/ ( )) reliability of compared to generally improves because of the Pr( / ( )) ( ) redundancy introduced during the FEC encoding.

90

The posterior probabilities can be computed with the symbol- Pr( / ( )) based MAP decoder. In fact, to compute , only the last line of the BCJR Pr( / ( )) algorithm described in Table II needs be updated to:

Pr ( ) = Pr = ( )

ʘ = ( ( ) )⁄ , (3.23)

Pr ( ) = P = ( )

ʘ = ( ( ) )⁄ , (3.24)

where

1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0

0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 (0) = (1) = 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0

91

0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 (2) = (3) = 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0

1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0

0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 (0) = (1) = 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1

92

0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 (2) = (3) = 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0

The a priori information is subtracted from the a posteriori information in order to

obtain the extrinsic LLRs

Pr = 0 () Pr ( = 0) ( ) ≜ log − log , or = 0,1,2,3 (3.25) Pr = () Pr ( = ) f

which are then fed back to the inner decoder to be used as updated a priori information.

The mapping from the decoder’s extrinsic information to the a priori ( ) information is referred to as the soft mapping. Incorporating the new updated a ( ) priori information into the trellis-based equalization algorithm requires only the transition matrices in the initialization step of the BCJR algorithm described in Table I be updated as

93

, − exp − Pr = , , ( )corresponds to a trellis branch, , ] = v alid 2 0, other ise w

(3.26) where

Pr = , = (Pr ( )) (3.27)

and

exp(− ( = )) Pr ( = ) = . 1 + exp − ( = 1) + exp − ( = 2) + exp(− ( = 3))

(3.28) After an initial detection of a received block, the iterative process is repeated a

predefined number of iterations (or until a suitably chosen termination criterion is met).

In the last iteration, the outer decoder only computes the data symbol estimates

f ≜ arg max Pr = () , or = 0,1,2,3 (3.29)

or, equivalently

f 1 i () = ( = 1) ( = 1) < 0 f and 2 i () = ( = 2) ( = 2) < 0 = f and (3.30) 3 i () = ( = 3) ( = 3) < 0 and 0 other ise where w

94

. ( ) = min( ( = 1), ( = 2), ( = 3))

In this dissertation, only the optimal (in terms of BER) MAP symbol detector, realized using the BCJR algorithm, is considered for decoding. Since BCJR equalization for large constellations and/or long ISI responses is too complex to be carried out, the reduced complexity backup M-BCJR method will be used for the equalizer.

3.4.2 Simulation results

Now we evaluate the BER performance of turbo equalization of the coded FTN signaling when the backup M-BCJR performs the ISI detection. While only the signs of the LLRs were needed for simple ISI detection, turbo decoding requires reasonably accurate absolute values, especially in the initial iterations. These values are provided by the backup M-BCJR.

We have two setups to be compared in the context of turbo equalization of the

FTN signals. The first is a binary code-aided BPSK-based FTN and the other is the quaternary code-aided QPSK-based FTN. The two configurations are run at ISI intensities , and they are shown in Figures 3.7 to 3.12. The figures τ = 0.5, 0.35, 0.25 show the BER performance when transmitting the data symbols over the different ISI

models as long as 32 taps.

95

Figure 3.7: Turbo equalizer BER vs. for binary code-aided BPSK-based FTN signaling at τ 1⁄ 2. ⁄

Figure 3.8: Turbo equalizer BER vs. for quaternary code-aided QPSK –based FTN signaling at τ 1⁄ 2. ⁄

96

Figures 3.7 and 3.8 compare the two FTN-based schemes when and taps τ = .5 from (3.5). The relatively mild ISI is not difficult to overcome, and even for the = 2 backup M-BCJR achieves nearly the CC-line BER at high SNR, where the CC-line is the

BER performance of the rate convolutional code over an AWGN channel. In both (5,7) setups, the results were gained after performing 4 complete loops in the iterative scheme.

The values of were chosen to achieve a good performance at reasonable SNRs in mind.

Figures 3.9 and 3.10 show the results when the FTN signaling rate increases. The figures plot the case, which suffers from more severe ISI than the previous case. τ = 0.35 The backup M-BCJR efficiently removes the ISI, even when SNR ≤ B . The results 7d have been attained by running 4 iterations. Again, the were chosen to achieve a s reliable performance at reasonable SNRs. The backup was chosen to be equal to 2. Employing the system in Figure 3.10 would achieve a bit density of ≈ 6 bits ⁄Hz s while the corresponding BPSK-based would enable of bit density. ≈ 3 bits ⁄Hz s Turbo equalization for the extreme ISI case with and the 32-tap channel τ 0.25 model given in (3.7) is shown in Figures 3.11 and 3.12, for the BPSK-based and QPSK- based configurations, respectively. In the simulations it was found that should be greater than and a high SNR is required in order to get reliable performance. A sharp 25 convergence threshold lies in the SNR range 11-12 dB. In both setups 7 iterations have

been performed at the turbo receiver. This signaling scheme at represents a τ = 0.25 highly bandwidth-efficient scheme.

97

Figure 3.9: Turbo equalizer BER vs. for binary code-aided BPSK-based FTN signaling at τ 0.35 . ⁄

Figure 3.10: Turbo equalizer BER vs. for quaternary code-aided QPSK-based FTN signaling at τ 0.35 . ⁄

98

Applying the system in Figure 3.11 achieves a bit density of 8 bits⁄ Hz s, while the system in Figure 3.12 would allow a 16 bits⁄ Hzs of bit density to be carried at a slight increase in the processing complexity.

Figure 3.11: Turbo equalizer BER vs. ⁄ for binary code-aided BPSK-based FTN signaling at τ 0.25 .

In Figure 3.13 the BER performance versus the SNR is shown with different number of decoder iterations performed at the turbo receiver for τ 0.5 and adopting the quaternary code-aided QPSK-based FTN system. The backup M-BCJR with 4 and

2 is employed.

99

Figure 3.12: Turbo equalizer BER vs. for quaternary code-aided QPSK-based FTN signaling at τ 0.25 . ⁄

Figure 3.13: Turbo equalizer BER vs. for quaternary code-aided QPSK-based FTN signaling for τ 0.5 and for different⁄ number of iterarions.

100

From the figure, it is clearly observed how increasing the number of iterations improves the performance significantly. The biggest improvement jump is attained when going from 0 iterations, i.e. separate equalization and decoding, to the first iteration. As the number of iterations goes higher, the improvement decreases and we get to the point that a limit is reached. Around 5 dB of improvement is gained when comparing the 0 iterations performance to the performance when performing 4 iterations over the same setup. This behavior is what makes the turbo decoding very popular among the research communities since its advent in 1995.

From Figures 3.7-3.13, we notice that in all setups, the QPSK-based FTN BER curves have the same trend as the BPSK curves, and the former has proven twice as efficient as the latter. Excellent performance has been achieved at low SNRs that is not very far from the no-ISI performance of the underlying convolutional code; thanks to the nearly-optimal turbo decoding that has been applied at the receiver.

We see that above a certain , called the threshold, the iterations converge to ⁄ the BER of the convolutional code alone over an ISI-free channel with the same . ⁄ Below this threshold, convergence is to a much higher BER.

In all of the figures, we see how the redundancy added by coding with turbo decoding has been used to improve system performance. If we compare the results in

Figure 3.4 and Figure 3.13, at a BER of , we see that using the turbo decoding with 10 even no iterations at all, i.e. separate equalization and decoding, achieves an SNR improvement of 1.5 dB over the simple detection receiver. Further, comparing the same two figures, around 6.2 dB is gained when just running 4 iterarions in the turbo receiver.

101

In Figure 3.13, we see how increasing the number of iterations in a turbo loop considerably improves the performance, because the reliability of the passed likelihoods around the loop generally improves.

In all of the figures, we notice that as increases, the performance improves at the expense of increased complexity, due to more channel states being considered. This clearly shows the advantage of the suboptimal -algorithm BCJR in that a system designer can easily specify the parameter in order to meet a certain performance or complexity level.

It is important to improve the quality of the LLRs passed around the turbo loop,

since they affect the stability and convergence of the turbo loop. Low-quality LLRs fail

to produce the exact soft information about some of the detected symbols. A useful

strategy is to scale the extrinsic LLRs by a scaling gain (gains were suggested in ≤ 1 [93]). The extrinsic LLRs are scaled by before each component BCJR decoder. The tests carried out in this work use loop gains that lie near 0.25 for and τ = 0.5 τ = 0.35 and near 0.15 for . τ = 0.25 We notice that the difference between the BER curves of the turbo QPSK-based

FTN and the BPSK-based configuration is due to the scaling factor which depends on an optimization extensive search and which also differs between different setups.

From the above results and observations, it turns out that the combination of coded FTN and higher order modulation is an attractive narrowband communications scheme.

102

3.5 Binary code-aided QPSK-based FTN

In this section, a binary code is applied at the transmitter side along with QPSK modulation, while at the receiver side the channel effects are equalized using the backup

M-BCJR, and the equalized sequence passes through a full BCJR binary decoder.

3.5.1 System model

In this configuration, a data source produces IUD data bits The sequence . of data bits is protected by a memory-2 binary convolutional = ( … ) (5,7) code. The encoding operation yields the code bits The code rate is = ( … ). ν ν equal to

2 − 2 1 − 2⁄ = = . (3.31) 2() 2

The code bits are permuted using a random interleaver to the bits = , which are modulated to the symbols using QPSK ( ( … ) … ) modulation. It follows that the transmitted symbols , according to Gray mapping, are

+1 + , = 0 = 0 and −1 + , = 1 = 0 = and (3.32) +1 − , = 0 = 1 and −1 − , = 1 = 1. and

The system code rate is

103

log |Ω| 1 − 2⁄ . (3.33)

The equalization process in this system is similar to the equalization in Table I.

The only difference is in the soft demapping of the posterior probabilities to Pr() . Pr() Since the outer decoder operates on the code bits , the next step for the receiver algorithm is demapping from probabilities to 2 probabilities . |Ω| = 2 Pr() Pr() The soft demapping from the posterior probabilities to the extrinsic LLRs Pr(|) is performed by marginalizing in the corresponding joint probability , as shown in the following equation, with : Pr(|) = (, , … , )

Pr ( = 0) = log Pr ( = 1)

∑ : Pr ( = map( )|) = log ∑ : Pr ( = map( )|) : ∑ ( ( map( )) ) = log ʘ : ∑ ( ( map( )) ) ( ) ʘ

: , : ∑ ( (map( )) ) ∏ exp(− . ( )) ʘ = log ∑ : ( (map( )) ,) ∏: exp(− . ( )) ʘ

(3.34) where

104

, ΓΦ΢ − exp − , ( ) corrsponds to a trellis branch, , ,] = v alid (3.35) 0, other ise, w

and follows from the trellis describing the ISI channel, and and are part of () the BCJR algorithm. To make sure that does not depend on , ( ) is computed using the extrinsic transition matrix as shown in (3.35) , above.

After is deinterleaved to , at each depth , the outer decoder ( ) ( ) computes the APPs , }, given only the a priori LLRs Pr( = ()) ∈ {0,1 |

() = ( ( )). (3.36)

The posterior probabilities can be computed efficiently using the Pr( = ()) | BCJR algorithm. Table III applies the BCJR algorithm to the memory-2 (5,7) binary convolutional code. The entries of and the initial vectors and follow ( , ) from the code trellis.

105

TABLE III Algorithm for computing the posterior probabilities for the memory-2 binary convolutional code definedP( in this= section()) | INPUT

Pr ( = 0)Pr ( = 0) 0 Pr ( = 1)Pr ( = 1) 0

Pr ( = 1)Pr ( = 1) 0 Pr ( = 0)Pr ( = 0)0 = 0 Pr ( = 0)Pr ( = 1) 0 Pr ( = 1)Pr ( = 0) 0 Pr ( = 1)Pr ( = 0) 0 Pr ( = 0)Pr ( = 1) For =0,1,...,−1

1 0 0 0 0 0 1 0

0 0 1 0 1 0 0 0 (0) = (1) = 0 1 0 0 0 0 0 1 0 0 0 1 0 1 0 0 INITIALIZATION

= = (1 0 0 0 ) BCJR ALGORITHM

FOR TO DO = 0 − 1 (forward recursion) = (normalization) = ⁄( ) END

FOR TO DO = − 1 1 (backward recursion) = (normalization) = ⁄( ) END

OUTPUT

Pr ( = ( ))) = ( () )⁄( ) | ʘ

106

From Table III, the decoder computes the soft information . The Pr(/ ( )) posterior probabilities can be computed with the symbol-based MAP Pr( / ( )) decoder. In fact, to compute , only the last line of the BCJR algorithm Pr( / ( )) described in Table III needs be updated according to (3.23) and (3.24) where

1 0 0 0 0 0 1 0

0 0 1 0 1 0 0 0 (0) = (1) = 0 1 0 0 0 0 0 1 0 0 0 1 0 1 0 0

1 0 0 0 0 0 1 0 . 0 0 1 0 1 0 0 0 (0) = (1) = 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 1

Each symbol depends on exactly 2 code bits , such that a = ( ) symbol probability is the product of 2 extrinsic code bit probabilities, i.e., Pr()

with The soft mapping can Pr ( = ) = ∏ Pr ( = ) = map (, ). be rewritten in terms of extrinsic LLRs ( )

exp(− . ( )) Pr ( = ) = , (3.37) 1 + exp(− ( ))

where , which are then fed back to the inner decoder to be used = map , … , as an updated a priori information. The mapping from the decoder’s extrinsic information

to the a priori information is referred to as the soft mapping. ( ) ( ) Incorporating the new updated a priori information into the trellis-based equalization

107

algorithm requires only the transition matrices in the initialization step of the BCJR algorithm described in Table I be updated as in (3.26).

After an initial detection of a received frame, the iterative process is repeated a preset number of iterations (or until a suitably chosen termination criterion is reached). In the last iteration, the outer decoder only computes the data bit estimates

≜ arg max for 0,1. 3.38 Pr ,

3.5.2 Simulation results

In the following two figures, the BER performance of the binary code-aided

QPSK-based FTN system is compared to the corresponding quaternary code-aided one.

Figure 3.14: Turbo equalizer BER vs. ⁄ for quaternary code-aided QPSK-based FTN signaling at τ 1⁄ 2.

108

Figure 3.15: Turbo equalizer BER vs. for binary code-aided QPSK-based FTN signaling at τ 1⁄ 2. ⁄

Comparing the two systems, we notice that the performance of the two systems is very close to each other. The differences are due to the facts 1) that different codes drawn from different alphabets are employed in each scheme, 2) that each system is run over scaling gains s that are not guaranteed to be the optimal ones, and 3) that the conversion of the posterior probabilities at the output of each block decoder in Figure 3.15 incurs losses in a practical implementation because of numerical problems. The system in

Figure 3.14 employs the coding scheme that uses the QPSK symbols in a natural way. No conversion to the detector’s posterior probabilities takes place. The probabilities are passed around the turbo loop in a natural manner. For the system in Figure 3.15, a conversion to the posterior probabilities needs be applied at the output of each block decoder, according to (3.34) and (3.37). This conversion increases the complexity considerably, especially when a high number of iterations is used. The conversion will

109

incur losses (this problem should grow more sound as we go higher in the order of the applied modulation scheme).

110

CHAPTER 4

Turbo equalization of the faster-than-Nyquist signaling using the

reduced-complexity Z-MAP algorithm

In this chapter, the performance of the Z-MAP algorithm is explored in the context of the turbo equalization of the faster-than-Nyquist (FTN) signals. The complexity of the Z-MAP, in terms of the number of the surviving states at each time epoch, is studied. Finally, the performance of the Z-MAP is compared to that of the backup M-BCJR for turbo equalizing the FTN signals.

4.1 Introduction

The M-BCJR [38] is a reduced-complexity algorithm of the optimal MAP trellis- based BCJR decoder [36]. At each trellis depth , the M-BCJR keeps the best states in terms of the metric values. Other states are declared null, and no paths are to be extended

from a canceled state. This process reduces the complexity of the full BCJR quite

considerably, at the expense of performance degradation. The M-BCJR is widely applied in the context of turbo equalization [99]. Other reduced-search variants of the M-BCJR are found in [27,100,101].

111

One reduced-search algorithm that performs better than the M-BCJR is the T-

BCJR [38]. In the T-algorithm, states with metrics less than a threshold are discarded. The T-algorithm performance improves upon the M-BCJR, but the number of surviving states varies at each trellis depth .

To improve upon the performance of the backup M-BCJR, the Z-MAP algorithm is applied for the turbo equalization of the FTN signaling. The results show the superiority of the Z-MAP over the backup M-BCJR in terms of performance improvement at nearly no cost.

The Z-MAP is a variant of the M-BCJR where the former provides adaptivity of the algorithm to the quality of the trellis metrics. The M-BCJR algorithm reduces the complexity of the BCJR by maintaining only the best values of and . The () () other states are neglected and set to zero, thus canceling a great number of states and their

corresponding outgoing branches. This is based on (2.72) and (2.73) in Chapter 2.

The weakness of the M-BCJR algorithm lies behind its tendency towards error

amplification [42]. Due to the zero-forcing of a large number of the trellis states, the M-

BCJR causes poor estimates of the alphas, . This behavior causes a burst of errors () throughout the data block. At erroneous moments, there are usually one or more competitors with the most probable state. A unique way of dealing with these error moments is to identify them first with a certain criterion, and then to increase the number of the surviving states around these moments. This is the principle upon which the Z-

MAP algorithm is based.

112

The Z-MAP algorithm is named so because the shape of the letter “Z” mimics the algorithm’s way of operation: GO, STOP in error instance, BACK and increase the number of states.

4.2 Error moments

In M-BCJR equalization, it is observed that the correct instants are well approached by an impulse with a high alpha value for the probable state as shown in

Figure (4.1). Other alphas have very small levels (probabilities) due to more zeros in the previous instants. The presence of an error is generally identified by the presence of one or more competitors in the alpha values, as shown in Figure 4.2. This phenomenon can be used as an easy criterion to locate an error possibility.

Figure 4.1: The most probable state in the correct instant.

113

Figure 4.2: Presence of a concurrent at the error instant.

Figure 4.3 shows a trellis with 4 states of an encoder which has a memory of 2. A relation between instant and the past is established in (4.1). Assuming this past is − 3. In this case, alpha of state 0 is found. The reason why a competitor appears in the M-

BCJR is illustrated both in Figure 4.3 and in (4.1).

(0 0,0 0,0 0,0 0,2 2,1 1,0 0 ) = ( )( )( ) + ( )( )( )] ( )

(1,0 0,0 0,0 1,2 2,1 1,0 1 (4.1 )( )( ) + ( )( )( )] ( ) )

(2,1 1,0 0,0 2,3 3,1 1,0 2 )( )( ) + ( )( )( )] ( )

(3,1 1,0 0,0 3,3 3,1 1,0 3 )( )( ) + ( )( )( )] ( )

114

Figure 4.3: A trellis with 4 states.

If assuming that at − 3, (3 is forced to zero with the M-BCJR technique; ) this action changes the four alphas 0 , 1 , 2 , and 3 with different ( ) ( ) ( ) ( ) proportionalities. This phenomenon creates a competitor in the M-BCJR decoding.

According to [42], the accumulation of zero forcing in the M-BCJR algorithm is significant and supports simultaneous appartition. These observations about error opportunity give a way of predicting the error moments during decoding. The existence of a competitor indicates a potential erroneous decision. The prediction principle can be: if the competitor exceeds a threshold “ ” then the Z-MAP algorithm should be applied.

The Z-MAP principle is: When the error instant is located, the reason for this degradation should be deleted. The Z-MAP algorithm works to increase the states number around this instant. A threshold, ξ, for detecting the event of error presence should be fixed and a fixed backward length should be specified.

115

The Z-MAP procedure [42] is as follows:

1) Locate the error moment during the progress of the M-BCJR in a direction.

Assume, for example, the forward direction during the calculation of alphas.

Assume is this moment.

2) Make a return of time units in the trellis. −

3) Increase to , with , where this increase should not exceed . > 2 ×

An example is shown in Figure 4.4. Figure 4.4(a) shows the surviving states during the execution of the 2-BCJR of a trellis with a total of 4 states. Suppose an error is located at moment 5, and . A return of time units is made to get to moment = 2 2 × 2 2. At moment 2, all outgoing branches that allow up to surviving states are kept. This process is repeated until we get to moment 6. Right after that, the Z-Map stops and the process proceeds as a normal backup M-BCJR algorithm.

The process of increasing the number of surviving states at moment 2 enhances

the metric value for the winning state afterwards as well as deteriorating the metric values

for other competitors. This results in a better quality winning state at moment 6 that is

more distinguishable from other surviving states at that moment. This behavior improves

performance as reflected in the figures in the next pages. Figure 4.4(b) shows the

procedure of the Z-MAP.

116

Figure 4.4: The Z-MAP principle [42].

4.3 Z-MAP applied to turbo equalization of FTN signals

The simulation setup applies FTN turbo equalization and a rate-1⁄ 2 (5,7 ) convolutional code. The simulations run for 4 iterations. The Z-MAP starts running at 8 states which reduce to 4 states.

117

Z-MAP is used in the equalizer with a threshold value and . Both ξ = 0.7 = 4 the threshold value and the traceback length are chosen such that the system ξ complexity converges to an average of four states at practical values of SNRs while improving upon the 4-BCJR performance.

4.3.1 Simulation results

In what follows, a discussion of some of the attained results using the Z-MAP is introduced.

In the context of FTN equalization, three FTN-based system configurations are

tested with the Z-MAP applied as the reduced-complexity equalizer. They are as follows:

1. Z-MAP in the simple detection of an uncoded binary-based FTN system.

2. Z-MAP in binary code-aided BPSK-based FTN system.

3. Z-MAP in quaternary code-aided QPSK-based FTN system.

Figure 4.5 shows the behavior of the Z-MAP complexity curve versus the SNR values for the uncoded simple detection FTN scheme. From the figure, it is seen that the algorithm’s complexity drops to an average number of states that is equal to 4.2 at an

SNR value of 10 dB.

118

Figure 4.5 Average number of states of the Z-MAP simple detection of the FTN binary signals at τ 1⁄ 2.

Figure 4.6 compares the BER performance of two algorithms working on approximately the same average number of live states, which is equal to 4, in the context of simple detection of the FTN signals. The algorithms are the 4-BCJR and the 8-states

Z-MAP.

119

Figure 4.6 BER comparison between M-BCJR and Z-MAP turbo decoding for binary FTN signaling at τ 1⁄ 2.

From the figure, we notice that an improvement of around 1 dB is achieved when employing the 8-state Z-MAP as compared to the 4-BCJR while operating at the same complexity of 4 states.

Figure 4.7 shows the BER performance of the binary code-aided BPSK-based

FTN with a turbo decoder applying the Z-MAP with 8 states that can be reduced to 4 states. The figure shows the BER curves for various numbers of iterations.

120

Figure 4.7: Turbo equalizer BER vs. for binary code-aided BPSK-based FTN signaling at applying the Z-MAP.E⁄ N τ 1⁄ 2

From the figure, it is shown that the performance improves as we go higher in the number of iterations. To relate the quality of the performance to the complexity of the applied Z-MAP, Figure 4.8 portrays the complexity of the equalizer (represented as the average number of the algorithm’s live states) versus the SNR, and as a function of the number of iterations as well.

121

Figure 4.8: Average number of states of the Z-MAP turbo system of figure 4.7.

From the figure, it is shown that as the SNR goes higher, the complexity of the equalizer decreases, since the quality of the winning states compared to other competitors generally gets better.

Furthermore, the same trend of complexity behavior is observed with respect to the number of iterations. From the figure, as the number of iterations increases, the complexity goes down. With more iterations, the reliability of the soft information passed around the turbo loop is improved, and thus, the quality of the winning state metric is improved compared to the other competitors.

Considering the performance shown in Figures 4.7 and 4.8, as the SNR and the number of iterations are increased, the complexity curves converge to the performance of an 8-state BCJR while undergoing the complexity of a 4-state BCJR. This provides a

122

great accommodation and adaptivity to the performance-complexity tradeoff of the M-

BCJR algorithm and shows the improvement of the Z-MAP compared to the M-BCJR.

We see from the figures that the Z-MAP has the performance of the 8-state BCJR, while working with the reduced complexity of only 4.28 states at a BER of . ≈ 10

Simulations show that the Z-MAP gives an improvement in performance over the backup M-BCJR with low complexity at the price of a small increase in complexity (an average of 0.28 more states at practical values of SNR and with 4 iterations).

The results of the Z-MAP algorithm performed in the context of the turbo equalization of the FTN signals, produce a great efficiency improvement over the already efficient reduced-complexity M-BCJR.

Figure 4.9 compares the BER performance of the two algorithms working almost on the same average number of live states, which is equal to 4 at the 4 th iteration for turbo

equalizing the BPSK-based FTN signals. These are the 4-BCJR and the 8-states Z-MAP.

We notice that the latter shows a considerable improvement over the former. Around 0.5

dB of improvement is achieved with the 8-state Z-MAP as compared to the 4-BCJR at an

SNR of 5 dB while having the same complexity.

123

Figure 4.9: BER comparison between M-BCJR and Z-MAP turbo decoding at the 4 th iteration for binary code-aided BPSK-based FTN signaling at τ 1⁄ 2.

Figures 4.10 and 4.11 show the behavior of the Z-MAP error performance and complexity, respectively, versus the SNR values and the number of turbo loop iterations when the algorithm is applied for the equalization of the quaternary code-aided QPSK- based FTN system. In both figures, the threshold value preset for the algorithm is

0.49 .

124

Figure 4.10: Turbo equalizer BER vs. for quaternary code-aided QPSK-based FTN signaling at applying the Z-MAP.E⁄N τ 1⁄ 2

Figure 4.11 Average number of states of the Z-MAP turbo system of Figure 4.10.

125

From the two figures, we notice that in the context of quaternary processing the

Z-MAP exhibits the same trend as in the binary case. This proves that the Z-MAP is a well-built algorithm in the context of turbo equalization, and that it is flexible to handle nonbinary symbol alphabets.

One advantage of the Z-MAP algorithm over the T-MAP [38], in the context of turbo equalization of the severe ISI introduced by the FTN, is in the easiness of implementation, since a system designer does not need to worry about the actual state metric values in order to specify a certain threshold to meet a desired level of performance-complexity tradeoff. In the Z-MAP algorithm, the threshold value only depends on the relative proportionality of the state metrics with each other, but not on the absolute values of the metrics, as with the T-MAP. In fact, the threshold value set in the context of the Z-MAP would make more sense to the system designer as a value by itself, and make the process more imaginable. This also would make it handy for the system designer to work with different setups and modulation alphabets. But with the T-MAP one needs more work to get to the threshold value each time they change the system parameters.

The significance of the Z-MAP in the context of FTN equalization is quite valued.

Since the ISI introduced by FTN is severe, ISI models as long as 32 taps are used, which incurs an exponential growth in the system’s trellis. A significant reduction in the trellis state space is necessary to make the system practical. This reduction in complexity is accompanied with performance degradation. So any increase of the system’s overall performance, while keeping the same level of complexity, is sought, which is exactly what the Z-MAP does for FTN equalization.

126

Finally, and as a conclusion, the idea of Z-MAP algorithm is very important, and it opens a new field of research in digital communications. The Z-MAP has shown to be an improved version of the backup M-BCJR algorithm where it improves upon its performance in the context of turbo equalization of FTN signals at a slight increase in complexity (an average of 0.28 more states). The Z-MAP makes use of the full potential of the backup M-BCJR, by providing adaptivity to the channel conditions and the SNR values. The significance of the Z-MAP in the context of FTN equalization is quite valued, since any slight improvement of the FTN system performance is significant because of the unavoidable heavy reduction of the system complexity.

It is noticed that the improvement attained, in terms of the gained SNR value, is greater when the Z-MAP is applied for the simple detection of the FTN signals. One of the possible explanations of this result is that the simple detection scheme is simpler than the coded system’s structure. In addition, this result can also be due to the fact that, in the context of uncoded FTN systems, only the sign of the LLRs is needed to detect the symbols which makes any advantage added by the Z-MAP be more sound.

It is apparent that the binary Z-MAP can be extended to handle the higher order alphabets in a straightforward manner.

127

CHAPTER 5

Summary and future directions

In this dissertation, a somewhat unconventional signaling scheme is considered.

In faster-than-Nyquist, signaling is carried out at a rate faster than that allowed by

Nyquist’s orthogonality criterion, and thus, this behavior introduces intentional intersymbol interference. Shannon showed in 1949 that the capacity limit can be reached using the memoryless transmission of ISI-free orthogonal sinc pulse and long symbol sequences. In practice, this memoryless assumption in the modulator has proven lossy for the capacity.

Faster-than-Nyquist signaling exploits the excess bandwidth of the -orthogonal pulses and achieves capacity with discrete symbol alphabets. The interesting thing about

FTN is that it maintains a fixed power spectral density while achieving higher bit densities, bits/Hz-s, than orthogonal transmission schemes. FTN is one of the most promising techniques for future satellite and wireless communications for its ability to pack more data in a given bandwidth without the need to increase the transmission energy.

There is a tradeoff when employing the FTN signaling, and this tradeoff is between the bandwidth efficiency and the receiver complexity. The ISI introduced by

128

FTN is trellis-structured since it can be well-approximated by a finite state machine. In this dissertation, we have extended the BPSK-based FTN signaling into

QPSK-based signaling along with quaternary convolutional coding which uses this modulation alphabet in a natural way. The signaling scheme has been decoded using the quaternary-extended reduced-complexity receivers, which can, with practical complexity levels, achieve near-optimal performance under severe ISI generated by the higher transmission rate in FTN. These contributions will make this signaling method more useful and even more bandwidth efficient.

In this dissertation, a system setup is presented including the modeling of the linear transmitted signals along with the trellis-based detection algorithms employed at the receiver for the nonbinary code-aided QPSK-based FTN signaling. The turbo principle is introduced as well.

Furthermore, the reduced-complexity M-BCJRs are extended to handle nonbinary modulation alphabets and are applied to simple ISI detection as well as to turbo equalization of the quaternary code-aided and the QPSK-based FTN signaling. The M-

BCJR is extended to the simple detection of the QPSK-based FTN to achieve twice the bandwidth efficiency of the BPSK-based FTN.

In a heavily reduced search, there is usually no overlap between the forward and

the backward recursions of one of the symbols. Therefore, the decoder cannot get reliable

soft information to be passed around the turbo loop. In addition, the magnitude of the

probabilities in the numerator and the denominator in the log likelihood ratios may differ

significantly from their respective values in a full complexity detector, which results in

overestimating the LLR values. In simple detection of the FTN signals, only the sign of

129

the LLRs was needed, while for the turbo equalization accurate absolute values of the

LLRs are required to provide reliable performance. The backup M-BCJR solves this problem by adding a third low complexity recursion to provide backup values in case of empty LLRs.

The backup M-BCJR is also extended to accommodate the quaternary code-aided

QPSK-based FTN. The result is a system that is twice as bandwidth-efficient as the coded

BPSK-based FTN at the same performance, with a slight increase of the underlying complexity.

A comparison is carried out between two system configurations, the quaternary code-aided QPSK-based FTN and the binary code-aided QPSK-based FTN systems. The former came to prove superior to the latter both in terms of performance and the underlying complexity. The former system’s code uses the modulation alphabets in a natural way, and therefore, no conversion to the posterior probabilities takes place before each component decoder. However, in the binary code-aided system conversion of the posterior probabilities has to be done for the probabilities to be passed around the turbo loop.

It is important to improve the quality of the LLRs passed around the turbo loop since they affect the stability and convergence of the turbo loop. A useful strategy is to scale the extrinsic LLRs by a scaling gain . An appropriately chosen would ≤ 1 greatly improve the turbo detector’s error performance without any increase in the underlying complexity. An interesting future subject is to find the best scaling factor analytically and to adapt this to the quality of the LLRs, in other words, adapting to

130

the SNR value and the number of surviving states . All results in this dissertation could be extended to work with other outer codes and larger symbol alphabets.

Chapter 4 presents the Z-MAP algorithm in the context of the turbo equalization

of the coded FTN signals. Z-MAP has shown to be an improved version of the M-BCJR

algorithm. The advantage of the Z-MAP is that it offers adaptivity to the quality of the

trellis metrics, and hence it improves the BER performance by increasing the number of

live states around an error burst. This considerably improves the performance at almost

no increase in the complexity at practical values of SNRs and sufficient number of

iterations.

The idea of the Z-MAP algorithm is very important, and it opens a new field of research in digital communications. Even though the M-BCJR is a highly efficient reduced-complexity algorithm, it requires a fixed number of live states over all SNR values, and different number of iterations regardless of the quality of the state metrics.

The Z-MAP makes use of the full potential of the backup M-BCJR by providing adaptivity to the channel conditions and the SNR values. The significance of the Z-MAP in the context of FTN equalization is quite valued, since any slight improvement of the

FTN system performance is significant because of the unavoidable heavy reduction of the system complexity.

131

REFERENCES

[1] BEREC. http://erg.eu.int.

[2] FCC. www.fcc.gov.

[3] TRAI. www.trai.gov.in.

[4] F. F. Lanas and P. P. Gomez, “Global communications newsletter,” IEEE Communications Magazine , vol. 49, no. 7, Jul 2011.

[5] M. El Hefnawy and H. Taoka, “Overview of faster-than-Nyquist for future mobile communication systems,” DOCOMO Communications Laboratories Europe. Munich, Germany.

[6] H. Nyquist, “Certain topics in telegraph transmission theory,” AIEE Transactions , 617 – 644, 1928. [7] C. E. Shannon, “A mathematical theory of communications,” Bell System Technical Journal , vol. 27, pp. 379–429 and 623–656, Jul. and Oct. 1948. [8] C. E. Shannon, “Communication in the presence of noise,” in Proc. IRE , vol. 23, pp. 10–21, 1949. [9] GSM. www.etsi.org/website/technologies/gsm.aspx. [10] J. E. Mazo, “Faster-than-Nyquist signaling,” Bell System Technical Journal , 54(8):1451–1462, Oct. 1975. [11] R. G. Gallager, “Low density parity check codes,” Monograph, MIT press , 1963. [12] J. G. Proakis and M. Salehi, Digital Communications , McGraw Hill , 5 edition, 2008.

[13] F. Rusek, Partial response and faster-than-Nyquist signaling , PhD thesis, Dept. of Electrical and Information Technology, Lund University, 2007. [14] A. D. Liveris and C. N. Georghiades, “Exploiting faster-thanNyquist signaling,” IEEE Trans. on Communications , vol.51, no.9, pp.1502– 1511, Sep. 2003.

132

[15] A. Barbieri, D. Fertonani, and G. Colavolpe, “Time-frequency packing for linear modulations: Spectral efficiency and practical detection schemes,” IEEE Trans. on Communications , vol. 57, no. 10, pp. 2951–2959, Oct. 2009. [16] M. McGuire and M. Sima, “Discrete time faster-than-Nyquist signaling,” In Proc. of IEEE Global Conference (GLOBECOM), Dec. 2010. [17] F. M. Han and X. D. Zhang, “Wireless multicarrier digital transmission via Weyl- Heisenberg frames over time-frequency dispersive channels,” IEEE Trans. on Communications , vol. 57, pp. 1721–1733, Jun. 2009. [18] M. Hamamura and S. Tachikawa, “Bandwidth efficiency improvement for multi- carrier systems,” in IEEE International Symposium on Personal, Indoor and Mobile Radio Communications , vol. 1, pp. 48–52, Sep. 2004.

[19] Y. G. Yoo and J. H. Cho, “Asymptotic optimality of binary faster-than-Nyquist signaling,” IEEE Communications Letters , vol. 14, no. 9, pp. 788 –790, Sep. 2010. [20] Y.J.D. Kim and J. Bajcsy, “On spectrum broadening of pre-coded faster-than- Nyquist signaling,” in Proc. IEEE Vehicular Technology Conference (VTC) , Fall Sep. 2010.

[21] I. Kanaras, A. Chorti, M.R.D. Rodrigues, and I. Darwazeh, “Spectrally efficient FDM signals: Bandwidth gain at the expense of receiver complexity,” in IEEE International Conference on Communications , pp. 1–6, Jun. 2009.

[22] S. Isam and I. Darwazeh, “Peak to average power ratio reduction in spectrally efficient FDM systems,” in 18th International Conference on (ICT) , pp. 363–368, May. 2011.

[23] John B. Anderson and Mehdi Zeinali, “Best rate ½ convolutional codes for turbo equalization with severe ISI,” in IEEE International Symposium on Information Theory Proceedings , pp. 2366-2370, 2010.

[24] F. Rusek and J. B. Anderson, “Constrained capacities for faster-than-Nyquist signaling,” IEEE Trans. on Information Theory , vol. 55, no. 2, February 2009.

[25] F. Rusek and J. B. Anderson, “Serial and parallel concatenations based on faster than Nyquist signaling,” in Proc. IEEE Int. Symp. Information Theory , pp. 970–974, Seattle, WA., July 2006.

[26] A. Prlja, J. B. Anderson,and F. Rusek, “Receivers for faster-than-Nyquist signaling with and without turbo equalization,” in Proc. IEEE International Symposium on Information Theory (ISIT) , Toronto, Canada, July 2008.

[27] A. Prlja and J. B. Anderson, “Reduced-complexity receivers for strongly narrowband intersymbol interference introduced by faster-than-Nyquist signaling,” IEEE Transactions on Communications , vol. 60, no. 9, pp. 2591–2601, September 2012.

133

[28] J. B. Anderson, F. Rusek, and V. Owall, “Faster than Nyquist signaling,” Proocedings of the IEEE , vol. 101, no. 8, pp 1817-1830, Aug. 2013. [29] G. D. Forney, Jr.,“Maximum-likelihood sequence estimation of digital sequences in the presence of intersymbol interference,” IEEE Trans. Inform. Theory , vol. 18, no. 3, pp. 363–378, May 1972. [30] A. J. Viterbi, “Error bounds for convolutional codes and an asymptotically optimum decoding algorithm,” IEEE Trans. Inf. Theory , vol. 13, no. 2, pp. 260–269, Apr. 1967.

[31] J. B. Anderson, Instrumentable tree encoding of information sources, M.Sc. Thesis, School of Electrical Engineering, Cornell University, Ithaca, N.Y., Sep, 1969.

[32] C. Berrou, A. Glavieux, and P. Thitimajshima, “Near Shannon limit error correcting coding and decoding: Turbo-codes (1),” in Proc. IEEE Int. Conf. Commun. (ICC) , vol. 2, pp. 1064–1070, May 1993.

[33] C. Berrou and A. Glavieux, “Near optimum error correcting coding and decoding: Turbo-codes,” IEEE Trans. Commun. , vol. 44, no. 10, pp. 1261–1271, Oct. 1996.

[34] J. Hagenauer, “The turbo principle: Tutorial introduction and state of the art,” in Proc. Int. Symp. Turbo Codes , pp. 1–11, ENST de Bretagne, France, Sep. 1997.

[35] C. Douillard, A. Picart, P. Didier, M. Jezequel, C. Berrou, and A. Glavieux, “Iterative correction of intersymbol interference: Turbo equalization,” Eur. Trans. Telecomm. , vol. 6, no. 5, pp. 507–512, Sept./Oct. 1995.

[36] R. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal decoding of linear codes for minimizing symbol error rate,” IEEE Trans. Inform. Theory , vol. 20, no. 2, pp. 284–287, March 1974.

[37] J. Hagenauer and P. Hoeher, “A Viterbi algorithm with soft-decision outputs and its applications,” in Proc. IEEE Global Telecomm. Conf. (GLOBECOM) , vol. 3, pp. 1680– 1686, Dallas, Nov. 1989.

[38] V. Franz and J.B. Anderson, “Concatenated decoding with a reduced search BCJR algorithm,” IEEE J. Sel. Areas Commun. , vol. 16, pp. 186– 195, Feb. 1998.

[39] K. K. V. Wong, The soft-output M-algorithm and its applications , Ph.D. thesis, Dept. Electrical and Computer Eng., Queens University, Canada 2006.

[40] J. Hagenauer and C. Kuhn, “Turbo equalization for channels with high memory using a list-sequential equalizer,” in Proc. Int. Symp. Turbo Codes, ENST de Bretagne, France, Sept. 2003.

[41] B. M. Hochwald and S. ten Brink, “Achieving near-capacity on a multiple antenna channel,” IEEE Trans. Commun. , vol. 53, pp. 389–399, Mar. 2003.

134

[42] A. Ouardi, A. Djebbari, and B. Bouazza, “Optimal M-BCJR turbo decoding: The Z- MAP algorithm,” Wireless Engineering and Technology , vol. 2, no. 4, pp. 230–234, 2011.

[43] A. J. Viterbi and J. K. Omura, Priciples of digital communication and coding , McGraw-Hill, NY, 1979. [44] J. B. Anderson, Digital transmission engineering , IEEE Press, Piscataway, NJ, 2nd ed., 2005.

[45] J. B. Anderson and A. Svensson, Coded modulation systems , Kluwer-Plenum, New York, 2003.

[46] G. J. Foschini, “Performance bound for maximum-likelihood reception of digital data,” IEEE Trans. Inf. Theory , vol. 21, no. 1, pp. 47–50, Jan. 1975.

[47] D. E. Knuth, The art of computer programming, vol. 3: Sorting and searching , Addison-Wesley, Reading, Mass., 1973.

[48] J. B. Anderson and E. Offer, “Reduced-state sequence detection with convolutional codes,” IEEE Trans. Inform. Theory , vol. 40, no. 5, pp. 965–972, May 1994.

[49] T. Aulin, “Breadth-first maximum likelihood sequence detection: Basics,” IEEE Trans. Commun. , vol. 47, no. 2, pp. 208–216, Feb. 1999.

[50] F. L. Vermeulen and M. E. Hellman, “Reduced state Viterbi decoding for channels with intersymbol interference,” in Proc. IEEE Int. Conf. Commun. (ICC) , pp. 37B.1– 37B.9, Minneapolis, June 1974.

[51] G. J. Foschini, “A reduced state variant of maximum likelihood sequence detection attaining optimum performance for high signal-to-noise ratios,” IEEE Trans. Information Theory , vol. 23, no. 5, pp. 605–609, Sept. 1977.

[52] S. J. Simmons, “Breadth-first trellis decoding with adaptive effort,” IEEE Trans. Communs. , vol. 38, no. 1, pp. 3–12, Jan. 1990.

[53] A. Duel-Hallen and C. Heegard, “Delayed decision-feedback sequence estimation,” IEEE Trans. Communs. , vol. 37, pp. 428–436, May 1989.

[54] M. V. Eyuboglu and S. U. Qureshi, “Reduced-state sequence estimation with set partitioning and decision feedback,” IEEE Trans. Communs. , vol. 36, pp. 13–20, Jan. 1988.

[55] R. Johannesson and K. Sh. Zigangirov, Fundamentals of convolutional coding , IEEE Press, Piscataway, NJ, 1999.

135

[56] J. B. Anderson and S. Mohan, Source and channel coding , Kluwer, Boston, MA., 1991.

[57] G. J. Foschini, “Contrasting performance of faster binary signaling with QAM,” Bell Laboratories Technical Journal , vol. 63, pp. 1419–1445, Oct. 1984.

[58] N. Seshadri, Error performance of trellis modulation codes on channels with severe intersymbol interference , Ph.D. thesis, Dept. Elec., Comp. and System Eng., Rensselaer Poly. Inst., Troy, NY, Sept. 1986.

[59] F. Rusek and J.B. Anderson, “Multi-stream faster-than-Nyquist signaling,” IEEE Trans. Communs. , vol. 57, pp. 1329–1340, May 2009.

[60] F. Rusek and J. B. Anderson, “The two dimensional Mazo limit,” in Proc. IEEE Int. Symp. Information Theory , pp. 970–974, Adelaide, 2005.

[61] F. Rusek and J. B. Anderson, “Successive interference cancellation in multistream faster-than-Nyquist signaling,” in Proc. Intl. Wireless Comm. and Mobile Computing Conf. (IWCMC’06) , Vancouver, Canada, July 2006.

[62] F. Rusek and J. B. Anderson, “Improving OFDM: Multistream faster than-Nyquist signaling,” in Proc. 4th Int. Symp. Turbo Codes & Related Topics , Munich, Germany, 2006.

[63] J. H. Lee and Y. H. Lee, “Design of multiple MMSE subequalizers for faster-than- Nyquist-rate transmission,” IEEE Trans. Communs. , vol. 52, pp. 1257–1264, Aug. 2004.

[64] A. D. Liveris, On distributed coding, quantization of channel measurements and faster-than-Nyquist signaling , Ph.D. thesis, Dept. Elec. Eng., Texas AT&M Univ., April 2006.

[65] F. Rusek and J. B. Anderson, “M-ary coded modulation by Butterworth filtering,” in Proc. IEEE Int. Symp. Information Theory , pp. 184, Yokohama, Japan, 2003.

[66] J. E. Mazo and H. J. Landau, “On the minimum distance problem for faster-than- Nyquist signaling,” IEEE Trans. Information Theory , vol. 34, pp. 1420–1427, 1988.

[67] D. Hajela, “On computing the minimum distance for faster-than-Nyquist signaling,” IEEE Trans. Information Theory , vol. 36, pp. 289–295, 1990.

[68] F. Rusek and J. B. Anderson, “On information rates of faster-than- Nyquist signaling,” in Proc. IEEE Global Telecomm. Conf. (GLOBECOM) , San Francisco, Ca., 2006.

136

[69] K. T.Wu and K. Feher, “Multilevel PRS/QPRS above the Nyquist rate,” IEEE Trans. Communs. , vol. 33, pp. 735–739, July 1985.

[70] F. Rusek, “A first encounter with faster-than-Nyquist signaling over the MIMO channel,” in Proc. IEEE Wireless Comm. Networking Conf. (WCNC) , Hong Kong, March 2007.

[71] F. Rusek and J. B. Anderson, “Optimal sidelobes under linear and faster than- Nyquist modulation,” in Proc. IEEE Int. Symp. Information Theory , pp. 2301–2304, Nice, June 2007.

[72] A. Barbieri, D. Fertonani and G. Colavolpe, “Improving the spectral efficiency of linear modulations through time-frequency packing,” in Proc. IEEE Int. Symp. Information Theory , pp. 2742–2746, Toronto, Canada, July 2008.

[73] G. Colavolpe, T. Foggi, A. Modenini, and A. Piemontese, “Faster-than-Nyquist and beyond: How to improve spectral efficiency by accepting interference,” Optics Express , vol. 19, no. 27, Dec. 2011.

[74] A. Modenini, G. Colavolpe and N. Alagha, “How to significantly improve the spectral efficiency of linear modulations through time-frequency packing and advanced processing,” in Proc. IEEE Int. Conf. Commun. (ICC) , pp. 3260–3264, June 2012.

[75] M. R. D. Rodrigues and I. Darwazeh, “A spectrally efficient frequency division multiplexing based communication channels,” in Proc. 8th Intl. OFDM Workshop , Hamburg, 2003.

[76] P. N. Whatmough, M. R. Perrett, S. Isam, and I. Darwazeh, “VLSI architecture for a reconfigurable spectrally efficient FDM baseband transmitter,” in Proc. IEEE Int. Symp. Circuits and Syst. (ISCAS) , pp. 1688– 1691, May 2011.

[77] R. G. Clegg, S. Isam, I. Kanaras, and I. Darwazeh, “A practical system for improved efficiency in frequency division multiplexed wireless networks,” IET Commun. , vol. 6, no. 4, pp. 449–457, March 2012.

[78] D. Dasalukunte, Multicarrier faster-than-Nyquist signaling transceivers , Ph.D. thesis, Elec. and Information Tech. Dept., Lund Univ., Lund, Sweden.

[79] W. Hirt, Capacity and information rates of discrete-time channels with memory , Ph.D thesis, no. ETH 8671, Inst. Signal and Information Processing, Swiss Federal Inst. Technol., Zurich, 1988.

137

[80] R. A. Gibby and J. W. Smith, “Some extensions of Nyquist’s telegraph transmission theory”, Bell System Technical Journal , vol. 44, no. 2, pp. 1487–1510, Sep. 1965.

[81] S. Shamai, L. H. Ozarow, and A. D. Wyner, “Information rates for a discrete-time Gaussian channel with intersymbol interference and stationary inputs,” IEEE Trans. Inf. Theory , vol. 37, no. 6, pp. 1527–1539, Nov. 1991.

[82] D. Kapetanovi , On linear transmission systems , Ph.D. thesis, Elec. and Information Tech. Dept., Lund Univ.,́ Lund, Sweden, June 2012. [83] M. T chler, R. Koetter, and A. C. Singer, “Turbo equalization: Principles and new results,” IEEE Trans. Commun. , vol. 50, no. 5, pp. 754–767, May 2002.

[84] M. T chler, A. C. Singer, and R. Koetter, “Minimum mean squared error equalization using a priori information,” IEEE Trans. Signal Process. , vol. 50, no. 3, pp. 673–683, March 2002.

[85] R. Koetter, A. C. Singer, and M. T chler, “Turbo equalization,” IEEE Signal Processing Magasine , vol. 21, no. 1, pp. 67–80, 2004.

[86] K. R. Narayanan, “Effect of precoding on the convergence of turbo equalization for partial response channels,” IEEE J. Select. Areas Commun. , vol. 19, pp. 686–698, April 2001.

[87] M. T chler, C. Weis, E. Eleftheriou, A. Dholakia, and J. Hagenauer, “Application of Proc. IEEE high-rate tail-biting codes to generalized pertial response channels,” in Global Telecomm. Conf. (GLOBECOM) , vol. 5, pp. 2965–2971, San Antonio , Nov. 2001.

[88] D. Doan and K. R. Narayanan, “Some new results on the design of codes for inter- symbol interference channels based on convergence of turbo equalization,” in Proc. IEEE Int. Conf. Commun. (ICC) , pp. 1873–1877, New York, April 2002.

[97] G. Ungerboeck, “Adaptive maximum-likelihood receiver for carrier modulated data- transmission systems,” IEEE Trans. Commun. , vol. 22, no. 5, pp. 624–636, May 1974.

[98] J. Hagenauer, E. Offer, and L. Papke, “Iterative decoding of binary block and convolutional codes,” IEEE Trans. Inf. Theory , vol. 42, no. 3, pp. 429–445, Mar. 1996.

[99] G. Bauch, H. Khorram, and J. Hagenauer, “Iterative equalization and decoding in mobile communications systems,” in Second European Personal Mobile Communications Conference (2. EPMCC ’97), 1997, pp. 307–312.

[100] C. Fragouli, N. Seshadri, and W. Turin, “On the reduced trellis equalization using the M-BCJR algorithm,” in IEEE Annual Conference on Information Sciences and Systems (CISS 2000) , 2000, pp. 28–33.

138

[101] P. Kumar, R. M. Banakar, and B. Shankaranand, “M-BCJR based turbo equalizer,” in Proceedings of the 7th International Conference on Intelligent Information Technology , ser. CIT’04. Berlin, Heidelberg: Springer-Verlag, 2004, pp. 376–386. [Online]. Available: http://dx.doi.org/10.1007/978-3-540-30561-3_39

[102] A. Alqudah and L. Joiner, “Turbo equalization of the faster-than-Nyquist signaling using the reduced-complexity Z-MAP algorithm,” International Journal of Computer Applications & Information Technology , vol. 9, pp. 201-207, July 2016.

139