<<

Fine-grained Channel Access in LAN ∗ ∗ Kun Tan Ji Fang Yuanyang Zhang Microsoft Research Asia Microsoft Research Asia Microsoft Research Asia Beijing, China and Beijing Jiaotong University and Beihang University [email protected] Beijing, China Beijing, China [email protected] ∗ ∗ Shouyuan Chen Lixin Shi Jiansong Zhang Yongguang Zhang Microsoft Research Asia Microsoft Research Asia and Tsinghua University Beijing, China Beijing, China {jiazhang, ygz}@microsoft.com [email protected]

ABSTRACT Keywords Modern communication technologies are steadily advancing the Fine-grained channel access, OFDMA, Cross-layer, MAC (PHY) data rate in wireless LANs, from hundreds of Mbps in current 802.11n to over Gbps in the near future. As PHY data rates increase, however, the overhead of media access 1. INTRODUCTION control (MAC) progressively degrades data efficiency. Modern communication technologies are steadily advancing the This trend reflects a fundamental aspect of the current MAC proto- physical layer (PHY) data rates in wireless local area networks col, which allocates the channel as a single resource at a time. (WLANs). For example, the latest ratified 802.11n standard [1] has This paper argues that, in a high data rate WLAN, the channel boosted data rates to 600Mbps. This capacity growth is achieved should be divided into separate subchannels whose width is com- primarily through wider channel bandwidths and advanced PHY mensurate with PHY data rate and typical frame size. Multiple techniques like MIMO (Multiple-Input Multiple-Output). Future stations can then contend for and use subchannels simultaneously standards like IEEE 802.11ac and 802.11ad are already poised to according to their traffic demands, thereby increasing overall effi- provide even faster PHY rates (>1Gbps) in the next few years. ciency. We introduce FICA, a fine-grained However, the data throughput efficiency — the ratio between the that embodies this approach to media access using two novel tech- network throughput and the PHY data rate — degrades rapidly as niques. First, it proposes a new PHY architecture based on OFDM the PHY data rate increases due to the design of the current 802.11 that retains orthogonality among subchannels while relying solely (MAC) protocol. For example, given that on the coordination mechanisms in existing WLAN, carrier-sensing most IP packets have a maximal transmit unit (MTU) size around and . Second, FICA employs a frequency-domain con- 1500 bytes, the efficiency ratio in an 802.11n network at 300Mbps tention method that uses physical layer RTS/CTS signaling and fre- is only 20%. That is, the 300Mbps data rate can sustain an actual quency domain backoff to efficiently coordinate subchannel access. throughput of only 60Mbps [23]. We have implemented FICA, both MAC and PHY layers, using a The fundamental reason for this inefficiency is that the current software platform, and our experiments demonstrate the fea- MAC allocates the entire channel to one station as a single resource. sibility of the FICA design. Further, our simulation results suggest This allocation strategy can become too coarse-grained when the FICA can improve the efficiency ratio of WLANs by up to 400% channel width increases or PHY data rate increases. Even if a compared to existing 802.11. has a small amount of data to send, it still needs to contend for the entire channel. Such contention resolution time is there- Categories and Subject Descriptors fore an overhead to the channel time used for data. Unfortunately, C.2.1 [COMPUTER-COMMUNICATION NETWORKS]: Net- this overhead cannot easily be reduced due to constraints of current work Architecture and Design—Wireless communication electronics and physical laws. As a result, the higher the PHY data rate, the lower the throughput efficiency will become. General Terms One way to improve the MAC efficiency is to extend the use- ful channel time for data transmissions by sending larger frames. Algorithms, Design, Experimentation, Performance Indeed, IEEE 802.11n allows frame aggregation, i.e., sending mul- ∗ tiple frames together in one contention period. However, when the This work was performed while Ji Fang, Yuanyang Zhang, PHY data rate increases, the aggregated frame size needs to in- Shouyuan Chen and Lixin Shi were research interns at Microsoft Research Asia. crease as well: achieving an efficiency of 80% in a 300Mbps net- work would require frames to be as large as 23KB. This larger ag- gregated frame means longer delays as the sender must wait to col- lect enough frames before actual transmission, resulting in adverse Permission to make digital or hard copies of all or part of this work for effects to TCP, real-time applications like VoIP and video confer- personal or classroom use is granted without fee provided that copies are encing, and even Web browsing that involves chatty protocols or not made or distributed for profit or commercial advantage and that copies short-lived sessions. bear this notice and the full citation on the first page. To copy otherwise, to We argue that a better way to improve WLAN efficiency is to ef- republish, to post on servers or to redistribute to lists, requires prior specific fectively reduce the channel width and create more channels, where permission and/or a fee. SIGCOMM’10, August 30–September 3, 2010, New Delhi, India. the channel width is commensurate with PHY data rate and typical Copyright 2010 ACM 978-1-4503-0201-2/10/08 ...$10.00. frame size. Multiple stations can then contend for and use these

147 Contention smaller channels simultaneously according to their traffic demands, slot thereby amortizing MAC coordination and increasing overall effi- ciency. We call this method fine-grained channel access for high ACK data rate WLANs. DIFS Contention Window SIFS

It is, however, non-trivial to divide a wide channel band into mul- Contention (a) Basic access tiple subchannels without losing useful channel . One slot common practice is to allocate both edges of two adjacent subchan- RTS CTS ACK nels as a “guard band” so that the useful transmissions are properly spaced to avoid interfering with each other. These guard bands can DIFS Contention Window SIFS SIFS SIFS add up to significant overhead, though, especially if the number of (b) With RTS/CTS handshake subchannels is large. For example, 802.11a uses a 1.875MHz guard band at both edges of every channel. If a 20MHz channel is divided Figure 1: Illustration of CSMA/CA access method. into four 5MHz subchannels, the overhead will amount to 75% of The rest of paper is organized as follows. Section 2 provides a the total bandwidth. Further, the guard band width cannot be easily detailed analysis of the source of inefficiency in current MAC pro- reduced due to power mask requirements and the difficulty of filter tocols. We then describe the design of FICA in Section 3 and eval- designs, independent of the width of a subchannel. uate its performance using simulation in Section 4. After describ- Orthogonal frequency division (OFDM) is a well- ing the implementation of a FICA prototype using a software radio understood PHY-layer technology that can eliminate the need to platform in Section 5, we evaluate its performance in Section 6. have guard bands, if the frequency and width of subchannels are Finally, Section 7 discusses related work and Section 8 concludes. strategically picked and transmission on each subchannel is syn- chronized in a way to become “orthogonal”, and hence non-interfering, to one another. Although some cellular networks (e.g., WiMAX [2] 2. BACKGROUND AND CHALLENGES and 3GPP LTE [3]) have proposed using OFDM in channel multi- access (OFDMA), doing so requires tight synchronization among 2.1 Inefficiency of Current WLANs user handsets and they cannot support random access. It thus re- State-of-the-art MAC protocols in wireless LANs manage the mains a new technical challenge for how to use OFDM-type chan- whole channel (e.g., 20/40MHz width) as a single resource. The nelization for fine-grained channel access among distributed and MAC protocol arbitrates access among multiple potential asynchronous stations in a random access WLAN, where it is im- and selects one as the winner, which then consumes the whole practical and unnecessary to achieve similar tight synchronization. channel resource to transmit. If multiple senders transmit at the In this paper, we present the design and implementation of FICA, same time, collisions may happen and receivers will likely fail to a novel cross-layer architecture based on OFDM that enables fine- decode the transmissions. grained subchannel random access in a high data rate WLAN. FICA Current 802.11 WLANs use carrier sensing multiple access with introduces two key techniques to address the aforementioned chal- collision avoidance (CSMA/CA) for their MAC protocol. When lenges: the channel is busy, all contending nodes wait until the channel be- • FICA proposes a new PHY architecture based on OFDM. comes free. The MAC employs a random backoff scheme to avoid Solely relying on the coordination mechanisms provided by having multiple nodes transmitting simultaneously. Each will b [0,CW ) existing WLANs, carrier-sensing and broadcasting, FICA re- randomly choose a number within a contention window , b tains orthogonality among subchannels with low overhead. and wait for time slots before it starts transmitting. If a node de- tects a transmission detected its backoff period, it will freeze the • FICA employs a novel frequency-domain contention method backoff counter until the channel is free again. If two nodes ran- that uses physical layer RTS/CTS signaling and frequency- domly choose the same backoff time, their transmissions will even- domain backoff for contending subchannels. We show that tually collide. A collision is usually detected by a missing acknowl- frequency-domain contention is much more efficient than the edgement (ACK) from the receiver. When a collision is detected, conventional time-domain contention mechanism in a fine- a sender will double its contention window CW according to the grained channel access environment. binary exponential backoff (BEB) algorithm to further reduce the collision probability for the next transmission. We have implemented a FICA prototype on the Sora software ra- Figure 1 illustrates the channel access timing diagram of the dio platform [24]. Our implementation demonstrates the feasibility 802.11 MAC. Figure 1(a) is the basic access method, and Fig- of our key techniques for both PHY and MAC design. We further ure 1(b) shows channel access with the optional RTS/CTS hand- use detailed simulation to evaluate FICA in large-scale wireless en- shake to handle hidden terminals. The Short Inter-frame Space vironments under different traffic patterns. Our results show that (SIFS) is the shortest time interval required for a receiver to re- FICA has up to a 4-fold gain in efficiency compared to existing turn a message to a sender. It is determined by Equation 1, where 802.11n with all its optimizations. trf_delay is the delay incurred to transfer digital signals from the RF In summary, this paper makes the following contributions. (1) We to the processing unit, tproc is the time needed for the pro- describe and examine the efficiency issue of current MAC protocols cessing unit to operate on the incoming signals, and tTxRx is the time in the context of high-speed WLANs, and argue that this issue can needed for the RF front-end to switch from receiving mode to trans- be resolved by fine-grained channel access. (2) We design and im- mitting. Normally, SIFS is about 10–16µs. The Distributed Inter- plement FICA, a protocol that enables fine-grained subchannel ran- frame Space (DIFS) is determined based on SIFS and the backoff dom access in WLANs; (3) We demonstrate the feasibility of FICA slot time, as shown in Equation 2. DIFS is defined to support prior- with a prototype implementation on a software radio platform, and ities in CSMA/CA and should be larger than SIFS. The backoff slot evaluate its performance using detailed simulation. To the best of time is critical. It is the minimal time needed for a node to sense our knowledge, FICA is the first system that enables fine-grained the channel condition and acquire the channel. Slot time is deter- channel access in WLANs. mined by Equation 3, where tcca is the time for a node to measure

148 Channel Guard band Parameter Value bandwidth

tslot 9µs tsifs 10–16µs tcca 4µs tTxRx ≤ 5µs tprop ≤ 1µs tpreamble 20–56µs f1 f2 f3 Frequency Table 1: Timing parameters of 802.11. (a) Normal frequency division multiplexing

Channel 90 bandwidth 80 802.11b 70 60 802.11a/g Spectrum saved 50 40 30 802.11n Efficiency(%) 20 802.11ac/ad 10 f1 f2 f3 Frequency 0 (b) OFDM 0 200 400 600 800 1000 Figure 3: OFDM achieves higher spectrum efficiency. PHY Data Rate (Mbps) Figure 2: Inefficiency of 802.11 MAC at high data rates with a versely proportionally. Figure 2 illustrates such a phenomenon: the typical MTU (1500B). efficiency quickly decreases from 60% at 54Mbps (802.11a/g) to less than 10% at 1Gbps (future 802.11ac/ad). the channel energy to decide the channel status, and tprop is the time for the radio signal to reach the maximal distance of the network. As mentioned in Section 1, transmitting larger frames will im- prove the efficiency ratio, but such a frame-aggregation approach tsifs = trf_delay + tproc + tTxRx, (1) has practical limitations. Fine-grained channel access will be a better approach, if we can divide the whole channel into smaller tdifs = tsifs + 2 · tslot, (2) sub-channels efficiently and allow different nodes to access dif- tslot = tcca + tTxRx + tprop + tproc, (3) ferent sub-channels simultaneously. Enabling concurrent transmis- Using these values, we can build a simple analytical model to sion across sub-channels is in effect an aggregation and opportunity compute the efficiency ratio for CSMA/CA. Since a node chooses to amortize the MAC overhead across different nodes. a random number uniformly from the contention window [0,CW ), CW 2.2 An OFDM Primer the expected number of backoff slots is W = 2 . Equation 4 gives the efficiency ratio for the basic access of CSMA/CA: Orthogonal Frequency Division Multiplexing (OFDM) has be- come increasingly popular in modern wireless communications [17]. t data It has been embraced by many existing wireless standards like IEEE η = · , (4) tslot W + tdifs + tpreamble + tsifs + tack + tdata 802.11a/g/n, WiMax [2], and by future standards like 3GPP LTE [3]. Cognitive radio technologies also mainly rely on OFDM to use where tdata is the time used for data transmission, tpreamble is the time non-contiguous spectrum bands for communication [19]. used to transmit per-frame training symbols and tack is the time used for the ACK frame. OFDM divides a spectrum band into many small and partially overlapping signal-carrying frequency bands called subcarriers. The Only tdata is used for transmitting application data, while all other times are overheads. Some overheads are constrained by physi- subcarrier frequencies are chosen so that they are “orthogonal” to cal laws and current contraints in state-of-the-art radio electronics. one another, meaning that cross-talk between subcarriers sums up to zero even though they are overlapping (Figure 3). OFDM can For example, you cannot reduce tprop less than 1µs to cover a net- work with a radius of a few hundreds of meters. It is also diffi- therefore pack subcarriers tightly together without inter-carrier in- terference, eliminating the need to have guard bands. cult to reduce tTxRx since the RF circuit requires a few microsec- ond to settle down for sending or receiving. Others are needed OFDM can be efficiently implemented using (inverse) Fast Fourier for the correct operation of the protocol. For example, we need Transform (iFFT/FFT). In an OFDM system with FFT size N, B training symbols for reliable estimation of the wireless channel for each subcarrier has exactly the same width of N and the subcar- 2πnB each frame, thus tpreamble is essential. The average backoff slots, rier central points are located at frequencies of fc + N , n = − N N − denoted by W , reflects the ability of CSMA/CA to avoid colli- 2 ..( 2 1), where fc is the central frequency of the channel sions. Thus, to work well in normal network settings, we need and B is the channel width. Different (e.g., BPSK, a reasonably large W . ACKs are also needed to detect collisions QPSK, etc.) can be applied to each subcarrier independently. After and other losses, thus in general we do not want to remove tack. modulating information onto each subcarrier, the sender performs Table 1 outlines some timing parameters defined in 802.11. They an iFFT to convert the frequency domain presentation to N time- remain similar across the different standards of 802.11a/g/n except domain samples which can be sent over the air. The time needed to for the preamble; since 802.11n uses MIMO, it requires more train- transmit these N samples is usually called the FFT period, which N ing symbols in its preamble. is equal to B seconds. Thus, given a fixed channel width, a larger Therefore, when the PHY data rate increases, only tdata will be N means a longer FFT period. Then, at the receiver side, the sig- reduced proportionally, while the other parameters remain largely nal can be converted back to the frequency domain using the FFT, unchanged. As a consequence, the efficiency ratio η decreases in- where each subcarrier can be demodulated independently.

149 FFT Window OFDM Symbol subchannel 1 Data 1 ACK

Sym i Sym i+1 Sym i+2 Sym i+3 Node A subchannel 2 Data 2 ACK

Sym i Sym i+1 Sym i+2 Sym i+3 Node B Busy subchannel 3 Data 3 ACK Data 4 (a) Loss of orthogonality due to misaligned symbols subchannel 4 ACK FFT Window OFDM Symbol DIFS M-RTS SIFS M-CTS SIFS SIFS

Sym i-1 CP i Sym i CP i+1 Sym i+1 Node A Figure 5: FICA uplink media access with four subchannels per Sym i-1 CP i Sym i CP i+1 Sym i+1 Node B channel.

Cyclic Prefix (b) Retaining orthogonality with a proper size CP bol alignment under current 802.11 coordination schemes, we will Figure 4: Misaligned OFDM transmissions. need a 40µs symbol length to keep the same guard-time overhead ratio. The reserved time slot for backoff, which has to be at least one OFDM symbol in length, will now increase proportionally. When OFDM is used as a multi-access technology where multi- This raises another technical challenge: we need a new efficient ple stations share the same channel, symbol timing alignment will MAC contention mechanism and a new backoff scheme. All these be a critical issue. As shown in Figure 4(a), if OFDM symbols from are necessary conditions for fine-grained channel access in high- two nodes misalign, the receiver may not be able to pick up an FFT data-rate WLANs. window containing the same samples across all senders. Orthog- onality will be lost and signals from both nodes will cause mutual interference. To ensure perfect symbol alignment, a multi-access 3. FICA DESIGN technology called OFDMA has been proposed for OFDM cellu- FICA is a cross-layer design that enables fine-grained channel lar networks like WiMAX and LTE. OFDMA requires all mobile access in high-rate wide-band WLANs. It is based on OFDM and stations to maintain tight timing synchronization with the cellular divides a wide-band channel into a set of orthogonal subchannels, (usually hundreds of nanoseconds). It requires a com- which different nodes can contend for individually. For the sake of plex ranging scheme to measure the propagation delay and fine tune clarity, we first assume a scenario consisting of a single access point each mobile station’s timing offset at the sample level granularity. and multiple mobile stations. We show later it is straightforward to OFDM further has a built-in robustness mechanism called the extend FICA to the multi-AP case. Also, while in the following cyclic-prefix (CP) [26] to guard against symbol misalignment due discussion we primarily use a 20-MHz channel as an example to to multipath echoes. Each OFDM symbol is prefixed with a copy explain the operation of FICA, we note that FICA is a scalable of the symbol’s tailing samples so that the receiver can still find a design that works for arbitrarily wide channel bands. proper FFT window as long as the misalignment is within a CP Figure 5 illustrates the basic uplink media access scheme for length (Figure 4(b)). CP is intrinsic to any OFDM system; in FICA. In this example, the channel is divided into just four sub- 802.11, the CP-to-symbol length ratio is 1:4 (0.8µs to 3.2µs). channels and each subchannel contains a number of subcarriers. FICA follows the basic scheme of CSMA. A new transmission op- 2.3 Fine-grained Channel Access in WLAN portunity appears only when the whole channel is idle. Then, all We propose to use fine-grained channel access to improve through- stations try to contend for different subchannels after the channel put efficiency in a high-data-rate WLAN. We divide the channel is idle for a certain amount of time (DIFS). At this time, all nodes width into appropriately sized sub-channels commensurate with will transmit a special RTS signal simultaneously. This RTS signal the PHY data rate and typical frame size, and further use OFDM is a specially-designed OFDM symbol, called Multi-tone RTS (see on the whole channel to avoid wasting bandwidth on guard bands. Section 3.3), in which each node embeds its contention information The fundamental challenge with this approach is coordinating ran- in a set of subcarriers for each subchannel it intends to access. All dom access among multiple distributed and asynchronous nodes M-RTS signals are resolved at the AP, and the AP will broadcast in a WLAN (potentially with multiple APs), without resorting to the contention results in a corresponding M-CTS OFDM signaling cellular-style tight timing synchronization. symbol. Then, only the nodes assigned subchannels will use them Because coordination in a WLAN is distributed and decentral- for data transmissions; note that a node may contend for multiple ized in nature, it is impractical to have OFDMA-style global time subchannels based on its instantaneous traffic demands. The AP synchronization. Not only would it introduce a great deal of sys- will then generate an acknowledgement on each subchannel where tem complexity, it would also likely require new hardware func- a data frame has been successfully decoded. tionality beyond the current or emerging 802.11 standards. Further, Downlink transmissions follow similar steps, but the AP will ini- OFDMA does not support random access and hence cannot be used tiate an M-RTS signal and receiving stations may return an M-CTS. directly in a WLAN. However, since FICA does not use random time backoff, it needs to Instead, we should use existing 802.11 coordination mechanisms, separate uplink and downlink transmissions; otherwise, collisions such as carrier-sensing and broadcast, to establish a rough symbol would happen under bi-directional traffic. FICA does so by assign- alignment among concurrent senders. We can leverage OFDM’s ing different DIFS times to uplink and downlink transmissions, de- intrinsic CP mechanism and lengthen it to suit the alignment scale, scribed further in Section 3.4. and further use a longer symbol length to maintain the same CP- We now present an analysis of the symbol timing misalignment to-symbol ratio. This approach calls for a new OFDM architecture problem in a CSMA-based WLAN. Based on this analysis, in Sec- specially designed for distributed coordination. tion 3.2 we describe the FICA PHY structure. We then describe Having a longer symbol length, however, does have a negative how FICA achieves frequency domain contention and backoff in impact that makes a conventional time-domain backoff scheme very Section 3.3. We finish by discussing several related design issues inefficient. For example, if we can only guarantee a 10µs sym- in Section 3.5.

150 M-RTS A 3.1 Symbol Timing Misalignment in WLANs AP M-RTS B In a WLAN, transmissions from distributed nodes are coordi- nated based on carrier-sensing and overhearing broadcast frames A M-RTS A (e.g., RTS/CTS). Unlike conventional MACs that use these mech- anisms to avoid simultaneous transmissions, FICA exploits simul- B M-RTS B taneous transmissions to enable concurrent access from different nodes but in orthogonal subchannels. tprop tcca tRxTx tprop In FICA, carrier-sensing coordinates the transmissions of M- (a) Using carrier sensing RTS. After they sense an idle channel for a fixed DIFS time, two DATA A nodes A and B may transmit their M-RTS symbols simultaneously. AP M-CTS DATA B However, since there is always a delay to sense the channel and for M-CTS a signal to propagate from one node to another, these two M-RTS A DATA A symbols cannot be transmitted and received at exactly the same tsifs time. Figure 6(a) shows the worst case analysis of the symbol tim- M-CTS DATA B ing difference of two such M-RTS symbols received by the AP. B

Assume A senses the channel idle for tdifs seconds, and A trans- tprop tsifs tprop mits M-RTS first. It will take at most t for the signal to arrive at prop (b) Using a reference broadcast node B. However, it may take B at least tcca time to reliably sense the busy channel. If the signal arrives at node B after B starts a Figure 6: The worst-case symbol timing misalignment of sensing slot, B may not be able to assess a busy channel.1 Thus, two concurrent transmissions received at the AP in a CSMA after the sensing slot, B still declares an idle channel and sends its WLAN: (a) symbol timing misalignment when coordination M-RTS as well. The radio signal of B’s M-RTS is actually emitted uses carrier sensing; (b) a reference broadcast synchronizes the after tRxTx. And it will take at most another tprop for the M-RTS senders better and timing misalignment is much tighter. to arrive at the AP. The total misaligned time is characterized by spaced very closely. Therefore, it is more sensitive to the frequency Eq. 5. The bound is tight in the worst case. offsets of different nodes. Since multiple nodes may always have some small frequency offset (a few hundred Hz as we show later), t ≤ t + t + 2 × t . (5) err_rts cca RxTx prop the adjacent subcarriers should have enough separation to accom- If we use the values of the parameters listed in Table 1, we find that modate such an offset. In practice, subcarriers are usually separated terr_rts can be as large as 11µs. by at least 15KHz [2, 3]. If coordination is performed by overhearing a broadcast frame, In FICA, we choose the FFT size of the DATA OFDM symbol to e.g., M-CTS or DATA, the timing misalignment can be shorter be- be 256 points in a 20MHz channel (subcarrier width is 78.12KHz). cause the two senders are synchronized better by a reference broad- Its FFT period is 12.8µs, yielding a cyclic-prefix overhead of 18% cast [6]. Figure 6(b) shows the worst case of symbol timing mis- which is comparable to the CP overhead of 802.11 (20%). ACK alignment of data frames after receiving a M-CTS broadcast. As- symbols have the same structure as DATA symbols. Since the M- suming the jitter for a local timer can be neglected since the waiting RTS uses the long CP, it should contain as few symbols as possi- time is usually very small (a few microseconds), the timing mis- ble. Thus, we choose a larger FFT size (512 points) for M-RTS to alignment is bounded by twice the propagation time, as shown in embed all necessary control information, which otherwise would Eq. 6. require multiple OFDM symbols for a smaller FFT size. FICA ap- plies the same FFT size to the M-CTS for format consistency. Ta- ≤ × terr 2 tprop. (6) ble 2 summarizes the detailed time parameters of the OFDM sym- bol structure in FICA. Using the parameter values in Table 1, terr is about 2µs in a WLAN. 3.2 PHY Architecture Parameter Value

Based on the analysis in last subsection, FICA needs to provide a Nfft_data 256 points guard time sufficiently long to handle the symbol timing misalign- tfft_data 12.8 µs ment in a WLAN based on carrier-sensing (11µs) and broadcast- Nfft_mrts, Nfft_mcts 512 points ing (2µs). We further need to include an additional 800ns guard tfft_mrts, tfft_mcts 25.6 µs time to account for the typical spreading delay in indoor environ- tlong_cp 11.8 µs ments [1]. We design two guard time sizes tailored to each coor- tshort_cp 2.8 µs dination situation: a long cyclic prefix (CP) of 11.8µs and a short tdata_sym 15.6 µs CP of 2.8µs. The long CP is attached to M-RTS only, while a short tmrts_sym 37.4 µs CP is attached to every M-CTS, DATA, and ACK OFDM symbols, tmcts_sym 28.4 µs and therefore is the major overhead of concern. To amortize the short CP overhead, we need a longer OFDM Table 2: OFDM symbol timings in FICA. symbol for data. Recall from Section 2.2 that a longer OFDM symbol is achieved by applying a larger FFT size N. Although In principle, FICA can allocate each subcarrier independently in theory one can choose any large FFT size N, there are a few to provide maximal flexibility. In practice, however, it is difficult practical considerations. First, a large N requires more computa- for a node to use only a single subcarrier to transmit data for two tional power to calculate the transform since the complexity of FFT reasons. First, using only one subcarrier suffers from frequency is O(Nlog(N)). Second, with large N the adjacent subcarriers are selective : if that subcarrier encounters deep fading, all data will be lost. Thus, it is essential for a wireless PHY to code across 1This case depends on the way the node implements carrier sensing multiple subcarriers to achieve a spectrum diversity gain [17]. Sec- and threshold setting. Here we consider only the worst case. ond, although a preamble may be used for channel estimation and

151 Configuration FICA (Mbps) 802.11n (Mbps) the node will transmit data symbols on that corresponding subchan- nel after a SIFS delay. 20MHz channel 71.8 72.2 There are a few issues that need to be considered. 40MHz channel 145 150 40MHz channel, 2xMIMO 290 300 What if two nodes transmit on the same contention subcarrier? 40MHz channel, 4xMIMO 580 600 It is quite possible for two nodes to choose the same random num- ber and transmit on the same contention subcarrier. Thus, their Table 3: Example PHY data rates in FICA vs. 802.11n. transmitted energy is additive at the AP side. If the AP picks this subcarrier as the winner, a collision will happen since both nodes compensation, the wireless channel may change during the period will consider themselves the winner and send data symbols on the of data transmission. It is essential for the receiver to track the same subchannel simultaneously. It is also possible (although the changes in the wireless channel. This tracking is typically done possibility is small) that two nodes’ signals are destructive and can- by adding an additional training subcarrier (pilot) along with other cel each other, so the AP may not be able to detect the transmis- data subcarriers. sions. In this case, though, the nodes involved will conclude that Therefore, following typical practice FICA groups a set of sub- they were not assigned the subchannels they requested. carriers into a subchannel and uses it as the basis for channel ac- cess.2 Each subchannel contains sixteen data subcarriers and one How large should K be? Clearly, a large K will have fewer colli- pilot subcarrier. Thus, a single subchannel in FICA is 1.33MHz sions in a subchannel. A larger K, though, means more subcarriers wide with a 6% pilot overhead, which is comparable to 802.11 (7%). are used and result in a larger signaling overhead: more subcarriers With different modes, the PHY data rate of a single need a larger FFT size and therefore a longer FFT period. In FICA subchannel can range from 512Kbps (BPSK, 1/2 coding on each we set K=16, the initial contention window size in 802.11. subcarrier) to 20.25Mbps (64QAM, 5/6 coding and four spatial Who is responsible for returning the M-CTS? Any node that multiplexing streams on four antennas). overhears M-RTS transmissions can arbitrate subchannel access. With a 1.33MHz subchannel, a 20MHz 802.11 channel contains Simultaneous transmissions of M-CTS from different nodes are 14 orthogonal subchannels. FICA uses the remaining spectrum as also allowed since, in a single broadcast domain, these M-CTS guard bands separating adjacent wide-band channels. Note that it is symbols are likely to have the same contention results and their also straightforward for FICA to support wider band channels, e.g., energy is additive on each subcarrier. But it is still reasonable that 40–100MHz or wider. To support a 40MHz channel, for instance, only the potential receivers should return an M-CTS for an M-RTS we simply double the FFT size for all OFDM symbols; since our based on the following considerations: (1) the irrelevant nodes can sampling rate is also doubled, the symbol period does not change. use power-save mode, since it is unfair for them to spend power re- When using a 40MHz channel, we can save the guard bands that sponding to M-RTS signals not for them; (2) non-receiving nodes would be necessary to separate two 20MHz channels. Thus, we may unnecessarily create more exposed terminals if they reply with can have 29 orthogonal FICA subchannels with a 40MHz channel. an M-CTS and disrupt transmissions that could otherwise happen Table 3 shows some example PHY data rates of FICA and com- without interference. pares them to 802.11n. As we will show later, although FICA has How we can specify receivers in an M-RTS? It is non-trivial slightly lower data rates than 802.11n, the effective are since multiple nodes may transmit M-RTSs simultaneously to dif- actually dramatically higher due to diminished overheads. ferent receivers and the receiver information may be mixed. FICA resolves nnthis issue by using a membership vector of m subcarri- 3.3 Frequency Domain Contention ers to represent receiver information in the M-RTS. Before trans- FICA uses the M-RTS/M-CTS signal exchange to avoid sub- mitting a M-RTS, a node will hash the receiver’s ID into a value channel collisions. M-RTS/M-CTS use simple binary amplitude between 0–(m − 1). Then, the corresponding subcarrier will carry modulation (BAM) to modulate a single bit on each subchannel. a “1” bit. Any node receiving an M-RTS will check if the subcar- Specifically, BAM uses On-Off Keying that maps a binary “0” to rier corresponding to its ID has been set. If true, it should return zero amplitude in a subcarrier, and it uses a random complex num- an M-CTS. It is possible that multiple nodes have hash value col- ber on the unit circle (ejθ) for a binary “1”. Receivers can eas- lisions. But with a reasonably-sized m — we use 40 bits in the ily detect BAM symbols by comparing the energy on a subcarrier current design — the number of station collisions should be small: against a threshold, without the need to recover the symbol phase. 2–3 in a typical WLAN setting where one AP may be associated The basic idea of frequency domain contention works as follows. with dozens of stations [14]. In M-RTS, a group of K subcarriers are allocated for each subchan- nel, called a contention band. When a node contends for a subchan- 3.3.1 M-RTS/M-CTS format nel, it will first randomly pick a subcarrier from the contention band Table 4 summarizes the subcarrier allocation for the M-RTS and and send a signal “1” using BAM. At the AP side, there may be M-CTS symbols. We explain the M-RTS/M-CTS format using a multiple M-RTS signals superposed, and the AP may detect mul- 20MHz channel as a concrete example. M-RTS/M-CTS use a 512- tiple ones on different subcarriers of the contention band. The AP point FFT in 20MHz channel (for wider channels, a proportionally then can arbitrate a winning node by selecting a subcarrier based on longer FFT point size is used). some predefined rules, e.g., the one with highest frequency. Then the AP sends the identification of the winning subcarrier for each Tag. The tag band contains a predefined binary sequence to identify subchannel in an M-CTS signal. After receiving the M-CTS, each the type of this OFDM symbol. The tag band is interleaved on node compares the subchannel allocation broadcast by the AP to selected subcarriers whose indices are multiples of four. Thus, the the subcarrier it has transmitted for the subchannel. If they match, tag band creates a repeating pattern in time-domain samples. The receiver can use this pattern to detect the symbol using correlation. 2The grouping of subcarriers may be arbitrary, and not necessar- In a 20MHz channel, the tag band contains 32 subcarriers. For M- ily contiguous. In our current implementation, however, we only RTS, the tag sequence starts with zero, {0, 1, 0, 1, ..., 0, 1}, while group adjacent subcarriers into a subchannel. for M-CTS the sequence becomes {1, 0, 1, ..., 0, 1, 0}.

152 C M-RTS M-CTS 1: Initialize max: 2: Cmax = Ctotal; Tag 32 Tag 32 3: Contention band 16 × 14 Resolution band 24 × 14 4: Update1: Reset to max NAV 64 NAV 64 5: if collision detected in any subchannel then Receiver band 160 Reserved 48 6: Cmax = max(Cmax/2, 1); Guard band 32 Guard band 32 7: else 8: C = C ; Table 4: Number of subcarriers allocated for the M-RTS and max total 9: end if M-CTS symbols in a 20MHz channel. 10: 11: Update2: AIMD Contention band and resolution band. For M-RTS FICA allo- 12: if p% subchannels have collisions and (p > 0) then cates sixteen subcarriers to each subchannel for contention. So, a 13: C = max(C × (1 − p/100), 1); total of 224 subcarriers are allocated for a contention band for all max max 14: else 14 subchannels arbitrated using an M-RTS symbol. 15: C = min(C + 1,C ); Contention results are embedded in the resolution band in M- max max total 16: end if CTS. The resolution band occupies more subcarriers compared to the contention band in the M-RTS. There are 24 subcarriers for each Figure 7: Pseudo-code of the frequency domain backoff algo- subchannel. However, since BAM is less robust compared to phase- rithms. Update1 emulates the behavior of 802.11 binary expo- shift keying modulation (e.g., BPSK), we improve BAM’s reliabil- nential backoff. Update2 uses an AIMD strategy. ity by replicating on multiple subcarriers and using soft-decoding. channels the node can access in the next transmission opportunity. It can be shown that, by replicating twice, BAM can achieve the Thus, when the channel is idle for DIFS, a node may pick up to n same performance of BPSK [25]. In FICA, we replicate a bit on subchannels to contend for, where n = min(Cmax, lqueue) and lqueue four subcarriers when we require high reliability. Thus, with four- is the number of fragments in the node’s local sending queue. time replication, we can encode six bits (24/4) to represent the res- The maximum subchannel count Cmax is updated based on the olution results for each channel. Currently, six bits can represent contention situation on the channel. There can be multiple update the winner number (1–16) and two other special states. If these 6 strategies. For example, we can emulate the behavior of binary bits are all zero, it means the subchannel is not used (e.g., no node exponential backoff (BEB) used in 802.11. When a collision is has contended for this subchannel). Alternately, if the bits are all detected on a subchannel that a node has transmitted on, it will ones, the subchannel is reserved by the receiver; thus no transmis- reduce Cmax by half. Once all transmissions are successful, the sion is allowed on this subchannel in this contention period. node resets Cmax to the total number of subchannels. Receiver band. The receiver band contains the hash vector for Alternatively, we can use an additive increase/multiplicative de- intended receivers across all subchannels. It contains 40 bits and crease (AIMD) strategy. Assume after transmission, a node detects each bit again is replicated on four subcarriers for reliability, for a that p% of channels have collisions. The node then decreases Cmax total of 160 subcarriers. by p% (multiplicative decrease). This reaction is reasonable since p reflects an estimation of the contention level in all subchannels. Network Allocation Vector (NAV). The NAV band encodes the And when all accesses to subchannels succeed, the node increases expected number of OFDM symbols in the following data frame. It Cmax by one (additive increase). contains 64 subcarriers and, also due to four times replication, can It is straightforward to prove that both strategies converge if all modulate 16 bits. There could be several ways to encode the data contending nodes are within a single broadcasting domain. Fig- symbol number in the NAV band. One simple method is to use each ure 7 shows the pseudo-code of the adaptive Cmax adjustment algo- single bit to present a certain number of OFDM symbols. Thus, rithms. Later, we evaluate these two strategies and find that AIMD sixteen bits can present numbers between 3–48, with a step of 3. As is slightly better than the reset-to-max strategy. we discuss later, such a range is sufficient for FICA. In an M-CTS, only the highest bit is echoed back and all nodes that overhear the 3.4 Multiple access points and two-way traffic M-CTS should defer their contention for a corresponding period. Until now, we have only concentrated on uplink access. Down- link access follows the same process with roles reversed. In this 3.3.2 Frequency domain backoff case, the AP will send out an M-RTS. In FICA, it is possible for In FICA, nodes will choose how many subchannels to request the AP to transmit simultaneously to multiple clients with a single via contention based on their traffic requirements. If its traffic re- transmission burst (but different frames to different subchannels). quirements are light, a node may request fewer subchannels, while The receiver IDs are encoded in the receiver band of M-RTS. All a heavily loaded node may contend for as many as all subchan- receiving clients should return M-CTS to the AP. It is necessary for nels for most efficient communication. However, if there are many the downlink transmissions to go through such a contention pro- nodes in a network contending for many subchannels, the collision cess since in practice there can be multiple APs located nearby on avoidance mechanism provided by M-RTS/C-RTS may not be suf- the same wide-band channel. Thus, these APs should use M-RTS ficient to represent all contenders. Using multiple M-RTS symbols and frequency-domain backoff to contend for each sub-channel for can further scale collision avoidance, but at the cost of more sig- transmission. The contention result is resolved by the receiving nalling overhead as described above. stations and fed back to each AP by M-CTS broadcasts. Instead, we use a novel frequency domain backoff scheme to However, there is another issue in presence of two-way traffic. scale FICA in a heavily-contended dense network. The basic idea Assume that an AP and one client have frames to exchange. Since is to control the maximum number of subchannels that one node FICA does not use time backoff by default, both nodes may send can access based on the observed collision level. It is similar to an M-RTS simultaneously. This may cause a failure if there are no existing widely-used congestion control mechanisms. Each node other nodes that can send back an M-CTS. Further, even if there is maintains a state variable, Cmax, which controls the maximum sub- a third node that receives these M-RTSs and sends an M-CTS back

153 with the contention resolution, it is still possible that both the AP nodes will defer if they sense a transmission of 802.11 nodes, and and the client have been granted some winning subchannels. So, vice versa. It is also possible to retain fairness among these two they will both transmit data simultaneously but neither of them can types of nodes. For example, since FICA has a fixed access pattern receive its frame due to the simplex radio used. with the M-RTS/M-CTS handshake, it is possible to choose an ap- To address this issue, FICA separates uplink and downlink traffic propriate DIFS time (e.g., equal to half of the CWmin of 802.11) for by assigning different DIFS times to the AP and stations. The one FICA to be fair to 802.11 nodes with random time-domain backoff. with a short DIFS has priority to access the channel by sending an Cellular vs. WLAN environments. Finally, we note that the mech- M-RTS earlier. To ensure fairness between uplink and downlink anisms designed in FICA may unlikely be applicable in cellular traffic, we use a simple dynamic DIFS assignment strategy. We networks because coverage of a cellular base station is large (e.g., assign a fixed DIFS time to all mobile stations, and the AP has two different DIFS time settings. One is shorter than the DIFS of a few kilometers). Thus, the propagation time is large as well, at mobile stations, and the other one is longer. The AP chooses a least an order of magnitude larger than that in a WLAN. Conse- DIFS time based on the following rules: (1) once an AP accesses quently, even using broadcasting the synchronization accuracy is e.g. the channel with a short DIFS, it will use the long DIFS for its next too coarse ( , several tens of microseconds). Current OFDM- 4.69µs access; and (2) if an AP receives an M-RTS from stations, it will based WWANs already emply a relative long cyclic prefix ( ) 66.67µs use a short DIFS for its next access. Note that this simple strategy and FFT period ( ) to handle a large delay spread due to ensures the fair interleaving of uplink and downlink traffic, but not multi-path fading in the wide area. A low-precision synchroniza- the fairness among all nodes as 802.11 currently does. In effect, tion method in OFDMA will further enlarge the symbol FFT size, it gives the AP many more chances to transmit — which may be adding substantial engineering complexity to control frequency off- sets and undermining the ability to handle Doppler effects in a mo- an appropriate strategy given the asymmetrical nature of WLAN bile environment. application workloads. It is an interesting open question of what sort of fairness is best provided in FICA and remains future work. 4. SIMULATION 3.5 Discussion We implemented an event-based simulator to study the perfor- We end the description of FICA with a few additional points of mance of FICA in large-scale wireless networks and to compare consideration. For the issues raised, we broadly describe potential its performance with 802.11n. The simulator can model both the approaches for addressing them, but in general leave an exhaustive CSMA MAC and an OFDM PHY that supports multiple subchan- discussion of them as future work. nels. We study the performance primarily under a single AP net- work with varying number of stations. We assume only collisions Hidden terminals. Until now, we have only described FICA within will cause frame reception failures, and thus we focus on the per- a single broadcast domain. How does FICA operate in a network formance of the MAC design. We also focus on only the uplink with hidden terminals, and therefore when there are multiple over- transmissions (the downlink behavior in this setting is analogous), lapped contention domains? Hidden terminals cannot be coordi- and we apply various traffic patterns in a wide 40MHz channel with nated by carrier sensing. As a result, there is a chance for M-RTSs high data rates. For 802.11n, we also simulate MAC Service Data to collide if the misalignment of symbols from hidden terminals Units (MSDU) aggregation, the most efficient aggregation method exceeds the long cyclic prefix. To prevent persistent M-RTS colli- defined in 802.11n [1]. sions, once a node misses an M-CTS or it receives an invalid M- For FICA, we use the same values of tsifs and tdifs as in 802.11. CTS after it sends an M-RTS, the node should wait for a random The preamble in FICA requires three OFDM symbols for single time before sending an M-RTS again. This random waiting time is and 2x MIMO and four for 4x MIMO. Using the three-symbol necessary to de-synchronize the hidden terminals. However, since preamble as an example, tpreamble = 46.8µs, and counting another M-RTS/M-CTSs are short, we believe the chance of M-RTS/M- OFDM symbol for the ACK, i.e., tack = 15.6µs, the subtotal per- CTSs collisions will be small. access MAC overhead of FICA is 157.8µs. Note that although Another issue is that when a node contends for a subchannel FICA uses the M-RTS/M-CTS handshake, the overhead is compa- in multiple domains, it may receive inconsistent resolution results rable to that of 802.11 (160µs with minimal contention window) from M-CTSs from different APs. This will effectively prevent the due to the use of the PHY signaling mechanism. Eq. 7 gives a node from sending on that subchannel since the received informa- simple model for FICA’s access efficiency, tion is a mixture from two M-CTSs. Only when the node is the winner in all contention domains it participates in should it be al- tdata ηfica = , (7) lowed to access the subchannels (implying all returned M-CTSs tdifs + trts/cts/ack + tpreamble + 3 × tsifs + tdata contain the same resolution results for intended subchannels). We where t = t + t + t . believe such behavior is reasonable and follows the general princi- rts/cts/ack mrts mcts mack Thus, to achieve an efficiency ratio of 80%, we need 40 DATA ple of wireless system designs. OFDM symbols. For different PHY data rates, frame sizes for Multi-user diversity. FICA also enables an opportunity to exploit the same efficiency correspond to a size of 400/800/1600 bytes at multi-user diversity in WLANs [26]. When a node chooses sub- 145/290/580Mbps, respectively. We use these sizes as a rule of channels to access, it may also consider the quality of each sub- thumb for FICA nodes to fragment upper layer frames and send channel. Moreover, heterogeneous modulation methods can be ap- each fragment on one subchannel. plied to different subchannels to match the conditions on that spe- No aggregation. cific channel band. There is already much research on resource In this scenario, we first disable the frame aggre- allocation for multi-user diversity in both single and multi-channel gation of 802.11n as a lower bound. Figure 8 shows the through- cases [7,12,15,21], and applying these ideas in a distributed system put efficiency of 802.11n and FICA with two different frequency- like FICA remains interesting open challenges. backoff schemes: AIMD and Reset-to-Max (RMAX) (Section 3.3.2). The scenario simulates ten concurrent nodes where each node trans- Backwards compatibility. It is also possible for FICA to coexist mits UDP traffic corresponding to 1/10 of the PHY data rate with with current 802.11. Since FICA is still based on CSMA, FICA a frame size of 1500 bytes. As expected, with a 1500-byte frame

154 90 90 802.11 FICA AIMD FICA RMAX FICA RMAX 80 FICA AIMD 80 FICA RMAX 90 FICA AIMD 70 70 80 60 60 70 50 50 802.11 60 40 40 802.11 50 30 30 Efficiency (%) 40 20 Efficiency (%) 20 30 10 10 20 0 0 Efficiency (%) 10 0 10 20 30 40 50 0 10 20 30 40 50 0 # of nodes # of nodes 0 200 400 600 (a) (b) PHY Data Rate (Mbps) Figure 9: Full aggregation case. For 802.11, the maximal aggre- Figure 8: Efficiency ratio of 802.11 and FICA with different gated frame size is 28KB. All nodes are saturated. (a) 802.11 PHY data rates. No frame aggregation is enabled. PHY 150Mbps; FICA 145Mbps. (b) 802.11 PHY 600Mbps; FICA 580Mbps. current 802.11a/g rates only provide around a 50% efficiency ra- tio, and this ratio decreases rapidly with the increase of the PHY 90 90 data rate. However, by enabling fine-grained channel access, FICA 80 FICA RMAX FICA AIMD 80 FICA AIMD 70 70 can achieve a much higher efficiency ratio in the same situation. 60 60 This benefit is because different stations can access different sub- 50 50 FICA RMAX channels simultaneously. Thus the per-access MAC overhead is 40 40 30 30 802.11

amortized among all concurrent nodes. Also, we find that FICA Efficiency (%)

20 Efficeiency (%) 20 AIMD has slightly better performance than FICA RMAX. As we 10 10 802.11 will see in subsequent experiments, FICA AIMD consistently per- 0 0 5 10 15 20 25 30 35 40 45 5 10 15 20 25 30 35 40 45 forms better. We hypothesize that this is because FICA AIMD ad- # of delay sensitive nodes # of delay-sensitive nodes justs Cmax much smoother compared to FICA RMAX. However, (a) (b) a deep analysis on the optimal frequency-domain backoff strategy remains future work. Figure 10: Mixed traffic. Five nodes are fully saturated. All This scenario is the worst case for 802.11n. We show this case other nodes have delay-sensitive traffic with a uniform distri- to demonstrate how significant the MAC overhead can be at high bution between 800Kbps to 5Mbps. (a) 802.11 PHY 150Mbps; PHY data rates, and that techniques like FICA or frame aggregation FICA 145Mbps. (b) 802.11 PHY 600Mbps; FICA 580Mbps. are indeed necessary for efficiency. Full aggregation. Here, we show the best case of 802.11n with to the channel is much less efficient. Thus, the overall channel uti- frame aggregation. In this experiment, all nodes are saturated so lization is reduced. However, with FICA such nodes can request that the frame aggregation can work most efficiently. Figure 9 access to fewer subchannels, leaving the other subchannels for use shows throughput efficiency with different numbers of contending by other nodes. Consequently, the overall network efficiency re- nodes at two PHY data rates, 150Mbps and 600Mbps, respectively. mains at a high level, improving upon 802.11 from 16% up to 4 In both cases, the efficiency of 802.11n has been significantly im- times better at the high PHY data rates. proved due to frame aggregation. Since all nodes are saturated, the aggregation level is very high: 12 frames (or 18KB) on average. 5. IMPLEMENTATION FICA still has slightly better performance than 802.11n even in We have also implemented the basic mechanisms of FICA us- this case, though, because FICA has slightly fewer collisions com- ing Sora, a fully programmable software radio platform based on pared to 802.11n. To understand why, consider the operation of commodity general-purpose PC architectures [24]. Our FICA im- frequency domain contention. When there are many stations con- tending for a subchannel, if two stations happen to pick up the same plementation is based on SoftWiFi, a software implementation of subcarrier to send their signals, it does not necessarily result in a the 802.11a/b/g PHY/MAC [24]. We make the following modifi- collision. A collision occurs only when the collided subcarrier is cations: (1) we change the FFT size from 64-point to 256-point also chosen as the winner as nodes contend for subchannels. In for DATA/ACK symbols and 512-point for M-RTS/M-CTS sym- the next contention period, all stations will pick a different random bols; (2) we employ convolutional coding in each subchannel and number again. This situation is unlike time-domain backoff used decode data in each subchannel individually using the Viterbi al- in 802.11: when two stations pick the same backoff slots they will gorithm; (3) we remove the random time-domain backoff in the eventually collide with each other. CSMA MAC, and implement the M-RTS/M-CTS handshake after the channel is sensed idle. Mixed traffic. Finally we evaluate a situation in between the two FICA uses a PHY frame structure and synchronization algorithm extremes. We have five saturated stations that always have full- similar to 802.11. A preamble precedes data symbols. The first sized frames to transmit. In addition, there are a variable number symbol is used for symbol time synchronization (i.e., finding the of nodes that have small but delay-sensitive traffic representing, for boundary of symbols). It employs a self-repeating pattern in the instance, video conferencing or Web browsing. We choose the load time domain so that the receiver can detect it using auto-correlation. of this delay sensitive traffic uniformly from 800Kbps to 5Mbps, The second symbol is used for channel estimation. To support 4x and the packet size from 800–1300 bytes. Figure 10 shows the effi- MIMO, another training symbol is needed. The last symbol en- ciency results of this scenario as a function of the number of delay- codes the Physical Layer Convergence Protocol (PLCP) header us- sensitive nodes. With a few delay-sensitive nodes, the through- ing BPSK and 1/2 convolutional coding. The PLCP header contains put efficiency of the network is significantly reduced for 802.11n. the modulation mode used in the following DATA symbols for the Since the delay sensitive flows cannot be aggregated, their access receiver to set the proper demodulating parameters.

155 600 1 500 400 0.8 300 200 0.6 100 Frequency Frequency Offset (Hz) Offset

0 CDF 0.4 12/24/2009 12/29/2009 1/3/2010 1/8/2010 0.2 Figure 11: Measured frequency offsets of two nodes over two 0 weeks starting from a single calibration. 0 2 4 6 8 10 Timing misalignment (ȝs) There are two practical issues that our implementation addresses: Figure 12: CDF of symbol timing mismatch with broadcasting. Threshold setting. With physical layer signal symbols (M-RTS/M- CTS), we need a proper energy threshold to reliably detect the bit We perform our experiments in the 2.4GHz band with 802.11b/g- modulated on each subcarrier. Since we target WLANs, where the compatible RF front-ends. Due to timing constraints, we have pre- signal-to-noise-ratio (SNR) is usually high, setting such a threshold stored all needed PHY frame samples on the RCB first. We also is not a difficult task [5]. In FICA, we further use a self-calibration conduct the experiments late at night to minimize interference from method to facilitate this threshold setting. We use blank subcarriers other traffic in the same frequency band. (carrying a bit “0”) in the M-RTS/M-CTS tag band to calibrate the noise floor. Then, we choose a threshold that is a constant multiple 6.1 Symbol timing misalignment in a WLAN of the noise floor. In our experiments, we find that a threshold 2– As discussed in Section 3.1, there are two ways to coordinate 5dB higher than the noise floor works reliably in WLAN settings. concurrent transmissions in a WLAN: using a reference broadcast (e.g., M-CTS) or carrier sensing on an idle channel. These two Frequency offset calibration. In OFDM-based multi-access net- methods result in different performance requirements in aligning works, the frequency offset among simultaneous senders should be symbol times. controlled within a certain range. Otherwise, it will undermine the To measure the symbol timing misalignment with broadcasting, orthogonality among subcarriers since the concurrent senders are the AP broadcasts an M-CTS symbol and two stations, when they not actually transmitting on orthogonal frequency due to the offset. detect the M-CTS, wait for a fixed 200µs and return two differ- Thus, in FICA, all nodes should have their frequency calibrated. ent Pseudo-Noise (PN) signals. We use PN signals so that we can Fortunately, FICA can use the AP’s frequency as a reference. easily separate the two signals by cross-correlation, and precisely Every station can hear the from the AP and adjust its local measure the time difference of the arrivals of the two signals at the numerically-controlled oscillator (NCO) to match the frequency of AP by counting the number of samples between the two correlation the AP. We note that frequency calibration is much simpler than peaks. Figure 12 shows the cumulative distribution function (CDF) time synchronization, since the frequency offset does not accumu- of 1,000 tests. In over 95% of the cases, the symbol timing differ- late over time. Figure 11 shows the measured absolute value of ence of these two transmissions is less than 1µs, and in 99% cases the frequency offset of two Sora nodes over two weeks after a sin- the timing difference is less than 2µs. gle calibration. The frequency offset is within a limited range of To characterize the maximal symbol time difference when coor- 500Hz. This range is quite small compared to the data subcarrier dinating with carrier sensing, we use the following method. After width (0.63% of 80KHz), and its impact to orthogonality can be the AP sends out a broadcast symbol, one station waits for a fixed neglected in practice. Thus, we conclude that an infrequent fre- 200µs and transmit its PN signal. We adjust the waiting time of quency calibration process can support FICA very well. the second station incrementally from 200µs to 220µs, performing Although Sora is the fastest programmable software radio plat- around 100 tests for each value. Upon timeout, the second station form that allows us to prototype FICA quickly, it still has a limita- performs carrier sensing first. If it senses a busy channel, it will tion that prevents our implementation from running in real-time. It cancel its transmission attempt and the AP will only receive one takes too long to transfer the just-in-time modulated PHY frames PN signal from the first station. Otherwise, if the second station from the CPU to the Radio Control Board (RCB) after receiving senses an idle channel, it will send a different PN signal. The AP contention results in the M-CTS. As a result, it will miss the SIFS will then detect two PN signals and measure the time difference deadline by a small margin. To improve the situation, we will either between their arrivals. need a faster version of Sora or take the step of moving part of the Figure 13 shows the results of this experiment. The x-axis shows FICA implementation to hardware. the relative delay of the second station to the first station. The dark line shows the probability of the second station sensing the trans- mission of the first node. We see that if the relative delay is larger 6. EVALUATION than 9µs, the second node will always sense the first node’s trans- In this section, we evaluate the feasibility of FICA using our mission and cancel its own sending attempt. The 9µs threshold prototype implementation on four Sora nodes: one serves as the reflects the turnaround time of Sora, which is about 5µs, and the AP and the others are stations. We first demonstrate how well con- carrier sensing time of 4µs we used in this experiment. When the current transmissions can be coordinated in a WLAN. We evaluate relative delay is less than 8µs, the second node always senses an the maximal symbol timing misalignment at the receiver for two idle channel and concurrent transmissions occur. concurrent transmissions coordinated by means of carrier sensing The rectangular points mark the difference in the arrival time of and broadcasting. We show that even with our existing software the two signals, and the error bars indicate the minimal and maxi- radio implementation, we can bound the maximal symbol timing mal values measured. The timing misalignment linearly increases misalignment within the range discussed in Section 3.1. Then, we with the relative delay, as expected. Note that, when the relative evaluate the efficiency and reliability of detecting BAM-modulated delay is between 8–9µs, carrier sensing may not always work and PHY signals in M-RTS/M-CTS. Finally, we show the decoding per- results in larger variance. Overall, the symbol timing misalignment formance of our FICA decoder for two concurrent FICA senders. is within 10µs, confirming our analysis in Section 3.1.

156 CS Probability Timing misalignment 1.E-01 1 12 0.9 1.E-02 BPSK FICA 0.8 10 QPSK FICA 1.E-03 16QAM FICA 0.7 8 0.6 64QAM FICA BER 0.5 6 s) 1.E-04 BPSK Single ȝ 0.4 ( QPSK Single 0.3 4 1.E-05 16QAM Single 64QAM Single 0.2 2 0.1 1.E-06 Carrier Sense Probability Sense Carrier 0 0 Mismatch Timing Symbol 0 20 40 0 5 10 15 SNR (dB) Relative Delay (ȝs) Figure 13: CDF of timing misalignment with carrier sensing. Figure 15: performance of FICA compared to The dark line shows the carrier-sensing probability and the conventional WLAN where only a single node can access chan- rectangular points show the measured symbol timing differ- nel at a time. ence. more sensitive to mutual inter-subchannel interference if there is 3.0% False Positive (Rep 1) any. We fix the position of the AP and two stations and adjust the 2.5% False Negative (Rep 1) transmission power to get different signal-to-noise-ratios (SNR). False Positive (Rep 2) For each SNR setting, we evaluate four different modulation schemes. 2.0% False Negative (Rep 2) False Positive (Rep 4) We schedule the transmissions in the following way. For each trans- 1.5% False Negative (Rep 4) mission power setting, we let two stations access the channel si- Error 1.0% multaneously using FICA first, and then immediately one station transmits alone. Since these two transmissions are back-to-back, 0.5% we assume their channel conditions should be similar. For each 0.0% power setting, we send 1400 frames. Each frame is 400 bytes and 1 2 3 4 5 6 7 8 9 10 11 12 uses one subchannel. Threshold factor We use the classic bit-error-rate (BER) to SNR plot to illustrate Figure 14: Reliability of PHY M-RTS/M-CTS signaling. the demodulation performance. Figure 15 shows the results for FICA as well as the case where only one node accesses the chan- 6.2 Reliability of PHY signaling nel. The BER value shown is measured before the Viterbi decoder; In this experiment, we evaluate the reliability of the PHY signal- after Viterbi, most of the errors are corrected. Clearly, all curves ing implementation. Two nodes transmit the same M-RTS symbol are very close to each other, including the high rate modulations simultaneously so that every subcarrier contains a superposition of like 64QAM which are very sensitive to interference. Thus, we the signals transmitted by the two senders. The AP detects the in- conclude that with FICA different nodes can transmit on different formation on each subcarrier. Since we know what is exactly trans- subchannels simultaneously without interfering with each other. mitted, we can detect false positives (i.e., the AP detects one on a subcarrier that should be zero) and false negatives (i.e., the AP 7. RELATED WORK detects zero where it should be one). We use the self-calibrated The application of OFDM for multiple access in WLANs is lim- method to set detection threshold as described in Section 5: we ited. In [18], Rahua, et al., developed FARA that implements measure the maximal energy Nf on the blank subcarriers and set downlink OFDMA in a WLAN and per-subcarrier rate adaptation. the threshold to be αNf , varying α. We also experiment with dif- But, since there is only one (the AP), symbol align- ferent degrees of bit replication on subcarriers. ment is not an issue. In [5] and similarly in [11], OFDM has been Figure 14 shows the error rates for 1,000 tests at various degrees used as a simple form of concurrent channel access. Nodes may of replication as we vary the threshold α. With a large threshold, modulate one bit of ACK information on different subcarriers after false positives decrease while false negatives increase slightly (al- receiving a broadcast frame. However, FICA is a new PHY/MAC though still less than 0.5%). Overall, though, there is quite a large framework for WLANs that enables over fine- space for threshold setting to provide good performance. In par- grained subchannels to improve overall network efficiency. ticular, when a bit is replicated on a few subcarriers (e.g., 4), both Physical layer signaling, usually with Binary Amplitude Modu- false positive and negative rates are close to zero when α is in the lation, has been used previously to assist MAC protocols. In [8,13], range 3–5. We hypothesize that, with more simultaneous transmis- busy tones are used to indicate channel occupancy to mitigate the sions, the detection should be more reliable as BAM is essentially hidden terminal problem. Recently, SMACK [5] uses a physical energy detection. With more transmissions superposed, the energy layer ACK, and MCBC [20] uses a PHY RTS/CTS handshake to fa- is additive at the receiver side. However, due to limited hardware cilitate a multi-round leader election protocol in a heavy-contention availability we leave this for future study. environment; after the protocol completes, the winner obtains ac- cess to the medium. FICA similarly shares the idea with SMACK 6.3 Demodulation performance and MCBC to apply PHY signaling based on simple BAM modula- In this experiment, we compare the demodulation performance tion, but FICA has the broader goal of enabling fine-grained chan- of FICA, where multiple nodes are allowed to simultaneously ac- nel access in high data rate WLANs. cess different subchannels, to the conventional WLAN, where only Coordination using broadcast in local area networks has been a single node can access the whole channel. We have one FICA sta- previously exploited for time synchronization in reference-broadcasting tion access only the odd-numbered subchannels and another FICA synchronization, which provides microsecond-level synchroniza- station access only the even-numbered subchannels. Thus, these tion precision [6]. Our results further confirm that microsecond- two stations maximally interleave their subchannels and should be level coordination accuracy is practical in WLANs.

157 There is extensive work in the literature to improve 802.11 MAC [4] R. Chandra, R. Mahajan, T. Moscibroda, R. Raghavendra, and performance by fine-tuning the backoff scheme [9, 10]. But these P. Bahl. A case for adapting channel width in wireless networks. approaches still consider the channel as one resource unit where SIGCOMM Comput. Commun. Rev., 2008. only one radio can work on one channel at a time. Multi-channel [5] A. Dutta, D. Saha, D. Grunwald, and D. Sicker. SMACK: a smart acknowledgment scheme for broadcast messages in wireless MAC protocols [16] have been studied to improve networks. In SIGCOMM ’09., , NY, USA, 2009. ACM. performance by using more orthogonal channels that are separated [6] J. Elson, L. Girod, and D. Estrin. Fine-grained netwrok time by guard bands. In contrast, FICA improves the performance of synchronization using reference broadcasts. In 5th Symposium on WLANs by increasing channel access efficiency. Using OFDM, Operating Systems Design and Implementation (OSDI), 2002. FICA creates a fine-grained structure (i.e., subchannels) for multi- [7] A. Goldsmith. Wireless Communication. Cambridge University access inside a wide-band channel without guard bands. Thus, a Press, 2005. FICA node can adjust the portion of the spectrum it accesses based [8] Z. J. Haas and J. Deng. Dual busy tone multiple access (dbtma)-a on its traffic demands, while other nodes can use the remaining multiple access control scheme for ad hoc networks. Communications, IEEE Transactions on, 50(6):975–985, 2002. spectrum simultaneously. This property shares some similarity to [9] Y. He, R. Yuan, J. Sun, and W. Gong. Semi-random backoff: the adaptive channel width demonstrated in [4]. FICA is comple- Towards resource reservation for channel access in wireless lans. In mentary to that work by providing a concrete means for adaptive ICNP, 2009. fine-grained subchannel access in WLANs. [10] M. Heusse, F. Rousseau, R. Guillier, and A. Duda. Idle sense: an The inefficiency of the 802.11 MAC has also been discussed be- optimal access method for high throughput and fairness in rate fore for supporting VoIP traffic [22,27]. In [27], a TDMA approach diverse wireless lans. SITCOM Compute. Common. Rev., is used to reduce the contention overhead for CSMA in 802.11. In 35(4):121–132, 2005. this paper, we argue that the inefficiency of 802.11 MAC is a fun- [11] B.-S. Kim, S. W. Kum, and R. L. Ekl. Ofdma-based reliable multicasting mac protocol for wlans. IEEE Transactions on Vehicular damental bottleneck as the PHY data rate increases for all traffic, Technology, 57(5), 2008. not just VoIP traffic. We further argue that this inefficiency issue [12] D. Kivanc, G. Li, and H. Liu. Computational efficient bandwidth should be resolved by enabling fine-grained channel access. allocation and power control for ofdma. IEEE Transactions on Wireless Communications, 2003. 8. CONCLUSION [13] L. Kleinrock and F. Tobagi. in radio channels: Part i–carrier sense multiple-access modes and their throughput-delay This paper addresses the inefficiency issue of MAC protocols in characteristics. In IEEE Transactions on Communications, 1975. current WLANs as the PHY data rate increases. The fundamental [14] D. Kotz and K. Essien. Analysis of a campus-wide wireless network. reason of this inefficiency lies in the fact that the current MAC pro- Wirel. Netw., 11(1-2):115–133, 2005. tocol allocates the entire wide-band channel as a single resource. [15] X. Liu, E. K. P. Chong, and N. B. Shroff. Opportunistic transmission Such allocation become too coarse-grained for general traffic de- with resource-sharing constraints in wireless networks. mands as the channel width or the PHY data rate increases. IEEE Journal on Selected Areas in Communications, 19, october We argue that this inefficiency issue should be resolved using 2001. fine-grained channel access in high data rate WLANs. We present [16] J. Mo, H.-S. So, and J. Walrand. Comparison of multichannel MAC protocols. Mobile Computing, IEEE Transactions on, 7(1):50–65, the design of FICA, a new cross-layer design that enables fine- Jan. 2008. grained subchannel random access based on OFDM. FICA addresses [17] J. G. Proakis and M. Salehi. Digital Communications. McGraw Hill, challenges in both PHY and MAC design due to the asynchronous 2008. and distributed nature of WLANs. First, FICA proposes a new [18] H. Rahul, F. Edalat, D. Katabi, and C. Sodini. Frequency-aware rate PHY architecture based on OFDM that retains orthogonality among adaptation and mac protocols. In Mobicom, 2009. subchannels solely relying on the coordination mechanisms in ex- [19] H. Rahul, N. Kushman, D. Katabi, C. Sodini, and F. Edalat. Learning isting WLANs. Second, FICA employs frequency-domain con- to share: narrowband-friendly wideband networks. SIGCOMM tention that uses physical layer RTS/CTS signaling, and frequency Comput. Commun. Rev., 38(4):147–158, 2008. domain backoff to efficiently coordinate subchannel access. [20] B. Roman, F. Stajano, I. Wassell, and D. Cottingham. Multi-carrier burst contention (MCBC): Scalable medium access control for We have implemented FICA on the Sora software radio platform. wireless networks. In WCNC 08, 2008. With our prototype, we validate the feasibility of the FICA cross- [21] Z. Shen, J. G. Andrews, and B. L. Evans. Adaptive resource layer design. Our detailed simulation results further indicate that allocation for multiuser ofdm with constrained fairness. IEEE FICA can outperform 802.11 with frame aggregation with up to Transactions on Wireless Communications, 2005. 400% under different traffic settings. [22] S. Shin and H. Schulzrinne. Experimental measurement of the capacity for voip traffic in IEEE 802.11 wlan. In Infocom 2007, 2007. [23] V. Shrivastava, S. Rayanchu, J. Yoon, and S. Banerjee. 802.11n under 9. ACKNOWLEDGEMENTS the microscope. In IMC ’08., 2008. The authors would like to thank Geoff Voelker (UCSD), Hui Liu [24] K. Tan, J. Zhang, J. Fang, H. Liu, Y. Ye, S. Wang, Y. Zhang, H. Wu, (University of Washington), our shepherd Kyle Jamieson (UCL), W. Wang, and G. M. Voelker. Sora: High performance software radio and the anonymous reviewers for their valuable comments and sug- using general purpose multi-core processors. In NSDI 2009. gestions. [25] Q. Tang, S. Gupta, and L. Schwiebert. BER performance analysis of an on-off keying based minimum energy coding for energy 10. REFERENCES constrained wireless applications. In ICC 2005, 2005. [1] IEEE standard for local and metropolitan area networks part 11; [26] D. Tse and P. Vishwanath. Fundamentals of Wireless amendment 5: Enhancements for higher throughput. IEEE Std Communications. Plenum Press New York and London, 2005. 802.11n-2009. [27] P. Verkaik, Y. Agarwal, R. Gupta, and A. C. Snoeren. Softspeak: [2] IEEE standard for local and metropolitan area networks part 16: Air Making VoIP play well in existing 802.11 deployments. In NSDI, interface for fixed broadband wireless access systems. ANSI/IEEE 2009. Std 802.16-2004. [3] 3GPP TS 36.201-820:. Evolved universal terrestrial radio access (E-UTRA); long term evolution (LTE) physical layer; general description.

158