Advances in Electrical and Computer Engineering Volume 9, Number 3, 2009 Considerations on VoIP Throughput in 802.11 Networks

Alin D. POTORAC Stefan cel Mare University of Suceava str.Universitatii nr.13, RO-720229 Suceava [email protected]

Abstract—Voice data packets have to arrive at the Overlap between the channels cause unacceptable destination in time, with a defined cadence and with low and degradation of signal quality and throughput. Basically the constant delay in order to allow the real time voice radio channel overlapping is accepted in 802.11 standards reconstruction. From this point of view, transmitting voice over [2]. IP networks is the most sensitive category of applications, especially when wireless medium is involved. The paper In infrastructure wireless LANs with one access point, the discusses the possibilities of transmitting the maximum data frames do not travel directly among clients. A wireless number of simultaneous voice streams over 802.11 wireless client sends the data frame to the access point and then the networks considering the main factors which impact with VoIP access point resends the payload content of the original data throughput, in a basic scenario. Starting from a proposed frame, packed in a new data frame, to the receiving client. communication model, the number of simultaneous possible The AP bandwidth (and the radio space) is shared between VoIP sessions is calculated, taking into consideration the contribution of the protocol overheads, the security overheads, the AP radio clients and the user available bandwidth is thus the PHY level timings and the CODEC proprieties. Numerical split among those clients [6]. results are generated and compared. RTS/CTS (Request to Send, Clear to Send) mechanism is the basic solution for managing 802.11b/g mixed wireless Index Terms—throughput, quality of service (QoS), IEEE networks. One client is asking the permission for 802.22 wireless LAN (WLAN), voice over IP (VoIP), voice transmission by sending a RTS message to the access point. codec At its turn, the access point is answering with a CTS message. The clients who receive the CTS will stop the send I. INTRODUCTION initiatives and avoid the collisions. The throughput is thus Sending the voice in real world IP networks is not a trivial reduced due to the RTS/CTS exchange times. problem. This is even more complicated on wireless Considering one radio client in a clean radio environment, networks. The voice applications are the most sensitive ones the above limitations could be neglected as a first approach. as they are time sensitive. In 802.11 communications, In such conditions, there are no overlapping, no bandwidth supplementary overheads and timing intervals are necessary sharing and no RTS/CTS mechanism. However, the protocol for every carried data packet. Additionally, the radio overheads and radio communication timing intervals cannot transmission technology has some limitations due to channel be avoided and they are extensively discussed further in this overlapping, radio bandwidth sharing, legacy support, paper, being the most throughput resources consuming overheads and inter-packet times. elements.

APP Application Layer (RTP) RTP Voice Packet 12B (Codec Dependent)

UDP Transport Layer (UDP) UDP Data Message 8B

INT Internet Layer (IP) IP Datagram 20B

LLC LLC Sub- Layer MAC Security LLC IP Data Unit 30B Overheads 4B CRC MAC MAC Sub- Layer 0-20B MAC Body 4B MAC Data PLCP MAC Protocol Data Unit (MPDU) Preamble Header PHY 18B 6B PLCP Protocol Data Unit (PPDU)

Figure 1. VoIP packet encapsulation.

Digital Object Identifier 10.4316/AECE.2009.03009 45 Advances in Electrical and Computer Engineering Volume 9, Number 3, 2009

II. PROTOCOL OVERHEADS Finally, when no security method is involved, the PLCP For each voic e packet whic h has to be sent, different Prot ocol Data Unit (PPDU) which is launched into the overheads are added when the data unit passes to each physical transmission medium contains a total number of protocol layer. We have to notice that the package length is overhead bytes equal with 102 (78+24), but transmitted at increased each time when de data unit is transferred to the two different rates. next down layer. In Figure 1, we show how these things happen. At the application level, the voice data stream is III. SECURITY OVERHEADS compressed by a specific codec, resulting voice packets with When security issues are used, as we can see in Table 1, a certain length. The compressed packets have different 8-16-20 bytes are supplementary carried for each voice dimensions depending on the codec. They are carried by frame. Accordingly, the security overheads will have an using RTP protocol (Real Time Transport Protocol) which impact on the voice channel extending the data frame. With adds its header of 12 bytes length. Next, at the transport WEP (Wired Equivalent Privacy) we arrive at 110 bytes, layer, a new UDP 8 bytes header is necessary (UDP is User with WPA (Wi-Fi Protected Access) using TKIP (Temporal Datagram Protocol). The IP frame creation needs another 20 Key Integrity Protocol) algorithm, 122 bytes are added, and bytes header. With no security involved, the MAC layer with WPA CCMP (Wi-Fi Protected Access based on adds a total overhead of 38 bytes (with LLC contribution Counter Mode with Cipher Block Chaining Message included) and PHY has another 24 bytes as PLCP (Physical Authentication Code Protocol) the security overheads are Layer Convergence Protocol) preamble and header. 118 bytes [10]. The PLCP Preamble and Header (PCLP overheads) are It is important to note that the voice packets length transmitted at the basic channel data rate. The basic rate is 1 usually extend from 10 to 160 bytes (Table 2). For example, Mbps for 802.11b, 24 Mbps for pure 802.11g and 6 Mbps when a G729 codec is used (with a 10 bytes packet length), for 802.11a. The rest of the frame is transmitted at the employing WPA-TKIP security method, 20 bytes will be channel data rate. At this moment and in these conditions, it added for each 10 bytes payload. This is the worst case is realistic to consider the overhead for 802.11 MAC layer scenario, with the minimum packet length and maximum only, which has a length of 78 bytes, and to include the security overhead. In other words, for one packet carrying PLCP transmission time together with the other timing voice, another double length sequence has to be added. intervals of 802.11 communications. These timing intervals However, the security overheads impact on the number of are DIFS (Distributed Interframe Space), SIFS (Short possible simultaneously VoIP sessions is reduced because of Interframe Space) and CW (Contention Window, backoff the fact that VoIP goodput is less sensible in relation with time) [5], [9] and they will be later explained as 802.11 overheads variations [1] while timing intervals are dominant radio environment characteristics. in the transmission budget, as we will conclude in this We will consider now the overheads for the Application paper. Layer (RTP), the Transport Layer (UDP), the Internet Layer (IP) and the MAC Sub-Layer (MAC) as suggested in figure TABLE 1. SECURITY OVERHEADS IN BYTES/PACKET WEP WPA WPA2 1. The MAC sub-layer contributes with the MAC header (30 Security Protocol WEP TKIP CCMP bytes), with the MAC CRC trailer (4 bytes) but also with Data integrity CRC-32 MIC CCM LLC overheads (3 or 4 bytes) and with optional security Security level Poor Medium High overheads if WEP or WPA is used. Overheads 8 20 16 Accordingly, at the PHY level, excluding the PHY IV. VOICE CODECS overheads, we arrive to an already included overheads amount H, having a length of 78 bytes, as it is calculated in Since the audio information is a continuous one, even in equation (1). quiet time intervals some background noise is to be transmitted. If uncompressed voice is carried, the necessary H = H + H + H + H = RTP UDP IP MAC speed for the basic data flow is 8 x 8 = 64 kbps (PCM). If = 12 + 8 + 20 + 38 = 78 bytes (1) we are adding 102 bytes as overheads to each 1 byte sample, The PHY level contribution was not included in the we get an unacceptable channel usage efficiency of general overheads because it is always transmitted at the 1/(1+102) = 0.97%. Therefore, packing the elementary basic channel data rate and not at the channel rate [1], [9]. samples into larger packets is a necessity. When, on a radio channel, a certain data rate is established, The packets have to be long enough to assure good the bits flow for data and overheads are transmitted at this channel efficiency, but also short enough to allow time speed. The PLCP preamble and header (24 bytes in total) are multiplexing with other packets and also they have to be however transmitted at the basic rate. Based on that, we can tailored for specific carrying frames. A short packet is also a transform this PHY overhead stream in a transmission time good solution for shorter retransmissions time when errors interval, with a constant time value, not related with the occur. Each codec is defining a packet length and a packet channel data rate and addable with other time intervals, inter-arrival time, as shown in Table 2. The protocols similar with the inter-frame intervals (which will be overheads for usual codecs are also summarized. The codecs explained as 802.11 radio channel propriety). parameter values are included into ITU Recommendations

46 Advances in Electrical and Computer Engineering Volume 9, Number 3, 2009

TABLE 2. THE PARAMETERS OF THE MAIN CODECS TCP/IP Layer Voice codec G.726-32 G711 G729 G723.1 Application Layer Packet inter-arrival time [ms] 20 20 10 30 Voice packet length [bytes] 80 160 10 24 RTP layer overhead [bytes] 12 12 12 12 Transport Layer UDP layer overhead [bytes] 8 8 8 8 Internet Layer IP layer overhead [bytes] 20 20 20 20 Data link Sublayer MAC layer overhead [bytes] 36 36 36 36 Physical Sublayer PHY layer overhead [bytes] 24 24 24 24

[3], [4] and they are based on acceptable coding/decoding If the medium is sensed to be available for the duration of delays compared with the human sound perception. time that exceeds the DIFS, before a new packet could be transmitted, a backoff waiting interval is introduced. The backoff interval is an integer number of slots time and is V. RADIO ENVIROMENT defining the Contention Window, CW. The Contention At the PHY interface, specific 802.11 rules and Window is a discrete random interval between zero and a parameters have to be considered. Since the protocols value between a minimum value (CWmin = 15/31) and a overheads were already counted in building the MPDU maximum value (CWmax = 1023), multiplied with the time (MAC Protocol Data Unit), now some dead time intervals slot value. need to be evaluated in order to calculate the channel In IEEE 802.11 standard, a receiver must transmit a availability for VoIP services. In a standard communication positive acknowledgement, ACK, to the transmitter, when a scenario, DIFS, Backoff and SIFS times are waiting packet is received with no errors. An ACK will be only intervals while ACK together with protocol overheads are transmitted after the short interframe space interval, SIFS additional data intervals which are not directly involved in (Fig. 2). voice transportation (Figure 2). The ACK frame is 14 bytes length and is transmitted at a The PHY overheads are always transmitted at the basic basic rate of 2 Mbps (in 802.11b g), regardless of the rate (1 Mbps for 802.11b) while the frame payload, which is channel data rate [1]. the MPDU, is travelling at the channel data rate. Based on We can now calculate a total time interval between two that, as was shown, it is useful to consider the protocol successive MPDUs in a unicast communication as follows: overheads only to the MPDU level, in order to count how Td = TDIFS + TBackoff + TSIFS + TPHY (2) many MPDU sequences are necessary for one VoIP stream. DIFS and SIFS time values are defined by the Since the ACK contribution is a constant one (always communication standards and they have the amounts from transmitted at the basic rate, with a fixed length) it can be Table 3. DIFS is in connection with SIFS and it is calculated considered as a supplementary timing interval together with as in equation (3). other timing issues. There are two access mechanisms specified by the IEEE TDIFS = 2 x TSlot + TSIFS (3) 802.11 standard and they are Distributed Coordination Before starting a transmission, a station will randomly Function (DCF) and Point Coordination Function (PCF). choose a backoff time with the number of time slots ranging PCF is based on a central coordinator which polls other from 0 to Contention Window (CW) [1]. The station will stations and allows them contention free access to the decrease this backoff value progressively while the channel channel. PCF is not generally supported in commercial is idle after a DIFS interval and stop the timer if it senses the products. The main access method is DCF and it is based on channel to be busy. When the backoff value reaches zero, the Carrier Sense Multiple Access with Collision Avoidance the station will transmit its packet. If, pursuant to the PPDU (CSMA/CA) protocol. frame transmission, it does not receive the ACK In CSMA/CA networks as 802.11, a radio station which confirmation, the station assumes that the packet has been finds that the radio environment is available (no carrier) will lost due to transmission errors. In the next steps, the CW n start to transmit only after a random back-off procedure. value is increased to CWn=2(CWn-1+1)-1=2 (CW0+1) for That is controlled by the distribution coordination function each the nth attempt and then a new backoff time is access method (DCF) which consists in waiting for a period randomly chosen from the interval [0, CW-1]. The of time, referred to as the DCF interframe space, DIFS (Fig. procedure is then repeated till a successful confirmation or 2). till the CW maximum limit is reached. The random backoff

Backoff Backoff Next Busy PPDU Frame DIFS PHY MPDU SIFS DIFS Sender Receiver time Time slots ACK

Figure 2. Access Mechanism for Unicast Packets in 802.11 Networks.

47 Advances in Electrical and Computer Engineering Volume 9, Number 3, 2009

time will be considered in this evaluation as an average /1 T f FI value, calculated as in (4). N == = FI 2/2 ⋅T f (8) TBackoff = TSlot x (CWmin – 1) / 2 (4) FI = = HPL ⋅+ 8)( TPHY is the time necessary for PCLP preamble and header [2 TT ++⋅ ] transmission (having 24 bytes in length) at the basic data d ACK R rate. This rate is the data rate which is "understandable" by ⋅ FIR = every station from the network. For 802.11b, transmitting 24 [2 d ACK HPLTRTR ⋅++⋅+⋅⋅ ]8)( bytes at 1 Mbps it means 24 x 8b / 1Mbps = 192 μs. Other values are calculated in Table 3. As an example, we can consider G.711 codec and an 802.11g network. Based on the above equation, the result is TABLE 3. DCF PARAMETERS FOR 802.11 COMMUNICATIONS 53.7 voice streams. 802.11b 802.11g 802.11a The results, as possible number of VoIP simultaneously DIFS 50 μs 28 μs 34 μs streams, for other usual codecs and wireless standards, are (50μs RTS-CTS) SIFS 10 μs 10 μs 16 μs shown in Table 4. The values were computed using the CWmin 31 15 15 above modelling equation (8). CWmax 1023 1023 1023 Slot Time 20 μs 9 μs (20 μs) 9 μs TABLE 4. THE NUMBER OF SIMULTANEOUSLY ACK Frame 203 μs 30 μs 24 μs VOIP SESSIONS PHY Header 192 μs 20 μs 16 μs G.726-32 G711 G729 G723.1 (PLCP) 802.11b 11,4 10,7 6,0 17,9 Data Rate 11 Mbps 54 Mbps 54 Mbps Basic Rate 1 Mbps 24 Mbps 6 Mbps 802.11g 57,3 53,7 30,5 90,3 (no legacy) 802.11a 56,7 53,1 30,1 89,2

The ACK fame has 14 bytes length at the MAC layer. For VII. MODEL LIMITATIONS transmission, the PCLP preamble and header (TPHY) is also Some limitations of this model are related with the added. TACK is obtainable as in (5). situations described bellow. TACK = TPHY + ACK_frame_length x 8 / Basic_Rate (5) The backoff time have a random value and can be only For 802.11b we have 192μs + 14 x 8b / 1 Mbps = 203 μs. estimated as an average value. The station selects a starting Other values for TACK are also calculated in Table 3. backoff time as a value inside the interval between 0 and CWmin, which define the time slots number to wait. If VI. VOIP SESSION BANDWIDTH NEEDS collisions appear, the CW value is increased and when For all codecs, except G.729, the necessary number of empty slots are detected CW is decreased, otherwise the packets per second (Np) is the inverse of the framing value remains constant. The contention window range interval (FI): between 0 and the minimum value if only one station is involved. It arrives at higher values for multiple clients [2]. Np = 1 / FI (6) If no collisions appear, the average backoff time can be For G.729, two frames are combined into one packet. calculated as (CWmin-1)/2, otherwise higher values have to Based on the packet length value (PL) and knowing the be considered. timing intervals Td for each frame from equation (2), is now For IEEE 802.11g networks, no RTS/CTS mechanism easy to calculate the total necessary time for the was considered, so no legacy is involved at this step. If there transmission of one frame and its ACK confirmation, Tf, are no 802.11b stations, then the network can use the using the following equation (7). 802.11g basic data rate of 24 Mbps, higher than 1 Mbps which is necessary to communicate with 802.11b stations. Tf = Td + TACK + [(PL + H) x 8 ] / R (7) When 802.11b legacy is necessary, the 802.11g stations will where H, T and T are calculated before with (1), (2) and d ACK fall back the communication parameters to be (5) and PL is the voice packet length from Table 2. understandable by 802.11b clients, at least when the The value 1/T is therefore the number of possible frames f medium access is negotiated. The RTS/CTS mechanism is per second. instructing the stations to not initiate a transmission but is A VoIP session has two streams, one in each direction, increasing the communication dead times. However, like both over the same communication medium. So every VoIP that, the collisions rate is reduced. When collisions appear, session between a wireless and a wired client needs at least there are no ACK confirmations and the frame is 2 x Np frames in a second. retransmitted. A frame can be retransmitted many times, so The number of possible simultaneous VoIP sessions is the additional dead times have to be included in the time ratio between the number of possible frames per second and domain throughput evaluation in such cases. the number of frames associated with the VoIP stream:

48 Advances in Electrical and Computer Engineering Volume 9, Number 3, 2009

VIII. CONCLUSIONS The numbers of possible simultaneous VoIP streams in a wireless environment is usually calculated starting from a simplified communication model. Different approaches, usually covering particular situations, are available in articles and papers [5], [7], [8], [9]. The model presented in this paper is based on a rigorous analysis of the specific factors involved in wireless communication. For a simple scenario, with no radio interference and transmission errors, the results are quite accurate and comparable with most of the models. The obtained results are slightly smaller then the results generated with other models [9] basically due to the more accurate overheads calculation in our case. Starting from this point, we can have more complex studies in the future which will bring into discussion the channel characteristics, the radio overlapping and the RTS/CTS legacy support. Figure 3. The time budget usage for a typical VoIP scenario. We can conclude that the channel usage in time domain terms can be calculated considering the main actors, which are: In particular, when 802.11g needs to be compatible with - Transmission time for the protocol overheads, H = 78 802.11b, because both standards use the same radio bytes, transmitted at the channel data rate R; frequency, the protection mechanism, based on RTS/CTS - Transmission time for the security overheads, HWEP/WPA = handshaking, is employed. When a station wants to initiate a 8, 20 or 16 bytes transmitted at the channel data rate R; transmission, it will send a RTS (Request to Send) message. - Interframe spaces: DIFS (Distributed Coordination Any other node receiving the RTS or the CTS should refrain Function Interframe Space), SIFS (Short Interframe from sending data for a time interval. The RTS/CTS Space), backoff time; mechanism is acting as a Virtual Carrier-Sense method. It - PLCP overheads (PHY), 24 bytes transmitted at the induces a supplementary delay due to RTS frame channel basic data rate RB.B transmission (20 bytes at the basic rate of 1 Mbps plus the - Voice payload. PHY header), CTS frame transmission (14 bytes at the basic rate of 1 Mbps plus PHY header) and one DIFS/SIFS pair The generic contributions to the transmission time budget for inter-framing. Based on the above explained principle, are suggested in Figure 3. Considering as example a the resulting RTS/CTS necessary time is 430 μs and it has to common scenario, with 802.11g standard and G729 codec, be added only in the particular case of 802.11g the results are: 9.24% protocol overheads, 2.37% security communication with 802.11b back compatibility enabled. overheads (WPA), 80.8% interframe space, 6.4% PHY When 802.11g adopts RTS/CTS protection, the throughput overheads and only 1.19% for the voice payload. As we can and VoIP capacity is not much higher than that in 802.11b. see, the voice payload has an extremely small contribution [9]. as compared with the other factors. Any transmission In the above calculus, no security measures were optimization has to act first of all on the element having the considered. When WEP or WPA is employed, bigger weight in the transmission budget. supplementary overheads are added, with 8 bytes (WEP), 20 Based on the described model, some solutions to improve bytes (WPA TKIP) or 16 bytes (WPA CCMP) weight. The the VoIP throughput can be identified. They are as follows: security overheads can be introduced as general protocol 1. The use of largest possible frames can reduce the weight overheads, considering an extended H value, Hsec = H + of interframe intervals. However, in a noisy HSOH (HSOH are the security overheads) [10]. environment, this means spending more time with For the most overhead consuming security method, WPA potential retransmissions, therefore the overall effect on (TKIP), by adding to the existing overheads H the WPA throughput could be negative; overheads (20 bytes, see Table 1), we obtain Hsec = H + 2. Security overheads can be reduced if elaborated HWPA = 78 + 20 = 98 bytes. The resulting values are shown encryption techniques are used instead of trivial ones in Table 5, and as we can see, it is not a significant (for example WPA2 instead of WEP) in order to add less degradation as compared with unsecured stream from Table security overheads to the data packets even if more 4. computing power is necessary[10]; 3. Interframe intervals can be reduced if broadcast traffic is used, when it is possible. That is because a continuous TABLE 5. THE NUMBER OF SIMULTANEOUSLY blind transmission, with no ACK back confirmation, VOIP SESSIONS WITH WPA diminishes some dead transmission intervals usually G.726-32 G711 G729 G723.1 associated with ACK procedures; 802.11b 11,2 10,5 5,9 17,6 4. In crowded communication channels, limiting the 802.11g 56,4 52,8 29,9 88,7 backoff time length for voice packets will increase their 802.11a 55,8 52,3 29,6 87,7 priority versus other types of data;

49 Advances in Electrical and Computer Engineering Volume 9, Number 3, 2009

5. Using an appropriate codec with higher compression rate [5] Mohd Alias, Ong Lee Loon, "Performance of Voice over IP over allows increasing the number of simultaneous VoIP Wireless LAN for different Audio/Voice Codecs", Jurnal Teknologi, 47(D) , Universiti Teknologi Malaysia, 2007 streams because each stream occupies less bandwidth. [6] Potorac Alin Dan, Eugen Coca, "QoS Considerations for 802.11 As a final conclusion, we can note that the paper is Networks", European Conference on the Use of Modern Information proposing a theoretical communication model to be applied and Communication technologies, ECUMICT ‘06, Ghent, Belgium, in VoIP transport services on wireless networks. 2006 [7] Raghuraman Rangarajan, Sridhar Iyer, Atanu Guchhait, "Automated In spite of some limitations, it is simple and accurate as design of VoIP-enabled 802.11g WLANs", OPNET Annual compared to other solutions analyzed in this research [1], Networking Conference (OPNETWORK), Washington D.C., USA, [9]. Aug 2005. Further improvements can be considered as future [8] Taiwen Tang, Ketan Mandke, Chan-Byoung Chae, Robert W. Heath, Jr. and Scott M. Nettles, "Multichannel Feedback in OFDM Ad Hoc research in terms of connecting the mathematical model Networks", The 3rd Annual IEEE Communications Society on Sensor with the bit error probability (BER) of the communication and Ad Hoc Communications and Networks, Volume 2, 2006. channel. [9] Wei Wang, Soung C. Liew, and Victor O. K. Li, "Solutions to Performance Problems in VoIP over a 802.11 Wireless LAN", IEEE Transactions On Vehicular Technology, 2005. [10] Potorac Alin Dan and Balan Doru, "The Impact of Security REFERENCES Overheads on 802.11 WLAN Throughput", Journal of Computer Science and Control Systems, University of Oradea, vol.2, No.1, [1] Daji Qiao, Sunghyun Choi, "Goodput Analysis and Link Adaptation 2009. for IEEE 802.11a Wireless LANs", IEEE Transactions on Mobile [11] Tiliute, D. E., "Security of Mobile ad-hoc Wireless Networks. A Brief Computing, vol.1, No.4, 2002 Survey", Advances in Electrical and Computer Engineering, ISSN [2] "IEEE 802.11, Wireless LAN (MAC) and 1582-7445, e-ISSN 1844-7600, vol. 7, no. 2, pp. 37-40, 2007, doi: (PHY) Specifications", Standard, IEEE, Aug. 1999. 10.4316/AECE.2007.02009 [3] "ITU-T Recommendation G.114. 2003", One way transmission time. [12] Marian, N., Top, S., "Integration of Simulink Models with [4] "ITU-T Recommendation G.107. 2003", The E-model, a Component-based Software Models", Advances in Electrical and computational model for use in transmission planning. Computer Engineering, ISSN 1582-7445, e-ISSN 1844-7600, vol. 8, no. 2, pp. 3-10, 2008, doi: 10.4316/AECE.2008.02001

50