SILK Speech Codec TDP 10/11 Xavier Anguera I Ciro Gracia SILK Codec

Total Page:16

File Type:pdf, Size:1020Kb

SILK Speech Codec TDP 10/11 Xavier Anguera I Ciro Gracia SILK Codec SILK Speech Codec TDP 10/11 Xavier Anguera I Ciro Gracia SILK Codec Audio codec desenvolupat per Skype (Febrer 2009) Previament usaven el codec SVOPC (Sinusoidal Voice Over Packet Coder): • LPC analysis. • Quasi-harmonic modelling of the linear prediction (LPC) residual. • Both the sinusoidal amplitudes and phases are explicitly encoded using new methods based on Gaussian mixture models. Requeriments (Internet Wideband Audio Codec) Optimitzat per a treballar en temps real. Flexibilitat i adaptació de paràmetres a temps real, segons condicions: Xarxa Hardware Senyal d'àudio Paràmetres (Internet Wideband Audio Codec) Bitrate: qualitat vs bitrate. Baix: <10kbps (parla en qualsevol idioma). Alt: excel·lent per a tota senyal musical. Sampling rate: narrowband (8 Khz) – wideband (24 Khz o més). Complexitat: 50 Mhz x86 core, wideband mode (16 KHz sampling rate). Packet Loss Resilience: minimitzar la propagació dels errors. Delay: retard < 30ms. Discontinuous Transmission (DTX): velocitat baixa quan només hi ha soroll de fons. Encoder Sampling Rate: 8, 12, 16, 24 KHz Bitrate: 6-40 Kbps (1 bit/sample good, 1.5 bits/sample transparent) Packet rate: 20 ms frames, 1-5 frames/packet. Bitrate vs latency/ sensitivity. Packet Loss Resilience: us de dependències inter-frame per a detectar errors. Complexity: optimitzacions. Escalabilitat Escalabilitat del encoder Evaluació subjectivaAvaluació de qualitat MOS (Mean Opinion Score) Encoder Voice LTP Scaling Activity Control R Detector a n g Gains e Processor Pitch LSF Analysis Quantizer E n c o d Noise Prediction e Shaping Analysis r analysis High-Pass PreFilter Noise filter Shaping Quantization Decoder 2) Decode R Parameters a n g 4) e 3) 5) 1) Generate LTP LPC 6) d Excitation synthesis synthesis e c o d 1) Range encoded bitstream e 2) Coded Parameters r 3) Pulses and Gains 4) Pitch lags and LTP doefficients 5) LPC coefficients 6) Decoded signañ Pitch analysis • Returns a pitch value every 5ms and the voiced/unvoiced decision • LPC analysis is done with order 16, 12 or 8 • Three levels of correlation are used to reduce complexity Noise shaping analysis Optimizes some parameters to reduce noise effect • Balances quantization noise and bitrate • Spectral shaping of the quantization noise: makes it follow the signal spectrum • Deemphasizes spectral valleys (where noise would be more noticeable) • Matches the levels of the decoded speech formants to the original ones • Resulting parameters are applied to the signal in the PREFILTER module Prediction analysis It is done differently depending whether we have voiced or unvoiced signals: • Voiced: • First a 5 coeff. long-term prediction analysis is performed on 20ms • The residual is input to an LPC analysis • LPC coefficients are converted to Line Spectral Frequencies(LSF) (less sensible to quantization noise) and quantized. Prediction analysis It is done differently depending whether we have voiced or unvoiced signals: • Unvoiced: • No need for LTP analysis • LPC is performed, transformed to an LSF vector and quantized. LSF quantization A codebook method is used and non-uniform quantization rate: • Rarely occurring values are quantized with low distortion but high number of bits • Commonly occurring values are modeled with low error and low number of bits. The used codebook is trained from a large training set a priori LTP quantization It also uses a vector codebook, chosen from 3 possible (containing 10, 20 and 40 vectors each) For each frame the best codebook is chosen according to a rate-distortion minimization function Noise shaping quantization This module joins all outputs from all modules to generate the overall residual that is quantized and sent. Range encoder It is a data compression method proposed in 1979 (now it is patent free) which is based on arithmetic encoding. It uses the probability of occurrence of each pattern to codify with less bits those that occur more often. It encodes the following: voiced/unvoiced, LTP + LPC quantization indexes, residual signal, several intermediate gains .
Recommended publications
  • Can Skype Be More Satisfying? a Qoe-Centric Study of the FEC
    HUANG LAYOUT 2/22/10 1:14 PM Page 2 Could Skype Be More Satisfying? A QoE-Centric Study of the FEC Mechanism in an Internet-Scale VoIP System Te-Yuan Huang, Stanford University Polly Huang, National Taiwan University Kuan-Ta Chen, Academia Sinica Po-Jung Wang, National Taiwan University Abstract The phenomenal growth of Skype in recent years has surpassed all expectations. Much of the application’s success is attributed to its FEC mechanism, which adds redundancy to voice streams to sustain audio quality under network impairments. Adopting the quality of experience approach (i.e., measuring the mean opinion scores), we examine how much redundancy Skype adds to its voice streams and systematically explore the optimal level of redundancy for different network and codec settings. This study reveals that Skype’s FEC mechanism, not so surprisingly, falls in the ballpark, but there is surprisingly a significant margin for improvement to ensure consistent user satisfaction. here is no doubt that Skype is the most popular VoIP data, at the error concealment level. service. At the end of 2009, there were 500 million Hereafter, we refer to the redundancy-based error conceal- users registered with Skype, and the number of concur- ment function of the system as the forward error correction rent online users regularly exceeds 20 million. Accord- (FEC) mechanism. There are two components in a general Ting to TeleGeography, in 2008 8 percent of international FEC mechanism: a redundancy control algorithm and a redun- long-distance calls were made via Skype, making Skype the dancy coding scheme. The control algorithm decides how largest international voice carrier in the world.
    [Show full text]
  • Specview User Manual Rev 3.04 for Version
    User Manual Copyright SpecView 1994 - 2007 3.04 For SpecView Version 2.5 Build #830/32 Disclaimer SpecView software communicates with industrial instrumentation and displays and stores the information it receives. It is always possible that the data being displayed, stored or adjusted is not as expected. ERRORS IN THE DATABASE OR ELSEWHERE MEAN THAT YOU COULD BE READING OR ADJUSTING SOMETHING OTHER THAN THAT WHICH YOU EXPECT! Safety devices must ALWAYS be used so that safe operation of equipment is assured even if incorrect data is read by or sent from SpecView. SpecView itself MUST NOT BE USED IN ANY WAY AS A SAFETY DEVICE! SpecView will not be responsible for any loss or damage caused by incorrect use or operation, even if caused by errors in programs supplied by SpecView Corporation. Warranties & Trademarks This document is for information only and is subject to change without prior notice. SpecView is a registered trademark of SpecView Corporation Windows is a trademark of Microsoft Corporation. All other products and brand names are trademarks of their respective companies. Copyright 1995-2007 by SpecView Corporation. All Rights Reserved. This document was produced using HelpAndManual. I Contents I Table of Contents Foreword 0 1 Installation 1 1.1 Instrument................................................................................................................................... Installation and Wiring 1 1.2 Instrument..................................................................................................................................
    [Show full text]
  • 1. Zpracování Hlasu 2. Kodeky
    2. přenáška Hlas a jeho kódování 1 Osnova přednášky 1. Zpracování hlasu 2. Kodeky 2 1. Hlas 3 Hlas Kmitočet hlasivek je charakterizován základním tónem lidského hlasu (pitch periode) F0, který tvoří základ znělých zvuků (tj. samohlásek a znělých souhlásek). Kmitočet základního tónu je různý u dětí, dospělých, mužů i žen, pohybuje se většinou v rozmezí 150 až 4 000 Hz. Sdělení zprostředkované řečovým signálem je diskrétní, tzn. může být vyjádřeno ve tvaru posloupnosti konečného počtu symbolů. Každý jazyk má vlastní množinu těchto symbolů – fonemů, většinou 30 až 50. Hlásky řeči dále můžeme rozdělit na znělé (n, e, ...), neznělé (š, č, ...) a jejich kombinace. Znělá hláska představuje kvaziperiodický průběh signálu, neznělá pak signál podobný šumu. Navíc energie znělých hlásek je větší než neznělých. Krátký časový úsek znělé hlásky můžeme charakterizovat její jemnou a formantovou strukturou. Formantem označujeme tón tvořící akustický základ hlásky. Ten vlastně představuje spektrální obal řečového signálu. Jemná harmonická struktura představuje chvění hlasivek. 4 Příklad časového a kmitočtového průběhu znělého a neznělého segmentu hovorového signálu Blíže http://www.itpoint.cz/ip-telefonie/teorie/princip-zpracovani-hlasu-ip-telefonie.asp 5 Pásmo potřebné pro telefonii 6 Pěvecké výkony jsou mimo oblast IP telefonie Blíže diskuze na http://forum.avmania.e15.cz/viewtopic.php?style=2&f=1724&t=1062463&p=6027671&sid= 7 2. Kodeky 8 Brána VoIP 9 Architektura VoIP brány SLIC – Subscriber Line Interface Circuit PLC – Packet Loss Concealment (odstranění důsledků ztrát rámců) Echo canceliation – odstranění odezvy 10 Některé vlastnosti kodérů . Voice Activity Detection (VAD) V pauze řeči je produkováno jen velmi malé množství bitů, které stačí na generování šumu.
    [Show full text]
  • PC-Netzwerke – Das Umfassende Handbuch 735 Seiten, Gebunden, Mit DVD, 7
    Wissen, wie’s geht. Leseprobe Wir leben in einer Welt voller Netzwerktechnologien. Dieses Buch zeigt Ihnen, wie Sie diese sinnvoll einsetzen und ihre Funktionen verstehen. Diese Leseprobe macht Sie bereits mit einzelnen Aspek- ten vertraut. Außerdem können Sie einen Blick in das vollständige Inhalts- und Stichwortverzeichnis des Buches werfen. »Netzwerk-Grundwissen« »Wireless LAN« »DHCP« »Netzwerkspeicher« Inhaltsverzeichnis Index Die Autoren Leseprobe weiterempfehlen Axel Schemberg, Martin Linten, Kai Surendorf PC-Netzwerke – Das umfassende Handbuch 735 Seiten, gebunden, mit DVD, 7. Auflage 2015 29,90 Euro, ISBN 978-3-8362-3680-5 www.rheinwerk-verlag.de/3805 Kapitel 3 Grundlagen der Kommunikation Dieser Teil des Buches soll Ihnen einen vertieften Überblick über das theoretische Gerüst von Netzwerken geben und damit eine Wissensbasis für die weiteren Kapitel des Buches schaffen. Das Verständnis der Theorie wird Ihnen bei der praktischen Arbeit, insbesondere bei der Fehleranalyse, helfen. Aktuelle Netzwerke sind strukturiert aufgebaut. Die Strukturen basieren auf verschie- denen technologischen Ansätzen. Wenn Sie ein Netzwerk aufbauen wollen, dessen Technologie und Struktur Sie ver- stehen möchten, dann werden Sie ohne Theorie sehr schnell an Grenzen stoßen. Sie berauben sich selbst der Möglichkeit eines optimal konfiguriertes Netzwerkes. In Fehlersituationen werden Ihnen die theoretischen Erkenntnisse helfen, einen Feh- ler im Netzwerk möglichst schnell zu finden und geeignete Maßnahmen zu seiner Beseitigung einzuleiten. Ohne theoretische Grundlagen sind Sie letztlich auf Glücks- treffer angewiesen. Dieses Buch legt den Schwerpunkt auf die praxisorientierte Umsetzung von Netzwer- ken und konzentriert sich auf die Darstellung von kompaktem Netzwerkwissen. Ein Computernetzwerk kann man allgemein als Kommunikationsnetzwerk bezeich- nen. Ausgehend von der menschlichen Kommunikation erkläre ich die Kommunika- tion von PCs im Netzwerk.
    [Show full text]
  • UC Santa Cruz UC Santa Cruz Electronic Theses and Dissertations
    UC Santa Cruz UC Santa Cruz Electronic Theses and Dissertations Title Proactive, Traffic Adaptive, Collision-Free Medium Access Permalink https://escholarship.org/uc/item/5r33n43f Author Petkov, Vladislav Publication Date 2012 Peer reviewed|Thesis/dissertation eScholarship.org Powered by the California Digital Library University of California UNIVERSITY of CALIFORNIA SANTA CRUZ PROACTIVE, TRAFFIC ADAPTIVE, COLLISION-FREE MEDIUM ACCESS A dissertation submitted in partial satisfaction of the requirements for the degree of DOCTOR OF PHILOSOPHY in COMPUTER ENGINEERING by Vladislav V. Petkov June 2012 The dissertation of Vladislav V. Petkov is approved: Katia Obraczka, Chair J.J. Garcia-Luna-Aceves Ram Rajagopal Venkatesh Rajendran Tyrus Miller Vice Provost and Dean of Graduate Studies Copyright c by Vladislav V. Petkov 2012 Contents List of Figures v List of Tables vii Abstract viii Dedication xi Acknowledgements xii 1 Introduction 1 1.1 Introduction . 1 1.2 Related Work . 5 1.2.1 Contention Based MAC Protocols . 5 1.2.2 Schedule Based MAC Protocols . 10 1.3 Contributions . 14 2 The Utility of Traffic Forecasting to Medium Access Control 16 2.1 Introduction . 16 2.2 Related Work . 18 2.2.1 Schedule Based MAC Protocols . 18 2.2.2 Traffic Forecasting . 19 2.2.3 Quality of Service . 20 2.3 Benefits of Traffic Forecasting . 21 2.4 Forecasting Traffic . 25 2.5 Challenges and Future Work . 29 3 Characterizing Network Traffic Using Entropy 30 3.1 Introduction . 30 3.2 Datasets . 32 3.2.1 Real-time flows . 34 3.2.2 Media streaming flows . 35 3.3 Self-similarity . 36 3.3.1 Self-similarity related work .
    [Show full text]
  • Análise Dos Efeitos De Codecs De Áudio Na Avaliação De Desvios Vocais
    1 ANSELMO DE VASCONCELOS CAVALCANTE ANÁLISE DOS EFEITOS DE CODECS DE ÁUDIO NA AVALIAÇÃO DE DESVIOS VOCAIS João Pessoa - PB Março, 2018 2 ANSELMO DE VASCONCELOS CAVALCANTE ANÁLISE DOS EFEITOS DE CODECS DE ÁUDIO NA AVALIAÇÃO DE DESVIOS VOCAIS Dissertação apresentada à Banca Examinadora do Programa de Pós-Graduação em Engenharia Elétrica do Instituto Federal de Educação, Ciência e Tecnologia da Paraíba como requisito necessário à obtenção do grau de Mestre em Engenharia Elétrica. Orientadora: Prof. Dra. Suzete Élida Nobrega Correia Coorientadora: Prof. Dra. Silvana Luciene do Nascimento Cunha Costa João Pessoa - PB Março, 2018 Dados Internacionais de Catalogação na Publicação – CIP Biblioteca Nilo Peçanha – IFPB, Campus João Pessoa C376a Cavalcante, Anselmo de Vasconcelos. Análise dos efeitos de CODECS de áudio na avaliação de desvios vocais / Anselmo de Vasconcelos Cavalcante. – 2018. 89 f. : il. Dissertação (Mestrado em Engenharia Elétrica) – Instituto Federal de Educação, Ciência e Tecnologia da Paraíba – IFPB / Coordenação de Pós-Graduação em Engenharia Elétrica, 2018. Orientador: Profº Paulo Henrique da Fonseca Silva. Coorientador: Profº Elder Eldervitch Carneiro de Oliveira 1. Engenharia de comunicação elétrica. 2. Voip. 3. CODECS. 4. Asterisk. 5. Avaliação da qualidade vocal. 6. Telemedicina. I. Título. CDU 621.391 Ivanise Andrade M. de Almeida Bibliotecária-Documentalista CRB-15/0096 4 Aos meus pais, Maria do Socorro e José Anselmo. A minha esposa Carolina. 5 AGRADECIMENTOS A Deus, que me proporcionou o dom da vida; A minha família, especialmente meus pais, que se doaram intensamente para que sempre buscasse meus objetivos; A minha esposa, que soube entender com extrema maestria as minhas dificuldades e emoções durante a execução deste trabalho, sem nunca deixar de me incentivar; A todos aqueles que um dia foram meus professores, em especial a Suzete Correia, Silvana Costa, Michel Dias e Leonardo Lopes, que me deram total apoio para o desenvolvimento desta pesquisa.
    [Show full text]
  • Sinusoidal Model Based Packet Loss Concealment for Wideband Voip Applications
    BEN-GURION UNIVERSITY OF THE NEGEV FACULTY OF ENGINEERING SCIENCE DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING Sinusoidal Model Based Packet Loss Concealment for Wideband VoIP Applications THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE M.Sc DEGREE By: Dmitry Lihovetski January 2011 BEN-GURION UNIVERSITY OF THE NEGEV FACULTY OF ENGINEERING SCIENCE DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING Sinusoidal Model Based Packet Loss Concealment for Wideband VoIP Applications THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE M.Sc DEGREE By: Dmitry Lihovetski Supervised by: Prof. Ilan D. Shallom Prof. Dov Wulich Author: Date: ………………..……………….. ……………….. Supervisor: Date: ………………..……………….. ………………. Supervisor: Date: ………………..……………….. ………………. Chairman of Graduate Studies Committee: ………………..……………….. Date: ………………. January 2011 SINUSOIDAL MODEL BASED PACKET LOSS CONCEALMENT FOR WIDEBAND VOIP APPLICATIONS “Essentially, all models are wrong, but some are useful”1 - George E.P. Box (1919-present) Dmitry Lihovetski Israel, January 2011 1 Box, George E. P.; Norman R. Draper (1987). Empirical Model-Building and Response Surfaces, p. 424, Wiley. Sinusoidal Model Based Packet Loss Concealment for Wideband VoIP Applications Abstract Voice over Internet Protocol (VoIP) has become very popular in recent years. However, since internet delivery does not guarantee quality of service, data packets are often lost due to network congestions or significantly delayed. Packet loss is a fundamental problem in VoIP networks, unless concealed in some way, it produces an annoying disturbance in the resulting gaps. Therefore, there is considerable interest in developing Packet Loss Concealment (PLC) algorithms to compensate for missing voice packets. In this thesis, a new method for concealment of the missing packets for wideband VoIP applications is presented.
    [Show full text]
  • POLITECNICO DI MILANO MILANO LEONARDO School of Industrial and Information Engineering Master of Science in Telecommunication E
    POLITECNICO DI MILANO MILANO LEONARDO School Of Industrial and Information Engineering Master of Science in Telecommunication Engineering “Comparison between VoIP clients” Supervisor: Antonio Capone Master of Science Thesis by Jahangir Khalid 801715 Academic year 2012-2014 Table of Contents Chapter 1............................................................................................................................................4 1) Introduction to IES ITALIA..............................................................................................................4 1.1) IES Product Platform Solutions...................................................................................................4 1.2) MARITIME………………………………………………………………………………………………………………………………4 1.3 Internet Surfing on the Connected cruise………………………………………………………………………………..5 1.4) Adding values to voyage…………………………………………………………………………………………………………5 1.5) Increasing Revenue…………………………………………………………………………………………………………………5 1.6) Strategy and solutions…………………………………………………………………………………………………………….5 1.7) Hospital-IES……………………………………………………………………………………………………………………………..5 1.8) IES-WEB…………………………………………………………………………………………………………………………………..6 1.9) Focus on New Technologies…………………………………………………………………………………………………….7 1.9a) Technologies Provided by IES…………………………………………………………………………………………………7 1.9.1) WI-FI……………………………………………………………………………………………………………………………………..7 1.9.2) Digital signage……………………………………………………………………………………………………………………….8 1.9.3) Applications…………………………………………………………………………………………………………………………..8 1.9.4) IPTV……………………………………………………………………………………………………………………………………….8
    [Show full text]
  • Winnie Soh Overview I
    Presented by: Winnie Soh Overview I. What is Skype? II. Features III. The Man Behind Skype IV. How it Started V. System VI. How VoIP Works? VII. Just for Thoughts VIII. Possibilities of Microsoft Acquisition IX. The Deal X. After the Acquisition XI. Reference The linked image cannot be displayed. The file may have been moved, renamed, or deleted. Verify that the link points to the correct file and location. What is Skype ? • Voice-over-internet Protocol (VoIP) service and software application which allows users to communicate with peers by voice, video, and instant messaging over the internet. • hybrid peer-to-peer and client–server system, and makes use of background processing on computers running Skype software; the original name proposed – Sky peer- to-peer The linked image cannot be displayed. The file may have been moved, renamed, or deleted. Verify that the link points to the correct file and location. Features • Voice chat - Allows telephone calls between pairs of users and conference calling, and uses a proprietary audio codec. • Video Conferencing – between two users was introduced in January 2006 for the Windows and Mac OS X platform clients. Skype 2.0 for Linux, released on 13 March 2008, also features support for video conferencing.16 Version 5 beta 1 for Windows, released 13 May 2010, offers free video conferencing with up to five people Messaging - Allows group chats, emotion icons, storing chat history and editing of previous messages. The usual features familiar to instant messaging users — user profiles, online status indicators, and so on — are also included. The Man behind Skype • Founded in 2003 by Janus Friis from Denmark and Niklas Zennström from Sweden.
    [Show full text]
  • Investigatory Voice Biometrics Committee Report Development Of
    1 2 3 4 5 6 7 8 9 Investigatory Voice Biometrics Committee Report 10 11 Development of a Draft Type-11 Voice Signal Record 12 13 09 March, 2012 14 15 Version 1.8 16 17 18 Contents 19 Summary ................................................................................................................................................................ 3 20 Introduction ............................................................................................................................................................ 3 21 Investigatory Voice Committee Membership ............................................................................................................ 5 22 Definitions of Specialized Terms Used in this Document .......................................................................................... 5 23 Relationship Between the Type-11 Record and Other Record Types and Documents ............................................... 8 24 Some Types of Transactions Supported by a Type-11 Record ................................................................................. 8 25 Scope of the Type-11 Record ................................................................................................................................ 10 26 Source Documents ............................................................................................................................................... 11 27 Administrative Metadata Requirements.................................................................................................................
    [Show full text]
  • White Paper Optimal Codec Selection in International IP Based Voice
    INTERNATIONAL INTERCONNECTION FORUM FOR SERVICES OVER IP (i3 FORUM) (www.i3forum.org) Workstream “Technical Aspects” White Paper Optimal Codec Selection in International IP based Voice Networks (Release 2.0) May 2010 “Optimal Codec Selection in International IP based Voice Networks”, Rel. 2.0, May 2010 1 i3 Forum Proprietary Document Executive Summary This White Paper assists in correct codec selection in different IP based voice interconnection configurations, as well as to predict IP-based voice interconnection configurations which will have unacceptable voice quality degradation. Codec engineering (the practical application of codecs) in IP based Voice networks is more complex in comparison to existing TDM networks; this document deals with the factors and configurations indispensable in correct network configuration and interconnection agreement planning, which have to be considered in order to deliver voice quality levels satisfactory for Service Providers. Having introduced codec basics, quality planning basics and the significance of proper codec choice, this White Paper provides a methodology, spreadsheets and a calculation template useful to evaluate codec choice(s) for a particular distance of network configuration, thus indicating if it will be possible to achieve the required speech quality. If this calculation shows that expected (customer) quality will be below a satisfactory level it is possible to go through the calculations step by step and try to change codec or other parameters to reach the desired quality level. It is shown that transcoding significantly affects call quality, and should be avoided unless absolutely necessary. The impact of transcoding is likely to be much higher when a chain of downstream carriers is involved in the end-user to end-user communication, than for bilateral interconnections engineered directly between network operators, and may necessitate different network configurations being sought.
    [Show full text]
  • Identification of Peer-To-Peer Voip Sessions Using Entropy and Codec
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. X, NO. X, MONTH 2011 1 Identification of Peer-to-Peer VoIP Sessions Using Entropy and Codec Properties Joao˜ V. Gomes, Pedro R. M. Inacio,´ Manuela Pereira, Mario´ M. Freire, and Paulo P. Monteiro Abstract—Voice over Internet Protocol (VoIP) applications based on peer-to-peer (P2P) communications have been experiencing considerable growth in terms of number of users. To overcome filtering policies or protect the privacy of their users, most of these applications implement mechanisms such as protocol obfuscation or payload encryption that avoid the inspection of their traffic, making it difficult to identify its nature. The incapacity to determine the application that is responsible for a certain flow raises challenges for the effective management of the network. In this article, a new method for the identification of VoIP sessions is presented. The proposed mechanism classifies the flows, in real-time, based on the speech codec used in the session. In order to make the classification lightweight, the behavioral signatures for each analyzed codec were created using only the lengths of the packets. Unlike most previous approaches, the classifier does not use the lengths of the packets individually. Instead, it explores their level of heterogeneity in real- time, using entropy to emphasize such feature. The results of the performance evaluation show that the proposed method is able to identify VoIP sessions accurately and simultaneously recognize the used speech codec. Index Terms—Data communications, distributed applications, network communications, network management, network monitoring, packet-switching networks. F 1 Introduction applications, or an extension of the Extensible Messaging he popularity of Voice over Internet Protocol and Presence Protocol (XMPP), used by Google Talk, are T (VoIP) applications relying on the peer-to-peer (P2P) good examples of such VoIP systems.
    [Show full text]