Audiocodes Voip Processors (Chips) Solutions Guide Datasheets

Total Page:16

File Type:pdf, Size:1020Kb

Audiocodes Voip Processors (Chips) Solutions Guide Datasheets AudioCodes VoIP Processor Solution Guide The Clear Sound of Quality Featuring Rich VoIP Processor Solutions from the world’s leading enabler of the new voice infrastructure AudioCodes Enabling Technology Products Solution Guide Application Solution VoIP Processor Number of Number of Product Description Remarks Page Product Number VoIP Channels Data Ports AC494 AC494 Hardware & Software Reference IP Phone solution IP Phone AC495 2 2 Design, Linux support, IP Phone 4 - 5 available VoIP Toolkit AC495L SIP demo application AC494E Hardware & Software Reference (Orchid) AC494E Design Gigabit Eth support, IP Phone solution IP Phone 2 2 10 - 11 IP Phone AC495E Linux support, IP Phone SIP available Toolkit demo application Hardware & Software Reference ATA/CPE AC494 Design, Linux support, ATA Gateway AC495 Up to 8 2 6 - 7 SIP demo application, FXS/FXO VoIP Toolkit AC496 support Production ready module, AC494 Optional production files available, Tulip ATA AC495 Up to 8 2 customization 7 Complete ATA SIP application, AC496 of the software FXS/FX0 support Hardware & Software Reference CPE Design, Linux support, ATA SIP AC483 Up to 4 - 14 VoIP Toolkit demo application, FXS/FXO support and CPE VoIP Gateway and CPE VoIP AC48x API Drivers AC48x Up to 4 - 13 VoIP DSP OS independent Analog Telephone Adapter (ATA) (ATA) Adapter Analog Telephone SMB VoIP Hardware & Software Reference Toolkit - AC494E + Design, Linux support, 16 port 4 - 16 2 Scalable solution 8 - 9 Gateway AC498 Gateway SIP demo application, FXS support Risk Free Asterisk AC494E + based IP- 4 - 16 2 IP-PBX application 9 AC498 PBX Toolkit Hardware & Software Reference Gladiolus AC501x 8 - 24 - Design, API Drivers OS 9 SDK independent Voice Over Hardware Reference Design, API Packet AC490 8 - 12 - 15 Drivers OS independent Processor SMB VoIP Gateway and IP-PBX Gateway SMB VoIP Voice Over Hardware Reference Design, API Packet AC501x 8 - 24 - 16 Drivers OS independent Processor Voice Over Hardware Reference Design, API Packet AC5042 Up to 32 Drivers OS independent, Linux 12 SoC support, SIP application Voice Over AC491/ Hardware Reference Design, API Packet 32, 64, 96 17 AC491L Drivers OS independent Processor Gateway Voice Over *168, 204, 234 Hardware Reference Design, API Enterprise VoIP VoIP Enterprise Packet AC503x - 18 *Dynamically allocated Drivers OS independent Processor VoIP channels 3 | CMBU Chips & Modules Business Unit AudioCodes Enabling Technology Products AC494 Family System On a Chip (SoC) Channel Density • 2 VoIP line support Hardware • CPU – MIPS32 4KEC, 165 Mhz • DSP – AC49x, 125 Mhz • 2 Integrated 10/100 Base-T MAC/PHY ports • Integrated 3-port layer-2 Ethernet Switch • 8x8 Keypad interface • LCD Interface – Alphanumeric, Graphic, Rasterized • Integrated SSP (SPI/I2C) Controller • USB 1.1 Host/Device • External Memory I/F – Two SDRAM banks, up to 128 MB, ● Single chip complete IP Phone solution Flash up to 32 MB • Telephony Interface - Integrated dual channel 16 bit CODEC, ● Superior voice quality at a competitive price sampling rate 8/16 KHz • Five inputs to ADC, four outputs from DAC ● Field-proven VoIP processor and software • PCM Interface - 2.048 MHz A/μ-Law serial port • Power Supply - +3.3V (+1.5V core via integrated voltage ● VoIPerfectHD™ (High Definition VoIP) support regulator) • Power Consumption - 1.88 W (max) ● Combined hardware & software reference design • Operational Case Temperature Range - 0ºC - 70ºC (commercial) • Package - 324 pin BGA, 23x23 mm, 1.00 mm ball spacing ● Ready to use IP Phone open platforms Telephony Signaling The AC494 IP Phone Toolkit is a combined hardware and software • DTMF – TIA464B IP Phone reference design based on the AC494 System on Chip (SoC) • Programmable Call-Progress Tones family. AudioCodes’ AC494 SoC (System on Chip) family provides IP VoIP Signaling Protocols Phone developers and manufacturers with a single chip field-proven • SIP - RFC 3261, 3262, 3263, 3264, 2327 complete IP Phone solution, integrated with superior voice quality at a Packetization competitive price. • RTP/RTCP Packetization – RFC 3550, 3551, 2198 • Caller ID - Telcordia/ETSI/NTT Type I/II (on hook / off hook) • DTMF Relay - RFC 2833, RFC 4733 AC494 SoC (System on Chip) Security In a single chip, the AC494 combines all IP Phone components; including • SRTP (Secured RTP) per RFC 3711, 128 bit AES, HMAC SHA1 a MIPS 4KEC CPU, AC49x DSP and a 3-port Ethernet Switch (MACs and Media Processing PHYs) together with a rich set of peripherals such as internal CODECS • Voice Coders - G.711, G.723.1, G.729A/B, G.726, GSM FR and serial ports. The AC494 is based on the VoIPerfectHD™ architecture • Wideband Coders – G.722, G.722.2 (WB-AMR), Opus • Echo Cancelation – G.168-2004 compliant, 64 msec tail length (High Definition VoIP) support, AudioCodes' underlying, best-of-breed, • Acoustic Echo Canceler core media gateway technology for its product line. Utilizing its highly • Silence Suppression – VAD, CNG integrated structure and requiring only a minimal number of peripheral • Adaptive Jitter Buffer 300 msec devices, time to market is rapid with little risk. Telephony Features • 3-Way conferencing (on each channel) Several silicon derivatives of this family allow cost optimization per • Call Forward • Call Hold application, which includes the following devices: AC494, AC495, AC496 • Call Transfer and AC495L. • Call Waiting Configuration/Management • Embedded Web Device CPU Clock DSP Clock USB Voice CODECs Eth. Ports • TFTP AC494 165 Mhz 125 Mhz V 2 2 • FTP • Telnet AC495 125 Mhz 100 Mhz - 2 2 • Filesystem Support AC495L 87.5 Mhz 87.5 Mhz - 2 2 Operating System - AC496 165 Mhz 125 Mhz 2 2 • Linux Kernel • Board Support Package 4 | CMBU Chips & Modules Business Unit IP Phone Toolkit IP Phone Toolkit Hardware Design The AC494 IP Phone Toolkit hardware reference design includes all the peripherals for a complete high-end IP Phone. The AC494 SOC includes the internal interfaces for LCD Display, a keypad, USB, speaker and handset. Additionally, it has a 3-port Eswitch and internal MACs/PHYs. IP Phone Toolkit Hardware Design IP Phone Software Development Kit (sdk) The AC494 IP Phone Toolkit includes a complete Linux Software Development Kit (SDK) supporting voice and network protocols. The software is available in open source code, comprised of separate modules allowing for maximum customization. IP Phone Software Architecture Ready to use IP Phones for developers The AC494 IP Phone toolkit is also available on ready-to-use IP Phones. This allows OEMs and ODMs to develop their software on an actual IP Phone platform. Toolkit Contents • Reference & Development IP Phone Platform • Linux Operating System • Free Development Environment (Linux Tool-chain) • Ready-to-use IP Phone Application (open source code) AC494 IP Phone 5 | VoIP Processor Solutions Guide AudioCodes Enabling Technology Products ATA/CPE Gateway VoIP Toolkit Channel Density • Up to 4 FXS line interfaces Hardware • CPU – MIPS32 4KEC, 165 Mhz • DSP – AC49x, 125 Mhz / 100 Mhz • 2 Integrated 10/100 Base-T MAC/PHY ports • Integrated 3-port layer-2 Ethernet Switch • Integrated SSP (SPI/I2C) Controller Telephony Signaling • DTMF – Detection and Generation, TIA464B • Caller ID – Telcordia/ETSI/NTT Type I/II (on hook/off hook) • Programmable Call-Progress Tones Packetization • RTP/RTCP Packetization – RFC 3550, 3551, 2198 • DTMF Relay - RFC 2833, RFC 4733 Security ● Complete ATA/CPE Gateway VoIP solution • SRTP (Secured RTP) per RFC 3711, 128 bit AES, HMAC SHA1 ● Up to 4 FXS line interfaces with assembly options Telephony Support ● 3-Way conferencing per line support • FXS and FXO supported ● VoIPerfectHD™ (High Definition VoIP) support, enabling Media Processing • Voice Coders - G.711, G.723.1, G.729A/B, G.726, GSM FR superior voice quality at a competitive price • Wideband Coders – G.722, G.722.2 (WB-AMR) • Fax – T.38 Relay, Bypass , T.38 over RTP ● Field-proven VoIP processor and software • Echo Cancelation – G.168-2004 compliant, 64 msec tail length • Silence Suppression – VAD, CNG ● Perfect platform for stand-alone ATA developers • Adaptive Jitter Buffer 300 msec ● Production ready hardware and software reference Telephony Features design • 3-Way conferencing (on each line) • Call Forward • Call Hold • Call Transfer • Call Waiting The ATA Toolkit is a combined hardware and software ATA reference design based on the AC494 System on Chip (SoC). The AC494 SoC Data Protocols (System on Chip) provides ATA developers and manufacturers with single • IPv4, TCP, UDP, ICMP, ARP • PPPoE chip field-proven complete ATA solution, integrated with superior voice • Layer 2 switching quality at a competitive price. • DHCP Client/Server – RFC 2132 • WAN to LAN Layer-3 routing • DHCP Client/Server – RFC 2132 • NAT – RFC3022, Application Layer Gateway (ALG) AC494 SoC (System on Chip) Family • IEEE 802.1p/q QoS (VLAN tagging) The AC494 family combines in a single chip the main ATA components; VoIP Signaling Protocols it includes a MIPS 4KEC CPU, AC49x DSP and a 3-port Ethernet Switch • SIP - RFC 3261, 3262, 3263, 3264, 2327 (MACs and PHYs) together with a rich set of peripherals. The AC494 is based on the VoIPerfectHD™ architecture (High Definition VoIP) support, Configuration/Management • Embedded Web AudioCodes' underlying, best-of-breed, core media gateway technology • TFTP for all of its products. • FTP • Telnet • Filesystem support Operating System • Linux Kernel • Board Support Package 6 | CMBU Chips & Modules Business Unit ATA and CPE VoIP Gateway Development Platform and Reference Design The ATA VoIP Toolkit is a combined hardware and softwareATA reference design. The ATA VoIP Toolkit is based on the AC494 System on Chip (SoC) family. The SoC combines in a single chip, all the ATA components; it inlcudes a MIPS 4KEC CPU, AC49x DSP and 2-ports MACs/PHYs together with a rich set of peripherals. The hardware reference design includes also the analog FXS module. The ATA Toolkit Hardware Design ATA Software Development Kit (SDK) The ATA VoIP Toolkit couples the hardware with a complete Linux software suite supporting voice and network protocols, call features and management applications.
Recommended publications
  • Media Gateway Mediant 8000™ Product Description Version
    ™ Media Gateway Mediant 8000™ Product Description Version 6.6 Document # LTRT-92323 Product Description Contents Contents 1 Introduction ....................................................................................................... 15 1.1 General Features ..................................................................................... 15 1.2 PSTN Signaling Features ........................................................................ 16 1.3 3GPP Functionality .................................................................................. 16 1.4 Cable Functionality (PacketCable 1.0) ................................................... 17 1.5 System Management Functionality ........................................................ 17 1.6 Hardware Platform Functionality ........................................................... 18 1.7 Performance Highlights .......................................................................... 19 2 Mediant 8000 Basic Feature Highlights ........................................................... 21 2.1 Voice Packet Processing ........................................................................ 21 2.1.1 Echo Cancelation ................................................................................... 21 2.1.2 Voice and Tone Signaling Discrimination ............................................... 22 2.1.3 Capacity and Voice Compression ........................................................... 22 2.1.4 Voice Activity Detection and Comfort Noise Generation ........................
    [Show full text]
  • Comparison of Voice Activity Detection Algorithms for Wireless
    Proc. IEEE Canadian Conf. Electrical, Computer Engineering (St. John’s, NF), pp. 470-473, May 1997 COMPARISON OF VOICE ACTIVITY DETECTION ALGORITHMS FOR WIRELESS PERSONAL COMMUNICATIONS SYSTEMSy Khaled El-Maleh and Peter Kabal TSP Lab oratory, Dept. of Electrical Eng. McGill University Montreal, Queb ec, Canada H3A 2A7 email: fkhaled,[email protected] ABSTRACT (VBR) co ding has b een recently adopted for CDMA- based cellular and PCS systems to enhance capacity V oice activity detection (VAD) algorithms havebe- b y reducing interference. A V AD device is an indis- come an in tegral part of many of the recently stan- p ensable part of any VBR co dec as it controls b oth the dardized wireless cellular and P ersonal Communica- a verage bit rate, and the o v erall qualit y of the co der tions Systems (PCS). In this pap er, we presentacom- [3]. parative study of the p erformance of three recently pro- This pap er is organized in ve parts. Section 2 posed VAD algorithms under various acoustical back- starts with a general review of the basics of a V AD ground noise conditions. We also prop ose new ideas to design. It then describ es, in some detail, three selected enhance the p erformance of a VAD algorithm in wire- V AD designs for the this study. Section 3 rep orts the less PCS sp eech applications. results of a simulation study that has b een p erformed to study the p erformance of the three VADs under var- 1. INTRODUCTION ious background noise conditions.
    [Show full text]
  • The Sound of Silence: How Traditional and Deep Learning Based Voice Activity Detection Influences Speech Quality Monitoring
    The sound of silence: how traditional and deep learning based voice activity detection influences speech quality monitoring Rahul Jaiswal, Andrew Hines School of Computer Science, University College Dublin, Ireland [email protected], [email protected] Abstract. Real-time speech quality assessment is important for VoIP applications such as Google Hangouts, Microsoft Skype, and Apple Face- Time. Conventionally, subjective listening tests are used to quantify speech quality but are impractical for real-time monitoring scenarios. Objective speech quality assessment metrics can predict human judge- ment of perceived speech quality. Originally designed for narrow-band telephony applications, ITU-T P.563 is a single-ended or non-intrusive speech quality assessment that predicts speech quality without access to a reference signal. This paper investigates the suitability of P.563 in Voice over Internet Protocol (VoIP) scenarios and specifically the influ- ence of silences on the predicted speech quality. The performance of P.563 was evaluated using TCD-VoIP dataset, containing speech with degra- dations commonly experienced with VoIP. The predictive capability of P.563 was established by comparing with subjective listening test results. The effect of pre-processing the signal to remove silences using Voice Activity Detection (VAD) was evaluated for five acoustic feature-based VAD algorithms: energy, energy and spectral centroid, Mahalanobis dis- tance, weighted energy, weighted spectral centroid and four Deep learning model-based VAD algorithms: Deep Neural Network, Boosted Deep Neu- ral Network, Long Short-Term Memory and Adaptive context attention model. Analysis shows P.563 prediction accuracy improves for different speech conditions of VoIP when the silences were removed by a VAD.
    [Show full text]
  • Design and Implementation on DSP of the ETSI GSM Adaptive Multi-Rate Vocoder
    Design and Implementation on DSP of the ETSI GSM Adaptive Multi-Rate Vocoder by Javier Martínez Morales Supervisor: Joan Serrat Fernández Contents Chapter 1 - Introduction...................................................................................................................................5 1.1 Foreword..................................................................................................................................................5 1.2 Project objectives and methodology........................................................................................................6 1.3 Memory structure.....................................................................................................................................8 Chapter 2 - GSM Adaptive Multi-Rate Vocoder..........................................................................................10 2.1 Origins and present situation..................................................................................................................10 2.2 Adaptive Multi-Rate (AMR) coding: general description.....................................................................11 2.3 GSM AMR speech encoder...................................................................................................................13 2.3.1 Principles.......................................................................................................................................13 2.3.2 Pre-processing...............................................................................................................................18
    [Show full text]
  • Efficient Voice Activity Detection Algorithms Using Long-Term Speech
    Speech Communication 42 (2004) 271–287 www.elsevier.com/locate/specom Efficient voice activity detection algorithms using long-term speech information Javier Ramırez *, Jose C. Segura 1, Carmen Benıtez, Angel de la Torre, Antonio Rubio 2 Dpto. Electronica y Tecnologıa de Computadores, Universidad de Granada, Campus Universitario Fuentenueva, 18071 Granada, Spain Received 5 May 2003; received in revised form 8 October 2003; accepted 8 October 2003 Abstract Currently, there are technology barriers inhibiting speech processing systems working under extreme noisy condi- tions. The emerging applications of speech technology, especially in the fields of wireless communications, digital hearing aids or speech recognition, are examples of such systems and often require a noise reduction technique operating in combination with a precise voice activity detector (VAD). This paper presents a new VAD algorithm for improving speech detection robustness in noisy environments and the performance of speech recognition systems. The algorithm measures the long-term spectral divergence (LTSD) between speech and noise and formulates the speech/ non-speech decision rule by comparing the long-term spectral envelope to the average noise spectrum, thus yielding a high discriminating decision rule and minimizing the average number of decision errors. The decision threshold is adapted to the measured noise energy while a controlled hang-over is activated only when the observed signal-to-noise ratio is low. It is shown by conducting an analysis of the speech/non-speech LTSD distributions that using long-term information about speech signals is beneficial for VAD. The proposed algorithm is compared to the most commonly used VADs in the field, in terms of speech/non-speech discrimination and in terms of recognition performance when the VAD is used for an automatic speech recognition system.
    [Show full text]
  • AMR Speech Codec, Wideband; General Description (3GPP TS 26.171 Version 5.0.0 Release 5)
    ETSI TS 126 171 V5.0.0 (2001-03) Technical Specification Universal Mobile Telecommunications System (UMTS); AMR speech codec, wideband; General description (3GPP TS 26.171 version 5.0.0 Release 5) 3 GPP TS 26.171 version 5.0.0 Release 5 1 ETSI TS 126 171 V5.0.0 (2001-03) Reference DTS/TSGS-0426171v500 Keywords UMTS ETSI 650 Route des Lucioles F-06921 Sophia Antipolis Cedex - FRANCE Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65 47 16 Siret N° 348 623 562 00017 - NAF 742 C Association à but non lucratif enregistrée à la Sous-Préfecture de Grasse (06) N° 7803/88 Important notice Individual copies of the present document can be downloaded from: http://www.etsi.org The present document may be made available in more than one electronic version or in print. In any case of existing or perceived difference in contents between such versions, the reference version is the Portable Document Format (PDF). In case of dispute, the reference shall be the printing on ETSI printers of the PDF version kept on a specific network drive within ETSI Secretariat. Users of the present document should be aware that the document may be subject to revision or change of status. Information on the current status of this and other ETSI documents is available at http://portal.etsi.org/tb/status/status.asp If you find errors in the present document, send your comment to: [email protected] Copyright Notification No part may be reproduced except as authorized by written permission. The copyright and the foregoing restriction extend to reproduction in all media.
    [Show full text]
  • Voice Activity Detection in Non-Stationary Noise
    Voice Activity Detection in Non-stationary Noise. Sunil Srinivasa Department of Electrical Engineering, IIT Madras A Summer Project under the guidance of Dr. N. Rama Murthy Scientist ‘E’, CAIR Objectives of the project. Learning various methods for VAD proposed in literature. Implementation of several VAD algorithms by using MATLAB™ – successful detection of start and end points in the noisy speech signal. Comparison of VAD algorithms – search for the best VAD algorithm. Study of some Speech Enhancing methods. Software coding of the techniques learnt to enhance speech. Introduction to VAD. What is VAD ? VAD stands for VOICE ACTIVITY DETECTOR. It carries out the process of distinguishing between conversational speech and silence(noise) i.e. It detects the beginning and ending of talk spurts. A VAD makes use of a set of rules incorporated in the Voice Activity Detection Algorithms. Advantages of VAD. Saves the processing time in canceling noise from speech. Helps to reduce Internet Traffic. Achieves Band Width reduction, as high- frequency noisy frames are discarded. Allows for multi-data transmission. Causes silence compression – very important for both fixed and mobile telecommunication systems. Few Applications of VAD. Required in some communication applications like : Speech Recognition Speech Coding Hands-Free Telephony Echo Cancellation. It is also required in Voice over IP systems for providing Toll Grade Voice Quality. It is an integral part of Transmission systems. Desirable aspects of VAD Algorithms. A good decision rule even at low SNR. Adaptability to non-stationary background noise. Low computational complexity. Toll Quality Voice Reproduction – very important for Voice over Packet Networks. Maximum saving in Band Width.
    [Show full text]
  • AN4229 How to Implement a Vocoder Solution Using
    AN4229 Application note How to implement a vocoder solution using STM32 microcontrollers Introduction This application note describes the vocal codecs (vocoder) available for STM32 32-bit ARM® Cortex® microcontrollers. The supported codecs are G.711, G.726, IMA-ADPCM and Speex. It comes with examples of implementation on STM32F4DISCOVERY (referred as DISCOVERY board hereafter), STM322xG-EVAL and STM324xG-EVAL (referred as EVAL boards hereafter). These examples are developed in such a way that they can be easily ported to any other STM32 microcontroller and board. The vocoder firmware package is provided upon request, for more details please contact your local ST sales representative. Table 1. Applicable products Type Applicable products Microcontrollers STM32 32-bit ARM Cortex MCU May 2014 DocID024112 Rev 1 1/26 www.st.com Contents AN4229 Contents 1 Vocoder overview and comparison . 5 1.1 G.711 . 5 1.2 G.726 . 5 1.3 IMA-ADPCM . 6 1.4 Speex . 6 1.5 Vocoder comparison . 7 1.6 Vocoder performance . 8 2 Package content and description . 10 2.1 Firmware description . 10 2.2 Audio API description . 13 2.3 Player and recorder processes . 15 2.3.1 Audio player . 15 2.3.2 Audio recorder . 16 2.4 Software . 16 3 How to use the demonstration . 17 3.1 Hardware configuration . 17 3.2 Firmware configuration . 17 3.3 Main application steps . 18 3.3.1 Player menu . 19 3.3.2 Recording format menu . 20 3.3.3 Recorder menu . 21 3.4 Simple examples . 21 4 Memory footprint . 22 5 FAQ . 23 6 References .
    [Show full text]
  • Voip Gateway Solutions
    VoIP gateway solutions Product bulletin Key features Texas Instruments (TI) offers a broad family of scalable voice over IP solutions (VoIP) • Advanced DSP processing enables based on TMS320C55x™ and TMS320C64x+™ DSP cores and Telogy Software™ products. superior voice quality, functionality and scalability These single-device silicon and software solutions offer many levels of channel densities, • Highly integrated processors enable from 4 to 128 G.711 channels or from 4 to 64 channels of compressed voice with SRTP. lower system costs With this wide variety of offerings, TI has a VoIP solution for every level of product, from • Field-proven software with emphasis small business and enterprise VoIP gateways to service provider access gateway on managing voice and call quality • Most comprehensive range of features applications. Advanced DSP processing enables superior voice quality, functionality and • Largest installed base of solutions with scalability to deliver cutting-edge products with lower system costs. more than 700 million ports shipped • Industry leader in DSP TI’s full range of VoIP solutions offers TI also offers several C64x+™ DSP-based • Committed roadmap support • Code compatibility optimized solution density and faster time to solutions – all based on single-core DSP • Process technology market for manufacturers. With the largest architectures: • Production facilities installed base, TI’s VoIP offerings reduce risk • TNETV2664 • World-class technical support and offer a field-proven technology. With • TNETV2666 • Industry leader in indemnification with broad patent portfolio evaluation modules (EVMs) available for all • TNETV2686 solutions, developers are able to begin • TNETV2689 prototyping and developing new products now, saving time and money on new product Channel TMS320C55x DSP TMS320C64x+™ DSP development.
    [Show full text]
  • A Simple but Efficient Real-Time Voice Activity Detection Algorithm
    17th European Signal Processing Conference (EUSIPCO 2009) Glasgow, Scotland, August 24-28, 2009 A SIMPLE BUT EFFICIENT REAL-TIME VOICE ACTIVITY DETECTION ALGORITHM M. H. Moattar and M. M. Homayounpour Laboratory for Intelligent Sound and Speech Processing (LISSP), Computer Engineering and Information Technology Dept., Amirkabir University of Technology, Tehran, Iran phone: +(0098) 2164542722, email: {moattar, homayoun}@aut.ac.ir ABSTRACT Among all features, the short-term energy and zero-crossing Voice Activity Detection (VAD) is a very important front end rate have been widely used because of their simplicity. processing in all Speech and Audio processing applications. However, they easily degrade by environmental noise. To The performance of most if not all speech/audio processing cope with this problem, various kinds of robust acoustic methods is crucially dependent on the performance of Voice features, such as autocorrelation function based features [3, Activity Detection. An ideal voice activity detector needs to 4], spectrum based features [5], the power in the band- be independent from application area and noise condition limited region [1, 6, 7], Mel-frequency cepstral coefficients and have the least parameter tuning in real applications. In [4], delta line spectral frequencies [6], and features based on this paper a nearly ideal VAD algorithm is proposed which higher order statistics [8] have been proposed for VAD. Ex- is both easy-to-implement and noise robust, comparing to periments show that using multiple features leads to more some previous methods. The proposed method uses short- robustness against different environments, while maintain- term features such as Spectral Flatness (SF) and Short-term ing complexity and effectiveness.
    [Show full text]
  • ETSI TR 101 329-6 V1.1.1 (2000-07) Technical Report
    ETSI TR 101 329-6 V1.1.1 (2000-07) Technical Report Telecommunications and Internet Protocol Harmonization Over Networks (TIPHON); End to End Quality of Service in TIPHON Systems; Part 6: Actual measurements of network and terminal characteristics and performance parameters in TIPHON networks and their influence on voice quality 2 ETSI TR 101 329-6 V1.1.1 (2000-07) Reference DTR/TIPHON-05004 Keywords IP, network, performance, quality, terminal, voice ETSI 650 Route des Lucioles F-06921 Sophia Antipolis Cedex - FRANCE Tel.:+33492944200 Fax:+33493654716 Siret N° 348 623 562 00017 - NAF 742 C Association à but non lucratif enregistrée à la Sous-Préfecture de Grasse (06) N° 7803/88 Important notice Individual copies of the present document can be downloaded from: http://www.etsi.org The present document may be made available in more than one electronic version or in print. In any case of existing or perceived difference in contents between such versions, the reference version is the Portable Document Format (PDF). In case of dispute, the reference shall be the printing on ETSI printers of the PDF version kept on a specific network drive within ETSI Secretariat. Users of the present document should be aware that the document may be subject to revision or change of status. Information on the current status of this and other ETSI documents is available at http://www.etsi.org/tb/status/ If you find errors in the present document, send your comment to: [email protected] Copyright Notification No part may be reproduced except as authorized by written permission.
    [Show full text]
  • A Novel Robust Mel-Energy Based Voice Activity Detector for Nonstationary Noise and Its Application for Speech Waveform Compression Syed, Q
    Louisiana State University LSU Digital Commons LSU Master's Theses Graduate School 2006 A Novel Robust Mel-Energy Based Voice Activity Detector for Nonstationary Noise and Its Application for Speech Waveform Compression Syed, Q. Waheeduddin Louisiana State University and Agricultural and Mechanical College, [email protected] Follow this and additional works at: https://digitalcommons.lsu.edu/gradschool_theses Part of the Electrical and Computer Engineering Commons Recommended Citation Waheeduddin, Syed, Q., "A Novel Robust Mel-Energy Based Voice Activity Detector for Nonstationary Noise and Its Application for Speech Waveform Compression" (2006). LSU Master's Theses. 1855. https://digitalcommons.lsu.edu/gradschool_theses/1855 This Thesis is brought to you for free and open access by the Graduate School at LSU Digital Commons. It has been accepted for inclusion in LSU Master's Theses by an authorized graduate school editor of LSU Digital Commons. For more information, please contact [email protected]. A NOVEL ROBUST MEL-ENERGY BASED VOICE ACTIVITY DETECTOR FOR NONSTATIONARY NOISE AND ITS APPLICATION FOR SPEECH WAVEFORM COMPRESSION A Thesis Submitted to the Graduate Faculty of the Louisiana State University and Agriculture and Mechanical College in partial fulfillment of the requirements of the degree of Master of Science in Electrical Engineering in The Department of Electrical Engineering by Waheeduddin Q. Syed B.S., Jawaharlal Technological University, 2004 December 2006 ACKNOWLEDGEMENTS I would like to begin with thanking the almighty god for making everything possible. Next, I would like to sincerely acknowledge my major professor Dr. Hsiao- Chun Wu for his guidance. This work would not have been possible without his able guidance and unwavering support for me.
    [Show full text]