DECT); New Generation DECT; Overview and Requirements
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Speech Coding and Compression
Introduction to voice coding/compression PCM coding Speech vocoders based on speech production LPC based vocoders Speech over packet networks Speech coding and compression Corso di Networked Multimedia Systems Master Universitario di Primo Livello in Progettazione e Gestione di Sistemi di Rete Carlo Drioli Università degli Studi di Verona Facoltà di Scienze Matematiche, Dipartimento di Informatica Fisiche e Naturali Speech coding and compression Introduction to voice coding/compression PCM coding Speech vocoders based on speech production LPC based vocoders Speech over packet networks Speech coding and compression: OUTLINE Introduction to voice coding/compression PCM coding Speech vocoders based on speech production LPC based vocoders Speech over packet networks Speech coding and compression Introduction to voice coding/compression PCM coding Speech vocoders based on speech production LPC based vocoders Speech over packet networks Introduction Approaches to voice coding/compression I Waveform coders (PCM) I Voice coders (vocoders) Quality assessment I Intelligibility I Naturalness (involves speaker identity preservation, emotion) I Subjective assessment: Listening test, Mean Opinion Score (MOS), Diagnostic acceptability measure (DAM), Diagnostic Rhyme Test (DRT) I Objective assessment: Signal to Noise Ratio (SNR), spectral distance measures, acoustic cues comparison Speech coding and compression Introduction to voice coding/compression PCM coding Speech vocoders based on speech production LPC based vocoders Speech over packet networks -
MIGRATING RADIO CALL-IN TALK SHOWS to WIDEBAND AUDIO Radio Is the Original Social Network
Established 1961 2013 WBA Broadcasters Clinic Madison, WI MIGRATING RADIO CALL-IN TALK SHOWS TO WIDEBAND AUDIO Radio is the original Social Network • Serves local or national audience • Allows real-time commentary from the masses • The telephone becomes the medium • Telephone technical factors have limited the appeal of the radio “Social Network” Telephones have changed over the years But Telephone Sound has not changed (and has gotten worse) This is very bad for Radio Why do phones sound bad? • System designed for efficiency not comfort • Sampling rate of 8kHz chosen for all calls • 4 kHz max response • Enough for intelligibility • Loses depth, nuance, personality • Listener fatigue Why do phones sound so bad ? (cont) • Low end of telephone calls have intentional high- pass filtering • Meant to avoid AC power hum pickup in phone lines • Lose 2-1/2 Octaves of speech audio on low end • Not relevant for digital Why Phones Sound bad (cont) Los Angeles Times -- January 10, 2009 Verizon Communications Inc., the second-biggest U.S. telephone company, plans to do away with traditional phone lines within seven years as it moves to carry all calls over the Internet. An Internet-based service can be maintained at a fraction of the cost of a phone network and helps Verizon offer a greater range of services, Stratton said. "We've built our business over the years with circuit-switched voice being our bread and butter...but increasingly, we are in the business of selling, basically, data connectivity," Chief Marketing Officer John Stratton said. VoIP -
Digital Speech Processing— Lecture 17
Digital Speech Processing— Lecture 17 Speech Coding Methods Based on Speech Models 1 Waveform Coding versus Block Processing • Waveform coding – sample-by-sample matching of waveforms – coding quality measured using SNR • Source modeling (block processing) – block processing of signal => vector of outputs every block – overlapped blocks Block 1 Block 2 Block 3 2 Model-Based Speech Coding • we’ve carried waveform coding based on optimizing and maximizing SNR about as far as possible – achieved bit rate reductions on the order of 4:1 (i.e., from 128 Kbps PCM to 32 Kbps ADPCM) at the same time achieving toll quality SNR for telephone-bandwidth speech • to lower bit rate further without reducing speech quality, we need to exploit features of the speech production model, including: – source modeling – spectrum modeling – use of codebook methods for coding efficiency • we also need a new way of comparing performance of different waveform and model-based coding methods – an objective measure, like SNR, isn’t an appropriate measure for model- based coders since they operate on blocks of speech and don’t follow the waveform on a sample-by-sample basis – new subjective measures need to be used that measure user-perceived quality, intelligibility, and robustness to multiple factors 3 Topics Covered in this Lecture • Enhancements for ADPCM Coders – pitch prediction – noise shaping • Analysis-by-Synthesis Speech Coders – multipulse linear prediction coder (MPLPC) – code-excited linear prediction (CELP) • Open-Loop Speech Coders – two-state excitation -
Speech Compression Using Discrete Wavelet Transform and Discrete Cosine Transform
International Journal of Engineering Research & Technology (IJERT) ISSN: 2278-0181 Vol. 1 Issue 5, July - 2012 Speech Compression Using Discrete Wavelet Transform and Discrete Cosine Transform Smita Vatsa, Dr. O. P. Sahu M. Tech (ECE Student ) Professor Department of Electronicsand Department of Electronics and Communication Engineering Communication Engineering NIT Kurukshetra, India NIT Kurukshetra, India Abstract reproduced with desired level of quality. Main approaches of speech compression used today are Aim of this paper is to explain and implement waveform coding, transform coding and parametric transform based speech compression techniques. coding. Waveform coding attempts to reproduce Transform coding is based on compressing signal input signal waveform at the output. In transform by removing redundancies present in it. Speech coding at the beginning of procedure signal is compression (coding) is a technique to transform transformed into frequency domain, afterwards speech signal into compact format such that speech only dominant spectral features of signal are signal can be transmitted and stored with reduced maintained. In parametric coding signals are bandwidth and storage space respectively represented through a small set of parameters that .Objective of speech compression is to enhance can describe it accurately. Parametric coders transmission and storage capacity. In this paper attempts to produce a signal that sounds like Discrete wavelet transform and Discrete cosine original speechwhether or not time waveform transform -
Polycom Voice
Level 1 Technical – Polycom Voice Level 1 Technical Polycom Voice Contents 1 - Glossary .......................................................................................................................... 2 2 - Polycom Voice Networks ................................................................................................. 3 Polycom UC Software ...................................................................................................... 3 Provisioning ..................................................................................................................... 3 3 - Key Features (Desktop and Conference Phones) ............................................................ 5 OpenSIP Integration ........................................................................................................ 5 Microsoft Integration ........................................................................................................ 5 Lync Qualification ............................................................................................................ 5 Better Together over Ethernet ......................................................................................... 5 BroadSoft UC-One Integration ......................................................................................... 5 Conference Link / Conference Link2 ................................................................................ 6 Polycom Desktop Connector .......................................................................................... -
Speech Coding Using Code Excited Linear Prediction
ISSN : 0976-8491 (Online) | ISSN : 2229-4333 (Print) IJCST VOL . 5, Iss UE SPL - 2, JAN - MAR C H 2014 Speech Coding Using Code Excited Linear Prediction 1Hemanta Kumar Palo, 2Kailash Rout 1ITER, Siksha ‘O’ Anusandhan University, Bhubaneswar, Odisha, India 2Gandhi Institute For Technology, Bhubaneswar, Odisha, India Abstract The main problem with the speech coding system is the optimum utilization of channel bandwidth. Due to this the speech signal is coded by using as few bits as possible to get low bit-rate speech coders. As the bit rate of the coder goes low, the intelligibility, SNR and overall quality of the speech signal decreases. Hence a comparative analysis is done of two different types of speech coders in this paper for understanding the utility of these coders in various applications so as to reduce the bandwidth and by different speech coding techniques and by reducing the number of bits without any appreciable compromise on the quality of speech. Hindi language has different number of stops than English , hence the performance of the coders must be checked on different languages. The main objective of this paper is to develop speech coders capable of producing high quality speech at low data rates. The focus of this paper is the development and testing of voice coding systems which cater for the above needs. Keywords PCM, DPCM, ADPCM, LPC, CELP Fig. 1: Speech Production Process I. Introduction Speech coding or speech compression is the compact digital The two types of speech sounds are voiced and unvoiced [1]. representations of voice signals [1-3] for the purpose of efficient They produce different sounds and spectra due to their differences storage and transmission. -
NTT DOCOMO Technical Journal
3GPP EVS Codec for Unrivaled Speech Quality and Future Audio Communication over VoLTE EVS Speech Coding Standardization NTT DOCOMO has been engaged in the standardization of Research Laboratories Kimitaka Tsutsumi the 3GPP EVS codec, which is designed specifically for Kei Kikuiri VoLTE to further enhance speech quality, and has contributed to establishing a far-sighted strategy for making the EVS codec cover a variety of future communication services. Journal NTT DOCOMO has also proposed technical solutions that provide speech quality as high as FM radio broadcasts and that achieve both high coding efficiency and high audio quality not possible with any of the state-of-the-art speech codecs. The EVS codec will drive the emergence of a new style of speech communication entertainment that will combine BGM, sound effects, and voice in novel ways for mobile users. 2 Technical Band (AMR-WB)* [2] that is used in also encode music at high levels of quality 1. Introduction NTT DOCOMO’s VoLTE and that sup- and efficiency for non-real-time services, The launch of Voice over LTE (VoL- port wideband speech with a sampling 3GPP experts agreed to adopt high re- TE) services and flat-rate voice service frequency*3 of 16 kHz. In contrast, EVS quirements in the EVS codec for music has demonstrated the importance of high- has been designed to support super-wide- despite its main target of real-time com- quality telephony service to mobile users. band*4 speech with a sampling frequen- munication. Furthermore, considering In line with this trend, the 3rd Genera- cy of 32 kHz thereby achieving speech that telephony services using AMR-WB tion Partnership Project (3GPP) complet- of FM-radio quality*5. -
The Growing Importance of HD Voice in Applications the Growing Importance of HD Voice in Applications White Paper
White Paper The Growing Importance of HD Voice in Applications The Growing Importance of HD Voice in Applications White Paper Executive Summary A new excitement has entered the voice communications industry with the advent of wideband audio, commonly known as High Definition Voice (HD Voice). Although enterprises have gradually been moving to HD VoIP within their own networks, these networks have traditionally remained “islands” of HD because interoperability with other networks that also supported HD Voice has been difficult. With the introduction of HD Voice on mobile networks, which has been launched on numerous commercial mobile networks and many wireline VoIP networks worldwide, consumers can finally experience this new technology firsthand. Verizon, AT&T, T-Mobile, Deutsche Telekom, Orange and other mobile operators now offer HD Voice as a standard feature. Because mobile users tend to adopt new technology rapidly, replacing their mobile devices seemingly as fast as the newest models are released, and because landline VoIP speech is typically done via a headset, the growth of HD Voice continues to be high and in turn, the need for HD- capable applications will further accelerate. This white paper provides an introduction to HD Voice and discusses its current adoption rate and future potential, including use case examples which paint a picture that HD Voice upgrades to network infrastructure and applications will be seen as important, and perhaps as a necessity to many. 2 The Growing Importance of HD Voice in Applications White Paper Table of Contents What Is HD Voice? . 4 Where is HD Voice Being Deployed? . 4 Use Case Examples . -
An Ultra Low-Power Miniature Speech CODEC at 8 Kb/S and 16 Kb/S Robert Brennan, David Coode, Dustin Griesdorf, Todd Schneider Dspfactory Ltd
To appear in ICSPAT 2000 Proceedings, October 16-19, 2000, Dallas, TX. An Ultra Low-power Miniature Speech CODEC at 8 kb/s and 16 kb/s Robert Brennan, David Coode, Dustin Griesdorf, Todd Schneider Dspfactory Ltd. 611 Kumpf Drive, Unit 200 Waterloo, Ontario, N2V 1K8 Canada Abstract SmartCODEC platform consists of an effi- cient, block-floating point, oversampled This paper describes a CODEC implementa- Weighted OverLap-Add (WOLA) filterbank, a tion on an ultra low-power miniature applica- software-programmable dual-Harvard 16-bit tion specific signal processor (ASSP) designed DSP core, two high fidelity 14-bit A/D con- for mobile audio signal processing applica- verters, a 14-bit D/A converter and a flexible tions. The CODEC records speech to and set of peripherals [1]. The system hardware plays back from a serial flash memory at data architecture (Figure 1) was designed to enable rates of 16 and 8 kb/s, with a bandwidth of 4 memory upgrades. Removable memory cards kHz. This CODEC consumes only 1 mW in a or more power-efficient memory could be package small enough for use in a range of substituted for the serial flash memory. demanding portable applications. Results, improvements and applications are also dis- The CODEC communicates with the flash cussed. memory over an integrated SPI port that can transfer data at rates up to 80 kb/s. For this application, the port is configured to block 1. Introduction transfer frame packets every 14 ms. e Speech coding is ubiquitous. Increasing de- c i WOLA Filterbank v mands for portability and fidelity coupled with A/D e and D the desire for reduced storage and bandwidth o i l Programmable d o r u utilization have increased the demand for and t D/A DSP Core A deployment of speech CODECs. -
HD Voice – a Revolution in Voice Communication
HD Voice – a revolution in voice communication Besides data capacity and coverage, which are one of the most important factors related to customers’ satisfaction in mobile telephony nowadays, we must not forget about the intrinsic characteristic of the mobile communication – the Voice. Ever since the nineties and the introduction of GSM there have not been much improvements in the area of voice communication and quality of sound has not seen any major changes. Smart Network going forward! Mobile phones made such a progress in recent years that they have almost replaced PCs, but their basic function, voice calls, is still irreplaceable and vital in mobile communication and it has to be seamless. In order to grow our customer satisfaction and expand our service portfolio, Smart Network engineers of Telenor Serbia have enabled HD Voice by introducing new network features and transitioning voice communication to all IP network. This transition delivers crystal-clear communication between the two parties greatly enhancing customer experience during voice communication over smartphones. Enough with the yelling into smartphones! HD Voice (or High-Definition Voice) represents a significant upgrade to sound quality in mobile communications. Thanks to this feature users experience clarity, smoothly reduced background noise and a feeling that the person they are talking to is standing right next to them or of "being in the same room" with the person on the other end of the phone line. On the more technical side, “HD Voice is essentially wideband audio technology, something that long has been used for conference calling and VoIP apps. Instead of limiting a call frequency to between 300 Hz and 3.4 kHz, a wideband audio call transmits at a range of 50 Hz to 7 kHz, or higher. -
Low Bit-Rate Speech Coding with Vq-Vae and a Wavenet Decoder
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 735-739. IEEE, 2019. DOI: 10.1109/ICASSP.2019.8683277. c 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. LOW BIT-RATE SPEECH CODING WITH VQ-VAE AND A WAVENET DECODER Cristina Garbaceaˆ 1,Aaron¨ van den Oord2, Yazhe Li2, Felicia S C Lim3, Alejandro Luebs3, Oriol Vinyals2, Thomas C Walters2 1University of Michigan, Ann Arbor, USA 2DeepMind, London, UK 3Google, San Francisco, USA ABSTRACT compute the true information rate of speech to be less than In order to efficiently transmit and store speech signals, 100 bps, yet current systems typically require a rate roughly speech codecs create a minimally redundant representation two orders of magnitude higher than this to produce good of the input signal which is then decoded at the receiver quality speech, suggesting that there is significant room for with the best possible perceptual quality. In this work we improvement in speech coding. demonstrate that a neural network architecture based on VQ- The WaveNet [8] text-to-speech model shows the power VAE with a WaveNet decoder can be used to perform very of learning from raw data to generate speech. Kleijn et al. [9] low bit-rate speech coding with high reconstruction qual- use a learned WaveNet decoder to produce audio comparable ity. -
Cognitive Speech Coding Milos Cernak, Senior Member, IEEE, Afsaneh Asaei, Senior Member, IEEE, Alexandre Hyafil
1 Cognitive Speech Coding Milos Cernak, Senior Member, IEEE, Afsaneh Asaei, Senior Member, IEEE, Alexandre Hyafil Abstract—Speech coding is a field where compression ear and undergoes a highly complex transformation paradigms have not changed in the last 30 years. The before it is encoded efficiently by spikes at the auditory speech signals are most commonly encoded with com- nerve. This great efficiency in information representation pression methods that have roots in Linear Predictive has inspired speech engineers to incorporate aspects of theory dating back to the early 1940s. This paper tries to cognitive processing in when developing efficient speech bridge this influential theory with recent cognitive studies applicable in speech communication engineering. technologies. This tutorial article reviews the mechanisms of speech Speech coding is a field where research has slowed perception that lead to perceptual speech coding. Then considerably in recent years. This has occurred not it focuses on human speech communication and machine because it has achieved the ultimate in minimizing bit learning, and application of cognitive speech processing in rate for transparent speech quality, but because recent speech compression that presents a paradigm shift from improvements have been small and commercial applica- perceptual (auditory) speech processing towards cognitive tions (e.g., cell phones) have been mostly satisfactory for (auditory plus cortical) speech processing. The objective the general public, and the growth of available bandwidth of this tutorial is to provide an overview of the impact has reduced requirements to compress speech even fur- of cognitive speech processing on speech compression and discuss challenges faced in this interdisciplinary speech ther.