Micarrayechocancellation Walkthrough: C++

Total Page:16

File Type:pdf, Size:1020Kb

Micarrayechocancellation Walkthrough: C++ MicArrayEchoCancellation Walkthrough: C++ Capturing Audio Streams with Acoustic Echo Cancellation and Beamforming About This Walkthrough In the Kinectct™™ for Windows®® Software Development Kit (SDK), the MicArrayEchoCancellation sample shows how to capture an audio stream from the microphone aae rray of the Kinect for Xbox 360®® se sensor by using the MSRKineinectAudio Microsoft DirectX ®® media object (DMO) in a Microsoft DirectShow®® graph. This document provides a walkthrough review of the MicArrayEchoCancellation sample. Resources For a complete list of documentation for the Kinect for Windows SDK Beta, plus related reference and links to the online forums, see the bbe eta SDK website at: http://kinectforwindows.org Contents Introduction .......................................................................................................................................................................... 2 Program Description ........................................................................................................................................................... 3 Create and Configure the MSRKinectAudio DMO .......................................................................................................... 4 Select the Kinect Sensor‘s Microphone Array .................................................................................................................. 55 Enumerate the Device Index........................................................................................................................................... 6 Determine the Device Index ........................................................................................................................................... 7 Record the Captured Stream and DeDetertermine the Source Direction ............................................................................. 8 Set Up the Data Buffer .................................................................................................................................................... 8 Set the Output Format .................................................................................................................................................... 8 Allocate Resources and the Output Buffer ................................................................................................................... 9 Capture the Audio Stretream and Determine Source Direction .................................................................................. 10 License: The Kinect for Windows SDK Beta is licensed for non-commercial use only. By in stalling, copying, or otherwise usin g the beta SDK, you agree to b e bound by the terms of its li cecensense.. Read the license.. Disclaimer: TThis document is provided ―as--is‖. Information and views expressed in this document, including URL and otherher InInternet Web site references, may change without notice. You bear the risk of using it. This document does not provide you with any lelegal rights to any in teltellectual property in aain ny Microsoft proproduct. You may copy an d uud se this document for youou r ininternal, reference purposes.es. © 2011 Microsofsoft Corporation. All rights reserved.. Microsoft, DirectShow, DirectX, Kininect, MSDN, Windows, and Windows Media are trademarks of th e Microsoft group of compan ies. All other trademarks are property of their respective owners. MicArrayEchoCancellation Walkthrough: C++ – 2 Introduction The audio component of the Kinect™ for Xbox 360® sensor is a four-element linear microphone array. An array provides some significant advantages over a single microphone, including more sophisticated acoustic echo cancellation and noise suppression, and the ability to determine the direction of a sound source. The primary way for C++ applications to access the Kinect sensor‘ s microphone array is through the MSRKinectAudio Microsoft® DirectX® media object (DMO). A DMO is a standard COM object that can be incorporated into a Microsoft DirectShow® graph or a Microsoft Media Foundation topology. The Kinect for Windows® Software Development Kit (SDK) Beta includes an extended version of the Windows microphone array DMO—referred to here as the MSRKinectAudio DMO—to support the Kinect microphone array. The MSRKinectAudio DMO supports all the standard microphone array functionality, which includes: Acoustic echo cancellation (AEC) Microphone array processing (MicArray) Noise suppression (NS) Automatic gain control (AGC) Voice activity detection (VAD) Sound source localization, which identifies the direction of the source in the horizo ntal plane Beamforming, which allows the array to function as a steerable directional microphone. The DMO supports 11 beams, with fixed directions that range from -50 to+50 degrees in 10-degree increments. For more information on the standard microphone array, see ―Microphone Array Support in Windows Vista‖ and ―How to Build and Use Microphone Arrays for Windows Vista‖ on the Microsoft Developer Network (MSDN®) website. Although the internal details for MSRKinectAudio DMO are different, you use it in much the same way as the standard microphone array DMO, with the following exceptions. The MSRKinectAudio DMO: Has its own class identifier (CLSID)—CLSID_CMSRMSRKinectAudio. Exposes sound source localization functionality through a new interface—ISoundSourceLocalizer. Supports an additional microphone array mode—adaptive beamforming)—which uses an internal source localizer to automatically determine the beam direction. The MicArrayEchoCancellation sample shows how to capture an audio stream from the Kinect sensor‘s microphone array by polling the MSRKinectAudio DMO in source mode. The application uses AEC to record a high-quality audio stream and beam-forming to determine the direction to the sound source. The DMO can also be used with a Microsoft Media Foundation topology. For an example, see ―MFAudioFilter Walkthrough: C++ Sample‖ on the beta SDK website. Note DirectShow is COM-based, and this document assumes that you are familiar with how to use COM objects and interfaces. You do not need to know how to implement COM objects. For the basics of how to use COM objects, see ―Programming DirectX with COM‖ on the MSDN website. That MSDN topic is written for DirectX programmers, but the basic principles apply to all COM -based applications. MicArrayEchoCancellation Walkthrough: C++ – 3 Program Description MicArrayEchoCancellation is installed with the Kinect for Windows Software Development Kit (SDK) Beta samples in %KINECTSDK_DIR%\Samples\KinectSDKSamples.zip.. MicArrayEchoCancellation is a C++ console application that is implemented in MicArrayEchoCancellation.cpp. The basic program flow is as follows: 1. Create and configure the MSRKinectAudio DMO. 2. Enumerate the available capture devices and select the Kinect sensor‘s microphone array. 3. Record 10 seconds of audio stream and determine the source direction as the capture process progresses. To run MicArrayEchoCancellation, start MicArrayEchoCancellation.exe and follow the instructions in the console window. Tip Before attempting to capture audio from the microphone array, you must be actively streaming to the audio render device that is specified for the DMO —typically the system‘s speakers. Otherwise, the MSRKinectAudio DMO fails. AEC is designed to cancel interfering sounds, so there must be something to cancel. The simplest solution is to start playing a tune on Windows Media ® Player before you run the application. The Libraries\Music\Sample Music folder on your Windows PC contains some sample music files. The following is a lightly edited version of the output from a MicArrayEchoCancellation session, where the sound source moved from side to side as capture progressed: Start a song in Windows Media Player and then press any key to start recording (echo cancellation processing expects speakers to be producing sound). Recording using DMO AEC-MicArray is running ... Press "s" to stop Position: -0.051290 Confidence: 1.000000 Beam Angle = 0.0000000 Sound output was written to file: C:\KDK\Samples\Audio\MicArrayEchoCancellation\CPP\AECout.wav The recording process uses beamforming, which creates a single directional channel from the four microphones in 16-kHz, 16-bit mono pulse code modulation (PCM) format. The channel is oriented to one of the 11 beam directions. MicArrayEchoCancellation uses adaptive beamforming, which automatically selects the beam that is closest to the source direction. You can use the captured stream for many purposes. MicArrayEchoCancellation simply writes the captured audio stream to AECout.wav—which is a .wav file that can be played with Windows Media Player. The rest of this document is a walkthrough of the MicArrayEchoCancellation sample. It describes all the sample‘s functionality except for writing the capture stream to a .wav file. For details on that process, see the sample code. MicArrayEchoCancellation Walkthrough: C++ – 4 Note This document includes code excerpts, most of which have been edited for brevity and readability. In particular, most routine error-correction code has been removed. For the complete code, see the MicArrayEchoCancellation sample. Hyperlinks in this walkthrough refer to content on the MSDN website. Create and Configure the MSRKinectAudio DMO The application‘s entry point— _tmain—manages the overall program execution, with private methods handling most details. The first step is to create and configure an instance of the MSRKinectAudio DMO, as
Recommended publications
  • Semi-Automated Parallel Programming in Heterogeneous Intelligent Reconfigurable Environments (SAPPHIRE) Sean Stanek Iowa State University
    Iowa State University Capstones, Theses and Graduate Theses and Dissertations Dissertations 2012 Semi-automated parallel programming in heterogeneous intelligent reconfigurable environments (SAPPHIRE) Sean Stanek Iowa State University Follow this and additional works at: https://lib.dr.iastate.edu/etd Part of the Computer Sciences Commons Recommended Citation Stanek, Sean, "Semi-automated parallel programming in heterogeneous intelligent reconfigurable environments (SAPPHIRE)" (2012). Graduate Theses and Dissertations. 12560. https://lib.dr.iastate.edu/etd/12560 This Dissertation is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Graduate Theses and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. Semi-automated parallel programming in heterogeneous intelligent reconfigurable environments (SAPPHIRE) by Sean Stanek A dissertation submitted to the graduate faculty in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Major: Computer Science Program of Study Committee: Carl Chang, Major Professor Johnny Wong Wallapak Tavanapong Les Miller Morris Chang Iowa State University Ames, Iowa 2012 Copyright © Sean Stanek, 2012. All rights reserved. ii TABLE OF CONTENTS LIST OF TABLES .....................................................................................................................
    [Show full text]
  • [MS-ERREF]: Windows Error Codes
    [MS-ERREF]: Windows Error Codes Intellectual Property Rights Notice for Open Specifications Documentation . Technical Documentation. Microsoft publishes Open Specifications documentation for protocols, file formats, languages, standards as well as overviews of the interaction among each of these technologies. Copyrights. This documentation is covered by Microsoft copyrights. Regardless of any other terms that are contained in the terms of use for the Microsoft website that hosts this documentation, you may make copies of it in order to develop implementations of the technologies described in the Open Specifications and may distribute portions of it in your implementations using these technologies or your documentation as necessary to properly document the implementation. You may also distribute in your implementation, with or without modification, any schema, IDL's, or code samples that are included in the documentation. This permission also applies to any documents that are referenced in the Open Specifications. No Trade Secrets. Microsoft does not claim any trade secret rights in this documentation. Patents. Microsoft has patents that may cover your implementations of the technologies described in the Open Specifications. Neither this notice nor Microsoft's delivery of the documentation grants any licenses under those or any other Microsoft patents. However, a given Open Specification may be covered by Microsoft Open Specification Promise or the Community Promise. If you would prefer a written license, or if the technologies described in the Open Specifications are not covered by the Open Specifications Promise or Community Promise, as applicable, patent licenses are available by contacting [email protected]. Trademarks. The names of companies and products contained in this documentation may be covered by trademarks or similar intellectual property rights.
    [Show full text]
  • Microsoft Directshow: a New Media Architecture
    TECHNICAL PAPER Microsoft Directshow: A New Media Architecture By Amit Chatterjee and Andrew Maltz The desktop revolution in production and post-production has dramatical- streaming. Other motivating factors are ly changed the way film and television programs are made, simultaneously the new hardware buses such as the reducing equipment costs and increasing operator eficiency. The enabling IEEE 1394 serial bus and Universal digital innovations by individual companies using standard computing serial bus (USB), which are designed with multimedia devices in mind and platforms has come at a price-these custom implementations and closed promise to enable broad new classes of solutions make sharing of media and hardware between applications difi- audio and video application programs. cult if not impossible. Microsoft s DirectShowTMStreaming Media To address these and other require- Architecture and Windows Driver Model provide the infrastructure for ments, Microsoft introduced Direct- today’s post-production applications and hardware to truly become inter- ShowTM, a next-generation media- operable. This paper describes the architecture, supporting technologies, streaming architecture for the and their application in post-production scenarios. Windows and Macintosh platforms. In development for two and a half years, Directshow was released in August he year 1989 marked a turning Additionally, every implementation 1996, primarily as an MPEG-1 play- Tpoint in post-production equip- had to fight with operating system back vehicle for Internet applications, ment design with the introduction of constraints and surprises, particularly although the infrastructure was desktop digital nonlinear editing sys- in the areas of internal stream synchro- designed with a wide range of applica- tems.
    [Show full text]
  • AVT Active Firepackage V1.1 – Release and Revision Notes 19 March 2008
    AVT Active FirePackage v1.1 – Release and revision notes 19 March 2008 Overview The AVT Active FirePackage (AFP) is a software development kit (SDK) that focuses on ActiveX Control based programming, but also provides interfaces for DirectShow and TWAIN for interfacing to third-party imaging software. The SDK is compatible with Microsoft Windows (Vista, XP, 2000) and includes an IEEE1394 digital camera system driver that is based on the Windows Driver Model (WDM). The driver can be installed manually, but also via an automatic driver install tool. The AFP has been created for programmers who are familiar with ActiveX Controls, COM, DirectShow or TWAIN, and who want to achieve their goals quickly in a comfortable way by using application development tools such as Visual C++, Visual Basic, VB.NET, C#, Java, Delphi, or others. This document provides an overview of the components and their versions provided with the AVT Active FirePackage v1.1. Furthermore, the package architecture is shown (see figure 1) and additional information about the system requirements and certain constraints of this release are listed. Package content This version of the AVT Active FirePackage contains the following components: • Camera system driver – based on the Microsoft IEEE1394 driver set, suitable for all AVT IEEE1394 cameras. • ActiveX Control – powerful, multi-function COM interface that provides various PropertyPages to configure the camera and the settings for image acquisition and supports many events. • DirectShow filter – In addition to a (WDM) Video Capture Source filter (see figure 1 for supported DirectShow interfaces) AVT camera specific transform filters for YUV411 and Y800 output formats are provided to support DirectX based video streaming applications.
    [Show full text]
  • WS-Biometric Devices Walkthrough How to Build a WS-BD Web Camera Service
    WS-Biometric Devices Walkthrough How to Build a WS-BD Web Camera Service For questions or comments, contact [email protected]. 1 Introduction Web Services for Biometric Devices, or WS-BD, is an open source command & control protocol specifically for biometric acquisition devices. Web services use protocols that underlie the web for machine to machine communication. WS-BD allows a target biometric sensor to be exposed to and controlled by a client(s) via a web service. It replaces the need for proprietary software/hardware (e.g. drivers, firewire/USB connectors), eliminates platform restrictions, and allows wired or wireless communication. With a focus on data acquisition, this RESTFUL service architecture affords biometric sensors of any modality communication with any device that is Internet-enabled. This document is written as a “quickstart” aid for development using the WS-Biometric Devices CSDv1.0 document. The specification can be accessed at https://www.oasis- open.org/committees/document.php?document_id=54815&wg_abbrev=biometrics. A .NET reference implementation exists to demonstrate one way to implement a WS-BD service. Libraries from the reference implementation will be used in this walkthrough to build a service. The complete .NET reference implementation can also be downloaded for free at http://www.nist.gov/itl/iad/ig/upload/WS-BD-RefImpl- Jan2015.zip. 1.1 Overview This document provides step by step instructions and source code on how to construct a WS-BD conformant web camera service. It uses a commercial off-the-shelf (COTS) web camera as the biometric sensor. The intent is to provide a quick start to WS-BD development as well as to shorten future WS-BD development time.
    [Show full text]
  • Microsoft Palladium
    Microsoft Palladium: A Business Overview Combining Microsoft Windows Features, Personal Computing Hardware, and Software Applications for Greater Security, Personal Privacy, and System Integrity by Amy Carroll, Mario Juarez, Julia Polk, Tony Leininger Microsoft Content Security Business Unit June 2002 Legal Notice This is a preliminary document and may be changed substantially prior to final commercial release of the software described herein. The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication. This White Paper is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, AS TO THE INFORMATION IN THIS DOCUMENT. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. Unless otherwise noted, the example companies, organizations, products, domain names, e-mail addresses, logos, people, places and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, e-mail address, logo, person, place or event is intended or should be inferred.
    [Show full text]
  • Datastead Multipurpose Directshow Encoder SDK Version 1.1.2
    DATASTEAD SOFTWARE Multipurpose DirectShow Encoder version 1.7.8 March 3, 2021 Copyright © Datastead 2021 www.datastead.com Overview......................................................................................................................................................3 Features...................................................................................................................................................3 System requirements...............................................................................................................................4 Download.................................................................................................................................................4 License.....................................................................................................................................................4 Contact.....................................................................................................................................................4 Limitations of the evaluation version........................................................................................................4 FAQ..............................................................................................................................................................5 Licensing..................................................................................................................................................5 Evaluation version....................................................................................................................................5
    [Show full text]
  • Choosing the Right Media Paths to Facilitate Hardware Acceleration
    White Paper Fan, Fred Choosing the Right Graphics Platform Application Engineer Media Paths to Intel Corporation Tan, Hui Li Facilitate Graphics Platform Application Engineer Hardware Intel Corporation Acceleration Media Solution for Retail Independent Software Vendors (ISVs) September, 2013 329650-001US Choosing the Right Media Paths to Facilitate Hardware Acceleration Executive Summary This paper provides detailed guidance for choosing the right media- development Application Programming Interfaces (APIs) for different retail usage models and devices on Windows* 7 and Windows* 8 operating systems (OSs). When Independent Software Vendors (ISVs) make decisions on which approach to take for their own media applications, there are five key factors that may influence their decisions: hardware investment, software resource investment, performance, flexibility, and software implementation speed. This paper provides detailed guidance for choosing the right media development APIs for different retail segments, from Entry to Mainstream and High-end. This paper does not include implementation details, but does provide related media references, such as Intel® Media SDK and other media development APIs, to help ISVs develop their media content as the next step. The Intel® Embedded Design Center provides qualified developers with web-based access to technical resources. Access Intel Confidential design materials, step-by-step guidance, application reference solutions, training, Intel’s tool loaner program, and connect with an e-help desk and the
    [Show full text]
  • Dnvideox Activex Control Reference Anmo Electronics Corporation
    DNVideoX ActiveX control reference AnMo Electronics Corporation Contents License agreement .............................................................................................................................. 8 Notice .................................................................................................................................................. 9 Tutorial .............................................................................................................................................. 10 Create video preview .................................................................................................................... 10 Capture still image ......................................................................................................................... 10 Capture video sequence ................................................................................................................ 11 Upload video images to WEB server via FTP ................................................................................. 12 Motion detection .......................................................................................................................... 12 Text captions on video (time-stamp, etc.) .................................................................................... 13 Sending video frames through network ........................................................................................ 13 Upload files to WEB server via HTTP ............................................................................................
    [Show full text]
  • Dataton Watchout Windows 10 Enterprise Tweaking Guide
    DATATON WATCHOUT WINDOWS 10 ENTERPRISE TWEAKING GUIDE INTRODUCTION Support 1 Warranty and service 1 Longevity and 24/7 use 1 New drivers 1 Microsoft EULA when building for reselling 1 Dataton’s WATCHOUT media servers 2 WINDOWS 10 ENTERPRISE TWEAKING GUIDE Install Windows 3 Add and remove Windows features 4 Updates 5 Disable and delete hibernation files 5 Remove Windows components 5 Uninstall OneDrive 6 Group policies 7 Install all drivers 10 Install WATCHOUT 10 Windows settings 11 Services 15 Registry settings 16 Task scheduler 17 © Copyright 2018 DATATON AB. All rights reserved. Dataton, the Dataton logo, WATCHOUT, WATCHPAX, WATCHMAX are trademarks/registered trademarks of DATATON AB. All other company and product names are trademarks or registered trademarks of their respective owners. Use of a term in this guide should not be regarded as affecting the validity of any trademark, or as an endorsement. This guide is provided as advice only and any technical information contained herein regarding features and spec- ifications is subject to change without notice. Dataton assumes no responsibility for any inaccuracies or errors in this guide or the products described. Document number: 3749. Rev 1.3 INTRODUCTION WATCHOUT is the leading multi-display and projection-mapping software with over 17 years of unrivalled performance and reliability under its belt. Award-winning WATCHOUT can be used in a wide range of applications, from smaller fixed installations, to world-record video walls and large, complex live events. In general, building a high performance, top quality and reliable WATCHOUT media server requires a combination of well-developed skills to select the right hardware, tune the BIOS settings and tune the Microsoft Windows® 10 operating system.
    [Show full text]
  • Windows Internals for .NET Developers
    Windows Internals for .NET Developers Pavel Yosifovich @zodiacon [email protected] Agenda •Windows Architecture and .NET •Jobs •Threads •Kernel Objects •Windows Runtime •Summary 2 About Me • Developer, Trainer, Author and Speaker • Co-author: Windows Internals 7th edition, Part 1 (2017) • Author: WPF Cookbook (2012), Mastering Windows 8 C++ App Development (2013) • Pluralsight author • Author of several open-source tools (http://github.com/zodiacon) • Blogs: http://blogs.microsoft.co.il/pavely, https://scorpiosoftware.net/ 3 Windows Architecture Overview Subsystem System Service Process User Processes Processes Processes (CSRSS.exe) CLR / .NET FCL Subsystem DLLs NTDLL.DLL User Mode Executive Kernel Mode Win32k.Sys Device Drivers Kernel Hardware Abstraction Layer (HAL) Hyper-V Hypervisor Kernel Mode (hypervisor context) 4 .NET Today •Part of Windows • Fully supported •User mode only • Kernel mode CLR/.NET possible (e.g. “Singularity”) •Not a complete wrapper over the Windows API • Opportunities to extend and optimize •What about .NET Core? 5 Windows Numeric Versions • Windows NT 4 (4.0) // get version... • Windows 2000 (5.0) if(version.Major >= 5 && version.Minor >= 1) { // XP or later: good to go!? • Windows XP (5.1) } • Windows Server 2003, 2003 R2 (5.2) • Windows Vista, Server 2008 (6.0) • Windows 7, Server 2008 R2 (6.1) • Windows 8, Server 2012 (6.2) • Windows 8.1, Server 2012 R2 (6.3) By default, reported as 6.2 • Windows 10, Server 2016 (10.0) By default, reported as 6.2 6 Windows Versions 7 Windows Subsystem APIs • Windows API (“Win32”)
    [Show full text]
  • Chapter 25 Directshow Capture
    Chapter 25 I DirectShow Capture Why Read As we were writing the book, DirectShow was still in its beta cycle. Nonetheless, we de- This Chapter? cided to add a new chapter to cover some of the new features of DirectShow, namely the new capture and compression interfaces. In this chapter, you’ll ■ get an overview of capture under VFW and DirectShow; ■ understand the internal workings of the sample video capture filter, which wraps any VFW driver; and ■ learn how to access the new capture interfaces from your application. 25.1 Overview of DirectShow Capture As of the time of the publication of this book, all video capture adapters use the Video for Windows (VFW) interface to implement capture drivers under Windows. VFW was designed to handle linear video capture and compression but did not handle video/audio synchronization, MPEG-style video, TV tuning, or video conferencing. As a result, Microsoft developed ActiveMovie 1.0 to address some of these issues. ActiveMovie 1.0 provided audio and video playback, synchronization between multiple streams, and MPEG1 support. But ActiveMovie 1.0 lacked support for audio/video cap- ture, TV tuning, and compression. ■—25-1—■ 25-2—■ —Chapter 25 DirectShow Capture DirectShow 2.0 completes the puzzle and provides the missing pieces of ActiveMovie 1.0. DirectShow defines the necessary interface to build a cap- ture source filter and the necessary means of accessing such interfaces from within an application. To provide a migration path, Microsoft imple- mented a default capture filter, which acts as a wrapper for any existing VFW video capture filter.
    [Show full text]