A Dynamic Programming Approach to Segmenting DJ Mixed Music Shows

Total Page:16

File Type:pdf, Size:1020Kb

A Dynamic Programming Approach to Segmenting DJ Mixed Music Shows A Dynamic Programming Approach to Segmenting DJ Mixed Music Shows Tim Scarfe, Wouter M. Koolen and Yuri Kalnishkan Computer Learning Research Centre and Department of Computer Science, Royal Holloway, University of London, Egham, Surrey, TW20 0EX, United Kingdom ftim,wouter,[email protected] Abstract. In this paper we describe a method to infer time indexes of track/song transitions in recorded audio. We apply our method to segment DJ-mixed electronic dance music podcasts/radio shows. These shows are seamlessly mixed by a DJ so the track boundaries are in- tentionally obfuscated. The driving factor is that audio streams often contain little or no metadata of events to describe their contents much like a table of contents would provide on a purchased compact disc. Our technique involves transforming the podcast into a sequence of feature vectors each representing a small interval of time and defining a matrix showing the cost of a potential track starting at a given time with a given length. We consider both Fourier and time domain autocorrelation features. We use a dynamic programming approach to find the segmen- tation with lowest total. We tested the approach on a dataset of 15 shows of 2 hours each (called Global DJ Broadcast by Markus Schulz). 1 Introduction DJ mixes are an essential part of electronic dance music. Whether you listen to your favourite DJ on the radio, live or downloaded from the internet { the individual tracks will be mixed together. More recently, podcasts (through Ap- ple iTunes) have become the de facto standard empowering people to download radio shows for offline listening automatically. Users can play the mixes on their computers and/or mobile devices. Many shows are also broadcast on internet ra- dio stations such as Digitally Imported and syndicated world wide on terrestrial radio. Podcasts on iTunes already have an \event metadata" concept called chapters built in. Very few radio show producers (two that we know of) use the chapters feature in their podcasts. Adding this metadata relies on a proprietary Apple extension to their media files and the ones we looked at had highly inaccurate indexes. This is not surprising given that it is a laborious manual process to add this metadata to the files. One of the interesting features of audio is that you can not scrub through it and get a contextual picture in the same way you could with video. Audio must be assimilated and experienced in real time to understand the context. Many DJs intentionally do not provide too much information about their mixes because they want to maintain a certain mystique and have their set seen as one distinct work of art rather than a set of tracks simply mixed together. We will not speculate about the reasons for this but it is clearly beneficial to have the metadata because it allows us to skip the tracks we do not like, and become intimately familiar with the shows content and the respective authors of the tracks. To mix tracks DJs always match the speed/BPM (beats per minute) of each track during a transition and align their percussion in the time domain. To make things sound even more seamless they often also apply EQ or harmonically mix tracks to stop any dissonance between instruments in adjacent tracks. DJs will know which \key" each track is in, and have access to a chart showing which other keys are harmonically compatible. This ensures that when both tracks are in, the tracks are harmonious and each critical band of hearing in the human ear is only stimulated by one peak in the frequency domain avoiding a characteristic muddy sound. Many real instruments have a harmonic series of peaks in the frequency domain, usually with each harmonic at integer multiple of their fundamental frequency. Synthesized instruments tend to mimic this spectral pattern (which is called timbre in music). This is the reason why instruments are pleasurable for humans to listen to. The technique we used in this paper is to segment the audio file into discrete time intervals (called tiles), and perform a transformation on those tiles into a domain where two tiles from the same tracks would be distinguishable by their cosine. We extracted two core sets of features; Fourier based and autocorrelation based. Intuitively speaking the Fourier features can be thought of as useful for distinguishing instruments (after filtering to remove noise and accentuate rele- vant features) and the autocorrelation features useful for distinguishing rhythmic patterns. Our features are quite generic in the sense that we look at the spectrum directly. Most papers in the literature extract features on top of the spectrum such as centroid, and bandwidth. Most of these papers are concerned with su- pervised classification or structural analysis. We work directly on the spectrum and match like for like instruments with increased levels of clarity/specificity. We are doing direct comparisons rather than classification on new data. Because the features were noisy, we filtered them using a one dimensional convolution first-derivative Gaussian filter. In the following sections we will de- scribe; the data set (Section2), the evaluation criteria (Section3), the test set (Section4), the feature extraction (Section5), the cost matrix we generate from the features (Section6), computing the best cost segmentation from the cost matrix (Section7) and finally our methodology and results (Section8). 1.1 Related Work A similar problem is addressed using dynamic programming in [10] and [9]. The novelty of our approach is that it can be used in the unsupervised setting to find an entirely distinct sequence of clusters, it assumes one cluster will not appear again later on. All related work in this area is concerned with summarisation by which we mean inferring some kind of structure concerned with repeating elements from the audio. J. Foote et al ([11][12]) has done a significant amount of work in this area. He has made use of similarity matrices for the purpose of structural analysis. His approach is based on statistical segment clustering. A distinguishing feature of our approach is that we evaluate how well we are doing in respect of humans for the same task. We compare our error to the relative error of cue sheets created by human domain experts. Another important feature of our work is that we focus directly on DJ mixed electronic dance music. We do not know of any previous work in this area. 2 Data Set We have decided to use the radio show Global DJ Broadcast by Markus Schulz. This show goes out weekly at 5PM EST on Digitally Imported. Its contents are mainly Progressive Trance and Tech Trance in equal measures. Progressive trance is characterised as having a house riff (bass line) with melodic instrumen- tal elements over the top. Tech trance has less or no riff and a more rhythmic focus achieved with percussion and instruments and less or no melody. Both sub genres have a very distinctive sound sharing many instruments and effects, usu- ally with gratuitous Gaussian noise and reverb layered on top (or similar). The first track usually starts at around 36 seconds after an introduction. Markus usu- ally talks over the beginning of the first or second track and many more after. The music remains constant throughout after the introduction. Occasionally Markus has guest mixes during his show. The show is usually played at around 130 BPM although this not constant throughout the show and can change abruptly on a transition (more likely) or slowly over an interval of time. The shows come in 44100 samples per second, 16 bit stereo MP3 files @ 192Kbs or more recently 256Kbs. We will resample these to 4000Hz 16 bit mono (left channel) WAV files to allow us to process them faster. A cue sheet is a metadata file which describes track names and indices of a media file. Cue sheets are stored as plain text files and commonly have a \.cue" file name extension. Most popular media players support cue sheets, allowing the user to listen to the separate components of one large audio file. These shows are 2 hours long, and we have used 15 of them in our dataset. We discarded any shows that have cue sheets which completely disagree with each other or are entirely erroneous (for example the cue sheet we inspected for 17th March 2011). The average discrepancy is the total absolute difference between pairs of tracks, summed up and averaged by the number of tracks. 3 Evaluation Criteria We do not feel that it is important to get these indexes exactly correct. The cue sheet authors are using their own recordings of the radio show which may differ in alignment to others and the indexes are subject to some debate due to the relatively large length of the overlap between any pair of adjacent tracks. We feel that blatant mistakes (see Definition1) should be avoided as the primary concern although we can't easily measure them objectively. We will use visual inspection later to demonstrate the efficacy of our approach as this measure is a more practical concern. In Section4 we include a statement from one of the cue sheet authors regarding his process for finding the track indexes. Definition 1. A Blatant Mistake is qualitative term { interpreting (in effect) a transition as a distinct track, misconceptualising one distinct element of a track as being a track in its own right or making otherwise significantly inaccurate bound recommendations. We allow more tolerance for an index to start late into track B, vs early during a transition, forcing the listener to hear the end of track A before track B is in properly.
Recommended publications
  • Devicelock® DLP 8.3 User Manual
    DeviceLock® DLP 8.3 User Manual © 1996-2020 DeviceLock, Inc. All Rights Reserved. Information in this document is subject to change without notice. No part of this document may be reproduced or transmitted in any form or by any means for any purpose other than the purchaser’s personal use without the prior written permission of DeviceLock, Inc. Trademarks DeviceLock and the DeviceLock logo are registered trademarks of DeviceLock, Inc. All other product names, service marks, and trademarks mentioned herein are trademarks of their respective owners. DeviceLock DLP - User Manual Software version: 8.3 Updated: March 2020 Contents About This Manual . .8 Conventions . 8 DeviceLock Overview . .9 General Information . 9 Managed Access Control . 13 DeviceLock Service for Mac . 17 DeviceLock Content Security Server . 18 How Search Server Works . 18 ContentLock and NetworkLock . 20 ContentLock and NetworkLock Licensing . 24 Basic Security Rules . 25 Installing DeviceLock . .26 System Requirements . 26 Deploying DeviceLock Service for Windows . 30 Interactive Installation . 30 Unattended Installation . 35 Installation via Microsoft Systems Management Server . 36 Installation via DeviceLock Management Console . 36 Installation via DeviceLock Enterprise Manager . 37 Installation via Group Policy . 38 Installation via DeviceLock Enterprise Server . 44 Deploying DeviceLock Service for Mac . 45 Interactive Installation . 45 Command Line Utility . 47 Unattended Installation . 48 Installing Management Consoles . 49 Installing DeviceLock Enterprise Server . 52 Installation Steps . 52 Installing and Accessing DeviceLock WebConsole . 65 Prepare for Installation . 65 Install the DeviceLock WebConsole . 66 Access the DeviceLock WebConsole . 67 Installing DeviceLock Content Security Server . 68 Prepare to Install . 68 Start Installation . 70 Perform Configuration and Complete Installation . 71 DeviceLock Consoles and Tools .
    [Show full text]
  • CD Audio, DDP Master Or WAV Files)
    GZ Digital Media, a.s., Tovární 340, Loděnice, ZIP: 267 12, Czech Republic, Registration no.: 25685511, VAT no.: CZ25685511 Web: www.gzdm.cz, E-mail: [email protected], Tel.: +420 311 673 111 Vinyl - Technical Conditions These technical conditions describe the acceptable source data and materials, including documentation required for the vinyl record production in the company GZ Digital Media, a. s. The customer has the duty to get acquainted with them prior to placing the order. The source data not mentioned in these technical conditions, or source data, which are inconsistent with these conditions, should be consulted in advance with a pre-mastering engineer. 1 Technical Specifications Vinyl record is a carrier of mechanical analogue recording of natural sounds. It is predetermined for listening of this recording by means of amplitude linear reproduction chain at uniform revolving of vinyl record at nominal speed in the clockwise direction. Therefore the problems caused by any other use of record cannot be the subject of claim. Each side of vinyl record carries one physically continuous spiral groove, which begins at the perimeter of record and is ended on the specified diameter by ring closure into itself. Any possible requested different geometrical layout must be defined precisely as intended deviation from the standard IEC 98. Technical parameters of vinyl record must conform to the specification of the standard IEC 98:1987. If the supplied source materials do not allow to carry out the recording in compliance with the abovementioned standard or the character of supplied sound recording at standard setting-up of the recording device exceeds its limit values, the supplied source materials will be adjusted in the pre-mastering, or will be rejected as nonconforming, should it be impossible to adjust them.
    [Show full text]
  • Name Synopsis Description Options
    DDPINFO(1) ddpinfo Manual DDPINFO(1) NAME ddpinfo − display and export the content of a DDP fileset SYNOPSIS ddpinfo [options] ddpdirectory ddpinfo [--help|--version] DESCRIPTION The ddpinfo command reads DDP filesets and shows its content in human readable form, it also offers to export the DDP as cue/wav image. While mainly focussed at DDP masters describing Red Book audio CDs, other types of DDP filesets should yield some useful output as well. To properly display non-ASCII characters in CD text fields on UNIX-likesystems set your terminal encod- ing to either "UTF-8" or "IS0 8859-1" (i.e. "Latin1"). On Microsoft Windows set the font of the Command Promt windowto"Lucida Console" or "Consolas". If a UPC/EAN code is present, it’schecksum digit will be validated. OPTIONS −y,--verify Search for MD5 and CRC32 checksum files in the DDP directory and use the checksums found to verify the integrity of the DDP fileset. Multiple checksum files will be evaluated one after another. File formats known to be recognized are: md5sum, Pyramix, Sequoia, SADiE (all flavors), Sonoris, DSP Quattro, Wav e Editor.Feel free to contact the author,ifyou encounter an unsup- ported file format. −x, --add-checksum Write MD5 and CRC32 checksum files with checksums for each file which is part of the DDP file- set. If already present the checksum files will be overwritten (CHECKSUM.MD5 and CHECK- SUM.TXT respectively). −e, --expert Showthe content of a DDP fileset in a rather rawformat. Note that this is only useful, if you are familiar with the DDP specification and want to examine broken or flawed masters.
    [Show full text]
  • Starburn CD/DVD/Blu-Ray/HD-DVD Toolkit: Getting Started
    StarBurn Software Technical Reference Series StarBurn CD/DVD/Blu-Ray/HD-DVD Toolkit: Getting Started April 18, 2016 StarBurn Software www.starburnsoftware.com Copyright © Rocket Division Software 2001-2016. All rights reserved. Copyright © StarBurn Software 2009-2016. All rights reserved. StarBurn CD/DVD/Blu-Ray/HD-DVD Toolkit: Getting Started Page 1 of 13 StarBurn Software Technical Reference Series INTRODUCTION .................................................................................................. 4 KEY BENEFITS ..................................................................................................... 5 KEY FEATURES .................................................................................................... 7 SUPPORTED PLATFORMS .................................................................................. 11 SYSTEM REQUIREMENTS................................................................................... 12 CONTACTS........................................................................................................ 13 StarBurn CD/DVD/Blu-Ray/HD-DVD Toolkit: Getting Started Page 2 of 13 StarBurn Software Technical Reference Series COPYRIGHT Copyright © Rocket Division Software 2001-2016. All rights reserved. Copyright © StarBurn Software 2009-2016. All rights reserved. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior written
    [Show full text]
  • Sonoris DDP Creator 4
    Sonoris DDP Creator 4 User’s Manual Copyright © Sonoris Audio Engineering © 2018 [email protected] Table of Contents Introduction ............................................................................................................................... 4 What Is It? .......................................................................................................................................................................................... 4 CD Authoring .................................................................................................................................................................................... 4 DDP Export........................................................................................................................................................................................ 4 DDP Import ....................................................................................................................................................................................... 4 DDP Creator “Pro” Version .......................................................................................................................................................... 4 DDP Creator Features ................................................................................................................. 5 DDP Creator Pro Features ........................................................................................................... 6 Basics and Terminology .............................................................................................................
    [Show full text]
  • Ted's SACD Ripping Guide
    Ted’s SACD Ripping Guide - Version 4.0 Feb 25, 2014 As we approach two and a half years of this ripping project life cycle, and one year from the last update on this guide, I realized it needed major updating. There are now over 100 DSD-capable DACs, almost a dozen DSD download sites, and over 8000 users of this guide (that I know of) So here is Version 4. First off, this primer is intended to help folks who, like myself, who want to rip their personal SACD collection for archiving and personal hi-end DAC (PCM or DSD) use. I am NOT intending this for use in pirating, selling or trading copyrighted content, or ripping content that you don’t already own. Here is hopefully everything you need for PS3 ripping, Windows or Mac extraction, or the server-mode capability of the SACD ripper (rip directly to LAN), and any necessary PCM conversion (native DSD playback is MUCH preferred but, of course, requires a DSD-capable DAC). And thanks to Jesus Rodriguez, a partner and friend (and owner of Simple Designs, which manufacturers Sonore gear and is the North American distributor for SOtM gear) I can now attest to 100 DSD-capable DACS that Jesus and I keep track of (I need to update at least 10 more DACs) over on the Google Docs DSD Database. https://docs.google.com/spreadsheet/ccc?key=0AgVhKcl_3lHfdFVyenBBNjNpQ2lieG81WGpqQTNfVUE# gid=0 However, the most important and valuable assistance that Jesus has given us is his new free ISO2DSD GUI app (in all three flavors; Windows, MAC OSX and Linux) that, among other things, deletes about a full page of this Guide .
    [Show full text]
  • The Imgburn Functions
    The ImgBurn Functions ImgBurn Support ForumImgBurn Support Forum: The ImgBurn Functions ImgBurn Support ForumJump to content Sign In » New user? Register Now! Help Search This topicSearch section: This topic This forumForumsMembersHelp FilesCalendar Advanced ImgBurn WebsiteForumsMembersCalendarImgBurn Support Forum> General> GuidesView New Content Community Calendar Page 1 of 1 You cannot start a new topicYou cannot reply to this topicThe ImgBurn Function s Everything you ever wanted to know about the program and its settings #1 Cynthia ISF Newbie Group: Beta Team Members Posts: 5,954 Joined: 10September 05 Gender:Female Location:Sweden Posted 22 March 2008 03:48 PM The ImgBurn Functions This guide explains the various functions that can be found in ImgBurn and it's structured in the same way as the ImgBurn menus. Guide Contents 1. File 1.1 Browse for a source file 1.2 Browse for a source folder 1.3 Remove all items 1.4 Browse for a destination file 1.5 Calculate 1.6 Build 1.7 Read 1.8 Write 1.9 Verify 1.10 New Project 1.11 Load Most Recent Project 1.12 Load Project 1.13 Save Project 1.14 Load Most Recent Queue 1.15 Load Queue 1.16 Save Queue 1.17 Export Graph Data 1.18 Display Graph Data 1.19 Recent Files 1.20 Exit 2. View 2.1 Disc Layout Editor 2.2 Drop Zone 2.3 Log 2.4 Queue 2.5 Refresh 3. Mode 3.1 EzMode Picker 3.2 Read 3.2.1 Options And Settings 3.2.1.1 Source 3.2.1.2 Destination 3.2.1.3 Graph Data 3.2.1.4 Settings 3.2.2 Read 3.2.2.1 Options 3.2.3 Available Guides for the 'Read' mode 3.3 Build 3.3.1 Input Modes 3.3.1.1 Standard
    [Show full text]
  • Name Synopsis Description Options Disc Description
    CUE2DDP(1) cue2ddp Manual CUE2DDP(1) NAME cue2ddp − create a DDP 2.00 master from an audio CD cue sheet SYNOPSIS cue2ddp [−ct][−m master−id] cuesheet [ddpdirectory] cue2ddp [ −−help | −−version ] DESCRIPTION The cue2ddp command acts as an audio CD image converter,parsing a cuesheet and writing a DDP 2.0 file- set to ddpdirectory.This directory must exist and files therin will be overwritten whithout further notice, if necessary.With no ddpdirectory specified, the program exits after parsing the cue sheet and printing a sum- mary of the infromation found, this can be used as syntax check. When cuesheet is set to "−" cue2ddp reads from standard input. Parsing is done with the intention to rather exit with an error than end up with a DDP that´s based on guesses about the exact meaning of the input file. Parser warnings will be isseud in non−fatal situations to give asdetailed a feedback as possible. CD text is supported with the standard TITLE, PERFORMER, and SONGWRITER cue sheet commands. In that case all CD Textfields are expected to be encoded as IS0 8859−1 (Latin1), and the language will be set to English. When using the CDTEXTFILE cue sheet command to link a binary CD Textfile cue2ddp will uses the CD Textdata exactly as found. Although not part of the DDP format cue2ddp will also add MD5 and CRC32 checksum files, which can be used to verify data integrity,e.g. after the DDP fileset has been transferred to a different location or media. OPTIONS −c, −−add−cuesheet Embedd a cue sheet within the DDP fileset.
    [Show full text]
  • Oracle® VM Virtualbox Release Notes for Release 6.1
    Oracle® VM VirtualBox Release Notes for Release 6.1 August 2021 F22856-13 Oracle Legal Notices Copyright © 2004, 2021 Oracle and/or its affiliates. This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable: U.S. GOVERNMENT END USERS: Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs) and Oracle computer documentation or other Oracle data delivered to or accessed by U.S. Government end users are "commercial computer software" or "commercial computer software documentation" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, the use, reproduction, duplication, release, display, disclosure, modification, preparation of derivative works, and/or adaptation of i) Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs), ii) Oracle computer documentation and/or iii) other Oracle data, is subject to the rights and limitations specified in the license contained in the applicable contract.
    [Show full text]
  • Proposed Optical Media Imaging Workflows for Fales Library & Special Collections
    Proposed Optical Media Imaging Workflows for Fales Library & Special Collections Annie Schweikert Graduate Intern, Fall 2018 NYU Moving Image Archiving and Preservation Dependencies FTK Imager, Exact Audio Copy, IsoBuster, VLC (to play DVD-Video). Sections: Evaluating materials and selecting a workflow Imaging a DVD-Video disc with IsoBuster IsoBuster settings Extracting audio with Exact Audio Copy Imaging a data disc with FTK Imager Imaging a data disc with IsoBuster (if FTK Imager fails) Further resources Questions not answered in the workflows may be answered with the IsoBuster documentation, located online at ​https://www.isobuster.com/help/​, or Exact Audio Copy’s FAQ at http://www.exactaudiocopy.de/en/index.php/support/faq/​. Evaluating materials and selecting a workflow 1. Open IsoBuster. In the drop-down menu, select the disc drive you are using (outlined by the blue box below). Once the disc drive is selected, the disc’s directory structure will appear in the left-hand menu. a. You can navigate through the disc structure by clicking on the directories in the left menu. The contents of the active directory (the one you most recently clicked on) will appear on the right. In the below view, the directory “RINEBOLD” (in orange box on left) is active, and the right side displays the contents of that directory. 2. Use this view to inform your cataloging. a. Identify whether the disc is a CD or DVD. (​Letter A: ​This disc is a CD.) b. Identify the contents of the disc. (​Letter B: ​The disc contents are JPG files, making this CD a data disc with “computer files,” as opposed to audio or video.) c.
    [Show full text]
  • Electronic Upload of Data to FERMATA, A.S. FTP Server Data for Production
    Electronic upload of data to FERMATA, a.s. FTP server Data for production of mastering/pressing stampers must be in image. Accepted formats of those files are: CD: DDP, CMF, ISO, IMG, NRG and CUE SHEET. DVD 5, DVD 9 ROM: DDP, CMF, ISO, IMG, NRG and CUE SHEET We accept only format cue / bin for the images created with the assistance of Toast (Apple/Mac) program. DVD 9 Video: DDP, CMF Such a data file has to be completed with check sum type MD5, with which could be checked the continuity and consistency of data through transfer. It is possible to create Check sum MD5 through MD5 summer program. This program is possible to download as a freeware for example on this web page: http://www.md5summer.org/ Both those files (image and MD5) must be uploaded into one independent folder on the FTP, whose name must correspond with the title number presented on the order form of the title. For higher security we recommend that both files would be zipped in archive type RAR or ZIP. Data which will not contain the check sum, cannot be accepted for production. Their unlocking for production will be possible only in case that customer deliver this check sum additionally or if the customer confirms in written that he is aware of all risks connected with unsecured transfer of file through FTP server. Always after the finalization of data transfer on FTP server the customer must sent the message about finalization of the data transfer to the email address [email protected] .
    [Show full text]
  • Sonoris DDP Player 4
    Sonoris DDP Player 4 User’s Manual Copyright © Sonoris Audio Engineering © 2019 [email protected] 1/17 Table of Contents Introduction ............................................................................................................................... 3 What is it? .......................................................................................................................................................................................... 3 DDP Import ....................................................................................................................................................................................... 3 DDP Player Features ................................................................................................................... 4 Basics ......................................................................................................................................... 5 Audio format ..................................................................................................................................................................................... 5 Tracks and indexes ......................................................................................................................................................................... 5 ISRC (International Standard Recording Code) ................................................................................................................... 5 Enhanced CD ...................................................................................................................................................................................
    [Show full text]