Proceedings of the 21st International Conference on Digital Audio Effects (DAFx-18), Aveiro, Portugal, September 4–8, 2018

FAST MUSIC – AN EFFICIENT IMPLEMENTATION OF THE MUSIC ALGORITHM FOR FREQUENCY ESTIMATION OF APPROXIMATELY PERIODIC SIGNALS

Orchisama Das, Jonathan S. Abel, Julius O. Smith III Center for Computer Research in Music and Acoustics, Stanford University Stanford, USA [orchi|abel|jos]@ccrma.stanford.edu

ABSTRACT with poles and zeros. It is said to have better resolution than MU- Noise subspace methods are popular for estimating the parame- SIC at low SNRs. Similarly, another popular algorithm, ESPRIT ters of complex sinusoids in the presence of uncorrelated noise [7], was invented by Roy et al. which makes use of the underly- and has applications in musical instrument modeling and micro- ing rotational invariance of the signal subspace. The generalized phone array processing. One such algorithm, MUSIC (Multiple eigenvalues of the matrix pencil formed by an auto-covariance ma- Signal Classification) has been popular for its ability to resolve trix and a cross-covariance matrix gives the unknown frequencies. closely spaced sinusoids. However, the computational efficiency ESPRIT performs better than MUSIC, especially when the sig- of MUSIC is relatively low, since it requires an explicit eigenvalue nal is sampled nonuniformly. More recently, another enhancement decomposition of an autocorrelation matrix, followed by a linear to MUSIC, gold-MUSIC [8] has been proposed which uses two search over a large space. In this paper, we discuss methods for and stages for coarse and fine search, respectively. the benefits of converting the Toeplitz structure of the autocorrela- One of the disadvantages of the MUSIC algorithm is its com- tion matrix to circulant form, so that the eigenvalue decomposition putational complexity. Typical eigenvalue decomposition algo- can be replaced by a Fast Fourier Transform (FFT) of one row of rithms are of the order O(N 3) [9]. In this paper, we use the the matrix. This transformation requires modeling the signal as at fact that, for periodic signals, the autocorrelation matrix is circu- least approximately periodic over some duration. Moreover, we lant when it spans an integer multiple of the signal’s period. In derive a closed-form expression for finding spectral peaks, yield- this case, looking for the eigenvalues with largest magnitude is ing large savings in computation time. We test our algorithm to equivalent to looking for magnitude peaks in the FFT. We know in resolve closely spaced piano partials and observe that it can work the circulant case that our noise eigenvectors are DFT sinusoids, with a much shorter data window than a regular FFT. and hence we can derive a closed-form solution when we project our search space onto the noise subspace, thereby reducing further 1. INTRODUCTION the calculations required to find the pseudospectrum. Replacing eigenvalue decomposition in MUSIC with efficient Fourier trans- Sinusoidal parameter estimation is a classical problem with appli- form based methods has been previously studied in [10] where cations in radar, sonar, music, and speech, among others. When the eigenvectors are derived to be some linear combinations of the the frequencies of sinusoids are well resolved, looking for spectral data vectors, while maintaining the orthonormality constraint. In peaks is adequate. It is shown in [1] that the maximum likelihood this paper, we take a different approach by utilizing the circulancy (ML) frequency estimate for a single sinusoid in Gaussian white of autocorrelation matrices. noise is given by the frequency of the magnitude peak in the peri- We test our algorithm to resolve closely spaced partials of the odogram. The ML approach is extended and Cramer-Rao bounds A3 note played on a piano. It is a well known fact that the strings are derived for multiple sinusoids in noise in a follow-on paper by corresponding to a particular piano key are slightly mistuned. Cou- the same authors [2]. Some other estimators are covered in [3]. pled motion of piano strings has been studied in detail by Weinre- For closely spaced sinusoidal frequencies, however, other ap- ich in [11, 12]. There is a slight difference in frequency of the proaches have been developed. Noise subspace methods are a individual strings, giving rise to closely spaced peaks in the spec- class of sinusoidal parameter estimators that utilize the fact that tra. A very long window of data is needed to resolve these peaks the noise subspace of the measured signal is orthogonal to the sig- with the FFT. We show that FAST MUSIC can resolve two closely nal subspace. Pisarenko Harmonic Decomposition [4] makes use spaced peaks with a much shorter window of data than a regular of the eigenvector associated with the minimum eigenvalue of the FFT, and much faster than MUSIC. estimated autocorrelation matrix to find frequencies. However, it The rest of this paper is organized as follows : Section 2 gives has been found to exhibit relatively poor accuracy [3]. Schmidt an outline of the MUSIC algorithm, Section 3 derives the FAST [5] improved over Pisarenko with the MUSIC (MUltiple SIgnal MUSIC algorithm, Section 4 describes the experimental results on Classification) algorithm which could estimate the frequencies of a) an artificially synthesized signal containing two sinusoids with multiple closely spaced signals more accurately in the presence of additive white noise and b) a partial of the A3 note played on the noise. In MUSIC, a pseudospectrum is generated by projecting piano which contains beating frequencies. We conclude the paper a complex sinusoid onto all of the noise subspace eigenvectors, in Section 5 and delineate the scope for future work. Estimation of defining peaks where this projection magnitude is minimum. This real-valued sine wave frequencies with MUSIC has been studied method is shown to be asymptotically unbiased. An enhancement in [13]. For all derivations in this paper, we work with real signals, to MUSIC, root-MUSIC, was proposed in [6]. It uses the proper- because we are interested in audio applications. This saves some ties of the signal-space eigenvectors to define a rational spectrum time in computing the pseudospectrum since we know it will be

DAFX-1 Proceedings of the 21st International Conference on Digital Audio Effects (DAFx-18), Aveiro, Portugal, September 4–8, 2018 symmetric. Our derivations can be easily extended to complex 3. FAST MUSIC signals. Under conditions to be specified, it is possible to replace the eigen- 2. MUSIC value decomposition required in MUSIC by a Fast Fourier Trans- form (FFT). In this section, we derive the structure and order of 2.1. Model the autocorrelation matrix for which it is circulant instead of only Toeplitz. We also derive a closed-form expression for finding the We wish to estimate the parameters of a signal composed of ad- pseudospectrum. ditive sinusoids from noisy observations. Let y(n) be the noisy signal, composed of a deterministic part, x(n), made of r real sinu- 3.1. Deriving the autocorrelation matrix soids and random noise, w(n). We assume that w(n) ∼ N(0, σ2), M and that w(n) and x(n) are uncorrelated. The sinusoidal phases In vector form, y ∈ R can be written as: φi’s are assumed to be i.i.d. and uniformly distributed φi ∼ U(−π, π). y = Sa + w r 2 (3) X w ∼ N(0, σ I) y(n) = Ai cos (ωin + φi) + w(n) i=1 y(n) = x(n) + w(n)  y(n)   1 0 ···  In vector notation, the signal y ∈ M can be characterized  y(n − 1)   cos(ω1) sin(ω1) ···  R   =   T  .   . .  by the M × M autocorrelation matrix Ky = E(yy ). For a  .   . . ...  zero-mean signal, the autocorrelation matrix coincides with the y(n − M + 1) cos[(M − 1)ω1] sin[(M − 1)ω1] ··· covariance matrix. Since this matrix is Toeplitz and symmetric     positive-definite, its eigenvalues are real and nonnegative (and pos- A1 cos(ω1n + φ1) w(n) itive when σ > 0). We can perform an eigenvalue decomposition A1 sin(ω1n + φ2)  w(n − 1)  ×   +   on this matrix to get a diagonal matrix Λ consisting of the eigen-  .   .   .   .  Q 2r values, and an eigenvector matrix . The eigenvectors cor- A sin(ω n + φ ) w(n − M + 1) responding to the 2r largest eigenvalues, Q , contain signal plus r r r x (4) noise information, whereas the remaining M-2r eigenvectors, Qw, only represent the noise subspace. Thus, we have the following Since y is zero-mean, its covariance matrix is relationships: T T 2 2 Ky = (yy ) = SKaS + σ I. (5) Ky = Kx + Kw = Kx + σ I E K = QΛQH We now want to get K in terms of K . We have assumed φ ∼ y (1) y a i  0   H  U(−π, π) (uniformly identically distributed random phase). We   Λ 0 Qx Ky = Qx Qw 2 H observe that every term of Ka is of the form Ka(i, j) = E[Ai cos(ωi 0 σ IM−2r Qw n + φi) Aj cos(ωj n + φj )], or E[Ai sin(ωin + φi)Aj sin(ωj n + φj )], or E[Ai sin(ωin+φi)Aj cos(ωj n+φj )], or E[Ai cos(ωin+ 2.2. Pseudospectrum Estimation φi)Aj sin(ωj n + φj )]. All of these terms are zero, except the first 2 2 2 Let a vector of M harmonic frequencies be denoted as b(ω) = two when i = j, i.e. E[Ai cos (ωin + φi) ] = E[Ai sin(ωin + 2 jω 2jω (M−1)jω T 2 Ai [1, e , e ··· e ] . We project this vector onto Qw, i.e., φi) ] = 2 , which makes it a diagonal matrix: the subspace occupied by the noise (where there is no signal com- ponent). We find the following spectrum as a function of a set of  2  A1 ω’s: 2 0 ··· 0 0  A2  1  0 1 ··· 0 0  P (ω) =  2  H H Ka = . . . . (6) b(ω) QwQ b(ω)  . . . .  w (2)  ......  1  2  Ar P (ω) = H 2 0 0 ··· 0 2 ||Qw b(ω)|| For a particular value of ω that is actually present in the sig- The autocorrelation matrix of the observed signal is given in (7). nal, the sum of projections of b onto the eigenvectors spanning the This is an M × M real, symmetric Toeplitz matrix, but with the noise subspace will be zero. This is because the subspace occu- right order M it can become circulant. pied by the signal is orthogonal to that occupied by noise since they are uncorrelated. Thus, we see that P (ω) will take on a very 3.2. Circulancy of the autocorrelation matrix high value in such cases. In conclusion, we can find peaks in the function P (ω) and those will correspond to our estimated frequen- We have seen that the autocorrelation matrix Ky is symmetric Toeplitz. cies. Since the search space can consist of any number of densely However it is to be noted that for k = 1, 2, ..., if M is an integer + packed frequencies, very closely spaced peaks can show up in the such that M = 2πn/ωi, n ∈ Z , then pseudospectrum, which a simple FFT might not be able to resolve r 2 r 2 in the presence of noise. However, as the search-space grows, so X Ai X Ai cos (M − k)ωi = cos kωi (8) does computational complexity. 2 2 i=1 i=1

DAFX-2 Proceedings of the 21st International Conference on Digital Audio Effects (DAFx-18), Aveiro, Portugal, September 4–8, 2018

 2 2 2 2  P Ai 2 P Ai P Ai P Ai i 2 + σ i 2 cos ωi i 2 cos 2ωi ··· i 2 cos (M − 1)ωi 2 2 2 2  P Ai P Ai 2 P Ai P Ai   cos ωi + σ cos ωi ··· cos (M − 2)ωi T 2  i 2 i 2 i 2 i 2  SKaS + σ I =  . . . .  (7)  ......   2 2 2  P Ai P Ai P Ai 2 i 2 cos (M − 1)ωi i 2 cos (M − 2)ωi ······ i 2 + σ

If we choose M carefully, then the autocorrelation matrix may be written as : PM−1 k n1   m=0 exp[2πjm N − M ] 1 H √  .  Qw b(k) =  .   2 2 2  M M−1 P Ai 2 P Ai P Ai P exp[2πjm k − nL ] i 2 + σ i 2 cos ωi ··· i 2 cos ωi m=0 N M 2 2 2 P Ai P Ai 2 P Ai   cos ωi + σ ··· cos 2ωi  i 2 i 2 i 2  (9) (10) . . . n k n1    −πj( k − 1 )(M−1) sin[π( − )M]   . . .  N M N M ...... e k n1  2 2 2  sin[π( N − M )] P Ai P Ai P Ai 2 1   i cos ωi i cos 2ωi ··· i + σ  .  2 2 2 = √  .  M  n  k nL k L  −πj( − )(M−1) sin[π( N − M )M]  e N M n This matrix is circulant! Hence, its eigenvectors are given by sin[π( k − L )] the DFT sinusoids and its eigenvalues are the DFT coefficients N M of the first row [14]. The eigenvalues can be computed using an From the pseudospectrum estimate, it can be seen that, whenever k ni FFT algorithm when M is a power of 2 or highly composite. The N = M , one term in the denominator sum dominates, and we can relationship between eigenvalues of Toeplitz matrices and those of evaluate P (k) using L’Hospital’s rule. asymptotically equivalent circulant matrices have been studied in M [15]. P (k) = n sin[π( k − i )M] PM−2r N M 2 For example, if the signal consists of 3 sinusoids with frequen- i=1 k ni sin[π( N − M )] cies π , π and π then the minimum order of M which will make 2 4 5 M k ni the autocorrelation matrix circulant is given by 2×LCM(2, 4, 5) = ≈ if = (11) PM−2r 2 N M 40. However, if any of the denominators is irrational, then the i=1 M LCM does not exist, and hence no value of M will make the auto- 1 k ni ≈ if = correlation matrix circulant. Of course, in reality we do not know M(M − 2r) N M the frequencies and cannot determine M this way. However, we can instead detect when the autocorrelation corresponds to a sig- 4. EXPERIMENTS AND RESULTS nal that is periodic. If we can find the periodicity of the estimated autocorrelation function and set M to be that period, then the re- 4.1. Synthesized signal sulting autocorrelation matrix will be circulant. Since no signal is truly precisely periodic, this procedure can be viewed as intro- To compare FAST MUSIC with MUSIC, we tested a cosine signal ducing an approximation based on assuming the signal is periodic. composed of frequencies 0.04rad and 0.05rad at fs = 1 Hz. Such a periodic/harmonic approximation is common when the un- y(n) = cos (0.05n) + 0.5 cos (0.04n + φ) + w(n) (12) derlying signal source is known to be a quasi periodic oscillator such in voiced speech, bowed strings, woodwinds, flutes, brasses, To detect periodicity of the autocorrelation function, we need at organs, and so on. In this paper, we use the Average Magnitude least two periods of the signal. For closely spaced sinusoids, the Difference Function [16] to detect the period. We find all local period is likely a large number, and hence more samples of the minima in the AMDF and pick the period as the lowest minimum signal are needed to estimate the autocorrelation function. In this index which is smaller than its adjacent neighbors. example, the length of the signal is 2000. The pseudospectra of MUSIC and FAST MUSIC are given in Figure 1. To measure computation time, we compared various eigen- 3.3. Searching over a large range of frequencies value decomposition algorithms with Fast Fourier Transform al- gorithms for increasing orders of the autocorrelation matrix. The Suppose we want to calculate the pseudospectrum for N distinct k N N results can be seen in Figure 2. QR factorization is used to find frequencies ωk = 2π N for k = − 2 ,..., 2 − 1 covering the eigenvalues and eigenvectors. QR factorization with Gram Schmidt 2πjk 4πjk range [−π, π). In that case, b(k) = [1, e N , e N ... orthogonalization is slow, symmetric tridiagonal QR with implicit 2π(M−1)jk T Wilkinson shift is slightly faster whereas reduction to the Hessen- e N ] . Since the eigenvectors spanning the noise sub- space are also complex exponentials, the projection function can berg form is fastest. More details about these algorithms can be found in [9]. The Fourier transform algorithms are orders of mag- be simplified. The matrix Qw is composed of columns of noise 1 nitude faster, with the DFT dominating at lower orders and self- eigenvectors, such that each column is denoted by qw = √ [1, ni M 2πjn 4πjn 2π(M−1)jn sorting mixed radix FFT [17] and resampled split radix FFT [18] i i i T e M , e M , ··· , e M ] where ni are the indices of giving faster speeds at higher orders. It is to be noted that the or- the noise eigenvectors for i = 1,...,L and L = M − 2r. der of the autocorrelation matrix for which circulancy is achieved

DAFX-3 Proceedings of the 21st International Conference on Digital Audio Effects (DAFx-18), Aveiro, Portugal, September 4–8, 2018

0 2.5 CRB -20 MUSIC 30 fast MUSIC 2 QIFFT -40

20 1.5 -60

1 -80

10 -100 Pseudospectrum Pseudospectrum 0.5

Mean squared error (dB) -120

0 0 -140 -0.15 -0.1 -0.05 0 0.05 0.1 -0.15 -0.1 -0.05 0 0.05 0.1 Frequency in rad Frequency in rad -160 -20 0 20 40 60 80 100 (a) MUSIC (b) FAST MUSIC SNR in dB (a) Frequency = 0.04rad

Figure 1: Pseudospectrum of signal with frequencies 0.04 rad and 0 CRB 0.05 rad. -20 MUSIC fast MUSIC QIFFT -40

-60 is not likely to be a power of 2, and hence we cannot use the well- -80 known radix-2 FFT algorithm. However, we can first resample the -100 data to a power of 2 [19] and then apply a radix-2/split radix FFT algorithm or used a mixed-radix FFT algorithm on any composite Mean squared error (dB) -120 order. All of these functions have been implemented by the au- -140 thors in MATLAB. More efficient implementations can be done -160 in C, where the FFT functions should overtake the DFT at much -20 0 20 40 60 80 100 SNR in dB lower orders. (b) Frequency = 0.05rad

Figure 3: MSE vs SNR plots

12

MUSIC basic QR 10 MUSIC hess QR 5000 MUSIC implicit QR fast MUSIC mixed radix fft 8 fast MUSIC resampled split radix fft fast MUSIC dft 4000 6

4 3000

2

2000 0 Time in seconds (log) Frequency in Hz -2 1000

-4

-6 0 0 500 1000 1500 2000 2500 3000 3500 0 1 2 3 4 5 6 Order of autocorrelation matrix Time in seconds Figure 2: Order of matrix vs time in seconds (log) Figure 4: Spectrogram of piano note

We also conducted 1000 Monte-Carlo simulations on the above 4.2. Piano data example, with uniformly distributed random phase and plotted the mean-squared errors vs SNR for MUSIC, FAST MUSIC and QIFFT We tested our algorithm on the A3 note played on the piano. The [20], along with the Cramer-Rao bounds (CRB) as given in [21]. spectrogram of the steady state portion of the note is given in Fig- The plots can be seen in Figure 3. The performances are nearly ure 4. We can observe beating in some of the partials. We decided equivalent in Figure 3a, with QIFFT performing the best and FAST to work with the 11th partial, located close to 2600 Hz, where MUSIC performing the worst, but FAST MUSIC performs signif- a beating of roughly 1 Hz is observed. We bandpass-filtered the icantly better than the others in Figure 3b for high SNRs. FAST signal using a 4th-order Butterworth filter with cut-off frequen- MUSIC performs poorly for SNRs below 20 dB. cies at 2400 Hz and 2900 Hz. We ran FFTs with the Blackman

DAFX-4 Proceedings of the 21st International Conference on Digital Audio Effects (DAFx-18), Aveiro, Portugal, September 4–8, 2018

0 0

-10 -10

-20 -20

-30 -30

-40 -40 Magnitude in dB Magnitude in dB

-50 -50

-60 -60 2662 2664 2666 2668 2670 2672 2674 2676 2678 2680 2662 2664 2666 2668 2670 2672 2674 2676 2678 2680 Frequency in Hz Frequency in Hz (a) Data length= 214 (b) Data length= 215

0 0

-10 -10

-20 -20

-30 -30

-40 -40 Magnitude in dB Magnitude in dB

-50 -50

-60 -60 2662 2664 2666 2668 2670 2672 2674 2676 2678 2680 2662 2664 2666 2668 2670 2672 2674 2676 2678 2680 Frequency in Hz Frequency in Hz (c) Data length= 216 (d) Data length= 217

0

-10

-20

-30

-40 Magnitude in dB

-50

-60 2662 2664 2666 2668 2670 2672 2674 2676 2678 2680 Frequency in Hz (e) Data length= 218

Figure 5: FFT magnitude plots and FAST MUSIC frequency estimates (vertical lines) window and FAST MUSIC on different data lengths, as shown in strings. Figure 5, where the vertical lines indicate the frequencies detected 14 15 by FAST MUSIC. We see that for window sizes 2 and 2 , the 5. DISCUSSION AND FUTURE WORK FFT magnitude does not exhibit two separately discernible peaks at all, whereas FAST MUSIC provides two peak frequencies with In this paper, we have proposed a computationally efficient imple- some error. For larger window sizes, both FFT and FAST MU- mentation of the MUSIC algorithm. The autocorrelation matrix SIC are able to resolve the two peaks with greater accuracy. FAST was derived and approximated by a circulant matrix. This approxi- MUSIC may be preferred over the FFT for higher resolution when mation allowed us to replace computationally intensive eigenvalue the available data size is limited. One potential application is in pi- decomposition algorithms with an FFT. We subsequently derived a ano tuning, where FAST MUSIC could be used to quickly resolve closed-form expression for searching over a range of frequencies. closely spaced peaks caused by the coupled motion of the piano These modifications yielded a significant improvement in compu-

DAFX-5 Proceedings of the 21st International Conference on Digital Audio Effects (DAFx-18), Aveiro, Portugal, September 4–8, 2018 tational speed. [4] Vladilen F Pisarenko, “The retrieval of harmonics from a co- A key factor in the accuracy of FAST MUSIC is the precision variance function,” Geophysical Journal International, vol. in periodicity detection. If the periodicity is off by a significant 33, no. 3, pp. 347–366, 1973. number of samples, the autocorrelation matrix is no longer circu- [5] Ralph Schmidt, “Multiple emitter location and signal param- lant and FAST MUSIC falls apart. AMDF based periodicity detec- eter estimation,” IEEE transactions on antennas and propa- tor is time consuming, not foolproof and often yields wrong results gation, vol. 34, no. 3, pp. 276–280, 1986. if the number of lags in the autocorrelation function is very high. Ideally, a better method for periodicity detection should be used. [6] Arthur Barabell, “Improving the resolution performance However, the aim of this paper is not to come up with a novel so- of eigenstructure-based direction-finding algorithms,” in lution for detecting the period of a signal, so we leave this problem Acoustics, Speech, and Signal Processing, IEEE Interna- open for future work. tional Conference on ICASSP’83. IEEE, 1983, vol. 8, pp. Another issue is finding the number of sinusoids present in a 336–339. given signal, if it is not known apriori. To do so, one can look at [7] Richard Roy and Thomas Kailath, “Esprit-estimation of sig- the relative magnitude of the eigenvalues (FFT peak values in our nal parameters via rotational invariance techniques,” IEEE case). This works well if the signal to noise ratio is high. More Transactions on acoustics, speech, and signal processing, robust partitioning schemes have been used in [8]. Once the signal vol. 37, no. 7, pp. 984–995, 1989. frequencies are known, the estimation of amplitudes is trivial and can be done using linear least squares. [8] Kaluri V Rangarao and Shridhar Venkatanarasimhan, “gold- We found that the estimator variances for both FAST-MUSIC music: A variation on music to accurately determine peaks and the QIFFT method were dominated by bias at high SNRs. Bias of the spectrum,” IEEE Transactions on Antennas and Prop- can be reduced arbitrarily in the QIFFT method by increasing the agation, vol. 61, no. 4, pp. 2263–2268, 2013. amount of zero-padding, as well as by other methods [22], and we [9] Peter Arbenz, “Lecture notes on solving large scale eigen- expect it to reduce similarly in FAST-MUSIC when the upsam- value problems,” chapter 4. ETH Zurich. pling factor is increased to higher powers of 2 and when the num- ber of points in the search space is increased. Future work should [10] Juha T Karhunen and Jyrki Joutsensalo, “Sinusoidal fre- reduce or eliminate the bias so that the relative performance can be quency estimation by signal subspace approximation,” IEEE observed at high SNRs. Transactions on signal processing, vol. 40, no. 12, pp. 2961– Note that both FAST-MUSIC and QIFFT are effectively FFT- 2972, 1992. magnitude based methods. They differ only in how they extract [11] Gabriel Weinreich, “Coupled piano strings,” The Journal of the peak frequency from the FFT magnitude, and there are slight the Acoustical Society of America, vol. 62, no. 6, pp. 1474– differences in the FFT magnitude computation itself (equivalently, 1484, 1977. slightly different definitions of sample autocorrelation). Both meth- ods can be regarded as approximate maximum-likelihood meth- [12] Gabriel Weinreich, “The coupled motions of piano strings,” ods, so the question becomes which method gives the best average Scientific American, vol. 240, no. 1, pp. 118–127, 1979. accuracy for a given computational complexity? This remains a [13] Petre Stoica and Anders Eriksson, “Music estimation of real- topic for future work. valued sine-wave frequencies,” Signal processing, vol. 42, We would also like to add a third method for comparison which no. 2, pp. 139–146, 1995. is a true maximum likelihood estimate, obtainable by finding the [14] Robert M Gray et al., “Toeplitz and circulant matrices: A optimal window-transform overlap that explains the observed spec- review,” Foundations and Trends R in Communications and trum, under the hypothesis of two sinusoidal peaks being present. Information Theory, vol. 2, no. 3, pp. 155–239, 2006. Such a method is of course much more expensive than either of the FFT methods considered here, but it should represent the highest [15] Robert Gray, “On the asymptotic eigenvalue distribution of accuracy obtainable under our signal modeling assumptions. In the toeplitz matrices,” IEEE Transactions on Information The- end, we would like a well filled out family of approximate max- ory, vol. 18, no. 6, pp. 725–730, 1972. imum likelihood estimators for closely spaced sinusoidal peaks, [16] Myron Ross, Harry Shaffer, Andrew Cohen, Richard Freud- providing a range of ways for trading off estimation accuracy ver- berg, and Harold Manley, “Average magnitude difference sus computational expense. function pitch extractor,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 22, no. 5, pp. 353–362, 6. REFERENCES 1974. [17] Clive Temperton, “Self-sorting mixed-radix fast Fourier [1] David C. Rife and Robert R. Boorstyn, “Single-Tone Pa- transforms,” Journal of Computational Physics, vol. 52, no. rameter Estimation from Discrete-Time Observations,” IEEE 1, pp. 1–23, 1983. Transactions on Information Theory, vol. 20, no. 5, pp. 591– 598, 1974. [18] Pierre Duhamel, “Implementation of Split-Radix FFT Algo- rithms for Complex, Real, and Real-Symmetric Data,” IEEE [2] David C Rife and Robert R Boorstyn, “Multiple tone param- Transactions on Acoustics, Speech, and Signal Processing, eter estimation from discrete-time observations,” Bell Labs vol. 34, pp. 285–295, 1986. Technical Journal, vol. 55, no. 9, pp. 1389–1410, 1976. [19] Julius O. Smith, “Digital Audio Resampling Home Page,” [3] Steven M Kay, “Modern spectral estimation,” chapter 13. Tech. Rep., Center for Computer Research in Music and Prentice Hall, 1988. Acoustics (CCRMA), Stanford University, 2002.

DAFX-6 Proceedings of the 21st International Conference on Digital Audio Effects (DAFx-18), Aveiro, Portugal, September 4–8, 2018

[20] Mototsugu Abe and Julius O. Smith, “Design criteria for simple sinusoidal parameter estimation based on quadratic interpolation of FFT magnitude peaks,” in Audio Engineer- ing Society Conventions and Conferences (AES’04), 2004, vol. 117. [21] Petre Stoica and Arye Nehorai, “Music, maximum likeli- hood, and cramer-rao bound,” IEEE Transactions on Acous- tics, Speech, and Signal Processing, vol. 37, no. 5, pp. 720– 741, 1989. [22] Kurt James Werner and François Georges Germain, “Si- nusoidal parameter estimation using quadratic interpolation around power-scaled magnitude spectrum peaks,” Applied Sciences, vol. 6, no. 10, pp. 306, 2016.

DAFX-7