HKUST Institutional Repository >
Computer Science and Engineering >
CSE Master Theses >
Please use this identifier to cite or link to this item:
|Title: ||Tempo extraction using the discrete wavelet transform|
|Authors: ||Tsang, Kei Man|
|Issue Date: ||2006 |
|Abstract: ||This thesis presents a method to extract the tempo from an audio file. First of all, we study the audio file for the beats; the interval between two successive beats is called the inter-onset interval (IOI). In order to investigate the inter-onset interval, two musicians were invited to conduct some experiments on the inter-onset intervals for a data set. This data set consists of 50 musical recordings which were extracted from audio CDs.
For our tempo extraction system, an audio file is read into memory and then a discrete wavelet transform (DWT) is applied. The input signal is then decomposed into four levels of DWT coefficients and a peak detection algorithm is performed to extract all peaks from these DWT coefficients. All the peaks are used to calculate the IOI. Some of them are more important for the IOI than others. So, a weight is introduced to each IOI in order to increase the accuracy of our system. We define the weight according to how many of the IOI's neighbors give similar values. All the weighted IOIs will form a histogram. The histogram is then smoothed out using a Gaussian function in order to better estimate the tempo.
For an input which is in stereo format, we treat it as three different inputs; the left channel, the right channel and the mono channel. The mono channel is the average of the left and right channels. We pass these three inputs into our system. Then, we can select the best one to be our final result.
The entire system was implemented using Matlab. We test our system using one data set of 50 musical recordings and one data set which had been used in a tempo extraction contest during the International Conference on Music Information Retrieval (ISMIR 2004). We obtained the correct tempo for 47 out of the 50 songs in our data set, achieving high accuracy. For the contest, there are in total two sets of data we can test with. Our ranking for one set is 2nd out of 12 and the other set is 3rd out of 12. This result shows that our system is competitive with the other algorithms used in the contest.|
|Description: ||Thesis (M.Phil.)--Hong Kong University of Science and Technology, 2006|
xx, 102 leaves : ill. ; 30 cm
HKUST Call Number: Thesis COMP 2006 Tsang
|Appears in Collections:||CSE Master Theses |
Files in This Item:
All items in this Repository are protected by copyright, with all rights reserved.