Welcome to D
SIGMOD 2003
<<< = SIGMOD'03 Pa>>>
PODS 2003
SIGMOD-RECOR
ADBIS
CIDR 2003
CIKM 2003
DASFAA 2003
Data Enginee
DEBS
DMKD 2003
DOLAP 2003
DPDJ 2003
ER
GIS 2003
Hypertext 20
ICDE 2003
ICDM 2003
ICDT 2003
JCDL 2003
KRDB 2003
MIR 2003
MIS 2003
MMDB 2003
RIDE 2003
SBBD 2003
SIGIR 2003
SIGIR-FORUM
SIGKDD 2003
SIGKDD-EXP
SSDBM 2003
TIME 2003
TODS
VLDB 2003
VLDB Journal
WIDM 2003

Query by Humming - in Action with its Technology Revealed


Yunyue Zhu, Dennis Shasha, and Xiaojian Zhao

  View Paper (PDF)  

Return to Potpourri


Abstract

). The user's humming is transcribed to a sequence of discrete notes and the contour information is extracted from the notes. This contour information is represented by a few letters. For example, ("U", "D", "S") represents that a note is above, below or the same as the previous one. The tunes in the databases are also represented by contour information. The edit distance can be use to measure the similarity between two melodies. Unfortunately, it is very hard to segment a user's humming into discrete notes. Some recent work proposes to match the query directly from audio based on dynamic time warping to match the hum-query with the melodies in the music databases. But this quality improvement comes at a price because a brute-force search using DTW is very slow. The database community has been researching problems in similarity query for time series databases for many years. The techniques developed in the area might shed light on the query by humming problem. In this demo, we treat both the melodies in the music databases and the user humming input as time series. Such an approach allows us to integrate many database indexing techniques into a query by humming system, improving the quality of such system over the traditional (contour) string databases approach. We design special searching techniques that are invariant to shifting, time scaling and local time warping. This makes the system robust and allows more flexible user humming input. A SIGMOD participant will hum the query melody using a PC microphone. The user will see his humming being displayed as a time series of the pitches within 1 second. The user can listen to the playback humming when the database query is executed. The top K matches of the user hum-query will be returned. The user can listen to the results and see whether they include the target tune. The time series of the user query and the result will be displayed together as in figure 1. The concept of Local Dynamic Time Warping distance will become very obvious by just glancing at the figure. The user might also find that some other tunes in the results sound similar to his target tune. If the user doesn't get the query result he wants, he can try again. As an option, the user can also change the warping width for the DTW distance and repeat the query to improve recall.

BIBTEX


@inproceedings       {DBLP:conf/sigmod/ZhuSZ03,
  author    = {Yunyue Zhu and
                Dennis Shasha and
                Xiaojian Zhao},
   booktitle = {SIGMOD Conference},
   title     = {Query by Humming - in Action with its Technology Revealed.},
   pages     = {675},
   year      = {2003},
   url       = {db/conf/sigmod/sigmod2003.html#ZhuSZ03},
   ee        = {http://www.acm.org/sigmod/sigmod03/eproceedings/papers/dem19.pdf},
   crossref  = {conf/sigmod/2003},
   bibsource = {DBLP, http://dblp.uni-trier.de} 
}



©2004 Association for Computing Machinery