Tim Pohle, Gerhard Widmer, Peter Knees, Markus Schedl,
"Automatically Adapting the Structure of Audio Similarity Spaces."
: Proceedings of the 1st Workshop on Learning the Semantics of Audio Signals (LSAS 2006), 1st International Conference on Semantics and Digital Media Technology (SAMT 2006), Athens, Greece, 2006
Original Titel:
Automatically Adapting the Structure of Audio Similarity Spaces.
Sprache des Titels:
Englisch
Original Buchtitel:
Proceedings of the 1st Workshop on Learning the Semantics of Audio Signals (LSAS 2006), 1st International Conference on Semantics and Digital Media Technology (SAMT 2006), Athens, Greece
Original Kurzfassung:
Today, among the best-performing audio-based music similarity
measures are algorithms based on Mel Frequency Cepstrum Coefficients
(MFCCs). In these algorithms, each music track is modelled as a
Gaussian Mixture Model (GMM) of MFCCs. The similarity between two
tracks is computed by comparing their GMMs. One drawback of this approach
is that the distance space obtained this way has some undesirable
properties.
In this paper, a number of approaches to correct these undesirable properties
are investigated. They use knowledge about the properties of music
by using other music tracks as a reference. These reference tracks can
either be the music collection itself, or they may be an external set of
reference tracks.
Our results show that the proposed techniques clearly improve the quality
of this audio similarity measure. Furthermore, preliminary experiments
indicate that the techniques also help to improve other similarity
measures. They may even be useful in completely different domains, most
notably text information retrieval.