Publikationsdetails

Zitat:	Filip Korzeniowski, "Harmonic Analysis of Musical Audio Using Deep Neural Networks." , 10-2018
Original Titel:	Harmonic Analysis of Musical Audio Using Deep Neural Networks.
Sprache des Titels:	Englisch
Original Kurzfassung:	In this thesis, I consider the automatic extraction of harmonic information from musical audio. Obtaining such information automatically is relevant not only for theoretical analyses, but also for commercial applications such as music tutoring programs or lead sheet generators. I focus on two aspects of harmony? chords and the global key?and tackle the problem of extracting them using deep neural networks. My work on chord recognition constitutes the main part of this thesis. To recognise chords in the audio, I first develop data-driven feature extraction methods (or, acoustic models) that outperform hand-engineered ones. I then focus on modelling chord sequences, and show that doing so on a frame-by-frame basis (as common in existing chord recognition systems) prevents learning musical relationships between chords?regardless of the complexity or power of a sequence model. I also show that such models instead need to operate on higherlevel chord symbol sequences in order to be useful. I continue by systematically exploring such chord sequence models based on recurrent neural networks and show their superiority to finite-context models. Finally, I devise a probabilistic model that integrates these chord sequence models with acoustic models using various models of chord duration, and evaluate how the performance of each model influences the final chord recognition results. The second part of this thesis concerns key classification. Here, I develop a convolutional neural network based on traditional key classification pipelines to create a key classifier that performs better than existing, hand-designed methods. I then evaluate how well the model generalises over datasets of different musical genres (a problem existing systems have not solved), and propose adaptations in training and network structure that enable learning a genre-agnostic model that outperforms genre-specific models on many available datasets.
Sprache der Kurzfassung:	Englisch
Erscheinungsmonat:	10
Erscheinungsjahr:	2018
URL zu weiteren Infos:	http://www.cp.jku.at/research/papers/Korzeniowski_dissertation_2018.pdf
Reichweite:	international
Publikationstyp:	Dissertation
Autoren:	Filip Korzeniowski
Forschungseinheiten:	Institut für Computational Perception

Wissenschaftsgebiete:	Informatik (ÖSTAT:102) Artificial Intelligence (ÖSTAT:102001) Bildverarbeitung (ÖSTAT:102003) Informationssysteme (ÖSTAT:102015) Audiovisuelle Medien (ÖSTAT:202002)

fodok.jku.at

Benutzerbetreuung: Sandra Winzer, letzte Änderung:

Johannes Kepler Universität (JKU) Linz, Altenbergerstr. 69, A-4040 Linz, Austria
Telefon + 43 732 / 2468 - 9121, Fax + 43 732 / 2468 - 29121, Internet www.jku.at, Impressum