Publikationsdetails

Zitat:	Hamid Eghbal-Zadeh, B. Lehner, Matthias Dorfer, "A hybrid approach with multi-channel i-vectors and convolutional neural networks for acoustic scene classification" : Proceedings of the 25th European Signal Processing Conference (EUSIPCO), Kos, 8-2017
Original Titel:	A hybrid approach with multi-channel i-vectors and convolutional neural networks for acoustic scene classification
Sprache des Titels:	Englisch
Original Buchtitel:	Proceedings of the 25th European Signal Processing Conference (EUSIPCO), Kos
Original Kurzfassung:	In Acoustic Scene Classification (ASC) two major approaches have been followed. While one utilizes engineered features such as mel-frequency-cepstral-coefficients (MFCCs), the other uses learned features that are the outcome of an optimization algorithm. I-vectors are the result of a modeling technique that usually takes engineered features as input. It has been shown that standard MFCCs extracted from monaural audio signals lead to i-vectors that exhibit poor performance, especially on indoor acoustic scenes. At the same time, Convolutional Neural Networks (CNNs) are well known for their ability to learn features by optimizing their filters. They have been applied on ASC and have shown promising results. In this paper, we first propose a novel multi-channel i-vector extraction and scoring scheme for ASC, improving their performance on indoor and outdoor scenes. Second, we propose a CNN architecture that achieves promising ASC results. Further, we show that i-vectors and CNNs capture complementary information from acoustic scenes. Finally, we propose a hybrid system for ASC using multi-channel i-vectors and CNNs by utilizing a score fusion technique. Using our method, we participated in the ASC task of the DCASE-2016 challenge. Our hybrid approach achieved 1st rank among 49 submissions, substantially improving the previous state of the art.
Sprache der Kurzfassung:	Englisch
Erscheinungsmonat:	8
Erscheinungsjahr:	2017
Anzahl der Seiten:	5
URL zu weiteren Infos:	http://www.eurasip.org/Proceedings/Eusipco/Eusipco2017/papers/1570347275.pdf
Reichweite:	international
Publikationstyp:	Aufsatz / Paper in Tagungsband (referiert)
Autoren:	Hamid Eghbal-Zadeh, B. Lehner, Matthias Dorfer
Forschungseinheiten:	Institut für Computational Perception

Wissenschaftsgebiete:	Informatik (ÖSTAT:102) Artificial Intelligence (ÖSTAT:102001) Bildverarbeitung (ÖSTAT:102003) Informationssysteme (ÖSTAT:102015) Audiovisuelle Medien (ÖSTAT:202002)

Forschungsprojekte:	Strategic FExFE Project on Deep Learning (Anfangsjahr: 2015) Informatik, Künstliche Intelligenz, Musik (Wittgenstein-Preis) (Anfangsjahr: 2009)

fodok.jku.at

Benutzerbetreuung: Sandra Winzer, letzte Änderung:

Johannes Kepler Universität (JKU) Linz, Altenbergerstr. 69, A-4040 Linz, Austria
Telefon + 43 732 / 2468 - 9121, Fax + 43 732 / 2468 - 29121, Internet www.jku.at, Impressum