Publikationsdetails

Zitat:	Florian Schmid, Paul Primus, Tobias Morocutti, Jonathan Greif, Gerhard Widmer, "Improving Audio Spectrogram Transformers for Sound Event Detection Through Multi-Stage Training" , 2024
Original Titel:	Improving Audio Spectrogram Transformers for Sound Event Detection Through Multi-Stage Training
Sprache des Titels:	Englisch
Original Kurzfassung:	This technical report describes the CP-JKU team's submission for Task 4 Sound Event Detection with Heterogeneous Training Datasets and Potentially Missing Labels of the DCASE 24 Challenge. We fine-tune three large Audio Spectrogram Transformers, PaSST, BEATs, and ATST, on the joint DESED and MAESTRO datasets in a two-stage training procedure. The first stage closely matches the baseline system setup and trains a CRNN model while keeping the large pre-trained transformer model frozen. In the second stage, both CRNN and transformer are fine-tuned using heavily weighted self-supervised losses. After the second stage, we compute strong pseudo-labels for all audio clips in the training set using an ensemble of all three fine-tuned transformers. Then, in a second iteration, we repeat the two-stage training process and include a distillation loss based on the pseudo-labels, boosting single-model performance substantially. Additionally, we pre-train PaSST and ATST on the subset of AudioSet that comes with strong temporal labels, before fine-tuning them on the Task 4 datasets.
Sprache der Kurzfassung:	Englisch
Erscheinungsjahr:	2024
Anzahl der Seiten:	5
DOI:	10.48550/arXiv.2408.00791
Reichweite:	international
Publikationstyp:	Anderer Forschungsbericht / Technischer Bericht
Autoren:	Florian Schmid, Paul Primus, Tobias Morocutti, Jonathan Greif, Gerhard Widmer
Forschungseinheiten:	Institut für Computational Perception

Wissenschaftsgebiete:	Informatik (ÖSTAT:102) Artificial Intelligence (ÖSTAT:102001) Bildverarbeitung (ÖSTAT:102003) Informationssysteme (ÖSTAT:102015) Audiovisuelle Medien (ÖSTAT:202002)

fodok.jku.at

Benutzerbetreuung: Sandra Winzer, letzte Änderung:

Johannes Kepler Universität (JKU) Linz, Altenbergerstr. 69, A-4040 Linz, Austria
Telefon + 43 732 / 2468 - 9121, Fax + 43 732 / 2468 - 29121, Internet www.jku.at, Impressum