Lukas Martak, Rainer Kelz, Gerhard Widmer,
"Differentiable dictionary search: Integrating linear mixing with deep non-linearmodelling for audio source separation"
: Proceedings of the 24thInternational Congress on Acoustics (ICA 2022), 10-2022
Differentiable dictionary search: Integrating linear mixing with deep non-linearmodelling for audio source separation
Sprache des Titels:
Proceedings of the 24thInternational Congress on Acoustics (ICA 2022)
This paper describes several improvements to a newmethod for signal decomposition that we recently formulatedunder the name of Differentiable Dictionary Search (DDS). Thefundamental idea of DDS is to exploit a class of powerful deepinvertible density estimators called normalizing flows, to modelthe dictionary in a linear decomposition method such as NMF,effectively creating a bijection between the space of dictionaryelements and the associated probability space, allowing adifferentiable search through the dictionary space, guided bythe estimated densities. As the initial formulation was a proofof concept with some practical limitations, we will presentseveral steps towards making it scalable, hoping to improve boththe computational complexity of the method and its signaldecomposition capabilities. As a testbed for experimentalevaluation, we choose the task of frame-level pianotranscription, where the signal is to be decomposed into sourceswhose activity is attributed to individual piano notes. Tohighlight the impact of improved non-linear modelling ofsources, we compare variants of our method to a linearovercomplete NMF baseline. Experimental results will show thateven in the absence of additional constraints, our modelsproduce increasingly sparse and precise decompositions,according to two pertinent evaluation measures.