Symbolic Music Analysis with Graph Neural Networks
Sprache des Titels:
Englisch
Original Kurzfassung:
Music analysis, as a scholarly field and as a set of techniques, is crucial for comprehending and appreciating music. It systematically examines elements such as harmony, melody, rhythm, form, and instrumentation, revealing the interplay of compositional techniques and structural elements/patterns. This thesis examines the effectiveness of Graph Neural Networks (GNNs) on music analysis tasks such as Cadence Detection, Roman Numeral Analysis, Composer Classification, and Voice Separation, focusing on symbolic representations (i.e., musical scores given in some machine-readable encoding). We argue that a graph structure is a more natural representation for modeling a musical score than the feature- or token-based representations that have been used so far.
In a first step, we experimentally compare graph representations to other modeling approaches such as piano rolls, note arrays, or custom descriptors on a number of different music classification tasks. We then develop a series of GNN-based machine learning models tailored for music analysis that can handle the unique characteristics of the music format itself as well as the targeted application. As a result, we obtain graph-based models that demonstrate improved performance on benchmark datasets for diverse analysis tasks. Furthermore, we develop a new, generic graph convolution block based on perception-inspired principles that improve performance on music understanding tasks. We present a framework for deriving and visualizing explanations of the decisions made by our music-related GNNs. Finally, we develop and publish a dedicated library for symbolic music graph processing in order to reinforce the impact of this work in the research community.