Thesis Open Access
Joe Cheri Ross
Preeti Rao; Pushpak Bhattacharyya
Development of a music recommender system, one of the key applications of Music Information Retrieval (MIR), necessitates research into methods to represent and retrieve music information efficiently. Considering the specific characteristics of each music culture and the diverse requirements thereof, the methods must be culture-aware with many of the associated tasks being culture-specific. This research is motivated by the importance of a music recommendation system for Hindustani music. The investigated tasks focus primarily on information extraction from melodic audio and text content, realizing the significance of information from multi-modal sources. In the context of information extraction from audio signals, we present our investigations on melodic motif detection involving mukhda (main title phrase of a composition) and pakad (raga characteristic phrase) detection. Our investigation on extracting meta-information from natural language text focuses on coreference resolution, with the aim of improving relation extraction from textual content. The task of raga similarity detection is investigated with both text and music content (available as music notation).
Melodic motifs form essential building blocks in Indian Classical music. The mukhda and the pakad phrases provide strong cues to the identity of the composition and the underlying raga respectively. Automatic detection of such recurring basic melodic phrases is highly relevant to music information retrieval in Hindustani music. This thesis discusses approaches to detect mukhda and pakad in a concert audio recording by exploring musicological cues and similarity computations.
Considering the large number of ragas in Hindustani music, detection of similarities between them is beneficial to music recommendation. The problem of raga similarity detection is investigated with two diverse data sources viz., discussions on Hindustani ragas and composition notations. Each of these sources help in extracting different aspects of raga similarity. While text discussions aid to extraction of similarities generally perceived by musicians, similarities based on melodic attributes are extracted through composition notations. Both the approaches learn representations for ragas, and the similarities between the representations indicate the similarities between the ragas.
Realizing the importance of coreference resolution to improve relation extraction from text, our investigations on extracting meta-information focuses on the same from music discussion forums. The attempt to design a specific approach is motivated by the nature of the text and the domain specificity. While the feature design considers domain specificity and nature of the text, we also observe the need for a hybrid approach. The proposed modification to best-first clustering, for the clustering step in the mention-pair model, considers relation between candidate antecedents while resolving for an anaphoric mention. We also discuss a method to identify the semantic class, a crucial feature for coreference resolution, with the help of web resources. This approach to semantic class identification is generalizable for any domain-specific dataset with similar challenges. The investigations with eye-tracking and memory networks initiate research in the direction of bridging the gap between the cognitive process involved and the machine understanding of coreference resolution.