Published February 23, 2024 | Version v2.0.1
Software Open

polifonia-project/folk_ngram_analysis: FoNN v2.0.1

  • 1. @data-science-institute-nuig
  • 2. University of Galway
  • 3. The Open University

Description

FoNN (Folk N-gram aNalysis) v2.0 contains a set of tools to ingest, preprocess and query music corpora for inter-opus similarity using three newly-developed music similarity metrics, all of which are based on local musical feature patterns extracted from digital scores. It also provides sample music corpora and ground-truth annotations to allow measurement and analysis of the performance of the bundled similarity tools, and for similar re-use in other studies. These tools have been developed with a particular focus on European musical heritage but can be applied to any machine-readable music score inputs in compatible formats (MIDI, ABC, **kern, MusicXML, and any other formats compatible with the music21 Python library).

(1) FoNN: Toolkit which uses n-gram patterns, pattern frequency and TF-IDF values, and a variety of standard and customised edit distance metrics to detect inter-opus melodic similarity. Some of FoNN's functionality has been tailored to the study of monophonic Irish & European folk music inputs but it can be applied to any music corpus in a compatible symbolic notation format. FoNN includes ingest pipeline tools for creation of Knowledge Graphs from music corpus data via the Polifonia Patterns Knowledge Graph repo. Also included as component is a small sample corpus for demonstration purposes, The Meertens Tune Collection Annotated Corpus (MTC-ANN) of Dutch folk songs.

(2) Ceol Rince na hÉireann (CRÉ) corpus: A cleaned and annotated version of the Ceol Rince na hÉireann corpus of Irish traditional dance music, containing 1,224 monophonic MIDI files, and a csv table of root note values for each file.

(3) Root Note Detection: Experimental work on development of a Machine Learning-based root note detection algorithm, exploring the use of decision tree and random forest techniques to aggregate and improve the accuracy of an ensemble of root detection metrics.

(4) Ground truth annotations: New in v2.0, this component comprises tune family ground truth annotations for a subset of 314 tunes within The Session corpus, grouped into 10 tune families. This resource can facilitate the quantitative testing measurement of the performance of music similarity tools.

In addition to the inclusion of the new component (4), this release includes a refactored and expanded version of component (1), which been re-engineered, significantly boosting both results accuracy and computational performance. Flow control, docs, and demos for component (1) have also been refactored and revised.

Full Changelog: https://github.com/polifonia-project/folk_ngram_analysis/compare/v2.0...v2.0.1

Files

polifonia-project/folk_ngram_analysis-v2.0.1.zip

Files (16.0 MB)

Name Size Download all
md5:527e503bdeb26ff819c2dbaca6eac8a3
16.0 MB Preview Download

Additional details

Related works