Methodologies for Creating Symbolic Corpora of Western Music Before 1600

Julie Cumming; Cory McKay; Jonathan Stuchbery; Ichiro Fujinaga

doi:10.5281/zenodo.1492459

Published September 23, 2018 | Version v1

Conference paper Open

Methodologies for Creating Symbolic Corpora of Western Music Before 1600

The creation of a corpus of compositions in symbolic formats is an essential step for any project in systematic research. There are, however, many potential pitfalls, especially in early music, where scores are edited in different ways: variables include clefs, note values, types of barline, and editorial accidentals. Different score editors and optical music recognition software have their own ways of storing and exporting musical data. Choice of software and file formats, and their various parameters, can thus unintentionally bias data, as can decisions on how to interpret potentially ambiguous markings in original sources. This becomes especially problematic when data from different corpora are combined for computational processing, since observed regularities and irregularities may in fact be linked with inconsistent corpus collection methodologies, internal and external, rather than the underlying music. This paper proposes guidelines, templates, and workflows for the creation of consistent early music corpora, and for detecting encoding biases in existing corpora. We have assembled a corpus of Renaissance duos as a sample implementation, and present machine learning experiments demonstrating how inconsistent or naïve encoding methodologies for corpus collection can distort results.

Files

46_Paper.pdf

Files (669.6 kB)

Name	Size	Download all
46_Paper.pdf md5:188d678a3f69d6f7f7316008681586b5	669.6 kB	Preview Download

265

Views

120

Downloads

Show more details

	All versions	This version
Views	265	265
Downloads	120	120
Data volume	88.4 MB	88.4 MB

More info on how stats are collected....

DOI

Resource type

Conference paper

Publisher

ISMIR

Imprint

Proceedings of the 19th International Society for Music Information Retrieval Conference, 491-498. Paris, France.

Conference

International Society for Music Information Retrieval Conference (ISMIR 2018) , Paris, France, September 23-27, 2018

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: November 20, 2018
Modified: August 2, 2024

Methodologies for Creating Symbolic Corpora of Western Music Before 1600

Authors/Creators

Description

Files

46_Paper.pdf

Files (669.6 kB)