Published December 29, 2025 | Version v1.0
Dataset Open

Music Informatics for Radio Across the GlobE (MIRAGE) MetaCorpus

  • 1. EDMO icon University of Michigan

Contributors

Contact person:

  • 1. EDMO icon University of Michigan

Description

Overview

Welcome to the Music Informatics for Radio Across the GlobE (MIRAGE) MetaCorpus. The current (v1.0) release consists of metadata (e.g., artist name, track title) and musicological features (e.g., instrument list, voice type) for 1 million events streaming on 10,000 internet radio stations across the globe, with 100 events from each station. 

Users who wish to access, interact with, and/or export metadata from the MIRAGE-MetaCorpus may also visit the MIRAGE online dashboard at the following url:

Attribution

The current MIRAGE-MetaCorpus is available under a CC4 license. Users may cite the dataset here:

Sears, David R.W. “Music Informatics for Radio Across the Globe (MIRAGE) Metacorpus v1.0”. Zenodo, December 29, 2025. https://doi.org/10.5281/zenodo.18112107.

Users accessing the MIRAGE-MetaCorpus using the online dashboard should also cite the following ISMIR paper:

Ngan V.T. Nguyen, Elizabeth A.M. Acosta, Tommy Dang, and David R.W. Sears. "Exploring Internet Radio Across the Globe with the MIRAGE Online Dashboard," in Proceedings of the 25th International Society for Music Information Retrieval Conference (San Francisco, CA, 2024). 

Data Sources

This repository of the MIRAGE-MetaCorpus contains 131 metadata fields from the following open-access sources:

Each event also includes attribution metadata from the following commercial sources:

MetaData

The metadata reflect information about each event's location (e.g., city, country), station (name, format, url), event (id, local time at station, etc.), artist (name, voice type, etc.), and track (e.g., title, year of release, etc.). The naming syntax for each metadata field consists of the entity (location, station, event, artist, track), data source (RG, NE, etc.), and metadata name (e.g., Latitude, Frequency, etc.).

Location

Location_NE_Continent
Location_NE_Country
Location_NE_CountryA3
Location_NE_CountryEconomy
Location_NE_CountryGDP
Location_NE_CountryGDPYear
Location_NE_CountryIncome
Location_NE_CountryPopulation
Location_NE_CountryPopulationRank
Location_NE_CountryPopulationYear
Location_NE_CountrySovereignState
Location_NE_CountryType
Location_NE_Region
Location_NE_StateProvince
Location_RG_City
Location_RG_Country
Location_RG_ID
Location_RG_Latitude
Location_RG_Longitude
Location_RG_utcOffset
Location_WD_City
Location_WD_CityCensusYear
Location_WD_CityDescription
Location_WD_CityLanguagesOfficial
Location_WD_CityLanguagesUsed
Location_WD_CityPopulation
Location_WD_CityPopulationMethod
Location_WD_CityQID
Location_WD_CityType
Location_WD_CountryLanguagesOfficial
Location_WD_CountryLanguagesUsed

Station

Station_AR_Annotator
Station_AR_Form
Station_AR_Format
Station_AR_Frequency
Station_AR_Genre
Station_AR_Languages
Station_RG_ID
Station_RG_Name
Station_RG_URL
Station_SE_Description
Station_SE_Name
Station_SE_WebsiteURL

Event

Event_GE_MatchReliability
Event_MA_ID
Event_MA_StreamLanguagePredictions
Event_MA_StreamLanguages
Event_MA_TimeStation
Event_MB_MatchReliability
Event_RG_Version
Event_SE_Bitrate
Event_SE_Channels
Event_SE_Codec
Event_SE_Description
Event_SE_DescriptionClean
Event_SE_Framerate
Event_SP_MatchReliability
Event_WD_MatchReliability

Artist

Artist_GE_Name
Artist_MA_GroupID
Artist_MA_MemberID
Artist_MB_Country
Artist_MB_Genre
Artist_MB_MBID
Artist_MB_Name
Artist_MB_Type
Artist_SP_ID
Artist_SP_Name
Artist_WD_AZlyricsID
Artist_WD_Coordinates
Artist_WD_Country
Artist_WD_Description
Artist_WD_Ethnicities
Artist_WD_Genders
Artist_WD_Genre
Artist_WD_Instruments
Artist_WD_Members
Artist_WD_MusixmatchID
Artist_WD_Name
Artist_WD_QID
Artist_WD_SexualOrientations
Artist_WD_StartYear
Artist_WD_Type
Artist_WD_VoiceTypes
Artist_WD_WebsiteURL
Artist_WD_YouTubeID

Track

Track_GE_ID
Track_GE_Lyrics
Track_GE_LyricsLanguagePredictions
Track_GE_LyricsLanguages
Track_GE_Title
Track_MB_Arrangers
Track_MB_Composers
Track_MB_CoverArt
Track_MB_CoverArtSmall
Track_MB_Duration
Track_MB_Engineers
Track_MB_Genre
Track_MB_ISRC
Track_MB_ISWCS
Track_MB_Instruments
Track_MB_Languages
Track_MB_Lyricists
Track_MB_LyricsURL
Track_MB_MBID
Track_MB_Name
Track_MB_Performers
Track_MB_Producers
Track_MB_Programmers
Track_MB_Release
Track_MB_ReleaseCountry
Track_MB_ReleaseLabels
Track_MB_ReleaseLanguage
Track_MB_ReleaseMBID
Track_MB_ReleaseScript
Track_MB_StreamingURL
Track_MB_Type
Track_MB_WorkMBID
Track_MB_Year
Track_SP_ID
Track_SP_Name
Track_WD_Composers
Track_WD_Description
Track_WD_Format
Track_WD_Language
Track_WD_Lyricists
Track_WD_Name
Track_WD_QID
Track_WD_Tonality
Track_WD_Year
Track_WD_YouTubeID

Data Sets

The MIRAGE-MetaCorpus includes the following datasets:

  • MIRAGE.csv -- the complete metacorpus (1 million)
  • events.csv -- all event-level metadata (1 million)
  • tracks.csv -- all track-level metadata (414,886)
  • artists.csv -- all artist-level metadata (259,783)
  • stations.csv -- all station-level metadata (10,000)
  • locations.csv -- all location-level metadata (4,324)

Subsets of the MIRAGE-MetaCorpus are also available for events with metadata from online music libraries that reliably matched the event's description in the radio station's stream encoder.

Reliable -- 'All' Sources Match

The first subset ('all' sources) consists of events where the stream description reliably matched with metadata from the WikiData, MusicBrainz, Spotify, and Genius sources (>= .9, normalized edit distance):

  • MIRAGE_reliable_all.csv (139,761)
  • events_reliable_all.csv (139,761)
  • tracks_reliable_all.csv (24,697)
  • artists_reliable_all.csv (6,321)
  • stations_reliable_all.csv (6,604)
  • locations_reliable_all.csv (3,238)

Reliable -- 'Core' Sources Match

The second subset ('core' sources) consists of events where the stream description reliably matched with metadata from either the WikiData or MusicBrainz sources (>= .9, normalized edit distance):

  • MIRAGE_reliable_core.csv (447,238)
  • events_reliable_core.csv (447,238)
  • tracks_reliable_core.csv (174,136)
  • artists_reliable_core.csv (62,324)
  • stations_reliable_core.csv (9,063)
  • locations_reliable_core.csv (4,031)

Contact

If you are a copyright owner for any of the metadata that appears in the MIRAGE-MetaCorpus and would like us to remove your metadata, please contact the developer team at the following email address: miragedashboard@gmail.com 

Files

artists.csv

Files (4.5 GB)

Name Size Download all
md5:f5a122511d3eff66d20cad44a3eb1051
48.3 MB Preview Download
md5:9e9e1f0c1d9ab457277c3d7088d33c0e
3.0 MB Preview Download
md5:029de7a0956b9397a3e79b69c1af1150
18.6 MB Preview Download
md5:27cabd203cb38afdacbdc233c9604fb8
241.5 MB Preview Download
md5:7f4950d704cf75952ff973e16b64f501
32.5 MB Preview Download
md5:097a651290eda6a0da08818c1bda74cd
107.5 MB Preview Download
md5:26ef9b147875b9b746af32075eda1739
4.4 MB Preview Download
md5:40d8969a011e1f119853c01e8d466d50
3.0 MB Preview Download
md5:1db588c9b978285449773afaf8f19eff
3.9 MB Preview Download
md5:e35a445c2989850c48d3d4bba675c505
2.2 GB Preview Download
md5:11bf09ae2caa278fe0d14371400cd519
381.2 MB Preview Download
md5:74deb6a27fbe5b54528ac7a258708db0
1.1 GB Preview Download
md5:2199dbfca0b74dcd68ab7667baa823b4
2.5 MB Preview Download
md5:db938c8db6f70fa71978bb0a39cf4189
1.6 MB Preview Download
md5:d6f1f8563f69d9d92d9196d4748e3c5f
2.2 MB Preview Download
md5:a2a1171305009e5d540c0a6258319226
172.6 MB Preview Download
md5:d4f2d63720fbb02508e3cc5ee7191696
19.5 MB Preview Download
md5:198558d942d3e9740e4c893243f3b6d4
104.1 MB Preview Download

Additional details

Funding

National Endowment for the Humanities
Music Informatics for Radio Across the GlobE (MIRAGE) HAA-301062-24

Dates

Created
2024-07-19
MIRAGE MetaCorpus v0.2
Updated
2025-12-29
MIRAGE MetaCorpus v1.0

Software

Development Status
Active