Music Informatics for Radio Across the GlobE (MIRAGE) MetaCorpus
Description
Overview
Welcome to the Music Informatics for Radio Across the GlobE (MIRAGE) MetaCorpus. The current (v1.0) release consists of metadata (e.g., artist name, track title) and musicological features (e.g., instrument list, voice type) for 1 million events streaming on 10,000 internet radio stations across the globe, with 100 events from each station.
Users who wish to access, interact with, and/or export metadata from the MIRAGE-MetaCorpus may also visit the MIRAGE online dashboard at the following url:
Attribution
The current MIRAGE-MetaCorpus is available under a CC4 license. Users may cite the dataset here:
Sears, David R.W. “Music Informatics for Radio Across the Globe (MIRAGE) Metacorpus v1.0”. Zenodo, December 29, 2025. https://doi.org/10.5281/zenodo.18112107.
Users accessing the MIRAGE-MetaCorpus using the online dashboard should also cite the following ISMIR paper:
Ngan V.T. Nguyen, Elizabeth A.M. Acosta, Tommy Dang, and David R.W. Sears. "Exploring Internet Radio Across the Globe with the MIRAGE Online Dashboard," in Proceedings of the 25th International Society for Music Information Retrieval Conference (San Francisco, CA, 2024).
Data Sources
This repository of the MIRAGE-MetaCorpus contains 131 metadata fields from the following open-access sources:
- Radio Garden (RG) -- https://radio.garden
- Natural Earth map data set (NE) -- https://www.naturalearthdata.com/
- Internet Radio Station Stream Encoder (SE)
- Annotator Review (AR)
- Monitoring/Matching Algorithm (MA)
- WikiData (WD) -- https://www.wikidata.org
- MusicBrainz (MB) -- https://musicbrainz.org/
Each event also includes attribution metadata from the following commercial sources:
- Spotify (SP) -- https://open.spotify.com/
- Genius (GE) -- https://genius.com/
- Musixmatch (MX) -- https://www.musixmatch.com/
- YouTube (YT) -- https://www.youtube.com/
- AZlyrics (AZ) -- https://www.azlyrics.com/
MetaData
The metadata reflect information about each event's location (e.g., city, country), station (name, format, url), event (id, local time at station, etc.), artist (name, voice type, etc.), and track (e.g., title, year of release, etc.). The naming syntax for each metadata field consists of the entity (location, station, event, artist, track), data source (RG, NE, etc.), and metadata name (e.g., Latitude, Frequency, etc.).
Location
| Location_NE_Continent |
| Location_NE_Country |
| Location_NE_CountryA3 |
| Location_NE_CountryEconomy |
| Location_NE_CountryGDP |
| Location_NE_CountryGDPYear |
| Location_NE_CountryIncome |
| Location_NE_CountryPopulation |
| Location_NE_CountryPopulationRank |
| Location_NE_CountryPopulationYear |
| Location_NE_CountrySovereignState |
| Location_NE_CountryType |
| Location_NE_Region |
| Location_NE_StateProvince |
| Location_RG_City |
| Location_RG_Country |
| Location_RG_ID |
| Location_RG_Latitude |
| Location_RG_Longitude |
| Location_RG_utcOffset |
| Location_WD_City |
| Location_WD_CityCensusYear |
| Location_WD_CityDescription |
| Location_WD_CityLanguagesOfficial |
| Location_WD_CityLanguagesUsed |
| Location_WD_CityPopulation |
| Location_WD_CityPopulationMethod |
| Location_WD_CityQID |
| Location_WD_CityType |
| Location_WD_CountryLanguagesOfficial |
| Location_WD_CountryLanguagesUsed |
Station
| Station_AR_Annotator |
| Station_AR_Form |
| Station_AR_Format |
| Station_AR_Frequency |
| Station_AR_Genre |
| Station_AR_Languages |
| Station_RG_ID |
| Station_RG_Name |
| Station_RG_URL |
| Station_SE_Description |
| Station_SE_Name |
| Station_SE_WebsiteURL |
Event
| Event_GE_MatchReliability |
| Event_MA_ID |
| Event_MA_StreamLanguagePredictions |
| Event_MA_StreamLanguages |
| Event_MA_TimeStation |
| Event_MB_MatchReliability |
| Event_RG_Version |
| Event_SE_Bitrate |
| Event_SE_Channels |
| Event_SE_Codec |
| Event_SE_Description |
| Event_SE_DescriptionClean |
| Event_SE_Framerate |
| Event_SP_MatchReliability |
| Event_WD_MatchReliability |
Artist
| Artist_GE_Name |
| Artist_MA_GroupID |
| Artist_MA_MemberID |
| Artist_MB_Country |
| Artist_MB_Genre |
| Artist_MB_MBID |
| Artist_MB_Name |
| Artist_MB_Type |
| Artist_SP_ID |
| Artist_SP_Name |
| Artist_WD_AZlyricsID |
| Artist_WD_Coordinates |
| Artist_WD_Country |
| Artist_WD_Description |
| Artist_WD_Ethnicities |
| Artist_WD_Genders |
| Artist_WD_Genre |
| Artist_WD_Instruments |
| Artist_WD_Members |
| Artist_WD_MusixmatchID |
| Artist_WD_Name |
| Artist_WD_QID |
| Artist_WD_SexualOrientations |
| Artist_WD_StartYear |
| Artist_WD_Type |
| Artist_WD_VoiceTypes |
| Artist_WD_WebsiteURL |
| Artist_WD_YouTubeID |
Track
| Track_GE_ID |
| Track_GE_Lyrics |
| Track_GE_LyricsLanguagePredictions |
| Track_GE_LyricsLanguages |
| Track_GE_Title |
| Track_MB_Arrangers |
| Track_MB_Composers |
| Track_MB_CoverArt |
| Track_MB_CoverArtSmall |
| Track_MB_Duration |
| Track_MB_Engineers |
| Track_MB_Genre |
| Track_MB_ISRC |
| Track_MB_ISWCS |
| Track_MB_Instruments |
| Track_MB_Languages |
| Track_MB_Lyricists |
| Track_MB_LyricsURL |
| Track_MB_MBID |
| Track_MB_Name |
| Track_MB_Performers |
| Track_MB_Producers |
| Track_MB_Programmers |
| Track_MB_Release |
| Track_MB_ReleaseCountry |
| Track_MB_ReleaseLabels |
| Track_MB_ReleaseLanguage |
| Track_MB_ReleaseMBID |
| Track_MB_ReleaseScript |
| Track_MB_StreamingURL |
| Track_MB_Type |
| Track_MB_WorkMBID |
| Track_MB_Year |
| Track_SP_ID |
| Track_SP_Name |
| Track_WD_Composers |
| Track_WD_Description |
| Track_WD_Format |
| Track_WD_Language |
| Track_WD_Lyricists |
| Track_WD_Name |
| Track_WD_QID |
| Track_WD_Tonality |
| Track_WD_Year |
| Track_WD_YouTubeID |
Data Sets
The MIRAGE-MetaCorpus includes the following datasets:
- MIRAGE.csv -- the complete metacorpus (1 million)
- events.csv -- all event-level metadata (1 million)
- tracks.csv -- all track-level metadata (414,886)
- artists.csv -- all artist-level metadata (259,783)
- stations.csv -- all station-level metadata (10,000)
- locations.csv -- all location-level metadata (4,324)
Subsets of the MIRAGE-MetaCorpus are also available for events with metadata from online music libraries that reliably matched the event's description in the radio station's stream encoder.
Reliable -- 'All' Sources Match
The first subset ('all' sources) consists of events where the stream description reliably matched with metadata from the WikiData, MusicBrainz, Spotify, and Genius sources (>= .9, normalized edit distance):
- MIRAGE_reliable_all.csv (139,761)
- events_reliable_all.csv (139,761)
- tracks_reliable_all.csv (24,697)
- artists_reliable_all.csv (6,321)
- stations_reliable_all.csv (6,604)
- locations_reliable_all.csv (3,238)
Reliable -- 'Core' Sources Match
The second subset ('core' sources) consists of events where the stream description reliably matched with metadata from either the WikiData or MusicBrainz sources (>= .9, normalized edit distance):
- MIRAGE_reliable_core.csv (447,238)
- events_reliable_core.csv (447,238)
- tracks_reliable_core.csv (174,136)
- artists_reliable_core.csv (62,324)
- stations_reliable_core.csv (9,063)
- locations_reliable_core.csv (4,031)
Contact
If you are a copyright owner for any of the metadata that appears in the MIRAGE-MetaCorpus and would like us to remove your metadata, please contact the developer team at the following email address: miragedashboard@gmail.com
Files
artists.csv
Files
(4.5 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:f5a122511d3eff66d20cad44a3eb1051
|
48.3 MB | Preview Download |
|
md5:9e9e1f0c1d9ab457277c3d7088d33c0e
|
3.0 MB | Preview Download |
|
md5:029de7a0956b9397a3e79b69c1af1150
|
18.6 MB | Preview Download |
|
md5:27cabd203cb38afdacbdc233c9604fb8
|
241.5 MB | Preview Download |
|
md5:7f4950d704cf75952ff973e16b64f501
|
32.5 MB | Preview Download |
|
md5:097a651290eda6a0da08818c1bda74cd
|
107.5 MB | Preview Download |
|
md5:26ef9b147875b9b746af32075eda1739
|
4.4 MB | Preview Download |
|
md5:40d8969a011e1f119853c01e8d466d50
|
3.0 MB | Preview Download |
|
md5:1db588c9b978285449773afaf8f19eff
|
3.9 MB | Preview Download |
|
md5:e35a445c2989850c48d3d4bba675c505
|
2.2 GB | Preview Download |
|
md5:11bf09ae2caa278fe0d14371400cd519
|
381.2 MB | Preview Download |
|
md5:74deb6a27fbe5b54528ac7a258708db0
|
1.1 GB | Preview Download |
|
md5:2199dbfca0b74dcd68ab7667baa823b4
|
2.5 MB | Preview Download |
|
md5:db938c8db6f70fa71978bb0a39cf4189
|
1.6 MB | Preview Download |
|
md5:d6f1f8563f69d9d92d9196d4748e3c5f
|
2.2 MB | Preview Download |
|
md5:a2a1171305009e5d540c0a6258319226
|
172.6 MB | Preview Download |
|
md5:d4f2d63720fbb02508e3cc5ee7191696
|
19.5 MB | Preview Download |
|
md5:198558d942d3e9740e4c893243f3b6d4
|
104.1 MB | Preview Download |
Additional details
Funding
Dates
- Created
-
2024-07-19MIRAGE MetaCorpus v0.2
- Updated
-
2025-12-29MIRAGE MetaCorpus v1.0
Software
- Development Status
- Active