Published September 21, 2025 | Version v2
Conference paper Open

AI-Generated Song Detection via Lyrics Transcripts

Description

The recent rise in capabilities of AI-based music generation tools has created an upheaval in the entire music industry, necessitating the creation of accurate methods to detect such AI-generated content. This can be done using audio-based detectors; however, it has been shown that they struggle to generalize to unseen generators or when the audio is perturbed. Furthermore, recent work used accurate and cleanly formatted lyrics sourced from a lyrics provider database to detect AI-generated music. However, in practice, such perfect lyrics are not available (only the audio is); this leaves a substantial gap in applicability in real-life use cases. In this work, we instead propose solving this gap by transcribing songs using general automatic speech recognition (ASR) models. Once transcribed, lyrics are again available in a text representation, and established AI-generated text detection methods can be applied. We do this using several detectors. The results on diverse, multi-genre, and multi-lingual lyrics show generally strong detection performance across languages and genres, particularly for our best-performing model using Whisper large-v2 and LLM2Vec embeddings. In addition, we show that our method is more robust than state-of-the-art audio-based ones when the audio is perturbed in different ways and when evaluated on different music generators.

Files

000013.pdf

Files (200.3 kB)

Name Size Download all
md5:a01b2c5294886f61c4cf74c331c43d02
200.3 kB Preview Download