AI: The Voices Behind the Music

Edward J. Russavage

Jul 25, 2024 8 min read

By: Ed Russavage and Summer Associate, Benjamin Nickerson

(as published in IPWatchdog)

AI has progressed within the last several years to do remarkable things. In February of this year, OpenAI unveiled Sora, an impressive text-to-video AI generator that produces high quality videos. This new generative AI innovation uses basic text prompts to generate up to 60-second videos. Like all generative AI tools, there are concerns about how the model is trained and the composition of the training data. The potential ethical and societal ramifications for a tool such as this may be cause for alarm, as there are certainly IP concerns that must be addressed before the tool is distributed for public use. OpenAI has started a dialogue with lawmakers and artists as a result.

Hey AI, Make Me A Song by My Favorite Artist…

Text-to-video is not the only innovation to come out of the recent generative AI wave. AI can impersonate musical artists’ voices with surprising accuracy. In April of 2023, an AI-generated song mimicking vocals of Drake and the Weeknd was uploaded to streaming and social media platforms. The song quickly went viral, amassing over 11 million views on YouTube and over 600,000 streams on Spotify. Thousands of users used the song in TikTok videos, and it is estimated that approximately $9,400 was made off of global streams. However, neither Drake nor the Weeknd authorized the use of their vocals on the song.

In fact, TikTok is a hotbed of AI-generated songs that impersonate artists’ voices. Simply searching AI and your favorite artist’s name can provide remarkable results. For example, a popular TikTok creator SWRVY has amassed nearly 23,000 followers and over 1.5 million likes on videos of AI covers. Many of the creator’s videos use AI to mimic a popular artist’s voice while singing a different artist’s song. Searching “Ariana Grande AI Cover” produces a video by SWRVY of an AI impersonation of Ariana Grande’s voice singing a Billie Eilish song. Again, it is evident that neither artist authorized the use of their voice nor works. Due to the viral nature of the AI-generated songs, record companies are put in a tough position of trying to defend the incredible volume of copyrighted works for which they own the licensing. Artists are left in an equally difficult spot as there are few protections provided to them when it comes to use of their voice to train generative models.

Record Companies Allege Infringement

Challenges resulting from AI mimicking established music is no more personified than in the recent conflict between the leading record companies and prominent AI music generator services Udio and Suno AI. Both services provide users the opportunity to generate radio-length songs with simple text inputs. In June of this year, Universal Music Group (UMG), in collaboration with other leading record companies, sued both Udio and Suno AI separately in federal district court for direct copyright infringement. Both complaints focus heavily on the use of training data for training their AI models. According to the complaint, providing the prompt “james brown 1965 soul rhythm & blues funk i feel good” to Suno AI’s service generated a song that was remarkably similar to the copyrighted work “I Feel Good.” The record companies provided numerous other examples that they allege prove that the services’ models were trained on their artists’ discographies.

To test out this theory, in a test to generate a song that resembled Cher’s “Believe”, a custom prompt was input with a snippet of lyrics produced a song with nearly a verbatim copy of the original chorus. UMG argues that the ability to produce similar sounds or similar lyrics to copyrighted works indicates that the model was trained on those very same copyrighted works. Although Udio argues that the training data is a trade secret, it is hard to deny the similarities between the AI-generated content and the copyrighted songs owned by the record companies.

In fact, neither Udio nor Suno AI dispute the use of copyrighted works in their training data. According to the complaint, the CEO of Suno AI claimed that the service is trained on a mixture of proprietary and public data. Udio made similar statements in which their model was trained on publicly available data that was scraped from the internet. Both companies claim that their use of the copyrighted materials in their training data falls under fair use.

The record companies disagree and allege that fair use only applies when it comes to human expression. Therefore, they allege that AI-generated songs cannot be considered human expression. Udio and Suno AI will likely disagree; their services cannot generate a song without some form of human input. The dichotomy that is created by these two arguments raises important legal questions that extend beyond the parties in this suit. For example, at what point, if at all, does generative AI move from human expression (input) to artificially generated content (output)? It is unclear if the court will even choose to address this question.

Legal Challenges

The legal system has always struggled with protecting artists’ voices. There have been attempts to provide protection on name, image, and likeness through rights of publicity, but it is not federally recognized. One of the first landmark court cases dealing with right to publicity was Bette Midler v. Ford Motor Co. Bette Midler denied an offer to for her song to be used in a Ford commercial. However, Ford hired an impersonator to rerecord Midler’s song mimicking her voice. Midler then sued Ford for violation of her right to publicity. The court found that an artist’s voice is part of their identity, and their likeness cannot be used without their permission. Although a right of publicity is now recognized in 26 states, it has never been recognized federally. As a result, artists may need to look for other avenues to protect their voice from being used in AI-generated content.

One possible avenue, and one that record companies are likely interested in pursuing, is the licensing of copyrighted works to be used in the training data for generative AI models. A major revenue stream for record companies is licensing of their songs. Currently, AI music impersonators are providing a blueprint for bypassing the need to license copyrighted works to train generative AI models. There is a constant struggle between the types of data used and how transformative the output is from the AI models. As we wait for the court to address these issues, the industry may be pushed towards a new licensing structure.

The Future of Licensing

When AI companies and musicians work in tandem, the result can be a harmonious relationship that leads to an innovative product. In 2023, YouTube announced a new tool called Dream Track that allows creators to make short songs with AI-generated clones of popular artists’ voices. When the project was announced, nine artists agreed to participate by allowing their voices to be cloned including Demi Lovato, John Legend, and Sia, to name a few. Copyright protection is used to protect a song’s lyrics and music; however, the law has not evolved to protect an artist’s voice. Regardless, infringement is not at issue in this case because YouTube licensed the content needed to train its models on the participating artists’ voices directly from UMG. Depending on the agreement between the parties, the participating artists will likely be compensated for songs that are generated using a clone of their voice. This new licensing scheme provides protection to AI companies when developing new models while also providing proper compensation to human creators. As the court continues to grapple with the generative-AI wave, the future may lie in licensing.