ChatGPT can't make music, but Google's new AI model can

Google is sharing information about a new, experimental AI model, MusicLM, which can create a song from a simple text input.
Written by Sabrina Ortiz, Editor
Robot hand playing on the piano
Getty Images/xia yuan

A major barrier to entry in the music industry is production costs. Even once an artist collects the funds, finding a music producer and studio to meet their needs can be extremely challenging. So, what if you could just tell your computer to make the beat you envision? With Google's MusicLM model, generating music from text could be a reality.

Also: The best AI art generators: DALL-E 2 and alternatives

Last week, Google released an academic paper discussing its MusicLM generative AI model that makes music from user's text prompts. The model can make anywhere from a 10-second audio clip to a full song, using as many specific details as you give it. It can also take an existing song and produce it with a different sound. 

Also: AI has caused a renaissance of tech industry R&D, says Meta's chief AI scientist

According to the paper, prompts for the AI model can include detailed commands such as, "enchanting jazz song with a memorable saxophone solo and a solo singer" or "Berlin 90s techno with a low bass and strong kick". To see samples of all its different prompts and abilities, you can click here

To create the music, the system is trained on a 280,000-hour dataset of unlabeled music that teaches MusicLM to generate long and coherent music at 25 kHz, according to the paper.  

This isn't Google's, or the industry's, first attempt at an AI song system. OpenAI, the AI research company behind ChatGPT and DALL-E, has its own version, JukeBox, which has yet to be released to the public. Riffusion, a neural network that produces music using images of sound, is already available to the public now. 

But according to Google, its new system is better than anything done before: "Our experiments show through quantitative metrics and human evaluations that MusicLM outperforms previous systems such as Mubert and Riffusion, both in terms of quality and adherence to the caption."  

So, when will we be able to use this "better than anything out there" AI model? The answer is, unfortunately, not any time soon. 

In the paper, Google recognizes the risk that these kinds of models could pose to the misappropriation of creative content and inherent biases present in the training that could affect cultures underrepresented in the training, as well fears over cultural appropriation. For all of these reasons, Google says it has no plans to release models at this point. 

Also: ChatGPT is changing everything. But it still has its limits

In recent times, we have seen AI models that pose the risks delineated by Google. With the release of AI-generated art models, such as Lensa's AI Time Machine, artists have been speaking out about having their art being stolen by AI art models without credit or compensation.

At the same time, the sudden interest in AI tools such as ChatGPT has reportedly prompted Google to consider rolling out AI-based products more quickly.

Editorial standards