Meta publishes music generation AI model as open source, so that anyone can create high quality music with text & voice input



Meta's research team has released an AI model ' MusicGen ' that generates music. Since it is open source, anyone can use the model for free, and examples and demos are open to the public, making it possible to actually check the quality.

[2306.05284] Simple and Controllable Music Generation

https://doi.org/10.48550/arXiv.2306.05284

MusicGen: Simple and Controllable Music Generation
https://ai.honu.io/papers/musicgen/



MusicGen is a model that uses Transformer like large-scale language models such as ChatGPT. A language model predicts the next word in a sentence, while MusicGen predicts the next section of music. The training used 20,000 hours of licensed music, specifically 10,000 internal datasets, as well as data from Shutterstock and Pond5 . Also, a graphic board with 16GB or more VRAM is required to run MusicGen.

Examples and demos are provided so that you can see the quality of the music that can be generated. For example, for the music below, enter 'Pop dance track with catchy melodies, tropical percussion, and upbeat rhythms, perfect for the beach'. originally generated. It is finished in an atmosphere that makes you imagine a beach in a tropical country, and it seems to be quite good quality.


Also, 'A grand orchestral arrangement with thunderous percussion, epic brass fanfares, and soaring strings, creating a cinematic atmosphere fit for a heroic battle.' , creates a cinematic atmosphere suitable for a heroic battle)', it looks like this. It's music that doesn't seem to be uncomfortable even if it's flowing in the final boss battle of the game.


Many other examples are posted on the introduction page of MusicGen , so please check it out if you are interested. On the introduction page, you can also compare the differences with other models.



A demo that can actually move the model is

published on Hugging Face . If you put text in the left input field and click 'Generate', 12 seconds of music will be generated. The generated music can be played, downloaded and saved.



When I tried various things, I could only generate music that would make me feel sick no matter how hard I tried with the original prompt I thought of, but I copied it from the sample page 'A light and cheerly EDM track, with syncopated The prompt 'drums, aery pads, and strong emotions' generated decent music in one shot.


Even in an era where AI generates music, it seems that a musical sense is necessary for how to input prompts.

in Review,   Software,   Web Application, Posted by log1d_ts