'VoiceCraft' can synthesize voice from just a few seconds of voice data
A research team led by the University of Texas at Austin has announced ' VoiceCraft ,' an AI capable of
VoiceCraft
https://jasonppy.github.io/VoiceCraft_web/
The newly announced 'VoiceCraft' is a neural codec language model inspired by multimodal models of text and images, enabling zero-shot text-to-speech output, speech synthesis, and speech editing.
VoiceCraft allows you to edit voices in a very natural way. First, here is the original voice saying, 'but the renaissance broke their monopoly on knowledge, one of the most important bastions of the church.'
Next, the audio edited with VoiceCraft is as follows. The content of the audio is 'But the renaissance broke their monopoly on knowledge, with it's free movement of research and endless scientific inquiry , one of the most important bastions of the church.' The part in bold is the part added by VoiceCraft.
VoiceCraft is available on GitHub and Hugging Face, so you can actually try it out for yourself.
GitHub - jasonppy/VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild
https://github.com/jasonppy/VoiceCraft
VoiceCraft - a Hugging Face Space by pyp1
https://huggingface.co/spaces/pyp1/VoiceCraft_gradio
So I decided to try out 'VoiceCraft' available on Hugging Face. When you click on the URL above to access it, you will see the following screen.
Although the demo audio has already been input, I decided to use another
Click 'Transcribe' to transcribe the audio.
Next, enter the text you want to read in the 'Text' field and click 'Run.' For the prompt, I used a passage from the famous
The resulting audio can be played or downloaded using the buttons in the red frame below.
Let's compare the two. First, here is the original audio that was uploaded:
Next, here is the audio created by VoiceCraft:
Related Posts: