Just by correcting the text, you can develop a technology that can easily correct mistakes in the movie



If you make a mistake in the movie, even if you edit the voice by force, the movement of the mouth and the way of connecting the voice become unnatural, so you have to retake it. Hmm. However, applying technology jointly developed by researchers from

Stanford University , Max Planck Institute of Informatics , Princeton University , and Adobe , you can also modify movies and sounds by text-based correction in the movie, as well as movies and sounds. It will be possible to fix it at the same time.

[1906.01524] Text-based Editing of Talking-head Video
https://arxiv.org/abs/1906.01524


Text-based Editing of Talking-head Video
https://www.ohadf.com/projects/text-based-editing/


You can see what you can actually correct by looking at the following movie.

Text-based Editing of Talking-head Video (SIGGRAPH 2019)-YouTube


For example, the following is a movie of a woman who is talking about the phrase 'Apple's stock closed at $ 191.45.'



If you want to modify the text that a woman speaks in the movie as 'Apple's stock price has reached a closing price of $ 182.25,' just correct that part of the text and it's OK. Then, the woman's face and voice were edited at the same time, and the woman was corrected to a movie that speaks the correct content with almost no discomfort.



The system of this technology looks as follows. When you enter a movie, its speech is decomposed according to phonemes. Phonemes are linked to text, and correcting text creates and replaces corrected speech from other speech. At the same time, a model of the face of the person who is speaking is generated, and is tracked and synthesized according to the corrected voice.



The image below shows the movie before the correction input on the left, the movie in the middle of correction processing, and the movie on the right after correction. The modified sound is natural, and the movie is processed accordingly. Although the accuracy is not so high as to say that 'the movement of the mouth and the sound are perfectly synchronized', it has been corrected to a movement that is natural to the extent that it can not be understood at first glance a corrected movie.



This technology adds new words to what is being said in the movie ...



It is also possible to decide that the words you have spoken were completely missing.



The following is the correction of a movie of a man who has had an extra 'and' in the conversation. For example, in a recorded presentation, even if the presenter makes extensive use of so-called expression called '

filler ', it can be naturally deleted from the movie and made easy to view. In addition, if this technology is applied, for example, even if an actor gives out NG in shooting a movie, it becomes possible to easily correct the NG movie, so it is expected that the budget for the movie can be kept lower. I can do it.

in Software,   Video, Posted by log1i_yk