AI 'Neural Voice Camouflage' to prevent eavesdropping AI has appeared



By hiding 'spyware' in computing devices such as PCs and smartphones, it is possible to eavesdrop on user conversations, and Amazon's smart speaker

Echo can be converted into an eavesdropping device. The crisis is becoming more familiar. The AI technology that protects users from such eavesdropping is ' Neural Voice Camouflage ', which creates custom audio noise from the background, making the recorded voice inaudible.

Real-Time Neural Voice Camouflage | OpenReview
https://openreview.net/forum?id=qj1IZ-6TInc

Is technology spying on you? New AI could prevent eavesdropping | Science | AAAS
https://www.science.org/content/article/technology-spying-you-new-ai-could-prevent-eavesdropping

'Neural Voice Camouflage' is an AI technology that uses 'adversarial attacks', which are attacks that trick AI, and uses machine learning to fine-tune the voice so that AI is mistaken for something else. .. Regarding hostile attacks, the journal Science describes it as 'using AI to fool another AI.'

It may seem easy to say 'use AI to fool another AI,' but this process is not as easy as it sounds. In the case of voice processing using hostile attacks, it is necessary to process the entire voice data, which makes real-time voice processing difficult.

However, Neural Voice Camouflage uses neural networks, which are brain-inspired machine learning systems to effectively predict the future. By training the neural network used in Neural Voice Camouflage with hours of voice data, the research team constantly processes two seconds of voice data, and which voice is likely to follow. It is possible to keep guessing whether to add noise.

For example, if someone says, 'Enjoy the banquet,' you cannot predict exactly what the next word will come out. However, considering the characteristics of the speaker's voice and what has just been said, it is possible to generate a voice that makes the phrase that will follow after that inaudible. In addition, since the sound generated by Neural Voice Camouflage sounds like background noise to humans, it seems that it will be possible to interfere only with the AI used for eavesdropping without disturbing the conversation.



The development team used one of the automatic speech recognition (ASR) systems to verify the accuracy of the Neural Voice Camouflage. When the voice was processed by Neural Voice Camouflage, we succeeded in reducing the word authentication accuracy of ASR from 88.7% to 19.8%. On the other hand, the word authentication accuracy of ASR is 87.2% when white noise is only added to the voice, and 79.5% when noise processing using hostile attacks without predictive function such as Neural Voice Camouflage is almost eavesdropping. It has become clear that it cannot be used to prevent it.

Furthermore, it has been clarified that even if ASR is learned to be able to recognize voice through eavesdropping prevention technology, the word authentication accuracy of ASR can be suppressed to 47.5% by using Neural Voice Camouflage. Researchers claim that the words that are difficult to eavesdrop on in Neural Voice Camouflage are short words such as 'the', which are the least noticeable parts of the conversation.

In addition, the Neural Voice Camouflage development team is testing a speaker set in the room where the recording microphone is located to output noise that makes eavesdropping difficult. This test has also been successful, for example, the conversation 'I also just got a new monitor' tells ASR 'with reasons with they also toscat and neumanitor'. However, there is a reason) ”.



'This is just the first step in using AI to protect privacy,' said Mia Chiquier, a computer scientist at Columbia University who led the study. 'AI is data about our voice, face, and behavior.' We need a new generation of privacy-respecting technology to counter this. '

Chiquier argues that the predictive features used in Neural Voice Camouflage have great potential for self-driving cars and other applications that require real-time processing. Self-driving car technology needs to predict where the car will go next and where the pedestrians will be. The human brain predicts this accurately, so Chiquier said, '(Our system) emulates the same way humans do.'

Andrew Owens, a computer scientist at the University of Michigan at Ann Arbor, who studies speech processing, said, 'It's great to combine the classic problem of machine learning with predicting the future with another problem of hostile attacks. '. Bo Li, a computer scientist at the University of Illinois at Urbana-Champaign, who studies the application of voice-based hostile attacks, also said he was impressed with the new approach to ASR. 'Voice camouflage technology is considered an important technology,' said Jay Stanley, senior policy analyst at the American Civil Liberties Union.

in Software, Posted by logu_ii