Microsoft researchers develop a system that 'can input speech without making a voice'

Speech recognition assistants such as Siri, Google Assistant, and Alexa are prevalent, but when you use these functions in situations where there are people around on the go, you are asked what you are trying to do to people around you There is a disadvantage that it is. A researcher at Microsoft Research developed a system that "can voice input without giving a voice" overturning the premise of such a voice input system that "I must speak".

SilentVoice: Unnoticeable Voice Input by Ingressive Speech - Microsoft Research

SilentVoice_cameraready_1d_tmp.pdf - UIST2018_SilentVoice_cameraready.pdf
(PDF file)

Normally, even when a human voice is a normal voice, even a whisper, when uttering a voice, I exhale while exhaling. However, Masaaki Fukumoto , researcher at Microsoft Research, devised a voice input system called " Silent Voice " using a method of "speaking while taking a breath" rather than an ordinary speech method.

You can see what type of voice input system Silent Voice is using by looking at the following embedded movie.

Silent Voice: Unnoticeable Voice Input by Ingressive Speech (Full Version)

A man who headed to the camera speaks normally as "Hello everyone, this is a demonstration of Silent Voice. (Hi everyone, this is a demonstration of Silent Voice.")

In the lower left "Sound Level" is about 60 dB .

Subsequently, a man presses a small microphone like a mouth ... ...

I started moving my mouth. Even if you are watching a movie, you can not hear anything as speech, but "Recognition Result (recognition result)" in the lower right shows "hello everyone this", indicating that words are recognized I will.

The lower left Sound Level pointed to the latter half of 30 dB.

In the movie, it seems as if you are playing mouthpiece, but you can not see it without saying hello everyone this is silent voice we can use voice input without annoying surrounding (Hello, everyone in Silent Voice, It is possible to input speech without giving out) "The pattern seems to be recognized.

Silent Voice recognizes sound with a plate-type microphone pressed close to the lips.

An important point is that you speak while sucking, not breathing out. As I usually do not speak while taking a breath, it seems that some practice is required to use it, according to the paper "Anyone can practice speaking while breathing within 15 minutes That thing. As a trick, "I will not breathe in my nose, I will breathe only from my mouth" "I will speak while sucking a small amount of breath, not while sucking a lot of breath" "Exhale before speaking, long sentences at a stretch I uttered what they said "to speak."

As a mechanism of Silent Voice, we recognize the voice by sensing a slight air gap between the microphone and the lips when speaking while taking a breath.

Because it recognizes the speech while breathing in, it is characterized by not being able to hear the voice input to the surroundings. In ordinary utterances, volume of about 60 dB latter half comes out.

Even if whispering as much as possible to speak as smoothly as possible, the volume of about 40 dB seems to come out.

However, with the breath taking speech, only the volume of 30 dB middle to the latter half is emitted. This is a value that is almost the same as the average value of the background noise in the ordinary place, 33.8 dB, and if it is Silent Voice's speech method, it is said that the sound input unknown to the surrounding noise is not heard.

Also, since there is a big difference between the noise generated by the microphone and the noise generated by the normal generation method and Silent Voice, the microphone can easily detect "Silent Voice input mode" and recognize the sound. Therefore, it is not necessary to change the microphone to the mode for sensing Silent Voice when inputting voice.

The microphone used for Silent Voice can be extremely miniaturized, it can be charged in a ring or charged in a smartphone.

Since it is difficult for noise to enter speech while breathing, it is possible to bring the microphone into close contact with the mouth. Therefore, it is said that the voice input system functions even in places where the surroundings are noisy.

Moreover, it seems that you can also put a microphone inside the implant to be worn on the tooth, and it is conceived to use it as a type of human body expansion device in the future.

in Hardware,   Science, Posted by log1h_ik