A report of an engineer who continued to record his behavior 24 hours a day and transcribe it with Whisper 'fully automatic dictation diary' for a week



``

Whisper '' developed by OpenAI, an AI research group, is an AI model that transcribes from recording with very high accuracy, and is open source and free of charge. On his blog, engineer Rober Dam reports on the results of testing a system that allows you to leave a 'fully automatic dictation diary' by recording your actions on your smartphone every day and using Whisper to transcribe the recordings. increase.

I record myself on audio 24x7 and use an AI to process the information.
https://roberdam.com/en/wisper.html

Mr. Dam came up with the fully automatic dictation diary because he thought that `` if a smartphone with a built-in storage exceeding 1 TB appeared, it would be possible to continue recording 24 hours a day, 365 days a year. It seems to have started. In addition, OpenAI's announcement of Whisper's public release in September 2022 made the fully automated dictation diary a realistic idea, Dam said.

OpenAI announces high-performance transcription AI 'Whisper', supports Japanese and can transcribe tongue twisters and lyrics with high accuracy - GIGAZINE



Mr. Dam bought a Chinese-made microphone and a miniature recorder , always recording his actions aloud. At that time, it seems that by adding 'Robert' at the beginning of the content you want to talk about and 'End Robert' at the end, you can record all the content you talked about. And he seems to have developed a system that processes all the content recorded by the 'Robert' command at the end of the day with Whisper, converts it to a text file, and automatically summarizes the content.

Regarding the reason why Google Assistant was not used to record the ``OK Google'' wake word, Mr. Dam said, ``I often wonder if you can do something interactively if you leave it as ``OK Google,'' or whether it will return Google search results. 'Because I don't know', 'Because commands starting with' OK Google 'are saved in Google as audio files', 'Because there is a delay when sending commands to Google'.

For example, say 'Robert WEIGHT 62.8 end Robert' to record your weight for the day.



'Robert SLEEP 7 hours 14 minutes (sleeping time is 7 hours 14 minutes) end Robert'



When you say 'Robert LUNCH two toasts with a fried egg (for lunch) end Robert'', it not only records the meal content, but also automatically calculates the calories of what you ate using an external API. It seems that



'Robert NOTE the podcast talks about Morgan Housel's book the psychology of money ' end By talking to Robert, you can write down your thoughts and ideas while driving.



Then, you can transcribe the recorded content with Whisper, convert everything into data, summarize it, and display it on the control panel. As shown below, changes in body weight for one week, changes in sleep time, calorie intake on the day, total amount spent on gasoline and shopping, etc., 'Today's Notes' are summarized in an easy-to-understand manner. increase.



In addition, it also has a 'My Journal' function that automatically creates a diary that summarizes the actions of the day by showing the contents of the tweets that were tweeted at each hour of the day in chronological order.



'If everything you say is recorded, it's a little less paranoid,' Dam said. It's about 100%,' he said. Also, not only the content of the talk but also the background by 5W1H such as 'when, where, with whom and how' is necessary, so there was a limit to the amount of information that can be left only by recording and transcription.

After trying the fully automated dictation system for a week, Mr. Dam said, ``It's magical to be able to fully relive everything you did that day from a small conversation.'' ``Everything that comes out of your mouth. By recording all exchanges and analyzing them, you can see things in a way that was impossible until now.” indicate. “The difference between utopia and dystopia is who has access to that information,” he said.

in Software, Posted by log1i_yk