I tried using 'audapolis', which allows free automatic transcription and audio editing with text



' Audapolis ' is an open source media editor with transcription functionality that not only automatically transcribes audio into text, but also allows you to edit the audio at the same time by editing the transcribed text. It is free to use.

GitHub - bugbakery/audapolis: an editor for spoken-word audio with automatic transcription

https://github.com/bugbakery/audapolis

Audapolis is a free application developed as open source and distributed under the AGPL-3.0 license. Visit the GitHub page and click on the latest version under 'Release'. The latest version at the time of writing is v0.3.0.



Audapolis is compatible with macOS, Linux, and Windows. Since I'm using the Windows version this time, I clicked on 'Windows' and downloaded the EXE installer (file size: 181MB).



Run the installer that you saved locally.



The installation will then start automatically.



Once the installation is complete, the app window will launch and there will be a short loading time.



Once the loading is complete, it will look like this. Since installation does not require any complicated settings or environment construction, even those without technical knowledge can easily use it. When using it for the first time, a pop-up will appear explaining the function, so click 'Next'. After this, a pop-up will appear every time you access a function, but you can just click 'Next' to proceed.



Click “Import & Transcribe”.



Then, 'You need to install the transcription model first,' and select the transcription model for your language. Click 'Japanese.'



There are two types of Japanese-compatible models, 'small' and 'big', depending on the size. 'Big' has higher accuracy, but the file size is larger. This time, click the download icon for 'big' with a file size of 1Gb.



Once the model has finished downloading, click “back to home screen” in the upper left to return to the top screen.



When you return to the top screen, click 'Import & Transcribe' again. This will allow you to import media. The file formats that can be imported are MP3, WAV, OGG, WMA, AAC, MP4, MKV, MOV, and WEBM, and you can transcribe not only audio but also video files.



Once you have loaded the audio file, click 'Transcribe.'



Then the transcription will begin.



In the movie below, you can see how audapolis transcribes an audio file of the opening part of Natsume Soseki's 'Botchan'. The transcribed text is automatically classified by speaker. Since this is a reading, there is only one speaker, but it is divided into two or more speakers based on intonation and breathing timing. The accuracy of the transcription is high, and although there are some mistakes, it seems to be at a level that is sufficient for practical use as long as the sound quality is at a certain level.

I tried transcribing with 'audapolis', which allows free automatic transcription from audio - YouTube


If you find a mistake in the transcribed text, you can edit it by selecting the incorrect part and pressing the 'o' key.



Also, if you select and delete the transcribed text, the audio being played will be edited at the same time. A major feature of Audapolis is that it allows you to edit audio on a text basis. You can get a good idea of how you can edit by watching the following movie.

'Audapolis' can automatically transcribe Japanese text and automatically edit the audio at the same time - YouTube


You can export the edited text and audio by clicking the icon in the upper right and clicking 'Start Export.'



Export formats include audio, video, subtitles, text and timeline.



You can also save the project you’re working on by clicking the hamburger icon in the upper left, selecting “File”, and then selecting “Save” (overwrite) or “Save As” (save as).



in Review,   Software,   Video, Posted by log1i_yk