The record of an AI developer running Gemma 4 on a MacBook to generate 'descriptions for a large number of video files' is interesting.



Videos shot with cameras and smartphones are given sequential filenames such as 'P1013593.MP4' or 'IMG_0034.MOV'. Because of this, when you shoot a large number of videos, you often end up in a situation where you 'don't know what's in which video'. In such cases, the official blog of the AI company SimbaStack has posted a record of the process of 'running an AI model such as

Gemma 4 on a MacBook to generate a large number of index files that include video descriptions', which may be helpful.

While I slept, my 5-year-old MacBook ran Gemma 4 locally and indexed a year of video — simbastack
https://blog.simbastack.com/indexed-a-year-of-video-locally/

The blog author spends half the year in Maasai Mara, Kenya, and shoots a large amount of video using devices such as the Nikon Z8, DJI Pocket, and Ray-Ban Meta. As the time he spends on video editing has decreased, he tried to experiment with a system that 'uses AI to stitch together video clips,' but he ran into the problem that 'an index file explaining the videos is needed.'

The blog author created the index using a MacBook with an M1 Max processor, which was released in 2021. The index is a Markdown file that records the video's metadata and 'what is shown' in it, and it serves as a starting point for AI to find videos.



The software used for index creation and its purpose are as follows:

ffprobe: Reads video metadata
ffmpeg: Extracts 5 frames from the video.
- exiftool: Reads GPS information
Nominatim: Converts GPS information into address information.
• WhisperX: Translates audio into text
・insightface: Face recognition
• Visual language model: Creates descriptive text for what is shown in the video.

The visual language model used was Gemma 4 31B Q4, and it was run in LMStudio.



MacBooks equipped with the M1 Max had 64GB of memory, but that wasn't enough, and apparently, at its peak, a swap file of 50.89GB was created.



After spending a full day creating an index on a MacBook with an M1 Max processor, I successfully assigned the same index name to all video files.



The blog author has made the environment used for indexing available on GitHub so that it can be cloned.

GitHub - Simbastack-hq/framedex: Framedex — a queryable knowledge base for your video archive · GitHub
https://github.com/Simbastack-hq/framedex

in AI,   Software, Posted by log1o_hf