What is the difference between the 'Hybrid MP4' added to OBS and regular MP4?



OBS, a video distribution and capture software, now supports ' Hybrid MP4 ' in version 30.2, which was officially released in July 2024. The OBS development team explains on their official blog what makes Hybrid MP4 different from regular MP4.

Writing an MP4 Muxer for Fun and Profit | OBS

https://obsproject.com/ja/blog/obs-studio-hybrid-mp4

The predecessor of MP4 is ' QuickTime File Format (QTFF)' developed by Apple. QTFF is commonly known by the file extension 'MOV' and was developed in the 1990s for the purpose of efficiently storing and playing multimedia content. In 2001, the International Organization for Standardization (ISO) adopted the 'MP4 file format' as Part 12 of the ISO Base Media File Format (BMFF) based on QTFF.

The extensibility of the MP4 format comes from its basic structure, a system of 'boxes' or 'atoms.' Each box consists of a size, a type, and the actual data. The size is usually a 4-byte or 8-byte field that indicates the overall size of the box, and the type is a 4-letter code that indicates the contents or purpose of the box. In the diagram below, 'ftyp' is the file type box, 'moov' is the movie box, and 'mdat' is the media data box.



Furthermore, boxes can form a hierarchical structure in which some boxes contain other boxes, making it possible to efficiently represent complex data structures. For example, a 'moov' box may contain multiple subboxes containing information and metadata for each track. This structure makes it easy to extend the MP4 format by adding new types of boxes, so that new codec support or DRM features can be added without breaking existing compatibility.

However, the biggest problem with the traditional MP4 format was that it relied on a 'moov' box written at the end of the file. If an unexpected interruption occurred during recording, such as a power loss, insufficient disk space, or a blue screen, the 'moov' could not be written correctly, making the file completely unplayable. This was a particularly serious problem for long recordings or recording important events.

This is where the 'Fragmented MP4' technology, standardized as ISO/IEC 14496-12, came in. This is a technology that fragments media data so that files can be opened normally even if the PC is shut down or the software crashes during recording, and is the basis for adaptive bitrate streaming technologies such as

HTTP Live Streaming (HLS) and MPEG-DASH .

However, the moov of a fragmented file in Fragmented MP4 is incomplete, and information about a particular sample, such as a video frame or audio segment, is stored in a Movie Fragment Box (moof) at the beginning of each fragment. Due to this specification, only a limited number of applications support files fragmented in Fragmented MP4, and some video editing software and players cannot handle Fragmented MP4 files properly.



Fragmented MP4 files also have slow access speeds on HDDs and network drives because the metadata of the file is spread across multiple fragments, and the metadata of the entire file must be read in order to start playback. Also, because the metadata of the file is spread across multiple fragments, file browsers cannot easily obtain the total playback time of the file, and the playback time of the file is not displayed correctly in file browsers such as Windows Explorer.

Therefore, the OBS development team started to develop a method that functions as a fragmented MP4 during recording and quickly looks like a conventional MP4 format when recording ends. This method is 'Hybrid MP4'.

Hybrid MP4 essentially acts as a fragmented MP4 during recording. Each fragment contains a 'moof' box, which describes the media data in that fragment. Once a fragmented file is finished recording, OBS performs a process called 'soft remuxing' and writes a 'moov' box at the end of the file just like a normal MP4 file. This 'moov' box indexes the location and time information of the media samples. Then, once the recording is successfully finished, it overwrites the 'free' box at the beginning of the file with the header of a giant 'mdat' box. This giant 'mdat' box essentially contains all the fragmented files in the file.



Hybrid MP4 manages both the fragmented file 'moof' and the MP4 file metadata 'moov', allowing it to use the fragment metadata if the file is interrupted and the regular MP4 metadata if it completes normally, allowing for greater file compatibility while still being tolerant of data loss during recording.

Also, the MP4 box size field is normally 4 bytes, but this limit means that the maximum file size that can be correctly processed is 4GB. Hybrid MP4 uses an 8-byte extended size field if necessary, so files larger than 4GB can be correctly processed.

Additionally, with the Hybrid MP4 implementation, OBS has also added the ability to integrate additional metadata into the file, such as chapter markers and encoder settings, which are then properly stored within the final moov box. The metadata now also includes the correct creation and encoding dates, so you can keep track of when a file was originally recorded, even if it is renamed.

According to the OBS development team, a few days after integrating Hybrid MP4 into OBS, a patch was submitted to FFmpeg to add functionality similar to Hybrid MP4.

in Software, Posted by log1i_yk