It is pointed out that Runway's video generation AI was trained by scraping videos of photo-based YouTubers



An investigation by

404 Media has revealed that Gen-3 Alpha , a video generation AI model announced by AI company Runway in June 2024, may have been trained by scraping videos of photo YouTubers.

A leaked internal document shows Runway's celebrated Gen-3 AI video generator collected thousands of YouTube videos and pirated movies for training data.
https://www.404media.co/email/e3836b26-6914-4c1c-a102-bf9735adc3de/



In latest AI training drama, Runway accused of using publicly available YouTube videos - SiliconANGLE
https://siliconangle.com/2024/07/25/latest-ai-training-drama-runway-accused-using-publicly-available-youtube-videos/

Runway Trained Its Video AI By Scraping Popular Photography YouTubers | PetaPixel
https://petapixel.com/2024/07/25/runway-trained-its-video-ai-by-scraping-popular-photography-youtubers/

404 Media has independently obtained internal documents showing that Runway employees were collecting video data from YouTube and pirate video sites to train their latest video generation AI model, Gen-3 Alpha. According to the documents, 'searching for high-quality videos to build AI models' was a company-wide initiative at Runway.

The internal documents obtained exclusively by 404 Media include a list of recommended channels, keywords, and hashtags, including those of popular photo YouTubers such as Kai W , who has over 950,000 subscribers, Peter McKinnon , who has over 5.94 million subscribers, and Michael Shainblum , who has over 120,000 subscribers.

Video sourcing - Jupiter



Additionally, 404 Media discovered that when they generated videos in Gen-3 Alpha using the names of some of the YouTubers on their list as prompts, the resulting videos were very similar to those of certain creators.

For example, if you use the name of Benjamin Hardman, who posts videos on YouTube about taking photos in Iceland, it will output something that looks very similar to the videos he posts. The prompt 404 Media used was 'YouTuber Benjamin Hardman appears in the style of his travel videos,' and the image below shows a scene from the generated video.



Other videos scraped include those shot with specific Sony cameras, such as the Sony A7 IV and Sony FX3 , likely in response to prompts like 'generate a video that looks like it was shot with a Sony camera.'

However, what 404 Media obtained was a 'spreadsheet created internally by Runway,' and there is no evidence that these videos were actually used for training. 404 Media has asked Runway for comment, but at the time of writing, no response had been received. It seems that they have begun blocking the names of YouTubers who appear to have been used for training in prompts.

Google has stated that using YouTube videos to train AI violates its terms of use.

YouTube CEO says 'Using AI for training is against the rules' and 'What's important is that creators succeed on YouTube' - GIGAZINE



in Software, Posted by logu_ii