Apple refutes reports that YouTube subtitles were used to train its AI, saying it does not use them in its commercial AI, including 'Apple Intelligence'



Apple and NVIDIA, who are said to have used the AI training dataset '

The Pile ,' created by a non-profit research organization, were criticized for including YouTube subtitles in the dataset. In response, Apple said, 'We used it for language models for research, but not for Apple Intelligence , Apple's AI function.'

Apple says its OpenELM model doesn't power Apple Intelligence amid YouTube controversy - 9to5Mac
https://9to5mac.com/2024/07/17/apple-intelligence-openelm-training-youtube/



According to a report by IT media Proof News on July 16, 2024, the dataset 'The Pile' created by non-profit AI research organization

EleutherAI contained European Parliament documents, English Wikipedia text, and subtitles for over 170,000 YouTube videos. Proof News investigated research papers and public information and pointed out that Apple, NVIDIA, Bloomberg, AI development company Anthropic, and others were using 'The Pile'.

It turns out that Apple, NVIDIA, Anthropic and others used YouTube video subtitles to train AI without permission - GIGAZINE



In particular, Apple had just announced an AI feature called 'Apple Intelligence,' so there were suspicions that inappropriate data may have been included. However, Apple later refuted this, stating that 'the dataset in question was only used for language models for research purposes and did not affect any of our AI, including Apple Intelligence.'

Apple has built a large-scale language model for research, ' OpenELM ,' and in its (PDF file) research paper it states that it 'used The Pile.' In response to this, Apple explained that 'We created the OpenELM model to contribute to the research community and promote the development of open source large-scale language models, not for Apple Intelligence,' emphasizing that the dataset was used solely for research purposes and not for Apple Intelligence. Apple also said that it has no plans to build a new version of the OpenELM model.



in Software, Posted by log1p_kr