Creators! Your YouTube Videos Are Being Trained by AI Giants Without Your Permission
Creators! Your YouTube Videos Are Being Trained by AI Giants Without Your Permission

Creators! Your YouTube Videos Are Being Trained by AI Giants Without Your Permission

2024-08-15
2 mins read

It was inevitable. Almost all data on the web is being trained by AI giants with the help of 3rd party dataset generators and without any permission. PoofNews.org reveals that even your YouTube videos are being used for that purpose and without any consent.

Will OpenAI destroy Hollywood?
Will OpenAI destroy Hollywood?

Everybody’s work is being exposed to AI dataset generators

As stated by Proof News: “Apple, Nvidia, Anthropic, and other big tech companies used thousands of swiped YouTube videos to train AI. Creators claim their videos were used without their knowledge”. The research site claims that tech companies are turning to controversial tactics to feed their data-hungry artificial intelligence models, vacuuming up books, websites, photos, and social media posts, often unbeknownst to the creators.

OpenAI founder, Sam Altman, and Hollywood.
OpenAI founder, Sam Altman, and Hollywood.

Remaining in secrecy

Proof News adds that AI companies are generally secretive about their sources of training data, but an investigation by Proof News found some of the wealthiest AI companies in the world have used material from thousands of YouTube videos to train AI. Companies did so despite YouTube’s rules against harvesting materials from the platform without permission. “Our investigation found that subtitles from 173,536 YouTube videos, siphoned from more than 48,000 channels, were used by Silicon Valley heavyweights, including Anthropic, Nvidia, Apple, and Salesforce” the site adds. Proof News also found material from YouTube megastars, including MrBeast (289 million subscribers, two videos taken for training), Marques Brownlee (19 million subscribers, seven videos taken), Jacksepticeye (nearly 31 million subscribers, 377 videos taken), and PewDiePie (111 million subscribers, 337 videos taken). Some of the material used to train AI also promoted conspiracies such as the “at-Earth theory.”

Video by Sora
Video by Sora

Layers on layers of data: Irreversible process

It’s important to emphasize that the process of training datasets is irreversible since every layer of data is based on other layers. Practically, AI image generators can not undo it (=remove a specifically trained video) since it would interfere with the AI calculation, and thus, datasets are being stitched together. 

Post production - post prompt. Filmmaking on Sora. Image: shy kids
Post production – post prompt. Filmmaking on Sora. Image: shy kids

YouTube and Sora

Moreover, OpenAI executives have repeatedly declined to publicly answer questions about whether it used YouTube videos to train its AI product Sora, which creates videos from text prompts. Earlier this year, a reporter with The Wall Street Journal put the question to Mira Murati, OpenAI’s chief technology officer. “I’m actually not sure about that,” Murati replied. That means the answer is ‘Yes’. So next time you upload your Porsche to YouTube, be aware that Sora will be trained on it and without your consent. According to those dataset generators, the utilization of YT content for AI train purposes can be defined as ‘Fair Use’. Yeah, you heard right. AI image generators think that taking your videos to train on them is Fair Use. Oh, and without any compensation – means you are getting nothing for it. I have a question: WHERE ARE THE LAWYERS?

Sora: Democratization of Filmmaking
Sora: Democratization of Filmmaking

Possible solution: Marking and money!

First, YouTube needs to address that ASAP, by clarifying and explaining to creators in case their videos are being trained without their permission. Second, trained videos should be marked, as well as AI-generated imagery. Every AI video must be marked ‘Made by AI’. Third, creators should get compensated twice: By datasets generators, and by the AI giants who trained those datasets. Therefore, those ‘voluntary’ dataset generators will understand the consequences of harming those who feed them (creators), by paying them money. It’s time to stop this circus. Take an example from Blackmagic.

Get the best of filmmaking!

Subscribe to Y.M.Cinema Magazine to get the latest news and insights on cinematography and filmmaking!

Yossy is a filmmaker who specializes mainly in action sports cinematography. Yossy also lectures about the art of independent filmmaking in leading educational institutes, academic programs, and festivals, and his independent films have garnered international awards and recognition.
Yossy is the founder of Y.M.Cinema Magazine.

Leave a Reply

Your email address will not be published.

Get the best of filmmaking!

Subscribe to Y.M.Cinema Magazine to get the latest news and insights on cinematography and filmmaking!

Get the best of filmmaking!

Subscribe to Y.M.Cinema Magazine to get the latest news and insights on cinematography and filmmaking!

Nikon is Ready to Develop a Cinema Camera
Previous Story

Nikon is Ready to Develop a Cinema Camera

Sony BURANO + FX3: The Ideal Combination?
Next Story

Sony BURANO + FX3: The Ideal Combination?

Latest from Educate

The Philosophy Behind Sony Cinema Line

The Philosophy Behind Sony Cinema Line

Sony’s Cinema Line has become an icon in the world of digital cinema. It’s more than just a collection of cameras; it’s a philosophical commitment to empower creators and push the boundaries…
Go toTop

Don't Miss

OpenAI Sora Has Been Leaked: The Pandora’s Box of AI Creativity

OpenAI Sora Has Been Leaked: The Pandora’s Box of AI Creativity

In a stunning act of rebellion, a group of artists has leaked OpenAI’s Sora project, exposing what many fear could be the most…
Hollywood’s Future: How AI and New Technology Are Transforming the Industry

Hollywood’s Future: How AI and New Technology Are Transforming the Industry

As Hollywood faces its greatest challenge to date—an audience drawn to streaming and empowered by ever-advancing technology—the cinematic landscape is undergoing a transformation…