Creators! Your YouTube Videos Are Being Trained by AI Giants Without Your Permission
Creators! Your YouTube Videos Are Being Trained by AI Giants Without Your Permission

Creators! Your YouTube Videos Are Being Trained by AI Giants Without Your Permission

2024-08-15
2 mins read

It was inevitable. Almost all data on the web is being trained by AI giants with the help of 3rd party dataset generators and without any permission. PoofNews.org reveals that even your YouTube videos are being used for that purpose and without any consent.

Will OpenAI destroy Hollywood?
Will OpenAI destroy Hollywood?

Everybody’s work is being exposed to AI dataset generators

As stated by Proof News: “Apple, Nvidia, Anthropic, and other big tech companies used thousands of swiped YouTube videos to train AI. Creators claim their videos were used without their knowledge”. The research site claims that tech companies are turning to controversial tactics to feed their data-hungry artificial intelligence models, vacuuming up books, websites, photos, and social media posts, often unbeknownst to the creators.

OpenAI founder, Sam Altman, and Hollywood.
OpenAI founder, Sam Altman, and Hollywood.

Remaining in secrecy

Proof News adds that AI companies are generally secretive about their sources of training data, but an investigation by Proof News found some of the wealthiest AI companies in the world have used material from thousands of YouTube videos to train AI. Companies did so despite YouTube’s rules against harvesting materials from the platform without permission. “Our investigation found that subtitles from 173,536 YouTube videos, siphoned from more than 48,000 channels, were used by Silicon Valley heavyweights, including Anthropic, Nvidia, Apple, and Salesforce” the site adds. Proof News also found material from YouTube megastars, including MrBeast (289 million subscribers, two videos taken for training), Marques Brownlee (19 million subscribers, seven videos taken), Jacksepticeye (nearly 31 million subscribers, 377 videos taken), and PewDiePie (111 million subscribers, 337 videos taken). Some of the material used to train AI also promoted conspiracies such as the “at-Earth theory.”

Video by Sora
Video by Sora

Layers on layers of data: Irreversible process

It’s important to emphasize that the process of training datasets is irreversible since every layer of data is based on other layers. Practically, AI image generators can not undo it (=remove a specifically trained video) since it would interfere with the AI calculation, and thus, datasets are being stitched together. 

Post production - post prompt. Filmmaking on Sora. Image: shy kids
Post production – post prompt. Filmmaking on Sora. Image: shy kids

YouTube and Sora

Moreover, OpenAI executives have repeatedly declined to publicly answer questions about whether it used YouTube videos to train its AI product Sora, which creates videos from text prompts. Earlier this year, a reporter with The Wall Street Journal put the question to Mira Murati, OpenAI’s chief technology officer. “I’m actually not sure about that,” Murati replied. That means the answer is ‘Yes’. So next time you upload your Porsche to YouTube, be aware that Sora will be trained on it and without your consent. According to those dataset generators, the utilization of YT content for AI train purposes can be defined as ‘Fair Use’. Yeah, you heard right. AI image generators think that taking your videos to train on them is Fair Use. Oh, and without any compensation – means you are getting nothing for it. I have a question: WHERE ARE THE LAWYERS?

Sora: Democratization of Filmmaking
Sora: Democratization of Filmmaking

Possible solution: Marking and money!

First, YouTube needs to address that ASAP, by clarifying and explaining to creators in case their videos are being trained without their permission. Second, trained videos should be marked, as well as AI-generated imagery. Every AI video must be marked ‘Made by AI’. Third, creators should get compensated twice: By datasets generators, and by the AI giants who trained those datasets. Therefore, those ‘voluntary’ dataset generators will understand the consequences of harming those who feed them (creators), by paying them money. It’s time to stop this circus. Take an example from Blackmagic.

Get the best of filmmaking!

Subscribe to Y.M.Cinema Magazine to get the latest news and insights on cinematography and filmmaking!

Yossy is a filmmaker who specializes mainly in action sports cinematography. Yossy also lectures about the art of independent filmmaking in leading educational institutes, academic programs, and festivals, and his independent films have garnered international awards and recognition.
Yossy is the founder of Y.M.Cinema Magazine.

Leave a Reply

Your email address will not be published.

Get the best of filmmaking!

Subscribe to Y.M.Cinema Magazine to get the latest news and insights on cinematography and filmmaking!

Get the best of filmmaking!

Subscribe to Y.M.Cinema Magazine to get the latest news and insights on cinematography and filmmaking!

Nikon is Ready to Develop a Cinema Camera
Previous Story

Nikon is Ready to Develop a Cinema Camera

Sony BURANO + FX3: The Ideal Combination?
Next Story

Sony BURANO + FX3: The Ideal Combination?

Latest from Educate

Hear the Sound of IMAX 15/70 Film Cameras

Hear the Sound of IMAX 15/70 Film Cameras

IMAX film cameras are noisy. Their design and structure make them almost impossible to utilize in dialogue scenes. However, have you ever heard them on set? These two YouTube videos demonstrate the…
The Bond Between GoPro and IMAX

The Bond Between GoPro and IMAX

IMAX has just released the trailer of a documentary called “FLY”. Very similar to Skywalkers, this is a story about a loving couple who seek danger, specializing in Base-Jumping. The main cameras…
Go toTop

Don't Miss

First Major Brand Utilizes Sora for Video Commercial

First Major Brand Utilizes Sora for Video Commercial

In April, Toys R Us partnered with AI marketing company Native Foreign to release the “first ever brand film” created with Sora, OpenAI’s…
Sony Introduces 4K Camcorders: Less Sensor, More AI

Sony Introduces 4K Camcorders: Less Sensor, More AI

Sony adds two 4K handheld professional camcorders with AI-based subject recognition Autofocus, tracking, and Auto Framing. Titled HXR-NX800 and PXW-Z200, those cameras are…