This past week, apps such as DALL-E, Midjourney and Stable Diffusion have come under fire for illegally using real images to train their AI bots.
Artificial Intelligence is the new craze to have hit the internet. The recent explosion in AI tools have relentlessly left many fascinated by the wonders that AI can offer. Among these wonders is AI’s ability to generate art pieces, most especially images that depict human faces.
However, there have been concerns on the likelihood of AI eroding the niche of human creativity. These concerns have been echoed by writers, programmers and artists alike. The outcry from artists have, most recently, been loudest. The rise of AI image generators like DALL-E has battered the pride of visual culture, which led artists to question the origins of the generated images.
AI’s Dirty Secret
As awe-inspiring as Artificial Intelligence might be, it does not possess the power to conjure images out of its own imagination. Virtual artists have to start somewhere before they can produce an image. And this starting point is a technique known as scraping.
Image scraping is a process used to extract images from websites. It is a form of web scraping, which is a method used to collect data from websites.
The image scraping process begins with the selection of an image source. Typically, this source will be a website, such as a search engine, an image-sharing platform, or a specific website. Once the source is identified, the scraper will then determine which images to scrape from the source. This can be done manually or by using automated tools.
The scraper will then download the images from the source. This is done by sending a request to the source to retrieve the image. The request will contain information about the image, such as its size and the URL of the image.
When the images are scraped to the target destination, they are saved. The scraper will then modify the images. This can include resizing, cropping, or any other modifications. This is done to ensure that the images are suitable for its intended purpose.
Image scraping is increasingly being used to train image generator bots. Image generator bots are Artificial Intelligence (AI) systems that generate new images from existing ones. This process is often used in the development of deep learning algorithms. Image generator bots are trained by feeding them a large number of images and then having them generate new images from the existing ones.
When it comes to art, the artist should have control over how their work is presented and distributed. However, image scraping removes that control from them, as anyone can take their artwork and use it for their own purposes through AI. This can lead to the artwork being used without credit being given to the artist, or it can be used for commercial purposes without any payment being made to the artist.
Image scraping can lead to the artwork being sold without the artist’s permission, meaning they may not receive their fair share of the profits. The artwork can be reproduced and sold illegally, which can have a devastating effect on the artist’s income. This is why artists have launched such a loud outcry against what they consider a crime against their creativity through the increasing use of AI.
The Ongoing War On Image Scraping
In January, Meta filed a suit against the surveillance startup Voyager Labs for illegally obtaining its user data. Following suit, Getty images sued Stability AI for scraping its content. It stated that,
“Stability AI didn’t seek license to scrape the Getty collection for its own commercial use.”
On the side of Meta, the social media giant claimed that Voyager Labs scraped data off its social media platforms of Facebook, Instagram and others like Twitter and Telegram. Voyager Labs supposedly accomplished this by creating 38,000 fake profiles and using them to extract public information from over 600,000 users. Meta demanded an end to such activities, as well as compensation for damages incurred. The lawsuit is still ongoing.
Apart from the clash of the big boys, a coalition of artists have taken to the courtroom in protest of image scraping. They sued Stability AI, Midjourney and DeviantArt for copyright infringement as their images and artworks are being used to train the companies’ image generators. The tussle between emerging AI technologies and the rights of those it uses for its data is ongoing.