A number of companies have taken major steps to stop scrapers from attempting to take their text.
It is the latest front in an ongoing and apparently escalating battle between websites that allow people to read text and the AI companies that wish to use it to build their new tools.
The rise of artificial intelligence has brought a number of companies looking to train new and smarter AI technologies. But the large language model systems that underpin many of them – such as ChatGPT – require vast amounts of text to be trained.
That has led some companies to scrape text from the web so that it can be fed into those systems for that training. That in turn has led to frustration from the owners of text-based websites, who argue not only that the companies do not have permission to use their data, but also that it is slowing down the performance of the internet.
Elon Musk, for instance, has repeatedly suggested that X, formerly Twitter, gets a huge amount of traffic from such scraping systems. X is one of many sites that have introduced strict “rate limiting” rules, which try and restrict bots from reloading its site too much – though some have suggested that has also been used to disguise problems with X’s seemingly troubled website.
READ MORE: