Sites scramble to block ChatGPT web crawler after instructions emerge

Enlarge (credit: Getty Images)

Without announcement, OpenAI recently added details about its web crawler, GPTBot, to its online documentation site. GPTBot is the name of the user agent that the company uses to retrieve webpages to train the AI models behind ChatGPT, such as GPT-4. Earlier this week, some sites quickly announced their intention to block GPTBot’s access to their content.

In the new documentation, OpenAI says that webpages crawled with GPTBot “may potentially be used to improve future models,” and that allowing GPTBot to access your site “can help AI models become more accurate and improve their general capabilities and safety.”

OpenAI claims it has implemented filters ensuring that sources behind paywalls, those collecting personally identifiable information, or any content violating OpenAI’s policies will not be accessed by GPTBot.

Read 12 remaining paragraphs | Comments

Sites scramble to block ChatGPT web crawler after instructions emerge

By

By

Related Post

The Last of Us episode 5 recap: There’s something in the air

OpenAI Stargate Phase 1 Construction of 200 Megawatts and 980,000 Square Feet

China Satellite Laser Ranges Out to the Moons Orbit

You missed

The Last of Us episode 5 recap: There’s something in the air

OpenAI Stargate Phase 1 Construction of 200 Megawatts and 980,000 Square Feet

China Satellite Laser Ranges Out to the Moons Orbit

Apple may release a ‘mostly glass, curved iPhone’ in 2027

ModernAftertime