But it actually was always there... We already have tools to battle it to a degree
I don’t think traffic hogging and data scraping will ever go away. We were okay with Google doing it, but at least people used to click through to your site. Now you get beautiful summary boxes in your AI tool of choice…
Contextually, it’s great. But let’s be honest: how often do you really end up on the source page?
I think AI tools need to cite all the sources they use, but then again, none of us really do…
So what’s the answer to this ongoing phenomenon? Here are some options (not an exhaustive list):
- https://developers.netlify.com/guides/blocking-ai-bots-and-controlling-crawlers/
- https://blog.cloudflare.com/declaring-your-aindependence-block-ai-bots-scrapers-and-crawlers-with-a-single-click/
There are probably more resources… and a bunch of strategies you can use, like:
- Robots.txt and AI-specific directives
- CDN/WAF bot management
- Rate limiting & request throttling
- IP blocking & geofencing
- Behavioral analysis
- Honeypots & trap links
- Dynamic DOM obfuscation
- NoAI meta tags & HTTP headers
- Terms of Service enforcement
- Content poisoning
- Anomaly monitoring & alerting
- Captcha
This is a continuation of my “Traffic hogging… Has arrived…” post from linked in:
https://www.linkedin.com/feed/update/urn:li:activity:7313284435347443714/