this post was submitted on 03 Feb 2024
192 points (99.0% liked)

Technology

59578 readers
3208 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

Google will no longer back up the Internet: Cached webpages are dead. Google Search will no longer make site backups while crawling the web.::Google Search will no longer make site backups while crawling the web.

top 16 comments
sorted by: hot top controversial new old
[–] wosat@lemmy.world 59 points 9 months ago (1 children)

To be clear, Google will still be storing copies of the pages they crawl. They just won't be making those copies available to end users.

[–] BananaTrifleViolin@lemmy.world 8 points 9 months ago

Absolutely. This is just locking off users access and is an example of enshittification. 0 benefit to users but benefits to Google (more click through to sponsored links, and blocking AI content crawlers probably)

[–] JimmyBigSausage@lemm.ee 46 points 9 months ago (1 children)

Internet Archive busy at it.

[–] bloopernova@programming.dev 16 points 9 months ago

And Kagi search results have additional links to archived versions. It's very useful.

[–] Grimm665@lemmy.world 26 points 9 months ago* (last edited 9 months ago) (2 children)

Does anyone else remember growing up being told "watch what you put on the internet! it'll be there forever!" Now it seems more and more like things out on the internet won't be there forever unless someone specifically wants it to. I seem to having a harder and harder time digging up parts of the internet i remember from my childhood, the old parts are slowly being erased by entropy and lack of desire to keep them there.

[–] AtariDump@lemmy.world 10 points 9 months ago

The internet has selective amnesia

[–] emeralddawn45@discuss.tchncs.de 5 points 9 months ago* (last edited 9 months ago) (1 children)

I mean sure, searches are basically useless now and the internet is filled with ai and seo garbage, but most everything is also still out there /somewhere/, even if it may only be in like the archive of the NSA. Plus when ai gets a bit better certain people will probably be able to link everything you've ever said to your "advertising profile" (Google basically already does this). Plus I've been saying for years that soon enough there'll be a facial recognition crawler app where you upload a photo of someone and it shows you every picture they've ever appeared in. Although with how good deep fakes are now this is arguably less concerning.

[–] Firenz@lemmy.world 2 points 9 months ago

I’ve been wondering for a while now if we’re even getting real results or if the seo results have made crawlers (as we know it) largely redundant. I doubt google has any real incentive to move away from the model that is now in place, unless they do intend to launch a paid service (free from the above mentioned garbage) alongside the existing service should competitors like kagi present a real threat to their business.

[–] kowcop@aussie.zone 18 points 9 months ago (1 children)

Lemme guess.. they are worried companies are using it to train ai, so better close it off so they control access to it

[–] Cqrd@lemmy.dbzer0.com 2 points 9 months ago

Maybe, but it also kills one of the better paywall avoidance methods

[–] autotldr@lemmings.world 6 points 9 months ago

This is the best summary I could come up with:


Google Search's "cached" links have long been an alternative way to load a website that was down or had changed, but now the company is killing them off.

The feature has been appearing and disappearing for some people since December, and currently, we don't see any cache links in Google Search.

Cached links used to live under the drop-down menu next to every search result on Google's page.

As the Google web crawler scoured the Internet for new and updated webpages, it would also save a copy of whatever it was seeing.

That quickly led to Google having a backup of basically the entire Internet, using what was probably an uncountable number of petabytes of data.

In 2020, Google switched to mobile-by-default, so for instance, if you visit that cached Ars link from earlier, you get the mobile site.


The original article contains 438 words, the summary contains 139 words. Saved 68%. I'm a bot and I'm open source!

[–] Smokeydope@lemmy.world 5 points 9 months ago* (last edited 9 months ago)

Cached webpages are dead

Internet archive: 👀

[–] ShellMonkey@lemmy.socdojo.com 2 points 9 months ago (1 children)

I haven't used Google directly other than at work for a while, but this is actually a welcome thing for those charged with filtering the web. Those cache links are so often used as a way around web content filters and with how closely the entire Google ecosystem is integrated it's a pain to slice them apart from the live web.

[–] fruitycoder@sh.itjust.works 1 points 9 months ago

Yeah, they used to be my go-to for getting around the censors myself

[–] Mango@lemmy.world -2 points 9 months ago

It's been years since I remembered cached pages was a thing.

[–] stufkes@lemmy.world -5 points 9 months ago

I don’t think this is a bad thing per sé. I don’t think that everything on the internet should be saved.