this post was submitted on 03 Aug 2025
352 points (97.1% liked)

Technology

73605 readers
3246 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
top 45 comments
sorted by: hot top controversial new old
[–] Pamasich@kbin.earth 6 points 5 hours ago

Update 7/31/25 4:10pm PT: Hours after this article was published, OpenAI said it removed the feature from ChatGPT that allowed users to make their public conversations discoverable by search engines. The company says this was a short-lived experiment that ultimately “introduced too many opportunities for folks to accidentally share things they didn’t intend to.”

Interesting, because the checkbox is still there for me. Don't see things having changed at all, maybe they made the fine print more white? But nothing else.

In general, this reminds me of the incognito drama. Iirc people were unhappy that incognito mode didn't prevent Google websites from fingerprinting you. Which... the mode never claimed to do, it explicitly told you it didn't do that.

For chats to be discoverable through search engines, you not only have to explicitly and manually share them, you also have to then opt in to having them appear on search machines via a checkbox.

The main criticism I've seen is that the checkbox's main label only says it makes the chat "discoverable", while the search engines clarification is in the fine print. But I don't really understand how that is unclear. Like, even if they made them discoverable through ChatGPT's website only (so no third party data sharing), Google would still get their hands on them via their crawler. This is just them skipping the middleman, the end result is the same. We'd still hear news about them appearing on Google.

This just seems to me like people clicking a checkbox based on vibes rather than critical thought of what consequences it could have and whether they want them. I don't see what can really be done against people like that.

I don't think OpenAI can be blamed for doing the data sharing, as it's opt-in, nor for the chats ending up on Google at all. If the latter was a valid complaint, it would also be valid to complain to the Lemmy devs about Lemmy posts appearing on Google. And again, I don't think the label complaint has much weight to it either, because if it's discoverable, it gets to Google one way or another.

[–] TheMonk@lemmings.world 2 points 5 hours ago

Should we be surprised? Thinking AI, the most data hungry undertaking in existence, was not storing the data from what you write? Especially when the companies behind it are the most invasive in history? Lol what else

[–] mrodri89@lemmy.zip 1 points 5 hours ago

I use DuckDuckGo. :)

[–] RampantParanoia2365@lemmy.world 2 points 7 hours ago

Isn't that a good thing? If more people are using it without it being indexed, search engines would end up even more useless.

[–] MonkderVierte@lemmy.zip 11 points 13 hours ago (1 children)

Mine are not public, i use ~~a tinfoil~~ duck.ai.

[–] jim3692@discuss.online 1 points 4 hours ago

I use local Ollama. I don't trust anyone with my AI conversations.

[–] corroded@lemmy.world 79 points 1 day ago (2 children)

If you don't want your conversations to be public, how about you don't tick the checkbook that says "make this public." This isn't OpenAI's problem, its an idiot user problem.

[–] zerozaku@lemmy.world 20 points 17 hours ago (1 children)

This is a case of corporation taking advantage of technically idiotic userbase, which is most of the general public. OpenAI using a dark pattern so that users can't easily unchecked that box nor making that text that says "this can be indexed by search engines" brightly visible.

[–] panda_abyss@lemmy.ca 9 points 8 hours ago (1 children)

I don't think OpenAI gets anything from this, I think they just failed to realize how stupid the average person is.

[–] AnarchistArtificer@lemmy.world -1 points 7 hours ago (1 children)

They get more human written text, which is one of the most powerful things in their doomed attempt to forestall model collapse

[–] panda_abyss@lemmy.ca 5 points 7 hours ago

They already have the text

[–] FauxLiving@lemmy.world 37 points 1 day ago (2 children)

If you don't want corporations to use you chats as data, don't use corporate hosted language models.

Even non-public chats are archived by OpenAI, and the terms of service of ChatGPT essentially give OpenAI the right to use your conversations in any way that they choose.

You can bet they'll eventually find ways to monetize your data at some point in the future. If you think GoogleAds is powerful, wait until people's assistants are trained with every manipulative technique we've ever invented and are trying to sell you breakfast cereals or boner pills...

You can't uncheck that box except by not using it in the first place. But people will sell their soul to a company in order to not have to learn a little bit about self-hosting

[–] Electricd@lemmybefree.net 10 points 17 hours ago (1 children)

This is basically a "if you don’t want your data to be used, run your own internet" comment

It’s just not doable for pretty much everyone

[–] Allero@lemmy.today 8 points 14 hours ago* (last edited 14 hours ago) (1 children)

Modern LLMs can serve you for most tasks while running locally on your machine.

Something like GPT4ALL will do the trick on any platform of your choosing if you have at least 8gb of RAM (and for most people nowadays it's true).

It has a simple, idiot-proof GUI and doesn't collect data if you don't allow it to. It's also open source, and, being local, it does not need Internet connection once you downloaded a model you need (which normally takes a single-digit number of gigabytes).

[–] Electricd@lemmybefree.net 1 points 13 hours ago (1 children)

If you want actual good features like deep research or chain of thought, eh, not sure it’s a good choice

The models will also not be very powerful

[–] null@lemmy.nullspace.lol 2 points 10 hours ago (1 children)

And you don't need any of that. You don't even need a local LLM.

So if you decide you want it, then that's on you, and you have made the choice to give up your data.

[–] Electricd@lemmybefree.net 1 points 7 hours ago* (last edited 7 hours ago) (1 children)

and you don't need a computer, and you don't need to eat good food

It's just that you lose so much productivity, comfort and so on

When such a tool is a difference between 30 mins and 5 hours of work, then you simply use it. You either move with the masses to compete, or you don't, but you'll pay the price anyways.

[–] null@lemmy.nullspace.lol 4 points 7 hours ago (1 children)

If you think LLMs are as fundamental as having a computer or internet access, then I really just don't know what to say.

[–] Electricd@lemmybefree.net 0 points 7 hours ago* (last edited 7 hours ago) (2 children)

You have clearly never been in that situation then. It is obviously not like this for many people, but for students for example, it often means a lot more

[–] Allero@lemmy.today 3 points 6 hours ago* (last edited 6 hours ago) (1 children)

While I don't fully share the notion and tone of other commenter, I gotta say LLMs have absolutely tanked education and science, as noted by many and as I witnessed firsthand.

I'm a young scientist on my way to PhD, and I get to assist in a microbiology course for undergraduates.

The amount of AI slop coming from student assignments is astounding, and worse of all - they don't see it themselves. When it comes to me checking their actual knowledge, it's devastating.

And it's not just undergrads - many scientific articles also now have signs of AI slop, which messes up with research to a concerning degree.

Personally, I tried using more specialized tools like Perplexity in Research mode to look for sources, but it royally messed up listing the sources - it took actual info from scientific articles, but then referenced entirely different articles that hold no relation to it.

So, in my experience LLMs can be useful to generate a simple text or help you tie known facts together. But as a learning tool...be careful, or rather just don't use them for that. Classical education exists for a good reason, and it is that you learn to get factually correct and relevant information, analyze it and keep it in your head for future reference. It takes more time, but is ultimately much worth it.

[–] Electricd@lemmybefree.net 1 points 6 hours ago* (last edited 6 hours ago) (1 children)

Sure, many don’t care and I have also experienced this, but it’s a fabulous way to quickly get a glimpse at a subject, or to get started, or to learn more. It’s not always correct, but for known subjects it’s pretty good

Anything related to law or really specific subjects will be horrible though

Classical education exists for a good reason

Sure, but not everyone teaches well enough, and LLMs are one of the ways to balance this, kinda

And if you don’t understand then… yea it’s still useful as a way to avoid failing a year which is morally questionable but hey, another topic

[–] Allero@lemmy.today 1 points 2 hours ago

Alright, we generally seem to be on the same page :)

(Except numerous great books and helpful short materials exist for virtually any popular major, and, while they take longer to study, they provide order of magnitude better knowledge)

[–] null@lemmy.nullspace.lol 0 points 7 hours ago (1 children)

Yeah, this thing that's notorious for hallucinating and has only recently become even somewhat reliable is essential.

How did all those students from 2021 even survive??

Jesus, we're absolutely fucked.

[–] Electricd@lemmybefree.net 0 points 7 hours ago* (last edited 7 hours ago) (1 children)

Surviving is different than living in good conditions

As with every tool, it has downsides. Learn to use it or continue to whine

[–] null@lemmy.nullspace.lol 2 points 7 hours ago (1 children)

That is... literally my point.

You're comparing very disparate things. And I already pointed out the downside. No one here is whining about anything.

[–] Electricd@lemmybefree.net 0 points 7 hours ago (1 children)

And I already pointed the upside. You don't have to use the default UI though, custom UI exists, APIs exists, and you don't have to enter personal infos in prompts, as well as use your residential IP... this pretty much makes it unlinkable to you

[–] null@lemmy.nullspace.lol 1 points 7 hours ago* (last edited 7 hours ago) (1 children)

To use your analogy, it's like someone said, "Well, if you want a Michelin 5-star steak au jus, then your wallet is gonna take a hit". And you replied, "That's like saying in order to eat dinner you need to raise your own livestock and train for years as a professional chef".

I'm not saying corporate LLMs are bad, or that they have no upside. I'm saying your scale for what's essential and what's a luxury is alarming.

[–] Electricd@lemmybefree.net 0 points 6 hours ago (1 children)

I wish they were always a luxury, but in some situations they’re just too important to me

[–] puck@lemmy.world 3 points 1 day ago (1 children)

Hi there, I’m thinking about getting into self-hosting. I already have a Jellyfin server set up at home but nothing beyond that really. If you have a few minutes, how can self-hosting help in the context of OPs post? Do you mean hosting LLMs on Ollama?

[–] BreadstickNinja@lemmy.world 7 points 1 day ago (2 children)

Yes, Ollama or a range of other backends (Ooba, Kobold, etc.) can run LLMs locally. Huggingface has a huge number of models suited to different tasks like coding, storywriting, general purpose, and so on. If you run both the backend and frontend locally, then no one monetizes your data.

The part I'd argue that the previous poster is glazing over a little bit is performance. Unless you have an enterprise-grade GPU cluster sitting in your basement, you're going to make compromises on speed and/or quality relative to the giant models that run on commercial services.

[–] tal@lemmy.today 1 points 1 day ago (1 children)

It's also going to cost more, because you almost certainly are only going to be using your hardware a tiny fraction of the time.

[–] BreadstickNinja@lemmy.world 2 points 23 hours ago* (last edited 23 hours ago)

Possibly, yes. There are models that will run on consumer-grade GPUs that you might already have or might have purchased anyway, where you might say there's no incremental cost. But the issue is that the performance will be limited. The models are forgetful and prone to getting stuck in loops of repeated phrases.

So if instead you custom-build a workstation with two 5090s or a Pro 6000 or something that pushes you up to the 100 GB VRAM tier, then absolutely, just as you said, you'll be spending thousands of dollars that probably won't pay back relative to renting cloud GPU time.

[–] puck@lemmy.world 1 points 1 day ago

Thanks for the info. Yeah, I was wondering what kind of hardware you’d need to host LLMs locally with decent performance and your post clarifies that. I doubt many people would have the kind of hardware required.

[–] DeceasedPassenger@lemmy.world 38 points 1 day ago (1 children)

I assumed this was a given. Anything offered to tech overlords will be monetized and packaged for profit at every possible angle. Nice to know it's official now, I guess.

[–] Pamasich@kbin.earth 2 points 5 hours ago

Plus, you explicitly have to opt into this, for each chat you share individually.

I get that it says "discoverable" at first and the search engines are in the fine print, but search engine crawlers get it anyway if it's discoverable on ChatGPT's website instead. That term is plenty clear imo.

[–] SkaveRat@discuss.tchncs.de 14 points 1 day ago

I mean... they are public. duh

[–] atticus88th@lemmy.world 9 points 1 day ago

I'll probably have a target on my back because I kept asking it how to replace CEOs and other executives who do literally nothing but collect a paycheck and break shit.

[–] SGforce@lemmy.ca 13 points 1 day ago* (last edited 1 day ago) (1 children)

Oh,boy. More.

Have you heard of timecube? Well here's mirrorcube

Ten times ten thousand pairs of opposite reflected extensions of you are doing the same thing - throwing the ball away from themselves toward their opposites and away from themselves, each one of each pair being the reverse of its opposite, and acting in reverse. YOU NOW KNOW WHAT THE ELECTRIC CURRENT IS, and that should tell you what RADAR is. Likewise it explains RADIO and TELEVISION. [See Principle of Regeneration, 3.13 - Reciprocals and Proportions of Motions and Substance, 7.3 - Law of Love - Reciprocal Interchange of State on Multiple Subdivisions]

It's so fucking insane

[–] Sxan@piefed.zip 6 points 1 day ago (1 children)

Have you heard of timecube?

No, but I've heard of þe time knife.

[–] can@sh.itjust.works 2 points 1 day ago

Here you go

Warning, it gets racist the deeper you read.

[–] SGforce@lemmy.ca 9 points 1 day ago* (last edited 1 day ago) (1 children)

Here's a good one

cape cod conspiracy

As for georges bank, it was made a national monument and are large ancient volanos that would be perfect for a alien base. There's also a line on google maps going from woods hole to there, and obama has a mantion off if king point road looking out into the ocean

Omfg

Yum, a nicely mixed word salad! Lmfao

[–] Twig@sopuli.xyz 3 points 1 day ago (1 children)
[–] Pamasich@kbin.earth 1 points 5 hours ago

ChatGPT chats are only public when turned into a shareable chat (which is a manually created snapshot of the chat with a link). And they only show up on search machines if you, after sharing, select the opt-in checkbox for having it show up there.

I don't know how duck.ai works, but I assume it doesn't do this.