this post was submitted on 01 Jun 2024

396 points (98.3% liked)

Technology

76257 readers

2892 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws

396

NGO noyb has filed a complaint against ChatGPT for violation of the GDPR (noyb.eu)

submitted 1 year ago by Vittelius@feddit.de to c/technology@lemmy.world

66 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[+] Reverendender@sh.itjust.works -104 points 1 year ago (4 children)

I am ALL for reigning in these above the law megacorps. That said, please do not take GPT away from me. It is such a boon to so many aspects of my life, and I don't want to go back to the before times.

[–] Karyoplasma@discuss.tchncs.de 105 points 1 year ago (1 children)

So you're not for reining in megacorps, just the ones you don't see as a personal benefit.

[+] Reverendender@sh.itjust.works -38 points 1 year ago (1 children)

You're right. I had an idea to regulate without completely eliminating, but that's obviously crazy talk.

[–] GoodEye8@lemm.ee 26 points 1 year ago (1 children)

You do know the R in GDPR literally stands for Regulation? There's already a regulation that chatGPT should follow but deliberately doesn't. Your idea isn't to regulate, it's to get rid of regulation so that you could keep using your tool.

[–] laurelraven@lemmy.blahaj.zone 2 points 1 year ago (1 children)

Sounded more like enforcing the regulations without destroying the company or product to me, which I would have assumed was the preferred avenue with most regulations

[–] GoodEye8@lemm.ee 5 points 1 year ago (1 children)

Agree to disagree. Regulations exist for a purpose and companies need to follow regulations. If a company/product can't existing without breaking regulations it shouldn't exist in the first place. When you take a stance that a company/product needs to exist and a regulation prevents it and you go changing the regulation you're effectively getting rid of the regulation. Now, there may be exceptions, but this here is not one of those exceptions.

[–] laurelraven@lemmy.blahaj.zone 1 points 1 year ago (1 children)

I mean, sure, if that's what someone is saying, but I didn't see anyone suggest that here.

Companies violating regulations can be made to follow them without tearing down the company or product, and I'm absolutely not convinced LLMs have to violate the GDPR to exist.

[–] GoodEye8@lemm.ee 1 points 1 year ago

That's a matter of perspective. I took the other persons comments as "Don't take away my chatGPT, change the regulations if you must but don't take it away", which is essentially the same as "get rid of regulation".

Realistically I also don't see this killing LLMs since the infringement is on giving accurate information about people. I'm assuming they have enough control over their model to make it say "I can't give information about people" and everything is fine. But if they can't (or most likely won't because it would cost too much money) then the product should get torn down. I don't think we should give free pass to companies for playing stupid games, even if they make a useful product.

[–] gravitas_deficiency@sh.itjust.works 37 points 1 year ago (2 children)

So we should only ban things that aren’t helpful to you in particular? That’s a very… conservative way of thinking.

[–] rovingnothing29@lemmy.world 17 points 1 year ago (1 children)

Don't you realize everyone exists to serve me?

[–] tsonfeir@lemmy.world 0 points 1 year ago

No, they think everyone exists to serve them.

[+] eltrain123@lemmy.world -17 points 1 year ago (1 children)

People can’t seem to understand that it’s a tool in the early stages of development. If you are treating it as a source of truth, you are missing the point of it entirely. If it tells you something about a person, that is not to be trusted as fact.

Every bit of information you get from it should be researched and verified. It just gives you a good jumping off point and direction to look based on your prompting. You can drastically improve your results on any subject with good direction, especially something you don’t know a lot about and are starting out in your research. If you are asking it about specific facts you want it to regurgitate, you are going to get bad information.

If you are claiming damages from something you know gives false information, maybe you should learn how to use the tool before you get your feelings invested, so you can start using it more effectively in your own applications. If you want it to specifically say something that can grab a headline, you can make it do that, it’s just disingenuous and not actually benefiting the conversation, the technology, or the future.

They have a long way to go to solve AGI, but the benefits to society along the way outpace current tools. At maturity, it has the potential to change major socio-economic structures, but it never gets there if people want to treat it like it has intuition and is trying to hurt them as the technology starts getting stood up.

[–] buddascrayon@lemmy.world 6 points 1 year ago* (last edited 1 year ago)

If you're wondering why you're getting so many downvotes, it's because you're ignoring the fact that the companies that have created these LLMs are passing them off as truth machines by plugging them directly into search engines and then asking everybody to use them as such. It's not the fault of the people who are trusting these things, it's the fault of the companies that are creating them and then passing them off as something they're not. And those companies need to face a reckoning.

[–] passepartout@feddit.de 26 points 1 year ago (4 children)

Have a look at self hosted alternatives like Ollama in combination with Open-webui. It can be a hassle to set up, or even excruciatingly painful if you never touched a computer before, but it could be worth a try. I use it daily and like it much more than chatgpt to be honest.

[–] kamenlady@lemmy.world 7 points 1 year ago

excruciatingly painful

is the perfect description

[–] Turun@feddit.de 7 points 1 year ago

You can literally run large language models with a single exe download: https://github.com/Mozilla-Ocho/llamafile

It doesn't get much simpler than that.

[–] Reverendender@sh.itjust.works 4 points 1 year ago

Thanks!

[–] capital@lemmy.world 1 points 1 year ago* (last edited 1 year ago) (1 children)

I use it daily and like it much more than chatgpt to be honest.

I wish I did. What local model and version of ChatGPT did you compare?

For my purposes, ChatGPT 4 was leagues ahead of the largest model I could run on a 1060.

[–] passepartout@feddit.de 1 points 1 year ago (1 children)

I like the gemma models bc of the phrasing they use and that they give sources sometimes. The best results though come from llama3 I think. Also openhermes and openchat, which perform well enough for my purposes.

In the beginning i had used microsoft phi, that wasn't that good though.

[–] capital@lemmy.world 2 points 1 year ago

I will have to give it another shot because I don't recognize any of those models meaning I probably didn't try them.

[–] SpaceNoodle@lemmy.world 9 points 1 year ago (4 children)

In what ways are you benefiting from a bevy of factually dubious query responses?

[–] brbposting@sh.itjust.works 10 points 1 year ago (1 children)

Can absolutely never blindly trust the hallucinating plagiarism machine.

It's great where either facts don't matter or you're personally in a position to vet all of its “factual” output 100%. Text revision, prompting for additional perspectives, prompting to challenge beliefs and identify gaps. Reformatting, quick and easy data extraction, outlining, brainstorming.

[–] SpaceNoodle@lemmy.world 2 points 1 year ago (1 children)

Reformatting and outlining as long as you go over and revise it again anyway, seemingly making that moot.

Data extraction as long as you don't care if the data is mangled.

Brainstorming is a good one, since off-the-wall ideas can be useful in that context.

[–] sugar_in_your_tea@sh.itjust.works 4 points 1 year ago (2 children)

In most cases I've seen AI used, the person spends as much time correcting it than they would if they just did the work without AI. So maybe it makes you feel more productive because a bunch of stuff happens all at once, but at least for text generation, I think it's more of a placebo.

[–] sudoreboot@slrpnk.net 3 points 1 year ago

It can at least get one unstuck, past an indecision paralysis, or give an outline of an idea. It can also be useful for searching though data.

[–] SpaceNoodle@lemmy.world -1 points 1 year ago (1 children)

If all I want is something blatantly false or legible yet nonsensical, like a modern lorem ipsum, it's a real time-saver.

[–] sugar_in_your_tea@sh.itjust.works 0 points 1 year ago (1 children)

Why not just use lorem ipsum? It's just a copy/paste, and without the liability of having false information if you forget to proofread it.

[–] SpaceNoodle@lemmy.world -2 points 1 year ago

I guess ChatGPT is just completely useless, then.

[–] capital@lemmy.world 3 points 1 year ago* (last edited 1 year ago) (1 children)

This question betrays either your non-use or misuse of the products available. You're either just reading the headlines of the screw-ups or you're just bad at using the tool.

To directly answer your question:

Quick scripts in a variety of languages. Tested before being used on real data/systems.
Creating visual graphs of data in python and Jupyter notebooks with no prior knowledge of python itself or the tools it's running. In this case, I was able to update the way I wanted it to look in natural language, have it suggest code changes, and immediately try them in the notebook with great results.
Improving the sentiment of correspondence. Proofread before sending. It has better grammar and flow than a surprising number of correspondences I've come across at work. Sure, English may be their second language but it doesn't change the fact.
Quickly finding documentation pertaining to the query which, yes, you need to go read to verify any answers any LLM provides. Anyone using it regularly should know this by now.
Quick "do this in command line. What options are required" which is then immediately tested.
In one case, a news story was referenced in passing in a podcast I listen to. It stuck with me days later and I wanted to find actual articles written about it. I was able to describe what I was looking for in natural language and included as many details as I could remember and asked it to find articles for me. I found exactly what I was after.

But were you actually looking for a real response to your question?

[–] SpaceNoodle@lemmy.world -1 points 1 year ago (1 children)

It's worse at all programming tasks except boilerplate, especially with its tendency to inject booby traps. Not knowing how to use the programming language it emits becomes a significant problem.

Comparing a language model to an idiot is unfair to the idiot.

A normal search engine works for everything else.

Any well-defined query I've ever made of an LLM has resulted in hilariously bad results, but I suppose I was expecting it to do something that I couldn't already do better myself.

[–] capital@lemmy.world 4 points 1 year ago

I'm a systems administrator, not a programmer. Like I said, quick scripts. An LLM could probably parse my comment better than you, evidently.

Comparing a language model to an idiot is unfair to the idiot.

Oof.. Was this in reply to my bit about better grammar and ESL individuals?

A normal search engine works for everything else.

Fuck no. Especially the python visualization point.

Any well-defined query I’ve ever made of an LLM has resulted in hilariously bad results, but I suppose I was expecting it to do something that I couldn’t already do better myself.

I suppose you're just a god among men then. For the rest of us, it's useful and you've been given plenty of good answers to your disingenuous question.

[–] kogasa@programming.dev 3 points 1 year ago (1 children)

I don't really query, but it's good enough at code generation to be occasionally useful. If it can spit out 100 lines of code that is generally reasonable, it's faster to adjust the generated code than to write it all from scratch. More generally, it's good for generating responses whose content and structure are easy to verify (like a question you already know the answer to), with the value being in the time saved rather than the content itself.

[–] SpaceNoodle@lemmy.world -2 points 1 year ago

It's good at regurgitating boilerplate, from what I've gathered.

[+] tsonfeir@lemmy.world -10 points 1 year ago (2 children)

Someone doesn’t know how to use ChatGPT

[–] SpaceNoodle@lemmy.world 6 points 1 year ago (1 children)

Oh, is there an arcane invocation that magically imbues it with reason?

[+] tsonfeir@lemmy.world -6 points 1 year ago (1 children)

Nope, just gotta know what it IS, what it ISN’T, and how to correctly write prompts for it to return data that you can use to formulate your own conclusion.

When using AI, it’s only as smart as the operator.

[–] SpaceNoodle@lemmy.world -2 points 1 year ago (2 children)

Well, it's not AI, for starters.

[–] msage@programming.dev 3 points 1 year ago (1 children)

As much as I hate to do this, it is AI, as ML is a part of Artificial Intelligence.

It isn't AGI, some might say it may be, but they are wrong. But the model is learning.

[–] SpaceNoodle@lemmy.world -1 points 1 year ago* (last edited 1 year ago) (1 children)

An LLM is not capable of learning. It won't hallucinate less with additional training input.

[–] msage@programming.dev 0 points 1 year ago (1 children)

Just the notion of a computer having hallucinations should suggest that it's doing more than just basic code.

It's not 'intelligent', but it has 'learned' enough beyond standard CPU instructions.

That's why it's not a General AI, but it's still an AI.

[–] SpaceNoodle@lemmy.world 0 points 1 year ago* (last edited 1 year ago) (1 children)

I also talk about gremlins inside CPUs, but that doesn't mean I think there are magical critters turning a crank inside them.

It's called a metaphor, brother.

Regardless, it's all code that's eventually run on a CPU, so there isn't any step where magic is injected.

[–] msage@programming.dev 0 points 1 year ago (1 children)

Sigh.

There is no code for language processing, it's just math approximating results from weights. The whole weight set-up is what's called 'artificial intelligence', because nobody wrote

if prompt like 'python' return ['large snake', 'programming language', 'australian car company']

the model 'learned' how to mimic human speech using training, not by 1000s of software engineers adding more branches to the code.

That technique is part of 'artificial intelligence', when computers solve problems they were not programmed to do. The neural network learns its knowledge by the code, but the code has no idea what is going on.

[–] SpaceNoodle@lemmy.world 0 points 1 year ago (1 children)

How do you think math is implemented on a computer?

[–] msage@programming.dev 0 points 1 year ago (1 children)

I am now properly confused as to what are you arguing for.

So let me go to the basics.

Computers follow instructions to the letter. Take input, process it, produce output.

There are specific instructions that computer can carry out, we can build on top of them to make them more complex. We write code to do that.

True/false gates can become numbers, which can become text, audio, video.

But everything 'programmed' or 'digitally created' is using the same instructions and only ever does what we tell the computer to do.

Cutting video will require video input, and then user has to do specific actions to produce a specific result.

Almost everything in existence is built like that - someone wrote specific code for technology to behave.

Now, this is very primitive way of solving tasks, specifically for real-world parameters. Computers have gigabytes (10^9) of memory, but just the earth has 10^50 atoms, so we can't put eveything into a computers (which is why we can't 100% predict the weather), and checking for every input parameter is not only futile, but also meaningless.

Enter 'artificial intelligence', approximated way of solving problems. Suddenly we don't code the tasks themselves, we only specify the neural network - weights and connections between them, and code the 'learning' algorhitm that adjusts the weights based on inputs during 'training'. Training is the expensive part, where we put huge amounts of input into the network, and if the answer we get is incorrect, we adjust the weights and try again with another sample.

It's very expensive in every way, but the code involved doesn't care about anything other than adjusting those weights. The network can be fed images and determining whether it's a dog or a cat. It can be fed audio samples and expect to write down the lyrics. The code doesn't know or care, apart from distinguishing between correct and not correct answers and adjusting those weights.

After those weights are set to our satisfaction, we can release them for others to use. We expect the network to have 'reliable' outputs for our inputs, so we just calculate the neuron activations based on those weights for every input, nothing else is necessary.

Therefore you do have code in the machine that learns, but only during training, and you have code that actually 'runs' the algorhitm for calculating output. But the actual solution to the problem is not inside the code, it can't even be coded by humans in any way. The neural network is a statistical model generated by the training set and according to our learning algo. The bigger the network, the bigger the training set, the better should those outputs be (in theory).

To take the cutting video example further, you can train network to cut trailers from movies.

Or you can let editors do that.

They both will use computers, but one is using deteministically coded software that just follows specific orders one by one, and the other just computes the neuron activations based on the inputs and produces an output based on what it had available in the training data with some probability.

So yes, machines can learn, and it's a subset of the 'Artificial Intelligence' field.

[–] SpaceNoodle@lemmy.world 0 points 1 year ago (1 children)

It won't hallucinate less with additional training input.

An LLM is good at making sentences that seem convincing, but has no ability to reason.

[–] msage@programming.dev 0 points 1 year ago (1 children)

Thanks for ignoring the same argument over and over again, it makes you look very stuck-up.

Intelligence does not require perfection (you are an example). You also hallucinate random output, but you can learn to stop specific hallucinations - like reading a Wiki page.

LLM aren't different in that regard - they were trained on inputs, and if you extend their training sets, they will be more exact in those areas.

Ability to reason is a very hard concept to specify, and we don't have any foolproof test (that I know of) that would definitely say if LLMs can reach that stage.

I will fight you if you try to tell me that humans are smarter than any current AI - because there are some real dumb people walking this earth and mindlessly reproducing, unable to process basic concepts that they depend their lives on.

Nothing of this changes the fact that there is an intelligence - natural language is an incredibly hard thing to code deterministically - and as such deserves the 'AI' label without a doubt.

[–] SpaceNoodle@lemmy.world 1 points 1 year ago

There is a complete lack of intelligence, just a passable facade that crumbles under scrutiny.

[–] tsonfeir@lemmy.world -1 points 1 year ago

Keep going…

[–] capital@lemmy.world 4 points 1 year ago* (last edited 1 year ago)

New version of people who know how to search the web vs those who don't. Currently shit search results broken by search companies notwithstanding.