this post was submitted on 29 Jan 2025
931 points (97.7% liked)

Lemmy Shitpost

27827 readers
3709 users here now

Welcome to Lemmy Shitpost. Here you can shitpost to your hearts content.

Anything and everything goes. Memes, Jokes, Vents and Banter. Though we still have to comply with lemmy.world instance rules. So behave!


Rules:

1. Be Respectful


Refrain from using harmful language pertaining to a protected characteristic: e.g. race, gender, sexuality, disability or religion.

Refrain from being argumentative when responding or commenting to posts/replies. Personal attacks are not welcome here.

...


2. No Illegal Content


Content that violates the law. Any post/comment found to be in breach of common law will be removed and given to the authorities if required.

That means:

-No promoting violence/threats against any individuals

-No CSA content or Revenge Porn

-No sharing private/personal information (Doxxing)

...


3. No Spam


Posting the same post, no matter the intent is against the rules.

-If you have posted content, please refrain from re-posting said content within this community.

-Do not spam posts with intent to harass, annoy, bully, advertise, scam or harm this community.

-No posting Scams/Advertisements/Phishing Links/IP Grabbers

-No Bots, Bots will be banned from the community.

...


4. No Porn/ExplicitContent


-Do not post explicit content. Lemmy.World is not the instance for NSFW content.

-Do not post Gore or Shock Content.

...


5. No Enciting Harassment,Brigading, Doxxing or Witch Hunts


-Do not Brigade other Communities

-No calls to action against other communities/users within Lemmy or outside of Lemmy.

-No Witch Hunts against users/communities.

-No content that harasses members within or outside of the community.

...


6. NSFW should be behind NSFW tags.


-Content that is NSFW should be behind NSFW tags.

-Content that might be distressing should be kept behind NSFW tags.

...

If you see content that is a breach of the rules, please flag and report the comment and a moderator will take action where they can.


Also check out:

Partnered Communities:

1.Memes

2.Lemmy Review

3.Mildly Infuriating

4.Lemmy Be Wholesome

5.No Stupid Questions

6.You Should Know

7.Comedy Heaven

8.Credible Defense

9.Ten Forward

10.LinuxMemes (Linux themed memes)


Reach out to

All communities included on the sidebar are to be made in compliance with the instance rules. Striker

founded 2 years ago
MODERATORS
931
AI Training (lemmy.world)
submitted 1 week ago* (last edited 1 week ago) by ekZepp@lemmy.world to c/lemmyshitpost@lemmy.world
 
top 50 comments
sorted by: hot top controversial new old
[–] Hackworth@lemmy.world 30 points 1 week ago
[–] 96VXb9ktTjFnRi@feddit.nl 6 points 6 days ago (1 children)

unpopular opinion: humans are a deeply mimetic species, copying is our very essence and every limitation to it is entirely unnatural and limiting human potential.

[–] ekZepp@lemmy.world 6 points 6 days ago* (last edited 5 days ago) (1 children)

Art live by imitation/ispiration AND reinterpretation of previous works. All the great artists study their predecessors first, THEN they create their own style.

Picasso selfpotrait 1896

7-3315069258

Vs 1971

painting-self-portrait-style-evolution-pablo-picasso-8

Al is just a fucking copy-past blender

[–] 96VXb9ktTjFnRi@feddit.nl 2 points 6 days ago* (last edited 6 days ago)

Yes, even when people copy eachother they don't have the same output. And some individuals are mighty excentric, for instance Picasso. But most people stick almost entirely to what they see and only differentiate by means of the mistakes they make, not by intended originality. From the moment people are born they start copying everything they see. With a head full of mirror neurons we tend to live our lifes exactly the same, and the differences only stand out because they're relative. From a distance we would all look, behave, be more or less the same. Copyright should be abolished. I'm all in favor of supporting artists and creators, support whoever you will out of free will, but don't limit others freedom to copy you. If we can't copy what others have done before us, then our culture is not free. It should be an honor to be copied, that means others like your idea and want to use it too. That's how humans have always lived, that's how we progress. it's what has brought us this far. Let's continue without bizarre copying limitations. If we can copy freely that means culture is free, it means we can learn from eachother, take eachothers ideas and creations, put them to use and expand upon them, sometimes inadvertently while trying to make an exact copy. This freedom will be to the benefit of us all, and the opposite is true aswel, intelectual property is to the detriment of us all.

If you don't want your work to be used by others, keep it private. Don't show it to anyone. Keep your invention in your cellar and let nobody enter. If you want to share your ideas and creations, please do so. But you can't have your cake and eat it too. You can't show what you've made and expect others to not use it as input and put it to use.

[–] Fisch@discuss.tchncs.de 121 points 1 week ago (2 children)

OpenAI when they steal data to train their AI: 😊🥰

OpenAI when their data gets stolen to train AI: 🤯😡🤬

[–] TxzK@lemmy.zip 53 points 1 week ago (1 children)

They stole it first fair and square

[–] BlueLineBae@midwest.social 39 points 1 week ago (2 children)

It's just the British museum all over again.

[–] Agent641@lemmy.world 2 points 6 days ago

"We're not done looking at it!"

[–] quixote84@midwest.social 12 points 1 week ago

At the point where it becomes possible to copy the entire British museum and hand it out to anyone who wants one, maybe it starts to be a good idea to do exactly that...

[–] barnaclebutt@lemmy.world 12 points 1 week ago

I made this.

[–] Monstrosity@lemm.ee 74 points 1 week ago* (last edited 1 week ago) (4 children)

The big difference is Deepseek is open sourced, which ALL of these models should be because they used our collective knowledge and culture to create them.

I like AI but the single biggest issue is how it is being gated off and abused by Capitalists for profit (It's kind of their thing).

[–] Sorgan71@lemmy.world 1 points 6 days ago* (last edited 6 days ago) (1 children)

Artists use our collective knowlege and culture in the same way. Its just some of them are whiny and complain when ai does their job faster and cheaper.

[–] Monstrosity@lemm.ee 2 points 6 days ago* (last edited 6 days ago) (1 children)

I am an artist & I agree, actually.

I do think it's problematic that corpos are using AI to replace working artists, although that's a systemic issue affecting a lot of disciplines.

That said, and I will get hate for this, there is a case to be made that if artists were more creative and interesting in general, they wouldn't be so easily displaced by AI slop.

[–] Sorgan71@lemmy.world 3 points 6 days ago

Yeah I mean faster and cheaper does not mean more creative.

[–] Dadifer@lemmy.world 82 points 1 week ago (2 children)
[–] dependencyinjection@discuss.tchncs.de 8 points 6 days ago (2 children)

Can you elaborate on what you mean, for a layman?

[–] Dadifer@lemmy.world 26 points 6 days ago (2 children)

The neural network is 100s of billions of nodes that are connected to each other with connections of different strengths or "weights", just like our neurons. Open source weights means that they released the weight of connections between the nodes, the blueprint of the neural network, if you will. It is not open source because they didn't release the material that it was trained on.

[–] dependencyinjection@discuss.tchncs.de 4 points 6 days ago (2 children)

Thanks.

Are there any models that are truly open source where they have shown the datasets it was trained on?

[–] HappyFrog@lemmy.blahaj.zone 2 points 3 days ago

There are probably several "mnist" or other smaller networks that are fully open sourced. But that's not the kind of neural networks most are talking about.

[–] Dadifer@lemmy.world 5 points 6 days ago

Not that I know of

[–] Lemminary@lemmy.world 2 points 6 days ago (3 children)

It is not open source because they didn't release the material that it was trained on.

I'm not sure if I'm missing a definition here but open source usually means that anyone can use the source code under some or no conditions.

[–] Dadifer@lemmy.world 9 points 6 days ago (1 children)

You can't use the source code, just the neural network the source code generated.

[–] Johanno@feddit.org 2 points 6 days ago

Open source means bx definition that the code is open the usage is open and anybody can use it.

This includes in theory the training material for the model.

But in common language open source means: i can download it and it runs on my machine. Ignoring legal shit.

I'm pretty sure open source means that the source code is open to see. I'm pretty sure there is open source things that you need to pay to use.

[–] Schadrach@lemmy.sdf.org 2 points 6 days ago

In parallel to what Hawk wrote, AI image generation is similar. The idea is that through training you essentially produce an equation (really a bunch of weighted nodes, but functionally they boil down to a complicated equation) that can recognize a thing (say dogs), and can measure the likelihood any given image contains dogs.

If you run this equation backwards, it can take any image and show you how to make it look more like dogs. Do this for other categories of things. Now you ask for a dog lying in front of a doghouse chewing on a bone, it generates some white noise (think "snow" on an old TV) and ask the math to make it look maximally like a dog, doghouse, bone and chewing at the same time, possibly repeating a few times until the results don't get much more dog, doghouse, bone or chewing on another pass, and that's your generated image.

The reason they have trouble with things like hands is because we have pictures of all kinds of hands at all kinds of scales in all kinds of positions and the model doesn't have actual hands to compare to, just thousands upon thousands of pictures that say they contain hands to try figure out what a hand even is from statistical analysis of examples.

LLMs do something similar, but with words. They have a huge number of examples of writing, many of them tagged with descriptors, and are essentially piecing together an equation for what language looks like from statistical analysis of examples. The technique used for LLMs will never be anything more than a sufficiently advanced Chinese Room, not without serious alterations. That however doesn't mean it can't be useful.

For example, one could hypothetically amass a bunch of anonymized medical imaging including confirmed diagnoses and a bunch of healthy imaging and train a machine learning model to identify signs of disease and put priority flags and notes about detected potential diseases on the images to help expedite treatment when needed. After it's seen a few thousand times as many images as a real medical professional will see in their entire career it would even likely be more accurate than humans.

[–] Monstrosity@lemm.ee 5 points 1 week ago

Yeah. That is true.

[–] kibiz0r@midwest.social 9 points 1 week ago (11 children)

I wouldn’t say it’s the biggest issue. Even if access was free, we’d still have to contend with the extreme energy use, and the epistemic chaos of being able to generate convincing bullshit much quicker than it can be detected and flagged.

I think it’s a harmful product in general. We’re polluting our infosphere the same way we polluted our ecosphere, and in both cases there’s still folks who think “unequal access to polluting industries” is the biggest problem.

[–] Monstrosity@lemm.ee 3 points 6 days ago

You're right about this. I was commenting in the context of "intellectual property".

[–] uis@lemm.ee 1 points 6 days ago

we’d still have to contend with the extreme energy use,

Meanwhile people running it on Raspberry PI: "I made it consume 1W less, which is 30% improvement!"

and the epistemic chaos of being able to generate convincing bullshit much quicker than it can be detected and flagged.

It's been this way long before modern AI.

[–] InFerNo@lemmy.ml 1 points 6 days ago

The infosphere already turned to shit over 10 years ago when the internet started consolidating towards a few super large companies.

[–] Hackworth@lemmy.world 11 points 1 week ago

All the data centers in the US combined use 4% of the electric load, and one of the main upsides to deepseek is that it requires much less energy to train (the main cost).

load more comments (7 replies)
load more comments (1 replies)
[–] ekZepp@lemmy.world 10 points 1 week ago
load more comments
view more: next ›