this post was submitted on 28 Jan 2025

32 points (94.4% liked)

Technology

40042 readers

104 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.

Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.

Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 6 years ago

MODERATORS

MinutePhrase@lemmy.ml

Meta is reportedly scrambling ‘war rooms’ of engineers to figure out how DeepSeek’s AI is beating everyone else at a fraction of the price (fortune.com)

submitted 8 months ago* (last edited 8 months ago) by yogthos@lemmy.ml to c/technology@lemmy.ml

19 comments fedilink hide all child comments

top 19 comments

sorted by: hot top controversial new old

[–] Lugh 16 points 8 months ago

At least this should finally put the 'Chinese can't innovate, they can only copy' meme into retirement.

[–] yogthos@lemmy.ml 15 points 8 months ago (1 children)

It’s interesting how the media focuses on the panic at Meta. While they’ve been pursuing open-source models like LLaMA, OpenAI appears far more impacted, as their business relies on selling access to a proprietary model-as-a-service.

[–] avidamoeba@lemmy.ca 2 points 8 months ago (2 children)

Probably a coincidence and/or got some scoop from Meta specifically.

[–] yogthos@lemmy.ml 3 points 8 months ago

I mean there's been a lot of news about DeepSeek in the past few days, but very little has been said regarding how this impacts the company that's most affected by this development.

[–] 1984@lemmy.today 3 points 8 months ago* (last edited 8 months ago)

Or it's important to the media companies to not alienate Microsoft because of reasons.

I mean, it's very strange. Open Ai is the obvious loser on this, not Facebook. Obviously Microsoft doesn't want the press reminding people of alternatives to the big tech models.

[–] chemicalwonka@discuss.tchncs.de 9 points 8 months ago

Gonna cry?

1000045471

[–] anachronist@midwest.social 6 points 8 months ago (1 children)

This whole DeepSeek freakout seems like an Op by the AI grifters to get more money. "We have to defeat China at the new AI space race!"

[–] yogthos@lemmy.ml 6 points 8 months ago

The freakout is over SV grift being exposed for what it is. Turns out you don't need to pour billions of dollars into this industry to get results.

[–] avidamoeba@lemmy.ca 3 points 8 months ago (2 children)

I think you got the wrong link. It goes to an article about chip tariffs.

[–] yogthos@lemmy.ml 6 points 8 months ago

oops fixed

[–] melroy@kbin.melroy.org 3 points 8 months ago

Think about the tariffs as well! ;P

[–] MyOpinion@lemm.ee 3 points 8 months ago

I am scrambling a war room to try and get this scum bag out of my life.

[–] melroy@kbin.melroy.org 1 points 8 months ago (1 children)

DeepSeek is not that great. I run it here locally, but the answers are often still wrong. And I get Chinese characters in my English output

[–] yogthos@lemmy.ml 10 points 8 months ago (1 children)

What makes DeepSeek important is that it shows that you can train and run a large scale model at a fraction of the cost of what existing models require. Meanwhile, in terms of quality it outperforms the top Llama model in benchmarks https://docsbot.ai/models/compare/deepseek-r1/llama-3-1-405b-instruct

[–] melroy@kbin.melroy.org 1 points 8 months ago (2 children)

Yes that is true.. now the question I have back is: How is this price calculated? I mean the price can also be low, because they ask less. Or the price can be low because interference costs less time / energy. You might answer the latter is true, but where is the source for that?

Again, since I can run it locally my price is $0 per million tokens, I only pay electricity for my home.

EDIT: The link you gave me also says "API costs" at the top of the article. So that means, they just ask less money. The model itself might use the same amount (or even more) energy than other existing models costs.

[–] yogthos@lemmy.ml 5 points 8 months ago (1 children)

The reason they ask for less money is due to the fact that it's a more efficient algorithm, which means it uses less power. They leveraged mixture-of-experts architecture to get far better performance than traditional models. While it has 671 billion parameters overall, it only uses 37 billion at a time, making it very efficient. For comparison, Meta’s Llama3.1 uses 405 billion parameters used all at once. You can read all about here https://arxiv.org/abs/2405.04434

[–] melroy@kbin.melroy.org 0 points 8 months ago (1 children)

I see ok. I only want to add that DeepSeek is not the first or the only model that is using mixture-of-experts (MoE).

[–] yogthos@lemmy.ml 5 points 8 months ago

Ok, but it is clearly the first one to use this approach to such an effect.

[–] Xavienth@lemmygrad.ml 2 points 8 months ago

The claim going around is that it uses 50x less energy