this post was submitted on 04 Apr 2024
293 points (94.3% liked)

Reddit

17683 readers
287 users here now

News and Discussions about Reddit

Welcome to !reddit. This is a community for all news and discussions about Reddit.

The rules for posting and commenting, besides the rules defined here for lemmy.world, are as follows:

Rules


Rule 1- No brigading.

**You may not encourage brigading any communities or subreddits in any way. **

YSKs are about self-improvement on how to do things.



Rule 2- No illegal or NSFW or gore content.

**No illegal or NSFW or gore content. **



Rule 3- Do not seek mental, medical and professional help here.

Do not seek mental, medical and professional help here. Breaking this rule will not get you or your post removed, but it will put you at risk, and possibly in danger.



Rule 4- No self promotion or upvote-farming of any kind.

That's it.



Rule 5- No baiting or sealioning or promoting an agenda.

Posts and comments which, instead of being of an innocuous nature, are specifically intended (based on reports and in the opinion of our crack moderation team) to bait users into ideological wars on charged political topics will be removed and the authors warned - or banned - depending on severity.



Rule 6- Regarding META posts.

Provided it is about the community itself, you may post non-Reddit posts using the [META] tag on your post title.



Rule 7- You can't harass or disturb other members.

If you vocally harass or discriminate against any individual member, you will be removed.

Likewise, if you are a member, sympathiser or a resemblant of a movement that is known to largely hate, mock, discriminate against, and/or want to take lives of a group of people, and you were provably vocal about your hate, then you will be banned on sight.



Rule 8- All comments should try to stay relevant to their parent content.



Rule 9- Reposts from other platforms are not allowed.

Let everyone have their own content.



:::spoiler Rule 10- Majority of bots aren't allowed to participate here.

founded 1 year ago
MODERATORS
 

Its a bit old, but I just learned it via the retro-dodo article here: https://retrododo.com/google-is-killing-retro-dodo/

you are viewing a single comment's thread
view the rest of the comments
[–] Rayspekt@lemmy.world 60 points 7 months ago (7 children)

Is it just me or are 60 million a ridiculously small price for that whole dataset?

[–] bobburger@fedia.io 69 points 7 months ago (6 children)

To be fair it's a pretty terrible dataset. The AI is just going to say "this" to every question you ask

[–] altima_neo@lemmy.zip 31 points 7 months ago (1 children)
[–] Shimon@slrpnk.net 11 points 7 months ago (2 children)
[–] Anticorp@lemmy.world 3 points 7 months ago

"Reject humanity. Return to monke.

[–] InFerNo@lemmy.ml 2 points 7 months ago

& Knuckles, featuring Dante from the Devil May Cry series

[–] Anticorp@lemmy.world 5 points 7 months ago (2 children)
[–] GBU_28@lemm.ee 3 points 7 months ago

Ai:

😭 I'm trying

[–] captainlezbian@lemmy.world 3 points 7 months ago (1 children)

My heel turn as a mod back in the day was having automod remove lmgtfy links

[–] brygphilomena@lemmy.world 1 points 7 months ago

It was a weird day when I recently went to teach someone about lmgtfy and found the website dead. There are clones, but the original was so simple and great.

[–] jkrtn@lemmy.ml 3 points 7 months ago

Hey, now, be fair. There are some Top 40s song lyrics in there too.

[–] OfCourseNot@fedia.io 2 points 7 months ago

Yeah and Google already has everything scrapped and indexed

[–] Ebby@lemmy.ssba.com 16 points 7 months ago (2 children)

Perhaps, but not worth buying if you can't make profit or keep it from your competition.

60M is for over almost 20 years of data, but once it's ingested, google will only want new content. Next year, it'll be more like 3M if the dataset isn't poisoned by bots or the AI fad hasn't collapsed. Reddit will struggle with finances again and users will suffer. At least that's my prediction.

[–] empireOfLove2@lemmy.dbzer0.com 7 points 7 months ago (1 children)

Spez has already grifted his money out of the initial stock pump so it literally doesn't matter. Reddit could shut down tomorrow and he'd be happy as a clam.

[–] Ebby@lemmy.ssba.com 2 points 7 months ago* (last edited 7 months ago) (1 children)

Yeah, what a load. Though now they can boot his arse and save.

Edited to remove number.

[–] Anticorp@lemmy.world 1 points 7 months ago (1 children)

I doubt he's getting 120M per year. I think that big compensation package was a 1 time deal. That's more than Satya Nadella makes.

[–] Ebby@lemmy.ssba.com 1 points 7 months ago* (last edited 7 months ago)

You're right. Total compensation was $193M for 2023 but that was a lot of stock too. It may have been one time like you said now that they went public. Hopefully enough to retire haha.

[–] Anticorp@lemmy.world -2 points 7 months ago (2 children)

the AI fad

LOL. Do you realize that makes you sound like Boomers talking about the internet in the late 90's and early 00's?

[–] Barbarian@sh.itjust.works 7 points 7 months ago

It currently looks very much like a bubble. After the dot com bubble, the internet didn't go away, but most companies died off and all the stupid monetisation went bankrupt.

We may be seeing something similar

[–] Ebby@lemmy.ssba.com 7 points 7 months ago

Haha! Wow I guess so. I'll keep some shelf space available in the geezer museum next to 3D TV's, deep fakes, fidget spinners, and my pogs. :D

[–] qjkxbmwvz@startrek.website 12 points 7 months ago

I wonder if Google's unlimited legal budget plays a role. Not a lawyer, so probably way off here...

But, for example, reddit's success in part depends on Google ingesting their data


reddit shows up in Google searches all the time, which can only happen if Google uses reddit's content. So reddit telling Google "you can't use our content" doesn't work, and they need to say something like, "you can use our content for search results but you can't consume it as training data."

This is a pretty straightforward statement/request/demand, but one could imagine Google lawyers maliciously complying and throwing their hands up dramatically, claiming "well we use some amount of AI in our search results, so if we can't use your content for AI training then we can't risk using it for search results." Which would, I imagine, really, really hurt reddit (no Google results would be catastrophic I suspect).

So, perhaps the "low" 60M figure is just Google using their leverage.

Or not. As a random person on the Internet, I can say I'm probably not contributing anything meaningful here...

[–] GBU_28@lemm.ee 6 points 7 months ago

How quickly you forget that half of it is just "I also choose this guy's wife" and "the narwhal bacon's at midnight"

[–] Zaktor@sopuli.xyz 4 points 7 months ago

I'm personally curious whether Reddit actually has any ability to protect that database. I don't remember Reddit TOS, but usually those things give them license to use and copy the data, maybe even to sell it, but not actually the copyright on it. So if someone made a Reddit scraper and copied the comments, wouldn't only the actual commenter be able to sue?

$60M may be reflecting that, in that it's more a convenience fee to shield Google against individual Redditors going after them than something that Reddit itself could actually sue over.

[–] trolololol@lemmy.world 2 points 7 months ago

Considering it's all full of Nazis and bots, and if you get to filter all of them out you're left with reposts and low quality memes followed by comments that represent the hostile side of each of us.... I'd say anything over $5 is a good deal for spez.

Now, I hope Google uses this data exclusively for detecting inappropriate answers. Can you imagine it giving answers based on the endless threads i of " I'm not your mate, bro; I'm not your bro, dude.....".

[–] PoliticalAgitator@lemmy.world 1 points 7 months ago

It's more than they were making from third party apps, hence the ridiculous API fees.