schizo

joined 5 months ago
[–] schizo@forum.uncomfortable.business 2 points 9 hours ago (2 children)

AI model of that type is safe to deploy anywhere

Yeah, I think you've made a mistake in thinking that this is going to be usable as generative AI.

I'd bet $5 this is just a fancy machine learning algorithm that takes a submitted image, does machine learning nonsense with it, and returns a 'there is a high probability this is an illicit image of a child', and not something you could use to actually generate CSAM with.

You want something that's capable of assessing the similarities between a submitted image and a group of known bad images, but that doesn't mean the dataset is in any way usable for anything other than that one specific task - AI/ML in use cases like this is super broad and has been a thing for decades before the whole 'AI == generative AI' thing became what everyone is thinking.

But, in any case: the PhotoDNA database is in one place and access to it is scaled by the merit of uh, lots of money?

And of course, any 'unscrupulous engineer' that may have any plans for doing anything with this is probably not a complete idiot, even if a pedo: they're going to have shockingly good access controls and logging and well, if you're in the US, if the dude takes this database and generates a couple of CSAM images using it, the penalty is, for most people, spending the rest of their life in prison.

Feds don't fuck around with creation or distribution charges.

[–] schizo@forum.uncomfortable.business 2 points 10 hours ago (4 children)

comparative scale of the content involved

PhotoDNA is based on image hashes, as well as some magic that works on partial hashes: resizing the image, or changing the focus point, or fiddling with the color depth or whatever won't break a PhotoDNA identification.

But, of course, that means for PhotoDNA to be useful, the training set is literally 'every CSAM image in existance', so it's not really like you're training on a lot less data than an AI model would want or need.

The big safeguard, such as it is, is that you basically only query an API with an image and it tells you if PhotoDNA has it in the database, so there's no chance of the training data being shared.

Of course, there's also no reason you can't do that with an AI model, either, and I'd be shocked if that's not exactly how they've configured it.

[–] schizo@forum.uncomfortable.business 4 points 12 hours ago (1 children)

The problem I ran into is that every single platform that primarily interacted with Mastodon (The keys, etc.) had the same exact same set of problems.

While yes, my Firefish instance had search, what was it searching? Local data only, and once I figured out that Mastodon-style replies didn't federate to all of someone's followers, it became pretty clear that it was uh, not very useful.

You can search, but any given server may or may not have access to data you actually want and thus, well, you just plain cannot meaningfully search for shit unless you go to one of the mega instances, or join giant piles of relays and store gigabyte upon gigabyte upon gigabyte of garbage data you do not care about.

The whole implementation is kinda garbage for search-based discovery from it's very basic design all the way through to everyone's implementations.

[–] schizo@forum.uncomfortable.business 4 points 12 hours ago (6 children)

first time law enforcement are sharing actual csam with a technology company

It's very much not: PhotoDNA, which is/was the gold standard for content identification, is a collaboration between a whole bunch of LEOs and Microsoft. The end user is only going to get a 'yes/no idea' result on a matched hash, but that database was built on real content working with Microsoft.

Disclaimer: below is my experience dealing with this shit from ~2015-2020, so ymmv, take it with some salt, etc.

Law enforcement is also rarely the first-responder to these issues, either: in the US, at least, reports will come to the hosting/service provider first for validation and THEN to NCMEC and LEOs, if the hosting provider confirms what the content is. Even reports that are sent from NCMEC to the provider aren't being handled by law enforcement as the first step, usually.

And as for validating reports, that's done by looking at it without all the 'access controls and safeguards' you think there are, other than a very thin layer of CYA on the part of the company involved. You get a report, and once PhotoDNA says 'no fucking clue, you figure it out' (which, IME, was basically 90% of the time) a human is going to look at it and make a determination, and then file a report with NCMEC or whatever, if it turns out to be CSAM.

Frankly, after having done that for far too fucking long, if this AI tool can reduce the amount of horrible shit someone doing the reviews has to look at, I'm 100% for it.

CSAM is (grossly) a big business, and the 'new content' funnel is fucking enormous and is why an extremely delayed and reactive thing like PhotoDNA isn't all that effective is that, well, there's a fuckload of children being abused and a fuckload of abusers escaping being caught simply because there's too much shit to look at and handle effectively and thus any response to anything is super super slow.

This looks like a solution to make it so less people have to be involved in validation, and could be damn near instant in responding to suspected material that does need validation, which will do a good job of at least pushing the shit out of easy (ier?) availability and out of more public spaces, which honestly, is probably the best thing that is going to be managed unless the countries producing this shit start caring and going after the producers which I'm not holding my breath on.

[–] schizo@forum.uncomfortable.business 13 points 14 hours ago (4 children)

For me, it's full text search.

I tend to want to find an opinion on something very specific, so if I can just toss a phrase or model number or name of something into a search field and get actual non-AI, non-advertisement, non-stupid-shit results, that'd be absolutely ideal.

Like, say, how Google worked 15 years ago.

[–] schizo@forum.uncomfortable.business 7 points 14 hours ago (8 children)

First: you'd probably be shocked how many pedos have zero opsec and just post shit/upload shit in the plain.

By which I mean most of them, because those pieces of crap don't know shit about shit and don't encrypt anything and just assume crap is private.

And second, yeah, I'll catch kids generating CSAM, but it'll catch everyone else too, so that's probably a fair trade.

Install it and use it?

Their PDS is self hosted, but it does still rely on the central relays (though you COULD host that yourself if you wanted to pay for it, I suppose?).

It's very centralized, but it's not that different from what you'd have to do to make Mastodon useful: a small/single user instance will get zero content, even if you follow a lot of people, without also adding several relays to work around some of the design decisions made by the Mastodon team regarding replies and how federation works for those kind of things, as well as to populate hashtags and searches and such.

Though really you shouldn't do any of that, and just use a good platform for discussion, like a forum or a threadiverse platform. (No seriously, absolutely hate "microblog" shit because it's designed to just be zingers and hot takes and not actual meaningful conversations.)

15 million Series A financing

Maybe shitty corporate search engines are failing me, but has there been a stated valuation for Bluesky? Googling 'Bluesky valuation" or any combination thereof is a problem since that's a business term so lol, lmao, search engine worthless.

$8m seed + $15m A series may be a shockingly small amount of equity, or it could be the whole damn company but I'm just not seeing it actually posted anywhere.

[–] schizo@forum.uncomfortable.business 48 points 1 day ago (13 children)

I gather that's a meme that's older than you are?

By linux ISOs I meant any content you're torrenting: movies, software, audio, my little pony porn, whatever.

[–] schizo@forum.uncomfortable.business 28 points 1 day ago (16 children)

Frankly, it probably means absolutely nothing.

Even when captain coffee cup was the FCC chairman, did you lose the ability to torrent linux isos? Did usenet stop working?

I wouldn't expect anything different this time, either.

[–] schizo@forum.uncomfortable.business 687 points 2 days ago (24 children)

Sounds good to me?

 

Made this mostly because I've found putting RSS feeds into Lemmy useful since my doom-scrolling has reduced to just Lemmy and figured I'm probably not the only person that'd find this useful.

It's pulling 6 RSS feeds that provide free games for Steam, Gog, Epic, and Humble.

Nothing shockingly world-changing, but hey, free games.

!freegames@forum.uncomfortable.business

 

So I'm looking for a laptop, but before you downvote and move on, I've got a twist: I'm looking for a laptop with Linux support that's going to intentionally be console-only and rely on TUIs to make a lower-distraction device.

I was looking at older Thinkpads with 4:3 screens and the good keyboard before Lenovo went all chicklet with them, but I'm kinda concluding they're both way too expensive AND way too old to be a reasonable choice at this point.

A X220 or T40-whatever would be great and be the perfect aesthetic, but they're expensive, hard to find parts for, and using enough crusty old shit that this becomes yet another delve into retro computing and not one into practical, useful computing which is the goal here.

So, anyone have any recommendations of any devices in the last decade that have a reasonable keyboard, screen, use modern enough components that you can source new drives and RAM and batteries and such, and preferably aren't coated in a coating that's going to turn to sticky goo?

Thin(ner) and light(er) would be nice, but probably not a dealbreaker if the rest of the pieces align. This will be almost entirely used at a table for writing and such.

 

Basically, the court said that algorithmically selected content doesn't qualify for Section 230 protections, which could be a massive impact to every social media platform out there that has any sort of algorithm selecting content, which, well, is all of them.

Definitely something that's going to be interesting watching play out.

 

I have a question for the hive mind: what is the point of this, exactly?

I mean, I understand the attempt to gain access, and I understand why 2fa codes can be valuable to attempt to phish but that's like, not the thing here.

They just spam dozens to hundreds of these (I'm showing over 400 in my inbox right now) but like, even if I WANTED to give these codes to the attacker, I have no damn clue who the dude in China that's doing this is.

I'm confused as to what they hope to gain by trying over and over and over every couple of hours because it feels like there's no upside to whomever is running this bot, but I probably have missed a memo on some TTP around this, heh.

 

I'm wanting to add a bunch of energy monitoring stuff so I can both track costs, and maybe implement automation to turn stuff on and off based on power costs and timing.

I'm using some TPlink based plugs right now which are like, fine, but I'm wanting to add something like 6 to 10 more monitoring devices/relays.

Anyone have experience with a bunch of shelly devices and if there's any weird behavior I should be aware of?

Assume I have good enough wifi to handle adding another 10 devices to it, but beyond that any gotchas?

view more: next ›