Because it's scraping pages and pages and pages of memes, jokes, bullshit replies, uneducated idiots who talk about things they know nothing about (like I'm literally doing right now) and there's no way to correct it because well...it's literally using trash information and starting to train itself on its own garbage. It'll compound and her worse and worse. They know, they just don't care because they spent billion on this crap millions told them was not wanted
Futurology
They're training them on social media because there's a lot of cheap content...
But ignoring that most people on social media don't know what they're talking about.
If you're realistic about AI and how far away it is, that's not a big deal. You don't send a baby to college before it can talk. But every AI company has to say they'll have God level AI next Tuesday because if they don't, they don't get investors.
If any of them said the truth, that they're 20 years away from real AI and will likely need quantum computing first...
They'd be out of business almost immediately.
So literally every AI company right now is just bald faced lying to investors, but they can't call each other out because they're all saying the same lie.
We know exactly why 🤣
Anyone who is making this out to be a big mystery either has no direct knowledge of how this all works, or is in complete denial. You can't keep training on expanded sets of data without having a source of truth. You also can't keep feeding these models training data that isn't deduplicated, for the simple fact that base libraries that emulate neural networks will absolutely eventually draw links to conflicting information.
This all links back to the next phase of freakout of this whole race to the bottom: data convergence and contraction. I've heard some of my peers just reference it as "Contraction". If models want to improve, you either need the hardware processing their work to expand lanes if bandwidth to work on more channels of data or once (not really possible right now), or you need to reduce the total amount of data being processed at once. If you don't care about providing a quick response as part of a product, this obviously doesn't matter, but that's all the companies want. You ALSO need a source of truth to even begin to have reasoning logic not just constantly regurgitate bullshit back on itself.
Either way, there's going to be reckoning or data sliming to make all of these shitty products hallucinate less, and provide more reliable working responses. It's seriously the only possible way at current unless there is constant feedback in training from humans, which is also not really possible for various reasons right now.
Because this shit doesn't work and it never did work in the way the liars shilling it implied. They're training it on garbage on top of that.
Because the currently billed "AI's" are glorified search engines not AI's.
They're not even that. They're chat bots. They literally just put words together in a way that appears human at a glance.
The companies that make them know exactly why they suck, but they wanna sell you an information tool and not admit that it's actually just a glorified Bonzai Buddy.