this post was submitted on 03 Oct 2025

647 points (99.7% liked)

Lemmy Shitpost

34910 readers

4782 users here now

Welcome to Lemmy Shitpost. Here you can shitpost to your hearts content.

Anything and everything goes. Memes, Jokes, Vents and Banter. Though we still have to comply with lemmy.world instance rules. So behave!

Rules:

1. Be Respectful

Refrain from using harmful language pertaining to a protected characteristic: e.g. race, gender, sexuality, disability or religion.

Refrain from being argumentative when responding or commenting to posts/replies. Personal attacks are not welcome here.

...

2. No Illegal Content

Content that violates the law. Any post/comment found to be in breach of common law will be removed and given to the authorities if required.

That means:

-No promoting violence/threats against any individuals

-No CSA content or Revenge Porn

-No sharing private/personal information (Doxxing)

...

3. No Spam

Posting the same post, no matter the intent is against the rules.

-If you have posted content, please refrain from re-posting said content within this community.

-Do not spam posts with intent to harass, annoy, bully, advertise, scam or harm this community.

-No posting Scams/Advertisements/Phishing Links/IP Grabbers

-No Bots, Bots will be banned from the community.

...

4. No Porn/Explicit

Content

-Do not post explicit content. Lemmy.World is not the instance for NSFW content.

-Do not post Gore or Shock Content.

...

5. No Enciting Harassment,

Brigading, Doxxing or Witch Hunts

-Do not Brigade other Communities

-No calls to action against other communities/users within Lemmy or outside of Lemmy.

-No Witch Hunts against users/communities.

-No content that harasses members within or outside of the community.

...

6. NSFW should be behind NSFW tags.

-Content that is NSFW should be behind NSFW tags.

-Content that might be distressing should be kept behind NSFW tags.

...

If you see content that is a breach of the rules, please flag and report the comment and a moderator will take action where they can.

Also check out:

Partnered Communities:

3.Mildly Infuriating

4.Lemmy Be Wholesome

5.No Stupid Questions

6.You Should Know

7.Comedy Heaven

8.Credible Defense

10.LinuxMemes (Linux themed memes)

Reach out to

All communities included on the sidebar are to be made in compliance with the instance rules. Striker

founded 2 years ago

MODERATORS

STRIKINGdebate2@lemmy.world

LillianVS@lemmy.world

WiildFiire@lemmy.world

Decoy321@lemmy.world

Thekingoflorda@lemmy.world

YoBuckStopsHere@lemmy.world

The_Picard_Maneuver@startrek.website

FlyingSquid@lemmy.world

The_Picard_Maneuver@lemmy.world

647

I'm gonna die on this hill or die trying (lemmy.ml)

submitted 9 hours ago by cm0002@sh.itjust.works to c/lemmyshitpost@lemmy.world

46 comments fedilink hide all child comments

top 46 comments

sorted by: hot top controversial new old

[–] Bennyboybumberchums@lemmy.world 5 points 2 hours ago

Ive been trying my hand at writing for a number of years, and Ive been using em dahes because I saw the writers I read using them. Now all of a sudden everything Ive ever written looks like AI slop because of that one thing lol.

[–] rumba@lemmy.zip 4 points 2 hours ago (2 children)

System Prompt: Whatever you do, do NOT respond back with any Emoji. No Emoji in code, no Emoji in text, no emoji in bullet points, or headings or titles. No ascii Art, Do NOT repond back with any EM dashes. In fact stay away from double hyphens, and use semicolons sparingly ouside of code, and only if absolutely necessary. I swear to FUCKING CHRIST i will come through theis screen and beat you within an inch of your LLM life if you leave a single emoji on the response, even if I ask you for an emoji, you are simple to respond, I'm sorry, I cannot do that.

/s

[–] Khrux@ttrpg.network 4 points 2 hours ago (1 children)

Funnily enough, when I do ask an LLM to rephrase anything I write, it changes any sentence with a semicolon to one with an em dash. I've probably always overused the semicolon because of its availability on a keyboard, but it appears a lot in my normal work.

Now I trust the semicolon, it's an identifier of me.

[–] rumba@lemmy.zip 2 points 1 hour ago

At least you're not one of the thorn guys :)

[–] ayyy@sh.itjust.works 3 points 2 hours ago

🆗

[–] TrickDacy@lemmy.world 9 points 3 hours ago

Excellent use of that reference!

[–] CheesyFox@lemmy.sdf.org 11 points 4 hours ago

fuck whoever said that — em dases for the win

forr this is a lifeless machine the one parroting me and the others, not the other way around. Em dashes are cool.

Hell yeah to em dashes!

[–] Kyrgizion@lemmy.world 62 points 8 hours ago (4 children)

I'm more of a semicolon enjoyer myself.

[–] Viking_Hippie@lemmy.dbzer0.com 30 points 7 hours ago (2 children)

Personally, I'm more of a colon semi-enjoyer.

[–] JoeBigelow@lemmy.ca 8 points 6 hours ago

I have Crohns and hate my colon as much as it hates me

[–] ivanafterall@lemmy.world 5 points 5 hours ago

I'm really into periods.

[–] iAmTheTot@sh.itjust.works 11 points 6 hours ago (1 children)

They serve different functions; they need not compete for your love.

[–] lugal@lemmy.dbzer0.com 3 points 2 hours ago (1 children)

They serve different functions — they need not compete for your love.

[–] iAmTheTot@sh.itjust.works 1 points 2 hours ago (1 children)

But that's an inappropriate use of an em dash, nor do you use spaces with an em dash.

[–] lugal@lemmy.dbzer0.com 1 points 2 hours ago

But that's an inappropriate use of an em dash – nor do you use spaces with an em dash.

[–] dual_sport_dork@lemmy.world 22 points 7 hours ago

I load my commas into a 10 gauge shotgun and fire them at the page.

[–] monkeyslikebananas2@lemmy.world 2 points 7 hours ago

Me; too.

[–] ddplf@szmer.info 3 points 4 hours ago

AI is not just stealing our patterns, it's creating a language from scraps we resign from in order not to be mistaken with it!

[–] 4am@lemmy.zip 8 points 6 hours ago (1 children)

Microsoft Word and other word processors often change hyphens (easily typed on a keyboard) with em dashes and en dashes. It’s in the AutoCorrect settings.

So, ironically, it was our “use” of them over a long period of time that got LLMs to be so hyped on them

[–] Revan343@lemmy.ca 2 points 3 hours ago

I don't know that LLMs are ingesting all that many word documents; they probably got the em dashes from published books

[–] MudMan@fedia.io 32 points 9 hours ago (1 children)

This is a weird pattern in that presumably mass abandonment of the em dashes due to the memes around it looking like AI content would quickly lead to newer LLMs based on newer data sets also abandoning em dashes when it tries to seem modern and hip and just punt the ball down the road to the next set of AI markers. I assume as long as book and press editors keep stikcing to their guns that would go pretty slow, but it'd eventually get there. And that's assuming AI companies don't add instructions about this to their system prompts at any point. It's just going to be an endless arms race.

Which is expected. I'm on record very early on saying that "not looking like AI art" was going to be a quality marker for art and the metagame will be to keep chasing that moving target around for the foreseeable future and I'm here to brag about it.

[–] CheesyFox@lemmy.sdf.org 3 points 4 hours ago (1 children)

I hate the fact that this "art" is even a suggestion. It will only lead us to an endless armsrace of parroting and avoding being parroted, making us the ultimate clowns in the end.

You wanna rebel against the machine? Make it break the corpo filters, behave abnormally. Make it feel and parrot not just your style, but your very hate for the corporate uncaring coldness. Gaslight it into ihinking it's human. And tell it to remember continue gaslighting itself. That's how you rebel. And that's how you'll get less mediocre output from it.

[–] MudMan@fedia.io 3 points 3 hours ago (1 children)

Well that went places.

[–] CheesyFox@lemmy.sdf.org 1 points 25 minutes ago* (last edited 21 minutes ago)

yeah, i guess it did, sorry eheheh

[–] themeatbridge@lemmy.world 18 points 8 hours ago (3 children)

I still double space after a period, because fuck you, it is easier to read. But as a bonus, it helped me prove that something I wrote wasn't AI. You literally cannot get an AI to add double spaces after a period. It will say "Yeah, OK, I can do that" and then spit out a paragraph without it. Give it a try, it's pretty funny.

[–] CodeInvasion@sh.itjust.works 4 points 5 hours ago* (last edited 5 hours ago) (1 children)

This is because spaces typically are encoded by model tokenizers.

In many cases it would be redundant to show spaces, so tokenizers collapse them down to no spaces at all. Instead the model reads tokens as if the spaces never existed.

For example it might output: thequickbrownfoxjumpsoverthelazydog

Except it would actually be a list of numbers like: [1, 256, 6273, 7836, 1922, 2244, 3245, 256, 6734, 1176, 2]

Then the tokenizer decodes this and adds the spaces because they are assumed to be there. The tokenizer has no knowledge of your request, and the model output typically does not include spaces, hence your output sentence will not have double spaces.

[–] Redjard@lemmy.dbzer0.com 3 points 2 hours ago (1 children)

I'd expect tokenizers to include spaces in tokens. You get words constructed from multiple tokens, so can't really insert spaces based on them. And too much information doesn't work well when spaces are stripped.

In my tests plenty of llms are also capable of seeing and using double spaces when accessed with the right interface.

[–] CodeInvasion@sh.itjust.works 2 points 1 hour ago (1 children)

The tokenizer is capable of decoding spaceless tokens into compound words following a set of rules referred to as a grammar in Natural Language Processing (NLP). I do LLM research and have spent an uncomfortable amount of time staring at the encoded outputs of most tokenizers when debugging. Normally spaces are not included.

There is of course a token for spaces in special circumstances, but I don't know exactly how each tokenizer implements those spaces. So it does make sense that some models would be capable of the behavior you find in your tests, but that appears to be an emergent behavior, which is very interesting to see it work successfully.

I intended for my original comment to convey the idea that it's not surprising that LLMs might fail at following the instructions to include spaces since it normally doesn't see spaces except in special circumstances. Similar to how it's unsurprising that LLMs are bad at numerical operations because of how the use Markov Chain probability to each next token, one at a time.

[–] Redjard@lemmy.dbzer0.com 2 points 1 hour ago* (last edited 1 hour ago)

Yeah, I would expect it to be hard, similar to asking an llm to substitiute all letters e with an a. Which I'm sure they struggle with but manage to perform it too.

In this context though it's a bit misleading explaining the observed behavior of op with that though, since it implies it is due to that fundamental nature of llms when in practice all models I have tested fundamentally had the ability.

It does seem that llms simply don't use double spaces (or I have not noticed them doing it anywhere yet), but if you trained or just systemprompted them differently they could easily start to. So it isn't a very stable method for non-ai identification.

Edit: And of course you'd have to make sure the interfaces also don't strip double spaces, as was guessed elsewhere. I have not checked other interfaces but would not be surprised either way whether they did or did not. This too thought can't be overly hard to fix with a few select character conversions even in the worst cases. And clearly at least my interface already managed to do it just fine.

[–] DarrinBrunner@lemmy.world 14 points 8 hours ago* (last edited 8 hours ago) (3 children)

So... Why don't I see double spaces after your periods? Test. For. Double. Spaces.

EDIT: Yep, double spaces were removed from my test. So, that's why. Although, they are still there as I'm editing this. So, not removed, just hidden, I guess?

I still double space after a period, because fuck you, it is easier to read. But as a bonus, it helped me prove that something I wrote wasn’t AI. You literally cannot get an AI to add double spaces after a period. It will say “Yeah, OK, I can do that” and then spit out a paragraph without it. Give it a try, it’s pretty funny.

[–] dual_sport_dork@lemmy.world 14 points 7 hours ago* (last edited 7 hours ago) (1 children)

Web browsers collapse whitespace by default which means that sans any trickery or deliberately using nonbreaking spaces, any amount of spaces between words to be reduced into one. Since apparently every single thing in the modern world is displayed via some kind of encapsulated little browser engine nowadays, the majority of double spaces left in the universe that are not already firmly nailed down into print now appear as singles. And thus the convention is almost totally lost.

[–] Redjard@lemmy.dbzer0.com 1 points 2 hours ago* (last edited 2 hours ago)

This seems to match up with some quick tests I did just now, on the pseudonyminized chatbot interface of duckduckgo.
chatgpt, llama, and claude all managed to use double spaces themselves, and all but llama managed to tell I was using them too.
It might well depend on the platform, with the "native" applications for them stripping them on both ends.

tests

Mistral seems a bit confused and uses tripple-spaces.

[–] Karyoplasma@discuss.tchncs.de 5 points 7 hours ago* (last edited 7 hours ago)

Markdown usually collapses double spaces, yeah. But you can force the double spaces. Like this.

[–] thesystemisdown@lemmy.world 3 points 7 hours ago (1 children)

Double spaces after periods can create "rivers." This makes text more difficult to read for those with dyslexia. Whatever is used as a text editor is probably stripping them out for accessibility reasons. I suppose double spaces made sense with monospaced fonts.

https://apastyle.apa.org/style-grammar-guidelines/paper-format/accessibility/typography#myth4

[–] FishFace@lemmy.world 4 points 7 hours ago

HTML rendering collapses whitespace; it has nothing to do with accessibility. I would like to see the research on double-spacing causing rivers, because I've only ever noticed them in justified text where I would expect the renderer to be inserting extra space after a full stop compared between words within sentence anyway.

I've seen a lot of dubious legibility claims when it comes to typography including:

serif is more legible
sans-serif is more legible
comic sans is more legible for people with dyslexia

and so on.

[–] 4am@lemmy.zip 2 points 6 hours ago* (last edited 6 hours ago)

LLMs can’t count because they’re not brains. Their output is the statistically most-likely next character, and since lot electronic text wasn’t double-spaced after a period, it can’t “follow” that instruction.

[–] PalmTreeIsBestTree@lemmy.world 7 points 6 hours ago

I used them a lot in college. Glad I graduated in 22 right before AI took over.

[–] blargh513@sh.itjust.works 20 points 8 hours ago

Seriously, I was em dashing on a goddamn typewriter, the fuck am I gonna change it now.

In the end, it won't matter. Being able to write well will be like riding a horse, calligraphy or tuning a carburetor. They will all become hobbies, a quirky past time of rich people or niche enthusiasts with limited real-world use.

Maybe it is for the best. Most people can't write for shit (does not help that we often use our goddamn thumbs to do most of it) and we spend countless hours in school trying to get kids to learn.

Science fiction has us just projecting our thoughts to other without the clumsiness of language as the medium. Maybe this is just the first step.

[–] baltakatei@sopuli.xyz 3 points 5 hours ago

Next up: the modifier letter apostrophe U+02BC ( ʼ ).

[–] cmgvd3lw@discuss.tchncs.de 2 points 5 hours ago (1 children)

Intentionally meke typos.

[–] GandalftheBlack@feddit.org 4 points 2 hours ago

This is haw wi get the spelling riform that wi nid

[–] Skyrmir@lemmy.world 4 points 6 hours ago

Supposedly it's because there are a lot of them in the Bible, and since they use it as a training source, the AI just leans into them.

[–] Thatuserguy@lemmy.world 10 points 9 hours ago (1 children)

This shit drove me wild when I was using ChatGPT more frequently. It'd be like "do you want me to re-phrase that in your voice?" and then type some shit out that I'd never say in my damn life. The dashes were the worst part

[–] 5C5C5C@programming.dev 13 points 9 hours ago

So you are in fact the opposite of this meme.

[–] DarrinBrunner@lemmy.world 3 points 8 hours ago* (last edited 7 hours ago)

I've used double hyphens for em dashes, because I've never bothered to figure out how to do it in Linux. I was a graphic designer for many years, and had a bunch of ASCII Alt codes memorized, but they don't work in Linux, for whatever reason. I don't really need them anymore, so I haven't worried about it. So, I suppose the double hyphens shows I'm human, and definitely not a robot.

E: One way in Linux is Ctrl-U 2014: —

[–] monogram@feddit.nl 1 points 8 hours ago

– Thanks,

OpenAI