this post was submitted on 18 Sep 2025

151 points (100.0% liked)

Data is Beautiful

2756 readers

42 users here now

Be respectful

founded 1 year ago

MODERATORS

KnowledgeIsPower@mander.xyz

WanderingPhoenix@mander.xyz

151

Mentions of the word "fascism" and its derivatives in Pravda, the main Soviet newspaper, from 1938 to 1942 (media.piefed.social)

submitted 1 week ago by misk@piefed.social to c/dataisbeautiful@mander.xyz

15 comments fedilink hide all child comments

via https://www.reddit.com/r/europe/comments/13chr5f/mentions_of_the_word_fascism_and_its_derivatives/

Percentage of pages mentioning "fascism" and its derivative words, from January 1938 until December 1942. Darkest blue is front page, the lightest blue is 6+ pages. Letters at the bottom are months.

Source, based on data from the Pravda Digital Archive.

top 15 comments

sorted by: hot top controversial new old

[–] IcedRaktajino@startrek.website 18 points 1 week ago* (last edited 1 week ago) (3 children)

I would have to do some deep searches to plug in actual data, but based on how it feels:

Mentions of the word "fascism" and its derivatives on Lemmy

[–] MelastSB@sh.itjust.works 21 points 1 week ago (1 children)

"There's no rise of fascism, we're maxed out!"

[–] IcedRaktajino@startrek.website 8 points 1 week ago

It's also for 2025. We're so maxed out, it flowed over into October through December and filled them up lol.

[–] grue@lemmy.world 7 points 1 week ago

It's eminently topical and (unlike corporate social media) not suppressed, so what did you expect?

[–] podbrushkin@mander.xyz 3 points 1 week ago (1 children)

I was going to remind of how good “filtered keywords” function is, but suddenly it struck me I shouldn’t have seen this post in the first place. Apparently, I saw it because my block list had a flaw.

[–] IcedRaktajino@startrek.website 1 points 1 week ago (1 children)

What app is that?

[–] podbrushkin@mander.xyz 2 points 1 week ago (1 children)

It’s voyager for iOS https://mander.xyz/post/37815572

[–] IcedRaktajino@startrek.website 1 points 1 week ago

I'm on Android but will still check it out. Thanks.

[–] Skullgrid@lemmy.world 11 points 1 week ago (1 children)

Comrade Gitler is helping us maintain peace in europe, until he very suddenly is not.

[–] DragonTypeWyvern@midwest.social 4 points 1 week ago

For someone that was canny enough of a backstabbing tyrant to murder his way into leadership it's always struck as me as strange that Stalin was taken by surprise when Hitler betrayed their treaty.

Everyone but him knew it was coming. There's no way he couldn't have known, even if he thought it would happen later he shouldn't have been surprised. Hating socialism in general and Bolshevism in particular was like half of their thing.

[–] podbrushkin@mander.xyz 7 points 1 week ago (1 children)

This is extremely interesting. How many magazines and newspapers are digitized in the way you can analyze them like that? This is a simple word-based analyze, also those texts can be enriched with metadata, e.g. mentions of people can be marked with their identifiers in Wikidata.

[–] misk@piefed.social 12 points 1 week ago (1 children)

With Slavic languages it’s never a simple word-based analysis - they’re highly inflected so you need to lemmatise text first which is a bit hit or miss when done automatically. I assume author of the graph did that manually for fascism because it’s a loanword and there’s less ambiguity to account for but it gets tedious quite fast if you really get into it. I got into it once in my mother tongue (Polish) and that rabbit hole goes really deep. Here’s a brief overview of how that process looks like.

[–] podbrushkin@mander.xyz 2 points 1 week ago (1 children)

Some software solutions exist, e.g. War and Peace by Tolstoy can be downloaded with metadata, ids are assigned to all characters and when one character tells something to another, this is highlighted as “x speaks to y”, and you can run a community detection algorithms on this data. I think in the paper they’ve been mentioning some proprietary software. I suspect detecting who speaks to whom is even harder.

Also, some form of crowd sourcing probably should be possible. At least collecting scans is possible on wikisource and wikimedia commons.

Probably AI language models should be pretty good in distinguishing between linguistic ambiguities.

I dream for a time when such reports as in OP post will be a matter of work for an hour or two — because data will be already collected and clean.

[–] misk@sopuli.xyz 2 points 1 week ago* (last edited 1 week ago)

LLMs can’t deal with highly reflective languages at the moment. In English or Chinese you can assign tokens to entire words without having to account for word morphology (which is also why models fail at counting letters in words) but it falls apart quickly in Polish or Russian. The way models like ChatGPT work now is that they do their „reasoning” in English first and translate back to the query language at the end.

[–] Hackworth@sh.itjust.works 6 points 1 week ago* (last edited 1 week ago)

US Google searches for "fascist". "Fascism" had a similar peak now, but bigger peaks in 2017 and around covid.