RoundSparrow

joined 1 year ago
[–] RoundSparrow@lemmy.ml 1 points 1 year ago* (last edited 1 year ago)

who would have predicted that Elon Musk would do all the wild things he did with Twitter. Reddit pissing everyone off in June... pretty odd how audiences are behaving in 2023 towards all this. Oh yha, Threads, that coming on the scene too. 2023 has really been odd for audiences.

The SQL speaks for itself, but I don't know what's going on in terms of why people are treating social media platforms like Lemmy, Twitter, Threads, Reddit this year so unusually. This SQL statement kind of thing has been covered in so many books, conferences, etc. It's like forgotten history now in the era of Elon Musk X and Reddit Apollo times.

I don't know what to say other than I can try to hire a translator or teacher to explain how this SQL problem is obvious and well understood 13 years ago. I mean, there was a whole "NoSQL movement" because of this kind of thing. But I clearly can't get people to hear past all the Elon Musk, Threads, Lemmy from Reddit ... and I'm left describing it as 'social hazing' or whatever is gong on with social media.

Lemmy has like 5 different Rust programming communities, but nobody fixing Lemmy. It's surreal in 2023 the Elon Musk X days. I think it's making all of us uncomfortable. The social movement underway.

[–] RoundSparrow@lemmy.ml -1 points 1 year ago (1 children)

Ok, so let's look at recent changes that they have deployed.... https://github.com/LemmyNet/lemmy/issues/3886

One of which makes entire tree of comments disappear. Do you see developers fretting over this and fixing it? Or do you see them ignoring the May 27 PostgreSQL JOIN problem.

How did such a bug go out? Do you see Lemmy developers actually using Lemmy to test things and notice these crashes and problems? Do you look at their posting and comment history? Do they actually go login over at Beehaw and Lemmy.world and see just how terrible the code performance is?

If it isn't hazing, what is it?

It's as if they build a product only for other people to use... and they don't notice any of the constant crashes, incredibly slow performance etc - and they act like nobody in the computer industry ever heard of Memcache or Redis to solve performance problems. If it isn't extreme hazing going on, then what is it?

[–] RoundSparrow@lemmy.ml 2 points 1 year ago

Here, you can dig into what posted days before the pull request you read:

https://github.com/LemmyNet/lemmy/issues/2877#issuecomment-1685314733

 

June 4:

joins are better than in queries with potentially thousands of inserted IDs.

Given that more than 8 JOIN statements is something PostgreSQL specifically concerns itself with (join_collapse_limit). I hand-edit the query with a single IN clause and the performance problem disappears. 8 full seconds becomes less than 200ms against 5,431,043 posts. And that 200ms is still high, as I was extremely over-reaching with "LIMIT 1000" in case the end-user went wild with blocking lists or some other filtering before reaching the final "LIMIT 10". When I change it to "LIMIT 20" in the subquery, it drops almost in half to 115ms... still meeting the needs of the outer "LIMIT 10" by double. More of the core query filtering can be put into the IN subquery, as we aren't dealing with more than 500 length pages (currently limited to 50).

SELECT 
   "post"."id" AS post_id, "post"."name" AS post_title,
   -- "post"."url", "post"."body", "post"."creator_id", "post"."community_id", "post"."removed", "post"."locked", "post"."published", "post"."updated", "post"."deleted", "post"."nsfw", "post"."embed_title", "post"."embed_description", "post"."thumbnail_url",
   -- "post"."ap_id", "post"."local", "post"."embed_video_url", "post"."language_id", "post"."featured_community", "post"."featured_local",
     "person"."id" AS p_id, "person"."name",
     -- "person"."display_name", "person"."avatar", "person"."banned", "person"."published", "person"."updated",
     -- "person"."actor_id", "person"."bio", "person"."local", "person"."private_key", "person"."public_key", "person"."last_refreshed_at", "person"."banner", "person"."deleted", "person"."inbox_url", "person"."shared_inbox_url", "person"."matrix_user_id", "person"."admin",
     -- "person"."bot_account", "person"."ban_expires",
     "person"."instance_id" AS p_inst,
   "community"."id" AS c_id, "community"."name" AS community_name,
   -- "community"."title", "community"."description", "community"."removed", "community"."published", "community"."updated", "community"."deleted",
   -- "community"."nsfw", "community"."actor_id", "community"."local", "community"."private_key", "community"."public_key", "community"."last_refreshed_at", "community"."icon", "community"."banner",
   -- "community"."followers_url", "community"."inbox_url", "community"."shared_inbox_url", "community"."hidden", "community"."posting_restricted_to_mods",
   "community"."instance_id" AS c_inst,
   -- "community"."moderators_url", "community"."featured_url",
     ("community_person_ban"."id" IS NOT NULL) AS ban,
   -- "post_aggregates"."id", "post_aggregates"."post_id", "post_aggregates"."comments", "post_aggregates"."score", "post_aggregates"."upvotes", "post_aggregates"."downvotes", "post_aggregates"."published",
   -- "post_aggregates"."newest_comment_time_necro", "post_aggregates"."newest_comment_time", "post_aggregates"."featured_community", "post_aggregates"."featured_local",
   --"post_aggregates"."hot_rank", "post_aggregates"."hot_rank_active", "post_aggregates"."community_id", "post_aggregates"."creator_id", "post_aggregates"."controversy_rank",
   --  "community_follower"."pending",
   ("post_saved"."id" IS NOT NULL) AS save,
   ("post_read"."id" IS NOT NULL) AS read,
   ("person_block"."id" IS NOT NULL) as block,
   "post_like"."score",
   coalesce(("post_aggregates"."comments" - "person_post_aggregates"."read_comments"), "post_aggregates"."comments") AS unread

FROM (
   ((((((((((
   (
	   (
	   "post_aggregates" 
	   INNER JOIN "person" ON ("post_aggregates"."creator_id" = "person"."id")
	   )
   INNER JOIN "community" ON ("post_aggregates"."community_id" = "community"."id")
   )
   LEFT OUTER JOIN "community_person_ban"
       ON (("post_aggregates"."community_id" = "community_person_ban"."community_id") AND ("community_person_ban"."person_id" = "post_aggregates"."creator_id"))
   )
   INNER JOIN "post" ON ("post_aggregates"."post_id" = "post"."id")
   )
   LEFT OUTER JOIN "community_follower" ON (("post_aggregates"."community_id" = "community_follower"."community_id") AND ("community_follower"."person_id" = 3))
   )
   LEFT OUTER JOIN "community_moderator" ON (("post"."community_id" = "community_moderator"."community_id") AND ("community_moderator"."person_id" = 3))
   )
   LEFT OUTER JOIN "post_saved" ON (("post_aggregates"."post_id" = "post_saved"."post_id") AND ("post_saved"."person_id" = 3))
   )
   LEFT OUTER JOIN "post_read" ON (("post_aggregates"."post_id" = "post_read"."post_id") AND ("post_read"."person_id" = 3))
   )
   LEFT OUTER JOIN "person_block" ON (("post_aggregates"."creator_id" = "person_block"."target_id") AND ("person_block"."person_id" = 3))
   )
   LEFT OUTER JOIN "post_like" ON (("post_aggregates"."post_id" = "post_like"."post_id") AND ("post_like"."person_id" = 3))
   )
   LEFT OUTER JOIN "person_post_aggregates" ON (("post_aggregates"."post_id" = "person_post_aggregates"."post_id") AND ("person_post_aggregates"."person_id" = 3))
   )
   LEFT OUTER JOIN "community_block" ON (("post_aggregates"."community_id" = "community_block"."community_id") AND ("community_block"."person_id" = 3)))
   LEFT OUTER JOIN "local_user_language" ON (("post"."language_id" = "local_user_language"."language_id") AND ("local_user_language"."local_user_id" = 3))
   )
WHERE 
  post_aggregates.id IN (
     SELECT id FROM post_aggregates
     WHERE "post_aggregates"."creator_id" = 3
     ORDER BY "post_aggregates"."featured_local" DESC , "post_aggregates"."published" DESC
     LIMIT 1000
  )
  AND
  (((((((
  (
  (("community"."deleted" = false) AND ("post"."deleted" = false))
  AND ("community"."removed" = false))
  AND ("post"."removed" = false)
  )
  AND ("post_aggregates"."creator_id" = 3)
  )
  AND ("post"."nsfw" = false))
  AND ("community"."nsfw" = false)
  )
  AND ("local_user_language"."language_id" IS NOT NULL)
  )
  AND ("community_block"."person_id" IS NULL)
  )
  AND ("person_block"."person_id" IS NULL)
  )
ORDER BY "post_aggregates"."featured_local" DESC , "post_aggregates"."published" DESC
LIMIT 10
OFFSET 0
;

 

If it isn't social hazing, then what is going on here? Why has this issue gone on since May and servers are crashing every day?

[–] RoundSparrow@lemmy.ml 1 points 1 year ago (4 children)

. However, I’m far from an expert,

Funny, because I'm a published author and expert on messaging systems... like Lemmy. Iv'e been building them since 1986 professionally.

There was a massive thread I posted dozens of comments on that came before today's pull request... I suggest you read that too.

Did you notice them even acknowledge server crashes are happening? Do you think developers ever suggest Memcache or Redis? Or discuss how Reddit solved their scaling in 2010 with PostgreSQL?

but perhaps they themselves felt attacked. I know that wasn’t your intention, but misunderstanding happen, especially over text.

I don't have any trouble understanding a bad SQL statement that has 14 JOINs and being told "JOIN is a distraction" after posting tons of examples.

Do we really need to spoon fed the stuff I did post?

Have you never seen social hazing in action? is it possible that I might be on to something going on psychologically besides my autism?

I can't believe anyone thinks a server should be crashing with 1 user on it.

[–] RoundSparrow@lemmy.ml -1 points 1 year ago* (last edited 1 year ago) (7 children)

may I voice my opinion on the exchange? This is coming from a place of trying to help, since I really do appreciate all the work you’ve put in and are putting in, and the fediverse can really use your talents, so I hope I don’t offend you.

Can you explain to me why it isn't social hazing?

it didn’t appear that you were being ignored/hazed

Do you know how to read a SQL statement? I just can't grasp how it isn't social hazing. I've been reading SQL statements for decades, this is obviously a problematic one.

Can you offer alternate explanations of how 3 people could think that SQL statement isn't ... poor performing and gong to cause problems? And how an SQL statement without a WHERE clause took them months to discover and fix?

Extreme hazing is my best answer. I just can't accept that the SQL statements don't speak for themselves along with the server crashes. 57K users for 1300 servers is very... taking several seconds to load 10 posts....

Look at the date... May... this has been going on since May. If it isn't social hazing ... what is it? I keep asking myself that.

[–] RoundSparrow@lemmy.ml 18 points 1 year ago (1 children)

Why is there a lack of gifs/videos on Lemmy?

Lemmy's internal data performance is so horribly slow and crash-causing that I think the last thing they want is even more popular data.

Video is simply the most superior type of media there is, and I think that not having easy access to it on Lemmy is hurting it.

Video is more data, popularity is more data. For whatever reason, at every turn, I've seen developers turn away from scaling options like Memcache, Redis, or just abandoning ORM data management and rewriting the data interfaces by hand....

since the sites on which the videos are hosted can track you.

That's already true for images that are hot linked routinely, so I don't think video really changes it.

I've been baffled since June why data and fixing lemmy's data coding hasn't been front and center. It's pretty wild to witness so many come to Lemmy and then turn away... Elon Musk has been flocking people, Reddit, etc. It's as if the project wants to make code that won't work on any data. It's baffeling.

[–] RoundSparrow@lemmy.ml 1 points 1 year ago* (last edited 1 year ago)

You’re putting too much importance into this matter. If this is distressing you should let it go and think about something else.

The apologists come out of the woodwork around here who can't see an SQL statement for what it is, a charade. Anyone who has worked with SQL knows that this is bloated SQL statement and poorly engineered.

I notice the scientific facts of server crashing and SQL statements you won't discuss, but you sure dish out the social advice for me to "move along" like a Jedi mind trick. Let's talk about the human attraction to truth and honesty since you are so great at handing out life advice to people. What do you know about the works of Marshall McLuhan on media?

Repeating: Its’ as if the mere concept of Redis or Memcache never existed… and that nobody ever heard of JOIN performance problems. If it isn’t extreme social hazing, what is it?

[–] RoundSparrow@lemmy.ml 2 points 1 year ago* (last edited 1 year ago)

ild rants. i’ve seen you do this many times. you need to step back, relax, and not take technical feedback so personally

I've stepped back and watched them ignore the issue since May when all the servers were crashing. Every single Lemmy server was falling over while they ignored the PostgreSQL problems.

The mistakes are obvious and huge. These are not minor topics.

our comments there are exceptionally aggressive. you accuse the developers of “hazing” you

I think they are hazing the entire World Wide Web, Reddit users, etc. How else can you explain such basic SQL problems that they have allowed to go on for so long?

Its' as if the mere concept of Redis or Memcache never existed... and that nobody ever heard of JOIN performance problems. If it isn't extreme social hazing, what is it?

[–] RoundSparrow@lemmy.ml 1 points 1 year ago* (last edited 1 year ago) (2 children)

having a meltdown on github doesn’t help anybody.

I'm glad for you that mental control is so trival and you aren't near death in your life from your brain damage.

Go outside and take a breath

I just got back from dinner ant the months of hazing I've witnessed hasn't gone away. The level of social games being played with PostgreSQL in this project are levels beyond anything I've encountered in my 50+ years alive. And I've first hand seen Bill Gates and his team do all kinds of odd things to groups.

I am at a total loss to explain why such fundamentals of basic relational database are avoided in this project. If it isn't social hazing, what is it?

[–] RoundSparrow@lemmy.ml 4 points 1 year ago* (last edited 1 year ago) (11 children)

Of course, that would be an insane amount of work, especially if it would get ignored, but something to consider!

I already did an insane amount of work to populate a Lemmy database with over 10 million posts. It is so incredibly slow out of the box that the normal API would take days to accomplish this. i had to rewrite the SQL TRIGGER logic to allow bulk inserts.

Here is my work on that:

DROP TRIGGER site_aggregates_post_insert ON public.post;


/*
TRIGGER will be replaced with per-statement INSERT only
*/
CREATE TRIGGER site_aggregates_post_insert
   AFTER INSERT ON public.post
   REFERENCING NEW TABLE AS new_rows
   FOR EACH STATEMENT
   EXECUTE FUNCTION site_aggregates_post_insert();


DROP TRIGGER community_aggregates_post_count ON public.post;


/*
TRIGGER will be replaced with per-statement INSERT only
*/
CREATE TRIGGER community_aggregates_post_count
   AFTER INSERT ON public.post
   REFERENCING NEW TABLE AS new_rows
   FOR EACH STATEMENT
   EXECUTE FUNCTION community_aggregates_post_count();


DROP TRIGGER person_aggregates_post_count ON public.post;


/*
TRIGGER will be replaced with per-statement INSERT only
*/
CREATE TRIGGER person_aggregates_post_count
   AFTER INSERT ON public.post
   REFERENCING NEW TABLE AS new_rows
   FOR EACH STATEMENT
   EXECUTE FUNCTION person_aggregates_post_count();



/*
TRIGGER will be replaced with per-statement INSERT only
no Lemmy-delete or SQL DELETE to be performed during this period.
*/
CREATE OR REPLACE FUNCTION public.site_aggregates_post_insert() RETURNS trigger
    LANGUAGE plpgsql
    AS $$
BEGIN
   UPDATE site_aggregates SET posts = posts +
      (SELECT count(*) FROM new_rows WHERE local = true)
      WHERE site_id = 1
      ;

   RETURN NULL;
END
$$;


CREATE OR REPLACE FUNCTION public.community_aggregates_post_count() RETURNS trigger
    LANGUAGE plpgsql
    AS $$
BEGIN
        UPDATE
            community_aggregates ca
        SET
            posts = posts + p.new_post_count
        FROM (
            SELECT count(*) AS new_post_count, community_id
            FROM new_rows
            GROUP BY community_id
             ) AS p
        WHERE
            ca.community_id = p.community_id;

    RETURN NULL;
END
$$;


/*
TRIGGER will be replaced with per-statement INSERT only
no Lemmy-delete or SQL DELETE to be performed during this period.
*/
CREATE OR REPLACE FUNCTION public.person_aggregates_post_count() RETURNS trigger
    LANGUAGE plpgsql
    AS $$
BEGIN
        UPDATE
            person_aggregates personagg
        SET
            post_count = post_count + p.new_post_count
        FROM (
            SELECT count(*) AS new_post_count, creator_id
            FROM new_rows
            GROUP BY creator_id
             ) AS p
        WHERE
            personagg.person_id = p.creator_id;

    RETURN NULL;
END
$$;


/*
***********************************************************************************************
** comment table
*/


DROP TRIGGER post_aggregates_comment_count ON public.comment;


/*
TRIGGER will be replaced with per-statement INSERT only
*/
CREATE TRIGGER post_aggregates_comment_count
   AFTER INSERT ON public.comment
   REFERENCING NEW TABLE AS new_rows
   FOR EACH STATEMENT
   EXECUTE FUNCTION post_aggregates_comment_count();


-- IMPORTANT NOTE: this logic for INSERT TRIGGER always assumes that the published datestamp is now(), which was a logical assumption with general use of Lemmy prior to federation being added.
CREATE OR REPLACE FUNCTION public.post_aggregates_comment_count() RETURNS trigger
    LANGUAGE plpgsql
    AS $$
BEGIN

        UPDATE
            -- per statement update 1
            post_aggregates postagg
        SET
            comments = comments + c.new_comment_count
        FROM (
            SELECT count(*) AS new_comment_count, post_id
            FROM new_rows
            GROUP BY post_id
             ) AS c
        WHERE
            postagg.post_id = c.post_id;


        UPDATE
            -- per statement update 2
            post_aggregates postagg
        SET
            newest_comment_time = max_published
        FROM (
            SELECT MAX(published) AS max_published, post_id
            FROM new_rows
            GROUP BY post_id
             ) AS c
        WHERE
            postagg.post_id = c.post_id;

        UPDATE
            -- per statement update 3
            post_aggregates postagg
        SET
            newest_comment_time_necro = max_published
        FROM (
            SELECT MAX(published) AS max_published, post_id, creator_id
            FROM new_rows
            WHERE published > ('now'::timestamp - '2 days'::interval)
            GROUP BY post_id, creator_id
             ) AS c
        WHERE
            postagg.post_id = c.post_id
            AND c.creator_id != postagg.creator_id
            ;

    RETURN NULL;
END
$$;


DROP TRIGGER community_aggregates_comment_count ON public.comment;

CREATE TRIGGER community_aggregates_comment_count
   AFTER INSERT ON public.comment
   REFERENCING NEW TABLE AS new_rows
   FOR EACH STATEMENT
   EXECUTE FUNCTION public.community_aggregates_comment_count();


CREATE OR REPLACE FUNCTION public.community_aggregates_comment_count() RETURNS trigger
    LANGUAGE plpgsql
    AS $$
BEGIN

        UPDATE
            community_aggregates ca
        SET
            comments = comments + p.new_comment_count
        FROM (
            SELECT count(*) AS new_comment_count, community_id
            FROM new_rows AS nr
            JOIN post AS pp ON nr.post_id = pp.id
            GROUP BY pp.community_id
             ) AS p
        WHERE
            ca.community_id = p.community_id
            ;

    RETURN NULL;

END
$$;


DROP TRIGGER person_aggregates_comment_count ON public.comment;

CREATE TRIGGER person_aggregates_comment_count
   AFTER INSERT ON public.comment
   REFERENCING NEW TABLE AS new_rows
   FOR EACH STATEMENT
   EXECUTE FUNCTION public.person_aggregates_comment_count();


CREATE OR REPLACE FUNCTION public.person_aggregates_comment_count() RETURNS trigger
    LANGUAGE plpgsql
    AS $$
BEGIN

        UPDATE
            person_aggregates personagg
        SET
            comment_count = comment_count + p.new_comment_count
        FROM (
            SELECT count(*) AS new_comment_count, creator_id
            FROM new_rows
            GROUP BY creator_id
             ) AS p
        WHERE
            personagg.person_id = p.creator_id;

    RETURN NULL;
END
$$;


DROP TRIGGER site_aggregates_comment_insert ON public.comment;

CREATE TRIGGER site_aggregates_comment_insert
   AFTER INSERT ON public.comment
   REFERENCING NEW TABLE AS new_rows
   FOR EACH STATEMENT
   EXECUTE FUNCTION public.site_aggregates_comment_insert();


CREATE OR REPLACE FUNCTION public.site_aggregates_comment_insert() RETURNS trigger
    LANGUAGE plpgsql
    AS $$
BEGIN

   UPDATE site_aggregates
      SET comments = comments +
         (
            SELECT count(*) FROM new_rows WHERE local = true
         )
      WHERE site_id = 1
      ;

    RETURN NULL;
END
$$;

With this in place, 300,000 posts a minute can be generated and reaching levels of 5 million or 10 million don't take too long.

[–] RoundSparrow@lemmy.ml 3 points 1 year ago* (last edited 1 year ago)

I heard that they gave out a bunch of free .ml domains and those are the ones they aren't allowing for free any more...

[–] RoundSparrow@lemmy.ml 9 points 1 year ago* (last edited 1 year ago) (10 children)

I already feel like I have to keep sticking my neck out to get them to question if using the ORM and a dozen JOIN statements isn't a problem.... but I guess I'll link it: https://github.com/LemmyNet/lemmy/pull/3900

As stated on my Lemmy user profile, I'm "RocketDerp" on GitHiub.

Honestly, the reason I keep making noise is because I'm sick of Lemmy crashing all the time when I come to use it... and I am on many servers that this happens. I really am not trying to piss off the developers, I even said I felt like I am being hazed, and I feel like hazing in general might explain what is going on with how much they are avoiding the elephant in the ROOM that ORM and a dozen JOIN might be the cause! Let alone the lack of Redis or Memcached addition being avoided, that's a second elephant on the second floor tap-dancing.... GitHub Issue 2910 was the straw that broke my back weeks ago, it took months for them to address it when it could be fixed in a couple hours (and it was weeks before the Reddti API deadline at the end of June.... and issue 2910 was neglected). The whole thing was a nightmare for me to watch...

 

cross-posted from: https://campfyre.nickwebster.dev/post/89316

I was getting close to hitting the end of my free object storage so there was time pressure involved haha.

Seems to work but I haven't tested it too much. Currently running on my instance.

 

Lemmy.ml front page has been full of nginx errors, 500, 502, etc. And 404 errors coming from Lemmy.

Every new Lemmy install begins with no votes, comments, postings, users to test against. So the problems related to performance, scaling, error handling, stability under user load can not easily be matched given that we can not download the established content of communities.

Either the developers have an attitude that the logs are of low quality and not useful for identifying problems in the code and design, or the importance of getting these logs in front of the technical community and trying to identify the underlying patterns of faults is being given too low of a priority.

It's also important to make each log of failures identifiable to where in the code this specific timeout, crash, exception, resource limit is encountered. Users and operations personnel reporting generic messages that are non-unique only slow down server operators, programmers, database experts, etc.

There are also a number of problems testing federation given the nature of multiple servers involved and trying not to bring down servers in front of end-users. It's absolutely critical that failures for servers to federate data be taken seriously and attempts to enhance logging activities and triangulate causes of why peer instances have missing data be track down to protocol design issues, code failures, network failures, etc. Major Lemmy sites doing large amounts of data replication are an extremely valuable source of data about errors and performance. Please, for the love of god, share these logs and let us look for the underlying causes in hard to reproduce crashes and failures!

I really hope internal logging and details of the inner workings of the biggest Lemmy instances is shared more openly with more eyes on how to keep scaling the applications as the number of posts, messages, likes and votes continue to grow each and every day. Thank you.

Three recently created communities: !lemmyperformance@lemmy.ml -- !lemmyfederation@lemmy.ml -- !lemmycode@lemmy.ml

 

( I didn't cross-post, as I encourage comments to go all on one posting )

 

cross-posted from: https://popplesburger.hilciferous.nl/post/9969

After setting up my own Lemmy server, I've been intrigued by the server logs. I was surprised to see some search engines already start to crawl my instances despite it having very little content.

I've noticed that most requests seem to come in from IPv4 addresses, despite my server having both an IPv4 and an IPv6 address. This made me wonder.

IPv4 addresses are getting more scarce by the day and large parts of the world have to share an IPv4 address to get access to older websites. This often leads to unintended fallout, such as thousands of people getting blocked by an IP ban from a site admin that doesn't know any better, as well as anti-DDoS providers throwing up annoying CAPTCHA pages because of bad traffic coming from the shared IP address. Furthermore, hosting a Lemmy server of your own is impossible behind a shared IP address, so IPv6 is the only option.

IPv6 is the clear way forward. However, many people haven't configured IPv6 for their hosts. People running their own Lemmy instances behind an IPv6 address won't be able to federate with those servers, and that's a real shame.

Looking into it

So, I whipped up this quick Python script:

import requests
import sys
import socket
from progress.bar import Bar

lemmy_host = sys.argv[1]

site_request = requests.get(f"https://{lemmy_host}/api/v3/site").json()

hosts = site_request['federated_instances']['linked']

ipv4_only = []
ipv6_only = []
both = []
error = []

with Bar('Looking up hosts', max=len(hosts)) as bar:
    for host in hosts:
        host = host.strip()

        try:
            dns = socket.getaddrinfo(host, 443)
        except socket.gaierror:
            error.append(host)

        has_ipv4 = False
        has_ipv6 = False
        for entry in dns:
            (family, _, _, _, _) = entry

            if family == socket.AddressFamily.AF_INET:
                has_ipv4 = True
            elif family == socket.AddressFamily.AF_INET6:
                has_ipv6 = True

        if has_ipv4 and has_ipv6:
            both.append(host)
        elif has_ipv4:
            ipv4_only.append(host)
        elif has_ipv6:
            ipv6_only.append(host)
        else:
            error.append(host)
        
        bar.message = f"Looking up hosts (B:{len(both)} 4:{len(ipv4_only)} 6:{len(ipv6_only)} E:{len(error)})"
        bar.next()

print(f"Found {len(both)} hosts with both protocols, {len(ipv6_only)} hosts with IPv6 only, and {len(ipv4_only)} outdated hosts, failed to look up {len(error)} hosts")

This script fetches the instances a particular Lemmy server federates with (ignoring the blocked hosts) and then looks all of them up through DNS. It shows you the IPv4/IPv6 capabilities of the servers federating with your server.

I've run the script against a few popular servers and the results are in:

Results

Server IPv6 + IPv4 IPv6 only IPv4 Error Total
Lemmy.ml 1340 3 1903 215 3461
Beehaw.org 807 0 1105 74 1986
My server 202 0 312 4 518

A bar chart of the table above

A pie chart of the results for Lemmy.nl

A pie chart for the results for Beehaw.org

A pie chart for the results for my server

It seems that over half (55%+) the servers on the Fediverse aren't reachable over IPv6!

I'm running my own server, what can I do?

Chances are you've already got an IPv6 address on your server. All you need to do is find out what it is (ip address show in Linux), add an AAAA record in your DNS entries, and enable IPv6 in your web server of choice (i.e. listen [::]:443 in Nginx). Those running a firewall may need to allow traffic through IPv6 as well, but many modern firewalls treat whitelist entries the same these days.

Some of you may be running servers on networks that haven't bothered implementing IPv6 yet. There are still ways to get IPv6 working!

Getting IPv6 through Tunnelbroker

If you've got a publicly reachable IPv4 address that can be pinged from outside, you can use Hurricane Electric's Tunnelbroker to get an IPv6 range, free of charge! You get up to five tunnels per account (each tunnel with a full /64 network) and a routed /48 network for larger installations, giving you up to 65k subnets to play with!

There are lots of guides out there, some for PfSense, some for Linux, some for Windows; there's probably one for your OS of choice.

Getting IPv6 behind CGNAT

Getting an IPv6 network through a tunnelbroker service behind CGNAT is (almost) impossible. Many ISPs that employ CGNAT already provide their customers with IPv6 networks, but some of them are particularly cheap, especially consumer ISPs.

It's still possible to get IPv6 into your network through a VPN, but for serving content you'll need a server with IPv6 access. You can get a free cloud server from various cloud providers to get started. An easy way forward may be to host your server in the cloud, but if you've got a powerful server at home, you can just use the free server for its networking capabilities.

Free servers are available from all kinds of providers, such as Amazon(free for a year), Azure(free for a year), Oracle(free without time limit). Alternatively, a dedicated VPS with IPv6 capabilities can be as cheap as $4-5 per month if you shop around.

You can install a VPN server on your cloud instance, like Wireguard, and that will allow you to use the cloud IPv6 address at home. Configure the VPN to assign an IPv6 address and to forward traffic, and you've got yourself an IPv6 capable server already!

There are guides online about how to set up such a system. This gist will give you the short version.

Final notes

It should be noted that this is a simple analysis based on server counts alone. Most people flock to only a few servers, so most Lemmy users should be able to access IPv6 servers. However, in terms of self hosting, these things can matter!

 

I encourage all instance owner/operators to run the query mentioned in the issue and see how many of these 'pending' they have on their server. (FYI, I am RocketDerp on GitHub)

 

At the time of this posting, both sites for over a minute were throwing 500 or 502 errors. I've seen this 'naked nginx 50x error page' many times throughout the last week from almost every Lemmy site I visit.

First off, how do we configure Nginx with a better error message when the site is down that says try again in a few minutes, etc? Someone have some config steps to share?

 

cross-posted from: https://lemmy.ml/post/1216911

Some of you may have noticed that federated actions are slow to synchronize between Lemmy instances. This is most likely because of the setting "Federation worker count" under /admin. It determines how many federation activities can be sent out at once. The default value is 64 which is enough for small or medium sized instances. But for large instances it needs to be increased.

Grep the server logs for "Maximum number of activitypub workers reached" and "Activity queue stats" to confirm that this affects you. For lemmy.ml I just changed the value to 512, you have to experiment what is sufficient. The new value is only applied after restarting Lemmy. In my case changing the value through the website didnt work (maybe because its overloaded). Instead I had to update local_site.federation_worker_count directly in the database.

Edit: I had to increase the value to 160k for lemmy.ml. Now the stats arent getting logged anymore, so Im not sure if the pending queue is still building up or not.

 

I will post comments on individual topics

 

Right now my focus is on testing the app and trying to get some understanding of how the code works.

Enabling federation makes sense because I want to be able to test and study how it interacts with other instances.

If I purge the install and reinstall, how does that impact the identity of the server to other instances?

Obviously the users, communities, comments would all get removed...

Is there some kind of encryption key that identifies an instance? Or is it the domain name alone that does this?

 

I'm still getting familiar with the code. Are these tunable parameters?

As of this moment, the !chatgpt@lemmy.ml Community isn't yet usable over on beehaw https://beehaw.org/c/chatgpt@lemmy.ml - and it's been local here on lemmy.ml for about 1 day.

I also see it takes about 1 day for a update to a user profile to be picked up on beehaw.

I do see if I edit a comment the change gets replicated to from lemmy.ml to behaw in a matter of seconds.

Is that generally what people have experienced?

 

Experience and tips based off these Ubuntu 20.04 instructions:
https://join-lemmy.org/docs/en/administration/from_scratch.html

I thought trying to detail problems and extra steps in public might help some people.

view more: next ›