I like the idea with Lemmy/kbin and the fediverse but theres something I dont understand perhaps.

If in the future Lemmy is very popular and someone wants to add their own server and federate with everyone then from that moment that new instance will get all new comments, posts, etc. from all other instances its federated with and must save them in its db. This means if Lemmy gets popular forget about little guys helping out spread the “load” because every intance still must take and save all new data. Thats a lot of processing power and storage. How can this work? I see in the future only a few instances will survive.

If somehow each instance was a node and only took care of its posts and comments and forward them to others upon request I can understand scaling but this is not how it works AFAIK. Another way would be with consensus algorithms where a node saves more thsn its own data but still not all.

  • maegul@lemmy.ml
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Except we’re not CDNs. Instances aren’t run by companies, most of the time, but by volunteers. Not that I have anything against companies running instances, but it’s not what the fediverse is about.

    So the question of resources is a sensible one, whether or not the current protocol and architecture has worked in the past.

    And the threadiverse’s difference in format is precisely the difference in highlighting. Sync for microblogs is over the posts of individual users that are followed. Sync for the threadiverse is over a community which comprises many users’ posts. Communities with threads versus single user microblogs … this is the format difference. And it’s the difference in what gets synced. Right, please correct me if I’m wrong.

    And so, if I’m right, the question of how much gets duplicated also differs.

    Whether the threadiverse has more duplication depends on the details, of course. My reasoning was that it would be easier for more duplication to occur on the threadiverse, as whole collections of conversations of many users will duplicated simply from one users single subscription. This is compared to the microblog platforms where users often only follow hundreds of people (my impression only).

    Of course, it may be that any users output is distributed over many communities, so that communities turn out not to be larger overall (maybe this was your point). And also, as you say, cacheing and duplicating is how the internet works, so we should have ways to handle it.

    All in all though, it would be nice to have some basic numerical analysis done, especially if we want people to start instances without worrying about getting burnt by ballooning costs.