Does Lemmy really benefit from Rust? Is code execution speed the bottleneck?

Buttons@programming.dev · edit-2 2 years ago

Does Lemmy really benefit from Rust? Is code execution speed the bottleneck?

AggressivelyPassive@feddit.de · 2 years ago

I’m pretty sure the fediverse needs a new kind of node at some point. If we assume, that almost every larger instance is connected to almost every other larger instance directly, then there’s a ton of duplicated and very small messages.

There needs to be some kind of hub in-between to aggregate and route this avalanche. Especially if, like you wrote, every upvote is a message, the overhead (I/O, unmarshalling, etc) is huge.

topbroken@programming.dev · 2 years ago

This is kinda how Usenet worked (well, still does). Rather than n*n federated connections, smaller providers tend to federate with central hubs that form backbones.

I think it makes sense for the fediverse as well.

chris@l.roofo.cc · edit-2 2 years ago

You mean like centralizing the fediverse? Who hosts the hub? Who maintains it? In which country? Who pays for it?

topbroken@programming.dev · edit-2 2 years ago

O(n*n) isn’t really scalable, so you either

a - have a small number of nodes total

b - have a small number of hubs with a larger number of leaf nodes.

Either way, there’s going to be some nodes that become more influential than others.

AggressivelyPassive@feddit.de · 2 years ago

Not a single hub, multiple ones.

Anyone can host a hub, federated instances can negotiate the intersection of hubs they both trust and then send traffic that way. That could mean, a single comment might be sent to, say, five hubs and each hub then forwards to 50 instances or so.

Since the hubs are rather simple, they can scale very easily and via cryptographic ratchets, all instances can make sure, they received the correct messages.

sznio@lemmy.world · 2 years ago

Hmm. Does the federation protocol only send information directly between servers, by that I mean that when something happens on A, does it send it to all other federated servers by itself?

If you could just proxy messages through other servers it would be an improvement. Essentially every instance would also be a hub. If you’re an instance A, connected to B and C, when B send you something you pass it onto C, instead of having C communicate with B directly.

In order to prevent spam you’d need whitelisting for the instances which you will act as a proxy for, and messages will have to be signed. Also, some protocol to discover the topology surrounding your server would be neat for optimizing delivery.

AggressivelyPassive@feddit.de · 2 years ago

Hmm. Does the federation protocol only send information directly between servers, by that I mean that when something happens on A, does it send it to all other federated servers by itself?

As far as I know, yes. There’s probably a filter in the sense that an instance only gets update for relevant events, i.e. you don’t get messages for communities you’re not subscribed to.

If you could just proxy messages through other servers it would be an improvement. Essentially every instance would also be a hub. If you’re an instance A, connected to B and C, when B send you something you pass it onto C, instead of having C communicate with B directly.

That would essentially be the same concept, just wrapped into each instance. But it would a) put massive loads on these instances and b) need some entity/authority to find the optimal spanning tree in the network - and someone would need to define, what “optimal” means in this context.

setsubyou@lemmy.world · 2 years ago

I don’t think you need an optimal spanning tree. Proxying messages is basically just how Usenet works. You peer with a small number of other servers each party forwards messages in groups the other party is interested in.

As someone who used to run a Usenet server (20 years ago), I don’t think it’s a better system. The extra hops add a lot of questions related to moderation, filtering, censorship, trust, responsibility for forwarded content, and so on.

AggressivelyPassive@feddit.de · 2 years ago

That’s why you’d need either a very closely to optimal spanning tree - or just direct intermediates (like a hub). Having messages bounce forever in the network would be far from ideal.

In any case, for everything above the actual message-handling layer, the aggregation should be transparent. That is, for moderation/filtering, etc. it shouldn’t matter, via which route the messages came to your instance.

Trust isn’t that hard either, if you sign messages (I have no idea if that’s already the case). Hubs would be no different from an ISP then.

Maybe I wasn’t clear enough above, but I would propose a very simple hub design. A hub receives messages that contain an envelope and a payload within the envelope, and then simply copy/repackage a bunch of payloads in new envelopes and send these to the connect message consumers. The actual payloads are not touch at all.