- cross-posted to:
- lemmy@lemmy.ml
- cross-posted to:
- lemmy@lemmy.ml
https://github.com/LemmyNet/lemmy/issues/3245
I posted far more details on the issue then I am putting here-
But, just to bring some math in- with the current full-mesh federation model, assuming 10,000 instances-
That will require nearly 50 million connections.
Each comment. Each vote. Each post, will have to be sent 50 million seperate times.
In the purposed hub-spoke model, We can reduce that by over 99%, so that each post/vote/comment/etc, only has to be sent 10,000 times (plus n*(n-1)/2 times, where n = number of hub servers).
The current full mesh architecture will not scale. I predict, exponential growth will continue to occur.
Let’s work on a solution to this problem together.
Apologies if I came off as hostile.
I mean I get what you’re saying - I just don’t see the practical use. The centralized hub replication servers would have to basically foot a huge bill for the fediverse, and do so silently and invisibly to the end user. As it is, most instances run on goodwill or donations. A silent, invisible server is hard to gather donations for. Who would run them?
Furthermore the topology you propose is essentially what we already have. A few large instances hold most of the largest communities. I don’t see that changing. This brings a fairly good balance - smaller instances pretty much only have to listen for updates from a few other instances, only the big instances are doing the hard work of notifying hundreds of others. They are already our “hubs”. Small instances really hardly do practically any hard work, the one I run for example just listens to maybe a dozen instances send updates, and occasionally sends out an update when one of my users interacts.
I suppose I just don’t understand how this could be implemented in practice- or rather how it could be useful to do so. It would strictly enforce a sort of centralization that right now is only a natural consequence of user behavior, while seemingly only bringing theoretical benefits unlikely to be realized.
One consideration, since they are only having to basically sub/pub - the load actually might be drastically lower than expected.
Suppose- that is a valid point. The issue though- those large instances are unable to keep up with demand and load, causing lots of federation issues.
Perhaps, my idea actually wouldn’t help that at all, but, using lemmy.ml as an example-
Instead of it having to send all of its updates out to every server subscribed- it can delegate that to a hub server to do it. The hub server can run a very minimal set of instructions, with enough intelligence to handle sub/pub.
Perhaps- one idea is, instead of thinking of it as a hub-server, think of it as a proxy server. Being able to delegate your instances actions to the proxy server to reduce that load from the main server.
And, instead of the hubs/proxies being more centralized, perhaps, its just an optional thing which you CAN do.
My line of thinking, is methods to reduce load from the main servers. This might be an idea that only benefits the handful of big servers.
To also further clarify- I DONT have a solution to the problem. I am only intending to establish a forum to discuss if this is even a viable option, or perhaps, think of other ways to spread around the load.
I am not certain on scenarios you were mentioning above, but I do agree that separating software to instance plus hub/proxym/mssage queue could help with handling load.
How can we scale our big i instances? I don’t know maybe it is easy to put instance on multiple servers, but sounds to me they are just buying bigger one, and that will fill up fast of growth continues to happen.
I would like to hear from developers what they think, but thank you for starting conversation about scaling.
I am probably missing something / being really oblivious (its been a long day…) but wouldn’t this same problem occur to the hub server in your model?
Although thinking about it a bit more, I thought I recalled seeing one of the Lemmy devs mention that the biggest issue is the SQL queries that are ran for various actions (such as loading the front page) - if that is the case, I don’t know if this idea would help with that.
The idea of a centralized hub server(s) also sounds like we’d be moving closer to the model of a centralized Reddit… But I guess in a way, the fact that larger instances exist in of itself poses the same issue?
… I’m probably just rambling to myself at this point, however, I do think a message queue type of system for federating events would be a good idea, for the sake of recovering from send failures.