Hello everyone, we're long overdue for an update on how things have been going!
Finances
Since we started accepting donations back in July we've received a total of $1350, as well as $1707 in older donations from smorks. We haven't had any expenses other than OVH (approx $155/mo) since then, leaving us $2152 in the bank.
We still owe TruckBC $1980 for the period he was covering hosting, and I've contributed $525 as well (mostly non-profit registration related stuff, plus domain renewals). We haven't yet discussed reimbursing either of us, we're both happy to build up a contingency fund for a while.
New Server
A few weeks ago, we experienced a ~26-hour outage due to a failed power supply and extremely slow response times from OVH support. This was followed by an unexplained outage the next morning at the same time. To ensure Lemmy’s growth remains sustainable for the long term and to support other federated applications, I’ve donated a new physical server. This will give us a significant boost in resources while keeping the monthly cost increase minimal.
Our system specs today:
Undoubtedly the cheapest hardware OVH could buy
Intel Xeon E-2386G (6 cores @ 3.5ghz)
32gb of ram
2x 512gb Samsung nvme in raid 1
1gb network
$155/month
The new system:
Dell R7525
AMD EPYC 7763 (64 cores @ 2.45ghz)
1tb of ram
3x 120gb sata ssd (hw raid 1 with a hot spare, for proxmox)
This means instead of renting an entire server and having them be responsible for the hardware, we'll be renting co-location space at a Vancouver datacenter PDF via a 3rd party service provider I know.
These servers are extremely reliable but if there is a failure, either Otter or myself will be able to get access reasonably quickly. We also have full OOB access via idrac, so it's pretty unlikely we'll ever need to go on site.
Server Migration
Phase 1 is currently planned for Jan 29th or 30th and will completely move us out of OVH and onto our own hardware. I'm expecting probably a 2-3 hour outage, followed by an 6-8 hour window where some images may be missing as the object store resyncs. I'll make another follow up post in a week with specifics.
Phases 2+ I'm not 100% decided on yet and have not planned a timeline around. It would get us into a fully redundant (excluding hardware) setup that's easier to scale and manage down the road, but it does add a little bit of complexity.
I'm seeing a net debt of about $400 hanging over you guys after all is said and done. I don't think anyone is clamoring for that money to be paid back, but I'd be happier if you guys didn't have to worry about it. I'll send some money for that reason, as well as all the good work you all do in supporting and maintaining this. Thank you.
Probably out of context, but do you have any plans of adding other networks up fedecan? Like mstdn.ca?
We're open to it, and it has a number of benefits, but we haven't formally discussed with their team on what that might look like.
And are there any plans for other services like Pixelfed, Friendica, or Peertube?
Yes, we definitely want to spin up more things once we are settled. Pixelfed is near the front of that list, as well as Friendica.
We haven't said no to any of them, but for example there isn't as much of a need for us to spin up Mastodon since mstdn.ca exists. A lot of us have accounts on there too
I've actually been investigating Postgres cluster configurations the past 2 weeks at work (though we're considering CloudNativePG+Kubernetes on 3 nodes spanning two physical locations).
One thing I might recommend is to investigate adding a proxy like PgBouncer in front of the databases. This will manage request differences where write-queries must go to the primary, but read-queries may go to any of the replicas as well.
It should also better handle the recycling of short-lived and orphaned connections, which may become more of a concern on your stage 3, and especially on some stage 4.
When purchased brand new by a now dying tech company, it was about 20-25k. I put dibs on it as part of my commission for managing the shut down & sale of their datacenters, this was just one of many such servers they owned.
I'm always so glad that this is the instance I chose to join in the Great Migration, and equally glad that I've been welcome here. Keep up the awesome work and thank you for keeping the communication so open.
Compute no, but memory yes. Lemmy is actually pretty lean and efficient, but 32gb is a bit tight for a few instances of it as well as postgres. We run multiple instances to reduce the impact when one stutters (not uncommon).
Upgrading to 64gb probably would have let us scale for the next year on the existing box, but I had this totally overkill hardware so might as well use it!
Yes, can do active / active effectively. That's basically stage 3 but it's all on one box to keep costs down. We get software failure redundancy but not hardware.
Metadata is in the db, files are in object storage. This is pictrs's design and the correct way to build it.
We're at 1tb of images today, no way I'd want to deal with scaling postgres to multiple TB. Object storage is cheap, scalable, easy to distribute and manage across multiple providers, etc.