The big thing for #2 would be to seperate out what you actually need vs what people keep recommending.
General guidance is useful, but there's a lot of 'You need ZFS!' and 'You should use K8s!' and 'Use X software!'
My life got immensely easier when I figured out I did not need any features ZFS brought to the table, and I did not need any of the features K8s brought to the table, and that less is absolutely more. I ended up doing MergerFS with a proper offsite backup method because, well, it's shockingly low-complexity.
And I ended up doing Docker with a bunch of compose files and bind mounts, because it's shockingly low-complexity. And it's just running on Debian, instead of some OS that has a couple of layers of additional software to make things "easier" because, again, it's low-complexity.
I can re-deploy the entire stack on new hardware in about ~10 minutes (I've tested this a few times just to make sure my backup scripts work), and there's basically zero vendor tie-in or dependencies that you'd have to get working first since it's just a pile of tarballs and packages from the distro's package manager on, well, ANY distro.
you do not need anything to be „high availability”, that just adds a ton of complexity for no benefit. Nobody will die or go broke if your homelab is down for a few days.
tailscale is awesome
docker-compose is awesome
irreplaceable data gets one offsite backup, one local backup, and ideally one normally offline backup (in case you get ransomwared)
yubikeys are cool and surprisingly easy to use
don’t offer your services to other people until you are sure you can support it, your backups are squared away, and you are happy with how things are set up.
It is much easier to buy one "hefty" physical machine and run ProxMox with virtual machines for servers than it is to run multiple Raspberry Pis. After living that life for years, I'm a ProxMox shill now. Backups are important (read the other comments), and ProxMox makes backup/restore easy. Because eventually you will fuck a server up beyond repair, you will lose data, and you will feel terrible about it. Learn from my mistakes.
My reason for self hosting is being in control of my shit, and not the cloud provider.
I run jellyfin, soulseek, freshRSS, audiobookshelf and nextcloud. All of that on a pi 4 with an SSD attached and then accessible via wireguard. Also that sad is accessible as nfs share.
As I had already known Linux very well before I've started my own cloud, I didn't really had to learn much.
The biggest resource I could recommend is that GitHub repository where a huge amount of awesomely selfhosted solutions are linked.
I'll parrot the top reply from Reddit on that one: to me, self hosting starts as a learning journey. There's no right or wrong way, if anything I intentionally do whacky weird things to test the limits of my knowledge. The mistakes and troubles are when you learn. You don't really understand the significance of good backups until you had to restore from them.
Even in production, it differs wildly. I have customers whom I set up a bare metal Ubuntu in some datacenter for cheap, they've been running on that setup for 10 years. Small mom and pop shop, they will never need a whole cluster of machines. Then at my day job we're looking at things like Kubernetes and very heavyweight stacks because we handle a lot of traffic.
Some people self-host a PiHole on a Raspberry Pi and that's all they need. Some people have entire NAS setups with smart TVs accessing their Plex/Jellyfin servers for the whole extended family. I host my own emails, which is a pain in the ass to get working reliably and clean your IP reputation.
I guess the only thing you should know is, you need some time to commit to maintaining your stuff if you don't want it to break or get breached (if exposed to the Internet), and a willingness to learn because self hosting isn't a turnkey experience. It can be a turnkey installation but when your SD card/drives fails you're still on your own to troubleshoot and fix it. You don't set a NextCloud server to replace Google Drive with the expectation that you shove the server in a closet forever. Owning your infrastructure and data comes at a small but very important upkeep time investment.
Cheap storage that I can use both locally and as a private cloud. Very convenient for piracy storing all my legally obtained files.
Network wide adblocking. Massive for mobile games/apps.
Pivate VPN. Really useful for using public networks and bypassing network restrictions.
Gives me an excuse to buy really cool, old server and networking hardware.
As for things I wish I knew... Don't use windows for servers. Just don't.
SMB sucks, try NFS.
Use docker, managing 5 or 10 different apps without containers is a nightmare.
Bold of you to assume I'm a computer scientist or engineer or that I have a degree lmao. I just hate ads, subscriptions and network restrictions, so I learned how to avoid those things. As for resources to get started... Look up TrueNAS scale. It basically does all of the work for you.
Learning. If you ever found yourself tired of learning new things, your life is basically done.
Cost. You already have an internet connection at home. It's practically a necessity these days. The connection is likely fast enough for most things. Renting even the most piddly of VPS is wildly expensive. Just throw a spare machine at it and go wild.
Freedom. Your own data is constantly being collected, regurgitated, and sold back to you. More people need to care about this incessant invasion of our lives.
Backups. 3 copies, on different forms of storage, in multiple PHYSICALLY distinct locations. Just when you have that teeny little imp in the back of your mind say "hmm, I should probably back up soon" -- stop everything you're doing and run a backup.
Test your recovery! Backups are only good if you can recover from them. Many have lost data because they failed to ever fail-test their backups.
Google. Legitimately the best skill you can ever attain is simply being able to search effectively and be able to learn jargon quickly. Once you have the lingo down, searches become clearer, quicker, more precise.
I've learned a number of tools I'd never used before, and refreshed my skills from when I used to be a sysadmin back in college. I can also do things other people don't loudly recommend, but fit my style (Proxmox + Puppet for VMs), which is nice. If you have the right skills, it's arbitrarily flexible.
What electricity costs in my area. $0.32/KWh at the wrong time of day. Pricier hardware could have saved me money in the long run. Bigger drives could also mean fewer, and thus less power consumption.
Google, selfhosting communities like this one, and tutorial-oriented YouTubers like NetworkChuck. Get ideas from people, learn enough to make it happen, then tweak it so you understand it. Repeat, and you'll eventually know a lot.
data stays local for the most part. Every file you send to the cloud becomes property of the cloud. Yeah, you get access, but so does the hosting provider, their 3rd party resources, and typical government compliances. Hard drives are cheap and fast enough.
not quite answering this right, but I very much enjoy learning and evolving. But technology changes and sometimes implementing new software like caddy/traefik on existing setups is a PITA! I suppose if I went back in time, I would tell myself to do it the hard way and save a headache later. I wouldn't have listened to me though.
Portainer is so nice, but has quirks. It's no replacement for the command line, but wow, does it save time. The console is nerdy, but when time is on the line, find a good GUI.
less is more, it's fine to sunset stuff you don't use enough to afford them using cpu cycles, memory and power
search warrants are a real thing and you should not trust others to use your infrastructure responsibly because you will be the one paying for it if they don't.
Podman quadlets have been a blessing. They basically let you manage containers as if they were simple services. You just plop a container unit file in /etc/containers/systemd/, daemon-reload and presto, you've got a service that other containers or services can depend on.
Our internet goes out periodically, so having everything local is really nice. I set up DNS on my router, so my TLS certs work fine without hitting the internet.
I wish someone would've taught me how to rip blurays. It wasn't a big deal, but everything online made it sound super sketchy flashing firmware onto a Bluray drive.
I'm honestly not sure. I'm in CS and am really into Linux, so I honestly don't know what would be helpful. I guess start small and get one thing working at a time. There's a ton of resources online for all kinds of skill levels, and as long as you do one thing at a time, you should eventually see success.
dont offer a service to a friend without really knowing and having the experience to keep it up when needed.
dont make it your life. The services are there to help you, not to be your life.
use docker. Podman is not yet ready for mainstream, in my experience. When the services move to podman officially it's time to move. Just because jellyfin offers official documentation for it, doesn't mean it'll work with podman (my experience)
just test all services with the base docker install. If something isn't working, there may be a bug or two. Report if it is a bug. Hunt a bug down if you can. maybe it's just something that isn't documented (well enough) for a beginner.
start on your own machine before getting a server. A pi is enough for lightweight stuff but probably not for a fast and smooth experience with e.g. nextcloud.
backup.
search for help. If not available in a forum. ask for help. Dont waste many many hours if something isnt working. But research it first and read the documentation.
For #2 and #3, it’s probably exceedingly obvious, but wish I would have truly understood ssh, remote VS Code, and enough git to put my configs on a git server.
So much easier to manage things now that I’m not trying to edit docker compose files with nano and hoping and praying I find the issue when I mess something up.
Things like changes to TOS or services can be seriously mitigated by hosting it yourself. WHat happens if Spotify changes the music they host or inserts ads into everything. Well for me, nothing. On the flip side, if some of my stuff goes down, kids and wife will bark. But honestly its mostly set it and forget it.
KISS is a thing that applies to many things in life. Anything "smart" in your home should ideally function without your "smart" features working. Ie: light switches should be dumb light switches if something breaks etc etc. Also dont get caught in using rack or enterprise gear. You can learn just as much using smaller, fatter desktops with bigger fans and air cooling over a power hungry rack servers with 80mm fans that blow your eardrums out. My entire lab runs on old dell workstations and raspberry pis'
Regarding your third point, you might find it helpful to search for beginners' guides whenever starting a new project. One thing that people don't seem to tell new users about is the struggles they faced when getting started themselves. Countless thousands of hours could be spent on this before someone decides to get started, while others pick it up in a much shorter timeframe. It just depends on you and what you are looking to get out of it.
It's much more difficult than many people realize. If you need a space to test things out, I'd recommend installing VirtualBox with a couple of VMs to host whatever services you decide on. You can take a snapshot of the VM at any point in time, so when things go bad, you can simply restore whichever snapshot you like.