My home lab has a mild amount of complexity and I'd like practice some good habits about documenting it. Stuff like, what each system does, the OS, any notable software installed and, most importantly, any documentation around configuration or troubleshooting.
i.e. I have an internal SMTP relay that uses a letsencrypt SSL cert that I need to use the DNS challenge to renew. I've got the steps around that sitting in a Google Doc. I've got a couple more google docs like that.
I don't want to get super complicated but I'd like something a bit more structured than a folder full of google docs. I'd also like to pull it in-house.
Thanks
Edit: I appreciate all the feedback I've gotten on this post so far. There have been a lot of tools suggested and some great discussion about methods. This will probably be my weekend now.
ansible, self-documenting. My playbook.yml has a list of roles attached to each host, each host's host_vars file has details on service configuration (domains, etc). It looks like this: https://pastebin.com/6b2Lb0Mg
Additionally this role generates a markdown summary of the whole setup and inserts it into my infra's README.md.
Manually generated diagrams, odd manual maintenance procedures and other semi-related stuff get their own sections in the README (you can check the template here) or linked markdown files. Ongoing problems/research goes into the infra gitea project's issues.
I'm only just starting to dip my toes in docker. Most of my stuff are kvm vms. I have a decent set of Ansible roles to setup a new vm when I spin it up but I'm not to the point where the specifics of every system is in Ansible yet.
You can full well deploy docker stacks using ansible. This is what I used to do for rocket.chat: [1][2] (ditched it for Matrix/element without Docker, but the concept stays valid)
> Iโm not to the point where the specifics of every system is in Ansible yet.
What I suggest is writing a playbook that list the roles attached to your servers, even if the roles actually do nothing:
# roles/application-x/tasks/main.yml
- name: setup application-x
debug:
msg: "TODO This will one day deploy application-x. For now the setup is entirely manual and documented in roles/application-x/README.md"
# roles/application-x/tasks/main.yml
- name: setup service-y
debug:
msg: "TODO This will one day deploy service-y. For now the setup is entirely manual and documented in roles/service-y/README.md"
#...
This is a good start for a config management/automated deployment system. At least you will have an inventory of hosts and what's running on them. Work your way from there, over time progressively convert your manual install/configuration steps to automated procedures. There are a few steps that even I didn't automate (like configuring LDAP authentication for Nextcloud), but they are documented in the relevant role README.md[3]