Anna's Archive is looking for volunteers to run mirrors
Anna's Archive is looking for volunteers to run mirrors
Anna's Archive is looking for volunteers to run mirrors
Could anyone broad-stroke the security requirements for something like this? Looks like they'll pay for hosting up to a certain amount, and between that and a pipeline to keep the mirror updated I'd think it wouldn't be tough to get one up and running.
Just looking for theory - what are the logistics behind keeping a mirror like this secure?
Could be worth asking on selfhosted (how do I link a sub on lemmy ?) They probably have more relevant experience at this sort of thing.
Edit
Does this work ?
!selfhosted@lemmy.world might work for more people.
Is probably more suitable. I'd be interested in the total size, though.
It does. 😉
They outline it pretty well here:
This is a fascinating read
Also link any ways to donate if they're accepting that.
I had no idea about this project. Is it like a better search engine for libgen etc?
It searches through libgens, z-library and has it's own mirrors of the files they serve on top of that. I think it was created as a response to Z-Library's domain getting seized but I could be wrong.
It has way more content than Libgen
For anyone wanting to contribute but on a smaller and more feasible scale, you can help distribute their database using torrents.
https://annas-archive.org/torrents
I know the last time this came up there was a lot of user resistance to the torrent scheme. I'd be willing to seed 200-500gb but having minimum torrent archive sizes of like 1.5TB and larger really limits the number of people willing to give up that storage, as well as defeats a lot of the resiliency of torrents with how bloody long it takes to get a complete copy. I know that 1.5TB takes a massive chunk out of my already pretty full NAS, and I passed on seeding the first time for that reason.
It feels like they didn't really subdivide the database as much as they should have...
There are plenty of small torrents. Use the torrent generator and tell the script how much space you have and it will give you the “best” (least seeded) torrents whose sum is the size you give it. It doesn’t have to be big, even a few GB is suitable for some smaller torrents.
Thx.
Do you know how useful it is to host such a torrent? Who is accessing the content via that torrent?
Anyone who wants to. I think a lot of LLM trainers access them.