Skip Navigation

Would you use a self-hosted, AI-powered search engine for your favorite sites?

For context I created a video search engine last year, I shut it down and put the data online. You can read about it here: https://www.bendangelo.me/2024/07/16/failed-attempt-at-creating-a-video-search-engine/

I put that project on hold because of scaling issues, anyway I'm back with an other idea. I've been frustrated with how AI slop is ruining the internet and recently it's been hitting Youitube pretty hard with AI videos. I’m brainstorming a tool for people to selfhost:

Self-hosted crawler: Pick which sites/videos to index (blogs, forums, YT channels, etc.). AI chat interface: Ask questions like, “Show me Rust tutorials from 2023” or “Summarize recent posts about homelab backups.” Optional sharing: Pool indexes with trusted friends/communities.

Why? No Google/YouTube spam—only content you choose. Works offline (archive forums, videos, docs). Local AI (Mistral) or cloud (paid) for smarter searches.

Would this be useful to you? What sites would you crawl? Any killer features I’m missing?

Prototype in progress—just testing interest!

41 comments
  • No.

    AI search offers me nothing that "normal" search doesn't also offer.

    But it uses a thousand times more resources.

    10 years ago people were shocked by the size of Google's server halls. Now imagine the increase in size/numbers through AI.

    Fuck this shit. The internet isn't what's driving the climate catastrophe, it's how people use it.

  • Not really. I could use some good selfhosted search engine. I mean all the existing projects (which is just YaCy, to my knowledge) are a bit dated. Nowadays we only got metasearch engines and we're relying on Google, Bing etc.

    But I don't need any chatbot enhancements. That's usually something I skip when using Google or Bing because it doesn't work well. The AI summaries tend to be wrong, and it's bad at looking up niche information, which is something I need a search engine to be able to find. The AI just cites the most common slop, or at best the Wikipedia article. But I don't really need any fancy software to get there... So for me, we don't need any AI augmentation.

    And I think the old way of googling was fine. Just teach people to put in the words that are likely to be in the article they want to find. That'd be something like "Rust new features 2023" or "homelab backup blog". Sure you can strap on a chatbot and put in entire natural language questions. But I think that's completely unnecessary. We have brains and we're perfectly able to translate our questions into search queries with little effort... If somebody teches us what to type into the search bar, and why.

  • If I wanted to self host a search engine, I'd just use a proper one that actually searches content rather than regurgitates bullshit.

    Search engines worked just fine until Google and Microsoft decided that they wanted to sell their AI products.

  • While almost everyone here seems to hate AI (maybe for the wrong reason, but who am I to judge) I like to have AI as it is able to provide answers a simple search engine cannot.

    What I don't see is hosting something like this myself. The managing of source and indexing them would take too much of my, my server's and the web servers to be indexed energy (maybe I am wrong).

    There are already good solutions (OpenWebUI with Ollama) that can be tweaked to almost do what you're describing and the AI models get better every month, so I don't think a custom AI search engine could keep up with it.

  • Web scrapers are all that's needed,

    AI is worthless except for the few uses it has combing through medical data.

    AI should never be used to try to influence people.

41 comments