Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Building a Local Perplexity Alternative with Perplexica, Ollama, and SearXNG (jointerminus.medium.com)
134 points by flybird on Aug 1, 2024 | hide | past | favorite | 49 comments


Am I understanding perplexity right? You put your query into a system that googles it, then pipes the results through an AI to give you an answer? Doesn't google already do that now with their AI summary at the top of the search results?


You ask Perplexity questions and it will transform it into a set of web queries that can return relevant results, and then tries to synthesize an answer based on those results. It can also do multi-step searches, where it will break down the question into queries for different parts of the question, and searches that relate the parts of the question. By the end of the process it may have done like a dozen different web searches. You can also ask follow up questions that will retain the previous dialog as context - I am not sure that you can do that with Google.


My understanding is that Perplexity AI doesn't "just Google it". They have their own indexing/crawling engine called PerplexityBot. They also have their own ranking engine, but they use ranking signals from both Google and Bing and combine with their own ranker.


Hopefully their ranking ignores SEO, then it might actually give useful results.


SEO is so depep into the internet its basically impossible. Its cancer. With AI you can generate way more garbage.

J


What do you mean by ranking signals from Google?


You're right! This is actually one of people's concerns about Perplexity's future.

Additionally, the way I see it, running a local Perplexity alternative is not necessarily for better results, but for better privacy protection. Plus, it's cool.


No he is not right. They don’t google stuff they maintain their own crawler and index which they query


Who's "he"? Also you might want to contact the author of the post then, because it talks a lot about how Perplexity searches for things on Google


The article is not correct. Perplexity has its own crawler and crawls sites directly.

The nice thing about Perplexity is its source annotation. It will show you in real time how it's arriving at its data capture(s), and then annotate the results accordingly.

It really is a lovely service. It has replaced the "normal" search engine for me, for most of my queries. And has completely replaced ChatGPT/Claude as it can fulfill those needs directly (and even let me choose to use one of those models).

They also have an OpenAI-compatible API. I use it in conjunction with https://github.com/thmsmlr/instructor_ex


Maybe its region specific, but I don't see any AI summary when I search in Google.


I get it on maybe half of my searches in the UK


I've never seen it, but I don't think they support Safari, for whatever reason.


Correct. In practice perplexity is quite a nice experience though. Currently (!!) no ads, no clicking through, just the answer in text.

There are rumours of ads coming though...unclear if just the free version or not


I'm a happy customer. They provide a good service that saves me hundreds of tiny chunks of time during research. The instant ads hit, I'm out. Any distraction from the results is a time and mind suck which users like me will have no tolerance for. Thankfully, what they're doing is "only" as complex as driving searches with LLMs. And it probably doesn't even involve fine tuning, given that they're able to run new LLMs the same day they're released. I hope to soon be a happy customer of the natural open source replacement for this.


I use chat gpt to generate my google queries now and finally getting better results.


That sounds cumbersome. And if that's your workflow, wouldn't a service like perplexity be a better fit for you?


Are you serious? Can you give some examples?


yes but with the observation that it also interprets your query beforehand and searches based on the interpreted string. It has increased my google/fu ten-fold


Yes


What is this Terminus thing? Half the links on the github pages are 404s, the screenshots of applications just link to larger versions of the screenshots, and it seems like it's trying to alternate between being a highly distributed control plane and a homelab service that runs on a raspberry pi.


Thanks for your feedback. Appreciate it if you can list some 404 links so that we can fix ASAP, and we will look into other page optimization ideas from you like linkable screenshots as well.

To answer your first question, Terminus is a free, self-hosted operating system built on Kubernetes, designed to turn your device/hardware into a homecloud, or a homelab service provider at home as you mentioned.

Terminus is supported on Linux, Raspberry, Windows and Mac.



Got it. Yeah, this section lists 60+ modules/subprojects for Terminus. But as we are iterating fast, some of the project links are not updated in time. We will submit a pr and fix them ASAP. Thanks again.


I just did that with docker-compose: https://taoofmac.com/space/notes/2024/06/23/1300

Much less fuss and zero weird stuff.


Very cool. The pieces you put together make sense, but I have to ask -- why use Node-RED over Jupyter notebook for prototyping? It seems like your reason is that it gives you access to a suitably flexible low code environment that lets you visually program and experiment without needing to reach into code until you have a working prototype.

Is the improvement in productivity that drastic that you'd rather take that approach when possible? I haven't given something like Node-RED a try before because I've been under the assumption that about 60% through any kind of low code prototyping exercise, you begin to hit limits of inflexibility that makes it more painful than dropping into pure code in the first place, but I wonder if I've been overly hasty in this presumption.


Because:

- I can write JavaScript inside nodes

- When each event is a JSON blob, it’s trivial to handle data

- You can break down arrays/lists into series of discrete events, and switch them along different paths with simple filtering logic

- You can build sub-flows and re-use them trivially.

- You don’t have to worry with doing HTTP requests or handle database connections (you just receive the request results as a JSON object or create an object you send to a database node

So for me a basic flow is [template node with prompt]->[LLM API call]->[parse and branch out depending on outcome]->[store output], and I only write two snippets of JS inside a couple of nodes.

I do have Jupyter on my stack (another stack, too, with GPU access), but it tends to be linear, messy and impossible to leave running as a service. My Node-RED stuff is effectively “testing in production” and exposing HTTP endpoints as well, and I switch by prod/test by linking the right nodes…

But yeah, when I’m happy with the prompt or the flow I look at it and write some actual code. Just not in Jupyter.


Cool stuff! Thanks for sharing. Docker Compose is indeed a nice method for fast deployment.

But like the difference between FTP and Dropbox

Our guide are designed to enable less-pro users to enjoy the benefits of selfhosting at home cloud.

A key advantage is fast deployment and flexible assemble of apps like Ollama and Perplexica.

In the meantime, compared to Docker Compose, Terminus provides a dedicated domain name for each application and service out-of-the-box. This enables users to access from anywhere via a browser. It will automatically handles all the complicated network configurations, DNS resoluton, and HTTPs certificate.


Actually, this 'out-of-the-box' thing makes it complicated for me. Don't get me wrong; I like it when things just work.

But when I already have a setup (e.g. traefik + multiple docker-compose environments), I do want to understand what your out-of-the-box setup is doing. Otherwise, I risk that it kills my existing setup.

So far I stopped, after I saw, that https://terminus.sh is just a script to download another installer.tar.gz


Thank you for your valuable feedback.

During the installation process, Terminus requires over 110+ images. Considering Docker Hub's rate limiting, the script initially prompts users to download all images locally.

The script executes the actual installation command at the end. The installation process generally follows these steps: 1. Install K8S/K3S 2. Import images sequentially 3. Install system applications using Helm 4. Wait for system startup, then user activation

We understand your concerns. As a new system without an established reputation, we recognize it's challenging for users to try it out.

We've added installation methods for Windows WSL and Raspberry Pi, allowing users to test in virtual machines or temporary environments without concerns about their exiting setups.

Over the next two months, we plan to develop a graphical installation tool to further simplify the process.

Thanks again for your input.


Looking forward to your Sandstorm.io successor, k8s is an interesting choice.


Thank you for your acknowledgment! We will certainly do our best.


I can do all of that with docker-compose and Cloudflare tunnels…


Wasn’t aware of LiteLLM, thanks! Are you using it instead of OWUI’s pipeline feature or alongside?


Does OWUI stand for Open Web UI?

My understanding is that Open Web UI and Perplexica each have their own focus.

However, through this tutorial, they can jointly use the same local SearxNG and Ollama service instance.


Yeah OWUI = Open Web UI. I was referring specifically to rcarmo’s setup though.


Slight correction: I believe Perplexity runs their own search index.


You are totally right. We kind of simplified the Perplexity workflow in the post for easier understanding. We will definitely be mindful about the accuracy in our future posts.


Google and Bing have been crawling the web for decades. Perplexity crawler is just for marketing purposes, there's no way they even have 1% of Google's index. So yeah, in reality they just query Bing API.


Except that this can't do auto complete in the URL bar on major browsers, including Firefox.

Autocomplete works if you visit localhost:port in the browser.

I've asked around. It seems like an anticompetitive behavior on the part of browser vendors.


Perplexity doesn't actually use Google Search directly. It uses its own indexer and crawler called PerplexityBot [1]. It then uses a mixture of Google and Bing ranking signals to "help" with some of its own search result ranking. I presume they have their own ranker that has some relevancy score based on site recency, cosine similarity of site data with search query, etc, and they combine this with Google and Bing ranking signals.

[1] https://nypost.com/2024/06/28/business/amazon-probing-ai-sta...


You are absolutely right.

Thanks for correcting and elaborating on this. We used a simplified version (maybe a bit oversimplfiied) to make it easy for the average audience to understand how AI search works. We will be more careful with accuracy in the future.


Congratulations on front page. AMD GPU support would be nice.


Thanks for your comment. AMD GPU support will be available shortly.


I tried Perplexity a few times, but I cannot see the point. Are you using it? How is it better than other tools?


I do use it, because it's replaced the search engine and ChatGPT/Claude for me. It's an all-in-one service that delivers a generally great experience. Both its web and mobile apps are excellent, and they provide an OpenAI -compatible API that lets me do work against their supported models.


Step 1) Install an App marketplace.

Uhhh, no.


> You’ve probably heard of Perplexity,

nope…


lol




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: