Show HN: Open source alternative to Perplexity Comet

(browseros.com)

221 points | by felarof 16 hours ago

24 comments

azinman2 8 hours ago
The demo buying toothpaste shows the difficulty of these tasks. Toothpaste itself was very underspecified, and it essentially randomly chose from a huge list. Some tasks may have past actions that could help guide, others won't have any to inform. Failure cases abound -- maybe the toothpaste you previously bought is no longer available. Then what? Ultimately how much time did this particular example save since you need to double check the result anyway? This is what doomed Alexa for the purchasing experience that Amazon assumed it would enable in the first place.
I think it'd be better to show more non-trivial examples where the time savings is clear, and the failure cases are minimized... or even better how it's going to recover from those failure cases. Do I get a bespoke UI for the specific problem? Talk to it via chat?
This whole world is non-trivial. Good luck!
[-]
- felarof 8 hours ago
  Great points! For sure, the whole agentic browsers space is still super early.
  We are also just getting started and trying to narrow down on a high-value niche use-case.
  There are few repetitive, boring use-cases where time saving could be meaningful -- one example: Walmart 3rd-party sellers routinely (multiple times a day) keep checking prices of the competitor products to price their products appropriately. This could be easily automated with current agentic browsers.
  [-]
  - byteknight 8 hours ago
    But in reality, would much more consistently be automated by a single playwright script.
    [-]
    - felarof 7 hours ago
      True, there are plenty of libs already available to do such an automation if you are (or can hire a) dev.
      But for non-technical folks, agentic browsers seems like a good UX to build such and many more automations.
askl 50 minutes ago
The demo looks like a mediocre solution looking for a problem. I can order toothpaste without an LLM faster and more reliably.
matheist 2 hours ago
> We are also building an LLM-based ad-blocker after Chrome blocked uBlock Origin.
Since it's a Chromium fork, why not re-enable uBlock Origin instead?
mdaniel 9 hours ago
> --- How we built? We patch Chromium's C++ source code with our changes, so we have the same security as Google Chrome. We also have an auto-updater for security patches and regular updates.
So you rebuild your browser on every Chromium release? Because that's the risk: often changes go into Chromium with very innocent looking commit messages than are released from embargo 90 days later in their CVE reference
[-]
- felarof 9 hours ago
  Good question, so far we have been building on top of chromium release that Google Chrome is based on.
  [-]
  - mdaniel 5 hours ago
    I feel as though you overlooked the "every" word in my question. I appreciate you built once, that's a solid accomplishment. If I'm going to be riding your custom build, with your custom C++ changes that introduce their own RCE risk, I want to at least know I'm only vulnerable to your RCE and not your RCE plus the 'just disclosed' RCE for Chromium itself that was actually patched 3 weeks ago but that you didn't bother to track because you don't track Chromium release tags
    Yes, I'm acutely aware of exactly how much compute pulling off such a stunt requires; what I'm wondering is whether you are aware of exactly how much RCE risk you're running by squatting on someone else's C++ codebase that ships what feels like a vuln-a-week from one of the best funded security research teams in the world
  - paulirish 4 hours ago
    You build from the release branches yeah? So, right now that'd be the 7204 branch. https://chromiumdash.appspot.com/releases?platform=Mac
layer8 14 hours ago
I would prefer this as a browser extension, not as its own browser application.
[-]
- Imustaskforhelp 6 minutes ago
  Exactly my thoughts when I saw nanobrowser being mentioned here.
- felarof 13 hours ago
  We would've preferred to build this as browser extension too.
  But we strongly believe that for building a good agent co-pilot we need bunch of changes at Chromium C++ code level. For example, chromium has a accessibility tree for every website, but doesn't expose it as an API to chrome extension. Having access to accessibility tree would greatly improve agent execution.
  We are also building bunch of changes in C++ for agents to interact with websites -- functions like click, elements with indexes. You can inject JS for doing this but it is 20-40X slower.
  [-]
  - esafak 12 hours ago
    Could you upstream that change in order to make it an extension in the future? I think people would not value it any less.
    [-]
    - felarof 12 hours ago
      We don't mind upstreaming. But I don't think Google Chrome/Chromium wants to expose it as an API chrome extensions, if not they would've done this long time ago.
      From Google's perspective, extension are meant to be lightweight applications, with restricted access.
      [-]
      - jazzyjackson 10 hours ago
        I'm not really interested in AI agents for my webbrowser, but it would be pretty cool to see a fork of chromium available that, aside from being de-googled, relaxes all the "restricted access" to make it more fun to modify and customize the way you guys are. Just a thought, may be more of a market for the framework more than the product :)
        See Sciter. A very cool, super lightweight alternative to Electron, but unfortunately it seems like a single developer project and I could never get any of the examples to run.
        https://sciter.com/
        [-]
        felarof 9 hours ago
        Yes, we want to do this too! We'll expose much more richer APIs.
        What use-cases do you have in mind? like scraping?
  - layer8 11 hours ago
    Would this be possible for Firefox?
    [-]
    - felarof 9 hours ago
      IIRC, Firefox's web extension API does not provide access to accessibility tree as well.
      [-]
      - dotancohen 8 hours ago
        I'm not GP, but I agree that if your goal is to empower the end user and protect him from corporate overlords, then Firefox is a more logical choice to fork from.
        [-]
        esperent 2 hours ago
        Isn't the Firefox code notoriously hard to fork and work with? I'm sure that nearly all of these Chrome forks would prefer to fork Firefox, but there's a reason they don't.
        axus 7 hours ago
        I wonder if someone on the Chromium team will upstream all these BrowserOS changes, or "Not Invented Here" and re-implement it all for Gemini / Google Assistant.
        Can't imagine Firefox acquiring a Firefox fork!
        [-]
        dotancohen 5 hours ago
        The hooks needed for integration will probably be implemented in Chrome, not Chromium. Nobody else should have them.
  - arjunchint 9 hours ago
    I mean you could build the agent with a first principles understanding of the DOM instead of just hacking together with the accessibility tree
- arjunchint 9 hours ago
  We had this exact thought as well, you don't need a whole browser to implement the agentic capabilities, you can implement the whole thing with the limited permissions of a browser extension.
  There are plenty of zero day exploit patches that Google immediately rolls out and not to mention all the other features that Google doesn't push to Chromium. I wouldn't trust a random open source project for my day-to-day browser.
  Check out rtrvr.ai for a working implementation, we are an AI Web Agent browser extension that meets you where your workflows already are.
  [-]
  - esperent 2 hours ago
    > I wouldn't trust a random open source project for my day-to-day browser.
    Given that you're working on a direct competitor, this comment reads as fearmongering, designed to drive people over to your product.
    [-]
    - arjunchint 1 hour ago
      I personally talked to another agentic browser player, fellou.ai, in the space asking them how they are keeping up with all the Chromium pushes as you need a dedicated team to handle the merges, they flat out told me they are targeting tech enthusiasts that are not interested in the security of their browser as much.
      As an ex-Google engineer I know the immense engineering efforts and infrastructure setup to develop Chrome. It is very implausible that two people can handle all the effort to serve a secure browser with 15+ million lines of constantly changing C++ code.
      A sandboxxed browser extension is the natural form factor for these agentic capabilities.
  - felarof 7 hours ago
    Brave Browser (70M+ users) has validated that a chromium fork can be viable path. And it can in fact provide better privacy and security.
    Chrome extensions is not a bad idea too. Just saying that owning the underlying source code has some strong advantages in the long term (being able to use C++ for a11y tree, DOM handling, etc -- which will be 20-40X faster than injecting JS using chrome extension).
    [-]
    - arjunchint 1 hour ago
      Honestly excited to see the benchmark result and comparison!
      Our benchmark results [https://www.rtrvr.ai/blog/web-bench-results] show that we are 7x faster than browser-use so curious to see if your claims live up to the hype
- casslin 1 hour ago
  try https://github.com/nanobrowser/nanobrowser
darkmuck 9 hours ago
This is similar to the chrome extension nanobrowser. https://github.com/nanobrowser/nanobrowser
[-]
- Imustaskforhelp 6 minutes ago
  Hm just saw this.
  I think if something can be done as an extension then there is no need to do it as a fork of the existing software.
  Are there any differences b/w nanobrowser and brwoserOS? like some functionalities that only browserOS could do and not nanobrowser which are worth mentioning?
- casslin 1 hour ago
  thanks for the mention!
rpastuszak 15 hours ago
Congrats!
How are you planning to make the project sustainable (from a financial, and dev work/maintenance pov)?
[-]
- felarof 13 hours ago
  Thank you!
  plan is to sell licenses for Enterprise-version of browser, same as other open-source projects.
- patrickaljord 14 hours ago
  my guess is it's just an electron app or chromium wrapper with an ollama wrapper to talk to it (there are plenty of free open source libs to control browsers).
  [-]
  - felarof 13 hours ago
    We are a chromium "wrapper"
    But we are much more performant than other libs (like playwright) which are written in JS, as we implement bunch of changes at chromium source code level -- for example, we are currently implementing a way to build enriched DOMtree required for agent interactions (click, input text, find element) directly at C++ level.
    We also plan to expose those APIs to devs.
  - janalsncm 14 hours ago
    “Just” is a four-letter word :)
    When someone in their infinite wisdom decides to refactor an api and deprecate the old one, it creates work for everyone downstream.
    Maybe as an industry we can agree to do this every so often to keep the LLMs at bay for as long as possible. We can take a page out of the book of the maintainers of moviepy for shuffling their apis around, it definitely keeps everyone on their toes.
  - moscoe 13 hours ago
    You don’t have to guess, it’s open source
  - lotyrin 8 hours ago
    No wireless. Less space than a Nomad.
eGQjxkKF6fif 14 hours ago
Whats the roadmap looking like for Linux?
I don't have Mac or Windows.
[-]
- felarof 13 hours ago
  this is on our radar, we plan to have it ready by early next week!
  still a team of 2 people, so bunch things on our plate.
tzury 4 hours ago
This is not Open source alternative to Perplexity Comet
This is the real thing, the original if you will.
As for Perplexity, to me this company and line of product are seemed as the alternative to anything great in AI.
zebomon 12 hours ago
This is very exciting given the rumor that OpenAI will be launching a (presumably not open source) browser of their own this summer. I've joined your Discord, so will try it soon and report back there. Congrats on launching!
[-]
- felarof 12 hours ago
  Thank you!
  Browser wars have begun.
  > that OpenAI will be launching a (presumably not open source) browser of their own this summer.
  For sure, won't be open-source. I bet in some parallel world, openAI would be non-profit and actually open-source AI :)
  [-]
  - mrweasel 1 hour ago
    > Browser wars have begun.
    Kind of a sad browser war. The majority of these Chromium forks should exist, nor are they particularly viable. Sniffing out which one is going to be successful is obviously the hard problem.
    Ideally we'd get an extensions API for AI agents and various companies could just release their own plugins. Sadly I don't think the majority is interested, they want to control the "experience".
o_____________o 9 hours ago
Would love to see this show up on homebrew!
[-]
- felarof 9 hours ago
  Oooh, that's a nice idea! We'll look into doing that!
  [-]
  - mh- 8 hours ago
    Making a homebrew recipe is super easy, and you can definitely find an example to draw from that's "shaped" like your app. Highly recommend.
5xpB7n8tdbtoP 6 hours ago
The windows .zip shared on your github release page gets flagged as a Trojan by windows defender. Why is it a zip in the first place?
[-]
- felarof 6 hours ago
  It is just a warning. We working on getting Microsft Windows signing set up.
dxroshan 2 hours ago
What a joke project! Chromium has a over 35 million lines of code. These people applied a few patches on it, and then advertising it as if they have developed a new browser. The same goes for Comet also.
From their GitHub readme:
> but Chrome hasn't evolved much in 10 years
Really?? It is not true. You guys please go and check release notes and commits log for Chrome/Chromium project for the past 10 years.
[-]
- Ey7NFZ3P0nzAe 2 hours ago
  Maybe they're only refering to the frontend/UX. It's true that in 10 years it has not changed much compared to the diff with browserOS.
  [-]
  - dxroshan 1 hour ago
    First of all, they didn't say they are referring to the UI.
    The main element in a web browser's UI is the web view where web pages get rendered. It may look the same 'rectangle' as it used to look 10 years ago. But the way chrome renders web pages and execute JavaScript have undergone a lot of changes over the years. Also they have added a lot of new standard and nonstandard HTML, CSS and JavaScript features. Then, there is WebGL 2.0, WebAssembly, WebGPU, etc.
- nsonha 29 minutes ago
  > joke project
  Is Chrome an agentic browser at 35 million lines of code? That's what this project does. Whether they're successful at that or not is another story.
DANmode 9 hours ago
Is BrowserOS-OS on the roadmap?
(Will you ever make a better FydeOS, or if you're laser-focused, perhaps be open to sharing some with them, so they could?)
[-]
- felarof 7 hours ago
  Yes! We are excited to build BrowserOS-OS! I think with agents the whole UX can be re-imagined.
  I'll check out FydeOS!
daureg 12 hours ago
> our agent runs locally in your browser (not on their server)
That's definitely a nice feature. Did you measure the impact on laptop battery life in a typical scenario (assuming there is such a scenario at this early stage)
[-]
- felarof 12 hours ago
  The agent running by itself shouldn't impact battery life, it is similar to a lightweight chrome extension and if you think about it, it's an agent browsing the web like human would :)
  If you run LLMs locally (using Ollama) and use that in our browser, that would impact battery life for sure.
msephton 5 hours ago
How does the competition do with the same demo tasks?
bkyan 7 hours ago
Is there a way to hook BrowserOS up as a sub-agent for an orchestration agent/system?
[-]
- felarof 7 hours ago
  Yes! We want to do this.
  We were thinking of implement MCP protocol into the browser, so the browser can be an MCP server (that exposes bunch of tools -- navigation, click, extract) and you can connect that to your agent, would that work?
  What is your use-case? Happy to chat on discord! https://discord.gg/YKwjt5vuKr
  [-]
  - bkyan 7 hours ago
    Yes, that works for me! My primary use case is automated testing of updates to websites that I maintain.
  - bkyan 7 hours ago
    Thanks for the discord invite! My username is Mindcast on discord.
    [-]
    - felarof 7 hours ago
      thanks for joining! DMed you.
thegeek108 6 hours ago
The name is confused. I think it is a web-based OS in the first place.
[-]
- felarof 6 hours ago
  Hmm, what do you mean? Could you elaborate? Thanks!
  [-]
  - thegeek108 3 hours ago
    I mean the name "BrowserOS" made me think it is an Operating System. Congrats to your achieved.
arjunchint 9 hours ago
Do you have any benchmarks to share like Halluminate's Web Bench?
[-]
- felarof 9 hours ago
  We working on it! Should have pretty soon!
closetkantian 12 hours ago
This looks like a great project.
What are the system requirements? And shouldn't they be listed on your website?
[-]
- felarof 12 hours ago
  we support Mac (apple silicon and intel) and Windows.
  hardware requirements are minimal, same as Google Chrome, if you BYOK API keys for agents and are not running LLMs locally.
rvz 5 hours ago
> But we strongly believe that a privacy-first browser with local LLM support is more important than ever – since agents will have access to so much sensitive data.
How does this make money? Surely this will have a cloud offering?
But if it doesn't make money, I can only assume that the team will be acqui-hired to answer that question.
ivape 9 hours ago
What is the default BrowserOS model? Is it local, and if so, what inferencing server are you using?
[-]
- felarof 9 hours ago
  The default model is Gemini.
  You can bring your own API keys and change the default to any model you local.
  Or better run model locally using Ollama and use that!
  [-]
  - ivape 9 hours ago
    The default is a remote Gemini call?
    [-]
    - felarof 8 hours ago
      Yes, right now using Gemini API on our liteLLM server (we can't expose API key on client side).
      We are working on smaller, fine-tuned model too, which will be the default soon! It should be much faster and precise at navigation tasks.
paul7986 10 hours ago
So would this or any AI browser go out and fetch a list of the best deals for my trip to Iceland? After Show me all the options it has found for flights, hotels, car rentals and show cheapest/best prices with all details (fly out of and into with times) to even allow me to pay for each item on same page I asked it to do so? As well it could group the overall best deal with details and then i can just click to pay instantly and or make some edits.
[-]
- felarof 10 hours ago
  We just started cooking, very soon you should be able to do this!
- dotancohen 8 hours ago
  It seems that the next evolution of SEO will be too skip the SE and simply O for the LLMs.
- flymasterv 6 hours ago
  Orbitz.com has done this for 20 years.