12 comments

  • jwr 4 hours ago
    That is very, very interesting. I've been hoping to have an assistant in the workshop (hands-free!) that I could talk to and have it help me with simple tasks: timers, calculating, digging up notes, etc. — basically, what the phone assistants were supposed to be, but aren't.

    "You will have to unlock your iphone first" is kind of a deal-breaker when you are in the middle of mixing polyurethane resin and have gloves and a mask on.

    More and more I find that we have the technology, but the supposedly "tech" companies are the gatekeepers, preventing us from using the technological advances and holding us back years behind the state of the art.

    I'll be trying this out on my Macbook, looks very promising!

    • gtowey 57 minutes ago
      The computing power we all have in our pockets is staggering. It could be tool that truly makes our lives easier, but instead it's mostly a device that is frustrating to use. Companies have decided to make it simply another conduit for advertising. It's a tool for them to sell us more stuff. Basic usability be damned.
    • huijzer 1 hour ago
      > More and more I find that we have the technology, but the supposedly "tech" companies are the gatekeepers

      Yes same with RSS readers being dropped by large companies. Worked too good I guess!

    • mentalgear 1 hour ago
      You might be interested in the open-source https://www.home-assistant.io/voice-pe/ .
  • magzter 1 hour ago
    This is so cool, I'm always speaking to people about how the advancement in the SOTA hosted AI's is also happening in the local model space, i.e. the SOTA hosted AI models 6-12 months ago are what we're seeing now being able to run locally on average hardware - this is such an amazing way to actually demo it.
  • dvt 8 hours ago
    Solid work and great showcase, I've done a bunch of stuff with Kokoro and the latency is incredible. So crazy how badly Apple dropped the ball... feels like your demo should be a Siri demo (I mean that in the most complimentary way possible).
  • zerop 4 hours ago
    I have been looking forward to build something like this using open models. A voice assisstant I can talk while I am driving, as I do have long commute. I do use chatGPT voice mode and it works great for querying any information or discussions. But I want to do tasks like browsing web, act like a social media manager for my business etc.
  • est 3 hours ago
    I am making something similar. Also been using Kokoro for TTS. Very cool project!

    Gemma 4 is kinda too heavyweight even with E2B. I am sticking with qwen 0.8B at the moment.

  • logicallee 2 hours ago
    It might interest people to know you can also easily fine-tune the text portion of this specific model (E2B) to behave however you want! I fine-tuned it to talk like a pirate but you can get it to do anything you have (or can generate) training data for. (This wouldn't make it to the text to speech portion though.) So you can easily train it to act a certain way or give certain types of responses.

    Video: https://www.youtube.com/live/WuCxWJhrkIM

    Generated writeup: https://taonexus.com/publicfiles/apr2026/pirate-gemma-journa...

  • divan 3 hours ago
    Can someone quickly vibe code MacOS native app for that so it doesn't require running terminal commands and searching for that browser tab? (: (also for iOS, pls)
    • duartefdias 3 hours ago
      Would you pay 2$ for that MacOS native desktop app?
  • agdexai 6 minutes ago
    [dead]
  • tianqi 1 hour ago
    [dead]
  • techpulse_x 5 hours ago
    [dead]
  • k-almuraee 5 hours ago
    Amazing, love your work ,