Your Voice Assistant Doesn’t Have To Be Cloudy

Voice assistants are neat — they let us interface with computers without having to bother with touching them at all. Still, many decry the perceived privacy intrusion these devices present, as they’re always trucking data off to corporate servers for all kinds of opaque reasons. Building your own standalone assistant is a way to get around that, and that’s precisely what [Tristram] did.

The build is based on an ESP32 Lyrat development board. Unlike most devboards, this one has two 3 watt audio outputs and mics on board, making it perfect for a build like this one. The Lyrat was paired with some NeoPixel LEDs and a pair of Dayton Audio 1.5″ speakers to enable it to interact with the user both audibly and visually.

[Tristram] steps through not only how to set up the voice assistant, but also how to build it into a simple and attractive enclosure that won’t unduly stand out in the average house. The Lyrat simply has to be flashed with firmware that enables it to work as a voice aid with Home Assistant platform.

If you’re unfamiliar, Home Assistant is a smart home architecture that you can run yourself on your own hardware, without having everything live in the cloud of some murky corporation.

Home Assistant has grown in popularity in recent years as a less intrusive smarthome solution. You can even use it to monitor your hot tub! Video after the break.

 

24 thoughts on “Your Voice Assistant Doesn’t Have To Be Cloudy

      1. It’s an actual privacy intrusion, not just one’s perception. Unless the equivalent, someone standing near you with a recoding device, always on, wouldn’t intrude on your privacy. Even if you employ them for this purpose!

        If a recorder is doing something nefarious or not is arguable. As I’m guessing a statement about anyone with access doing horrible things isn’t generally included in the EULA and arbitration clause.

    1. I’m not familiar with “Home Assistant” but surely there is a balance that can be struck somewhere in between entirely hosting your assistant in the cloud and just sending your voice recordings and it’s responses back and forth vs an entirely air-gapped machine.

      I don’t see why a self-hosted open source home assistant AI can’t call out to get the weather! If I were making something like this myself I would give it an API even so I could open a port-forward and communicate with it via app when away from home. Or at the very least with “writing the app” ever on a to-do list I’d have the ability to ssh in and talk to it on the command line.

      1. The company behind Home Assistant (Nabu Casa)do provide cloud based access to your own home instance. You control it, but it allows you to connect in when you are away. Home Assistant also provides integrations with several other internet services. The advantage is that you get to decide who you want to connect to. It’s not just “it won’t work unless you give us all the information about you”.

    2. Home Assistant has integrations into multiple weather forecast services. That is distinct from the voice assistant part. So your voice assistant isn’t sending all of your conversations off to some unknown data miner while your weather integration only gives them the minimum needed to get the right forecast.

    3. Home Assistant (HA) uses the Wyoming Protocol for configuring the voice assistant. So you’re able to specify your own wake-word-detection (WWD) method, Speech-to-text (STT) engine, and Text-to-speech (TTS) engine. You also can configure the Agent to handle the spoke commands.

      There are cloud and local options for all 4 components and you can roll your own. It ships with a local agent that’s able to interact with entities configured in your HA. There’s also an option for an OpenAI chatbot but it cannot interact with HA entities, you can however feed any state values from your HA into the OpenAI chatbot as a prompt.

      So using either the OpenAI agent or the HA local agent, you can read the value of a configured Weather Add-On in HA and get the weather forecast. The agents do not have free reign of your HA or direct access to the internet.

      Everything runs ok on a Raspberry Pi 4, and with the latest release of ESPHome you can put the WWD directly on the ESP micro-controller which further reduces the load on your HA server.

      I have an M5Stack Atom Echo running an OpenAI agent with a prompt like this that works exceptionally well:

      {% set _now = as_local(now()) %}
      The current date is {{_now.strftime(“%A %B %d, %Y”)}}.
      The current time is {{_now.strftime(“%I:%M:%S %p”)}}.
      The current time zone is {{_now.strftime(“%Z”)}}.
      The user is located in —-, north of —-.
      Provide answers to questions an elementary school aged child could understand.
      If you don’t know then answer to or are unable to answer a question, just say “I don’t know”.
      Answer the user’s questions about the world truthfully.
      Be brief with responses of 1 to 4 sentences.
      Refuse to talk about topics inappropriate for children such as sex, drugs, or violence.
      Your responses are being fed into a TTS engine so please spell phonetically.

  1. Hmm.. “Home Assistant” sounded like something I want to know more about so I Googled it. And it came up as the first result! Troubling. “Home Assistant” is a terrible name in the age of the internet because it should show up in all sorts of sentences as a common noun. That should make it hard to search for. And yet.. Google knew just what I wanted right away.

    Probably knew I had just read this article. Must be time to de-Google again!

    1. It became so prevalent in recent years, it actually got to the point of hurting Googles own ecosystem of the same category. The search phrase “trigger home automation from assistant” shows results relating to Home Assistant way before Google Home related results

  2. Home Assistant can run on a verity of hardware. Raspberry Pi’s are common but you might consider something more powerful if you’re going to use a voice assistant on it. This project is just building a remote interface. Personally I only ever used it on a Raspberry Pi and never tried the voice assistant.

  3. It would be quicker to stand up and press the light switch, or have a button on a remote control to activate a relay, rather than relying on voice decoding. Nonetheless… very well done to Tristram for getting big-tech’s listening devices out of his house. Bad things happen if you let remotely-operated corporate smart-tech monitor your every word and control your appliances:
    https://dailysceptic.org/2023/09/27/the-coming-tyranny-of-smart-technology-is-worse-than-you-think-but-there-is-hope/

  4. @Lewin Day said: “The build is based on an ESP32 Lyrat development board.”

    That’s the Espressif ESP32-LyraT Audio Development Board. See it here:

    https://www.espressif.com/en/products/devkits/esp32-lyrat

    You can buy the ESP32-LyraT board for $20 bucks here:

    https://www.mouser.com/ProductDetail/Espressif-Systems/ESP32-LyraT?qs=MLItCLRbWsxPzPCja546ZA%3D%3D

    In fact there are four different Espressif LyraT audio dev boards:

    https://www.mouser.com/c/?q=ESP32-LyraT

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.