Building A Smart Speaker Outside The Corporate Cloud

November 17, 2025

If you’re not worried about corporate surveillance bots scraping your shopping list and manipulating you through marketing, you can buy any number of off-the-shelf smart speakers for your home. Alternatively, you can roll your own like [arpy8] did, and keep your life a little more private.

The build is based around an ESP32 microcontroller. It connects to the ‘net via its inbuilt Wi-Fi connection, and listens out for your voice with an INMP441 omnidirectional microphone module. The audio data is trucked off to a backend server running a Whisper speech-to-text model. The text is then passed to Google’s Gemini 2.5 Flash large language model. The response generated is passed to the Piper Neural Voice text-to-speech engine, sent back to the ESP32, and spat out via the device’s DAC output and a speaker attached to an LM386 amplifier. Basically, anything you could ask Gemini, you can do with this device.

By virtue of using a commercial large language model, it’s not perfectly private by any means. Still, it’s at least a little farther removed than using a smart speaker that’s directly logged in to your Amazon/Google/Hulu/Beanstikk account. Files are on Github for those eager to dive into the code. We’ve seen some other fun builds along these lines before, too. Video after the break.

10 thoughts on “Building A Smart Speaker Outside The Corporate Cloud”

Pat says:

November 17, 2025 at 1:23 pm

This: “worried about corporate surveillance bots scraping your shopping list”

Followed by: “Basically, anything you could ask Gemini”

…. does… not… compute

Report comment

Reply
1. Sean says:
  
  November 17, 2025 at 3:53 pm
  
  I thought ‘well Gemini 2.5 can be a locally run model, it’s not too hard to do this on ollama with the model from hugging space’ so looked at the code… and sadly in the code https://github.com/arpy8/ESP32_Voice_Assistant/blob/main/server/main.py…
  
  from google import genai
  …
  llm_client = genai.Client()
  
  So, yep, they may as well have just bought a google device as the spyvertising company is getting absolutely everything, and training on it.
  
  Report comment
  
  Reply
  1. PEBKAC says:
    
    November 17, 2025 at 4:32 pm
    
    Right, the actual LLM models require hundreds of GB of vRAM to run at any appreciable performance level … and they are largely closed source.
    
    IIRC Meta’s llama or whatever they call it has small enough models to run on a local machine, and they’re Open-enough too
    
    Report comment
    
    Reply
Nikolai says:

November 17, 2025 at 1:39 pm

The main problem is not about corporate surveillance, but any IOT on the local network is the potential backdoor for hackers regardless how good is your firewall.

Report comment

Reply
1. Sean says:
  
  November 17, 2025 at 4:06 pm
  
  You always need to be aware of security, yep. Dead right and you make an important point.
  
  The safest ‘commercial grade’ solution that avoids relying on a computers firewall is to put IOT and untrusted devices on a seperate VLAN in our routers. So that such untrusted devices are quite safely seperate from the rest of the network at OSI level 2 and any harm is limited to only other IOT devices on that partitioned network only.
  
  I have quite a number set up, some VLANS have no internet access in or out, some only have out, some are bidirectional, then there is another vlan on top for dangerous devices with heaters like my 3D printer. Finally a vlan dedicated for secure devices and a guest vlan. Some have their own access points and some must be hard wired to access.
  
  Report comment
  
  Reply
Clara Hobbs says:

November 17, 2025 at 1:53 pm

Misleading title. This isn’t “outside the corporate cloud,” since it uses a corporate-cloud-based LLM.

Report comment

Reply
shadester says:

November 17, 2025 at 2:10 pm

Should be able to use a local LLM, but it would be slower.

Report comment

Reply
Make says:

November 17, 2025 at 2:35 pm

Perhaps https://www.home-assistant.io/integrations/ollama

Report comment

Reply
Leonardo says:

November 17, 2025 at 3:02 pm

Please stop with the clickbait.

Report comment

Reply
1. Brad Granath says:
  
  November 17, 2025 at 4:15 pm
  
  Seconded.
  
  Where is Hackaday’s coverage of the Home Assistant Voice Assistant Preview?
  
  Here is Arc Technica’s
  
  https://arstechnica.com/gadgets/2024/12/home-assistants-voice-preview-edition-is-a-little-box-with-big-privacy-powers/
  
  What gives?
  
  Report comment
  
  Reply

Hackaday

Building A Smart Speaker Outside The Corporate Cloud

10 thoughts on “Building A Smart Speaker Outside The Corporate Cloud”

Leave a Reply to shadesterCancel reply

Search

Never miss a hack

If you missed it

Tech In Plain Sight: Pneumatic Tubes

If IRobot Falls, Hackers Are Ready To Wrangle Roombas

Moving From Windows To FreeBSD As The Linux Chaos Alternative

“AI, Make Me A Degree Certificate”

Japan’s Forgotten Analog HDTV Standard Was Well Ahead Of Its Time

Our Columns

Keebin’ With Kristina: The One With The Cipher-Capable Typewriter

Hackaday Links: November 16, 2025

The Value Of A Worked Example

Hackaday Podcast Episode 345: A Stunning Lightsaber, Two Extreme Cameras, And Wrangling Roombas

This Week In Security: Landfall, Imunify AV, And Sudo Rust

10 thoughts on “Building A Smart Speaker Outside The Corporate Cloud”

Leave a Reply to shadesterCancel reply

Search

Never miss a hack

Subscribe

If you missed it

Our Columns