AI In A Box Envisions AI As A Private, Offline, Hackable Module

[Useful Sensors] aims to embed a variety of complementary AI tools into a small, private, self-contained module with no internet connection with AI in a Box. It can do live voice recognition and captioning, live translation, and natural language conversational interaction with a local large language model (LLM). Intriguingly, it’s specifically designed with features to make it hack-friendly, such as the ability to act as a voice keyboard by sending live transcribed audio as keystrokes over USB.

Based on the RockChip 3588S SoC, the unit aims to have an integrated speaker, display, and microphone.

Right now it’s wrapping up a pre-order phase, and aims to ship units around the end of January 2024. The project is based around the RockChip 3588S SoC and is open source (GitHub repository), but since it’s still in development, there’s not a whole lot visible in the repository yet. However, a key part of getting good performance is [Useful Sensors]’s own transformers library for the RockChip NPU (neural processing unit).

The ability to perform things like high quality local voice recognition and run locally-hosted LLMs like LLaMa have gotten a massive boost thanks to recent advances in machine learning, and it looks like this project aims to tie them together in a self-contained package.

Perhaps private digital assistants can become more useful when users can have the freedom to modify and integrate them as they see fit. Digital assistants hosted by the big tech companies are often frustrating, and others have observed that this is ultimately because they primarily exist to serve their makers more than they help users.

10 thoughts on “AI In A Box Envisions AI As A Private, Offline, Hackable Module

    1. who knows. It has 8gb of ram, no mention of permanent storage, no real details on software, github doesn’t have any details. Mainly seems to be a marketing project at this point.

    2. Depends on what you need to do with them. If you’re okay with waiting for the responses from the LLM, your only limitation is the amount of RAM on your machine. You could run absolutely massive models on a Celeron CPU with enough RAM to hold them.

      The size of the models start from 7 billion parameters and go all the way up to 130 billion parameters. The models are roughly about 800MB/billion parameters. 7B are ~5GB-6GB, 13B are 9GB-10GB. 7B feel kinda “dumb” while 13B are quite a bit smarter. I have no experience with larger models but I hear its diminishing returns above 34B parameters.

      If you want any kind of speed, you will need a GPU with enough VRAM. 4GB will be the absolute minimum, 6GB should be fine. It all depends on the size of the model you are running.

      I run a 13B parameter model on my PC for…reasons…and I have a RTX3060 with 12GB VRAM and 32GB of RAM with an Ryzen 5 5600X. Its plenty fast for what I need.

      1. Someone is going to become very rich when they develop a framework for a 5-8B parameter model that can gab effectively and use a database to store tokens in place of short & long-term memory. RPG NPC’s don’t have to be _that_ with it if they can remember what happens to them and react to their senses.

  1. Probably relevant to note that one of the founders of Useful Sensors is Pete Warden. Pete has done a lot of work on making AI run on low end hardware e.g. he’s one of the creators of TinyML. The other AI-powered edge devices they’ve shown, like their Person Sensor, have performed as expected. I’m sure this will too.

    Full disclosure: I have known Pete for many years.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.