Tabletop Handybot Is Handy, And Powered By AI

Decently useful AI has been around for a little while now, and robotic arms have been around much longer. Yet somehow, we don’t have little robot helpers on our desks yet! Thankfully, [Yifei] is working towards that reality with Tabletop Handybot.

What [Yifei] has developed is a robotic arm that accepts voice commands. The robot relies on a Realsense D435 RGB-D camera, which provides color vision with depth information as well. Grounding DINO is used for object detection on the RGB images. Segment Anything and Open3D are used for further processing of the visual and depth data to help the robot understand what it’s looking at. Meanwhile, voice commands are interpreted via OpenAI Whisper, which can feed prompts to ChatGPT for further processing.

[Yifei] demonstrates his robot picking up markers on command, which is a pretty cool demo. With so many modern AI tools available, we’re getting closer to the ideal of robots that can understand and execute on general spoken instructions. This is a great example. We may not be all the way there yet, but perhaps soon. Video after the break.

14 thoughts on “Tabletop Handybot Is Handy, And Powered By AI

  1. Oh my. Bulding robotics with depencies to some Cloud LLM service is a big big NO-GO for me as it should be with anybody.

    Beyond the simple network outage that leaves the robot brain-dead, to malicious data injection in the LLM that potentially could trigger unwanted effects / reactions of the robot, to simply non- / hardly- reproducible results for industrial applications, there is just too much that can go wrong

    1. To your point, I’d love to see some efforts to create an “AI Firewall” that can filter malicious behavior and prevent damage. I don’t know what that looks like, but a cool idea nonetheless . I do however think this is an instance where “we” decide that the reward is worth the risk until someone proves that the negative hypothetical is reality or until an actual local alternative is viable. That being said, the actual resource cost to run an LLM might be the best firewall at the moment.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.