This Open Source Robot Brain Thinks in 3D

0
21

Save StorySave this storySave StorySave this story

European roboticists today released a powerful open-source artificial intelligence model that acts as a brain for industrial robots—helping them grasp and manipulate things with new dexterity.

The new model, SPEAR-1, was developed by researchers at the Institute for Computer Science, Artificial Intelligence and Technology (INSAIT) in Bulgaria. It may help other researchers and startups build and experiment with smarter hardware for factories and warehouses.

Just as open source language models have made it possible for researchers and companies to experiment with generative AI, Martin Vechev, a computer scientist at INSIAT and ETH Zurich, says SPEAR-1 should help roboticists to experiment and iterate rapidly. “Open-weight models are crucial for advancing embodied AI,” Vechev told WIRED ahead of the release.

SPEAR-1 differs from existing robot foundation models in that it incorporates 3D data into its training mix. This gives the model an enhanced understanding of the physical world, making it easier to understand how objects move through physical space.

Robot foundation models are generally built on top of vision language models (VLMs) which have a broad but limited grasp of the physical world because training tends to come from labeled 2D images. “Our approach tackles the mismatch between the 3D space the robot operates in and the knowledge of the VLM that forms the core of the robotic foundation model,” Vechev says.

SPEAR-1 is roughly as capable as commercial foundation models designed to operate robots, when measured on RoboArena, a benchmark that tests a model’s ability to get a robot to do things like squeeze a ketchup bottle, close a drawer, and staple pieces of paper together.

The race to make robots smarter already has billions of dollars riding on it. The commercial potential of generally capable robots has spawned well-funded startups including Skild and Generalist besides Physical Intelligence. SPEAR-1 is almost as good as Pi-0.5 from Physical Intelligence, a billion-dollar startup founded by an all-star team of robotics researchers.

SPEAR-1 suggests that the quest to build more intelligent robots may involve both closed models like those from OpenAI, Google, and Anthropic, as well as open source variants like Llama, DeepSeek, and Qwen.

Robot intelligence is still in its infancy, though. It is possible to train an AI model to operate a robot arm so that it can reliably pick certain objects from a table. In practice, however, the model will need to be retrained from scratch if a different kind of robot arm is used or if the object or the environment are altered.

Robotics researchers hope that the same recipe that produced large language models—huge amounts of training data and compute—will eventually yield robot models with similarly general capabilities. This would mean robots capable of adapting very quickly to new situations or new tasks. Eventually such models might enable humanoids to operate in messy and unfamiliar environments, thanks to a general understanding of how the world works.

Karl Pertsch, a researcher at the company Physical Intelligence, says it is too soon to know how important 3D training data will be for robotic foundation models. He adds, however, that SPEAR-1 shows how rapidly more general robotic models are advancing. “It's really cool to see academic groups building quite general policies that can actually be evaluated across a diverse set of environments out-of-the-box, and [can] achieve non-trivial performance,” Pertsch says. “This was not possible even a year ago.”