Google DeepMind Launches On-Device Gemini Robotics AI Model That Works Without Internet


Hyderabad: Google rolls out an on-device version of the Gemini Robotics AI model, which can operate without an internet connection. The new Vision-Language-Action (VLA) model comes with general-purpose dexterous capabilities and task generalisation, similar to the one released in March this year.

For those who do not know what a VLA model is, it’s a type of AI model that combines vision, language, and action capabilities to enable robots to understand their environment, process instructions, and perform tasks.

This new update highlights a significant shift from its earlier models that relied on cloud connectivity. By enabling robots to process information and make decisions on the device itself, the Mountain View-based company hopes to make robotics more practical in environments such as remote areas, secure facilities, and latency-sensitive situations.

Google DeepMind Launches On-Device Gemini Robotics AI Model That Works Without Internet

Dexterity Evaluation (Image Credit: Google DeepMind Blog)

The main highlight of Google’s on-device Robotics AI model is that it can work in places with weak or no connectivity, making it highly reliable. Additionally, it can process information locally, which is expected to be helpful in privacy-sensitive applications where data security is a major concern, such as healthcare and industrial automation.

Gemini Robotics On-Device AI Model

The new Robotics AI model is designed to enable robots to complete a wide range of physical tasks, even if the model has not been specifically trained to do so. Several videos shared by a DeepMind blog showcase how the on-device AI model can do general tasks with dexterity.

In a clip, the on-device AI model was instructed to perform general tasks—such as opening a middle drawer and closing a pear-shaped container—which it executed successfully. To check the dexterity of the AI model, it was instructed to unzip a bag and uncap a marker, which was also completed.

Google DeepMind Launches On-Device Gemini Robotics AI Model That Works Without Internet

On-device Robotic AI model performing various tasks. (Image Credit: Google DeepMind Blog)

Another video demonstrated how the AI model executed general task instructions in entirely new environments with previously unseen objects. Although the on-device Gemini Robotics AI model was originally trained solely for Google’s ALOHA robot, it successfully adapted to other robotic systems, including Apptronik’s Apollo humanoid and the dual-armed Franka FR3.

It performed tasks such as industrial belt assembly and folding a dress—all while operating offline with low-latency inference. For comparison, Optimus, Tesla’s humanoid robot, can also perform tasks like folding clothes, boiling eggs, and dancing, but it requires an internet connection to process data and deliver results.

Google DeepMind Launches On-Device Gemini Robotics AI Model That Works Without Internet

On-device Robotic AI model performing tasks after being integrated into Apollo humanoid robot. (Image Credit: Google DeepMind Blog)

The third video displays how the same on-device model was integrated into the Apollo humanoid robot, which followed natural language instructions and performed general tasks with previously unseen objects.

On-Device AI Model Performance

The team processed the performance and presented the data on three parameters, which include generalisation, instruction-following, and fast adaptation.

Google DeepMind Launches On-Device Gemini Robotics AI Model That Works Without Internet

Generalisation Benchmark (Image Credit: Google DeepMind Blog)

The first graph showcases the generalisation performance of the AI model. Surprisingly, the visual generalisation of both models was marginally close, the cloud-connected variant being a bit better than the on-device model. Meanwhile, the semantic and action generalisation were a bit behind the cloud-connected model.

Google DeepMind Launches On-Device Gemini Robotics AI Model That Works Without Internet

Instruction-Following Benchmark (Image Credit: Google DeepMind Blog)

The second graph presents how both AI models followed instructions provided in natural language. In this evaluation, both AI models performed well when given easy instructions, while the cloud-connected flagship Gemini Robotics took a significant lead when given hard instructions.

Google DeepMind Launches On-Device Gemini Robotics AI Model That Works Without Internet

Fast Adaptation Benchmark (Image Credit: Google DeepMind Blog)

Meanwhile, the third graph displays how the AI models are quick towards adapting to new scenarios. Surprisingly, the on-device AI model was close to the cloud-connected AI model. As the on-device model could carry out tasks instantly and also learn new ones from 50 to 100 demonstrations.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *