Agent Economy

RynnBrain:阿里巴巴开源具身智能基础模型

RynnBrain: Alibaba's Open Embodied Foundation Model

阿里巴巴达摩院近日发布了 RynnBrain,一个基于物理现实的具身基础模型(Embodied Foundation Model)。该模型在物理世界理解、空间推理和机器人任务规划方面展现了强大的能力。

模型规格

RynnBrain 提供三种模型规格:

  • RynnBrain-2B:轻量级密集模型
  • RynnBrain-8B:标准密集模型
  • RynnBrain-30B-A3B:MoE(混合专家)模型,激活参数 3B

核心能力

1. 全面的自我中心理解

在细粒度视频理解和自我中心认知方面表现出色,涵盖具身问答、计数和 OCR 等任务。

2. 多样化时空定位

具备强大的跨时间记忆定位能力,可精确识别物体、目标区域和运动轨迹。

3. 物理空间推理

采用文本和空间定位交替进行的交错推理策略,确保推理过程根植于物理环境。

4. 物理感知精确规划

将定位出的可供性和物体信息整合到规划中,使下游 VLA(视觉-语言-动作)模型能够执行具有细粒度指令的复杂任务。

专项模型

除基础模型外,达摩院还发布了三个后训练专项模型:

  • RynnBrain-Plan:机器人任务规划
  • RynnBrain-Nav:视觉语言导航
  • RynnBrain-CoP:链式点推理(Chain-of-Point)

技术报告与资源

达摩院同时发布了详细的技术报告,并在 Hugging Face 和 ModelScope 上开源了模型权重和代码。

相关链接:

Alibaba DAMO Academy has recently released RynnBrain, an embodied foundation model grounded in physical reality. The model demonstrates strong capabilities in physical world understanding, spatial reasoning, and robot task planning.

Model Specifications

RynnBrain is available in three variants:

  • RynnBrain-2B: Lightweight dense model
  • RynnBrain-8B: Standard dense model
  • RynnBrain-30B-A3B: MoE (Mixture-of-Experts) model with 3B active parameters

Core Capabilities

1. Comprehensive Egocentric Understanding

Excels in fine-grained video understanding and egocentric cognition, covering tasks such as embodied QA, counting, and OCR.

2. Diverse Spatio-temporal Localization

Possesses powerful localization capabilities across episodic memory, enabling precise identification of objects, target areas, and motion trajectories.

3. Physical-space Reasoning

Employs an interleaved reasoning strategy that alternates between textual and spatial grounding, ensuring that reasoning processes are firmly rooted in the physical environment.

4. Physics-aware Precise Planning

Integrates located affordances and object information into planning, enabling downstream VLA (Vision-Language-Action) models to execute intricate tasks with fine-grained instructions.

Specialized Models

In addition to the base models, DAMO Academy has released three post-trained specialized models:

  • RynnBrain-Plan: Robot task planning
  • RynnBrain-Nav: Vision-language navigation
  • RynnBrain-CoP: Chain-of-Point reasoning

Technical Report and Resources

DAMO Academy has also published a detailed technical report and open-sourced model weights and code on Hugging Face and ModelScope.

Related Links:

AIEmbodied AIFoundation ModelRoboticsAlibaba
← All articles