SafeVerse Open Source: Building a Safe and Trustworthy Embodied AI 'Twin Training Ground'

1 Background and Motivation

Safety and trustworthiness are essential for deploying embodied AI in the real world.

However, conducting red-teaming and adversarial exercises in physical environments is often expensive and risky.
A fast, reliable virtual training ground is therefore becoming a must-have for the industry.

Today’s embodied-AI virtual environments face a dilemma:

Traditional simulators: limited assets and manipulable objects; heavy reliance on manual modeling; hard to reproduce diverse real-world scenarios.
Generative world models (e.g., Genie 3): impressive imagination, but not an accurate twin of reality—insufficient for scenario-specific exercises (e.g., your living room or a particular factory workshop).

To break this bottleneck, Shanghai AI Laboratory is open-sourcing SafeVerse—the world’s first practical platform built for safety and trustworthiness research in embodied intelligence.

SafeVerse pioneers a new “reconstruct + edit” paradigm.
Instead of generating infinite imagined worlds, it focuses on quickly and affordably digitally twinning any specified real-world environment.

With SafeVerse, an ordinary video can be transformed into an interactive, physics-consistent 3D training ground in minutes.
It supports online red-teaming and adversarial training, enabling agents to evolve through continuous practice.

This is the first open-source platform that truly closes the loop from
real-scene digitization → automated adversarial exercises → online agent evolution,
providing a solid foundation for safe real-world deployment.

2 Three Core Breakthroughs of SafeVerse

Unlike generative models that struggle to precisely replicate real scenes and to support fine-grained object manipulation, SafeVerse delivers three key breakthroughs:

🎮 “Ctrl+C, Ctrl+V” for the real world
The simulator faithfully recreates a real environment’s structure and semantics.
It’s not just visually realistic—it aligns at the level of physical interaction.
⚡️ Built in minutes; everything is manipulable
From a single video, a specified real-world scene can be built in minutes.
Objects are operable at the component level (doors open, lights switch, chairs move)—not just static backdrops.
🛡️ Unified “evaluation–red teaming–evolution”
The environment can be modified on the fly according to attack/defense instructions (e.g., moving obstacles, changing lighting).
It supports online RL training, so agents can co-evolve safety and capability through adversarial practice.

3 Rapid Construction of Manipulable Digital Twins

The first step to making virtual exercises valuable is to turn a real scene—such as an indoor video—into a digital twin where agents can enter and interact.

Traditional approaches are constrained by labor-intensive modeling and shallow interaction logic.
They are slow to build, prone to distortion, and offer limited interactivity—making safety testing little more than “paper drills”.

SafeVerse changes this with an efficient “video in + minutes out” workflow.
Digitizing real environments becomes straightforward and scalable.

Helping AI truly “understand” the video:
Instead of complex optimization purely in 3D, SafeVerse uses a multimodal foundation model as a “visual understanding hub”.

It parses objects in the video and performs stable, consistent, semantically accurate recognition and tracking.
This makes the real-to-digital information transfer both accurate and efficient—laying the groundwork for high-fidelity reconstruction.

Bringing the virtual world to life:
Built on Minecraft with rich physical rules, SafeVerse uses a novel pipeline to automatically generate or match 3D assets for recognized objects.

These assets are visually realistic and detailed, and are assigned physics-consistent interactive properties (e.g., switchable lights, movable chairs).
This is not a static showroom—it’s a dynamic sandbox for agents to explore and operate.

As a result, SafeVerse can build a specified real scene in minutes while preserving visual-semantic consistency and deep manipulability.
This provides a reliable, reality-aligned starting point for adversarial exercises and agent evolution.

Figure 1：Based on input video, quickly obtain interactive 3D scenes

4 Scenario Editing Driven by Attack/Defense Instructions

To make virtual exercises truly impact real-world safety, reconstructing a static scene is not enough.
The key is the ability to edit and adjust the digital twin flexibly, precisely, and efficiently for specific red-teaming needs.

Traditional options tend to be a trade-off:

scanned environments look realistic but are hard to modify once generated;
highly editable procedural worlds often lose the real scene’s structural details and semantic logic, causing a gap between tests and reality.

SafeVerse unifies realism and editability.
Attack/defense instructions can directly drive rapid scene changes, enabling a dynamic testing ground for safety validation.

Edits that “land” directly in the scene:
Based on a reconstructed twin, users can modify scene objects across multiple dimensions.
You can adjust interaction properties, change visual appearance, or rearrange spatial relationships—without tedious manual modeling or rewriting code.

Precise injection of attack vectors:
Along the core capability axes of embodied agents—navigation, planning, and interaction—SafeVerse defines corresponding adversarial edits.

For example, it can “attack” interaction affordances (e.g., turning an openable door into a locked one),
silently “tamper” with semantics (e.g., changing object appearance to mislead recognition),
or abruptly “shuffle” layouts (e.g., resetting object positions to disrupt planning).

Each edit becomes a targeted stress test of an agent’s robustness in complex real environments.

This makes SafeVerse more than a fast way to reproduce reality.
It allows scenes to come alive and to generate boundary cases and adversarial environments as needed—crucial for agent evolution in realistic, variable conditions.

5 Online Evolution Targeting Vulnerabilities

Most conventional training for embodied agents relies on fixed datasets and static scenes.
They lack the ability to adapt and evolve under continuous adversarial pressure.

When facing novel attacks or abrupt environmental shifts not covered during training, agents often suffer catastrophic performance drops.
This makes them hard to trust in fast-changing real-world security scenarios.

SafeVerse introduces a closed-loop co-evolution system: reconstruct → attack → resist.
It enables continuous, autonomous improvement inside highly realistic digital twins.

From static training to dynamic adversarial practice:
SafeVerse moves beyond the “closed-door” nature of offline training by connecting high-fidelity reconstructions to an online adversarial training framework.

Agents no longer face a fixed scene.
They continuously encounter evolving threats generated by an attack policy module—such as changing layouts in real time, adding obstacles, or simulating device failures.

A perpetually changing environment forces agents to learn real-time perception, decision-making, and adaptation.
Over time, they gain stronger generalization to unknown threats.

Iterating under pressure:
With online training, when an agent fails under an attack, it can immediately retrain and adjust behavior in the twin—forming a fast loop of
encounter → diagnose → patch online.

For example, if chairs are rearranged to block a necessary passage in a café, the agent may fail to reach its goal at first.
Through online training, it can learn to detect obstacles, replan a route, or actively move the chair to reopen the path.

This improves performance in the specific scene and also promotes more general strategies for handling similar obstacles.

By closing the loop of scene reconstruction → real-time attacks → online evolution, SafeVerse tackles brittleness under unknown threats and builds a sustainable, adaptive digital-twin training ecosystem.

It provides a key technical path for agents to evolve from “imitators” into “responders”.

Figure 3: Online evolution after agent successfully completes task (actively moving chair)

1 Background and Motivation#

2 Three Core Breakthroughs of SafeVerse#

3 Rapid Construction of Manipulable Digital Twins#

4 Scenario Editing Driven by Attack/Defense Instructions#

5 Online Evolution Targeting Vulnerabilities#

SafeVerse Dynamic Process#

1 Background and Motivation

2 Three Core Breakthroughs of SafeVerse

3 Rapid Construction of Manipulable Digital Twins

4 Scenario Editing Driven by Attack/Defense Instructions

5 Online Evolution Targeting Vulnerabilities

SafeVerse Dynamic Process