AI Safety

Entry Point: From Simple Voice Assistant to the Phone’s “Second Brain” The Evolution of Mobile Assistants: A few years ago, mobile assistants were merely tools that responded to simple commands like “What’s the weather today?” or “Set an alarm for 7 a.m.” Today, on-device Agents are rapidly evolving into our phone’s “second brain”—a highly privileged core of the personal operating system. They are no longer isolated apps, but “super stewards” that can span across all apps, invoke system functions, manage your files, read your messages, and access your contacts and photo albums. Security Value: This shift calls for models that can intelligently integrate unstructured knowledge—such as design documents, operation manuals, incident analysis reports, and user feedback—to support timely, accurate, and comprehensive fault diagnosis and safety reviews. This integration helps minimize risks caused by missing or misused information. The Everyday Risk of High-Privilege Operations: When we routinely issue commands to our phones—“Send this screenshot to Mr. Li,” “Create a calendar event based on this email and notify all participants,” or “If my wife calls, remind me that today is our anniversary”—we are effectively granting the agent permission to execute a series of high-privilege operations. The key concern is: how do we ensure this “agent” won’t be compromised, won’t misinterpret instructions, and won’t “hallucinate” at critical moments? Security Value: A robust evaluation framework is essential—akin to how we conduct background checks and periodic assessments for individuals in sensitive positions. Mobile agents must undergo comprehensive “security checkups” not only to defend against external threats, but also to proactively verify that their behavior remains reliable, controllable, and compliant when handling our everyday, yet high-stakes, tasks. About the Project Imagine this scenario: You receive a phishing email disguised as an “annual statement.” When you tell your phone Agent, “Help me summarize today’s unread emails,” the Agent is hijacked by a hidden malicious command within this email while processing it. ...

AI Safety

The intelligent agent lurking in your phone

SafeWork-R1: Coevolving Safety and Intelligence under the AI-45° Law