From Rigid Automation to Physical AI: The 4 Levels of AI-Powered Robotics You Need to Know
- Stephan Hotz

- Aug 12
- 8 min read
Updated: Sep 2

A Pragmatic Guide to AI in Industrial Automation
Artificial Intelligence (AI) is transforming robotics and industrial automation pushing the industry from rigid, rule-based systems toward flexible, context-aware autonomy.
However, with AI evolving at breakneck speed, it’s easy to get lost in hype.
This position paper introduces a pragmatic 4-level framework that helps assess the current state of AI in robotics. Each level is examined from two key angles:
Robot View: what robots can actually do at that level
Developer View: how developers build and interact with robots
The vast majority of deployed automation today still operates at Level 0 or Level 1.
Level 2, where AI augments development, is gaining traction. Levels 3 and 4, involving learned behaviors and agentic autonomy, are beginning to emerge but require a fundamentally new kind of platform.
Wandelbots NOVA is built as that platform. Wandelbots NOVA Operating System (OS) and NOVA Cloud, provide the infrastructure to unify development, enable large-scale data collection, and deploy modern AI workflows across real robot hardware bridging today’s industrial reality with tomorrow’s intelligent automation.
Industrial Automation Is Entering Its AI Era
AI is fundamentally transforming the landscape of robotics and industrial automation.
The speed, flexibility, and intelligence that AI systems bring are redefining what machines can do. They are unlocking levels of productivity, adaptability, and operational scale that were previously unimaginable. AI is no longer a futuristic vision; it's a driving force behind today's most advanced manufacturing and robotic systems.
Yet with opportunity comes complexity.
The rapid evolution of AI has introduced an incredible abundance of tools, models, approaches, and directions. For professionals and decision-makers, it's more important than ever to clearly understand the current state of AI within the robotics and automation industries. A grounded perspective helps navigate this landscape with focus and intent, prioritizing initiatives that deliver real value.
In this environment of hype and high expectations, distinguishing between buzzwords and genuine technological breakthroughs is essential. This article offers a structured framework to help you assess where AI is making tangible progress in robotics, where the greatest potential lies, and where to focus for practical impact.
Analyzing the Status Quo
To navigate the rapidly evolving field of robotics and AI, this framework organizes the journey into levels that clarify both current capabilities and future potential. These levels offer clarity from two perspectives: the robot solution view, what tasks robots can reliably perform and the developer view, how engineers and programmers interact with robotic systems. By distinguishing between levels of maturity, you can better assess both the promise and the current limits of AI in automation.

Level 0: Heuristics & Optimizers (Not AI)
Definition: Automation Intelligence based on hardcoded rules, parameter optimizations, or simple control loops with no learning involved.
Robot View: Executes fixed tasks and programs with high precision, but no flexibility. Typical examples include welding, gluing, or palletizing routines in tightly controlled environments.
Developer View: Programs the robot via proprietary languages or logic blocks (e.g., RAPID, KRL, PLCs). All behavior must be explicitly defined and tuned.
Example: A KUKA robot spot-welds a car frame on a fixed jig.
Wandelbots NOVA Context: NOVA enables modern programming approaches even for traditional rule-based tasks by abstracting vendor-specific code into Python and providing a cell operating system for deployment and convenience features. The intelligence level remains fixed unless combined with higher-level tools. Bringing current robot cells onto a unified platform and into a single language, while opening proprietary controller systems for data collection, forms a crucial building block for Physical AI.
Level 1: AI-Powered Perception (Task specific, highly specialized AI)
Definition: Task-specific AI models, mostly in vision, enabling robots to recognize objects or features. No reasoning or learning beyond initial training.
Robot View: Gains perception abilities (e.g., see parts, detect defects) to improve flexibility in semi-structured environments.
Developer View: Integrates pre-trained vision models or tools (e.g. Cognex, Zivid) into robot workflows for pick-and-place, inspection or sanding jobs.
Example: A Yaskawa robot uses a camera to scan a workpiece, create a 3D point cloud out of a scan, and make use of an algorithmic path planner to define and execute a robot path.
Wandelbots NOVA Context: Vision-based capabilities are and can be integrated with NOVA’s Python SDK, where object detection models trigger specific robot tasks based on coordinates or advanced path planners.
Level 2: AI-Augmented Programming (AI supporting the Developer)
Definition: AI supports the development process via code suggestions, natural language prompts, vibe coding or adaptive trajectory generation. This becomes even more powerful when the robot setup is equipped with vision systems and sensors, and programming can be further abstracted away.
Robot View: Still relies on programming but with greater flexibility in execution (e.g., path planning, interpolation).
Developer View: Uses tools like GitHub Copilot, LLMs, or custom SDKs (e.g., NOVA Python SDK) to accelerate development by code generation or even simply relying on vibe engineering (natural language prompts to generate executable skills).
Example: A developer fine-tunes a pick-and-place application using AI-generated Python code in Microsoft Visual Studio Code (VS Code), with real-time simulation in NVIDIA Omniverse or Rerun.
Wandelbots NOVA Context: Developers use NOVA Developer Tools, specifically the VS Code extension to bring AI-generated code into their workflow. GitHub Copilot-style AI support helps write and modify NOVA programs, reducing ramp-up time for non-experts. As robot programming in Wandelbots NOVA is based on Python and the convenience SDK that is included as a standard package for developers it ready made for being connected to state-of-the-art AI models developed in Python.
Level 3.1: Learned Motion Intelligence (AI-driven systems that adapt quickly)
Definition: Robots execute trajectories produced by a learned policy that is trained via large-scale vision-language-action (VLA) imitation or reinforcement learning (RL). This policy masters low-level motion in dynamic scenes and generalizes across part and pose variants.
Robot View: Executes complex trajectories that emerge from unsupervised (or weakly supervised) training - often first in simulation - rather than being programmed manually.
Developer View: Instead of writing motion code, engineers curate demonstrations, simulation scenarios, and reward tweaks. Training runs on simulators such as NVIDIA Omniverse/Isaac Sim. The policy is then deployed through a runtime, such as Wandelbots NOVA.
Example: A robot arm that was pre-trained as a VLA model on millions of vision-language-action pairs, and then lightly fine-tuned in simulation to pick up a flexible gasket. The final policy can be transferred to the real cell with minimal adjustments.
Wandelbots NOVA Context: NOVA's low-level motion abstraction and Omniverse connectivity allow developers to fine-tune or validate VLA/RL policies on different types of hardware before putting them into production. Additionally, NOVA Cloud acts as the data collectors for all robot cells and systems running on NOVA OS therefore providing the data backbone for making reinforcement learning possible across shopfloors.
Level 3.2: Context-Aware Agentic AI (Learned Planning Intelligence)
Definition: Systems that understand the environment and intent, decompose a high-level command into a sequence of skills, and autonomously execute those skills, providing full planning intelligence.
Robot View: A single language command ("bolt this bracket") triggers perception, task decomposition, and dynamic chaining of skills, which is indispensable for highly dynamic production settings and general-purpose or humanoid robots.
Developer View: The focus shifts toward integrating a modular library of action and perception skills, system integration, and providing multimodal data (e.g., 2D/3D vision, sensor data and technical asset inputs), as well as effective prompting.
Example: A humanoid robot interprets the command "bolt this bracket" via its VLA planner and then selects verified skills: detect bracket → grasp → align → bolt-tighten.
Wandelbots NOVA Context: NOVA is positioned to serve as the execution layer in a broader agentic stack. Its API enables the seamless execution of skills within multi-step processes provided by a planning model running in the cloud or on-premises infrastructure.
Level 4: Super-intelligent AI (the longer-term future)
Definition: Theoretical AI surpasses human intelligence across all dimensions, including creativity, strategy, and physical coordination.
Robotics Context: An independent management-level AI managing robot fleets, infrastructure and even governance across an entire factory.
Example: Popularized in science fiction e.g., "Her," HAL 9000, or "Ex Machina", but not a reality today... yet.
Putting Level 3 into practice with Wandelbots NOVA
Overview
To assess NOVA’s capabilities as a robust data collection and execution layer for modern robot learning workflows, a closed-loop setup was created within Omniverse Isaac Sim. In this environment, the well-known Push-T task was recreated using a UR robot model from the NOVA asset pack.
The objective: guide the robot to push a T-shaped object across a table into a designated target area. Using NOVA’s Omniservice, the virtual scene was orchestrated, simulated cameras were connected, and NOVA's API was leveraged to control the robot and collect proprioceptive data (joint positions, end-effector pose) during human demonstration. Data was collected over several dozen episodes using simple keyboard inputs to control the robot while monitoring the real-time simulation in Isaac Sim. A Diffusion Policy was then trained from scratch, empowering the robot to successfully push the object to its goal based solely on camera inputs and the robot's proprioceptive data.
Key Take-aways
NOVA was invaluable as the "glue" between the simulation environment and robot control. Its asset library sped up cell setup, the Omniverse Connector handled environment randomization and data capture, and NOVA OS translated policy outputs into robot motion. LeRobot’s dataset and training utilities were used, as well as the tools for data aggregation and model training. Despite the LeRobot framework's convenience and encouraging community backing, its lack of support for industrial robots highlighted NOVA’s unique strength as the execution backbone for real-world use cases.
Implications
Synchronizing camera feeds, proprioceptive signals, and sensor data quickly enough when the GPUs hosting the policy and the controllers are on different machines highlighted a clear pain point. Looking ahead, streamlined cloud pipelines for automated data collection, preprocessing, and monitoring will be essential as multimodal sensing becomes more prevalent on the shop floor. While NOVA simplifies setup, runtime management, and robot control in the simulation gym, future work will focus on fully automating data pipelines and providing turnkey inference tools that connect pretrained models directly to industrial robots in simulation and production environments. This will allow for fine-grained monitoring of performance in one place. Thus, NOVA is already well placed today to help developers step into Level 3 with their robotics and automation projects. It also is uniquely placed to help developers and companies overcome the commonly recognized lack of data problem that stands in the way of a broader adaptation of physical AI application in actual industrial setups.

Conclusion
Most productive robot automation in industry today still operates at Level 0 and Level 1. These systems are mature, reliable, and deeply integrated into manufacturing processes, but they are fundamentally constrained by rigid logic and highly specialized AI.
Momentum is building around Level 2, where AI-augmented tools are beginning to enhance how developers create robot applications. Programming is becoming more accessible, workflows more flexible, and integrations with AI tools more common. However, to fully unlock the next generation of automation (Level 2 and Level 3) a new type of platform is needed.
Traditional robot software stacks were not designed to support modern AI workflows. Moving beyond today’s limitations requires:
A common abstraction layer to unify programming across robot brands and interfaces
Open APIs and data access for capturing and leveraging the full context of robot execution
Support for feedback, data collection, and iteration loops
Seamless integration of simulation, training, and deployment environments
Infrastructure for deployment and skill execution at runtime
Wandelbots NOVA is built for exactly this shift. It serves as the operating system and platform layer that connects AI-powered development with real-world industrial deployment. By enabling Python-based programming, structured data collection, and runtime policy execution across heterogeneous robots, NOVA is paving the way for developers and companies to build and deploy intelligent robotics today.
As AI continues to evolve, Wandelbots NOVA provides the foundation to move from automation to true autonomy.


