A New Era of AI Agents
Today (March 31, 2025), AWS introduced Amazon Nova Act, a revolutionary AI model designed to perform actions within a web browser. Alongside this, AWS is launching the Nova Act SDK, which is available as a research preview at nova.amazon.com. This SDK enables developers to experiment with the early version of Nova Act, allowing them to build AI agents capable of completing tasks in a web browser, such as submitting an out-of-office request, setting calendar holds, and even composing ‘away from office’ emails.
Beyond Traditional AI Agents
The concept of AI agents has evolved significantly. Initially, agents were designed primarily for natural language interactions and retrieval-augmented generation (RAG). However, AWS envisions agents as systems that actively perform complex digital and physical tasks on behalf of users.
While existing agents rely heavily on API integrations, Nova Act extends capabilities beyond API-covered scenarios. The dream is to create agents that can handle multi-step, complex workflows, such as organising events, managing IT tasks, and boosting productivity, without constant human supervision.
Addressing Today’s Challenges in AI Agents
One of the major limitations of current AI agents is their inability to reliably complete multi-step tasks without human intervention. Nova Act aims to solve this problem by offering a highly structured approach to workflow automation. The Nova Act SDK provides developers with reliable atomic commands, such as search, checkout, and answering on-screen queries, while allowing for detailed, customisable instructions.
Key Features of Nova Act SDK:
- Reliable Atomic Commands — Execute actions like selecting dates, interacting with drop-down menus, and handling pop-ups with high accuracy.
- Detailed Command Customisation — Developers can add specific instructions (e.g., “do not accept insurance upsell”) to ensure precise execution.
- API and Playwright Integration — Offers a combination of API calls and direct browser manipulation via Playwright for enhanced reliability.
- Python Code Integration — Allows interleaving Python scripts for testing, breakpoints, assertions, and parallelisation to optimise performance.
Best-in-Class Performance
The Nova Act model has been evaluated rigorously against industry benchmarks, achieving best-in-class performance compared to other leading models like Claude 3.7 Sonnet and OpenAI CUA. Below are the benchmark results:
These results highlight Nova Act’s superior ability to interact with web elements accurately, significantly outperforming competitors in key areas.
Headless Execution and Automation
Once configured, Nova Act agents can run autonomously without human supervision. Developers can:
- Enable headless mode for seamless execution.
- Convert their agent into an API for seamless product integration.
- Schedule automated tasks, such as ordering dinner every Tuesday night.
Expanding AI Capabilities Beyond APIs
An exciting aspect of Nova Act is its ability to generalise across novel digital environments — even those it was not explicitly trained for. Initial results indicate promising performance in scenarios like navigating web-based games, suggesting broad adaptability beyond traditional browser interactions.
Real-World Application: Alexa+
Nova Act has already been integrated into Alexa+, where it autonomously navigates the web to complete user tasks when direct API connections are unavailable. This represents a major step toward self-directed AI agents that can function independently within complex digital ecosystems.
The Future of AI Agents
Nova Act marks the first milestone in AWS’s broader mission to develop scalable, intelligent, and reliable AI agents. AWS’s long-term vision extends beyond simple LLM fine-tuning; AWS aims to enhance AI through reinforcement learning across diverse environments, creating agents that can reliably execute complex, multi-step workflows.
Join the Innovation
AWS believes that the most valuable AI use cases have yet to be discovered. With the Nova Act SDK, AWS invites developers and innovators to explore new possibilities through rapid prototyping and iterative feedback.
The Amazon Nova Act SDK is available for research preview at nova.amazon.com. This is just the beginning of an exciting journey in AI-driven web automation. Stay tuned for more updates as AWS continues to push the boundaries of intelligent automation!