OpenAI launched Operator, an AI agent that can use go to the web and perform tasks such as booking vacations, shopping for groceries and making restaurant reservations.
Operator is in research preview and the ability to interact with a webpage and use a mouse and keyboard follows a similar effort by Anthropic.
In a blog post and YouTube video, OpenAI walked through a few demonstrations. The general idea is that OpenAI's Operator can handle repetitive browser tasks using the same interface as humans. OpenAI is aiming to use agentic AI to popularize it with consumers, but the real money will be in business use cases. "Our plan is to expand to Plus, Team, and Enterprise users and integrate these capabilities into ChatGPT in the future," said the company.
- CxOs upbeat on economy, plan to invest heavily on genAI, AI agents
- Agentic AI: Three themes to watch for 2025
- Enterprise software 2025: Three big shifts to watch
For now, the Operator is available to ChatGPT Pro users.
Operator leverages GPT-4o's vision and reasoning features and a new model called Computer-Using Agent, or CUA, which interacts with graphical interfaces.
The key item here is that OpenAI Operator takes action on the web without APIs and custom integrations. If Operator gets stuck it can self-correct and hand back control to users.
Initially, Operator works with day-to-day services such as DoorDash, Instacart, OpenTable, Priceline, StubHub, Thumbtack and Uber. Over time, rest assured that OpenAI will expand to enterprise use cases.
OpenAI said it will provide an API for CUA for developers, enable Operator to handle more complex tasks and expand to enterprises.
On the safety front, Operator asks the user to takeover when it needs more information and credentials, confirm actions, decline sensitive tasks and require supervision to avoid mistakes.