As AI develops, its true potential will be realized when it can independently handle most of our digital tasks. Satyen K. Bordoloi explores the latest advances in AI to shed light on the rise of personal AI assistants.
In the Marvel film series, Iron Man’s digital assistant, Jarvis, has complete control over Tony Stark’s Iron Man suit and digital life, completing tasks with voice commands. Although fictional, it’s a benchmark for AI capabilities, with Google naming its latest AI technology designed to control browsers and perform tasks like research and shopping Project Jarvis.
The advances in AI have been astounding, but we are only at the beginning of its full potential. AI will truly fulfill its promise when it autonomously handles – like Jarvis – most of our digital tasks, leaving us with the task of simply monitoring them. In my previous articlesI have called this ‘PAI’ – Personal AI Assistants, but AI developers call them Autonomous Agents or simply ‘Agent’. Although Jarvis first appeared in a 2008 movie, Project Jarvis and other developments by major AI companies and some startups, to create PAI, have started gaining momentum in 2024.
What is an autonomous agent: An autonomous agent or agent is an AI program designed to pursue goals in digital environments autonomously, even complex ones, with minimal user supervision. These agents can make decisions based on natural language instructions, i.e. voice commands from users, using tools often controlled by large language models. Unlike passive gadgets you have to act on, AI agents can act on their own.
From planning your day, making your presentation, arranging meetings, managing your finances or researching vacation spots – Agents can connect to other applications including AI apps and perform multi-step decision making just like a human personal assistant would.
Google’s AI agents: Google has made an extra effort to lean into AI agent development. In addition to Project Jarvis, their major advances have been via Vertex AI, a unified platform that provides a comprehensive set of tools and Application Programming Interfaces (APIs) for building, training, and deploying machine learning models. The aim is to enable developers to create sophisticated AI agents capable of performing tasks such as natural language processing, computer vision and predictive analytics.
Google is working hard to figure out how to train agents to make complex decisions in dynamic environments. We can expect these to be integrated into Google Assistant and Gemini later.
Anthropic’s innovative AI: Founded by former OpenAI employees, Anthropic has introduced an AI model designed to push the boundaries of traditional automation. This system can control a PC and offers a personalized user experience by intuitively understanding and executing commands in various applications. It automates tasks such as managing emails, scheduling and generating reports. Its adaptability, learned from user interactions, makes it a useful tool for personal and professional use.
Microsoft’s Windows Agent Arena: Microsoft is aiming to revive its 1990s glory days with its finger on the pulse of the tech world via its Windows Agent Arena, an AI assistant that integrates deeply with Windows PCs to perform users’ tasks for them. From scheduling meetings to drafting documents, it’s designed to work seamlessly with Microsoft’s software suite to ensure users don’t get bogged down in mundane tasks.
Microsoft hopes to let this agent assistant handle everything from simple reminders to complex project management.
OpenAI’s AI agents: During their DevDay 2024 event, OpenAI CEO Sam Altman said AI agents would be integrated into daily life next year. The company hopes to be one of the main winners from this transition, and their recent API updates such as the Realtime API for speech-to-speech interaction and vision fine-tuning for image recognition are expected to drive this integration.
Model distillation and fast caching are also part of innovations to increase efficiency. These advances, along with significant funding, position OpenAI to significantly improve the development and deployment of AI agents.
Agents from startups: Other AI agents from startups, such as Relay and Induced AI, automate repetitive tasks and integrate with various software. Automate can control browsers for tasks such as web scraping and data extraction. Salesforce’s AI agents are designed to integrate with business tools, automate customer support, forecast sales, and help create marketing campaigns. They aim to be valuable tools for managing customer relationships, providing businesses with actionable insights and automating processes to drive efficiency and growth.
Advantages and challenges: The benefits of AI models are enormous. They can increase the efficiency and productivity of businesses and give you the convenience of a digital assistant that anticipates your needs and completes tasks, from the mundane to the complex. However, there are significant concerns. The biggest one is data privacy. With AI agents gaining access to sensitive information about individuals and companies, the risk of data compromise is high.
Ensuring the safety of these agents is critical. Another issue is that over-reliance on AI agents can erode our critical thinking and problem-solving skills.
Future Prospects and Monetization: The growth of PAIs or agents is driven by a simple fact: this is the easiest way to monetize the billions invested in AI research and development. Even before the advent of generative AI, I had predicted that the PAI or agent industry – which did not exist when I said it – driven by their ability to enhance human capabilities and free up our time for more creative and important tasks – would become a trillion market.
It seems that 2025 will be the beginning of the journey when each of us, as in the movie Herwill carry a digital assistant in our pockets. And maybe by 2030 it could be a trillion dollar industry overall.
The greatest strength of these AI agents lies in their ability to enhance human abilities and provide more free time. However, I recommend that the AI community consider moving beyond the term “Agents”. It perpetuates the negative myths surrounding AI, reminiscent of the evil AI programs in the Matrix series. With Agent Anthropic, Agent OpenAI, and Agent Microsoft already in existence, it’s only a matter of time before we’re inundated with a meme-fest pouncing on the potential malevolence of an “Agent Smith.”
Hopefully, the benefits of AI agents will continue to outweigh the risks so much that, like PG Wodehouse’s butler Jeeves becoming Jarvis, Agent Smith will become a goofy Agent Psmith.