AI agents have taken the world by a storm this year, many presenting it as a major milestone in Artificial Intelligence but what exactly is new? I will aim to summarize my observations and readings at a very high level in this quick article.

Taking a step back and looking at Large Language Models (LLMs), which were accessible to users either in the form of a polished chatbot or through API access (for developers and students). These LLMs often made good conversation, provided excellent copyrighting most of the time, could help you decipher legal documents or draft one, could help you study for a topic and quiz you on it. Some of the LLMs could really write good code from plain English prompting, and we had image generation, video generation, music generation.

All these were quite impressive developments, which drove concerns and debate about the future of human jobs and creativity, copyright laws, cheating in the learning place. It also drew some annoyance about AI taking on more creative outlets of humans leaving us to do the boring administrative tasks. While Gen AI could “create” or synthesize all this content it could not really do anything with it, a human still had to post it or send it in an email or draft it into a working document.

The next steps was the appearance of Large Reasoning Models or LLM that can seem to reason. This appearance to reasoning is a combination of special prompting and new fine-tuned LLM that can solve problems and make a plan. Although not perfect, a reasoning model can be used as an engine of “doing” versus only “creating”.

AI Agents are an interesting addition to a long tradition of software based automation. A few years back, Robotic Process Automation (RPA) was all the rage in the enterprise and delivered quite interesting efficiencies. The challenge with traditional programming automation is that it needs to cater for all eventual outcomes, every exception needs to be coded, error handling needs to be managed and application flows are somehow rigidly designed and implemented.

AI agents on the other hands combines the following capabilities and

Thinking and planning (Brain): What’s different with AI agents, is that now the AI can actually take actions and you can define and automate certain workflows.
Agentic AI utilizes a LLM or LRM as the brain of the agent, allowing it the capacity to observe, reason and plan for actions.

Tools (Arms and legs): An agentic framework, allows the Agent to work with defined “tools” which are the arms and legs of the agents that enable actions. Think of tools as defined functions or libraries in the programming sense that could be used by the agent to achieve the necessary output. These functions will only be defined by the designer, then it’s up to the agent to decide when a tool call is appropriate during its workflow from the general context.

Context (Memory): Adding memory to the agent allows it to maintain the context of the task and the evolving data generated and collected in order to produce a consistent action at the end of the workflow.

Why is this interesting?

AI agents will enable process automation at a large scale. While process automation is not new, software automations requires the proper development skills and integration with APIs and custom libraries development.

AI agents, on the other hand, will require much less coding skills, some platforms like n8n allow you no code agents building. Some other frameworks like LangChain, LlamaIndex, Smolagents allow simple python coding with all its extensive libraries to define a workflow.

AI agents, with the proper design, will manage error handling and alternative solutions on their own. This means they will be more adaptable to changing conditions that are not often fully predictable.

LangChain graph depiction of a simple agent framework (research analyst agent)

Use Cases

To realize the potential of AI agents requires the context of how it can be made useful. Some of the popular use cases are:

Customer Service Automation: most of us interacted with chatbots online that attempted to various degrees of success to solve our issues by explaining procedures, linking relevant web pages but they were often limited to taken our information and summarizing the ticket to an actual human agent who could take more advanced actions. AI agents now can access multiple internal systems, review documents and retrieve information from them to support its answers, take actions like refunds, schedule service appointments without involving a real person.

Supply Chain Management: Agents can monitor inventory levels, predict demand, automatically reorder products, automaticlaly negotiate and optimize logistics routes.

Cybersecurity Response: Autonomous security agents detect threats, isolate compromised systems, patch vulnerabilities, and respond to incidents faster than human teams could manage.

Smart Home Orchestration: AI agents coordinate multiple IoT devices, learning user preferences and automatically adjusting lighting, temperature, security, and entertainment systems throughout the day

Healthcare Coordination: AI agents schedule appointments, manage prescriptions, coordinate between different healthcare providers, monitor health metrics, and provide personalized health recommendations.

Scientific Research Assistance: Agents can design experiments, analyze data, search literature, generate hypotheses, and even write portions of research papers while maintaining scientific rigor.

Limitations

While AI agents have a lot of potential, like any technology, it’s important to understand its limitations to use it properly.

Reasoning constraints:

  • Hallucinations: AI agents, with the proper design, should hallucinate less than plain LLMs if they are designed properly and utilized in the correct way. This is due to tool use and Retrieval augmented generation (RAG) that allow the agent to access and ingest external sources of information beyond what the LRM/LLM was trained and tuned to. However there is always the possibility that the agent if in a situation where it does not know will simply rely on the LLM predicting the most likely answer and generate complete garbage.
  • Context Window limitations: Most AI agents have limited context spans, this means they can lose track of important context in ling-running tasks or complex multi-step processes. This effect can be reduced by token management techniques and internal message summaries or truncation but requires a more involved design.

Safety and control issues:

  • Alignment Problems: Ensuring agents pursue intended goals rather than optimizing for unintended consequences is challenging. An agent tasked with “maximize user engagement” might employ manipulative tactics rather than providing genuine value.
  • Lack of True Understanding: Agents operate through pattern matching rather than genuine comprehension, which can lead to inappropriate actions when they misinterpret context or nuance.
  • Unpredictable Emergent Behaviors: As agents become more sophisticated, they may exhibit unexpected behaviors that weren’t anticipated during development, making it difficult to predict all possible outcomes.
  • Bias and ethics: LLM is trained on human generated data, which will have some form of bias, discrimination in it. Data that is not cleaned from these biases will lead to reproduction of the same behaviour from the LLM. These issues can be mitigated by setting guardrails structures that validate the output of the LLM against some clear guidance before further utilizing it.

Total Cost of Operation

  • Resource Intensity: Running sophisticated AI agents requires significant computational resources, making them expensive to deploy at scale and potentially limiting their responsiveness.
  • Integration Complexity: Real-world deployment requires agents to interact with numerous existing systems, APIs, and databases, each with their own quirks and failure modes.
  • Error Propagation: When agents make mistakes early in a multi-step process, these errors can compound, leading to increasingly poor outcomes without human intervention.

Conclusion

I think there is a lot of value in agentic AI, which can be witnessed in the market today, there is also a lot of hype as usual but it should not take away the huge potnetial enabled by agentic frameworks. I am still of the opinion that we will continue to need humans that know how to code and design AI tools and how to utilize them properly and safely and it’s essential to bridge any learning gaps on these aspects as soon as possible.

Learning Resources

Building Effective Agents – Anthropic White Paper

A Practical Guide To Building Agents – OpenAI White Paper

Hugging Face Agents Course

Introduction to LangGraph Course


Discover more from Wasla

Subscribe to get the latest posts sent to your email.

meahus Avatar

Published by

Categories: ,

Leave a comment