Home

AI 'Agents' Are Trying to Make Life a Little Easier and a Lot Weirder

Tech bros are always chasing the next big thing, so much so that some developers are already trying to imply that chatbots like ChatGPT are old hat. The real big AI innovation, they say, is language model-powered AI “agents” able to carry out multiple tasks in a row.

Compared to the “prompt, response” model of current chatbots, these agents like Auto-GPT are potentially capable of writing whole reams of code, building websites—or in one surprising case—making a call to a physical pizza place and placing an order.

These agents are essentially self-contained systems that use modern genartive AI models to automate tasks. Most agents use OpenAI’s ChatGPT and GPT-4 as a base, but several other homespun agents also take in generative AI image and voice models to create some surprising, if sometimes creepy results. These systems feed the AI’s outputs back into themselves, creating a program that can run semi-autonomously with an overarching goal.

Say I wanted the AI agent to create a plan to upgrade my PC with a limited budget. In several Agent models, I can set it on tasks like “find and rank the most-current different graphics cards based on price for under $500” and then do the same with a CPU, RAM, and more. Then I can list a task like “Use those lists and determine the best PC one can build for under $1,000.” Depending on the model, it could give me a good idea of where to find my next upgrade. It could also lock up and tell me it doesn’t know how to complete the task.

Compared to your regular old AI chatbot like ChatGPT, these AI agents can connect to the internet and search for information that isn’t present in their own training data. The other big selling point is these agents have more memory than a regular ChatGPT session. The thing is, while these agents work surprisingly well on very basic, specialized tasks, you really can’t leave them alone for too long. Large language models are already prone to spitting out false information, and running multiple instances of a large language model can dramatically increase the likelihood of failure. AI is fully capable of coding, but even one mistake can make the entire thing fail. Sure, you could automate routine code checks, but what if those fail as well?

So are agents actually the evolution of AI, or just a chain of Google searches? Well, the answer lies somewhere in the middle. Despite the moniker, AI simply isn’t intelligent by any real standard. These agents need quite a lot of guidance, and along the way there’s plenty of opportunity for the system to produce wrong information, spoiling the entire process. Before they become truly autonomous, these agents are little more than clever toys.

That doesn’t mean they aren’t interesting or don’t have the capacity to radically change how we currently think about AI. We’ve gone through some of the more interesting AI agents models currently out there, plus a few of the more dramatic agents built for specific tasks that you can check out by clicking through.

Want to know more about AI, chatbots, and the future of machine learning? Check out our full coverage of artificial intelligence, or browse our guides to The Best Free AI Art Generators, The Best ChatGPT Alternatives, and Everything We Know About OpenAI’s ChatGPT.

2 / 12

Created by dedicated AI evangelist and developer Toran Bruce Richards, Auto-GPT is more of a parent program used to generate AI agents. Essentially, the program uses a script to link outputs of the GPT-4 large language model, feeding itself based on its responses so it can iterate and correct itself. It requires a bit of setup, though you can find a good tutorial for creating your own Auto-GPT instance in this Twitter thread by developer Sully Omar.

What’s most impressive about Auto-GPT is how it all runs off natural language prompts. A user can give the AI up to five goals to accomplish based on the original description. By default, users have to give it permission to complete each task, though there is the option of letting it go freestyle.

Some users said they to were able get the AI to order food or book flights online for them through a platform built on Auto-GPT. Omar showed how he managed to get Auto-GPT to complete some simple market research. So far, the main application for the agent has been creating lists and performing simple research tasks.

And some of those designs can be malicious. As first reported by VentureBeat, security researcher Simon Willison wrote about his concerns that simple prompt injection techniques could create avenues for bad actors to attack people through external tools like Auto-GPT.

3 / 12

Alongside Auto-GPT, BabyAGI is the other major code repository causing waves in the AI scene. Yohei Nakajima, BabyAGI’s creator, said the whole project came together thanks to a side project he was working on with ChatGPT. It’s similar to AutoGPT, but instead of planning each step individually, it plans a sequence all at once and then acts on them.

After open-sourcing the code, users have been able to connect it with other online tools. And of course, this has allowed more developers to make their own UI for easier access to the BabyAGI agent.

As with all the general-purpose agents currently around, they only have limited capacity for specific tasks beyond making lists. Some users say they have managed to get BabyAGI to simplify and test accurate code through a separate application, though it took quite a lot of handholding and trial and error.

4 / 12

The Camel AI agent is essentially two agents that work side by side with each other. Since humans often need to be there to hold the AI agent’s hand, the developer’s idea is to add another AI agent to “role-play” as the human and guide its counterpart.

Essentially, the program acts by creating individual tasks and inputs them into the agent. If I ask it to make a peanut butter and jelly sandwich, the AI will first tell its counterpart to gather all the ingredients and tools, then place the slices of bread on a plate, and on and on until it completes the task.

At this point, Camel is more of an experimental model than a user-side platform, but more developers have talked about exploring multiple agents working in tandem, and we can expect to see more of this in the future.

5 / 12

There’s a web-based version of Auto-GPT called AgentGPT, though it offers very limited controls on what tasks it will perform, and the demo will eventually shut itself off after a certain period of time. Still, with a few simple prompts I created an agent called “Crunchatize me Captain” trying to create a breakfast cereal combining the worst, most-processed ingredients into one horrid box. The new cereal combined sugary Frosted Flakes with stale Rice Krispies and “chemical-laden” Lucky Charms marshmallows. Yummy.

This program asks for an OpenAI API key. That may be a real sticking point, as OpenAI explicitly tells users not to share their keys with outside clients. It’s also limited in how users can force the system to approach each task.

6 / 12

“God Mode,” is essentially like AutoGPT and AgentGPT, though it’s also in-browser and requires a connection to a Google account or Twitter account, and an OpenAI API key. A reminder: OpenAI says not to give out your key, so take that into consideration.

The system first asks for a prompt then creates a suggested multi-point action plan and then asks for user input for each part of the process. For instance, I asked God Mode to “Identify the best way to peel a banana.” The system then created action items like “research different methods of peeling bananas” and “conduct experiments to compare the efficiency and ease of each method.” Users can add their own tasks or accept the suggested tasks before running the program.

7 / 12

After working with several different AutoGPT and BabyAGI implementations for ease of use, I’ve found that Cognosys had the most complete UI for automating tasks. It’s essentially the same as God Mode and AgentGPT, but it requires the least amount of user data up front, and I personally found it did not need an OpenAI API key to work upfront. Still, that could change in time.

Cognosys’ creator Sully Omarr said this beta is still a very early version, and that he plans to add more search capabilities, custom agents, and connection to GPT-4. Unfortunately, the UI does not include a search function like AutoGPT does running it by itself, but Omarr said he is working on it.

I asked it my banana peeling question, and it established that both the “backwards peel” and “freeze” methods were the best options for human hands, noting the pros and cons of both. Unfortunately, the system was much less thorough when I asked it to create a 5-point action plan for dealing with New York City rats. It didn’t even manage to finish a fifth point, and the best it could offer was using rat-proof containers on garbage and food waste cans.

8 / 12

While agents like AutoGPT have the ability to search the internet for information, the so-called Do Anything Machine also advertises it can access users’ data and apps, as long as you’re willing to give the platform access to that information on whichever platform you’re using. Developer Garrett Scott based the system on BabyAGI and showed how it acts almost like an AI-based to-do list. It automatically spawns tasks based on an initial prompt and then runs those in the background.

The platform is currently free, though there’s currently a waitlist to access the AI agent on the Do Anything Machine website. The agent platform’s privacy policy mentions it collects basic user informat Source: Gizmodo

Previous

Next