Can Multiple AI Agents Work as a “Company”?

Can Multiple AI Agents Work as a “Company”?

Introduction

What if we asked multiple Artificial Intelligences to form a company?

ChatDev is a virtual chat-powered “AI company” with several intelligent AI agents each taking on roles like CEO, CTO, programmer, tester, etc. AI tends to be more powerful in collaboration (just like humans) so this model attempts to benefit off the improved performance of communication and verification across individual AI models.

image

This “company” uses the waterfall method of software development: designing, coding, testing, and documenting and breaks up each phase into atomic subtasks.

  • Each node is a specific subtask, and two roles propose and validate solutions
  • Deliver final software, source code, dependency environment specifications, and user manuals

People have created complete applications with versioning and documentation for displaying video, Gomoku games, BMI calculators and more. This is my experience with the framework.

Running ChatDev

The first step is to clone the repository and install all the required dependencies in a new miniconda environment. Following the repository’s README, I simply ran the following commands:

git clone https://github.com/OpenBMB/ChatDev.git
conda create -n chatdev python=3.9 -y
conda activate chatdev
cd ChatDev
pip3 install -r requirements.txt

I then created a new environment variable OPENAI_API_KEY with my OpenAI API Key. To get one yourself, you can make an OpenAI account and navigate to Personal > View API Keys.

image

From here, create a new secret key and keep it somewhere safe.

image

Before we can continue, we’ll need to add funds to our OpenAI credit balance. I added about $10 worth to play around with, but feel free to add more or less for your needs. If this is your first time using pay-as-you-go APIs, I recommend keeping “Auto recharge” off for now to prevent any accidental overspending for large requests.

image

Once that’s complete, make sure to set your API Key environment variable. I’m running everything on an Ubuntu WSL Subsystem on my Windows 10 machine, so I ran:

export OPENAI_API_KEY="my_OpenAI_API_key"

Back to ChatDev, let’s try having ChatDev develop a Blackjack game.

python3 run.py —task “Develop a blackjack game” —name “Blackjack V1.0” —config “Human”

The AI finished its first iteration, great! Let’s run the new file it generated.

image

It looks great so far, but there are a couple of suggestions I wanted to make. I wanted the game to display with a GUI. Also, the Player shouldn’t be able to see both of the Dealer’s cards when the game starts.

Since we enabled the “Human” configuration, we have the option to send our own feedback up to 5 times. This is what I requested:

Please implement the blackjack game with a GUI.
One of the dealer's cards should not be visible to the Player until the dealer is done serving the Player.
Please add a counter for both the Player's hand value count and the Dealer's visible hand count value. 
The Dealer's visibile hand count should be updated when the hidden card is revealed.
The GUI should include two buttoms "hit" and "stand".

Round 2

Upon second iteration (and after setting up X11 on WSL2), this is what we ended up with. We have a nice GUI now, just like I suggested.

image

But it looks like neither player or dealer started the game with cards. Anyway, lets see what happens when we press “Hit”.

image
image

We got lucky in this game and landed on 21, and I clicked “Stand”. In my terminal, I notice an AttributeError pop up. In another test run, it looks like I can just continue clicking “Hit” forever.

Round 3-7

If you’re interested to additionally read how these extra rounds went, you can follow the rest of this story on my Medium profile.

Review

As a part of ChatDev’s functionality, we’re able to view the logs to easily read how the several AI agents worked together. After opening the web application and loading the logs, let’s take a closer look at where things may have gone awry. Open ChatDev’s local demo and upload a log file to visually review.

image

My comments are much less descriptive and did not include any suggestions of priority or specific fix methods, likely causing confusion and inadequate changes. The Code Reviewer did not add any “translations” on my user request, and essentially presented them as-is to the developer agent.

Conclusion

I’m a little dissapointed I didn’t get to a result as shiny and perfect as some of ChatDev’s community contributions, but this was still a very interesting experiment working with mutiple AI agents at once. Moving forward, I may try some different prompts after studying a bit about the wider world of prompt engineering. If all goes well, I may write a follow-up piece.

From a broader perspective, it’s very interesting to watch the digital landscape evolve at such an astonishing pace. It may not be much longer until multiple AI agents continue paving the way for extremely quick software development.

I showed a family member ChatDev while it was running, and they described the experience as “scary”. Despite the curiosity from this that quite literally had me at the edge of my seat, I would be lying if I said I didn’t agree. What an interesting space.

Thought

The Logic Model of Creativity
August 1, 2023
Goodbye, Skeuomorphism
January 7, 2024
A Love Letter to the Uniball Signo 207
January 1, 2024
Can Multiple AI Agents Work as a “Company”?
October 14, 2023
How to Stop Being Afraid of Starting Something
July 30, 2023