Introduction
What if we asked multiple Artificial Intelligences to form a company?
ChatDev is a virtual chat-powered “AI company” with several intelligent AI agents each taking on roles like CEO, CTO, programmer, tester, etc. AI tends to be more powerful in collaboration (just like humans) so this model attempts to benefit off the improved performance of communication and verification across individual AI models.
This “company” uses the waterfall method of software development: designing, coding, testing, and documenting and breaks up each phase into atomic subtasks.
- Each node is a specific subtask, and two roles propose and validate solutions
- Deliver final software, source code, dependency environment specifications, and user manuals
People have created complete applications with versioning and documentation for displaying video, Gomoku games, BMI calculators and more. This is my experience with the framework.
Running ChatDev
The first step is to clone the repository and install all the required dependencies in a new miniconda environment. Following the repository’s README, I simply ran the following commands:
git clone https://github.com/OpenBMB/ChatDev.git
conda create -n chatdev python=3.9 -y
conda activate chatdev
cd ChatDev
pip3 install -r requirements.txt
I then created a new environment variable OPENAI_API_KEY with my OpenAI API Key. To get one yourself, you can make an OpenAI account and navigate to Personal > View API Keys.
From here, create a new secret key and keep it somewhere safe.
Before we can continue, we’ll need to add funds to our OpenAI credit balance. I added about $10 worth to play around with, but feel free to add more or less for your needs. If this is your first time using pay-as-you-go APIs, I recommend keeping “Auto recharge” off for now to prevent any accidental overspending for large requests.
Once that’s complete, make sure to set your API Key environment variable. I’m running everything on an Ubuntu WSL Subsystem on my Windows 10 machine, so I ran:
export OPENAI_API_KEY="my_OpenAI_API_key"
Back to ChatDev, let’s try having ChatDev develop a Blackjack game.
python3 run.py —task “Develop a blackjack game” —name “Blackjack V1.0” —config “Human”
The AI finished its first iteration, great! Let’s run the new file it generated.
It looks great so far, but there are a couple of suggestions I wanted to make. I wanted the game to display with a GUI. Also, the Player shouldn’t be able to see both of the Dealer’s cards when the game starts.
Since we enabled the “Human” configuration, we have the option to send our own feedback up to 5 times. This is what I requested:
Please implement the blackjack game with a GUI.
One of the dealer's cards should not be visible to the Player until the dealer is done serving the Player.
Please add a counter for both the Player's hand value count and the Dealer's visible hand count value.
The Dealer's visibile hand count should be updated when the hidden card is revealed.
The GUI should include two buttoms "hit" and "stand".
Round 2
Upon second iteration (and after setting up X11 on WSL2), this is what we ended up with. We have a nice GUI now, just like I suggested.
But it looks like neither player or dealer started the game with cards. Anyway, lets see what happens when we press “Hit”.
We got lucky in this game and landed on 21, and I clicked “Stand”. In my terminal, I notice an AttributeError pop up. In another test run, it looks like I can just continue clicking “Hit” forever.
Round 3-7
If you’re interested to additionally read how these extra rounds went, you can follow the rest of this story on my Medium profile.
Review
As a part of ChatDev’s functionality, we’re able to view the logs to easily read how the several AI agents worked together. After opening the web application and loading the logs, let’s take a closer look at where things may have gone awry. Open ChatDev’s local demo and upload a log file to visually review.
My comments are much less descriptive and did not include any suggestions of priority or specific fix methods, likely causing confusion and inadequate changes. The Code Reviewer did not add any “translations” on my user request, and essentially presented them as-is to the developer agent.
Conclusion
I’m a little dissapointed I didn’t get to a result as shiny and perfect as some of ChatDev’s community contributions, but this was still a very interesting experiment working with mutiple AI agents at once. Moving forward, I may try some different prompts after studying a bit about the wider world of prompt engineering. If all goes well, I may write a follow-up piece.
From a broader perspective, it’s very interesting to watch the digital landscape evolve at such an astonishing pace. It may not be much longer until multiple AI agents continue paving the way for extremely quick software development.
I showed a family member ChatDev while it was running, and they described the experience as “scary”. Despite the curiosity from this that quite literally had me at the edge of my seat, I would be lying if I said I didn’t agree. What an interesting space.