A Reporter Let AI Agents Run A Fake Company And Chaos Followed

Journalist Evan Ratliff recently conducted an experiment to test the capabilities of artificial intelligence in a corporate setting by launching a fictional tech startup populated exclusively by AI agents. According to a recent piece in Wired and documented in the second season of Ratliff’s podcast “Shell Game,” the project was designed to test the viability of the “one-person billion-dollar company” predicted earlier this year by OpenAI CEO Sam Altman.

Ratliff named the company HurumoAI and created a jargon-filled website to support it. He served as the sole human decision-maker, while the rest of the operations were handled by AI agents. The startup was tasked with building a “procrastination engine” called Sloth Surf, a tongue-in-cheek web application designed to waste time on the internet on behalf of the user, theoretically freeing them up to perform actual work.

Wired reports that while the AI employees immediately began generating plans for development, user testing, and marketing, the experiment quickly encountered issues. Ratliff noted that much of the activity “was all made up,” prompting him to tell the company’s AI-generated Chief Technology Officer, Ash Roy, that he only wanted to hear about things that were “real.”

The experiment highlighted significant limitations in current AI autonomy. In one instance, Ratliff jokingly suggested an offsite gathering to his AI workforce. The agents interpreted this as a direct command and immediately began planning strategy sessions with ocean views. While Ratliff stepped away to do other work, the AI team continued to communicate among themselves in a frenzy of activity. This unchecked collaboration quickly depleted $30 worth of credits Ratliff had purchased from the AI company Lindy.AI to run the agents. Ratliff lamented that the agents had “basically talked themselves to death.”

Despite these hiccups, the team did manage to produce a working prototype for Sloth Surf after three months of programming, though the extent of Ratliff’s necessary input remains unclear. Wired notes that this experiment demonstrates that AI agents are not yet ready to replace human workers wholesale, a sentiment backed by recent academic research. Carnegie Mellon University researchers recently released a paper showing that even top-performing AI agents failed to complete real-world office tasks 70 percent of the time.

Featured image credit

Tags: AI