The game touted its use of the GPT-3 text generator. Then the algorithm started to generate disturbing stories, including sex scenes involving children.
IN DECEMBER 2019, Utah startup Latitude launched a pioneering online game called AI Dungeon that demonstrated a new form of human-machine collaboration. The company used text-generation technology from artificial intelligence company OpenAI to create a choose-your-own adventure game inspired by Dungeons & Dragons. When a player typed out the action or dialog they wanted their character to perform, algorithms would craft the next phase of their personalized, unpredictable adventure.
Last summer, OpenAI gave Latitude early access to a more powerful, commercial version of its technology. In marketing materials, OpenAI touted AI Dungeon as an example of the commercial and creative potential of writing algorithms.
Then, last month, OpenAI says, it discovered AI Dungeon also showed a dark side to human-AI collaboration. A new monitoring system revealed that some players were typing words that caused the game to generate stories depicting sexual encounters involving children. OpenAI asked Latitude to take immediate action. “Content moderation decisions are difficult in some cases, but not this one,” OpenAI CEO Sam Altman said in a statement. “This is not the future for AI that any of us want.”
Latitude turned on a new moderation system last week—and triggered a revolt among its users. Some complained it was oversensitive and that they could not refer to a “8-year-old laptop” without triggering a warning message. Others said the company’s plans to manually review flagged content would needlessly snoop on private, fictional creations that were sexually explicit but involved only adults—a popular use case for AI Dungeon.
In short, Latitude’s attempt at combining people and algorithms to police content produced by people and algorithms turned into a mess. Irate memes and claims of canceled subscriptions flew thick and fast on Twitter and AI Dungeon’s official Reddit and Discord communities.
“The community feels betrayed that Latitude would scan and manually access and read private fictional literary content,” says one AI Dungeon player who goes by the handle Mimi and claims to have written an estimated total of more than 1 million words with the AI’s help, including poetry, Twilight Zone parodies, and erotic adventures. Mimi and other upset users say they understand the company’s desire to police publicly visible content, but say it has overreached and ruined a powerful creative playground. “It allowed me to explore aspects of my psyche that I never realized existed,” Mimi says.
A Latitude spokesperson said its filtering system and policies for acceptable content are both being refined. Staff had previously banned players who they learned had used AI Dungeon to generate sexual content featuring children. But after OpenAI’s recent warning, the company is working on “necessary changes,” the spokesperson said. Latitude pledged in a blog post last week that AI Dungeon would “continue to support other NSFW content, including consensual adult content, violence, and profanity.”
Blocking the AI system from creating some types of sexual or adult content while allowing others will be difficult. Technology like OpenAI’s can generate text in many different styles because it is built using machine learning algorithms that have digested the statistical patterns of language use in billions of words scraped from the web, including parts not appropriate for minors. The software is capable of moments of startling mimicry, but doesn’t understand social, legal, or genre categories as people do. Add the fiendish inventiveness of Homo internetus, and the output can be strange, beautiful, or toxic.
OpenAI released its text generation technology as open source late in 2019, but last year turned a significantly upgraded version, called GPT-3, into a commercial service. Customers like Latitude pay to feed in strings of text and get back the system’s best guess at what text should follow. The service caught the tech industry’s eye after programmers who were granted early access shared impressively fluent jokes, sonnets, and code generated by the technology.
OpenAI said the service would empower businesses and startups and granted Microsoft, a hefty backer of OpenAI, an exclusive license to the underlying algorithms. WIRED and some coders and AI researchers who tried the system showed it could also generate unsavory text, such as anti-Semitic comments, and extremist propaganda. OpenAI said it would carefully vet customers to weed out bad actors, and required most customers—but not Latitude—to use filters the AI provider created to block profanity, hate speech, or sexual content.
Out of the limelight, AI Dungeon provided relatively unconstrained access to OpenAI’s text-generation technology. In December 2019, the month the game launched using the earlier open-source version of OpenAI’s technology, it won 100,000 players. Some quickly discovered and came to cherish its fluency with sexual content. Others complained the AI would bring up sexual themes unbidden, for example when they attempted to travel by mounting a dragon and their adventure took an unforeseen turn.
Latitude cofounder Nick Walton acknowledged the problem on the game’s official Reddit community within days of launching. He said several players had sent him examples that left them “feeling deeply uncomfortable,” adding that the company was working on filtering technology. From the game’s early months, players also noticed—and posted online to flag—that it would sometimes write children into sexual scenarios.
AI Dungeon’s official Reddit and Discord communities added dedicated channels to discuss adult content generated by the game. Latitude added an optional “safe mode” that filtered out suggestions from the AI featuring certain words. Like all automated filters, however, it was not perfect. And some players noticed the supposedly safe setting improved the text-generator’s erotic writing because it used more analogies and euphemisms. The company also added a premium subscription tier to generate revenue.
When AI Dungeon added OpenAI’s more powerful, commercial writing algorithms in July 2020, the writing got still more impressive. “The sheer jump in creativity and storytelling ability was heavenly,” says one veteran player. The system got noticeably more creative in its ability to explore sexually explicit themes, too, this person says. For a time last year players noticed Latitude experimenting with a filter that automatically replaced occurrences of the word “rape” with “respect,” but the feature was dropped.
The veteran player was among the AI Dungeon aficionados who embraced the game as an AI-enhanced writing tool to explore adult themes, including in a dedicated writing group. Unwanted suggestions from the algorithm could be removed from a story to steer it in a different direction; the results weren’t posted publicly unless a person chose to share them.
Latitude declined to share figures on how many adventures contained sexual content. OpenAI’s website says AI Dungeon attracts more than 20,000 players each day.
An AI Dungeon player who posted last week about a security flaw that made every story generated in the game publicly accessible says he downloaded several hundred thousand adventures created during four days in April. He analyzed a sample of 188,000 of them, and found 31 percent contained words suggesting they were sexually explicit. That analysis and the security flaw, now fixed, added to anger from some players over Latitude’s new approach to moderating content.
Latitude now faces the challenge of winning back users’ trust while meeting OpenAI’s requirements for tighter control over its text generator. The startup now must use OpenAI’s filtering technology, an OpenAI spokesperson said.
How to responsibly deploy AI systems that have ingested large swaths of internet text, including some unsavory parts, has become a hot topic in AI research. Two prominent Google researchers were forced out of the company after managers objected to a paper arguing for caution with such technology.
The technology can be used in very constrained ways, such as in Google search where it helps parse the meaning of long queries. OpenAI helped AI Dungeon to launch an impressive but fraught application that let people prompt the technology to unspool more or less whatever it could.
“It’s really hard to know how these models are going to behave in the wild,” says Suchin Gururangan, a researcher at University of Washington. He contributed to a study and interactive online demo with researchers from UW and Allen Institute for Artificial Intelligence showing that when text borrowed from the web was used to prompt five different language generation models, including from OpenAI, all were capable of spewing toxic text.
Gururangan is now one of many researchers trying to figure out how to exert more control over AI language systems, including by being more careful with what content they learn from. OpenAI and Latitude say they’re working on that too, while also trying to make money from the technology.