This August, some of the world’s best professional gamers will travel to Vancouver to fight for millions of dollars in the world’s most valuable esports competition. They’ll be joined by a team of five artificial intelligence bots backed by Elon Musk, trying to set a new marker for the power of machine learning.
Vancouver is hosting the annual world championship of Dota 2, one of the internet’s most-watched videogames. The prize purse is more than $15 million and growing, exceeding the $11 million at stake at golf’s Masters. In each game, two teams of five people attempt to destroy each others’ bases, playing characters that can include demons, spiders, and icy ghosts.
Earlier this month, OpenAI’s team, OpenAI Five, played and beat a team of semipros among the top 1 percent on the Dota 2 global rankings. That matchup simplified the game’s features somewhat—for example, by restricting both teams to the same characters. But OpenAI CTO and cofounder Greg Brockman believes the bots can be ready for a fuller match against pros on the sidelines of the Vancouver contest in two months. “We’ve seen professional-level plays emerge from this system,” he says.
That’s a bold statement. Battling with orcs and warlocks may seem less cerebral than chess or Go, games at which computers beat top humans in 1997 and 2016, respectively. But complex videogames like Dota 2 are in fact far more difficult for AI systems, says Dave Churchill, a professor at Memorial University, in St. John’s, Canada. It’s why Alphabet’s DeepMind, which created the AlphaGo software that made history by defeating a Go champ in 2016, is now working on StarCraft 2, a similarly tough videogame.
Dota and StarCraft are very different, but both are difficult for AI because the action takes place on a much larger board, where not all your opponent’s moves are visible, as they are in chess or Go. Complex videogames also require players to make more decisions, more quickly. A chess player has, on average, about 35 possible moves at any time, and a Go player 250. OpenAI says each of its team’s bots must choose between an average of 1,000 valid actions every eighth of a second. Dota 2 matches typically last around 45 minutes. “These games have much more similar properties to real world scenarios than chess and Go,” says Churchill. OpenAI says its Dota 2 algorithms could be adapted to help robots learn how to perform complex tasks, for example.
OpenAI Five learned how to play Dota 2 by playing against clones of itself millions of times. The software is built around a technique called reinforcement learning, in which software uses trial and error to discover what actions will maximize a virtual reward. In the case of OpenAI Five, the reward is a combination of game stats chosen by OpenAI researchers to produce steadily improving skills.
Although reinforcement learning is inspired by research on how animals and humans learn, the artificial version is much less efficient. OpenAI Five’s training made use of Google’s cloud computing service, occupying 128,000 conventional computer processors and 256 graphics processors, chips vital to big machine learning experiments, for weeks at a time. The conventional processors do the work of running the game, generating training data for the learning algorithms, which are powered by the graphics processors. Each day, OpenAI Five played the equivalent of 180 years of Dota 2.
No human has 180 years to learn a videogame. Indeed, some AI researchers say reinforcement learning is too inefficient to be useful outside of toy scenarios like games. But the OpenAI project does show that if you can put more computing power behind today’s algorithms, they can do a lot more than people expect, Brockman insists.
OpenAI’s bots don’t play like humans, either. They perceive the game as a stream of numbers detailing different aspects of the game, rather than by decoding a display image, for example. They can react faster than human players.
If OpenAI Five wins in Vancouver, those differences, and any other tweaks made to adapt the game to a bot, may lead some AI researchers to argue it wasn’t a fair fight. Churchill says that any victory on such a complex task would be significant, but the magnitude of the breakthrough will depend on the methodological details. The only way to avoid all quibbles, he jokes, would be a match in which a robot sat at a computer and operated a keyboard and mouse. Brockman says he will judge the bots’ success based on whether pro gamers accept them as worthy opponents.
Should the bots win, the achievement will inevitably be compared with DeepMind and its work on Go. Brockman says he’s not racing DeepMind to set the next big marker in the contest between computers and humans. “We’re exploring machine learning and AI together, trying to see what are these technologies capable of,” Brockman says.