U.S. Army game-theory research using artificial intelligence may help treat cancer and other diseases, improve cybersecurity, deploy Soldiers and assets more efficiently and even win a poker game.
New research, published in Science, and conducted by scientists at Carnegie Mellon University, developed an artificial intelligence program called Pluribus that defeated leading professionals in six-player no-limit Texas hold’em poker.
The Army and National Science Foundation funded the mathematics modeling portion of the research, while funding from Facebook was specific to the poker.
“It’s all about strategy,” said Dr. Purush Iyer, division chief, network sciences at the Army Research Office, an element of the U.S. Army Combat Capabilities Development Command’s Army Research Laboratory. “A limiting factor in game theory has always been scalability (i.e., ability to deal with exponentially increasing state space). Poker is an accessible example to show how these mathematical models can be used to devise strategies for situations where a person doesn’t have complete information – they don’t know what the adversaries will do, and what their capabilities are.”
This research is extremely relevant to many real-world and military challenges that involve multiple parties such as cybersecurity and defense posturing, he said.
Poker has been an AI challenge because it is an incomplete information game, where players cannot be certain which cards are in play and opponents can, and will, bluff, much like military strategy.
“Thus far, superhuman AI milestones in strategic reasoning have been limited to two-party competition,” said Dr. Tuomas Sandholm, Angel Jordan Professor of Computer Science, who developed Pluribus with Noam Brown, who is finishing his doctorate in Carnegie Mellon’s Computer Science Department as a research scientist at Facebook AI. “The ability to beat five other players in such a complicated game opens up new opportunities to use AI to solve a wide variety of real-world problems.”
“Playing a six-player game rather than head-to-head requires fundamental changes in how the AI develops its playing strategy,” said Brown, who joined Facebook AI last year.
Pluribus dispenses with theoretical guarantees of success and nevertheless develops strategies that enable it to consistently outplay opponents. Pluribus first computes a blueprint strategy by playing six copies of itself, which is sufficient for the first round of betting. From that point on, Pluribus does a more detailed search of possible moves in a finer-grained abstraction of game. It looks ahead several moves as it does so, but not requiring looking ahead all the way to the end of the game, which would be computationally prohibitive. Limited-lookahead search is a standard approach in perfect-information games, but is extremely challenging in imperfect-information games. A new limited-lookahead search algorithm is the main breakthrough that enabled Pluribus to achieve superhuman multi-player poker.
The software also seeks to be unpredictable. For instance, betting would make sense if the AI held the best possible hand, but if the AI bets only when it has the best hand, opponents will quickly catch on. So Pluribus calculates how it would act with every possible hand it could hold and then computes a strategy that is balanced across all of those possibilities.
With Army funding, Sandholm and some of his other students are developing related techniques for bio-steering, where the researchers are computing optimal treatment plans that steer a patient’s immune system to better fight cancers, autoimmune diseases, infections, etc.
Previous Army-funded game theory research is now being used by the Transportation Security Administration, the U.S. Coast Guard and the Los Angeles Metro Rail to schedule resources in a manner that decreases cost for the those organizations ensuring safety while increasing the costs for an adversary, thus reducing the chances for attacks.
Furthermore, Army-funded foundational research in algorithmic game theory has been used in civil society to reduce poaching of elephants in Queen Elizabeth Forest, Uganda, and tigers in Southeast Asia, as well as in addressing homelessness and implementing HIV-prevention campaigns in Los Angeles.
“The research work of Dr. Sandholm and others will be used in a variety of ways in the not-too-distant future to address societal problems in a cost-effective manner,” Iyer said. “Dr. Sandholm’s work is an exciting advance in game-theory; the applications are enormous.”