Facebook’s New Poker-Playing AI ReBel Performs Better Than Humans
A team of researchers from Facebook have recently developed a poker-playing AI that is capable of beating human players in heads-up, no-limit Texas hold’em poker.
Called Recursive Belief-based Learning (ReBel), the general AI framework learns poker faster than any other previous poker-specific AI, using less domain knowledge, and researchers are claiming this with a supporting experiment.
The AI was pitted against Dong Kim, considered one of the best heads-up players in the world, alongside three other top human players as part of a series of trials, and the outcomes are impressive!
Not only did ReBel played at a faster pace than its human opponents (faster than two seconds per hand and taking not more than five seconds to make a decision across 7,500 hands), it achieved an aggregated score of 165 thousandths of a big blind per game, defeating Kim with a standard deviation of 69. ReBel performed better than Facebook’s previous poker AI Libratus which recorded an aggregated score of 147.
ReBel’s Development & Applications
ReBel fixes common problems encountered in previous AIs by operating two AI models representing value and policy. Contrary to how past AI’s were developed, such as DeepMind’s Alpha Zero that combined reinforcement learning and search using AI model training for a number of board games like Shogi, Go, and chess, ReBel is mainly developed on game state concepts.
This method results in the creation of a public belief state which enables the AI to come up with probabilities according to the sequence of actions and game states. During the decision-making process, all relevant aspects are considered, including the overall pot and chips, as well as the possible result of a given hand. Based on that information, ReBel creates a “subgame” and then incorporates reinforcement learning until it reaches the designated accuracy level.
Because ReBel does not rely heavily on specific domain knowledge, it’s application is more general and universal, especially in aspects that involve uncertainties and information that are not always available, such as in the game of poker.
The researchers believe the ReBel framework can be applied in developing techniques that involve interactions between multiple agents, such as self-driving cars, negotiations, auctions, and cybersecurity – areas that are usually associated with imperfect-information multi-agent interactions.
To prevent possible cheating in real-life high-stakes games, Facebook has opted not to release the ReBel codebase for poker