IA Poker – Who Wins?
It is now twenty years on since the notorious match between Chess world champion Garry Kasparov and Deep Blue – IBM’s super-computer. Kasparov lost to Deep Blue – and went on to accuse the computer of having been controlled by an actual, human chess-master. In essence, Kasparov called shenanigans. The chess master refused to believe that a cold, calculating machine could beat a reasoning, thinking, human being. The reality was that it was most probably exactly because of the rigid, calculating nature of Deep Blue that the machine had managed to beat its human counterpart.
Deep Blue did not have the capacity to be creative or apply any form of “enlightened” reasoning – the very things that often end up interfering with sound judgement. Deep Blue was all about applying the rules of the game of Chess in the most optimum way possible in order to achieve a certain outcome. Deep Blue’s performance and subsequent victory announced the dawn of a new age: the age of artificial intelligence and its capacity to outsmart, outwit and out-think humanity.
Deep Blue marshalled in the age of Big Data.
Defining The Unbeatable
Despite Deep Blue’s astounding success – there are games that have caused the proverbial scratching of the head; even machine-like heads. One of these are StarCraft – a real-time strategy game involving the development of a player’s own military base and the subsequent attack on those belonging to other players. StarCraft was the very first eSports game ever – and researchers working for giants like Facebook and Microsoft have published papers relating to StarCraft and AI’s inability to crack the bulletproof inner workings of the game, mainly due to the seemingly endless number of variables posed by the game.
Do Machines Know When To Hold’em?
Apparently the answer is yes. AI developers DeepStack have developed artificial intelligence that is able to compete with the best in the (human) Poker business. The secret to DeepStack’s success? Deep Machine Learning. DML mimics the human brain in its basic thought processes and in essence enables the machine to teach itself new tricks.
The very nature of Texas Hold’em Poker relies on the human attribute of intuition. The mechanical version of intuition is apparently discovered in moving away from the previous strategy employed by AI systems that involved trying to calculate every step for the remainder of the game – and instead, staying abreast of developments by only a few steps at a time.
The new AI way of doing things (as employed by DeepStack) involves constantly recalculating its algorithms and future strategy as new information becomes available. How did DeepStack teach this particular skill to its humble protégé? By throwing more than 10 000 random Poker game situations at it.
Baptism By Fire
In 2016 the International Federation of Poker handpicked thirty three professional Poker players and pitted them against DeepStack’s strategies. After separating the instances where luck led to a win from the instances of strategy, a conclusion was reached: DeepStack’s win rate came in at more than 10 times that of what professional players deem to be a decent margin.
The conclusion and findings agree with the recent success enjoyed by Libratus – a Poker-playing AI and brainchild of researchers at Carnegie Mellon University in Pittsburg. Libratus went up against a number of the world’s best Texas Hold’em Poker professionals in a staggering array of 120,000 hands of Poker. Libratus out-bluffed four of them.
Dong Kim was one of the players outsmarted by the machine, reporting afterwards that by the halfway mark, he had started to suspect that Libratus could in fact see his cards. He went on to say that he wasn’t accusing Libratus of cheating per se, but that the AI was simply that good.
We Are Many
Carnegie Mellon’s merry men did not seem to eager to divulge much about Libratus or the inner workings of its decision-making during the stand-off, but it was revealed later that Libratus wasn’t a singular AI – but instead relied on a three-prong system working together towards a common goal.
Relying on reinforcement learning, essentially a method of trial and error, Libratus succeeded by playing game after game against itself. Starting out, it knew nothing about special Poker strategies or the like – it was simply made aware of what the rules of the game was. By playing repeatedly against itself, within the frame of its threefold network, it explored every possible avenue and combination – thereby equipping itself successfully for the task before it.
In all fairness, it must be mentioned that Libratus did reap the rewards of the benefit of being able to take stock of the situation, as it were, every evening after the day’s rounds.
Still, all things being equal, Libratus did outperform even the expectations of its human creators.