1614954864 why do we keep using games as a benchmark for

Why Do We Keep Using Games as a Benchmark for A.I.?

Growing up, many of us were told that playing computer or video games is a waste of time. However, when it comes to training artificial intelligence, not only is it a waste of time, but it can be important for developing the intelligent agents of tomorrow.

Several years ago, researchers at Google DeepMind impressively demonstrated that state-of-the-art AI can be used to master many classic Atari games, without being explicitly taught how to play them. Since then, many researchers have tried their hand at reinforcement learning systems that use trial and error to learn the game.

Now, researchers at Uber AI Labs and OpenAI have come up with a way to further refine these tools that enable them to demonstrate higher levels of performance in more complex games that gameplay AI agents have previously struggled with .

“The new algorithm we developed, Go-Explore, replaces previous machine learning algorithms on many Atari games including the infamous, hard exploration games Montezuma’s Revenge And Danger, “Joost Huizinga, a research scientist at OpenA.I, told .

Not only is this “outperform” previous system, but Go-Xplore is the first algorithm to beat all kinds of difficult levels Montezuma’s Revenge And rack with on-perfect score Danger.

Montezuma pyramid game
Uber AI Labs / OpenAI

It does so by remembering previous successful approaches and returning to these high scoring moments rather than starting at the beginning of the game. Since the Atari game usually did not allow this (people who avoid big games that routinely allow them to get saving points don’t realize how lucky they are!), The researchers found an emulator. Used which allows them to save stats and reload them at any time.

Playing through a collection of 55 Atari games – representing an increasingly standard benchmark for reinforcement learning algorithms – Go-Xplore was able to defeat other state-of-the-art systems in these titles 85.5% of the time.

This is an impressive demonstration of artificial intelligence in action. And, despite being just a game, it can have exciting real-world applications.

Checkmates and Robot Overloaders

From the very beginning – artificial intelligence was also coined as the discipline’s official name – field researchers were interested in sports. In 1949, Claude Schönen, one of the founding figures of AI, gave his explanation for why making computer play chess would be a worthy endeavor.

Games such as chess, Shannon wrote, present a sharply defined problem, with both acceptable operations and end goals. They were hardly even a challenge to solve, and yet still require intelligence, while having a discrete (non-continuous) structure with a step-by-step manner in which the computer solves problems. is.

While the technology driving these systems has changed immensely over an interval of more than 70 years, many of those ideas are still in place that advance the use of games for the development and testing of artificial intelligence. Games provide a pithy, simplified version of the real world in which the complexity of the problems is distilled into actions, states, awards, and clear-cut winners and losers.

Although Shannon, Alan Turing, and many other early AI luminaires worked on the challenge of computer chess, and had some notable successes along the way – such as Massachusetts Institute of Technology programmer Richard Greenblatt’s McHack in 1960 – it wasn’t really until May 1997. Computer chess really caught the attention of the world.

This was the month and year when IBM’s “Deep Blue” system defeated chess world champion Gary Kasparov in a six-match game. Deep Blue was a surprising example of brute-force computing in action. It used very large parallel hardware consisting of 30 top-end microprocessors to test the impact of 200 million board positions every second. IBM’s chess-playing supercomputer was equipped with a memory bank, containing thousands of previous, master-level games on which it could not draw. (Ironically, Deep Blue’s first match-winning trick was actually a failure on the part of the system in which she missed randomly picking up a trick that Kasparov mistook for creativity.)

Nearly 15 years later, in February 2011, IBM won its next headline-winning AI gaming triumph, when its IBM Watson AI faced off against former champions Ken Jennings and Brad Rutter in a much-awaited television special of the game show’s delight. Jennings told me about his book, saying, “I went to AI classes and knew that the kind of technology that could kill humans is still decades away.” Thinking machines. “Or at least I felt it was.”

Why do we keep using games as a benchmark for
IBM / YouTube

In this event, Watson separated the pair, a path to win $ 1 million in prize money. “It lost really badly,” said Jennings, who holds the record for the longest winning streak in the history of the game. As the game came to a close, he wrote a phrase on his answer board and held it for the cameras: “I welcome the overlay of my new robot.”

Gameplaying and Machine Learning

Recent gameplay AI demonstrations largely employ DeepMind, which has focused on games as part of its stated goal for “intelligence”. Perhaps the most notable achievement was AlphaGo, a go-playing bot that beat world champion Lee Sedol by one in four matches – the 2016 series clocked by 60 million people. It also featured attic-playing AI, and Alphastar, which tried to master real-time strategy games. StarCraft II.

Compared to Deep Blue’s brute-force computing efforts, these are demonstrations of machine learning techniques. This is partly out of necessity. For example, chess facilitates far more potential board positions, making brute-force harder to employ. The initial trick in the game of chess allows for 20 possible moves. The first player in Go has 361 prospects. In totality, Go has more acceptable board locations than the total number of atoms in the known universe. This is a tall order for brute-force computing, allowing for an increase in hardware since 1997.

Go alpha
Deepmind / youtube

As advanced reinforcement learning approaches feature advanced, modern game-playing AI systems, such as AI as a whole, largely switches from learning to follow the rules of the predecessor itself. It has also opened up a new advantage of using games as a game for AI systems: free data.

As symbolic AI gave way to today’s data-hungry machine learning tools, the game offered researchers a more plentiful source of data needed to execute their demonstrations. Demis Haasabis, CEO and co-founder of DeepMind, said this in an interview with Azim Azhar of the November 2020 newspaper. Exponential view. “We were a small startup, we didn’t have access to a lot of data from applications … and so we had to synthesize our own data,” Haasabis said. “If you use games, whether it’s board games like Go or simulation, such as computer games, video games, you can play them as long as you want and generate as much synthetic data as you want.”

Point in Case: Alpha Go played more than 10 million times to achieve its go-playing proof. In non-gameplaying scenarios, this mass of data has to be gathered from elsewhere. In the case of gameplay AI, it can be generated by the system itself.

Real world use

There is, it must be said, an element of PT Barnum’s “roll up, roll up incredibly intelligent computer” to make public gameplay AI demonstrations public. It takes machine learning research closer to the Olympic Games. More people saw IBM’s Watson in danger! Backpropagation has been described, citing research papers, one of the most well-known algorithms in modern machine learning. IBM’s stock soared after the Kasparov chess game in 1997, and again after Watson’s dangerous win in 2011.

But AI gameplay is not just a cynical grip for AI attention. “The ultimate goal, of course, is not to solve it per game,” Adrian Ecofate, a research scientist at OpenAI, told . “The problem of preparing itself is very common, so that algorithms that solve the game well can be useful in practical applications as well. In our work, we point out that the algorithm that was used to solve the Atari game can also be used to solve a challenging robotics problem. In addition to robotics, Go-Xplore has already seen some experimental research into language learning, where an agent learns the meaning of words by searching for a text-based game, and exploring potential failures in the behavior of a self-driving car in order. Does. To avoid those failures in the future. “

Jeff Cloon, a research team leader at OpenAI, told that DeepMind has successfully applied and applied machine learning to practical, real-world problems such as controlling the stratospheric balloon and cooling the data center.

Meanwhile, Huizinga pointed out that reinforcement learning tools are widespread in recommendation systems that determine what videos or advertisements are scheduled to show to users online. Similarly, the search algorithms used to allow AI agents to find their way into video games also create a “backbone algorithm” for automated route planning in navigation systems.

“While, to the best of our knowledge, there are no commercially applicable Go-Xplore versions yet, it might not be long before we start seeing practical applications.” And, with this, there is most likely a completely other gameplay AI system

A paper describing Uber AI Labs and OpenAI. The reinforcement learning project was recently published in the journal Nature.

Editors recommendations

Similar Posts