Human Compatible: AI and the Problem of Control

Stuart Russell, one of the better-known researchers in Artificial Intelligence, author of the best selling textbook Artificial Intelligence, A Modern Approach addresses, in his most recent book, what is probably one of the most interesting open questions in science and technology: can we control the artificially intelligent systems that will be created in the decades to come?

In Human Compatible: AI and the Problem of Control Russell formulates and answers the following, very important question: what are the consequences if we succeed in creating a truly intelligent machine?

The question brings, with it, many other questions, of course. Will intelligent machines be dangerous to humanity? Will they take over the world? Could we control machines that are more intelligent than ourselves? Many writers and scientists, like Nick Bostrom, Stephen Hawking, Elon Musk, Sam Harris, and Max Tegmark have raised these questions, several of them claiming that superintelligent machines could be around the corner and become extremely dangerous to the humanity.

However, most AI researchers have dismissed these questions as irrelevant, concentrated as they are in the development of specific techniques and well aware that Artificial General Intelligence is far away, if it is at all achievable.  Andrew Ng, another famous AI researcher, said that worrying about superintelligent machines is like worrying about the overpopulation. of Mars.

There could be a race of killer robots in the far future, but I don’t work on not turning AI evil today for the same reason I don’t worry about the problem of overpopulation on the planet Mars

Another famous Machine Learning researcher, Pedro Domingos, in his bestselling book, The Master Algorithm, about Machine Learning, the driving force behind modern AI, also ignores these issues, concentrating on concrete technologies and applications. In fact, he says often that he is more worried about dumb machines than about superintelligent machines.

Stuart Russell’s book is different, making the point that we may, indeed, lose control of such systems, even though he does not believe they could harm us by malice or with intention. In fact, Russell is quite dismissive of the possibility that machines could one day become truly intelligent and conscious, a position I find, personally, very brave, 70 years after Alan Turing saying exactly the opposite.

Yet, Russell believes we may be in trouble if sufficiently intelligent and powerful machines have objectives that are not well aligned with the real objectives of their designers. His point is that a poorly conceived AI system, which aims at optimizing some function that was badly specified can lead to bad results and even tragedy if such a system controls critical facilities. One well-known example is Bostrom’s paperclip problem, where an AI system designed to maximize the production of paperclips turns the whole planet into a paperclip production factory, eliminating humanity in the process. As in the cases that Russell fears, the problem comes not from a machine which wants to kill all humans, but from a machine that was designed with the wrong objectives in mind and does not stop before achieving them.

To avoid that risk os misalignment between human and machine objectives, Russell proposes designing provably beneficial AI systems, based on three principles that can be summarized as:

  • Aim to maximize the realization of human preferences
  • Assume uncertainty about these preferences
  • Learn these preferences from human behavior

Although I am not fully aligned with Russell in all the positions he defends in this book, it makes for interesting reading, coming from someone who is a knowledgeable AI researcher and cares about the problems of alignment and control of AI systems.