AI safety is a weird area, and in trying to explain it to friends I realized that I don’t have a simple explanation of what the (existential) risks are and why I think they’re important.
Word limits force simplicity, so here are some attempts to explain AI risk in different numbers of words.
Do you value the same things as your great grandfather? What we value changes from generation to generation. We don’t really know how to impart our values on our children.
Does America value the same things as its founders did? Do companies share their founders' visions? We don’t know how to build institutions that faithfully carry our values into the future.
The same is true of AI. We don’t know how to make it share our values. None of this is intrinsic to AI; imparting values is just a hard problem. Values are hard to clearly explain, and our explanations rely heavily on context.
We have to try though. AI’s will be extraordinarily powerful, and in a century our world will strongly reflect whatever AI’s ends up valuing, for better or worse. American values shaped the 20th century, and AI values will shape the 22nd.
Unlike humans, AI’s get smarter the more compute we give them. Take a weak chess-playing AI, give it enough compute, and it’ll beat a grandmaster as easily as a grandmaster beats a chimp.
Compute is cheap, and getting cheaper every day. In a few decades, compute will probably be cheap enough for AI’s to match humans in every domain. A few decades later they could be far beyond us.
The worry is that we do not know how to build AI systems that share our values. We don’t know how to tell them what we want in unambiguous terms.
Do you value the same things as your great grandfather? What we value changes from generation to generation. We don’t know how to perfectly impart our values on children.
The same is true of AI. We don’t know how to make it share our values, or preserve them, or carry them forward. None of this is intrinsic to AI, imparting values is just a hard problem. But AI will be extraordinarily powerful, and in a century our world will strongly reflect whatever AI ends up valuing, for better or worse.
It’s hard to ask optimizers the right questions; you may not get what you want. Capitalism is a powerful optimizer. We asked it for cheap transportation and we got Global Warming. In the coming decades we will build AI with far more optimizing power than capitalism, and we don’t know how to ask it the right questions.
10 years ago AI couldn’t play world-class Go. 5 years ago it couldn’t write clear, meaningful prose. 4 years ago it couldn’t fold proteins. 3 years ago it couldn’t play Starcraft. 1 year ago it couldn’t draw a scene from a sentence. Progress in AI is fast and, now that AI is lucrative, it’s only going to get faster.
Humans are smarter than monkeys and transformed the world. Monkeys might not approve. AI could be smarter than humans, and we don’t know how to make it share our values.
Corporations are superhuman. No human can make an iPhone, but Apple can. We barely manage to align corporations with our values. AI could be much harder to control.
Regulating an industry requires understanding it. This is why complex financial instruments are so hard to regulate. Superhuman AI could have plans far beyond our ability to understand.