We’re Launching Research On AI Alignment And Trust

Contents

Acknowledge The AI Alignment Problem Exists (And Is Worth Caring About)AI Is Changing Rapidly And Risks Are Multiplying Start Tackling AI Alignment To Create Trust

I’m excited to share new research I’ve begun with my colleague Brandon Purcell, called Align by Design. We’re tackling what I believe to be the biggest technology challenge we face this decade – AI alignment. Come help us with this research.

Today, enterprises are conflicted about AI – executives are giddy, compliance and risk personnel are concerned, and workers are just trying to figure it all out. Amidst this confusion and excitement, most firms are in FOMO mode rather than approaching AI with a long-term, strategic perspective. We think it’s time to look further into the future to find the big opportunities for those that are ready, and to start evaluating subtle and potentially dangerous issues with AI (beyond known problems of hallucinations, data privacy, etc.).

Acknowledge The AI Alignment Problem Exists (And Is Worth Caring About)

Richard Ngo at OpenAI defines the alignment problem on a societal scale as “the challenge of ensuring that AI systems pursue goals that match human values or interests rather than unintended and undesirable goals.”

Interpreted for business, AI alignment is:

The challenge of ensuring that AI is achieving its intended goals, not acting in ways that are unintended and harmful, all while adhering to organizational values.

The largest AI firms in the world are taking this seriously and making alignment a priority. OpenAI, for example, acknowledges that “we don’t have a solution for steering or controlling a potentially superintelligent AI and preventing it from going rogue…We need new scientific and technical breakthroughs.”

AI Is Changing Rapidly And Risks Are Multiplying

The field of AI alignment and safety has moved from research theory to discussion of real-world AI implications. I believe it’s time for our clients to understand this movement and take it seriously. I’m seeing some important trends:

AIs are prone to bias and the scale of data for genAI makes bias problems worse. We’ve researched algorithmic bias extensively at Forrester. Bias is particularly problematic for large language models which train unsupervised on huge bodies of potentially biased data. There’s no easy way to solve the problem of biased data at scale. In the words of an expert we just spoke with, “it’s the wild, wild west out there.”
‘Super intelligent’ AIs that act autonomously may be a closer reality than you think. This paper from Google shows that its PaLM AI is improving in its ability to reason across multiple logical steps to reach a goal. This open source framework for Autonomous Agents lays out how AIs can be built that can take action based on the steps they determine. There’s tremendous uncertainty in predicting how AIs will behave as they’re given power to automate our lives.
Early signs of undesirable behavior are emerging. AI safety research is identifying behaviors like context-aware reward hacking and AIs that are learning how to manipulate in pursuit of internal, intermediate goals that may not even be known to designers. We’re seeing signs of power seeking behavior from advanced AIs as well, where AIs compete with humans for resources to achieve the AI’s goal. And those goals may, in the future, be learned by watching what we do rather than programmed by humans.

Start Tackling AI Alignment To Create Trust

Many of the world’s leading AI experts and AI companies are working on the alignment problem. As a result, the field of AI safety research has moved from the fringe to mainstream. More scholarly articles have been published on AI safety since 2015 than in the previous 50 years (18.8K vs. 16.6K according to Google Scholar). In The Alignment Problem, Brian Christiansen puts it eloquently when he states:

“We find ourselves in a fragile moment in history – where the power and flexibility of these models have made them irresistibly useful…yet standards and norms around how to use them appropriately are still nascent. It is exactly this period that we should be the most cautious.”

It’s time for organizations to understand and prepare. We have to take alignment seriously. I think firms that do will have many opportunities; firms that don’t will spend time working their way out of problems. Align by Design is our research project to figure out exactly what clients should do today. Please join us if you have an opinion. Also please read my colleague’s Brandon’s blog which introduces details of our research hypothesis.