Introduction
I participated in BlueDot Impact’s AI Safety Fundamentals: Governance course over the last several months. My final project was to explore the development of a board game which casts players in the roles of (American) frontier AI labs, competing to earn the most prestige (and income) while avoiding a host of perils: being outcompeted by foreign adversaries, losing control to the US government, or losing control over a rogue unaligned AI.
I learned a lot from this exercise and this post will cover my experiences. If you are interested, my final prototype can be played using Tabletop Simulator, though this is paid software and not the most user-friendly platform.
Threat Model and Theory of Change
During the course, I became very interested in the race dynamic that exists between the leading AI labs. Labs compete to create the best AI tools, like chatbots and coding assistants. Labs with better products earn more revenue and gain prestige, allowing them to attract further investment and talent while increasing their influence on potential government regulation.
The race dynamic refers to the problem that given the huge incentives to be ahead of the competition, these labs have difficulty finding the time, money, or compute required to pursue other goals. And there are many other goals which these labs could be pursuing, in particular:
Keeping their AI systems under control; that is, doing what their users want them to do rather than doing other things like killing all humans
Keeping their technology and computer systems secure from theft, such as by computer hacking by very capable and motivated foreign state actors.
I was inspired by the release of the board game Daybreak last year. Daybreak puts players in the role of governments working together to solve climate change. Daybreak was exciting, in part, for bringing a serious look at some of the dynamics of the climate crisis to a wider, general audience. I did an impact analysis of Daybreak which I’ll link to here, but in brief its crowdfunding campaign raised $450,015 across 8977 backers by Oct 20, 2022 and was covered in several major publications including the New York Times.
Could I create a board game that gives players a chance to feel the competitive pressures that frontier labs feel?
Methodology and Prototype
Prototype One
The initial design was inspired by games like Terraforming Mars, in that players would draw initiatives from a common deck and use them to execute the various business functions of developing advanced AI. Some general categories:
Hiring talent
Purchasing and deploying computing hardware
Acquiring data, used in training advanced AI
Purchasing power
Researching new architectures or algorithms to improve the efficiency of their compute or their data
Scaling the capabilities of their frontier models
Creating products based on the level of capabilities attained
Researching AI alignment
Improving cybersecurity
There were also a number of rare cards in the deck that did unusual things, like making a deal to build data centers in the United Arab Emirates or firing your board of directors. Players draft cards into a private hand, and play many of them face-down, so their moves are mostly, but not entirely, hidden from each other. Here are some example cards:
A central board would track the status of many different considerations:
What level of capabilities has each player attained?
What level of capabilities has China reached?
What level of capabilities, if reached, would cause the US government to intervene and take control over the labs?
At a certain level of capabilities, artificial general intelligence is reached.
How much data does each player have access to?
How much funding does each player have access to?
How trusted is each player (by other governments or corporations or the public)?
How much cooperation is there, internationally?
How likely is a fast takeoff? (A fast takeoff describes when the time between smart-as-human AI and much, much, much smarter-than-human AI is short.)
How difficult will AI alignment/control be?
This is an image of the game board (missing some components like cards or player tokens):
This is what the prototype looks like (in tabletop simulator):
There are several ways the game can end. As noted above, there is a level of capabilities that is sufficient for artificial general intelligence. If this is reached, then we follow this flow to determine what happens:
Broadly, we check to see if takeoff is fast or slow. In the fast case, the game ends momentarily. The triggering player with the best capabilities checks to see if their alignment research holds up. If it does, they win. If it doesn’t, everyone loses as we consider a rogue, uncontrolled AI to dominate the future in a probably bad way.
In the slow takeoff case, players play another turn, and then any players with sufficient capabilities check their own alignment solutions. Perhaps players have taken the time to find solutions and share them? All players need to pass their alignment rolls individually, and then all players need to also pass a cooperation check in order to share the win. Any other result ends in chaotic, unpredictable post-AGI competition and warfare; a loss for all players.
The game can also end in a couple of other ways. If China reaches AGI first, it is considered a loss for all players (who represent US AI labs after all and in this game are assumed to have different values from the Chinese government.) If players advance their capabilities too quickly and too publicly and with too little effort spent influencing the US government, then the US government can decide to take over the labs (in one way or another), which is a loss for all players. Similarly, if players get too far ahead of Chinese capabilities, then it is assumed that China will take drastic action to avoid losing the race. This also causes all players to lose.
Finally, it is possible, though difficult, for the game to end in an AI Pause. This requires all players to promise to not actually advance their capabilities for a full turn, and to actually keep this promise. It also requires that China is far enough behind and that global international cooperation be at a certain level (which players can promote through certain card play.)
Prototype Two
After a couple of plays of prototype one, I really felt I needed to try a version of the game that was much more streamlined. The point of doing this was to make a game that really would work for a broader audience. To this effect I took more inspiration from games like 7 Wonders and Sushi Go! – a tight drafting game that still hooks into many of the game end conditions I described above.
This is what a player’s play area could look like mid-way through the game. You can see, for example, relatively simple cards, providing power, data, compute, and security.
This is the main board, which you can see is also highly streamlined. In fact, the whole turn flow and game end could be described right there on the board. The major addition to this prototype compared to the first one was that the game is more likely to end on time, and in the case that AGI is not yet reached, the player with the most prestige simply wins the game. And players earn prestige for advancing their capabilities and for leading in capabilities.
Results
In the first playtest of prototype two, this story unfolded naturally from the interactions of the players at the table, which I think is worth relating. Four developers were competing, and two had clearly invested more into pushing their capabilities by the end of the second turn, though it was unclear by exactly how much. Everyone agreed, however, that no one at the table had done the proper investments into alignment research to make developing AGI safe.
One of the lagging developers said, “hey, I have this card which allows me to publish some of my alignment research. I will do so if we all agree to not advance our capabilities past 7 this turn.” The table agrees and the research is published.
Of course, both of the leading developers immediately defect, and they determine how far their capabilities advance. They both are at level 8. So they check to see if the difficulty of AGI has been reached. (This is done by flipping over the tiles on the track to see if the one that has the “AGI is this hard” has been reached.) It was, in fact, right at 8. So now both players need to pass their alignment checks. One passes, but one fails, so we have a rogue unaligned AI and all players lose!
After the game, both defectors said they wouldn’t have pushed further on capabilities if the alignment research hadn’t been shared, as it would have been too dangerous. With the additional research, however, they each made the choice to defect. So this example makes viscerally clear, among other things, that sharing alignment research itself can be dangerous.
Takeaways
Learning through Design and Development
It sounds obvious, but let me assure you that in order to develop the prototype I had to think really hard about this subject for a long time. I wrote up a ton of documents, tore up a bunch of terrible prototypes, and had many conversations with interested collaborators. This forced me to refine my understanding and to operationalize my ideas.
Learning through Play
As expected, by playing the prototype there was a lot of learning all around. In some cases, players took away from the experience the lessons I was hoping to impart, as in the example above. In other cases, players disagreed with aspects of the game and provided that feedback, which was just as valuable! Of course, this is also a prototype and so constructive feedback just adds more detail to the process of iteration and development.
Economic Details
For this game, the focus on cooperative-competitive dynamics means that we don’t need to focus too hard on the fidelity of the underlying economic simulation. But that said, there are some real-world lines of play that I think it is valuable to model, like many of the happenings at OpenAI regarding internal corporate governance. Working on this in the spring of 2024 I was eager to incorporate such elements as well.
Negotiation
It was extremely valuable to give players the ability to make deals in an open-ended fashion. The second prototype I built gave some options but overall restrained players too much.
I have also received feedback that the game should really focus on this aspect, and not as much on the economic aspect or the competing values to just pushing capabilities as hard as possible. I’m unsure but I do think that a game that focuses almost exclusively on negotiation would be viable and interesting, but different.
Scenarios
I’m very excited about how the setup described allows for interesting games which begin under different starting assumptions. For instance, you can vary these properties at game start:
The ease of international cooperation
The chance of a fast takeoff
The difficulty of aligning smarter-than-human AI
Further Extensions and Other Topics
Prototype Two is better suited for a broad audience, and I think there is merit there, but Prototype One is a richer simulation that would probably be better for adaptation into a product for students, researchers or policymakers.
My present research has moved on to focus on the international race dynamics between states, and while I’m not working on a board game (yet) I have been experimenting with wargaming approaches for developing forecasts. There are certainly some overlap and lessons I can take from this work that can be applied there.
Do you think it would be possible to create a physical (i.e. real-life board) version of this game?
I want to play, but I wish my friends and I didn't have to pay $20 each for Tabletop Simulator in order to start a game.