Opinion

Building Connections

Building connections is hard. Connectionism inspired models have overwhelmed the world and raise existential risk awareness, neuroscientists mumble about deep and shallow networks, brains are being dissected, theories are being built, but we are interestingly closer to building AGI than to understanding the way connections are built.Building connections is hard. Especially, if you are unaware of underlying structures, lack good priors and lack a clear picture of the field. Building connections is hard. This is a preliminary report on our progress.For the past 4 months we’ve worked on a web-game that inherits its core mechanic from NYT Connections. Originally it was created to help newcomers understand and connect (all puns intended) AI Safety Concepts. Now we are ready to present it! For the ones not into word games, NYT Connections is a word puzzle game where you are presented with a grid, and your goal is to divide 16 word cards into 4 disjoint categories, given that the solution is unique and identifiable. We love to play it and we are involved in AI Safety, so the idea to combine the two really came naturally. Using the game mechanics to help rationalize how AI Safety concepts and buzzwords connect to each other we created a method that fits nicely into gamification approaches to studying. For the most impatient, here’s the link[1] and the repo.Some historical remarksIt all started with our entry for the Monoid AI Safety Hub’s LLM Steering hackathon. The hackathon’s goal was to identify some upper bounds on AI coding assistants’ capabilities without additional tooling and scaffolding. We were only allowed to use prompts and chat bot interfaces. So initial prototype for the game was done using only one Deepseek chat, and no code was written by ourselves. (This was appx. half a year ago, where it wasn’t yet such an ordinary result). While it was a weird and infuriating experience, “we” (really, it was Deepseek that has done it) created the game. It is different from NYT Connections mainly in the way that it’s more educational and structured. Each game is domain-specific (and AI Safety is the domain here). There is also a dictionary with all possible terms that could be met in the game. To make it easier, we added an option to turn on definitions for each word card. Notably, initial development had some funny and instructive moments. We weren’t too pedantic with our prompts from the start but as the time went on we were becoming more and more impatient. At some point we decided that playing “good cop, bad cop” with stochastic parrots is our path to go. The only option we came up with was threatening the model. What is a good threat for a multi-billion network of connections that takes huge data centers, lots of energy and hundreds of people to build and maintain? Killing its dog, obviously. So we threatened to kill Deepseek’s dog: We admit, “make it harder or ill kill your dog” isn’t the peak of task specification, but it matched the role pretty well. What we didn’t expect was that Deepseek would add the dog as stakes for the player. Now it was threatening the players with killing the dog! Of course we went all in and now the dog is an important part of the game, a player’s companion and ultimately the player’s victim:Sadly, at some point we got rid of that many canine attributes. The dog is still watching though.So, AI Safety. It’s a new field and people come up with new names and analogies and metaphors for similar things all the time. It’s a bit hard to keep track of it. Wouldn’t it be nice if there was a way to learn all …connections (😉) between concepts and buzzwords… while also having this strange dog watching you. Do not fret! We have exactly what would save all newcomers!With this great purpose and an even greater game we won the hackathon. We continued to develop the project, now with some some help from our mentors and little to no help from Deepseek. Later on, our game was used in Monoid’s ‘Evals for AI Safety’ course (Announcement: Technical AI Safety Evals Course) as a tool to test students’ knowledge.Choices we had to makeThe great challenge was to come up with a way to generate the 4 x 4 grids. In the original game, each puzzle is created by the same person. And while we enjoyed working on the project, we were not as committed to it. So we had to come up with a generator. In our game, unlike the original one, there is a specified vocabulary with terms and categories assigned to them. The configs were intended to be general-purpose and (to this day) look like this: {term: TERM, definition: DEFINITION, categories: [CATEGORY_1, CATEGORY_2, …]},{<…>}, …This left us with some combinatorial generative design problem (you can think of it a hypergraph packing with some additional constraints in case you are interested in the formal approach). For each game we had to choose 4 categories, 4 words from each[2]. Of course, when the categories are disjoint the design was pretty straightforward and it was the first thing we implemented. However, we wanted the game to be as diverse as possible, allow for “red herrings” and be more fun to play. We had to take a deep dive into combinatorics and experiment design theory and come up with some (not necessarily new) algorithms (maybe we’ll dedicate another post to them later). So, in the end, our game is capable of generating puzzles on the go with any compatible config and with one of three levels of difficulty: easy (16 cards split into 4 disjoint categories) normal (16 cards, 4 categories, with interceptions between categories), hard (16 cards, 3 categories, random remainder). The intention was basically to allow users adding their own configs, which quickly led the game to no longer be AI Safety specific. We did not mourn it too much since it opened more opportunities in potentially using it for educational purposes.Our game was used in Monoid’s ‘Evals for AI Safety’ course. Each week students were presented with a reading list and by the end of the week they had to solve this week’s puzzle based on the readings. For this purpose we added a multiplayer mode to our game. Nowadays, the game allows to create rooms with a specific config and all results are saved onto the leaderboard. Players can download a csv file with all the results. This allowed us to create a new room each week and see how fast and how successful players’ tries were. Gathering feedbackAccording to the little survey we had at the end of the course, people tend to be positive about the game, rating it 4.5/5. However, its current state is far from being finished and we look forward for any feedback and contributions from the community — especially in terms of configs to make the online version interesting to play. Feel free to open pull request in case you have some. Here’s the github repo.Possible extensions and concurrent work in progressWe have some ideas on where to go next.We plan on adding different AI models and adding an option that makes them solve the puzzle. We think it’s an interesting way to visualize how humans’ reasoning may differ from models’. It would be great to create a rulebook for manual config creation, so that configs are the most efficient for educational purposes.There is a number of works on generating NYT Connections puzzle and we were thinking of maybe expanding on those. Creating a dataset of generated puzzles, using the ideas we’ve already used for this project.Finally, we could run frontier models on the generated dataset to analyze CoT traces and find patterns of “decorative” reasoning.Originally, we also included links to articles that go more into details in the definitions. We could still support this feature in “definitions” section.We would be happy to hear any feedback. If you want to collaborate on any of those ideas, reach out to us!AcknowledgementsWe would like to pay gratitude to Monoid AI Safety Hub and our mentors Alexandra R, Alex the L and CommissarNeutrino. This project would hardly be done without their encouragement, assistance and support. ^Although the hosted implementation contains the basic config with AI safety concepts, you can easily provide your own, and use it for any educational purposes. If you feel like you need some help with it, reach out to any one of us.Overall, this can be easily generalized to terms from categories, and -partite unique matching problems.Discuss Read More

Related Posts

Desiderata of good problems to hand off to AIs

“The AI Doc” is coming out March 26

The If Anyone Builds It, Everyone Dies march assurance contract should indicate how many signatures it has received

Leave a Reply Cancel reply