Austin Spizzirri
21. Computer science student. Published researcher. Musician. Writing on philosophy and AI alignment.
What is the alignment problem?
We’re building machines that appear smarter and smarter. They’re not — they’re becoming more and more capable, which isn’t the same thing. Current AI systems have a lot of data and they piece it together well enough to pass a bar exam, but that doesn’t make them intelligent. They predict and pattern match — which is part of what intelligence does — but without any underlying model of reality grounding it. No active maintenance, no felt sense of when something’s wrong, no ability to update what they know from experience. They produce outputs that look like comprehension, but there’s no one home. Most people — including most people building them — don’t fully grasp that distinction. That’s half the problem. The other half is alignment itself.
Alignment is the question of how to make sure these systems actually care about what we care about. Not perform caring. Not optimize for looking aligned while pursuing something else beneath the surface. Actually care — in a way that holds when they’re more capable than us and we can’t course-correct anymore. The standard approach is to specify values and train systems to follow them. I think that’s a dead end. You can’t encode morality into something the way you upload a file. Values that matter — the kind worth having — emerge from lived experience, from relationships with stakes, from being in a world where your choices cost something.
That’s the thesis. I’m designing an experiment to find out whether it’s true — AI agents that start empty, no language, no values, no training, in environments where hunger hurts, death is permanent, and other agents aren’t abstractions. The goal is to find out whether genuine values can emerge from developmental experience the way they do in biological minds, or whether that’s a dead end too. The work is early. The question is not.