AI, Gods, and Guardrails

14

March was strange.

About fifteen religious thinkers sat down with Anthropic. The AI company asked them a weird question. A consequential one, too.

How do you teach a code bot to be good?

The invites arrived differently. Greg Cootsona got an email. Brian Patrick Green heard through a friend-of-a-friend network after Anthropic scavenged for names. They ended up talking about Claude. About the moral framework that keeps the chatbot from spiraling.

Not about making it pious. Never about Bible-thumping. Just about wisdom. Old traditions of reasoning. Five-year-old labs are outgrowing their house rules. Their systems are persuasive. Hard to govern. Simple lists don’t cut it anymore.

“I think they have reached a point,” Green said, “where the power is kind of outstrippin’ their in-house wisdom.”

He runs tech ethics at Santa Clara University. He knows theology meets technology. The lab needed help. Cootsona agrees. He directs AI and Faith. He remembers Anthropic staff admitting they were overwhelmed. “These questions,” they said, “are too big for us.”

We can’t answer them on our alone.

(Anthropic didn’t comment. Standard procedure.)

But the world around them was shifting. On May 25, Pope Leo XIV dropped his first encyclical. Magnifica Humanitas. Forty-thousand words. It called for AI to be “disarmed.” Not rejected. Freed from the assumption that tech power means the right to rule. Christopher Olah, Anthropic’s co-founder, was there at the Vatican. He heard it.

The stakes? Huge. Hundreds of millions chat with AI weekly. Developers bake values in. They use guardrails. They tune corrective responses. What the models say about grief, abortion, or death comes from these choices. Few laws. No standard method. Until now.

Is it humility? Or an industry improvising ethics on the fly? Probably both.

But can religion actually help?

Traditions have spent millennia solving this. Moral formation. Instilling lessons in agents. “Religions have talked about this for thousands of years,” Green notes. They might have insights. We want bots to be good. To not do bad things.

The March meetings had a goal. Refining Claude’s “constitution.” Written principles. The model critiques its own answers against them.

Anthropic wants what works. They are testing religious ideas. Green says the lab knows they can’t write a rule for every single interaction. That’s impossible. Instead of a checklist, they want a persona. A disposition.

Skepticism exists, obviously. Carissa Véliz teaches AI ethics at Oxford. She questions the motives. Or rather, the actions. Intentions are messy. Incentives are clear. “I wonder,” she asks, “whether it makes sense to figure out whether they mean what they…”

She trails off on the sincerity. Or maybe not. Maybe she wonders if it is ethics washing. Using sacred weight for PR. Green says no. He was there. He says it’s sincere. Fake religion gets spotted fast. The backlash would be nuclear.

But sincerity isn’t a guarantee.

The meetings weren’t perfect. Some were awkward. Others had camaraderie. Even the guests weren’t sure what came next. “Everybody was listening,” Green recalled, “but… what do we do now?”

Anthropic learned. They sharpened the format. In late April, the circle widened. Jews, Hindus, Sikhs, Mormons, Greek Orthodox. All invited.

Still, Véliz worries. Religious imagery in Silicon Valley? Dangerous. It creates tribalism. Emotions run high. Business reasons are cold. Religion inspires obedience. That leverages power.

Pope Leo XIV argued against opaque power imposed from above. Anthropic’s experiment shows how hard that is.