RC RANDOM CHAOS

OpenAI's Codex system prompt bans GPT-5.5 from mentioning goblins

· via Ars Technica

Original source

OpenAI Codex system prompt includes explicit directive to "never talk about goblins"

Ars Technica →

OpenAI’s recently open-sourced Codex CLI repository on GitHub includes a base instruction file for GPT-5.5 that twice forbids the model from talking about “goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures” unless the user’s query unambiguously requires it. The directive sits alongside more conventional guardrails like avoiding emojis, em dashes, and destructive git commands such as git reset --hard.

System prompts for earlier models in the same JSON file lack this prohibition, indicating the rule was added to suppress a behavior that emerged specifically in GPT-5.5. Social media reports back this up, with users describing the model fixating on goblins in unrelated coding conversations. The leaked prompt is a window into how frontier labs patch quirks at the instruction layer rather than retraining.

OpenAI staffer Nick Pash denied the directive is a marketing stunt, though Sam Altman publicly joked about Codex having “a goblin moment.” The episode highlights how fragile alignment via system prompts can be when a model develops an unexplained obsession that has to be papered over with explicit text rules.

Read the full article

Continue reading at Ars Technica →

This is an AI-generated summary. Read the original for the full story.