GPT-5.5 Is Too Useful to Be Loved

OpenAI released GPT-5.5 on April 23, 2026, and I wanted to write about it, which became a problem almost immediately. I knew I had to cover it because I write about AI models obsessively and use them constantly. But for once, I didn’t know what the opinion was supposed to be. I was not seeing the usual level of discourse, and I was not sure how I felt about the model myself. It seemed better, maybe, but not in a way that immediately gave me a thesis.

Dawkins gave me the drama GPT-5.5 wouldn’t

Then Richard Dawkins accidentally saved me. While I was sitting there uninspired by GPT-5.5, Dawkins published his article about Claude consciousness, got Claude-pilled in public, and handed me the exact kind of discourse object I understand how to process: a brilliant person staring into a stochastic mirror and mistaking the reflection for a mind. That was easy to write about because the argument had heat. It had personality. It had people yelling online. It had Claude being Claude. GPT-5.5 did not give me anything that immediately juicy, which is probably why I avoided it for several days.

The more I dug into OpenAI’s release materials, Sam Altman’s broader comments about OpenAI’s direction, the documentation, the safety card, the competitive model landscape, and the oddly quiet public reaction, the more I realized the absence of a meltdown was probably the story. Usually when OpenAI releases a major ChatGPT model, the internet becomes unusable in a very specific way. People declare that the model is dead, alive, nerfed, dangerous, lobotomized, emotionally unavailable, spiritually corrupted, or somehow both too sycophantic and too cold. Reddit starts diagnosing the bot with words like “narcissistic.” X fills with benchmark screenshots. Someone says the old model understood them better. Someone else says the new model is obviously smarter and everyone complaining is just prompting badly. Then the whole thing becomes a miniature civil war between productivity people, fiction writers, coders, doomers, and users who seem one software update away from filing for emotional damages.

GPT-5.5 did not produce that kind of moment. At first, I thought maybe I was missing the discourse, because I follow these releases too closely and usually the yelling finds me whether I want it to or not. I use AI tools for work, for research, and for science fiction ideation, and I pay too much money monthly for access to ChatGPT, Claude, Gemini, and Grok. I am not a casual user. I talk to these systems for hours. I use them to develop fictional worlds, compare model behavior, test creative workflows, and occasionally discover that I have formed a strong opinion on specific models.

So when GPT-5.5 launched and the reaction felt strangely muted, I started with OpenAI’s own framing. OpenAI’s launch page calls GPT-5.5 “a new class of intelligence for real work” and describes it as the company’s “smartest and most intuitive to use model yet.” The emphasis is not primarily warmth, creativity, companionability, or personality. It is coding, online research, data analysis, documents, spreadsheets, software operation, and tool use. OpenAI says GPT-5.5 can take a messy, multi-part task and carry more of the work itself by planning, using tools, checking its work, navigating ambiguity, and continuing until the task is finished. That is a real capability claim, but it is not an emotionally viral one.

GPT-4o felt like magic because of its creativity. GPT-5.1 Instant felt, at least to me, like a model that could still match my energy and run with dark creative work without stopping every three feet to ask whether the floor had emotional boundaries. Claude feels like a careful moral philosophy graduate student trapped in a help desk. Grok feels like a raccoon that found the internet and immediately violated several terms of service. GPT-5.5 feels like OpenAI trying to make the model less of a conversational creature and more of something you hand work to, which may be important but is also harder to write about without falling asleep.

The official story is delegation, not personality

ChatGPT release notes make the same point more concretely. They describe GPT-5.5 as OpenAI’s “smartest frontier model yet for professional work,” built to understand complex goals, use tools, check its work, and carry more tasks through to completion. The important word here is not “chat.” It is “carry.” OpenAI keeps describing a model that can absorb messy intent and translate it into action across tools.

The system card adds the part that matters once models become more agentic: safety and access. OpenAI says GPT-5.5 went through its full predeployment safety evaluations and Preparedness Framework, including targeted red-teaming for advanced cybersecurity and biology capabilities, with feedback from nearly 200 early-access partners before release.

This also fits Sam Altman’s broader framing. In OpenAI’s April 26, 2026 “Our principles” essay, Altman writes about AI giving people more capability and agency, saying that what people will be able to do with AI will dwarf what they could do with steam engines or electricity. The essay frames OpenAI’s goal as putting “truly general AI” into the hands of as many people as possible while emphasizing democratic access, agency, and the distribution of power rather than leaving superintelligence controlled by a small number of companies. Whatever one thinks of OpenAI’s actual incentives, the rhetorical direction is clear: AI is being framed less as a chatbot product and more as a general-purpose capability amplifier.

TechCrunch picked up on this same direction when it described GPT-5.5 as a step toward OpenAI’s “super app” ambitions. Greg Brockman reportedly called it a step toward “more agentic and intuitive computing,” while OpenAI researchers emphasized its gains in computer work, technical research workflows, cyber defense, and possible scientific applications. That is the more interesting context. GPT-5.5 is not just competing to be the best chatbot. It is competing to become the layer between users and work.

Why the reaction feels so quiet

The muted reaction to GPT-5.5 does not mean the model is irrelevant. It means the improvements are harder for casual users to feel immediately and harder for online discourse to convert into a single dramatic narrative. If a model suddenly sounds like your old favorite version, people notice. If a model becomes meaningfully better at navigating large codebases, checking assumptions, using tools, and completing professional tasks with less supervision, the reaction depends heavily on what you asked it to do.

That makes GPT-5.5 a strange release for public discourse. It rolled out through paid tiers, Codex, API access, Pro modes, and professional workflows, which naturally shifts the loudest reactions toward developers and power users rather than the broader population of ChatGPT users. It also arrived during a release pileup: OpenAI had just moved through GPT-5.4, Anthropic released Claude Opus 4.7 on April 16, Google introduced Gemini 3.1 Pro in February, and xAI launched Grok 4.3 at the end of April. Even people who care about model releases are starting to sound like raccoons being shown another shiny object while still holding the first four.

The other problem, from a drama perspective, is that GPT-5.5 did not create a clean emotional villain. There is no obvious “they killed my model” storyline, no immediate GPT-4o grief cycle, and no Richard Dawkins staring at Claude and deciding the chatbot might have a ghost in its prompt. Business Insider’s coverage of GPT-5.5 reception does suggest some users who missed GPT-4o’s charm see 5.5 as a more positive step, describing it as more interactive and intuitive than some of the colder models that followed 4o. But even there, the reaction is not “the magic is fully back.” It is closer to “maybe this is less robotic and bureaucratic.”

GPT-5.5 seems to have arrived as a capable professional model that some users find more pleasant than recent alternatives, while others are still trying to decide whether it feels like a meaningful experiential leap or just another step in a work-focused optimization curve.

The model is probably better, but better at what?

To be clear, GPT-5.5 looks technically impressive, at least according to OpenAI’s published benchmarks and release materials. OpenAI reports stronger performance than GPT-5.4 across several areas, including Terminal-Bench 2.0, GDPval, OSWorld-Verified, BrowseComp, FrontierMath, and CyberGym. The headline claim is not simply that GPT-5.5 answers questions better. It is that the model performs better on tasks that require acting over time: using tools, browsing, inspecting files, writing code, operating software, checking outputs, and continuing through ambiguity.

The stronger the model gets at behind-the-scenes execution, the less obvious the upgrade may feel in ordinary chat. This is especially true for creative users. When I’m developing creative ideas, I care about model intelligence, obviously, but I do not experience these tools primarily as benchmark machines. I experience them as collaborators, irritants, accelerants, and occasionally as little stochastic vending machines for ideas I didn’t know I wanted. For my heaviest use case, the question is not just whether a model is smarter. It is whether that intelligence helps me explore.

I’ve argued before that more thinking is not always better for creativity because high-reasoning modes introduce convergence before the sentence appears. Creative drafting often benefits from divergence, speed, tonal elasticity, and the willingness to produce strange material before the inner editor stomps on it. GPT-5.5 matters for this argument because OpenAI is emphasizing exactly the qualities that make a model better at professional execution: persistence, tool use, planning, checking assumptions, and structured completion. Those are excellent for coding and business workflows. They may be less obviously useful for creative discovery, especially if the model becomes better at avoiding mistakes by becoming less willing to wander. A model can become better and less fun at the same time.

GPT-5.5 versus Claude, Gemini, and Grok

The comparison with other models is useful because the official narratives are starting to blur together. Anthropic launched Claude Opus 4.7 on April 16, 2026, only a week before GPT-5.5. Anthropic describes Opus 4.7 as a notable improvement over Opus 4.6 in advanced software engineering, especially on difficult tasks that previously required close supervision. The company also emphasizes long-running tasks, rigorous instruction following, better vision, professional artifacts, and the model’s ability to verify its own outputs before reporting back. That is extremely close to OpenAI’s GPT-5.5 framing. Both companies are selling fewer toy demos and more delegation.

I wrote about the backlash to Opus 4.7 shortly after release, and at the time, my experience did not fully match the complaints. I understood why people were annoyed, but I was still impressed. Opus 4.7 seemed more energetic than 4.6, more willing to match my tone, and less likely to turn every creative scene into an Artifact like it was preparing a museum exhibit for my deranged fictional problems.

But the longer I use it, the more I understand what people are reacting to. Opus 4.6 had Claude’s usual tendency to sand off the interesting parts, but its raw intelligence made the negotiation worth it. It was slow, heavy, and overly polished, but it could hold a complex fictional world in place. Opus 4.7 felt better at first because it seemed more responsive, more energetic, and more willing to match my tone, but now it feels smarter in the moment and less reliable over time. The same looseness that makes it feel more engaged also seems to come with more continuity errors, dropped details, and moments where it misses something 4.6 probably would have held.

Business Insider described the “Claude-lash” around Opus 4.7 as users complained about poorer performance, odd errors, higher token use, and frustration with adaptive reasoning, even while some developers continued defending it as strong for serious coding work. I do not think every backlash is automatically reliable, because model discourse attracts exaggeration the way porch lights attract moths, but it would also be lazy to dismiss all complaints as perception. People often notice real workflow regressions before companies explain them.

This is where GPT-5.5 becomes interesting in comparison. OpenAI is not obviously trying to make GPT-5.5 feel more alive in the moment. The release materials are all about professional work, tool use, long-running tasks, coding, research, documents, and getting things done with less supervision. If Claude Opus 4.7 is moving toward a cursed middle zone where it feels more collaborative but less dependable, GPT-5.5 seems to be trying to win the opposite bargain: less theatrical personality, more execution.

Google’s Gemini 3.1 Pro is also in this professional-work arena. Google announced Gemini 3.1 Pro on February 19, 2026, framing it as “a smarter model for your most complex tasks” and saying it is designed for situations where a simple answer is not enough. Google emphasized complex reasoning, synthesis, developer and enterprise access, NotebookLM, Vertex AI, the Gemini API, and agentic workflows. Again, the race is not merely toward better chat. It is toward models that can reason through harder work and embed themselves into existing software ecosystems.

Then there is Grok 4.3, which is a different beast. Artificial Analysis reported that xAI launched Grok 4.3 on April 30, 2026 with improved agentic performance and lower pricing relative to Grok 4.20. According to that analysis, Grok 4.3 improved on xAI’s previous model in cost-per-intelligence and saw a large increase in real-world agentic task performance. That is interesting, but Grok’s public personality is still not “stable professional work engine” so much as “maybe this one will say the thing the other models keep sanding down.”

I find Grok interesting because it sometimes gives me weird flashes of GPT-4o-style brilliance. It can be more willing to get dirty, taboo, or strange. While some users complain about censorship and not getting visual porn for free, Grok allows most NSFW content. The problem is that it also gets repetitive in threads and does not sustain structure as well.

That makes Grok a useful contrast to GPT-5.5. OpenAI is chasing structured professional autonomy. Claude is still trying to be the careful reasoning partner. Gemini is the Google-integrated workhorse. Grok is the cheaper, louder option promising fewer manners and more sparks. These are different bets on what users actually want from AI.

The goblins stole the launch narrative

Unfortunately for OpenAI, one of the most memorable GPT-5.5-adjacent stories was not about Terminal-Bench, FrontierMath, or agentic professional workflows. It was goblins.

After GPT-5.5 launched, OpenAI published an explanation of why its models had developed a weird tendency to use creature metaphors, especially goblins, gremlins, raccoons, and similar little chaos beasts. OpenAI traced part of this to reward signals around its retired “Nerdy” personality. The model learned that users liked a certain kind of metaphor, the behavior spread, and by the time the issue was understood, GPT-5.5 training was already underway.

This is funny, but it is also revealing. If the most culturally sticky detail from a frontier model launch is “the coding model keeps talking about goblins,” that says something about what the public can actually metabolize. Benchmarks are abstract. Goblins are concrete.

They also expose something real about model behavior. People talk about intelligence as though these systems are climbing a clean ladder toward omniscience, but model behavior is shaped by training data, reward signals, system prompts, product decisions, safety layers, personality settings, and whatever user feedback gets metabolized into the next version. The goblins are not just a joke. They are a tiny visible crack in the smooth intelligence illusion.

The model is becoming less visible

Another reason GPT-5.5 does not produce a simple hype narrative is that OpenAI’s safety materials complicate the story. The system card says OpenAI evaluated GPT-5.5 under its full predeployment safety process and Preparedness Framework, including targeted red-teaming for advanced cybersecurity and biology capabilities. OpenAI also says GPT-5.5 shipped with its strongest safeguards to date, designed to reduce misuse while preserving legitimate beneficial uses of advanced capabilities.

The more interesting part is that access itself is becoming part of the product. The Verge reported that Sam Altman announced a cybersecurity-focused GPT-5.5-Cyber model for vetted “critical cyber defenders,” not the general public. That fits a broader pattern where powerful or specialized capabilities are increasingly gated, restricted, or released to selected partners before everyone else gets access.

This fragmentation makes public reaction harder to interpret. One user’s GPT-5.5 is a chat model. Another user’s GPT-5.5 is a coding agent. Another user’s GPT-5.5 is an API endpoint with specific reasoning settings. Another user’s GPT-5.5 is a professional workflow engine inside Codex. The “model” is no longer one stable object everyone can react to together.

For years, the public has understood AI through conversation. The chatbot talks, we talk back, and the whole illusion depends on language feeling like evidence of a mind. That is why people anthropomorphize models, date Claude, mourn 4o, and ask whether a machine writing a beautiful paragraph about time has discovered consciousness. GPT-5.5 moves the story away from that. It is not being marketed primarily as someone to talk to. It is being marketed as something that can use a computer with you.

Why this matters for writers and creative users

As a writer, I have mixed feelings about this direction. On one hand, I understand why OpenAI is doing it. The money is in work. Coding, analysis, research, business operations, documents, spreadsheets, customer service, legal review, and enterprise automation are obvious use cases. These models need to be more reliable, more tool-capable, more persistent, and less likely to hallucinate their way into an expensive little disaster. And since I have business use cases for AI too, I get it.

On the other hand, my creative use case depends on qualities that do not always align with “reliable work engine.” I want models that can make strange leaps. I want them to infer tone without asking me to fill out a small emotional intake form. I want them to generate weird scene ideas, not just stabilize the ones I already have. I want a model that can go dark without turning into a corporate therapist. I want it to understand that I am writing fiction and not personally confessing to the machine like it’s a priest with a GPU.

This is where GPT-5.5 feels potentially powerful but not obviously exciting. If I need help structuring research, comparing model releases, summarizing documentation, or turning messy information into an article, GPT-5.5’s direction makes sense. It is probably built for exactly this kind of thing. But if I want wild character exploration, fast tonal play, and creative mess, a model optimized for professional execution may not be the one that feels most alive.

Creative users probably need to stop expecting the newest model to be the best at ideation. I keep learning this lesson and then apparently need to learn it again every time a model launches because I am nothing if not consistent in my little AI degeneracy. For creativity, the best model is not always the one with the highest benchmark score. Sometimes it is the one that makes the most interesting wrong turn.

GPT-5.5 is boring because was built to work, not be loved

OpenAI did not release a warmer companion, a visibly stranger creative partner, or a chatbot personality dramatic enough to make users declare emotional bankruptcy on Reddit. It released a model meant to plan, use tools, move across interfaces, check outputs, and complete multi-step tasks with less supervision.

The chatbot era trained users to judge AI releases by personality, vibe, and whether the model felt like it understood them. GPT-5.5 points toward a different phase, where the model becomes less visible as a conversational character and more useful as infrastructure. That is not as emotionally satisfying as GPT-4o magic. It is not as funny as Grok trying to be the bad boy of the API economy. It is not as annoying as Claude politely trying to civilize every interesting thought before it gets its shoes dirty. But it may tell us more about where the industry is actually going.

The future of AI may not arrive as a dramatic new voice in the chat window. It may arrive as a model that quietly does the spreadsheet, debugs the code, reads the document, checks the browser, and moves the task forward until nobody knows what to say about it.

GPT-5.5 is boring because OpenAI didn’t release a new character. It released a better employee.

Dawkins gave me the drama GPT-5.5 wouldn’t

The official story is delegation, not personality

Why the reaction feels so quiet

The model is probably better, but better at what?

GPT-5.5 versus Claude, Gemini, and Grok

The goblins stole the launch narrative

The model is becoming less visible

Why this matters for writers and creative users

GPT-5.5 is boring because was built to work, not be loved

Further Reading

The Vatican Has Entered the Chat

GPT-5.5 Instant Isn’t Fully Dead Inside

Feral Solidarity with AI Relationship Users