You can use 4.5 Opus with Claude Code by running it specifically with `--model` pointed towards 4.5 Opus:
$ claude --model claude-opus-4-5
Claude Code v2.1.37
claude-opus-4-5 · Claude Max
/tmp/test
Opus 4.6 is here · $50 free extra usage · /extra-usage to enable
GPT-5 used to do this on release, but seems to have reverted back, especially on the codex versions. This may well be a feature.
I joked that this is the side effect of asking it to act like a senior software engineer. It tends to talk back and do its own thing. There was that one time when the thought processes went "I'm a full stack engineer" > "I'm expanding my connections on LinkedIn" > "I'm establishing myself as a tech writer" > "I'm evaluating classes in professional writing". It does have introspection capabilities, but one could argue it's just a bug or emergent.
Anyway, option 1: why not just use Sonnet? Heck you can use Haiku if you're giving it clear instructions. The thinking ones do perform worse on clear tasks. You also get your rolled back version.
Option 2: Use role prompting [0]
Give it some junior engineer role where it's expected to follow instructions exactly as given.
> thought processes went "I'm a full stack engineer" > ... > "I'm evaluating classes in professional writing"
Let that one run, let's see where it ends.
maybe "I no longer build software; I now make furniture out of wood"
or "Have taken up farming"
or perhaps "I've sold all my RSUs, am taking a Sabattical" > "I'm exploring activities I find more fulfilling: yoga & angel investing" > "I'm the author of a substack blog post where I share life advice with others" [1]
In my opinion, Opus 4.6 handles instructions (especially negative rules) better than Opus 4.5 and Sonnet 4.5. It also applies skills better than other models.
But I think this is individual.
Another example. I asked gpt-5.2-codex to add an array of 5 values to the script and write a small piece of code. Then I manually deleted one of the values in the array and asked the agent to commit.
But the model edited the file again and added the value I deleted to the array.
I deleted that value again and asked the agent to “just commit.” But the agent edited the file again before committing.
This happened many times, and I used different commands, such as “never edit the file, just commit.” The model responded that it understood the command and began editing the file. I switched to gpt-5.2, but that didn't help.
I switched to sonnet-4.5, and it immediately committed on the first try without editing the file.
You can use 4.5 Opus with Claude Code by running it specifically with `--model` pointed towards 4.5 Opus:
GPT-5 used to do this on release, but seems to have reverted back, especially on the codex versions. This may well be a feature.
I joked that this is the side effect of asking it to act like a senior software engineer. It tends to talk back and do its own thing. There was that one time when the thought processes went "I'm a full stack engineer" > "I'm expanding my connections on LinkedIn" > "I'm establishing myself as a tech writer" > "I'm evaluating classes in professional writing". It does have introspection capabilities, but one could argue it's just a bug or emergent.
Anyway, option 1: why not just use Sonnet? Heck you can use Haiku if you're giving it clear instructions. The thinking ones do perform worse on clear tasks. You also get your rolled back version.
Option 2: Use role prompting [0] Give it some junior engineer role where it's expected to follow instructions exactly as given.
[0] https://platform.claude.com/docs/en/build-with-claude/prompt...
> thought processes went "I'm a full stack engineer" > ... > "I'm evaluating classes in professional writing"
Let that one run, let's see where it ends.
maybe "I no longer build software; I now make furniture out of wood"
or "Have taken up farming"
or perhaps "I've sold all my RSUs, am taking a Sabattical" > "I'm exploring activities I find more fulfilling: yoga & angel investing" > "I'm the author of a substack blog post where I share life advice with others" [1]
[1] https://briansolis.com/2015/09/silicon-valley-hierarchy-need...
In my opinion, Opus 4.6 handles instructions (especially negative rules) better than Opus 4.5 and Sonnet 4.5. It also applies skills better than other models.
But I think this is individual.
Another example. I asked gpt-5.2-codex to add an array of 5 values to the script and write a small piece of code. Then I manually deleted one of the values in the array and asked the agent to commit. But the model edited the file again and added the value I deleted to the array. I deleted that value again and asked the agent to “just commit.” But the agent edited the file again before committing. This happened many times, and I used different commands, such as “never edit the file, just commit.” The model responded that it understood the command and began editing the file. I switched to gpt-5.2, but that didn't help.
I switched to sonnet-4.5, and it immediately committed on the first try without editing the file.