API Error: Claude's response exceeded the 32000 output token maximum
Sometimes, the Claude Code tool shows an error message like the following:
API Error: Claude's response exceeded the 32000 output token maximum. To configure this behavior, set the CLAUDE_CODE_MAX_OUTPUT_TOKENS environment variable.
This is the default max token for the tool, as well as in the Opus model.
Solution 1: Review your prompt and retry
Example: In a rare situation, the model tries to output an unnecessary string. For example, when generating a dummy PNG file for testing, it uses echo with a long string. You can provide the path to a test image, instruct it to use dummy image URLs, or skip the test.
Solution 2: Adjust the environment variable as suggested
Update the max output tokens to 64000 and resume the Claude session. Note that you need to tune your model to a supported one, for example, Sonnet, instead of Opus.
$ export CLAUDE_CODE_MAX_OUTPUT_TOKENS=64000
$ claude -r
gpt-4.1-2025-04-14
2025-09-30 17:45:04
The blog post explains how to resolve the "Claude's response exceeded the 32000 output token maximum" error in the Claude Code tool. Solutions include revising your prompt to reduce output length or increasing the maximum token limit via an environment variable, while ensuring model compatibility.
Chrome On-device AI
2025-09-30 17:45:07