One of the more annoying flaws of ChatGPT is when it abruptly stops answering you in the middle of a response.
When ChatGPT stops writing mid-sentence, it’s usually because it’s reached its token limit. Other reasons are the servers experiencing heavy demand — which can cause responses to cut off before they finish — or a temporary glitch (such as the vanishing gradient problem).
Want to learn how to fix this annoying token issue and get ChatGPT to answer completely? Keep reading!
- ChatGPT may stop writing halfway due to reaching its token limit, causing responses to cut off before completion. You can think of a token as a piece of a word.
- The token limit for ChatGPT depends on the model being used, ranging from 4,096 tokens for gpt-3.5-turbo to 8,192 tokens for gpt-4 (available through ChatGPT Plus).
- Ways to fix the token limit issue include clicking “Continue generating,” breaking requests into smaller tasks, condensing input, asking for limited response length, or upgrading to a higher token allocation model.
- Connection issues can also lead to delays or non-responsiveness in ChatGPT, particularly during periods of heavy demand. You can check the server status at https://status.openai.com/
- A temporary glitch — such as the vanishing gradient problem — may also be at fault. If this happens, simply refresh the chat page.
Reason #1: The ChatGPT Token Limit
The main reason ChatGPT stops in the middle of a response is the token limit.
Each ChatGPT session starts with a finite number of “tokens.” You can think of a token as a piece of a word.
Here’s how the ratio of tokens to words generally adds up:
- 1 token = ~4 characters in English or ¾ of a single word
- 100 tokens = ~75 words
- Wayne Gretzky’s quote “You miss 100% of the shots you don’t take” = 11 tokens
Once ChatGPT has reached its word limit — and therefore exhausted its token count — it will stop responding. (Even if it’s in the middle of a response!)
This limit is a cumulative limit that counts both the text you input and the text ChatGPT outputs. That means the words you type in and the words ChatGPT outputs both add to the token limit.
Here’s an example of how this adds up:
The token limit for gpt-3.5-turbo — the current version of ChatGPT that you can access for free — is 4,096 tokens, or ~3,072 words.
If you’re subscribed to ChatGPT Plus and are using the gpt-4 model, then you have a limit of 8,192 tokens, or ~6,144 words.
If ChatGPT stops writing code or text halfway, it’s likely that you’ve reached the limit for the model you’re using.
Pro-tip: If you’re concerned that a long prompt might use too many tokens, you can check its token usage with OpenAI’s Tokenizer tool.
How to Fix the Token Limit Issue
1. Click “Continue generating”
If ChatGPT stopped in the middle of its response, the easiest way to make it continue is to click the “continue generating” button that appears below the unfinished response.
If it doesn’t work for some reason, you can also use a command like “keep going” or “continue your response.”
However, you need to know that this approach isn’t optimal in every situation.
To free up tokens to continue the response, ChatGPT has to discard its awareness of text used earlier in the conversation. If you ask it to reference something mentioned early in the conversation, it may not know what you’re talking about.
The easiest way to fix this forgetfulness is to start your next chat message with a summary of everything you’ve talked about thus far. This gives ChatGPT a way to reference key points from early in the conversation without the token limit causing issues.
2. Break Your Request Into Smaller Tasks
If you’re asking ChatGPT to perform a large, ambiguous task – such as coding a complex program or writing a long story – it may run out of tokens before it completes its response.
To solve this problem, take a little time to break the task down into more manageable chunks before handing things over to ChatGPT. Then you can prompt ChatGPT with each chunk rather than one long prompt it won’t be able to finish.
Pro-tip: Ask ChatGPT to help you break down your task into more discrete steps.
3. Condense Your Input
Remember, the number of words you use in your request impacts the total token allocation. The more words you use in your request, the fewer words ChatGPT can use in its response.
If your request is rather wordy, try to condense your request into fewer words without sacrificing meaning or clarity.
Here’s an example:
- Wordier prompt: Could you please tell me what ingredients are included in a pineapple pizza?
- Condense prompt: pineapple pizza ingredients
Pro-tip: Save time by asking ChatGPT to condense your request. Just make sure you review the output to make sure the condensed request retains its full meaning.
4. Ask ChatGPT To Limit Its Response to a Certain Length
To prevent excessive token use, you can ask ChatGPT to keep its responses shorter.
This doesn’t always work – I’ve noticed ChatGPT loves to ignore specific requests to keep responses under a certain word count – but it’s worth a try.
5. Upgrade to gpt-4
As I mentioned the gpt-3.5-turbo default model for ChatGPT only allows 4,096 tokens, while gpt-4 has a ceiling of 8,192 tokens.
To access gpt-4 and double your token allocation, subscribe to ChatGPT Plus for $20/month.
Access to gpt-4 is currently limited to 25 messages per 3 hours, but that’s expected to increase soon.
Pro-tip: There is a version of gpt-4 — called gpt-4-32k — that has a maximum of 32,000 tokens (~25,000 words). Unfortunately, this version is currently only available to a limited number of API users. You cannot currently use it through the normal ChatGPT interface.
To sign up for the API, visit platform.openai.com and sign into your account.
Reason #2: Connection Issues
ChatGPT is an incredibly popular program. In fact, it set a record for the fastest-growing app in history, reaching 100 million users a mere two months after launch.
Given its meteoric rise, it’s no surprise that it has network issues from time to time.
If the ChatGPT servers are experiencing heavy demand, it may take you a while — potentially several minutes — to generate a full response.
This can manifest in mid-response generation delays, and it may seem like ChatGPT has stopped responding even though it’s merely being quite slow. Though I have experienced instances where ChatGPT stops responding entirely due to network issues.
How To Fix
If ChatGPT’s responses are slow – or not generating at all – try starting a fresh chat in a new ChatGPT tab.
If that doesn’t fix the issue, you’ll either have to deal with the delay or try again later when the network issues have been resolved.
To see if there are network issues, you can check the server status at https://status.openai.com/
Reason #3: Temporary Glitch
If there’s a temporary glitch — potentially caused by the vanishing gradient problem — you can usually fix it by simply refreshing the chat page or regenerating the prompt.
The vanishing gradient problem is far too technical for me to discuss here, but you can think of it as ChatGPT not knowing what words to output.
ChatGPT is rapidly evolving, so these small token limits will likely be a thing of the past relatively soon.
For now, you can work with the limit in the ways I described:
- Click “continue generating.”
- Break your request into smaller tasks.
- Condense your input.
- Ask ChatGPT to limit its response to a certain length.
- Upgrade to ChatGPT Plus and gain access to gpt-4.
And if the issue is network-related, your best bet is to wait until the server problems are resolved.