Tokens to Words Converter
Convert between tokens and words for GPT-4o, Claude, Gemini and every major model. Free, instant, works in your browser.
1,333 tokens
≈ 1.04% of GPT-4o / 4o-mini's 128,000-token context window
5,100
characters
4.0
pages (single-spaced)
Based on OpenAI and Anthropic-published averages. Actual ratio varies with content type (code, non-English, and numeric text are more token-dense than prose).
Quick Answer
For English prose, 1 token ≈ 0.75 words, which means 1 word ≈ 1.33 tokens. So 1,000 words is roughly 1,333 tokens, and 128,000 tokens of context fits about 96,000 words (around 200 single-spaced pages). Ratios shift for code, non-English text and numeric content, which all use more tokens per word.
What is a token, actually
When you paste text into ChatGPT or Claude or Gemini, the model doesn't see words. It sees tokens. A token is a chunk of text the model's tokenizer splits your input into before the neural network gets anywhere near it. Sometimes a token is a whole word. Sometimes it's a subword ("token" might be one token, but "tokenization" splits into "token" + "ization"). Sometimes it's a single character (most punctuation) or a single byte (for rare characters).
OpenAI's tokenizer for GPT-4o is called o200k_base. Anthropic uses its own tokenizer. Google's Gemini uses SentencePiece. They all produce slightly different token counts for the same input — same text, different numbers. That's why our converter lets you pick a model.
The practical consequence: when you hear "Claude has a 200k context window," that's 200,000 tokens, not 200,000 words. Closer to 150,000 English words. Still a lot, but less than the marketing implies when you do the conversion.
Context windows in words (quick reference)
| Model | Context (tokens) | Approx. words | Equivalent |
|---|---|---|---|
| GPT-4o | 128,000 | ~96,000 | A short novel |
| GPT-4.1 | 1,000,000 | ~750,000 | All of Harry Potter plus change |
| o3 / o4-mini | 200,000 | ~150,000 | A PhD dissertation |
| Claude Opus 4.6 | 200,000 | ~150,000 | A PhD dissertation |
| Claude Sonnet 4.6 | 200,000 | ~150,000 | A PhD dissertation |
| Gemini 2.5 Pro | 2,000,000 | ~1,500,000 | Two full Bibles |
| Gemini 2.5 Flash | 1,000,000 | ~760,000 | War and Peace, twice |
| Llama 3.1 / 3.3 | 128,000 | ~93,000 | A short novel |
| Grok 3 | 131,072 | ~97,000 | A short novel |
Why 1 token = 0.75 words
OpenAI's own documentation uses 0.75 as the rule of thumb. That number comes from tokenizing large corpora of English text and averaging. Short common words ("the," "and," "to," "is") tokenize as one token each. Longer words often split: "unbelievable" becomes three tokens in most tokenizers. Proper nouns and technical terms split the hardest.
So if you have 1,000 tokens of English prose, you can expect about 750 words. Go the other direction, 1,000 words translates to about 1,333 tokens.
The ratio breaks down in three situations worth knowing:
- Code is token-dense. Variable names split aggressively. Indentation becomes tokens. A 1,000-word Python file might be 1,600+ tokens.
- Non-English text uses more tokens. Chinese, Japanese and Arabic often use 2 to 3 tokens per word because common scripts weren't as prevalent in the tokenizer's training data.
- Numbers and special characters. A phone number or long decimal often splits digit-by-digit. A line of JSON might be 40% tokens-per-character worse than prose.
If you're pricing an API call or checking if your prompt fits, always add 10 to 20% headroom over the 0.75 rule. The exact number only matters when you're right at the limit.
Word count for common token budgets
This is what people actually Google, so here's the table:
| Tokens | Words | Pages (single-spaced) | Real-world equivalent |
|---|---|---|---|
| 1,000 | ~750 | 3 | A short blog post |
| 4,096 | ~3,070 | 12 | A feature article |
| 8,192 | ~6,140 | 25 | A college term paper |
| 16,384 | ~12,280 | 50 | A short story collection |
| 32,768 | ~24,500 | 100 | A novella |
| 100,000 | ~75,000 | 300 | A YA novel |
| 128,000 | ~96,000 | 384 | An adult novel |
| 200,000 | ~150,000 | 600 | A PhD dissertation |
| 1,000,000 | ~750,000 | 3,000 | Harry Potter books 1-7 |
| 2,000,000 | ~1,500,000 | 6,000 | Two Bibles |
Counting tokens in practice
If you want the exact token count for a specific model, use the model's native tokenizer, not a rule of thumb. Three options that cost nothing:
- OpenAI's tokenizer playground at platform.openai.com/tokenizer. Paste your text, see the exact tokens with color-coded chunks.
- tiktoken library (Python/JavaScript) if you're building anything programmatic.
pip install tiktokenthen four lines of code gets you the count. - Anthropic's count_tokens API endpoint for Claude. It's free and returns the exact count for Claude's tokenizer.
For rough planning, our converter above is accurate within 5 to 10% for English prose. When you're designing a prompt, drafting a long document for summarization, or pricing an API call at scale, use one of the exact tools.
Why this matters for your wallet
If you're paying for API access, tokens are the unit of billing. At October 2026 list prices (always check current pricing before you rely on these numbers):
- GPT-4o input runs roughly $2.50 per million tokens. That's $2.50 for about 750,000 words of input.
- Claude Opus 4.6 input is around $15 per million tokens. Pricier, but the context window and reasoning quality justify it for complex work.
- Gemini 2.5 Flash is one of the cheapest at around $0.15 per million input tokens for the lower-tier model.
If you're feeding a 50-page report (~12,000 words ≈ 16,000 tokens) into Claude Opus for summarization, that's $0.24 for input alone. Do that a thousand times a day and the math starts mattering. Knowing your token count up front prevents bill shock.
Common misconceptions
"A bigger context window means the model remembers everything." Not quite. Research from Anthropic and academic papers have shown that model performance on tasks like needle-in-a-haystack recall degrades in the middle of very long contexts, especially past 100k tokens. The window is how much the model can see, not how well it can use all of it.
"Tokens and characters are the same." No. Tokens are variable-length. A 5-character word like "hello" is one token. A 5-character word like "quote" might also be one token. But a 5-character string like "qxzvw" (nonsense) could be three or four tokens because the tokenizer has no common subword for that sequence.
"Output tokens count the same as input tokens." For the context window, they share the pool — output takes up space in the same window as input. For pricing, output tokens are usually 3 to 5 times more expensive than input tokens on most providers. Generating a 2,000-word essay costs more than inputting one.
Need to count the words first?
Paste your text into our free word counter, then convert to tokens here
Open Word CounterFrequently Asked Questions
How many words is 1 token?
About 0.75 words for English prose. So 100 tokens is roughly 75 words, and 1,000 tokens is about 750 words.
How many tokens is 1 word?
About 1.33 tokens for English. 100 words is around 133 tokens, 1,000 words is around 1,333 tokens.
How many words fit in 128k tokens?
Roughly 96,000 words of English prose, about 384 single-spaced pages or a short novel.
Are tokens the same across ChatGPT, Claude, and Gemini?
No. Each provider uses a different tokenizer, so the same text produces slightly different token counts. The 0.75 rule is a useful average, but exact counts require that model's specific tokenizer.
Does code use more tokens than prose?
Yes. Code is token-dense because variable names split aggressively, indentation becomes tokens, and special characters each count. Expect 30 to 60% more tokens per word when tokenizing code vs. prose.
How do I count tokens exactly for a specific model?
Use the model's official tokenizer. For GPT, OpenAI's online tokenizer or the tiktoken library. For Claude, Anthropic's count_tokens API endpoint. For Gemini, the Google AI Studio tokenizer.