Tokens to Words Converter

Convert between tokens and words for GPT-4o, Claude, Gemini and every major model. Free, instant, works in your browser.

1,333 tokens

≈ 1.04% of GPT-4o / 4o-mini's 128,000-token context window

5,100

characters

4.0

pages (single-spaced)

Based on OpenAI and Anthropic-published averages. Actual ratio varies with content type (code, non-English, and numeric text are more token-dense than prose).

Quick Answer

For English prose, 1 token ≈ 0.75 words, which means 1 word ≈ 1.33 tokens. So 1,000 words is roughly 1,333 tokens, and 128,000 tokens of context fits about 96,000 words (around 200 single-spaced pages). Ratios shift for code, non-English text and numeric content, which all use more tokens per word.

What is a token, actually

When you paste text into ChatGPT or Claude or Gemini, the model doesn't see words. It sees tokens. A token is a chunk of text the model's tokenizer splits your input into before the neural network gets anywhere near it. Sometimes a token is a whole word. Sometimes it's a subword ("token" might be one token, but "tokenization" splits into "token" + "ization"). Sometimes it's a single character (most punctuation) or a single byte (for rare characters).

OpenAI's tokenizer for GPT-4o is called o200k_base. Anthropic uses its own tokenizer. Google's Gemini uses SentencePiece. They all produce slightly different token counts for the same input — same text, different numbers. That's why our converter lets you pick a model.

The practical consequence: when you hear "Claude has a 200k context window," that's 200,000 tokens, not 200,000 words. Closer to 150,000 English words. Still a lot, but less than the marketing implies when you do the conversion.

Context windows in words (quick reference)

ModelContext (tokens)Approx. wordsEquivalent
GPT-4o128,000~96,000A short novel
GPT-4.11,000,000~750,000All of Harry Potter plus change
o3 / o4-mini200,000~150,000A PhD dissertation
Claude Opus 4.6200,000~150,000A PhD dissertation
Claude Sonnet 4.6200,000~150,000A PhD dissertation
Gemini 2.5 Pro2,000,000~1,500,000Two full Bibles
Gemini 2.5 Flash1,000,000~760,000War and Peace, twice
Llama 3.1 / 3.3128,000~93,000A short novel
Grok 3131,072~97,000A short novel

Why 1 token = 0.75 words

OpenAI's own documentation uses 0.75 as the rule of thumb. That number comes from tokenizing large corpora of English text and averaging. Short common words ("the," "and," "to," "is") tokenize as one token each. Longer words often split: "unbelievable" becomes three tokens in most tokenizers. Proper nouns and technical terms split the hardest.

So if you have 1,000 tokens of English prose, you can expect about 750 words. Go the other direction, 1,000 words translates to about 1,333 tokens.

The ratio breaks down in three situations worth knowing:

  • Code is token-dense. Variable names split aggressively. Indentation becomes tokens. A 1,000-word Python file might be 1,600+ tokens.
  • Non-English text uses more tokens. Chinese, Japanese and Arabic often use 2 to 3 tokens per word because common scripts weren't as prevalent in the tokenizer's training data.
  • Numbers and special characters. A phone number or long decimal often splits digit-by-digit. A line of JSON might be 40% tokens-per-character worse than prose.

If you're pricing an API call or checking if your prompt fits, always add 10 to 20% headroom over the 0.75 rule. The exact number only matters when you're right at the limit.

Word count for common token budgets

This is what people actually Google, so here's the table:

TokensWordsPages (single-spaced)Real-world equivalent
1,000~7503A short blog post
4,096~3,07012A feature article
8,192~6,14025A college term paper
16,384~12,28050A short story collection
32,768~24,500100A novella
100,000~75,000300A YA novel
128,000~96,000384An adult novel
200,000~150,000600A PhD dissertation
1,000,000~750,0003,000Harry Potter books 1-7
2,000,000~1,500,0006,000Two Bibles

Counting tokens in practice

If you want the exact token count for a specific model, use the model's native tokenizer, not a rule of thumb. Three options that cost nothing:

  • OpenAI's tokenizer playground at platform.openai.com/tokenizer. Paste your text, see the exact tokens with color-coded chunks.
  • tiktoken library (Python/JavaScript) if you're building anything programmatic. pip install tiktoken then four lines of code gets you the count.
  • Anthropic's count_tokens API endpoint for Claude. It's free and returns the exact count for Claude's tokenizer.

For rough planning, our converter above is accurate within 5 to 10% for English prose. When you're designing a prompt, drafting a long document for summarization, or pricing an API call at scale, use one of the exact tools.

Why this matters for your wallet

If you're paying for API access, tokens are the unit of billing. At October 2026 list prices (always check current pricing before you rely on these numbers):

  • GPT-4o input runs roughly $2.50 per million tokens. That's $2.50 for about 750,000 words of input.
  • Claude Opus 4.6 input is around $15 per million tokens. Pricier, but the context window and reasoning quality justify it for complex work.
  • Gemini 2.5 Flash is one of the cheapest at around $0.15 per million input tokens for the lower-tier model.

If you're feeding a 50-page report (~12,000 words ≈ 16,000 tokens) into Claude Opus for summarization, that's $0.24 for input alone. Do that a thousand times a day and the math starts mattering. Knowing your token count up front prevents bill shock.

Common misconceptions

"A bigger context window means the model remembers everything." Not quite. Research from Anthropic and academic papers have shown that model performance on tasks like needle-in-a-haystack recall degrades in the middle of very long contexts, especially past 100k tokens. The window is how much the model can see, not how well it can use all of it.

"Tokens and characters are the same." No. Tokens are variable-length. A 5-character word like "hello" is one token. A 5-character word like "quote" might also be one token. But a 5-character string like "qxzvw" (nonsense) could be three or four tokens because the tokenizer has no common subword for that sequence.

"Output tokens count the same as input tokens." For the context window, they share the pool — output takes up space in the same window as input. For pricing, output tokens are usually 3 to 5 times more expensive than input tokens on most providers. Generating a 2,000-word essay costs more than inputting one.

Need to count the words first?

Paste your text into our free word counter, then convert to tokens here

Open Word Counter

Frequently Asked Questions

How many words is 1 token?

About 0.75 words for English prose. So 100 tokens is roughly 75 words, and 1,000 tokens is about 750 words.

How many tokens is 1 word?

About 1.33 tokens for English. 100 words is around 133 tokens, 1,000 words is around 1,333 tokens.

How many words fit in 128k tokens?

Roughly 96,000 words of English prose, about 384 single-spaced pages or a short novel.

Are tokens the same across ChatGPT, Claude, and Gemini?

No. Each provider uses a different tokenizer, so the same text produces slightly different token counts. The 0.75 rule is a useful average, but exact counts require that model's specific tokenizer.

Does code use more tokens than prose?

Yes. Code is token-dense because variable names split aggressively, indentation becomes tokens, and special characters each count. Expect 30 to 60% more tokens per word when tokenizing code vs. prose.

How do I count tokens exactly for a specific model?

Use the model's official tokenizer. For GPT, OpenAI's online tokenizer or the tiktoken library. For Claude, Anthropic's count_tokens API endpoint. For Gemini, the Google AI Studio tokenizer.

Related Tools and Guides