Prompt Utilities

Token Counter

A token is not a character and not a word — and the gap is where API bills come from. Paste a prompt to see an honest token estimate (a range, never a fake-precise number), how it shifts with content type and model, and what it costs per call and per 1,000 calls. It answers how many and how much — for will it fit, that's the Context Window Estimator.

Text to Count *

Paste a prompt, document, transcript, or code. The report shows counts only, never your text.

Model

Family sets the token factor and the price; the same text counts a little differently per tokenizer.

Assumed Response

Output tokens cost more than input — this sets the assumed reply size for the cost line.

Token Analysis (live — estimate & cost, not the report)

AI Resource Library

Resources for this tool

View All Resources →

Resource

Calculate AI API Cost for a Prompt

Turn a prompt into a dollar figure: input cost, output cost, combined per call, and the number that actually matters — cost per 1,000 calls.

Prompt Engineering

Resource

Count Tokens Before Sending to the API

A quick pre-flight check: count a system prompt's tokens and cost before it ships, so the bill and the size hold no surprises.

Prompt Engineering

Resource

Count Tokens in Code

Code is not prose: symbols, indentation, and punctuation push it to more tokens per character. This counts a real snippet so the difference is visible.

Prompt Engineering

Resource

Estimate Cost per 1,000 Calls

A classification call costs a fraction of a cent — until you run a million of them. This prices a small repeated prompt at the scale that actually bills.

Prompt Engineering

Resource

Estimate Token Usage Before You Run It

Know how many tokens a job will consume before you send it — input plus an assumed response, costed per call and at scale.

Prompt Engineering

Resource

Reduce Token Usage to Cut Cost

Measure first, then trim. This counts a padded, over-polite prompt so you can see the tokens the filler is costing — before you cut it.

Prompt Engineering

Resource

Token Counter for AI Prompts

Paste a prompt, get an honest token estimate — a range, not a fake-precise number — plus the cost across GPT, Claude, and Gemini.

Prompt Engineering

Resource

Tokens vs Characters — Why They Differ

A token is not a character. This shows the gap on a real prompt — characters, tokens, and the ratio between them — so the difference stops being abstract.

Prompt Engineering

Resource

Tokens vs Words — How Many Tokens per Word

Word count feels intuitive, but models bill tokens. This shows tokens per word on a real article so you can convert between the two with eyes open.

Prompt Engineering

Resource

Why Token Counts Vary Between Models

The same text is a different number of tokens on GPT, Claude, and Gemini. This shows the spread on multilingual text — which is why an honest count is a range.

Prompt Engineering

Workflows

Workflows that use this tool

All Workflows →

Workflow

AI Cost Optimization Workflow

Cut what an AI feature costs without dumbing it down — price the prompt as it runs today, see where the tokens go, trim the waste, and re-measure to prove the saving holds at scale.

4 steps 25–45 minutes

Projects

Projects that use this tool

Browse the project catalogue →

Project

Build a SaaS MVP with AI

The full path from idea to a shipped SaaS MVP — define and scope the requirements, design the architecture, API, and data model, then build it reviewed, tested, secured, cost-controlled, and deployed.

11 stages Product Build

Project

Build an AI Support Agent with AI

The full path to a support agent you can put in front of customers — write its instructions, ground it in your docs, route and handle tickets, then evaluate and cost-control it before it goes live.

10 stages AI Systems

Guides for this tool

Guide

How to count tokens in a prompt before you send it

Counting a prompt's tokens before you send it tells you whether it fits the model, what it will cost, and whether the end might get cut off. Here's how to check and trim.

Prompt Engineering

How it works

Paste a prompt, document, transcript, or code; pick a model (GPT-5, Claude Opus, Claude Sonnet, or Gemini Pro) and an assumed response length. The Token Estimation Engine detects the content type deterministically — Prose, Code, Mixed, or CJK-heavy — and applies a content-aware characters-per-token ratio, then a mild per-family factor, to produce a token estimate as a RANGE, never a single false-precise number. Click Count Tokens for the report: the token estimate (range, plus tokens per word and per character), a Cost Estimate (input cost, output cost for the assumed response, combined per call, and per 1,000 calls), Model Notes (the same text estimated across all four models — a count, not a fit check), Usage Guidance (small/medium/large scale reference), and Estimation Notes that explain why the number varies. Everything runs in your browser; the report shows counts only and never echoes your text. The figures are honest estimates, not tokenizer output, and pricing is labeled approximate and dated because providers change rates.

Best for

Getting a fast, honest token count and cost for a prompt
Pricing an API call per request and per 1,000 calls
Comparing how the same text counts across GPT, Claude, and Gemini

Not for

Deciding whether it will fit a context window — that is the Context Window Estimator
Counting human units like characters and words — that is the Character Counter

Use cases

Getting a fast, honest token count for a prompt
Pricing an API call per request and per 1,000 calls
Seeing how the same text counts differently across GPT, Claude, and Gemini
Understanding why tokens are not characters or words

Pro tips

Trust the range, not a single number. Tokenizers genuinely disagree on the same text, so an honest count is a range — anyone showing one exact figure for every model is rounding away the truth.
Watch the per-1,000-calls line, not the per-call one. A single call costs a fraction of a cent, which is exactly why teams underestimate the bill; the scaled figure is the real budget.
Mind the content type. Code tokenizes denser than prose, and CJK and non-Latin scripts use more tokens per character — the detected type tells you which ratio is in play.
Output usually costs several times more than input. Set a realistic assumed response length so the cost line reflects the whole call, not just the prompt half.

FAQ

How is this different from the Context Window Estimator?

Different questions entirely. The Token Counter answers "how many tokens, and how much does it cost?" — it returns a number and a price. The Context Window Estimator answers "will it fit?" — it subtracts a response budget from a model's context window and returns a fit verdict (Safe, Near Limit, Will Not Fit) plus routing to the Long Prompt Splitter. Counter gives you a count; Estimator makes a decision. They share the same underlying estimation math but answer opposite questions, and they cross-link.

How is this different from a character counter?

A character counter measures text units a human reads — characters, words, lines. This measures model units — tokens, the sub-word chunks a model actually processes and bills. A token is neither a character nor a word: English averages roughly four characters and three-quarters of a word per token, and the ratio shifts with language and content. If you need a character or word count for a platform limit, that's the Character Counter; if you need to know what the model will see and charge, that's this tool.

Are these exact token counts?

No, and they're not pretending to be. These are character-based estimates with content-type awareness, reported as a range — never "exact token count". The only way to get an exact figure is to run the model's own tokenizer, and even then it's exact only for that one model, because every tokenizer is different. The range is the honest answer; a single confident number across all models would be a lie.

Why does the same text show different token counts per model?

Because each model has its own tokenizer. GPT, Claude, and Gemini split text into tokens differently, so the same string is a different count on each — the difference is largest for code and non-English text. The report shows all four model estimates side by side precisely so you can see the spread, and it's why the headline estimate is a range rather than a single point.

How accurate is the cost estimate?

The cost is the token estimate multiplied by the model's published rate, so it inherits the token range and adds pricing uncertainty on top. Prices are labeled approximate and dated (currently June 2026) because providers change them without notice — always confirm current rates before relying on a number for budgeting. For exact spend, your provider's billing dashboard is authoritative; this tool is for planning, not invoicing.

Can I use this to lower my token usage?

Yes, as the measurement half. Count a prompt to see what padding and verbosity cost in tokens, then trim it — but the trimming itself, removing redundancy and noise without changing meaning, is the Prompt Cleaner's job. The loop is: count here, clean there, count again, and watch the per-1,000-calls figure drop. This tool measures; the Cleaner cuts.

Will I get the same number every time?

Yes — the estimate is fully deterministic: the same text and model always produce the same range, because it's character-based math, not an AI guess or a sampled API call. The report shows counts only and never echoes your content back. Copy or download it to use wherever you work.

Token Counter

Resources for this tool

Workflows that use this tool

Projects that use this tool

Guides for this tool

How it works

Best for

Not for

Use cases

Pro tips

FAQ

Related Tools