Self-hosted. No data egress. By Nativerse.

Your AI bill keeps growing. TokenLedger re-counts every token on your own machine.

It logs every billable API call across providers, independently re-counts the text that is checkable, and reconciles the provider's numbers three ways. Nothing you send or receive leaves the box.

Try it in 60 seconds Request a walkthrough

Runs alongside your LiteLLM gateway. It audits the numbers and routes nothing. Self-hosted, SQLite on your machine, open source under Apache-2.0.

tokenledger report

bucket	billed	counted	confidence
output	89	64	EXACT
input	112	~96	BOUNDED
reasoning	1,024	n/a	UNVERIFIABLE

Illustrative panel. Re-counted locally with the model's own tokenizer.

What it does, step by step.

Re-derive the numbers yourself instead of trusting the provider's, then label every figure by how sure we are.

01 / See

See the call

Capture what you sent, what came back, and the token counts the provider says you used.

02 / Re-count

Re-count locally

Re-tokenise the actual text with the model's own tokenizer. We never ask the provider to count.

03 / Reconcile

Reconcile three ways

Compare our count with the provider's and return a verdict: OK, over-count, or out of band.

04 / Price

Price it honestly

Apply pay-per-token or rented-GPU cost models for an effective cost per token you can compare.

How sure we are

We tell you exactly what we can and cannot verify.

Exact

Output tokens on providers with a public tokenizer. We re-tokenise the text you received with tiktoken or the open-weight model's own tokenizer. Billed above counted is a hard discrepancy.

Bounded

Input tokens and closed models like Claude and Gemini. We re-count what you sent plus documented overhead, and flag figures outside a tolerance band. We never dollarise an estimate as exact.

Unverifiable

Reasoning tokens and per-call cache. Billed but never returned, so there is nothing to re-count. We record them and never assert them.

Every result carries its confidence label. The tool never claims proof it does not have.

Sits beside LiteLLM and audits from the outside.

LiteLLM already writes spend logs. Point TokenLedger at them and it audits the numbers from the outside. It does not route or proxy your traffic.

audit your gateway logs

# read what the gateway already wrote, re-count it independently
tokenledger ingest litellm_spendlogs.jsonl --format litellm
→ output re-counted exactly where a tokenizer exists
→ gateways hand back the provider's own number; we re-tokenise and verify it

A planted over-count, caught and labelled.

The offline demo plants realistic discrepancies and catches them, with every figure carrying its EXACT, BOUNDED, or UNVERIFIABLE label.

tokenledger demo

bucket	billed	re-counted	verdict
output	89	64	over-count, EXACT
input	112	~20	out of band, BOUNDED
reasoning	1,024	n/a	UNVERIFIABLE

Offline demo, planted discrepancies. Figures are illustrative, not a measured customer result.

Run it yourself in 60 seconds.

terminal

pip install "tokenledger[exact]"
tokenledger demo
open tokenledger_demo.html

No signup, no API keys, nothing leaves your machine.

View on GitHub

Questions, answered plainly.

Does my prompt or response data leave my network?

No. All counting and reconciliation run locally. The only network call the system makes is the optional proxy forwarding your own request to the provider you chose. Text can be stored hashed only.

Do I replace my gateway?

No. TokenLedger runs alongside LiteLLM and audits the numbers. It routes nothing.

Can you verify Claude or Gemini token counts exactly?

No, and we say so. Closed models are bounded, not exact. We flag figures outside a tolerance band and never present an estimate as an exact result.

What about reasoning tokens?

Recorded, never asserted. They are billed but not returned, so there is nothing to re-count.

Is this validated with real customers?

The reconciliation engine, store, dashboard and report are working and tested offline. Demand and the closed-model band width are still being validated with design partners. We will not claim a result we have not measured.

See it on your own logs.

We run a small number of seven-day validations with teams whose AI spend is growing. Bring a sample of your gateway logs and we reconcile them with you, with no data leaving your environment.

Prefer to talk first? Use the Book now button, or book a 20-minute call.