Technology

Reading the Meter

Tokens go in. Tokens come out. Who knows what they cost.

Crispin Bailey

04 Jun 2026 • 11 min read

It's a weekday evening in 2027, and there’s a pile of bills on my desk that I’ve been meaning to get to. I pick up the latest bill from the company that supplies our AI. It’s itemized the like all the other utility bills. It lists the tokens used on the main model, broken out by person, plus the tokens used by the AI assistant that's been monitoring our investments and our tracking our household spending. There's a processing fee, a delivery fee, and a routing surcharge because the system kept switching to the better model whenever one of us asked a question that required extra thinking. Plus a regulatory fee – because someone has to regulate all of this, and a small charge for the report itself, written by the model, explaining what the model did.

The total for the month is much higher than usual, and I already know why. I'd spent the last two weeks working on a new software project, and I left the high intelligence model on auto during peak hours while I was out running an errand. I consider for the umpteenth time whether we need to upgrade to the premium tier, and how to explain this month’s bill to my wife. Somewhere in a building neither of us will ever see, a meter keeps running.

This isn't real yet, but the pieces for it are slowly being put in place. What turns a cheap AI subscription into a household bill, with surcharges and a routing fee and the mental arithmetic of whether you can afford to think at full speed this month, is already being built. What got me thinking about this scene wasn't just the idea of a big bill, though I do believe we'll all be shocked by our AI bills in the not-too-distant future. It was that the numbers on each line would be impossible to predict because there’s no way to estimate or track them in real-time.

The dashboard you don't have

I don't have to imagine that last part because it already works that way. Over the winter holidays I spun up two side projects, spent a week going deep on them, and didn’t get far with my weekly usage allowance before Claude informed me I had hit my limit and refused to keep going unless I upgraded or paid more. Or I could wait 3 hours till my usage limit resets. There wasn't a running total displayed after each response, or any warnings when I reached fifty or eighty percent of my subscription. Yes, there is a usage tracker, but it's buried in the settings, inconveniently located, intentionally out of sight. And even then, it's just a bar chart with a percentage indicator, and we're supposed to trust that it's accurate. Yet it's constantly measuring our usage right down to the last token.

That's something I don’t hear people mention when they say AI is becoming a utility. You are being metered, in a unit called the token (which represents a few characters of text), and the meter is hidden from you. Only developers who access models through an API ever see their actual usage in tokens because they have to pay for every single one. Most people using a chatbot never see a token count. Even your allowance is a moving target. Anthropic doesn't publish the daily limits for Claude's free consumer plans, usage resets every five hours, the paid tiers stack weekly caps on top of the daily caps, and how much you get shifts depending on the model you select, the length of your conversation, the files you attach, the tools you call, and what time of day it is.

A couple of weeks ago I wrote about not trusting the dashboards the industry shows you. This is the other half of the story. Underneath those dashboards a meter is running for everyone, but we aren’t given access to it. The flat subscription fee was always just a teaser, sold below cost while the providers keep adding new features to make them harder and harder to live without. But the bill is starting to come due. GitHub Copilot recenty moved to usage-based pricing, and at one point Anthropic was weighing whether to pull advanced coding features out of the lowest paid subscription tier.

The people at the leading edge are already living with this reality. Last summer a developer watched his Cursor bill jump from around $20 to $130 after the company changed the fixed monthly allotment scheme to a usage-based credit system. What he did next is something we might all need to do at some point. He built a meter to watch the meter. The rest of us can install third-party browser extensions or token counters meant for coders, but otherwise we’re left in the dark, trying to ration a resource we’re not able to measure.

Without tools to estimate or measure token usage, costs can quickly get out of hand. A large company reportedly blew $500M in a single month after deploying AI without putting any caps on employee usage. I imagine they’ll be building internal tools to monitor usage and setting token limits going forward.

The meter you can't read

We have metered invisible things before, and it went badly enough that people eventually fixed it. When you think of how utilities are priced the electric or water bill might spring to mind, but those move slowly and arrive with a number you can roughly guess, because you can feel the air conditioner running and count how many showers your household takes. The closer ancestors are long-distance calls charged by the minute, and mobile data charged by the megabyte, both with exorbitant roaming rates varying by the country you’re visiting. Both abstract units, with meters you can’t see, a rate you can’t quite pin down, and at the end of the trip, the bill shock.

The term “bill shock” isn't mine. It's what the regulators called it. Before Europe acted, one in five travellers came home to a phone bill at least a hundred and fifty dollars higher than usual, and roaming surcharges cost British customers more than half a billion pounds a year. The fix arrived in stages that should sound familiar by the time AI inevitably reaches them. First the price caps. Then the transparency rules, an alert when you crossed eighty percent of a limit, and an automatic cut-off at fifty euros so the meter couldn't run away in the dark. Then, in 2017, the surcharges went away altogether, and Europeans had saved close to ten billion euros.

There were catches however. Unlimited came with conditions and clauses that let an operator throttle you past a certain point, and exclusions that kicked in the moment you crossed a border. That is the form AI pricing is taking now, the nominal all-you-can-use plan with a higher fee when the system uses the smarter model and a quiet ceiling you can’t see until you hit it. The people who feel the exclusions first are in the middle, the small studios and the people working side-hustles who need more than the basic tier and can't justify the ultra plan, and anyone in Canada who pays an extra 30-40% on every American price before the surcharges begin. The encouraging part of the roaming story is that the asymmetry wasn't permanent. It was a choice made by the providers, and enough people complained about it that the rules eventually changed.

What it costs the grid

There's a second number the meter hides, and it doesn’t appear on any bill but it costs us all. Every token is a small act of computation, and computation requires electricity and the water that cools the machines doing it. When he was asked what people typing please and thank you into ChatGPT costs OpenAI, Sam Altman answered: "tens of millions of dollars, well spent." He meant that as a flex about their scale at the time, but it serves as a clear admission that with AI, even your manners carry a cost.

How much it costs is contestable, but a responsible tally keeps both numbers in view. Altman puts a single query at about a third of a watt-hour and a fifteenth of a teaspoon of water, small enough to wave away. Researchers at UC Riverside put the water behind a hundred-word email closer to three bottles' worth. Both can be true depending on what you count, and either way the aggregate is the thing to watch. Data centres already draw something like one and a half percent of the world's electricity, set to roughly double by the end of the decade.

Efficiency won't save us. Even though models keep getting cheaper to run per token, cheaper tokens will just result in more tokens, per Jevon's paradox, which is in part why there's now a market being built to sell them like a commodity. Treating the token as something that simply flows, the way the old ads taught our parents’ generation to picture electricity flowing into the house, is a habit worth breaking before it sets. The money saved is the small reason. The bigger one is that whatever arrives so cheaply is ultimately still being drawn from a data center that has to be powered and cooled.

Commodity or currency

While you can't see your own meter, the infrastructure for pricing what it measures is going up at a scale that matches the frontier labs’ ambition. The big American exchanges, CME and Intercontinental Exchange, are preparing contracts on the cost of GPU time, a market BlackRock's Larry Fink has called a new asset class and its boosters compare to the six-trillion-dollar energy market. China meanwhile, is taking a different approach. The Shanghai Futures Exchange, where copper and aluminium already trade, is drawing up futures tied to the tokens themselves. And Beijing has given the token a name, ciyuan, derived from the word for its own currency, which some people interpret as a bid to turn the unit of AI into a unit of money.

That last move is the strange one. A barrel of oil and a dollar are not the same kind of thing, and the exchanges are betting that the token is both a commodity to be hedged and a currency to be issued. One Semafor piece floats tokens as the first durable digital store of value where earlier digital currencies failed. But a token is only useful in the moment, as it’s consumed or generated, and it isn’t something you can store. So there is clearly still some figuring out to do.

Who gets to read the meter

The only people who can read the meter are the ones who buy their tokens the way the big companies do, by the unit, through the API, watching the cost tick up in real time. Bill Nguyen is one of them. He made a fortune selling companies to Apple, and this past spring he paid for what amounted to an unlimited supply of tokens and used them to build a working model of himself. To get there he handed the system his digital life, the call logs and the location history and the patterns of his memory, his voice included. In a recent interview he explained that he hadn't asked AI to help him. He'd asked it to be him.

It’s worth noting what the price of admission buys at each end of the market. Lower down, surveillance is the toll, and the cheaper the tier, the more of you gets tracked and sold to keep it cheap. ChatGPT now shows ads on its free and budget tiers. At the top end, a similar surrender of the self is the cost of moving faster than everyone else, but at least it comes without ads. I wrote about the two-tier world already, so I'll add only the part that belongs here. What currently divides the tiers is visibility, and whether or not you can see what you're spending. The ability to read your own meter has become a premium service.

Off the grid

There is a way to go off this grid, and it looks a lot like the panels people put on their roofs to stop paying the power company. You can run a model on your own machine, on your own electricity, using your own tokens. Some of the open source options are good enough for ordinary uses, and device makers like Apple could include small local models as a way to offer free AI features, the way a new house might come with solar panels already on the roof. It will never match the frontier because the gap between what fits on a laptop and what runs in a billion-dollar data centre is pretty big. The off-grid option will mostly serve hobbyists and the people priced out of everything else. And there's likely a grey market version coming, with cheap compute run on borrowed or lower-grade hardware, but that's a story for another day. On your own hardware you can finally read the meter, because the meter is your power bill. But you give up access to the best models, and the best intelligence.

Something more community-based could eventually come out of all this. A SETI-style network where people donate spare compute to a shared public pool, or a peer-to-peer market that pays you a few cents for every thousand tokens your idle machine processes for someone else. Whether any of it becomes more than a hobbyist's mutual aid against an industrial grid is the kind of thing nobody can call just yet. But I’m willing to bet that someone's already working on it.

Paid in tokens

Every utility we’ve built in the last century and a half redrew who got access and who didn’t, and the lines were drawn by hand. When the grid went up, the private power companies wrote off rural America as unprofitable, charged the farms they did serve up to four times the city rate, and in the cities they wired the wealthy white neighbourhoods first while communities of colour had to wait. Barely one farm in ten had power in 1934. Those families eventually got access but it came after a decade of public outcry, setting up cooperatives themselves, and a New Deal agency that finally bankrolled them. None of it was inevitable, and none of it happened without a fight.

It should concern all of us that the AI utility is being built out right now, and the argument is barely happening. There's no obligation to serve everyone, no regulator with the standing to write one, and the cheapest tier is being shaped around the people who can only ever afford the cheapest tier. The cellular roaming meter got fixed because enough people decided bill shock was not a law of nature, and they pushed for change. The same door is open here, and it's worth walking through before the meters arrive at home.

Underneath all of this sits a slightly unsettling question. If machine intelligence is ultimately being priced by the token, at what point does an hour of human work get valued against the token too? For every meeting a model could have run, every paragraph it could have drafted, or idea it might have reached first, someone will inevitably ask if the human cost more or less tokens. And as tokens keep getting cheaper to make and more expensive to buy, the new rate for thinking is hiding somewhere in that gap.

Below are two postscripts. Describing these problems without providing solutions felt unresolved so I created a couple prototypes to demonstrate how things could look. The first one is something you can use now (with the prompt to build your own), while the second one is something the industry could easily build tomorrow.

I asked Claude to build me a token estimator, which it did in just a few minutes (and tells how hard this isn't). Paste or type in any amount of text, and it gives you a number, a price, and the potential draw on the grid.

Launch Token Estimator

This is a simple UI concept that AI providers could easily ship tomorrow. It includes a meter that lives in the interface instead of in your imagination. Send it a message and watch it climb!

Launch Prototype

Research and editing assistance for this post was provided by Claude Opus 4.8. Cover art was generated with Midjourney 8.1