This is absurd. Training an AI is energy intensive but highly efficient. Running inference for a few hundred tokens, doing a search, stuff like that is a triviality.
Each generated token takes the equivalent energy of the heat from burning ~.06 µL of gasoline per token. ~2 joules per token, including datacenter and hosting overhead. If you get up to massive million token prompts, it can get up to the 8-10 joules per token of output. Training runs around 17-20J per token.
A liter of gasoline gets you 16,800,000 tokens for normal use cases. Caching and the various scaled up efficiency hacks and improvements get you into the thousands of tokens per joule for some use cases.
For contrast, your desktop PC running idle uses around 350k joules per day. Your fridge uses 3 million joules per day.
AI is such a relatively trivial use of resources that you caring about nearly any other problem, in the entire expanse of all available problems to care about, would be a better use of your time.
AI is making resources allocated to computation and data processing much more efficient, and year over year, the relative intelligence per token generated, and the absolute energy cost per token generated, is getting far more efficient and relatively valuable.
Find something meaningful to be upset at. AI is a dumb thing to be angry at.
I’m curious where you got any of those numbers. Many laptops use <20W. But most local-ai inferencing requires high end, power hungry nvidia GPUs that use multiple hundreds of watts. There’s a reason those GPUs are in high demand, with prices sky high, because those same (or similar) power hungry chips are in data centers.
Compared to traditional computing it seems to me like there’s no way AI is power efficient. Especially when so many of the generated tokens are just platitudes and hallucinations.
> The agreed-on best guess right now for the average chatbot prompt’s energy cost is actually the same as a Google search in 2009: 0.3 Wh. This includes the cost of the answering your prompt, idling AI chips between propmts, cooling in the data center, and other energy costs in the data center. This does not include the cost of training the model, the embodied carbon costs of the AI chips, or the fact that data centers typically draw from slightly more carbon intense sources. If you include all of those, the full carbon emissions of an AI prompt rise to 0.28 g of CO2. This is the same emissions as we cause when we use ~0.8 Wh of energy.
How concerned should you be about spending 0.8 Wh? 0.8 Wh is enough to:
Stream a video for 35 seconds
Watch an LED TV (no sound) for 50 seconds
Upload 9 photos to social media
Drive a sedan at a consistent speed for 4 feet
Leave your digital clock on for 50 minutes
Run a space heater for 0.7 seconds
Print a fifth of a page of a physical book
Spend 1 minute reading this blog post. If you’re reading this on a laptop and spend 20 minutes reading the full post, you will have used as much energy as 20 ChatGPT prompts. ChatGPT could write this blog post using less energy than you use to read it!
W stands for Watts, which means Joules per second.
The energy usage of the human body is measured in kilocalories, aka Calories.
Combustion of gasoline can be approximated by conversion of its chemicals into water and carbon dioxide. You can look up energy costs and energy conversions online.
Some AI usage data is public. TDP of GPUs are also usually public.
I made some assumptions based on H100s and models around the 4o size. Running them locally changes the equation, of course - any sort of compute that can be distributed is going to enjoy economies of scale and benefit from well worn optimizations that won't apply to locally run single user hardware.
Also, for AI specifically, depending on MoE and other sparsity tactics, caching, hardware hacks, regenerative capture at the datacenter, and a bajillion other little things, the actual number is variable. Model routing like OpenAI does further obfuscates the cost per token - a high capabilities 8B model is going to run more efficiently than a 600B model across the board, but even the enormous 2T models can generate many tokens for the equivalent energy of burning µL of gasoline.
If you pick a specific model and gpu, or Google's TPUs, or whatever software/hardware combo you like, you can get to the specifics. I chose µL of gasoline to drive the point across, tokens are incredibly cheap, energy is enormously abundant, and we use many orders of magnitude more energy on things we hardly ever think about, it just shows up in the monthly power bill.
AC and heating, computers, household appliances, lights, all that stuff uses way more energy than AI. Even if you were talking with AI every waking moment, you're not going to be able to outpace other, far more casual expenditures of energy in your life.
A wonderful metric would be average intelligence level per token generated, and then adjust the tokens/Joule with an intelligence rank normalized against a human average, contrasted against the cost per token. That'd tell you the average value per token compared to the equivalent value of a human generated token. Should probably estimate a ballpark for human cognitive efficiency, estimate token/Joule of metabolism for contrast.
Doing something similar for image or music generation would give you a way of valuing the relative capabilities of different models, and a baseline for ranking human content against generations. A well constructed meme clip by a skilled creator, an AI song vs a professional musician, an essay or article vs a human journalist, and so on. You could track the value over context length, length of output, length of video/audio media, size of image, and so on.
Suno and nano banana and Veo and Sora all far exceed the average person's abilities to produce images and videos, and their value even exceeds that of skilled humans in certain cases, like the viral cat playing instrument on the porch clips, or ghiblification, or bigfoot vlogs, or the AI country song that hit the charts. The value contrasted with the cost shows why people want it, and some scale of quality gives us an overall ranking with slop at the bottom up to major Hollywood productions and art at the Louvre and Beethoven and Shakespeare up top.
Anyway, even without trying to nail down the relative value of any given token or generation, the costs are trivial. Don't get me wrong, you don't want to usurp all a small town's potable water and available power infrastructure for a massive datacenter and then tell the residents to pound sand. There are real issues with making sure massive corporations don't trample individuals and small communities. Local problems exist, but at the global scale, AI is providing a tremendous ROI.
AI doombait generally trots out the local issues and projects them up to a global scale, without checking the math or the claims in a rigorous way, and you end up with lots of outrage and no context or nuance. The reality is that while issues at scale do exist, they're not the issues that get clicks, and the issues with individual use are many orders of magnitude less important than almost anything else any individual can put their time and energy towards fixing.
You are cleary biased.
A complex chatgpt 5 thinking runs at 40 Wh per prompt. This is more in line with the estimated load that ai needs to scale. These thinling models wpuld be faster but use similar amount of energy. Humans doing that thinking use far fewer jpiles than gpt 5 thinking. Its not even close.
your answer seems very specific on joules. Could you explain your calculations, since I cannot comprehend the mapping of how you would get a liter of gasoline to 16.8m tokens? e.g. does that assume 100% conversion to energy, not taking into account heat loss, transfer loss, etc?
(For example, simplistically there's 86400s/day, so you are saying that my desktop PC idles at 350/86.4=4W, which seems way off even for most laptops, which idle at 6-10W)
Each generated token takes the equivalent energy of the heat from burning ~.06 µL of gasoline per token. ~2 joules per token, including datacenter and hosting overhead. If you get up to massive million token prompts, it can get up to the 8-10 joules per token of output. Training runs around 17-20J per token.
A liter of gasoline gets you 16,800,000 tokens for normal use cases. Caching and the various scaled up efficiency hacks and improvements get you into the thousands of tokens per joule for some use cases.
For contrast, your desktop PC running idle uses around 350k joules per day. Your fridge uses 3 million joules per day.
AI is such a relatively trivial use of resources that you caring about nearly any other problem, in the entire expanse of all available problems to care about, would be a better use of your time.
AI is making resources allocated to computation and data processing much more efficient, and year over year, the relative intelligence per token generated, and the absolute energy cost per token generated, is getting far more efficient and relatively valuable.
Find something meaningful to be upset at. AI is a dumb thing to be angry at.