Mastodon

New estimates have ChatGPT using 10x less power than previously thought

Posted by Matt Birchler
— 2 min read

Josh You publishing a study on Epoch AI: How Much Energy Does ChatGPT Use?

A commonly-cited claim is that powering an individual ChatGPT query requires around 3 watt-hours of electricity, or 10 times as much as a Google search.

However…

We find that typical ChatGPT queries using GPT-4o likely consume roughly 0.3 watt-hours, which is ten times less than the older estimate. This difference comes from more efficient models and hardware compared to early 2023, and an overly pessimistic estimate of token counts in the original estimate.

Reading the study is worth your time, but as I’ve written about several times in member posts, I’ve been struggling with being too upset about the energy use of LLMs even with the previous, potentially 10x overstated power estimates. If the average ChatGPT request is as power hungry as the average Google search and if most of the entire DeepSeek drama from last month was about how much less power that model and others like it use than 4o, then what are we even doing here making this a key argument against LLMs? Even with more pessimistic numbers, this was a stat that stood out to me when I was considering the power use of Midjourney compared to some household items:

A headless [M4] Mac mini that just sits there and does nothing all day is using just as much power as someone who generates 200 images on Midjourney that same day.

As I’ve written in those “incomplete thought” posts, I’m happy to be shown why these comparisons are incorrect, but in our nerd community, someone who has an always-on Mac mini is quirky and a good nerd, but someone who uses ChatGPT or Claude a bit is boiling the oceans.

Another way I put it in that piece based on the power use of LLMs vs your furnace.

Actually, since each image uses as much power as your furnace every second it runs, it would actually be more energy efficient for you to have an LLM turn off your furnace than to walk across the house to manually turn the dial.

The point of that article was not to tell you to start using LLMs to do everything for you, it was just that how you present data matters a ton, and I thought the reporting on LLM power use was presenting it in a context-free way that made it look as bad as possible. As I said in those members posts before, I’m open to being shown good data on why this is so much worse than it seems.