Technology Intermediate Free Analysis

How to Tame AI’s Voracious Appetite for Energy

Katarina Zimmer · Knowable Magazine 2026 8 min read ~2,200 words

Why Read This

What Makes This Article Worth Your Time

Summary

What This Article Is About

Katarina Zimmer surveys the rapidly growing energy cost of artificial intelligence and the scientific race to address it. The problem originates in transformer architecture — the design underpinning most large language models — which scales energy use quadratically with text length and requires enormous computation both during training and, more worryingly, during everyday inference (generating responses). US data centres already consumed 224 terawatt-hours of electricity in 2025 — over 5% of the country’s total — up from just 1.9% in 2018, with much of this new demand met by fossil-fuel-powered plants. Without intervention, data centres could emit the equivalent of 44 megatons of CO₂ annually — comparable to Norway’s total yearly emissions.

Researchers are pursuing solutions on multiple fronts. On the software side, smaller task-specific models and “mixture of expert” architectures can reduce energy use by over 90% compared to full LLMs. Alternative model designs such as xLSTM avoid the “quadratic curse” of transformers. On the hardware side, wafer-scale chips, custom AI processors, neuromorphic chips inspired by the brain, and photonic computing using light rather than electrons all promise significant gains. Beyond the machines themselves, better siting of data centres near renewable energy sources and policy intervention from governments are also critical. Zimmer closes by raising a broader question: even a more efficient AI must still justify its energy use — and not every problem needs an AI solution.

Key Points

Main Takeaways

Inference Is the Bigger Energy Problem

Training an LLM once is energy-intensive, but the real concern is inference — running the model for billions of users every day. As one expert puts it: “You train once, then you inference for a billion people in the world.”

Smaller, Specialised Models Save Over 90% Energy

A 2025 UNESCO study found that task-specific small models like DistilBART consumed more than 90% less energy than Meta’s full-scale Llama 3.1 model when used for the same tasks.

Transformer Architecture Has a “Quadratic Curse”

Doubling the length of text in a transformer model quadruples the number of computations required. Alternative architectures like xLSTM avoid this by storing a running summary rather than processing the full text each time.

Renewable Offsets Are Not Enough

Buying renewable energy credits elsewhere while running fossil-fuelled data centres merely keeps CO₂ emissions “in stasis” — it does not reduce them. Each new fossil megawatt installed, Masanet says, “sets us back on our progress.”

Better Siting Could Cut Footprints by Over 70%

Moving data centres to locations with more abundant renewable energy and water — such as the US Midwest — combined with efficient hardware and software could reduce carbon footprints by 73% and water footprints by 86%.

Not Every Problem Needs an AI Solution

Beyond making AI more efficient, experts argue we should ask where AI is truly needed. Using AI for tasks like customer service chatbots may not justify the environmental cost, regardless of how efficient the underlying models become.

Master Reading Comprehension

Practice with 365 curated articles and 2,400+ questions across 9 RC types.

Start Learning

Article Analysis

Breaking Down the Elements

Main Idea

AI’s Energy Problem Is Structural — and Solvable

Zimmer’s core argument is that AI’s enormous energy consumption is not an unavoidable feature of the technology but a consequence of specific architectural and infrastructure choices that can be redesigned. From software algorithms to chip design to site selection, multiple paths exist to significantly reduce AI’s environmental footprint — but they require active investment, policy support, and a willingness to ask whether AI is actually needed for a given task in the first place.

Purpose

To Inform and Galvanise Action Before the Window Closes

Zimmer writes for Knowable Magazine — a publication that translates peer-reviewed science for general audiences. Her purpose is both explanatory (helping readers understand why AI is so energy-hungry) and prescriptive (surveying solutions with enough specificity to move the conversation beyond vague concerns). The article is addressed as much to policymakers and industry as to curious readers, using expert voices to communicate urgency without alarmism.

Structure

Personal Hook → Scale of Problem → Root Causes → Software Solutions → Hardware Solutions → Siting & Policy → Broader Reflection

Zimmer opens with an intimate, first-person moment (coffee in Berlin, a question to Gemini) that immediately grounds the abstract problem in lived experience. She then escalates through scale data, causal explanation, and a layered set of solutions — software, then hardware, then location and governance — before stepping back to question whether efficiency alone is the answer. The structure moves from personal to planetary, and from diagnosis to prescription.

Tone

Informative, Concerned & Constructively Hopeful

Zimmer writes with the measured concern of a science journalist who has reported closely on the energy transition. She does not catastrophise — she balances alarming figures with concrete progress — but she does not minimise the stakes either. Fengqi You’s closing assertion that “we could really reshape the trajectory” captures her overall tone: urgent but not despairing, focused on solutions rather than blame.

Key Terms

Vocabulary from the Article

Click each card to reveal the definition

Inference

noun

Click to reveal

In AI, the process of using a trained model to generate responses to new queries — what happens every time a user interacts with a chatbot, as distinct from the initial training phase.

Transformer architecture

noun phrase

Click to reveal

A neural network design introduced in 2017 that processes text by simultaneously weighing each word’s relationship to every other word; the foundation of most modern LLMs including ChatGPT and Gemini.

Terawatt-hour

noun

Click to reveal

A unit of energy equal to one trillion watt-hours; used to measure electricity consumption at national or industrial scale. US data centres consumed 224 terawatt-hours in 2025.

Hyperscale

adjective

Click to reveal

Describing data centres of extreme size and power demand — typically using a gigawatt or more of electricity — designed to support the largest AI and cloud computing workloads.

GPU

noun

Click to reveal

Graphics Processing Unit — a chip originally designed for parallel rendering in video games, now the dominant processor for AI computations because it can perform many calculations simultaneously.

Neuromorphic computing

noun phrase

Click to reveal

An approach to computer engineering inspired by the structure and energy efficiency of the human brain, using spike-based signals and components that rest when idle rather than running continuously.

Transistor

noun

Click to reveal

A microscopic electronic switch inside a computer chip that processes data by switching between on and off states; the basic building block of all modern processors.

Photonic computing

noun phrase

Click to reveal

A computing method that uses photons (particles of light) rather than electrons to process and transmit information, potentially enabling faster and more energy-efficient calculations.

Build your vocabulary systematically

Each article in our course includes 8-12 vocabulary words with contextual usage.

View Course

Tough Words

Challenging Vocabulary

Tap each card to flip and see the definition

Quadratically kwod-RAT-ik-lee Tap to flip

Definition

Growing in proportion to the square of a quantity — so doubling one variable causes the dependent variable to quadruple, rather than merely double.

“The number of computations the model performs… increases quadratically relative to the length of text (i.e., doubling the length of text quadruples the number of computations).”

Stasis STAY-sis Tap to flip

Definition

A state of no change or progress; equilibrium — here used to describe a situation where carbon emissions are held constant rather than actually reduced.

“This strategy — at best — keeps CO₂ emissions of centers in stasis rather than reducing them to a net of nothing.”

Labyrinthine lab-uh-RIN-thin Tap to flip

Definition

Resembling a labyrinth — extremely complex, intricate, and difficult to navigate; here used to evoke the enormous, maze-like scale of data centre server halls.

“Somewhere inside the data center’s labyrinthine halls of stacked processors, my query gets converted into numbers…”

Pervasively per-VAY-siv-lee Tap to flip

Definition

Spreading widely through an area or group; present throughout in a thorough and widespread manner — used here to describe the hoped-for mainstream adoption of optical chips across data centres.

“Joshi hopes that, ‘in 10 years, we would have a practical solution that can be deployed pervasively across the data centers’.”

Frugal FROO-gul Tap to flip

Definition

Sparing in the use of resources; economical in a way that avoids waste — applied to AI here to describe the goal of building systems that accomplish tasks using as little energy as possible.

“AI’s energy cost will ultimately be a balancing act… though building a more frugal, energy-saving AI is important…”

Voracious vuh-RAY-shus Tap to flip

Definition

Having an extremely eager and seemingly insatiable appetite, whether for food or — metaphorically — for resources like energy or data.

“How to tame AI’s voracious appetite for energy” — the article’s title uses this word to convey the scale and urgency of AI’s resource consumption.

1 of 6

Reading Comprehension

Test Your Understanding

5 questions covering different RC question types

1According to the article, GPUs were originally invented specifically for AI computations and were later adapted for video gaming.

2According to the article, why does the xLSTM model use less energy than transformer-based models when generating long responses?

3Which sentence best explains why tech companies’ strategy of buying renewable energy credits elsewhere does not adequately address the environmental cost of their data centres?

4Evaluate whether each of the following statements is supported by the article.

According to IEA estimates, US data centres consumed approximately 224 terawatt-hours of electricity in 2025, representing more than 5% of the country’s total electricity use.

The article states that LSTM models are now considered superior to transformer models and are being widely adopted by major tech companies to replace them.

Wafer-scale chips are described in the article as consuming 143 times less electricity for communication than comparable GPUs, but carrying a greater risk of damage during manufacturing.

Select True or False for all three statements, then click “Check Answers”

5The article notes that some state and local governments are introducing policies that “mostly aim to incentivize and accelerate data center builds.” What concern does this detail implicitly raise about the overall policy landscape?

Keep Practicing!

0 correct · 0 incorrect

Get More Practice

FAQ

Frequently Asked Questions

Training an AI model is a one-time event — expensive, but finite. Inference, by contrast, happens continuously, at enormous scale, every time any of the model’s hundreds of millions or billions of users asks it a question. As the article explains: “You train once, then you inference for a billion people in the world.” With ChatGPT alone receiving billions of queries every week, even individually small per-query energy costs accumulate into an enormous and growing total. Training GPT-4 may have consumed 50–60 gigawatt-hours, but the ongoing inference load dwarfs that figure many times over.

Transformer models process language by weighing every word against every other word in the text — a computationally powerful approach, but one whose energy cost scales quadratically with text length. This means that doubling the amount of text doesn’t double the computation — it quadruples it. For short prompts this is manageable, but as responses grow longer, the energy cost explodes. The xLSTM model sidesteps this by maintaining a compressed summary rather than re-processing the full growing text each time it generates a new word, keeping energy costs roughly flat regardless of length.

A “mixture of expert” model is a large AI system that is internally divided into specialised sub-models — each expert at handling a different type of task or language pattern. Rather than activating the entire model for every query, the system routes each request to whichever sub-section is most relevant, leaving the rest dormant. This means far fewer parameters are activated per query, significantly reducing computation and energy use compared to running the full model each time. Google’s Gemini and OpenAI’s ChatGPT are described in the article as increasingly using this approach.

Readlite provides curated articles with comprehensive analysis including summaries, key points, vocabulary building, and practice questions across 9 different RC question types. Our Ultimate Reading Course offers 365 articles with 2,400+ questions to systematically improve your reading comprehension skills.

This article is rated Intermediate. Katarina Zimmer writes for a scientifically literate but non-specialist audience, explaining technical concepts (transformers, GPUs, neuromorphic computing) clearly without requiring prior knowledge. However, the article introduces a large number of distinct technologies across multiple sections, deploys precise quantitative comparisons throughout, and requires readers to track and distinguish between software-level, hardware-level, and infrastructure-level solutions. Students preparing for CAT or GMAT will find it excellent practice for the kind of technology-and-environment passages that regularly appear in those exams.

Katarina Zimmer is a Berlin-based science and environment journalist whose work appears in National Geographic, Scientific American, BBC Future, and Knowable Magazine. She specialises in the energy transition and planetary health. Knowable Magazine is published by Annual Reviews, a non-profit scientific publisher, and is dedicated to making peer-reviewed research accessible to general audiences. It commissions long-form science journalism grounded in original academic sources — making it one of the most reliable outlets for technically rigorous popular science writing.

The Ultimate Reading Course covers 9 RC question types: Multiple Choice, True/False, Multi-Statement T/F, Text Highlight, Fill in the Blanks, Matching, Sequencing, Error Spotting, and Short Answer. This comprehensive coverage prepares you for any reading comprehension format you might encounter.

How to Tame AI’s Voracious Appetite for Energy

What Makes This Article Worth Your Time

What This Article Is About

Main Takeaways

Inference Is the Bigger Energy Problem

Smaller, Specialised Models Save Over 90% Energy

Transformer Architecture Has a “Quadratic Curse”

Renewable Offsets Are Not Enough

Better Siting Could Cut Footprints by Over 70%

Not Every Problem Needs an AI Solution

Breaking Down the Elements

AI’s Energy Problem Is Structural — and Solvable

To Inform and Galvanise Action Before the Window Closes

Personal Hook → Scale of Problem → Root Causes → Software Solutions → Hardware Solutions → Siting & Policy → Broader Reflection

Informative, Concerned & Constructively Hopeful

Vocabulary from the Article

Challenging Vocabulary

Test Your Understanding

Keep Practicing!

Frequently Asked Questions

6 Complete Courses

365 Premium Articles

1 Year Community Access

2,400+ Practice Questions

Multi-Format Learning

What Makes This Article Worth Your Time

What This Article Is About

Main Takeaways

Inference Is the Bigger Energy Problem

Smaller, Specialised Models Save Over 90% Energy

Transformer Architecture Has a “Quadratic Curse”

Renewable Offsets Are Not Enough

Better Siting Could Cut Footprints by Over 70%

Not Every Problem Needs an AI Solution

Breaking Down the Elements

AI’s Energy Problem Is Structural — and Solvable

To Inform and Galvanise Action Before the Window Closes

Personal Hook → Scale of Problem → Root Causes → Software Solutions → Hardware Solutions → Siting & Policy → Broader Reflection

Informative, Concerned & Constructively Hopeful

Vocabulary from the Article

Challenging Vocabulary

Test Your Understanding

Keep Practicing!

Frequently Asked Questions

Complete Bundle - Exceptional Value

Why This Bundle Is Worth It

6 Complete Courses

365 Premium Articles

1 Year Community Access

2,400+ Practice Questions

Multi-Format Learning

✨ Everything Included:

Connect with Prashant

Stuck on a Topic? Let's Solve It Together! 💡

🌟 Explore The Learning Inc. Network

WordPandit

Learn@WordPandit

EDGE@VA-RC

Preplite

GK365

GD PI WAT

Readlite

Easy Hinglish