Tokens Per Watt Is the Real Limit on AI Revenue

May 26, 2026

Most AI revenue will flow through tokens — and the two bottlenecks are tokens per watt (energy cost) and tokens per second (throughput).

Tokens per watt determines how much output you can generate from a fixed energy supply — already constrained and getting tighter. Tokens per second sets the ceiling on how fast that revenue can flow. Kong's AI Gateway optimizes both at the connectivity layer: semantic caching and semantic routing increase token output without adding watts or latency.
#Shorts #AIGateway #LLM #AIInfrastructure