Energy-Efficient AI Infrastructure
Benchmarking and tuning the power consumption of LLM inference and GPU clusters.
Work on understanding and reducing the energy footprint of AI workloads on HPC systems: TokenPowerBench (AAAI’26) benchmarks the power consumption of LLM inference; related efforts model the impact of knowledge distillation on GPU clusters and tune CPU frequency and scale for energy-efficient execution.