Energy-Efficient AI Infrastructure

Benchmarking and tuning the power consumption of LLM inference and GPU clusters.

Work on understanding and reducing the energy footprint of AI workloads on HPC systems: TokenPowerBench (AAAI’26) benchmarks the power consumption of LLM inference; related efforts model the impact of knowledge distillation on GPU clusters and tune CPU frequency and scale for energy-efficient execution.