Monitoring Power Usage of Jobs Running on Quanah Cluster


Advanced power measurement capabilities are becoming available on large-scale high performance computing (HPC) deployments. Measurement of power/energy usage and its variation during real workloads will enable us to evaluate the potential benefits of incorporating power data into job scheduling and resource management decisions. There are several existing approaches to providing power measurements today, primarily through in-band and out-of-band measurements. In this talk ,we will discuss several power profiling techniques on modern HPC platforms and give a demo of our current implementation of monitoring power usage of jobs running on Quanah cluster. While this is still work in progress, we present the current state of our research in order to show what we are trying to learn, what we are analyzing and how we are analyzing it, and what else we need to accomplish to make further progress.

Download slides here