Detecting and Identifying Applications by Job Signatures


As HPC systems are entering the exaFLOP era, the scale and complexity of HPC systems have increased significantly over the past few years. Administrators need to understand not only how the hardware system is performing, but also the typical applications that use the system. In addition, resource contention and energy consumption increase with the computation capability. HPC administrators and researchers need to understand the characteristics of running applications and design better resource-aware scheduling policies to improve system efficiency. Moreover, unauthorized applications, such as bit-coin mining programs, could take advantages of the high computing capability, consuming computing hours that supposed be used for scientific discoveries. Therefore, knowing which applications are running will help administrators to ban these malware in a proactive manner. To address these challenges, it is necessary to detect applications and develop management strategies based on knowledge of the applications. However, this is a no-trivial task if users do not specify the name of the application in their job submission scripts. In this research, we propose approaches to detect and identify applications through job signatures that are built from the monitoring metrics. Specifically, we exploit monitoring metrics collected from LDMS to build job signatures by two approaches: extracting statistical features from time-series data and representing multi-dimensional time-series data with images. Then, we explore several classification algorithms and evaluate their performances in classifying job signatures.

Download slides here