PIMS: A Lightweight Processing-in-Memory Accelerator for Stencil Computations

Abstract-Stencil computation is a classic computational kernel present in many high-performance scientific applications, like image processing and partial differential equation solvers (PDE). A stencil computation sweeps over a multi-dimensional grid and repeatedly updates values associated with points using the values from neighboring points. Stencil computations often employ large datasets that exceed cache capacity, leading to

Memory Power Tracing on Hybrid Memory Cube

Abstract—The modern microprocessor architecture mainly utilizes multi-level data caches as a primary optimization to reduce the memory access latency and power consumption as well as increase the perceived bandwidth from an application. This mechanism works well with significant memory reuse or linear memory access patterns. However, the irregular or nondeterministic pattern of memory access will