Investigation of DVFS Based Dynamic Reliability Management for Chip Multiprocessors
Format of Original
Institute of Electrical and Electronics Engineers (IEEE)
2015 International Conference on High Performance Computing & Simulation (HPCS)
We investigate dynamic voltage and frequency scaling (DVFS) as a mechanism for dynamic reliability management (DRM) of chip multiprocessors (CMPs). The proposed DRM scheme operates as a control technique whose objective is to drive the operation of the CMP such that reliability changes towards a desired target. While the chip multiprocessor is continuously monitored and reliability is estimated in real time, the voltage and frequency of different cores in the CMP are dynamically adjusted such that reliability converges towards the target. When the temperature of cores increases and thus reliability degrades, the proposed DRM scheme throttles selectively the frequency of the cores with the highest temperature. This is turn, leads to a lower power dissipation in those cores whose temperature decreases, thereby improving reliability. We leverage existing simulation and estimation tools to develop the proposed DRM scheme. Simulations results show that the proposed DRM scheme provides an effective way to tradeoff reliability and performance.