Title

Improving Performance and Energy Efficiency of Matrix Multiplication via Pipeline Broadcast

Document Type

Conference Proceeding

Language

eng

Format of Original

5 p.

Publication Date

9-2013

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Source Publication

42nd International Conference on Parallel Processing (ICPP) 2013

Source ISSN

0190-3918

Original Item ID

doi: 10.1109/CLUSTER.2013.6702672

Abstract

Boosting performance and energy efficiency of scientific applications running on high performance computing systems arise cruicially nowadays. Software and hardware based solutions for improving communication performance have been recognized as significant means of achieving performance gain and thus energy savings for such applications. As a fundamental component of most numerical linear algebra algorithms, improving performance and energy efficiency of distributed matrix multiplication is of major concerns. For such purposes, we propose a high performance communication scheme that fully exploits network bandwidth via non-blocking pipeline broadcast with tuned chunk size. Empirically, substantial performance gain up to 8.4% and energy savings up to 6.9% are achieved compared to blocking pipeline broadcast, and against binomial tree broadcast, performance gain up to 6.5% and energy savings up to 6.1% are observed on a 64-core cluster.

Comments

Published as part of the proceedings of the conference, 2013 IEEE International Conference on Cluster Computing (CLUSTER), 2013: 1-5. DOI.