Training Recipe for N:M Structured Sparsity with Decaying Pruning Mask

Amir Yazdanbakhsh

Sheng-Chun Kao

Shivani Agrawal

Suvinay Subramanian

Tushar Krishna

Utku Evci

(2022) (to appear)

Google Scholar

Abstract

Sparsity has become one of the promising methods to compress and accelerate Deep Neural Networks (DNNs). Among different categories of sparsity, structured sparsity has gained more attentions due to its efficient execution on modern accelerators. Particularly, N:M sparsity is attractive because there are already hardware accelerator architectures that can leverage few forms of N:M structured sparsity in the model to yield higher compute-efficiency. While there is a large body of work proposing various recipes for N:M structured sparsity training, compute-efficient training recipes for structured sparsity is rather a less explored territory. In this work, we focus on N:M sparsity and extensively study and evaluate various training recipes for N:M sparsity in terms of the trade-off between model accuracy and compute training cost (FLOPs). Building upon this study, we propose two new decay-based pruning methods, namely “pruning mask decay” and “sparse structure decay”. Our evaluations indicate that these proposed methods consistently deliver SOTA model accuracy, comparable to unstructured sparsity, on a transformer-based model for translate task. The increase in the accuracy of the sparse model using the new training recipes comes at the cost of marginal increase in the total training compute (FLOPs).

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Training Recipe for N:M Structured Sparsity with Decaying Pruning Mask

Abstract

Research Areas

Meet the teams driving innovation

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Training Recipe for N:M Structured Sparsity with Decaying Pruning Mask

Abstract

Research Areas

Meet the teams driving innovation

AI/ML Foundations  & Capabilities