Bayesian learning for neural network compression

Jen Tzung Chien, Su Ting Chang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Quantization on weight parameters in neural network training plays a key role for model compression in mobile devices. This paper presents a general M-ary adaptive quantization in construction of Bayesian neural networks. The trade-off between model capacity and memory cost is adjustable. The stochastic weight parameters are faithfully reflected. A compact model is trained to achieve robustness to model uncertainty due to heterogeneous data collection. To minimize the performance loss, the representation levels in quantized neural network are estimated by maximizing the variational lower bound of log likelihood conditioned on M-ary quantization. Bayesian learning is formulated by using the multi-spike-and- slab prior for quantization levels. An adaptive quantization is derived to implement a flexible parameter space for learning representation which is applied for object recognition. Experiments on image recognition show the merit of this Bayesian model compression for M-ary quantized neural networks.

Original languageEnglish
Title of host publication2020 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728114859
DOIs
StatePublished - Jul 2020
Event2020 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2020 - London, United Kingdom
Duration: 6 Jul 202010 Jul 2020

Publication series

Name2020 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2020

Conference

Conference2020 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2020
CountryUnited Kingdom
CityLondon
Period6/07/2010/07/20

Keywords

  • Adaptive quantization
  • Bayesian neural network
  • Model compression
  • Quantized neural network

Fingerprint Dive into the research topics of 'Bayesian learning for neural network compression'. Together they form a unique fingerprint.

Cite this