"Multiplication Through a Single Look-Up-Table (LUT) in CNN Inference Computation"

Parameter quantization with lower bit-width is the common approach to reduce the computation loads in CNN inference. With the parameters being replaced by fixed-width binaries, multiplication operations can be replaced by the look-up-table (LUT), where the multiplier-multiplicand operands serve as the table index, and the pre-calculated products serve as table elements. Because the histogram profiles of the parameters in different layers/channels differ significantly in CNN, previous LUT-based computation methods have to use different LUTs for each layer/channel, and consequently demand larger memory space along with extra access time and power consumption. In this work, we first normalize the parameters’ Gaussian profiles of different layers/channels to have similar means and variances, and further quantize the normalized parameters into fixed-width through non-linear quantization. Because of the normalized parameters’ profile, we can use one single compact LUT (16×16 entries) to replace all multiplication operations in the whole network. Furthermore, the normalization procedure also reduces the errors induced from quantization. Experiments demonstrate that with a compact 256-entry LUT, we can achieve the accuracy comparable to the results from 32-bit floating-point calculation; while significantly reduce the computation loads and memory spaces, along with power consumption and hardware resources.

Keywords: table lut; cnn inference; computation; multiplication; look table

Journal Title: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Year Published: 2021

Link to full text (if available)

Share on Social Media: Sign Up to like & get
recommendations!
1

LAUSR

You are not signed in:

Sign Up!

Related content

More Information News Social Media Video Recommended