Ultra-high-definition (UHD) video standards demand processing speed from 60 to 120 fps. These standards require relatively huge resources for providing such high processing speed. In this paper, an area-efficient and high-speed… Click to show full abstract
Ultra-high-definition (UHD) video standards demand processing speed from 60 to 120 fps. These standards require relatively huge resources for providing such high processing speed. In this paper, an area-efficient and high-speed two-dimensional (2D) $$2\times 2$$ 2 × 2 and $$3\times 3$$ 3 × 3 block parallel scalable recursive convolution (BPSRC) architectures are proposed. The $$2\times 2$$ 2 × 2 and $$3 \times 3$$ 3 × 3 BPSRC architectures are used to implement block parallel filters with small to large kernel sizes but limited to multiples of 2 and 3, respectively. For a block parallel convolution, the spatial window is partitioned into fixed size blocks for parallel processing of the block outputs. The algorithm proved effective with respect to area and computational time. The increase in kernel size does not affect the processing time but increases the hardware cost. However, the increase in hardware cost is considerably less when compared with conventional block parallel convolution (BPC). Overall, multiplier complexity is reduced by a factor of 4/9 and 9/16 for $$3\times 3$$ 3 × 3 and $$2 \times 2$$ 2 × 2 BPSRC implementation of 2D finite impulse response (FIR) filters, respectively, over conventional BPC. A throughput of 1.55 Giga operations per second is achieved with $$2\times 2$$ 2 × 2 BPSRC, and that of 1.86 Giga operations per second is achieved with $$3\times 3$$ 3 × 3 BPSRC on Virtex 7 XC7VX485T FPGA.
               
Click one of the above tabs to view related content.