In this paper, we introduce a hardware architecture to accelerate connected component labeling (CCL) for embedded systems. The proposed CCL architecture scans the given binary image only once, and during… Click to show full abstract
In this paper, we introduce a hardware architecture to accelerate connected component labeling (CCL) for embedded systems. The proposed CCL architecture scans the given binary image only once, and during the raster scan, the input binary image is compressed with run-length encoding to extract runs. The equivalences between runs are resolved efficiently by merging equivalent label lists, and all the intermediate data generated while labeling the images are stored in on-chip memory to avoid frequent access to off-chip memory. The finished connected components are determined and then output directly to free on-chip memory resources early. These freed memory resources can be reused, which saves memory. Our CCL architecture is implemented with Verilog, and a quantitative comparison of memory cost shows that the proposed CCL architecture is memory-efficient and requires significantly fewer memory resources compared to other methods. In addition, our CCL architecture can process more than 25 $2048\times 1536$ benchmark images per second when it works at 300 MHz.
               
Click one of the above tabs to view related content.