This paper presents a textline detection method for degraded historical documents. Our method follows a conventional two-step procedure that the binarization is first performed and then the textlines are extracted… Click to show full abstract
This paper presents a textline detection method for degraded historical documents. Our method follows a conventional two-step procedure that the binarization is first performed and then the textlines are extracted from the binary image. In order to address the challenges in historical documents such as document degradation, structure noise, and skews, we develop new methods for the binarization and textline extraction. First, we improve the performance of binarization by detecting the non-text regions and processing only text regions. We also improve the textline detection method by extracting main textblock and compensating the skew angle and writing style. Experimental results show that the proposed method yields the state-of-the-art performance for several datasets.
               
Click one of the above tabs to view related content.