"Synthesizing data for text recognition with style transfer"

Photo by campaign_creators from unsplash

Most of the existing datasets for scene text recognition merely consist of a few thousand training samples with a very limited vocabulary, which cannot meet the requirement of the state-of-the-art deep learning based text recognition methods. Meanwhile, although the synthetic datasets (e.g., SynthText90k) usually contain millions of samples, they cannot fit the data distribution of the small target datasets in natural scenes completely. To address these problems, we propose a word data generating method called SynthText-Transfer, which is capable of emulating the distribution of the target dataset. SynthText-Transfer uses a style transfer method to generate samples with arbitray text content, which preserve the texture of the reference sample in the target dataset. The generated images are not only visibly similar with real images, but also capable of improving the accuracy of the state-of-the-art text recognition methods, especially for the English and Chinese dataset with a large alphabet (in which many characters only appear in few samples, making it hard to learn for sequence models). Moreover, the proposed method is fast and flexible, with a competitive speed among common style transfer methods.

Keywords: text recognition; synthesizing data; style transfer; transfer

Journal Title: Multimedia Tools and Applications
Year Published: 2018

Link to full text (if available)

Share on Social Media: Sign Up to like & get
recommendations!
1

LAUSR

You are not signed in:

Sign Up!

Related content

More Information News Social Media Video Recommended