LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

FCENet: An Instance Segmentation Model for Extracting Figures and Captions From Material Documents

Photo by nems16 from unsplash

A critical ideology of the existing Material Genome Project refers to the application of data and artificial intelligence to facilitate material innovation. The lack of data hinders the development of… Click to show full abstract

A critical ideology of the existing Material Genome Project refers to the application of data and artificial intelligence to facilitate material innovation. The lack of data hinders the development of novel materials. The figures and captions in the material literature cover essential information regarding the entire document and have sufficient image sample data for research. Accordingly, how to extract figures and captions from the literature is critical to solve the lack of data. Though some PDF parsing tools are capable of extracting information from documents, they generally identify a document’s figures by parsing the document into a concrete structure. As impacted by the inconsistency of the form of different journals, they commonly achieve wrong recognition results. Thus, an efficient figure and caption extraction network FCENet is proposed in the present study. Inconsistent with other extraction tools, this study first attempts to adopt instance segmentation models to detect figures and their captions, and then extract them. FCENet developed in this study builds upon BlendMask and introduces a horizontal and vertical attention module. This study splits the BlendMask detection head into two branches, i.e., figure detection and caption detection, which increases final detection accuracy and speed. This study collects nearly 3000 material documents for model training and testing. As revealed from the last experiments and results, the performance of FCENet is significantly compared with that of other existing instance segmentation models. Its box and mask mAP (mean Average Precision) are 8.51% and 12.59% higher than those of BlendMask, respectively. This study hopes that considerable material image data can be acquired via FCENet and sufficiently support image data for machine learning and data mining in the material area.

Keywords: fcenet; instance segmentation; figures captions; captions material

Journal Title: IEEE Access
Year Published: 2021

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.