LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

D3NER: biomedical named entity recognition using CRF‐biLSTM improved with fine‐tuned embeddings of various linguistic information

Photo by alterego_swiss from unsplash

Motivation: Recognition of biomedical named entities in the textual literature is a highly challenging research topic with great interest, playing as the prerequisite for extracting huge amount of high‐valued biomedical… Click to show full abstract

Motivation: Recognition of biomedical named entities in the textual literature is a highly challenging research topic with great interest, playing as the prerequisite for extracting huge amount of high‐valued biomedical knowledge deposited in unstructured text and transforming them into well‐structured formats. Long Short‐Term Memory (LSTM) networks have recently been employed in various biomedical named entity recognition (NER) models with great success. They, however, often did not take advantages of all useful linguistic information and still have many aspects to be further improved for better performance. Results: We propose D3NER, a novel biomedical named entity recognition (NER) model using conditional random fields and bidirectional long short‐term memory improved with fine‐tuned embeddings of various linguistic information. D3NER is thoroughly compared with seven very recent state‐of‐the‐art NER models, of which two are even joint models with named entity normalization (NEN), which was proven to bring performance improvements to NER. Experimental results on benchmark datasets, i.e. the BioCreative V Chemical Disease Relation (BC5 CDR), the NCBI Disease and the FSU‐PRGE gene/protein corpus, demonstrate the out‐performance and stability of D3NER over all compared models for chemical, gene/protein NER and over all models (without NEN jointed, as D3NER) for disease NER, in almost all cases. On the BC5 CDR corpus, D3NER achieves Symbol for the chemical and disease NER, respectively; while on the NCBI Disease corpus, its F1 for the disease NER is 84.41%. Its F1 for the gene/protein NER on FSU‐PRGE is 87.62%. Symbol. No caption available. Availability and implementation: Data and source code are available at: https://github.com/aidantee/D3NER. Supplementary information: Supplementary data are available at Bioinformatics online.

Keywords: information; biomedical named; recognition; d3ner; named entity

Journal Title: Bioinformatics
Year Published: 2018

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.