LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Topic Modeling for Short Texts via Word Embedding and Document Correlation

Photo by maxchen2k from unsplash

Topic modeling is a widely studied foundational and interesting problem in the text mining domains. Conventional topic models based on word co-occurrences infer the hidden semantic structure from a corpus… Click to show full abstract

Topic modeling is a widely studied foundational and interesting problem in the text mining domains. Conventional topic models based on word co-occurrences infer the hidden semantic structure from a corpus of documents. However, due to the limited length of short text, data sparsity impedes the inference process of conventional topic models and causes unsatisfactory results on short texts. In fact, each short text usually contains a limited number of topics, and understanding semantic content of short text needs to the relevant background knowledge. Inspired by the observed information, we propose a regularized non-negative matrix factorization topic model for short texts, named TRNMF. The proposed model leverages pre-trained distributional vector representation of words to overcome the data sparsity problem of short texts. Meanwhile, the method employs the clustering mechanism under document-to-topic distributions during the topic inference by using Gibbs Sampling Dirichlet Multinomial Mixture model. TRNMF integrates successfully both word co-occurrence regularization and sentence similarity regularization into topic modeling for short texts. Through extensive experiments on constructed real-world short text corpus, experimental results show that TRNMF can achieve better results than the state-of-the-art methods in term of topic coherence measure and text classification task.

Keywords: short texts; text; topic modeling; topic; word

Journal Title: IEEE Access
Year Published: 2020

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.