LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Research on Topic Recognition of Network Sensitive Information Based on SW-LDA Model

Photo by thinkmagically from unsplash

The mining of network sensitive information is of great significance for understanding the social stability of the network. Obtaining the network public opinion of sensitive information is helpful to master… Click to show full abstract

The mining of network sensitive information is of great significance for understanding the social stability of the network. Obtaining the network public opinion of sensitive information is helpful to master Internet users’ attitudes toward important social events. The related artificial intelligence technology can achieve the topics from the network texts. At present, the current topic recognition model has a low recognition rate for sensitive information and usually generates some inaccurate topic keywords. In this paper, a topic recognition method of the network sensitive information based on a sensitive word weighted-latent Dirichlet allocation (LDA) model is proposed. First, the basic sensitive word vocabulary is constructed by manual collection, and the embedding representation of the word is obtained through the training of a large amount of network corpus based onWord2vec. The semantic similarity between the word embedding is calculated to extend the basic sensitive word vocabulary. Second, the extended sensitive word vocabulary is embedded in the LDA model. On the one hand, it can improve the semantic understanding and the recognition ability of LDA for the sensitive topic words and promote the quality of the generated topic words. On the other hand, it can also improve the relevance of the topic keywords and the related topics and find more fine-grained keywords. The experimental results show that the sensitive word weighted-LDA model can effectively improve the topic recognition quantity and quality of sensitive information. This paper is helpful to the development of artificial intelligence. The generated corpus in this paper is meaningful to the research of text classification, clustering and information retrieval, and so on.

Keywords: topic; model; sensitive information; recognition; network

Journal Title: IEEE Access
Year Published: 2019

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.