LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Topic Detection and Tracking Based on Windowed DBSCAN and Parallel KNN

Photo by iamromankraft from unsplash

Topic Detection and Tracking technique (TDT) has been commonly used to identify the hot topics from the huge volume of Internet news information and keep up with the hot news.… Click to show full abstract

Topic Detection and Tracking technique (TDT) has been commonly used to identify the hot topics from the huge volume of Internet news information and keep up with the hot news. However, traditional topic detection and tracking methods have shown low accuracy and low efficiency. In this paper, a topic detection system driven by big data is built on the Spark platform, which aims at improving the efficiency of news collecting from the Internet and improving the accuracy and efficiency of topic detection and tracking tasks. This system can be easily employed in a distributed architecture and work as a parallelized news collecting and topic detection system. An improved density-based spatial clustering of application with noise (DBSCAN) clustering algorithm based on the time window is proposed to achieve accurate topic detection with the auxiliary advantage of reducing the time complexity. A parallel KNN based topic tracking algorithm is proposed for the topic tracking task. Experiments including comparison with some baseline algorithms and quantitative and qualitative analyses are conducted on pseudo-distributed Spark platform, which demonstrates the effectiveness and efficiency of the parallelized topic detection system.

Keywords: detection tracking; topic; news; parallel knn; topic detection

Journal Title: IEEE Access
Year Published: 2021

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.