LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

A Tree-Based Indexing Approach for Diverse Textual Similarity Search

Photo by fasyahalim_ from unsplash

Textual information is ubiquitous in our lives and is becoming an important component of our cognitive society. In the age of big data, we consistently need to traverse substantial amounts… Click to show full abstract

Textual information is ubiquitous in our lives and is becoming an important component of our cognitive society. In the age of big data, we consistently need to traverse substantial amounts of data even to find a little information. To quickly acquire effective information, it is necessary to implement a textual similarity search based on an appropriate index structure to efficiently find results. In this article, we study top-k textual similarity search and develop a tree-based indexing approach that can construct indices to support various similarity functions. Our indexing approach clusters similar records in the same branch offline to improve the performance of online search. Based on the index tree, we present a top-k search algorithm with efficient pruning techniques. The experimental results demonstrate that our algorithm can achieve higher performance and better scalability than the baseline method.

Keywords: similarity; indexing approach; textual similarity; similarity search; search

Journal Title: IEEE Access
Year Published: 2021

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.