LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

An NVM SSD-Based High Performance Query Processing Framework for Search Engines

Photo by samsungmemory from unsplash

Commercial search engines generally maintain hundreds of thousands of machines equipped with large sized DRAM in order to process huge volume of user queries with fast responsiveness, which incurs high… Click to show full abstract

Commercial search engines generally maintain hundreds of thousands of machines equipped with large sized DRAM in order to process huge volume of user queries with fast responsiveness, which incurs high hardware cost since DRAM is very expensive. Recently, NVM Optane SSD has been considered as a promising underlying storage device due to its price advantage over DRAM and speed advantage over traditional slow block devices. However, to achieve a comparable efficiency performance with in-memory index, applying NVM to both latency and I/O bandwidth critical applications such as search engines still face non-trivial challenges, because NVM has much lower I/O speed and bandwidth compared to DRAM. In this paper, we propose an NVM SSD-optimized query processing framework, aiming to address both the latency and bandwidth issues of using NVM in search engines. First, we propose a pipelined query processing methodology which significantly reduces the I/O waiting time by fine-grained overlapping of the computation and I/O operations. Second, we propose a cache-aware query reordering algorithm which enables queries sharing more data to be processed adjacently so that the I/O traffic is minimized. Third, we propose a data prefetching mechanism which reduces the extra thread waiting time due to data sharing and improves bandwidth utilization. Moreover, we propose intra-query parallel mechanisms for long-tail queries, including query subtask scheduling, heap concurrent access strategy, query parallelism prediction and adaptive pipelining. Extensive experimental studies show that our framework significantly outperforms the state-of-the-art baselines, which obtains comparable processing latency and throughput with DRAM while using much less space in both inter-query and intra-query parallel scenarios.

Keywords: query processing; nvm ssd; search engines; query

Journal Title: IEEE Transactions on Knowledge and Data Engineering
Year Published: 2023

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.