LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

CongraPlus: Towards Efficient Processing of Concurrent Graph Queries on NUMA Machines

Photo from wikipedia

Graph analytics has been routinely used to solve problems in a wide range of real-life applications. Efficiently processing concurrent graph analytics queries in a multiuser environment is highly desirable as… Click to show full abstract

Graph analytics has been routinely used to solve problems in a wide range of real-life applications. Efficiently processing concurrent graph analytics queries in a multiuser environment is highly desirable as we enter a world of edge device oriented services. Existing research, however, primarily focuses on analyzing a single, large graph dataset and leaves the efficient processing of multiple mid-sized graph analytics queries an intriguing yet challenging open problem. In this work, we investigate the scheduling of concurrent graph analytics queries on NUMA machines. We analyze the performance of several graph analytics algorithms and observe that they have diminishing performance returns as the number of processor cores increases. With concurrent graph analytics, such diminishing returns translate to no or even negative performance gains because of increasing contention on shared hardware resources. We also demonstrate the unpredictability of memory bandwidth usage for numerous graph analytics algorithms, which can lead to sub-optimal performance due to its potential to cause severe memory bandwidth contention. Motivated by the above observations, we propose CongraPlus, a NUMA-aware scheduler that intelligently manages concurrent graph analytics queries for better system throughput and memory bandwidth efficiency. CongraPlus collects the memory bandwidth consumption characteristics of graph analytics queries via offline profiling and eliminates memory bandwidth contention by computing the optimal sequence to launch queries. It also avoids computation resource contention by assigning a certain number of processor cores to the individual queries. We implement CongraPlus in C++ on top of the Ligra graph processing framework and test it with judiciously selected graph processing query combinations. Our results reveal that CongraPlus-based schemes improve query throughput by 30 percent compared to the conventional approach. It also exhibits a much better quality of service and scalability.

Keywords: memory bandwidth; processing concurrent; concurrent graph; graph; analytics queries; graph analytics

Journal Title: IEEE Transactions on Parallel and Distributed Systems
Year Published: 2019

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.