The rapid growth in the number of scholarly documents on the Web and in other digital platforms makes it challenging for researchers to find research publications most relevant to their… Click to show full abstract
The rapid growth in the number of scholarly documents on the Web and in other digital platforms makes it challenging for researchers to find research publications most relevant to their information needs. This challenge has been mitigated to a greater extent by the major scholarly retrieval systems, such as Google Scholar, Semantic Scholar, PubMed, CiteSeerX, and others. The reason for the success of these retrieval solutions lies in the advances in ranking approaches. However, the existing studies advocate for the fact that we are still far from the method’s effectiveness ceiling, leaving ample room for further improvement to meet the scholarly needs of users. The existing methods adopt different approaches; some use classical Information Retrieval (IR), others use semantics-aware methods, including Knowledge Graph (KG) to support scholarly search. However, we hypothesize that combining the best of both worlds can further improve search relevance. In this context, this work incorporates inverted index from the classical IR with BM25 as the weighting scheme, combined with Citation Networks Analysis (CNA) for the baseline search results, which are then re-ranked by passing the selected entities from the top-k initial search results as the search query to the KG. This way, not only the textual content but also the structural semantics of the research publications are well exploited in the retrieval processes. The goal is to exploit IR and KG-based retrieval techniques to gain insights into the behavior of both textual and structured information in the strategic ranking of scholarly articles. The proposed solution has been evaluated using the ACL Anthology Network (AAN) dataset. The results show that the proposed technique can comparatively improve the retrieval performance in terms of Normalized Discounted Cumulative Gain (nDCG) and precision rates.
               
Click one of the above tabs to view related content.