In recent years, we have witnessed increasing attention paid to the problem of cross-site user identification (CSUI) with the bloom of social media. Despite noticeable progress in this field, the… Click to show full abstract
In recent years, we have witnessed increasing attention paid to the problem of cross-site user identification (CSUI) with the bloom of social media. Despite noticeable progress in this field, the problem of enormous computation posed by the full-scale pairwise comparison still remains unsolved, which constrains the real application of the state-of-the-art approaches, especially when the user number reaches tens of millions. To address this issue, we propose a novel trajectory-oriented method, which aims at incorporating the locality-sensitive hashing (LSH) technology into the strategy of approximate nearest neighbors searching. Specifically, we design an LSH function based on binary search (BLSH) for processing user trajectory easily, and by searching the BLSH buckets, we cluster similar users to avoid the onerous full-scale pairwise comparison. Moreover, a hierarchy of discrete attention is devised to further improve the effectiveness, which develops our method into the mature one, hierarchically attentioned binary-search-based LSH (HA-BLSH). Extensive experiments on ground-truth datasets demonstrate that our HA-BLSH outperforms the cutting-edge approaches on efficiency and effectiveness. The further discussion indicates that our method is also quite flexible and able to accommodate more scenarios.
               
Click one of the above tabs to view related content.