Localization and tracking in multi-player sports present significant challenges, particularly in wide and crowded scenes where severe occlusions can occur. Traditional solutions relying on a single camera are limited in… Click to show full abstract
Localization and tracking in multi-player sports present significant challenges, particularly in wide and crowded scenes where severe occlusions can occur. Traditional solutions relying on a single camera are limited in their ability to accurately identify players and may result in ambiguous detection. To overcome these challenges, we proposed fusing information from multiple cameras positioned around the field to improve positioning accuracy and eliminate occlusion effects. Specifically, we focused on soccer, a popular and representative multi-player sport, and developed a multi-view recording system based on a 1+N strategy. This system enabled us to construct a new benchmark dataset and continuously collect data from several sports fields. The dataset includes 17 sets of densely annotated multi-view videos, each lasting 2 min, as well as 1100+ min multi-view videos. It encompasses a wide range of game types and nearly all scenarios that could arise during real game tracking. Finally, we conducted a thorough assessment of four multi-view multi-object tracking (MVMOT) methods and gained valuable insights into the tracking process in actual games.
               
Click one of the above tabs to view related content.