"Adaptive Relay Selection Strategy in Underwater Acoustic Cooperative Networks: A Hierarchical Adversarial Bandit Learning Approach"

Relay selection solutions for underwater acoustic cooperative networks suffer significant performance degradation as they fail to adapt to incomplete information, noisy interference and overwhelming dynamics. To address this challenge, a hierarchical adversarial multi-armed bandit learning framework by proposing an online reward estimation layer is designed to improve adaptive relay decision control. In online reward estimation layer, adaptive Kalman filter estimator is developed to properly handle noisy observation to support accurate reward. Meanwhile, an online predict mechanism is projected for all relays to enrich learning information. Furthermore, based on estimate error variance, an adaptive exploration structure is developed to accelerate the balance between exploration and exploitation. All gathered information are exploited to learn relay quality for the decision-making. Accordingly, we present a Hierarchical Adversarial Bandit Learning (HABL) algorithm to fully exploit the heuristic interaction between the hierarchical framework. HABL integrates reward estimation, information prediction, adaptive exploration and decision making carefully in a holistic algorithm to maximize the learning efficiency. Thereby, the HABL-based relay selection algorithm has higher system throughput and lower communication cost. Further, we rigorously analyze the convergence of HABL algorithm and give its upper bound on the cumulative regret. Finally, extensive simulations elucidate the effectiveness of the HABL.

Keywords: hierarchical adversarial; bandit learning; relay; relay selection

Journal Title: IEEE Transactions on Mobile Computing
Year Published: 2023

Link to full text (if available)

Share on Social Media: Sign Up to like & get
recommendations!
2

LAUSR

You are not signed in:

Sign Up!

Related content

More Information News Social Media Video Recommended