Sign Up to like & get
recommendations!
2
Published in 2022 at "IEEE transactions on neural networks and learning systems"
DOI: 10.1109/tnnls.2022.3161806
Abstract: Upper confidence bound (UCB)-based contextual bandit algorithms require one to know the tail property of the reward distribution. Unfortunately, such tail property is usually unknown or difficult to specify in real-world applications. Using a tail…
read more here.
Keywords:
tail property;
proposed estimator;
contextual bandit;
bandit ... See more keywords