A Definition of Non

您所在的位置:网站首页 colt会议 A Definition of Non

A Definition of Non

2023-07-09 18:59| 来源: 网络整理| 查看: 265

Despite the subject of non-stationary bandit learning having attracted much recent attention, we have yet to identify a formal definition of non-stationarity that can consistently distinguish non-stationary bandits from stationary ones. Prior work has characterized non-stationary bandits as bandits for which the reward distribution changes over time. We demonstrate that this definition can ambiguously classify the same bandit as both stationary and non-stationary;  this ambiguity arises in the existing definition's dependence on the latent sequence of reward distributions. In addition, the definition has given rise to two widely used notions of regret: the dynamic regret and the weak regret. These notions are not indicative of qualitative agent performance in some bandits that are intuitively nearly-stationary, i.e., non-stationary bandits that closely resemble stationary ones. In addition, this definition of non-stationary bandits has led to the design of agents that explore excessively. We introduce a formal definition of non-stationary bandits that resolves these issues. In addition, for two bandits that provide agents with indistinguishable experiences, our definition classifies them as both stationary or both non-stationary. Our definition also applies seamlessly to both the Bayesian and the frequentist formulations of bandits, providing a unified approach. 



【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3