Stay with me: Lifetime maximization through heteroscedastic linear bandits with reneging

Ping Chun Hsieh*, Xi Liu, Anirban Bhattacharya, P. R. Kumar

*Corresponding author for this work

研究成果: Conference contribution同行評審

摘要

Sequential decision making for lifetime maximization is a critical problem in many real-world applications, such as medical treatment and portfolio selection. In these applications, a "reneging" phenomenon, where participants may disengage from future interactions after observing an unsatisfiable outcome, is rather prevalent. To address the above issue, this paper proposes a model of heteroscedastic linear bandits with reneging, which allows each participant to have a distinct "satisfaction level," with any interaction outcome falling short of that level resulting in that participant reneging. Moreover, it allows the variance of the outcome to be context-dependent. Based on this model, we develop a UCB-type policy, namely HR-UCB, and prove that it achieves 0(vT(log(T))3) regret. Finally, we validate the performance of HR-UCB via simulations.

原文English
主出版物標題36th International Conference on Machine Learning, ICML 2019
發行者International Machine Learning Society (IMLS)
頁面4957-4966
頁數10
ISBN(電子)9781510886988
出版狀態Published - 1 一月 2019
事件36th International Conference on Machine Learning, ICML 2019 - Long Beach, United States
持續時間: 9 六月 201915 六月 2019

出版系列

名字36th International Conference on Machine Learning, ICML 2019
2019-June

Conference

Conference36th International Conference on Machine Learning, ICML 2019
國家United States
城市Long Beach
期間9/06/1915/06/19

指紋 深入研究「Stay with me: Lifetime maximization through heteroscedastic linear bandits with reneging」主題。共同形成了獨特的指紋。

引用此