太阳成集团tyc234cc(中国)有限公司

首页» 科学研究» 学术报告» 讨论班» Information Sciences

讨论班

机器学习与数据科学博士生系列论坛（第七十六期）—— Follow-the-Perturbed-Leader Achieves Best-of-Both-Worlds for Bandit Problems

报告人：詹景昕（tyc234cc 太阳成集团）

时间：2024-09-19 16:00-17:00

地点：腾讯会议 627-5441-1672

摘要：
Best-Of-Both-Worlds (BOBW) bandit algorithms that have regret guarantees for both stochastic and adversarial settings have been studied for many years and Tsallis-INF (or other FTRL policies) is one of the most promising frameworks for BOBW policies.

However, a limitation of FTRL policies is that we need to explicitly compute the list of arm selection probabilities. The Follow-The-Perturbed-Leader (FTPL) policy has been researched as a promising candidate to circumvent this limitation. In this talk, we will introduce a FTPL algorithm with Fréchet perturbation, which also achieves the BOBW bound, based on a recent work by Lee, Honda, Ito and Oh (Colt 2024).

论坛简介：该线上论坛是由张志华教授机器学习实验室组织，每两周主办一次（除了公共假期）。论坛每次邀请一位博士生就某个前沿课题做较为系统深入的介绍，主题包括但不限于机器学习、高维统计学、运筹优化和理论计算机科学。

北大数学成就展

人才引进

捐赠