学术报告
时间: 2016-05-30 发布者: 文章来源: 必威 审核人: 浏览次数: 576

题目: Reinforcement learning in Markov decision processes with Skew-symmetric bilinear utility functions

报告人:翁安林教授(中山大学)

时间:5月31日(星期二)14:00

地点:天赐庄理工楼633会议室

报告简介:Markov decision process (MDP) and reinforcement learning (RL) are general frameworks to tackle sequential decision-making problems under uncertainty. Both rely on the existence of numerical rewards to quantify the value of actions in states and on expectation to quantify the value of policies. In practice, these assumptions may not hold, as it has also been observed in some other recent work (e.g., preference-based reinforcement learning, dueling bandits…). In this talk, we investigate in the context of MDP/RL the exploitation of a general family of decision model called skew-symmetric bilinear utility functions, which contains expected utility (as used in standard MDP/RL), probabilistic dominance (as used in dueling bandits) and many others. For both the MDP and the RL settings, we propose solving algorithms and present some initial experimental results.

报告人简介:Paul Weng (翁安林) is currently a faculty at SYSU-CMU Joint Institute of Engineering (JIE), which is a partnership between Sun Yat-sen University (SYSU) and Carnegie Mellon University (CMU). During 2015, he was a visiting faculty at CMU. Before that, he was an associate professor in computer science at Sorbonne Universités, UPMC (Pierre and Marie Curie University), Paris. He received his Master in 2003 and his Ph.D. in 2006, both at UPMC. Before joining academia, he graduated from ENSAI (French National School in Statistics and Information Analysis) and worked as a financial quantitative analyst in London. His main recent research work deals with (sequential) decision-making under uncertainty, multicriteria decision-making, qualitative/ordinal decision models and preference learning/elicitation.