Eugene Lim

Eugene Lim

3 Posts

Stochastic Bandits for Egalitarian Assignment

We address a problem where an agent assigns users to arms in a stochastic multi-armed bandit setting to maximize the minimum expected cumulative reward for all users. It presents a UCB-based policy with upper bounds on cumulative regret and an impossibility result for policy-independent approaches.

Observed Adversaries in Deep Reinforcement Learning

We examine the problem of observed adversaries for deep policies, where observations of other agents can hamper robot performance.

Juiced and Ready to Predict Private Information in Deep Cooperative Reinforcement Learning

Training robots that can interactively assist humans with private information