Stochastic Bandits for Egalitarian Assignment

Stochastic Bandits for Egalitarian Assignment, Eugene Lim★, Vincent Tan, Harold Soh★, Transactions on Machine Learning Research (TMLR)
Links: Paper | Github

EgalUCB Algorithm

We study EgalMAB, an egalitarian assignment problem in the context of stochastic multi-armed bandits. In EgalMAB, an agent is tasked with assigning a set of users to arms. At each time step, the agent must assign exactly one arm to each user such that no two users are assigned to the same arm. Subsequently, each user obtains a reward drawn from the unknown reward distribution associated with its assigned arm. The agent’s objective is to maximize the minimum expected cumulative reward among all users over a fixed horizon. This problem has applications in areas such as fairness in job and resource allocations, among others. We design and analyze a UCB-based policy EgalUCB and establish upper bounds on the cumulative regret. In complement, we establish an almost-matching policy-independent impossibility result.

Resources

You can find our paper here. Check out our repository here on github

Citation

Please consider citing our paper if you build upon our results and ideas.

Eugene Lim★, Vincent Tan, Harold Soh★, “Stochastic Bandits for Egalitarian Assignment”, Transactions on Machine Learning Research (TMLR)

@article{lim2025egalucb,
title={Stochastic Bandits for Egalitarian Assignment},
author={Lim, Eugene and Tan, Vincent and Soh, Harold},
journal={Transactions on Machine Learning Research},
issn={2835-8856},
url={https://openreview.net/forum?id=kmKVJl2JWo},
year={2024} } 

Contact

If you have questions or comments, please contact Eugene or Harold.

Acknowledgements

This research/project is supported by the National Research Foundation, Singapore under its AI Singapore Programme (AISG Award No: AISG2-PhD/2021-08-011). This research/project is also supported by a Ministry of Education Tier 2 grant under grant number A-9000423-00-00.