DISCO: Language-Guided Manipulation with Diffusion Policies and Constrained Inpainting, Ce Hao★, Kelvin Lin★, Zhiwei Xue★, Siyuan Luo, Harold Soh★, IEEE Robotics and Automation Letters(RA-L)
Links:
Diffusion policies have demonstrated strong performance in generative modeling, making them promising for robotic manipulation guided by natural language instructions. However, generalizing language-conditioned diffusion policies to open-vocabulary instructions in everyday scenarios remains challenging due to the scarcity and cost of robot demonstration datasets. To address this, we propose DISCO, a framework that leverages off-the-shelf vision-language models (VLMs) to bridge natural language understanding with high-performance diffusion policies. DISCO translates linguistic task descriptions into actionable 3D keyframes using VLMs, which then guide the diffusion process through constrained inpainting. However, enforcing strict adherence to these keyframes can degrade performance when the VLM-generated keyframes are inaccurate. To mitigate this, we introduce an inpainting optimization strategy that balances keyframe adherence with learned motion priors from training data. Experimental results in both simulated and real-world settings demonstrate that DISCO outperforms conventional fine-tuned language-conditioned policies, achieving superior generalization in zero-shot, open-vocabulary manipulation tasks.
Resources
You can find our paper here.
Citation
Please consider citing our paper if you build upon our results and ideas.
Ce Hao★, Kelvin Lin★, Zhiwei Xue★, Siyuan Luo, Harold Soh★, “DISCO: Language-Guided Manipulation with Diffusion Policies and Constrained Inpainting”, IEEE Robotics and Automation Letters(RA-L)
@article{hao2025disco, title={DISCO: Language-Guided Manipulation with Diffusion Policies and Constrained Inpainting}, author={Hao, Ce and Lin, Kelvin and Xue, Zhiwei and Luo, Siyuan and Soh, Harold}, journal={IEEE Robotics and Automation Letters}, year={2025}, publisher={IEEE} }
Contact
If you have questions or comments, please contact Ce or Harold.