Heterogeneous Tactile Transformer, Jianxin Bi★, Qiang Wang, Jayaram Reddy★, Kelvin Lin★, Soibkhon Khajikhanov★, Ruihan Gao, and Harold Soh★, arXiv preprint
Links:

Tactile sensors are inherently heterogeneous: a model trained on one sensor usually cannot be used on another, which makes it hard to learn contact-rich manipulation policies from diverse tactile data at scale. The Heterogeneous Tactile Transformer (HTT) bridges this gap. It pairs sensor-specific encoders with a shared transformer trunk, and is pretrained self-supervised using per-modality masked reconstruction together with cross-modal alignment between paired sensors—so it preserves each sensor’s structure while aligning them in a common latent space.
Pretraining uses our new Heterogeneous Paired Tactile (HPT) dataset: 1.6M synchronized paired frames across four vision- and array-based tactile sensors (GelSight Mini, 9DTact, Xela, TAC-02), collected with a Universal Manipulation Interface (UMI) setup across press, twist, and slide motions.
Highlights
- Transferable perception. Across object classification, force estimation, and slip detection on all four heterogeneous sensors, HTT learns the strongest representations—best overall 20-class classification accuracy, and the best force/slip results on three of the four sensors.
- Manipulation with unseen sensors. Deployed on a camera-free Sharpa hand (fingertip tactile only), HTT embeddings drive contact-rich policies far better than a raw 6-D force baseline: 95% success on toy-screw tightening (vs. 50% for wrench, 5% for qpos) and 55% on grasping tofu without crushing it (vs. 35% / 5%). The fingertip sensors were not seen during HTT pretraining.
- A step toward a general tactile backbone. The results suggest that paired heterogeneous pretraining is a practical path toward sensor-agnostic tactile representations for both perception and robot manipulation.
Resources
The dataset, code, and model checkpoints will be released—see our paper for details.
Citation
Please consider citing our paper if you build upon our results and ideas.
Jianxin Bi★, Qiang Wang, Jayaram Reddy★, Kelvin Lin★, Soibkhon Khajikhanov★, Ruihan Gao, and Harold Soh★, “Heterogeneous Tactile Transformer”, arXiv preprint
@misc{bi2026htt, title={Heterogeneous Tactile Transformer}, author={Bi, Jianxin and Wang, Qiang and Reddy, Jayaram and Lin, Kelvin and Khajikhanov, Soibkhon and Gao, Ruihan and Soh, Harold}, year={2026}, eprint={2606.29948}, archivePrefix={arXiv}, primaryClass={cs.RO}, url={https://arxiv.org/abs/2606.29948} }
Contact
If you have questions or comments, please contact Jianxin Bi or Harold.