Software & Data Downloads — RESTA_VLM

Directional Embedding Smoothing for Robust Vision Language Models provides experiments for our more recent robust VLM paper.

This repository provides the experimental code for our paper "Directional Embedding Smoothing for Robust Vision Language Models" by Ye Wang, Jing Liu, Toshiaki Koike-Akino. These experiments investigate robust VLMs, via an inference-time defense against multi-modal jailbreak attacks. This defense extends the Randomized Embedding Smoothing and Token Aggregation (RESTA) defense, that we developed in our earlier paper "Smoothed Embeddings for Robust Language Models" by Ryo Hase, Md Rafi Ur Rashid, Ashley Lewis, Jing Liu, Toshiaki Koike-Akino, Kieran Parsons, Ye Wang. However, note that this repository only provides experiments for our more recent robust VLM paper. We reimplemented RESTA from scratch in this repo, while generalizing to support VLMs.

MERL Contacts

Related Publications
Wang, Y., Liu, J., Koike-Akino, T., "Directional Embedding Smoothing for Robust Vision Language Models", International Conference on Learning Representations (ICLR) Workshop on Agents in the Wild, April 2026.
BibTeX TR2026-049 PDF Software Presentation
- @inproceedings{Wang2026apr4,
- author = {{Wang, Ye and Liu, Jing and Koike-Akino, Toshiaki}},
- title = {{Directional Embedding Smoothing for Robust Vision Language Models}},
- booktitle = {International Conference on Learning Representations (ICLR) Workshop on Agents in the Wild},
- year = 2026,
- month = apr,
- url = {https://www.merl.com/publications/TR2026-049}
- }
Hase, R., Rashid, M.R.U., Lewis, A., Liu, J., Koike-Akino, T., Parsons, K., Wang, Y., "Smoothed Embeddings for Robust Language Models", Safe Generative AI Workshop at Advances in Neural Information Processing Systems (NeurIPS), December 2024.
BibTeX TR2024-170 PDF Software Presentation
- @inproceedings{Ryo2024dec,
- author = {{Hase, Ryo and Rashid, Md Rafi Ur and Lewis, Ashley and Liu, Jing and Koike-Akino, Toshiaki and Parsons, Kieran and Wang, Ye}},
- title = {{Smoothed Embeddings for Robust Language Models}},
- booktitle = {Safe Generative AI Workshop at Advances in Neural Information Processing Systems (NeurIPS)},
- year = 2024,
- month = dec,
- publisher = {OpenReview},
- url = {https://www.merl.com/publications/TR2024-170}
- }

Access software at https://github.com/merlresearch/RESTA_VLM.

YeWang

JingLiu

ToshiakiKoike-Akino

Ye
Wang

Jing
Liu

Toshiaki
Koike-Akino