TR2024-166
MEL-PETs Defense for the NeurIPS 2024 LLM Privacy Challenge Blue Team Track
-
- "MEL-PETs Defense for the NeurIPS 2024 LLM Privacy Challenge Blue Team Track", LLM Privacy Challenge at Neural Information Processing Systems (NeurIPS) 2024, December 2024.BibTeX TR2024-166 PDF Video Presentation
- @inproceedings{Liu2024dec,
- author = {{Liu, Jing and Wang, Ye and Koike-Akino, Toshiaki and Nakai, Tsunato and Oonishi, Kento and Higashi, Takuya}},
- title = {MEL-PETs Defense for the NeurIPS 2024 LLM Privacy Challenge Blue Team Track},
- booktitle = {LLM Privacy Challenge at Neural Information Processing Systems (NeurIPS) 2024},
- year = 2024,
- month = dec,
- url = {https://www.merl.com/publications/TR2024-166}
- }
,
- "MEL-PETs Defense for the NeurIPS 2024 LLM Privacy Challenge Blue Team Track", LLM Privacy Challenge at Neural Information Processing Systems (NeurIPS) 2024, December 2024.
-
MERL Contacts:
-
Research Areas:
Abstract:
We proposed a simple yet effective defense method for the NeurIPS 2024 LLM Privacy Challenge. Our defense strategy involves unlearning the PII of the fine- tuning data, as well as leveraging the system prompt to guard against the malicious attackers who want to use text continuation techniques to extract PII. The proposed defense can significantly reduce the Attack Success Rate (ASR) of the baseline attack to 0.06%, while maintaining the utility of the model.
Related News & Events
-
AWARD MERL Wins Awards at NeurIPS LLM Privacy Challenge Date: December 15, 2024
Awarded to: Jing Liu, Ye Wang, Toshiaki Koike-Akino, Tsunato Nakai, Kento Oonishi, Takuya Higashi
MERL Contacts: Toshiaki Koike-Akino; Jing Liu; Ye Wang
Research Areas: Artificial Intelligence, Machine Learning, Information SecurityBrief- The Mitsubishi Electric Privacy Enhancing Technologies (MEL-PETs) team, consisting of a collaboration of MERL and Mitsubishi Electric researchers, won awards at the NeurIPS 2024 Large Language Model (LLM) Privacy Challenge. In the Blue Team track of the challenge, we won the 3rd Place Award, and in the Red Team track, we won the Special Award for Practical Attack.