TR2024-166

MEL-PETs Defense for the NeurIPS 2024 LLM Privacy Challenge Blue Team Track


    •  Liu, J., Wang, Y., Koike-Akino, T., Nakai, T., Oonishi, K., Higashi, T., "MEL-PETs Defense for the NeurIPS 2024 LLM Privacy Challenge Blue Team Track", LLM Privacy Challenge at Neural Information Processing Systems (NeurIPS) 2024, December 2024.
      BibTeX TR2024-166 PDF Video Presentation
      • @inproceedings{Liu2024dec,
      • author = {{Liu, Jing and Wang, Ye and Koike-Akino, Toshiaki and Nakai, Tsunato and Oonishi, Kento and Higashi, Takuya}},
      • title = {MEL-PETs Defense for the NeurIPS 2024 LLM Privacy Challenge Blue Team Track},
      • booktitle = {LLM Privacy Challenge at Neural Information Processing Systems (NeurIPS) 2024},
      • year = 2024,
      • month = dec,
      • url = {https://www.merl.com/publications/TR2024-166}
      • }
  • MERL Contacts:
  • Research Areas:

    Artificial Intelligence, Machine Learning

Abstract:

We proposed a simple yet effective defense method for the NeurIPS 2024 LLM Privacy Challenge. Our defense strategy involves unlearning the PII of the fine- tuning data, as well as leveraging the system prompt to guard against the malicious attackers who want to use text continuation techniques to extract PII. The proposed defense can significantly reduce the Attack Success Rate (ASR) of the baseline attack to 0.06%, while maintaining the utility of the model.

 

  • Related News & Events

    •  AWARD    MERL Wins Awards at NeurIPS LLM Privacy Challenge
      Date: December 15, 2024
      Awarded to: Jing Liu, Ye Wang, Toshiaki Koike-Akino, Tsunato Nakai, Kento Oonishi, Takuya Higashi
      MERL Contacts: Toshiaki Koike-Akino; Jing Liu; Ye Wang
      Research Areas: Artificial Intelligence, Machine Learning, Information Security
      Brief
      • The Mitsubishi Electric Privacy Enhancing Technologies (MEL-PETs) team, consisting of a collaboration of MERL and Mitsubishi Electric researchers, won awards at the NeurIPS 2024 Large Language Model (LLM) Privacy Challenge. In the Blue Team track of the challenge, we won the 3rd Place Award, and in the Red Team track, we won the Special Award for Practical Attack.
    •  
  • Related Video