publications | Matthieu Meeus

The most updated information is available on Google Scholar.

2025

ICML Workshop

Counterfactual Influence as a Distributional Quantity

Matthieu Meeus, Igor Shilov, Georgios Kaissis, and Yves-Alexandre Montjoye

arXiv preprint arXiv:2506.20481, 2025

TL;DR Paper

TLDR; We study how memorization in ML models can be better understood by looking at the entire influence distribution rather than self-influence alone.
Preprint

Strong Membership Inference Attacks on Massive Datasets and (Moderately) Large Language Models

Jamie Hayes, Ilia Shumailov, Christopher A Choquette-Choo, Matthew Jagielski, George Kaissis, Katherine Lee, Milad Nasr, Sahra Ghalebikesabi, Niloofar Mireshghallah, Meenatchi Sundaram Mutu Selva Annamalai, and others

arXiv preprint arXiv:2505.18773, 2025

TL;DR Paper

TLDR; We run shadow-modeling based MIAs against LLM pretraining, studying how well attacks can perform when adversaries have strong computational power and how vulnerability changes with the setup/differs from other privacy metrics.
Preprint

Alignment Under Pressure: The Case for Informed Adversaries When Evaluating LLM Defenses

Xiaoxue Yang, Bozhidar Stevanoski, Matthieu Meeus, and Yves-Alexandre Montjoye

arXiv preprint arXiv:2505.15738, 2025

TL;DR Paper Code

TLDR; We take the perspective of a strong adversary to stress-test alignment based defenses against prompt injection attacks and jailbreaking. We find that successful attacks almost always exist, suggesting that alignment-based defenses are not robust against every improving attacks in the future.
ICML

The Canary’s Echo: Auditing Privacy Risks of LLM-Generated Synthetic Text

Matthieu Meeus, Lukas Wutschitz, Santiago Zanella-Béguelin, Shruti Tople, and Reza Shokri

2025

Work done during my internship at Microsoft Research.

TL;DR Paper Code

TLDR; We propose the first privacy auditing pipeline for synthetic text. We implement different MIAs just based on access to the text and find that canaries with low-perplexity-prefix and high-perplexity-suffix are the most vulnerable.
SaTML

SoK: Membership Inference Attacks on LLMs are Rushing Nowhere (and How to Fix It)

Matthieu Meeus, Shilov Igor, Shubham Jain, Manuel Faysse, Marek Rei, and Yves-Alexandre Montjoye

IEEE Conference on Secure and Trustworthy Machine Learning (SaTML 2025), 2025

Received Best Paper Award at SaTML 2025.

TL;DR Paper Code

TLDR; We wrote an SoK on recent developments in MIAs against LLMs. We discuss how things have evolved recently, show popular evaluation setups to be flawed, and examine solutions going forward.

2024

Preprint

ChocoLlama: Lessons Learned From Teaching Llamas Dutch

Matthieu Meeus, Anthony Rathé, François Remy, Pieter Delobelle, Jens-Joris Decorte, and Thomas Demeester

arXiv preprint arXiv:2412.07633, 2024

All 6 models are available on HuggingFace. Press coverage in Flemish newspaper De Tijd.

TL;DR Paper Code

TLDR; We further pretrain Llama-2/3 on Dutch data, and release a family of 6 open-source LLMs. We elaborate on our learnings in the paper (modifying the tokenizer, using LoRA at scale for language adaptation, pretraining versus posttraining, benchmarking).
Preprint

Mosaic Memory: Fuzzy Duplication in Copyright Traps for Large Language Models

Igor Shilov, Matthieu Meeus, and Yves-Alexandre Montjoye

arXiv preprint arXiv:2405.15523, 2024

Paper
Preprint

Lost in the Averages: A New Specific Setup to Evaluate Membership Inference Attacks Against Machine Learning Models

Florent Guépin, Nataša Krčo, Matthieu Meeus, and Yves-Alexandre Montjoye

arXiv preprint arXiv:2405.15423, 2024

Paper
USENIX Security

Did the Neurons Read your Book? Document-level Membership Inference for Large Language Models

Matthieu Meeus, Shubham Jain, Marek Rei, and Yves-Alexandre Montjoye

In 33rd USENIX Security Symposium (USENIX Security 24), 2024

Press coverage in Le Monde.

TL;DR Paper Code

TLDR; Given a pretrained LLM and a document, can I infer whether the document was used to train the LLM? First, we rely on the collection of documents which we know have been used to train the LLM (members) and documents made available after the model release data (non-members). We then query the LLM on both members and non-members for token-level probabilities and train a classifier to predict binary membership. Spoiler: It’s harder than you think!
ICML

Copyright Traps for Large Language Models

Matthieu Meeus, Igor Shilov, Manuel Faysse, and Yves-Alexandre Montjoye

In Forty-first International Conference on Machine Learning, 2024

Press coverage in MIT Technology Review and Nature News.

TL;DR Paper Code

TLDR; We add copyright traps to original content. These are highly unique sequences that, if an LLM were to be trained on it, we would be able to tell through how the LLM reacts to our injected trap. We inject a variety of traps into the pretraining dataset of the real-world 1.3B CroissantLLM trained from scratch, and find that copyright traps indeed enable content detectability.

2023

ESORICS

Achilles’ Heels: Vulnerable Record Identification in Synthetic Data Publishing

Matthieu Meeus, Florent Guepin, Ana-Maria Creţu, and Yves-Alexandre Montjoye

In European Symposium on Research in Computer Security, 2023

TL;DR Paper Code

TLDR; We audit the privacy risk of synthetic tabular data through Membership Inference Attacks (MIAs). For this, we are most concerned about the worst-case risk - so we propose a method to identify the most at-risk data records in a dataset. We show that our vulnerable record identification method beats previously used, ad-hoc outlier detection mechanisms significantly.
ESORICS Workshop

Synthetic Is All You Need: Removing the Auxiliary Data Assumption for Membership Inference Attacks Against Synthetic Data

Florent Guépin, Matthieu Meeus, Ana-Maria Creţu, and Yves-Alexandre Montjoye

In European Symposium on Research in Computer Security, 2023

TL;DR Paper Code

TLDR; In Membership Inference Attacks (MIAs) against synthetic data, we typically assume the attacker to have access to some auxiliary data (from the same distribution as the real training data). In practice, this is not that realistic, especially for use cases typically suggested for synthetic data. We here examine what happens to the MIA performance when we use the released synthetic itself as a replacement for the auxiliary dataset to build shadow-modeling based MIAs. Spoiler: MIAs still work, but with a substantial drop compared to real auxiliary data.
Nature Medicine

Concerns about using a digital mask to safeguard patient privacy

Matthieu Meeus, Shubham Jain, and Yves-Alexandre Montjoye

Nature Medicine, 2023

TL;DR Paper Code

TLDR; A widely covered Nature paper introduces a Digital Mask (DM), an ’anonymization’ algorithm to be applied to facial images of patients. Reportedly, the mask would irreversibly erase all identifiable features while retaining the information necessary for medical diagnosis. We show their setup to evaluate the anonymization provided by the DM to be seriously flawed, and show that in a proper setup, the risk of identification increases by 100X.