Range Membership Inference Attacks (2024)

Jiashu Tao
Department of Computer Science
National University of Singapore
Singapore
jiashut@comp.nus.edu.sg
&Reza Shokri
Department of Computer Science
National University of Singapore
Singapore
reza@comp.nus.edu.sg

Abstract

Machine learning models can leak private information about their training data, but the standard methods to measure this risk, based on membership inference attacks (MIAs), have a major limitation. They only check if a given data point exactly matches a training point, neglecting the potential of similar or partially overlapping data revealing the same private information. To address this issue, we introduce the class of range membership inference attacks (RaMIAs), testing if the model was trained on any data in a specified range (defined based on the semantics of privacy). We formulate the RaMIAs game and design a principled statistical test for its complex hypotheses. We show that RaMIAs can capture privacy loss more accurately and comprehensively than MIAs on various types of data, such as tabular, image, and language. RaMIA paves the way for a more comprehensive and meaningful privacy auditing of machine learning algorithms.

1 Introduction

Machine learning models are prone to training data memorization [14, 15, 27, 41, 24]. It is also a known fact that the outstanding predictive performance of machine learning models on long-tailed data distributions often comes at the expense of blatant memorization of certain data points [15, 3, 30, 16]. In simple words, memorization is the phenomenon that models behave differently on training points, compared to other points. The memorization can lead to significant privacy risks as adversaries can infer private information about the training data from only having black box access to models.

To quantify the privacy risk of machine learning models, a privacy notion needs to be fixed first. The reigning privacy notion is defined by membership information. Membership information of a data point is binary, but this single bit of information carries huge privacy implications. Being able to infer membership information opens up the possibility of conducting data reconstruction attack[36, 17, 4, 29], where the reconstruction attack inspects the membership of plausible data points to recover the training set. The de-facto way to audit the privacy risk according to this privacy notion is to conduct membership inference attacks (MIAs)[39], where an adversary aims to predict whether a given query data is part of the training set of the target model. The more powerful the membership inference attack is, the higher the privacy risk the target model bears.

Membership inference attacks provide a lower bound of the true privacy risk of a model, so improving the attack performance also means tightening the bound of privacy risk estimation. So far, the community has put much effort into improving the power of membership inference attacks by crafting better membership signals and constructing better statistical tests [38, 39, 35, 44, 5, 47]. While these have been useful for the betterment of privacy auditing, they have ignored the fundamental drawback of membership inference attacks as a practical privacy auditing tool, i.e., MIAs assume it is a privacy concern only if the adversary can identify the exact, full version of the training data. However, if the adversary can identify data points that are similar enough to the training points, it should also be treated as a significant privacy risk, because those points can contain similar levels of private information. For example, two Alice’s photos taken from slightly different angles, or with a different background would contain similar private information about Alice’s face or location. Similar to images, small perturbations or rephrasing in textual data also affect little in the sensitivity of the information conveyed [13]. This oversight means private information leakage beyond exact matches of training data is ignored and current privacy auditing tools might produce overly optimistic results.

Besides, focusing on exact membership inference attacks renders them incapable of handling queries with missing values. This is another major limitation of MIAs, as data records with the same sensitive features and a few missing non-private features would carry a similar level of private information as the full data records. Imagine the case where an adversary can infer that an Asian person of age 25, identification number 123456 is in the hospital training set, there is no need to identify the rest of the features because the adversary is already able to pinpoint who has HIV from the given subset of key features. Even if the key identifiers are missing and unknown, it is still a grave privacy threat if the attacker can infer the rest of the features that often contain quasi-identifiers, which has been studied extensively in the failure of k-anonymity [37], where the attacker is able to reconstruct and identify training data with certain features removed. However, if we make up the missing values or pass noisy data to the membership inference attack, the attack is expected to output “not a member," since the chance of us coming up with the right values is so slimSupposeIf we use property inference [40, 1, 8, 9, 45, 20] to help fill in the missing features, the imputed data may be have inflated membership score even if they are non-members, leading to a higher false positive rate. Therefore, existing frameworks struggle with quantifying privacy leakage with missing features.

We argue that privacy quantification should not be point-based, because a small neighborhood around training points also contains similar private information. Hence, in this paper, we are proposing a new attack framework called range membership inference attacks (RaMIAs) to better capture the notion of privacy. Instead of using point queries and testing for exact matches, range membership inference attacks use range queries that cover a set of points. The goal of range membership inference attacks is to infer if the given range query contains any training point.

Range membership inference attacks extend the formulation of membership inference attacks. We adapt the original inference game formulation to reflect the change to range queries in RaMIAs. This extended formulation produces composite hypotheses in the likelihood ratio tests, which are the standard and best attack techniques in MIAs [38, 44, 5, 47]. Our method is based on standard statistical methods for composite hypothesis testing, namely generalized likelihood ratio tests and Bayes factors. We show that RaMIAs can provide a more comprehensive notion of privacy by detecting private information leakage from the vicinity of training data when MIAs underestimate such privacy risk. Specifically, we observe a simple flipping can cause the membership score to decrease from very high to 0 (Figure 1), and the overall AUC can drop 0.20 if we test image classifiers with horizontally flipped images (Figure 2(c)). RaMIA, implemented with our simple attack strategy (Sec 4), supersedes MIA by at least 5% on image datasets (Fig 3(b), 3(c)), providing better privacy auditing at the cost of as few as 15 samples, which is insignificant compared to the dimensionality of the data space.

In this paper, we emphasize the motivation and formulation of our newly proposed attack framework, RaMIA. As a proof-of-concept, we experiment RaMIA with a simple attack strategy on tabular, image and text datasets, where RaMIA unanimously outperforms MIA. Additionally, our attack can also be potentially used in the pioneering membership inference and data extraction attacks on generative models [4, 43, 7], where the current evaluation requires finding the closest training image for all candidate data and computing their distances. By setting the distance function as the range function and conducting RaMIAs, we can evaluate the attacks more systematically.

Range Membership Inference Attacks (1)
Range Membership Inference Attacks (2)
Range Membership Inference Attacks (3)

2 Preliminaries

2.1 Membership inference attacks

The membership inference attack (MIA) [39] is a type of inference attack against machine learning models to infer whether a given data sample is part of the model’s training set. Mathematically, given a model f𝑓fitalic_f and a query point 𝐱𝐱\mathbf{x}bold_x, the MIA aims to output 1 if 𝐱𝐱\mathbf{x}bold_x is a training point, and 0 otherwise. There are various methods to construct and conduct the attack, and it is still an active research direction that sees more powerful attacks being developed. Shokri etal. [39] use a shadow model based approach where shadow models are trained on known training sets in similar ways to the target model. Confidence values of the training and test data on the shadow models are computed, which are then used as benchmarks in testing. However, the high cost and strong assumption of knowing the target model’s training details make the attack often infeasible. Yeom etal. [45] use model loss as a signal and threshold it, scraping the need for shadow models. Then MIA is formulated as an inference game (See Sec 3.1.1). Researchers turn to the principled approach to solve the game via likelihood ratio tests [38, 5, 44, 47]. Carlini etal. [5] and Ye etal. [44] propose reference-model based approaches, where target signals are compared to those obtained on reference models to obtain the likelihood ratio. To further boost the attack power, Zarifzadeh etal. [47] assume the attacker has access to a pool of population data so that the likelihood ratio from reference-based attacks can be calibrated on ratios obtained on (non-member) population data.

Recent attacks [4, 47] further boost the attack performance on image data by augmenting the test queries with train-time augmentations. This assumes that the attacker knows the exact train-test augmentations in advance, and is able to sample from them. Augmenting training images with non train-time augmentations is not considered for valid reasons: those augmented images would be non-members in the current privacy notion.

2.2 Range queries

If we make a connection to the field of databases, a membership inference attack operates on point queries or exact match queries. That is, each query to the membership inference attack only contains one data point and the attack only concerns if this very point is in the training set. On the other hand, range query, which is also a common querying operation in database systems, wants to retrieve all data points that fall into the "range". The most fundamental difference to point query is that the retrieved result often contains multiple data points instead of a single one. Our proposed attack, the range membership inference attack, operates with range queries.

3 From MIA to RaMIA

Membership inference attacks are often formulated as an inference game [45, 21, 44, 5, 47] between a challenger and an adversary. In this section, we will walk through how we come up with RaMIA from MIA.

3.1 Membership inference attacks

In membership inference attacks, the goal is to identify if a given point is part of the training set.

3.1.1 Membership inference game

Definition 1

(Membership Inference Game [44, 45]) Let π𝜋\piitalic_π be the data distribution, and let 𝒯𝒯\mathcal{T}caligraphic_T be the training algorithm.

  1. 1.

    The challenger samples a training dataset DsDπ𝐷subscript𝑠𝐷𝜋D\overset{s_{D}}{\longleftarrow}\piitalic_D start_OVERACCENT italic_s start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_OVERACCENT start_ARG ⟵ end_ARG italic_π, and trains a model θ𝒯(D)𝜃𝒯𝐷\theta\longleftarrow\mathcal{T}(D)italic_θ ⟵ caligraphic_T ( italic_D ).

  2. 2.

    The challenger samples a data record z0sz0πsubscript𝑧0subscript𝑠subscript𝑧0𝜋z_{0}\overset{s_{z_{0}}}{\longleftarrow}\piitalic_z start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_OVERACCENT italic_s start_POSTSUBSCRIPT italic_z start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_OVERACCENT start_ARG ⟵ end_ARG italic_π from the data distribution, and a training data record z1sz1Dsubscript𝑧1subscript𝑠subscript𝑧1𝐷z_{1}\overset{s_{z_{1}}}{\longleftarrow}Ditalic_z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_OVERACCENT italic_s start_POSTSUBSCRIPT italic_z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_OVERACCENT start_ARG ⟵ end_ARG italic_D.

  3. 3.

    The challenger flips a fair coin to get the bit b{0,1}𝑏01b\in\{0,1\}italic_b ∈ { 0 , 1 }, and sends the target model θ𝜃\thetaitalic_θ and data record zbsubscript𝑧𝑏z_{b}italic_z start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT to the adversary.

  4. 4.

    The adversary gets access to the data distribution π𝜋\piitalic_π and access to the target model, and outputs a bit b^𝒜(θ,zb)^𝑏𝒜𝜃subscript𝑧𝑏\hat{b}\longleftarrow\mathcal{A}(\theta,z_{b})over^ start_ARG italic_b end_ARG ⟵ caligraphic_A ( italic_θ , italic_z start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ).

  5. 5.

    If b^=b^𝑏𝑏\hat{b}=bover^ start_ARG italic_b end_ARG = italic_b, output 1 (success). Otherwise, output 0.

3.1.2 Evaluation of MIA

Evaluation is done with a set of training and test points. True positive rate (TPR) and false positive rate (FPR) are computed by sweeping over all possible threshold values. By plotting the receiver operating characteristic curve (ROC), the power of an attack strategy can be represented by the area under the curve (AUC). A clueless adversary who can only randomly guess the membership labels will get an AUC of 0.5. For stronger adversaries, they predict membership more correctly at each error level. Hence, they would achieve higher TPR at each FPR, and get a higher AUC.

3.1.3 Intrinsic limitation of MIA as a Privacy Auditing Framework

MIAs are intrinsically incapable of identifying points close to training points, regardless of how similar they are, because these points are, by definition, non-members in the scope of MIAs. Hence, there is a huge space of points that contain private information but are deemed non-members in the current privacy auditing framework. In this way, MIAs as privacy auditing tools become bad when the queries move away from the original data. Figure 2 shows the MIAs under-perform on non-original data. This inspires our formulation of RaMIA, where these points will be classified as "members" for better and more comprehensive privacy auditing.

Range Membership Inference Attacks (4)
Range Membership Inference Attacks (5)
Range Membership Inference Attacks (6)
Range Membership Inference Attacks (7)

3.2 Range membership inference attack

In range membership inference attacks, the goal is to identify if a given range contains any training point.

3.2.1 Range membership inference game

Here we define our range membership inference game, modified from the above formulation.

Definition 2

(Range Membership Inference Game) Let π𝜋\piitalic_π be the data distribution, and let 𝒯𝒯\mathcal{T}caligraphic_T be the training algorithm.

  1. 1.

    The challenger samples a training dataset DsDπ𝐷subscript𝑠𝐷𝜋D\overset{s_{D}}{\longleftarrow}\piitalic_D start_OVERACCENT italic_s start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_OVERACCENT start_ARG ⟵ end_ARG italic_π, and trains a model θ𝒯(D)𝜃𝒯𝐷\theta\longleftarrow\mathcal{T}(D)italic_θ ⟵ caligraphic_T ( italic_D ).

  2. 2.

    The challenger samples a data record z0sz0πsubscript𝑧0subscript𝑠subscript𝑧0𝜋z_{0}\overset{s_{z_{0}}}{\longleftarrow}\piitalic_z start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_OVERACCENT italic_s start_POSTSUBSCRIPT italic_z start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_OVERACCENT start_ARG ⟵ end_ARG italic_π from the data distribution, and a training data record z1sz1Dsubscript𝑧1subscript𝑠subscript𝑧1𝐷z_{1}\overset{s_{z_{1}}}{\longleftarrow}Ditalic_z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_OVERACCENT italic_s start_POSTSUBSCRIPT italic_z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_OVERACCENT start_ARG ⟵ end_ARG italic_D.

  3. 3.

    The challenger flips a fair coin to get the bit b{0,1}𝑏01b\in\{0,1\}italic_b ∈ { 0 , 1 }. If b=1𝑏1b=1italic_b = 1, the challenger samples a range 1subscript1\mathcal{R}_{1}caligraphic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT containing at least one training point. Otherwise, challenger samples a range 0subscript0\mathcal{R}_{0}caligraphic_R start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT containing no training points.

  4. 4.

    The challenger sends the target model θ𝜃\thetaitalic_θ and the range bsubscript𝑏\mathcal{R}_{b}caligraphic_R start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT to the adversary.

  5. 5.

    The adversary gets access to the data distribution π𝜋\piitalic_π and access to the target model, and outputs a bit b^𝒜(θ,b)^𝑏𝒜𝜃subscript𝑏\hat{b}\longleftarrow\mathcal{A}(\theta,\mathcal{R}_{b})over^ start_ARG italic_b end_ARG ⟵ caligraphic_A ( italic_θ , caligraphic_R start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ).

  6. 6.

    If b^=b^𝑏𝑏\hat{b}=bover^ start_ARG italic_b end_ARG = italic_b, output 1 (success). Otherwise, output 0.

The main difference between the two games is that the adversary now receives a range query (Step 4 in Def 2) instead of a point query (Step 3 in Def 1). We assume that the adversary is able to sample within the range. This is a reasonable assumption because the adversary is usually assumed to have the ability to sample from the original data distribution π𝜋\piitalic_π [39, 44, 47]. Given a range query and a sampler of π𝜋\piitalic_π, it is not difficult to sample within the range.

What is a range

A range can be defined by a center, which is a point, a radius representing the size of the range, and a distance function which the radius is defined with. We refer to the center as the query center, the radius as the range size, and the distance function as the range function in this paper. One way to visualize a range is to imagine a unit l2subscript𝑙2l_{2}italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ball around a point x𝑥xitalic_x, replacing the radius and l2subscript𝑙2l_{2}italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT distance with any arbitrary choice of range sizes and functions. Our framework can cater to any arbitrary range function. It can be spatial based (e.g. lpsubscript𝑙𝑝l_{p}italic_l start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT distances), transformation based (e.g. geometric transformations) and semantic based (e.g. owner/main features of the data). In the experiment section, we will present results with all of these types of range functions. Note that our attack reduces to user-level inference [31, 23, 27, 11, 10] when the range function is user-based.

How to construct a range

In Step 3 of the range membership inference game, the details of how the challenger samples the ranges are intentionally omitted. This is because the ranges can be constructed around either in-distribution or out-of-distribution data points for both in- and out-ranges. The details of how we construct the in- and out-ranges for our experiments are elaborated in Appendix B.

3.3 Evaluation of RaMIA

Similarly, we evaluate RaMIA with AUCs. However, the notion of true positives and false positives are different from those defined in MIA, as both are defined on the range level. To avoid confusion, we call them Range TPR and Range FPR, which means a range is correctly/wrongly predicted to contain at least one training point.

4 Range membership inference attacks

4.1 Composite hypothesis testing

Similar to likelihood ratio tests for membership inference games A.1, we can also construct two hypotheses for range membership inference game (Def 2):

H0subscript𝐻0\displaystyle H_{0}italic_H start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT:None of the points in the given range are from the training set.:absentNone of the points in the given range are from the training set.\displaystyle:\text{None of the points in the given range are from the %training set. }: None of the points in the given range are from the training set.
zb:zD.:for-all𝑧subscript𝑏𝑧𝐷\displaystyle\forall z\in\mathcal{R}_{b}:z\not\in D.∀ italic_z ∈ caligraphic_R start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT : italic_z ∉ italic_D .
H1subscript𝐻1\displaystyle H_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT:There is at least one point in the given range from the training set.:absentThere is at least one point in the given range from the training set.\displaystyle:\text{There is at least one point in the given range from the %training set. }: There is at least one point in the given range from the training set.
zbs.t.zD.𝑧subscript𝑏s.t.𝑧𝐷\displaystyle\exists z\in\mathcal{R}_{b}\text{ s.t. }z\in D.∃ italic_z ∈ caligraphic_R start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT s.t. italic_z ∈ italic_D .

The likelihood ratio in this case is (θ|H1)(θ|H0)conditional𝜃subscript𝐻1conditional𝜃subscript𝐻0\frac{\mathbb{P}\left(\theta|H_{1}\right)}{\mathbb{P}\left(\theta|H_{0}\right)}divide start_ARG blackboard_P ( italic_θ | italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) end_ARG start_ARG blackboard_P ( italic_θ | italic_H start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_ARG. Note that the alternative hypothesis H1subscript𝐻1H_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is composite because it is a union of multiple hypotheses zib(ziD)subscriptsubscript𝑧𝑖subscript𝑏subscript𝑧𝑖𝐷\bigcup_{z_{i}\leftarrow\mathcal{R}_{b}}(z_{i}\in D)⋃ start_POSTSUBSCRIPT italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ← caligraphic_R start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_D ). Therefore, we need to change our methodology to those tailored for composite hypothesis testing. There are two commonly used methods for it: Bayes Factor [22] and Generalized Likelihood Ratio Tests (GLRT) [42]. Bayes Factor replaces the composite hypothesis with a simple one that is representative of its hypothesis class. It models the "parameter" of the hypothesis with a prior distribution and then computes the expected value of the composite hypothesis based on the prior distribution. In this case, (θ|H1)conditional𝜃subscript𝐻1\mathbb{P}\left(\theta|H_{1}\right)blackboard_P ( italic_θ | italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) will be computed byx1(θ|xD)(x)𝑑xsubscript𝑥subscript1conditional𝜃𝑥𝐷𝑥differential-d𝑥\int_{x\in\mathcal{R}_{1}}\mathbb{P}(\theta|x\in D)\mathbb{P}(x)dx∫ start_POSTSUBSCRIPT italic_x ∈ caligraphic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT blackboard_P ( italic_θ | italic_x ∈ italic_D ) blackboard_P ( italic_x ) italic_d italic_x. The generalized likelihood ratio test (GLRT) simply takes the maximum of all values the composite hypothesis can achieve. In this case, (θ|H1)conditional𝜃subscript𝐻1\mathbb{P}\left(\theta|H_{1}\right)blackboard_P ( italic_θ | italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) will be computed by maxx1(θ|xD)subscript𝑥subscript1conditional𝜃𝑥𝐷\max_{x\in\mathcal{R}_{1}}\mathbb{P}(\theta|x\in D)roman_max start_POSTSUBSCRIPT italic_x ∈ caligraphic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT blackboard_P ( italic_θ | italic_x ∈ italic_D ).

To make full use of the Bayes Factor, we need to know the prior distribution, which is unrealistic. On the other hand, taking the max seems to be a more intuitive approach for range membership inference attacks, because it provides a two-step solution: search and test. Searching for the points with the highest membership score is conceptually equivalent to identifying the points that are most likely to be training points. However, this assumes that we can reliably find the max values in a given range. Since most ranges are large data subspaces, it is very challenging to find the optimal points within the large space. Even if the search space can be navigated, any search algorithm is likely to return local maxima. Hence, a robust way is to aggregate the top samples. However, membership inference attacks are known to be unreliable on out-of-distribution (OOD) data [47] and can assign them high scores. When the sampling space contains only these data as opposed to real and in-distribution (ID) data, the maximum might not be anything close to true training data, increasing the FPR and lowering the AUC as a result.

In this paper, we adopt a simple attack strategy. Our solution to this is based on the type of data in the sampling space. If the data are all naturally ID, we can combine Bayes Factor and GLRT by taking the average likelihood of the top samples to reduce the influence of the randomness in the sampling process.On the other hand, if the adversary can only synthesize data within the range, the top samples are highly likely to be OOD data with high membership scores. Hence, in this case, we want to remove those points from the equation. Since the presence of training points intuitively raises the average membership score of ID points nearby, compared to having no training points at all, we average the membership scores of the remaining samples.Unifying these two strategies gives us the following:

(θ|H1)=TrimmedAvg(S,qs,qe;)=Avgx[qs,qe]-th quantiles(θ|xD),conditional𝜃subscript𝐻1TrimmedAvg𝑆subscript𝑞𝑠subscript𝑞𝑒subscriptAvg𝑥subscript𝑞𝑠subscript𝑞𝑒-th quantilesconditional𝜃𝑥𝐷\mathbb{P}\left(\theta|H_{1}\right)=\text{TrimmedAvg}(S,q_{s},q_{e};\mathbb{P}%)=\text{Avg}_{x\not\in[q_{s},q_{e}]\text{-th quantiles}}\mathbb{P}(\theta|x\inD),blackboard_P ( italic_θ | italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) = TrimmedAvg ( italic_S , italic_q start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT , italic_q start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ; blackboard_P ) = Avg start_POSTSUBSCRIPT italic_x ∉ [ italic_q start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT , italic_q start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ] -th quantiles end_POSTSUBSCRIPT blackboard_P ( italic_θ | italic_x ∈ italic_D ) ,(1)

where S𝑆Sitalic_S is the sampled set, qssubscript𝑞𝑠q_{s}italic_q start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT and qesubscript𝑞𝑒q_{e}italic_q start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT mark the start and the end of the quantiles where we want to remove to compute our robust statistics that are one-sided trimmed means. If the sampling space is filled with synthetic data, the chance of the top samples being false positives is high, so we set qe=100subscript𝑞𝑒100q_{e}=100italic_q start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT = 100 to remove the largest points. qssubscript𝑞𝑠q_{s}italic_q start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT is a hyperparameter that decreases (trim more) as the quality of sampled points gets worse. On the other hand, if the sampling space consists of real points, we set qs=0subscript𝑞𝑠0q_{s}=0italic_q start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT = 0 to remove the smallest points in our aggregation. qesubscript𝑞𝑒q_{e}italic_q start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT decreases (trim less) as the number of real samples decreases to offset the high variance due to limited samples available. Note that the optimal hyperparameters may differ across different membership signals (e.g. loss values, LiRA scores), as they exploit different vulnerabilities and expose different training points. However, for fixed model architectures, range functions, data distributions and sampling methods, these hyperparameters can be determined by reference models, similar to the offline version of RMIA [47]. Specifically, by randomly choosing a reference model as the temporary target model, we can run RaMIAs using all other reference models while sweeping these hyperparameters.

4.2 Range membership inference attack as a framework

The range membership inference attack is a new inference attack framework, not a particular attack algorithm. There are two components in this framework: a sampler and a membership tester, both of which are necessary to compute the range membership score formulated in Eqn 1. The sampler Sampler():SX:Sampler𝑆𝑋\text{Sampler}(\mathcal{R}):S\rightarrow XSampler ( caligraphic_R ) : italic_S → italic_X returns samples within the given range. The membership tester MIA(x)MIA𝑥\text{MIA}(x)MIA ( italic_x ) is a (point-query) membership inference algorithm that outputs a membership score, which can be used to approximate (θ|x)conditional𝜃𝑥\mathbb{P}(\theta|x)blackboard_P ( italic_θ | italic_x ). A number of existing attack algorithms can be plugged in. Similar to MIAs, the key to using RaMIA as a privacy auditing tool is to compute the range membership score. Our framework can adopt any existing membership scoring function MIA(x)MIA𝑥\text{MIA}(x)MIA ( italic_x ) to compute RaMIA()RaMIA\text{RaMIA}(\mathcal{R})RaMIA ( caligraphic_R ). Below, we outline the attack with our attack strategy described above:

1:Input range \mathcal{R}caligraphic_R, sampler Sample()Sample\text{Sample}(\cdot)Sample ( ⋅ ), target model θ𝜃\thetaitalic_θ, membership scoring function MIA()MIA\text{MIA}(\cdot)MIA ( ⋅ ).

2:Sample an attack set: S𝑛Sample()𝑆𝑛SampleS\overset{n}{\longleftarrow}\text{Sample}(\mathcal{R})italic_S overitalic_n start_ARG ⟵ end_ARG Sample ( caligraphic_R );

3:ifsamples are real and IDthen

4:Set qs=0subscript𝑞𝑠0q_{s}=0italic_q start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT = 0, and set qesubscript𝑞𝑒q_{e}italic_q start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT by sweeping on reference models;

5:else

6:Set qe=100subscript𝑞𝑒100q_{e}=100italic_q start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT = 100, and set qssubscript𝑞𝑠q_{s}italic_q start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT by sweeping on reference models.

7:endif

8:RaMIA(;θ)=TrimmedAvg(S,qs,qe;MIA)RaMIA𝜃TrimmedAvg𝑆subscript𝑞𝑠subscript𝑞𝑒MIA\text{RaMIA}(\mathcal{R};\theta)=\text{TrimmedAvg}(S,q_{s},q_{e};\text{MIA})RaMIA ( caligraphic_R ; italic_θ ) = TrimmedAvg ( italic_S , italic_q start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT , italic_q start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ; MIA )

5 Experiments

Since the purpose of this paper is to introduce a new concept and framework, the goal of the experiments section is to provide a proof-of-concept.We experiment on the commonly used Purchase-100 [39], CelebA [28], CIFAR-10 [25] and AG News [48] datasets. Details of how we split the dataset, train models, construct ranges, and obtain samples are explained in Appendix B. We compare RaMIA with MIA (Sec 5.1) in scenarios MIA under-performs (Depicted in Figure 2). Both attacks are built upon the state-of-the-art attack algorithm, robust membership inference attack (RMIA) [47], with three reference models trained in the same way as Carlini etal. [5] and Zarifzadeh etal. [47]. The respective queries are outlined in Table 1, and the definitions of members under each attack framework are explained in Table 2. In both tables, x𝑥xitalic_x represents original data in datasets, while xsuperscript𝑥x^{\prime}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPTs are either data with missing values or modified data from x𝑥xitalic_x. The reason that we do not test our attacks by taking ranges centered at original data x𝑥xitalic_x is that the chance of the attack data being exactly the same as the training data is extremely low without sufficient prior knowledge. It is more realistic that similar data are being queried.

DatasetRange queryPoint query
Purchase-100possible data records given the incomplete data xsuperscript𝑥x^{\prime}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPTmode imputed xsuperscript𝑥x^{\prime}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
CelebAphotos featuring the same person as photo xsuperscript𝑥x^{\prime}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPTphoto xsuperscript𝑥x^{\prime}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
CIFAR-10transformed versions of image xsuperscript𝑥x^{\prime}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPTimage xsuperscript𝑥x^{\prime}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
AG Newssentences that are of Hamming distance 8 to sentence xsuperscript𝑥x^{\prime}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPTsentence xsuperscript𝑥x^{\prime}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
DatasetRange member if there is at least(Point) member if
Purchase-100one training point matches with xsuperscript𝑥x^{\prime}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT on all unmasked columnsximpusubscriptsuperscript𝑥impux^{\prime}_{\text{impu}}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT impu end_POSTSUBSCRIPT is member
CelebAone training image featuring the same person as xsuperscript𝑥x^{\prime}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPTxsuperscript𝑥x^{\prime}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is member
CIFAR-10one version of image xsuperscript𝑥x^{\prime}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT in the training setxsuperscript𝑥x^{\prime}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is member
AG Newsone training sentence within Hamming distance 8 to xsuperscript𝑥x^{\prime}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPTxsuperscript𝑥x^{\prime}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is member

On Purchase-100, we take 20 samples in every range, and set qe=100,qs=45formulae-sequencesubscript𝑞𝑒100subscript𝑞𝑠45q_{e}=100,q_{s}=45italic_q start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT = 100 , italic_q start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT = 45. On CIFAR-10, we apply up to 15 distinct transforms, and set qe=100,qs=40formulae-sequencesubscript𝑞𝑒100subscript𝑞𝑠40q_{e}=100,q_{s}=40italic_q start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT = 100 , italic_q start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT = 40. On AG News, we construct 50 sentences within each range, and set qe=100,qs=20formulae-sequencesubscript𝑞𝑒100subscript𝑞𝑠20q_{e}=100,q_{s}=20italic_q start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT = 100 , italic_q start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT = 20. On CelebA, each celebrity has a different number of images in the sampling space, ranging from 1 to 18. Since it is hard to standardize the sample size for all ranges, we take all of them. We then set qs=0subscript𝑞𝑠0q_{s}=0italic_q start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT = 0 and qe=25subscript𝑞𝑒25q_{e}=25italic_q start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT = 25, which means we are not trimming anything for ranges with very few samples available.

5.1 RaMIAs quantify privacy risks more comprehensively than MIAs

As we have explained before, data points that are close enough to the training data are out of the scope of membership inference attacks. We observe from Figure 3 that range membership inference attacks are better at identifying those nearby points, and thus providing more comprehensive privacy auditing on all the four datasets we tested. We want to emphasize that the gain is remarkable if we consider how little samples were taken compared to the range sizes. On Purchase-100, there are a total of 1024 candidates, and we take less than 20% of them. On AG News, there are millions of sentences within a distance of 8. 50 sentences are too little to meaningfully cover anything in the space. But yet limited samples can lead to noticeable gains, which further shows the current privacy quantification approach is flawed and needs a better framework. Due to randomness in sampling, we report the average gain of RaMIA over MIA with standard deviation in Table 4.TPRs at small FPRs are in Table 3

Range Membership Inference Attacks (8)
Range Membership Inference Attacks (9)
Range Membership Inference Attacks (10)
Range Membership Inference Attacks (11)

5.2 Factors affecting RaMIA performance

Training data density in the range

Due to the nature of the sampling-based approach, the chance of our attack set containing a true training point scales linearly with the density of training points in the range. If we keep the sample size constant, increasing the range without including more training points in the range hurts the attack performance because the chance of the attack set including any training point gets diluted. On the other hand, if we increase the training point density, which is equivalent to increasing the probability the attacker samples a true training point, the attack performance gets boosted. Figure 4(b) shows that the performance of RaMIA increases when the range becomes larger in the CIFAR-10 experiment. Recall that the range function in CIFAR-10 is based on image augmentation methods. Increasing the range means the attacker applies more distinct augmentation methods to obtain transformed images. This increases the chance of the attacker obtaining one of the transformed versions of training images seen by the model during training, thus leading to better attack performance. In Figure 3(b), we conducted the attack assuming the attacker cannot sample any true training images. As a sanity check, we relax this assumption, and Figure 4(a) shows that RaMIA performs monotonically better when the density of training images increases from 0% to 50%, when the number of samples is constant.

Range Membership Inference Attacks (12)
Range Membership Inference Attacks (13)
Susceptibility to MIAs and RaMIAs is correlated

Ranges containing training points that are susceptible to MIAs are also more susceptible to RaMIAs. Researchers have previously discovered that machine learning models memorize duplicate data more [26, 6]. In our CelebA dataset, each celebrity has different numbers of photos in the training set, which can be thought as that each identity has different levels of duplication in the training set. Similar to the insights from MIAs, we also observe that identities that have more training images, i.e. higher duplication rate, are more susceptible to RaMIA. Figure 8 shows the relationship between the percentile each range’s RaMIA score within non-members’ RaMIA scores and the duplication rate. Generally speaking, identities that have more training photos are more prone to RaMIAs. Similarly correlation can be observed on the other three datasets in our experiments, where the training points’ RaMIA score percentiles among non-members are positively correlated with their MIA score percentiles 7.

5.3 Mismatched training and attack data hurts attack performance

Figure 2(c) shows that MIA underestimates the privacy risk when the augmentation used in training and attacking differs. This rings a bell as many people audit the privacy risk of image classifiers with original images, when the classifiers are often trained with a composition of augmentations. Many transformations, such as color jittering and affine transformations, always produce different final images. Other commonly used augmentation methods, such as random cropping, introduce more randomness to the pipeline. Hence, it is almost certain that the original images are never seen by the model. Therefore, we should use RaMIA for a better auditing result (Figure 3(c)).

Difference to existing augmentation-based MIAs

Existing attacks [4, 47] also use augmented queries in the attack, but with a different rationale and assumption of the attacker’s knowledge. Since they adopt the existing privacy notion based on point queries, only (augmented) images seen by the model in the training stage are considered as members. Hence, the attacker needs to know the exact train-time augmentations and augment images accordingly to not violate the privacy of notion. In RaMIA, the set of augmentations is given by the challenger (Def 2), which can contain augmentations not used in training, but considered as privacy leaking. Using the aggregation method in [4] will hurt the attack performance if non-training augmentations are used. However, RaMIA is designed to be robust in this scenario (Fig 3(c)).

6 Conclusion

In this paper, we argue that membership inference attacks are only useful as a privacy audit tool when querying exact copies of training and test data. Moving the query to similar points causes a drastic decrease in performance, rendering MIAs less useful. We conclude MIAs fail to comprehensively capture the notion of privacy, and thus propose a new class of inference attack, RaMIA, that extends the notion of MIAs. and cover the failure cases of MIAs by checking if a given range contains a training point. We introduce RaMIA as an attack framework that can be implemented with any existing MIA algorithm. We show that it can provide better privacy auditing with very few samples taken randomly. We hope our work can make more privacy researchers and practitioners aware of the shortcomings of MIAs, and shift their attention to RaMIAs. As it is the first paper that brings up this new framework, there is room for improvement in specific attack algorithms. For example, a better sampling process will surely increase the gap between RaMIA and MIA. Nevertheless, we have shown our framework is sensible and useful. In future work, we hope to design more powerful RaMIA strategies that are robust to the change of membership signals and datasets, especially on LLMs where we believe our privacy notion is extremely relevant.

References

  • Ateniese etal. [2015]G.Ateniese, L.V. Mancini, A.Spognardi, A.Villani, D.Vitali, and G.Felici.Hacking smart machines with smarter ones: How to extract meaningfuldata from machine learning classifiers.International Journal of Security and Networks, 10(3):137–150, 2015.
  • Bradbury etal. [2018]J.Bradbury, R.Frostig, P.Hawkins, M.J. Johnson, C.Leary, D.Maclaurin,G.Necula, A.Paszke, J.VanderPlas, S.Wanderman-Milne, and Q.Zhang.JAX: composable transformations of Python+NumPy programs,2018.URL http://github.com/google/jax.
  • Brown etal. [2021]G.Brown, M.Bun, V.Feldman, A.Smith, and K.Talwar.When is memorization of irrelevant training data necessary forhigh-accuracy learning?In Proceedings of the 53rd annual ACM SIGACT symposium ontheory of computing, pages 123–132, 2021.
  • Carlini etal. [2021]N.Carlini, F.Tramer, E.Wallace, M.Jagielski, A.Herbert-Voss, K.Lee,A.Roberts, T.Brown, D.Song, U.Erlingsson, etal.Extracting training data from large language models.In 30th USENIX Security Symposium (USENIX Security 21), pages2633–2650, 2021.
  • Carlini etal. [2022a]N.Carlini, S.Chien, M.Nasr, S.Song, A.Terzis, and F.Tramer.Membership inference attacks from first principles.In 2022 IEEE Symposium on Security and Privacy (SP), pages1897–1914. IEEE, 2022a.
  • Carlini etal. [2022b]N.Carlini, D.Ippolito, M.Jagielski, K.Lee, F.Tramer, and C.Zhang.Quantifying memorization across neural language models.arXiv preprint arXiv:2202.07646, 2022b.
  • Carlini etal. [2023]N.Carlini, J.Hayes, M.Nasr, M.Jagielski, V.Sehwag, F.Tramer, B.Balle,D.Ippolito, and E.Wallace.Extracting training data from diffusion models.In 32nd USENIX Security Symposium (USENIX Security 23), pages5253–5270, 2023.
  • Chase etal. [2021]M.Chase, E.Ghosh, and S.Mahloujifar.Property inference from poisoning.arXiv preprint arXiv:2101.11073, 2021.
  • Chaudhari etal. [2023]H.Chaudhari, J.Abascal, A.Oprea, M.Jagielski, F.Tramer, and J.Ullman.Snap: Efficient extraction of private properties with poisoning.In 2023 IEEE Symposium on Security and Privacy (SP), pages400–417. IEEE, 2023.
  • Chen etal. [2023a]G.Chen, Y.Zhang, and F.Song.Slmia-sr: Speaker-level membership inference attacks against speakerrecognition systems.arXiv preprint arXiv:2309.07983, 2023a.
  • Chen etal. [2023b]M.Chen, Z.Zhang, T.Wang, M.Backes, and Y.Zhang.{{\{{FACE-AUDITOR}}\}}: Data auditing in facial recognition systems.In 32nd USENIX Security Symposium (USENIX Security 23), pages7195–7212, 2023b.
  • Devlin etal. [2018]J.Devlin, M.-W. Chang, K.Lee, and K.Toutanova.Bert: Pre-training of deep bidirectional transformers for languageunderstanding.arXiv preprint arXiv:1810.04805, 2018.
  • Duan etal. [2024]M.Duan, A.Suri, N.Mireshghallah, S.Min, W.Shi, L.Zettlemoyer,Y.Tsvetkov, Y.Choi, D.Evans, and H.Hajishirzi.Do membership inference attacks work on large language models?arXiv preprint arXiv:2402.07841, 2024.
  • Feldman [2019]V.Feldman.Does learning require memorization? a short tale about a long tail.corr abs/1906.05271 (2019).arXiv preprint arXiv:1906.05271, 2019.
  • Feldman and Zhang [2020]V.Feldman and C.Zhang.What neural networks memorize and why: Discovering the long tail viainfluence estimation.Advances in Neural Information Processing Systems,33:2881–2891, 2020.
  • Garg and Roy [2023]I.Garg and K.Roy.Memorization through the lens of curvature of loss function aroundsamples.arXiv preprint arXiv:2307.05831, 2023.
  • Hilprecht etal. [2019]B.Hilprecht, M.Härterich, and D.Bernau.Monte carlo and reconstruction membership inference attacks againstgenerative models.Proceedings on Privacy Enhancing Technologies, 2019.
  • Honnibal etal. [2020]M.Honnibal, I.Montani, S.VanLandeghem, and A.Boyd.spaCy: Industrial-strength Natural Language Processing in Python.2020.doi: 10.5281/zenodo.1212303.
  • Hu etal. [2021]E.J. Hu, P.Wallis, Z.Allen-Zhu, Y.Li, S.Wang, L.Wang, W.Chen, etal.Lora: Low-rank adaptation of large language models.In International Conference on Learning Representations, 2021.
  • Jayaraman and Evans [2022]B.Jayaraman and D.Evans.Are attribute inference attacks just imputation?In Proceedings of the 2022 ACM SIGSAC Conference on Computerand Communications Security, pages 1569–1582, 2022.
  • Jayaraman etal. [2021]B.Jayaraman, L.Wang, K.Knipmeyer, Q.Gu, and D.Evans.Revisiting membership inference under realistic assumptions.Proceedings on Privacy Enhancing Technologies, 2021(2), 2021.
  • Jeffreys [1939]H.Jeffreys.Theory of probability.1939.
  • Kandpal etal. [2023]N.Kandpal, K.Pillutla, A.Oprea, P.Kairouz, C.A. Choquette-Choo, and Z.Xu.User inference attacks on large language models.arXiv preprint arXiv:2310.09266, 2023.
  • Kim etal. [2023]Y.I. Kim, P.Agrawal, J.O. Royset, and R.Khanna.On memorization and privacy risks of sharpness aware minimization.arXiv preprint arXiv:2310.00488, 2023.
  • Krizhevsky etal. [2009]A.Krizhevsky etal.Learning multiple layers of features from tiny images.2009.
  • Lee etal. [2021]K.Lee, D.Ippolito, A.Nystrom, C.Zhang, D.Eck, C.Callison-Burch, andN.Carlini.Deduplicating training data makes language models better.arXiv preprint arXiv:2107.06499, 2021.
  • Liu etal. [2021]F.Liu, T.Lin, and M.Jaggi.Understanding memorization from the perspective of optimization viaefficient influence estimation.arXiv preprint arXiv:2112.08798, 2021.
  • Liu etal. [2018]Z.Liu, P.Luo, X.Wang, and X.Tang.Large-scale celebfaces attributes (celeba) dataset.Retrieved August, 15(2018):11, 2018.
  • Long etal. [2023]Y.Long, Z.Ying, H.Yan, R.Fang, X.Li, Y.Wang, and Z.Pan.Membership reconstruction attack in deep neural networks.Information Sciences, 634:27–41, 2023.
  • Lukasik etal. [2023]M.Lukasik, V.Nagarajan, A.S. Rawat, A.K. Menon, and S.Kumar.What do larger image classifiers memorise?arXiv preprint arXiv:2310.05337, 2023.
  • Mahloujifar etal. [2021]S.Mahloujifar, H.A. Inan, M.Chase, E.Ghosh, and M.Hasegawa.Membership inference on word embedding and beyond.arXiv preprint arXiv:2106.11384, 2021.
  • Mangrulkar etal. [2022]S.Mangrulkar, S.Gugger, L.Debut, Y.Belkada, S.Paul, and B.Bossan.Peft: State-of-the-art parameter-efficient fine-tuning methods.https://github.com/huggingface/peft, 2022.
  • Paszke etal. [2019]A.Paszke, S.Gross, F.Massa, A.Lerer, J.Bradbury, G.Chanan, T.Killeen,Z.Lin, N.Gimelshein, L.Antiga, etal.Pytorch: An imperative style, high-performance deep learning library.Advances in neural information processing systems, 32, 2019.
  • Radford etal. [2019]A.Radford, J.Wu, R.Child, D.Luan, D.Amodei, I.Sutskever, etal.Language models are unsupervised multitask learners.OpenAI blog, 1(8):9, 2019.
  • Sablayrolles etal. [2019]A.Sablayrolles, M.Douze, C.Schmid, Y.Ollivier, and H.Jégou.White-box vs black-box: Bayes optimal strategies for membershipinference.In Proceedings of the 36th International Conference on MachineLearning (ICML’19), page 5558–5567, 2019.
  • Salem etal. [2020]A.Salem, A.Bhattacharya, M.Backes, M.Fritz, and Y.Zhang.{{\{{Updates-Leak}}\}}: Data set inference and reconstruction attacksin online learning.In 29th USENIX security symposium (USENIX Security 20), pages1291–1308, 2020.
  • Samarati and Sweeney [1998]P.Samarati and L.Sweeney.Generalizing data to provide anonymity when disclosing information.In PODS, volume98, pages 10–1145, 1998.
  • Sankararaman etal. [2009]S.Sankararaman, G.Obozinski, M.I. Jordan, and E.Halperin.Genomic privacy and limits of individual detection in a pool.Nature genetics, 41(9):965–967, 2009.
  • Shokri etal. [2017]R.Shokri, M.Stronati, C.Song, and V.Shmatikov.Membership inference attacks against machine learning models(s&p’17).2017.
  • Suri and Evans [2022]A.Suri and D.Evans.Formalizing and estimating distribution inference risks.Proceedings on Privacy Enhancing Technologies, 2022.
  • Tirumala etal. [2022]K.Tirumala, A.Markosyan, L.Zettlemoyer, and A.Aghajanyan.Memorization without overfitting: Analyzing the training dynamics oflarge language models.Advances in Neural Information Processing Systems,35:38274–38290, 2022.
  • VanTrees [1968]H.VanTrees.Detection, estimation, and modulation theory. part 1-detection,estimation, and linear modulation theory.1968.
  • Wu etal. [2022]Y.Wu, N.Yu, Z.Li, M.Backes, and Y.Zhang.Membership inference attacks against text-to-image generation models.2022.
  • Ye etal. [2022]J.Ye, A.Maddi, S.K. Murakonda, V.Bindschaedler, and R.Shokri.Enhanced membership inference attacks against machine learningmodels.In Proceedings of the 2022 ACM SIGSAC Conference on Computerand Communications Security, pages 3093–3106, 2022.
  • Yeom etal. [2018]S.Yeom, I.Giacomelli, M.Fredrikson, and S.Jha.Privacy risk in machine learning: Analyzing the connection tooverfitting.In 2018 IEEE 31st computer security foundations symposium(CSF), pages 268–282. IEEE, 2018.
  • Zagoruyko and Komodakis [2016]S.Zagoruyko and N.Komodakis.Wide residual networks.In British Machine Vision Conference 2016. British MachineVision Association, 2016.
  • Zarifzadeh etal. [2023]S.Zarifzadeh, P.C.-J.M. Liu, and R.Shokri.Low-cost high-power membership inference by boosting relativity.2023.
  • Zhang etal. [2015]X.Zhang, J.Zhao, and Y.LeCun.Character-level convolutional networks for text classification.Advances in neural information processing systems, 28, 2015.

Appendix A Attack Details

A.1 (Simple) Hypothesis testing

The standard way to tackle the inference game (Def 1) is to apply statistical hypothesis tests [44, 5]:

H0subscript𝐻0\displaystyle H_{0}italic_H start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT:The givenzis not a training point(b=0).:absentThe given𝑧is not a training point𝑏0\displaystyle:\text{The given }z\text{ is not a training point }(b=0).: The given italic_z is not a training point ( italic_b = 0 ) .
H1subscript𝐻1\displaystyle H_{1}italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT:The givenzis a training point(b=1).:absentThe given𝑧is a training point𝑏1\displaystyle:\text{The given }z\text{ is a training point }(b=1).: The given italic_z is a training point ( italic_b = 1 ) .

The likelihood ratio test (LRT) is then conducted

(θ|H1)(θ|H0)conditional𝜃subscript𝐻1conditional𝜃subscript𝐻0\frac{\mathbb{P}\left(\theta|H_{1}\right)}{\mathbb{P}\left(\theta|H_{0}\right)}divide start_ARG blackboard_P ( italic_θ | italic_H start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) end_ARG start_ARG blackboard_P ( italic_θ | italic_H start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_ARG(2)

This is usually called "simple" hypothesis testing because each H𝐻Hitalic_H contains a single hypothesis.

A.2 Attack algorithms

In this section, we explain the details of the membership inference attack algorithms used in our experiments.

LOSS

LOSS [45] computes loss values as a proxy of membership score on given points: MIA(x;θ)=l(x;θ)MIA𝑥𝜃𝑙𝑥𝜃\text{MIA}(x;\theta)=\mathit{l}(x;\theta)MIA ( italic_x ; italic_θ ) = italic_l ( italic_x ; italic_θ ). To compute the likelihood, an easy way is to take the exponential of the negative of the loss P=expl𝑃superscript𝑙P=\exp^{-\mathit{l}}italic_P = roman_exp start_POSTSUPERSCRIPT - italic_l end_POSTSUPERSCRIPT.

RMIA

RMIA [47] computes membership score by applying chain rule: (θ|x)=(x|θ)(θ)(x)conditional𝜃𝑥conditional𝑥𝜃𝜃𝑥\mathbb{P}(\theta|x)=\frac{\mathbb{P}(x|\theta)\mathbb{P}(\theta)}{\mathbb{P}(%x)}blackboard_P ( italic_θ | italic_x ) = divide start_ARG blackboard_P ( italic_x | italic_θ ) blackboard_P ( italic_θ ) end_ARG start_ARG blackboard_P ( italic_x ) end_ARG. The score is then compared with all available population data points to obtain the percentage of population points being dominated by the given point: zZ[(θ|x)(θ|z)γ]subscript𝑧𝑍delimited-[]conditional𝜃𝑥conditional𝜃𝑧𝛾\mathbb{P}_{z\in Z}[\frac{\mathbb{P}(\theta|x)}{\mathbb{P}(\theta|z)}\leq\gamma]blackboard_P start_POSTSUBSCRIPT italic_z ∈ italic_Z end_POSTSUBSCRIPT [ divide start_ARG blackboard_P ( italic_θ | italic_x ) end_ARG start_ARG blackboard_P ( italic_θ | italic_z ) end_ARG ≤ italic_γ ], where the term (θ)𝜃\mathbb{P}(\theta)blackboard_P ( italic_θ ) will cancel out with each other. The normalizing constant (x)𝑥\mathbb{P}(x)blackboard_P ( italic_x ) is computed with reference models: (x)=0.5𝔼θin(x|θin)+0.5𝔼θout(x|θout)𝑥0.5subscript𝔼subscript𝜃inconditional𝑥subscript𝜃in0.5subscript𝔼subscript𝜃outconditional𝑥subscript𝜃out\mathbb{P}(x)=0.5\mathbb{E}_{\theta_{\text{in}}}\mathbb{P}(x|\theta_{\text{in}%})+0.5\mathbb{E}_{\theta_{\text{out}}}\mathbb{P}(x|\theta_{\text{out}})blackboard_P ( italic_x ) = 0.5 blackboard_E start_POSTSUBSCRIPT italic_θ start_POSTSUBSCRIPT in end_POSTSUBSCRIPT end_POSTSUBSCRIPT blackboard_P ( italic_x | italic_θ start_POSTSUBSCRIPT in end_POSTSUBSCRIPT ) + 0.5 blackboard_E start_POSTSUBSCRIPT italic_θ start_POSTSUBSCRIPT out end_POSTSUBSCRIPT end_POSTSUBSCRIPT blackboard_P ( italic_x | italic_θ start_POSTSUBSCRIPT out end_POSTSUBSCRIPT ). In its offline version, the in models are unavailable. In this case, the former probabilities is approximated by the latter term in=aout+(1a)subscriptin𝑎subscriptout1𝑎\mathbb{P}_{\text{in}}=a\mathbb{P}_{\text{out}}+(1-a)blackboard_P start_POSTSUBSCRIPT in end_POSTSUBSCRIPT = italic_a blackboard_P start_POSTSUBSCRIPT out end_POSTSUBSCRIPT + ( 1 - italic_a ). The hyperparameter α𝛼\alphaitalic_α is chosen based on the reference models. Specifically, one reference model is chosen as the temporary target model, and the rest are used to attack it. The value of α𝛼\alphaitalic_α is chosen to be the best performing value under this setting, obtained via a simple sweeping. In our experiment, we use the offline attack only. The α𝛼\alphaitalic_α values for Purchase-100 and CIFAR-10 are taken from [47]. For CelebA, we set it to be 0.33. For AG News, we set it to be 1.0.

Appendix B Setup Details

On each of the dataset, we train four models on half of the dataset in the same way described by Carlini etal. [5], Zarifzadeh etal. [47]. We will describe the details of the datasets below. We have checked with their licenses with our best effort, and confirm their terms of use are respected.

B.1 Tabular data: Purchase-100

Dataset

Purchase-100 [39] is a tabular dataset derived from Kaggle’s Acquire Valued Shoppers Challenge 111https://www.kaggle.com/c/acquire-valued-shoppers-challenge/data. This dataset was first curated by Shokri etal. [39] such that there are 600 binary features, each representing if each person, represented by each row, has purchased the product. The data is then divided into 100 classes, and the task is to predict the category of the person given the purchase history.

Models

We train five-layer multi-layer perceptron (MLP) models in PyTorch [33] on half of the entire dataset. The hidden layers are of sizes [512,256,128,64]51225612864[512,256,128,64][ 512 , 256 , 128 , 64 ]. All models achieve a test accuracy of 83%percent8383\%83 %.

Construction of ranges

We simulate the scenario where the attacker has incomplete data (data with missing values). For all training and test data records, we randomly mask k columns. Each row with masked columns is a range query that contains 2ksuperscript2𝑘2^{k}2 start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT possible points as each feature is binary. We then check if each range constructed from test data points include any training point. If so, they are re-labelled as "in-ranges".

Sampling within ranges

Since this dataset contains 600 independent binary features, we do Bernoulli sampling independently for all missing columns. The parameter of the sampler is computed by taking the average value of each column. Because of the nature of this dataset, our sampled data can be regarded as in-distribution. We take 19 samples for each range, together with the data obtained by doing mode imputation (fill in the missing values with the modes).

B.2 Image data I: CelebA

Dataset

CelebA [28], also known as the CelebFaces Attributes dataset, contains 202,599 face images from 10,177 celebrities, each annotated with 40 binary facial features. We construct the members set by only including photos of celebrities with identity number smaller than 5090. The rest are used to construct the non-members set. For each celebrity in the members set, half of the photos are put into the training set, while the other half goes into the holdout set.

Model

We train four-layer convolutional neural networks (CNNs) in PyTorch [33] on the training set to predict the facial attributes of any given photo. Our target model has a test accuracy of 87%percent8787\%87 %.

Construction of ranges

The range function here is a semantic one that is based on the identity of the face image. For example, a range query can be "all Alice’s photos". Since the identities in the training and non-members set are disjoint, it is easy to construct in- and out-ranges based on the distribution of identities in the two sets.

Sampling within ranges

For each range query, we curate all images in the holdout set that share the same identity as the range center to construct our sample set.

B.3 Image data II: CIFAR-10

Dataset

CIFAR-10 [25] is a popular image classification dataset. There are 50,000 training images, each of size (32,32,3)32323(32,32,3)( 32 , 32 , 3 ).

Models

We train WideResNets-28-2 [46] with JAX [2] on half of the training set of CIFAR-10 using the code from [5], with and without image augmentations. Our target model trained without augmentation achieves a test accuracy of 83%percent8383\%83 % on the CIFAR-10 test set, and the target model trained with augmentation achieves a test accuracy of 92%percent9292\%92 %. The train time augmentation is the composition of random flipping, cropping and random hue.

Construction of ranges

The range function here is different types of image augmentations, which are geometric transformations. An example of a range query is "all transformed version of image X𝑋Xitalic_X". For each training and test image, a range is constructed by applying different transformations.

Sampling within ranges

For each range query, we independently apply 15 image augmentations on the query center. The augmentations include flipping, random rotation, random resizing and cropping, random contrast, brightness, hue, and the composition of them.

B.4 Textual data: AG News

Dataset

We use AG News dataset [48], which is a news collection with four categories of news. Although it was introduced as a text classification dataset, we disregard the labels and treat it as a text generation dataset. There are 120,000 sentences in its training set.

Model

We took pretrained GPT-2 [34] models from Hugging Face’s transformers library, and finetuned them on half of AG News’ training set with LoRA [19] implemented in Hugging Face’s PEFT [32] library. The finetuning is done for 4 epochs. Our target model achieves a perplexity of 1.39 on the test set of AG News.

Construction of ranges

The range function here is word-level Hamming distance, which can be thought as the edit distance measured on word level that only allows word substitution. An example of a range query is "all sentences within Hamming distance d𝑑ditalic_d to sentence x𝑥xitalic_x". To construct in- and out-ranges, we just need to specify the max Hamming distance and the starting sentence. We constructed the starting sentences by randomly masking α𝛼\alphaitalic_α words from the training and test sentences, before filling in the mask with a pre-trained BERT [12] model, so they have a distance of α𝛼\alphaitalic_α to the original training/test sentences. A Hamming distance is then specified with each starting sentence to form a range.

Sampling within ranges

We mask the range center by k𝑘kitalic_k words where k𝑘kitalic_k is the Hamming distance specified by the range. Then we use BERT [12] to replace the mask with one of the top choices.

Appendix C Implementation details

For all PyTorch models, we use Adam as our optimizer with a learning rate of 0.001. For WideResnets, we use the training code from [4]. On AG News, the models are trained for 4 epochs. On other datasets, they are trained for 100 epochs.

All training are done on two Nvidia RTX 3090 GPUs. Training on AG News takes about 1 hour per epoch. Training other models takes less than one hour each.

Appendix D RaMIA on redacted data

Many large language models (LLMs) are trained with sensitive textual data. Some of the data with sensitive information redacted might be public available. Similar to our experiment with data with missing values, we can apply RaMIA to redacted data to identify which of them are used to train a target LLM. Accurately identify the redacted sentences paves the way for reconstructing them as a follow-up attack. Figure 5 shows the results. In this experiment, we use spaCy [18] to mask peoples’ names to simulate masking of personally identifiable information (PII). We then generate 10 possible sentences for each masked sentence using BERT, and conduct RaMIA. The MIA performance is the average attack performance over all 10 possible sentences.

Range Membership Inference Attacks (14)

Appendix E Extra results

In this section, we put extra experiment results.

Purchase-100CIFAR-10CelebAAG News
TPR@FPR(%)1%0.1%1%0.1%1%0.1%1%0.1%
MIA
LOSS000.1501.860.310.080.080.080.0800
RMIA2.180.372.400.211.690.190.300.300.300.3000
RaMIA
LOSS0±0plus-or-minus000\pm 00 ± 00±0plus-or-minus000\pm 00 ± 01.17±0.06plus-or-minus1.170.061.17\pm 0.061.17 ± 0.060.05±0.03plus-or-minus0.050.030.05\pm 0.030.05 ± 0.031.400.281.10±0.11plus-or-minus1.100.111.10\pm 0.111.10 ± 0.110±0plus-or-minus000\pm 00 ± 0
RMIA2.57±1.58plus-or-minus2.571.582.57\pm 1.582.57 ± 1.580.57±0.47plus-or-minus0.570.470.57\pm 0.470.57 ± 0.473.59±0.11plus-or-minus3.590.113.59\pm 0.113.59 ± 0.110.80±0.02plus-or-minus0.800.020.80\pm 0.020.80 ± 0.021.440.220.54±0.12plus-or-minus0.540.120.54\pm 0.120.54 ± 0.120±0plus-or-minus000\pm 00 ± 0

Purchase-100CIFAR-10CelebAAG News
ΔAUCΔAUC\Delta\text{AUC}roman_Δ AUC2.62±0.04plus-or-minus2.620.042.62\pm 0.042.62 ± 0.044.12±0.06plus-or-minus4.120.064.12\pm 0.064.12 ± 0.065.45.45.45.41.20±0.2plus-or-minus1.200.21.20\pm 0.21.20 ± 0.2
Range Membership Inference Attacks (15)
Range Membership Inference Attacks (16)
Range Membership Inference Attacks (17)
Range Membership Inference Attacks (18)
Range Membership Inference Attacks (19)
Range Membership Inference Attacks (2024)
Top Articles
Hvcw Ihub
Tap Tap Run Coupon Codes
Skylar Vox Bra Size
Rabbits Foot Osrs
La connexion à Mon Compte
Chris wragge hi-res stock photography and images - Alamy
Acts 16 Nkjv
Waive Upgrade Fee
Driving Directions To Atlanta
Bestellung Ahrefs
Jvid Rina Sauce
Nene25 Sports
Alexandria Van Starrenburg
Arboristsite Forum Chainsaw
Kürtçe Doğum Günü Sözleri
Spoilers: Impact 1000 Taping Results For 9/14/2023 - PWMania - Wrestling News
Labby Memorial Funeral Homes Leesville Obituaries
Palm Springs Ca Craigslist
Decosmo Industrial Auctions
Employee Health Upmc
Cain Toyota Vehicles
Hctc Speed Test
Medline Industries, LP hiring Warehouse Operator - Salt Lake City in Salt Lake City, UT | LinkedIn
Waters Funeral Home Vandalia Obituaries
Yu-Gi-Oh Card Database
Perry Inhofe Mansion
Davita Salary
Vlocity Clm
15 Downer Way, Crosswicks, NJ 08515 - MLS NJBL2072416 - Coldwell Banker
Gasbuddy Lenoir Nc
2024 Coachella Predictions
Slv Fed Routing Number
Prima Healthcare Columbiana Ohio
Metro By T Mobile Sign In
Go Smiles Herndon Reviews
Babylon 2022 Showtimes Near Cinemark Downey And Xd
Nobodyhome.tv Reddit
Giantess Feet Deviantart
When His Eyes Opened Chapter 2048
Winco Money Order Hours
Red Dead Redemption 2 Legendary Fish Locations Guide (“A Fisher of Fish”)
Leena Snoubar Net Worth
Indiana Jones 5 Showtimes Near Cinemark Stroud Mall And Xd
Autum Catholic Store
Advance Auto.parts Near Me
56X40X25Cm
Port Huron Newspaper
Wolf Of Wallstreet 123 Movies
City Of Irving Tx Jail In-Custody List
Bbwcumdreams
Law Students
Invitation Quinceanera Espanol
Latest Posts
Article information

Author: Mrs. Angelic Larkin

Last Updated:

Views: 6519

Rating: 4.7 / 5 (47 voted)

Reviews: 86% of readers found this page helpful

Author information

Name: Mrs. Angelic Larkin

Birthday: 1992-06-28

Address: Apt. 413 8275 Mueller Overpass, South Magnolia, IA 99527-6023

Phone: +6824704719725

Job: District Real-Estate Facilitator

Hobby: Letterboxing, Vacation, Poi, Homebrewing, Mountain biking, Slacklining, Cabaret

Introduction: My name is Mrs. Angelic Larkin, I am a cute, charming, funny, determined, inexpensive, joyous, cheerful person who loves writing and wants to share my knowledge and understanding with you.