Data Mining: The Textbook

Yüklə 17,13 Mb.

səhifə	399/423
tarix	07.01.2024
ölçüsü	17,13 Mb.
	#211690

1 ... 395 396 397 398 399 400 401 402 ... 423

1-Data Mining tarjima

patients has a much higher expected chance of having HIV, than the base population.
In this context, a notion of Bayes optimal privacy exists, which ensures that the addi-tional posterior information gained after release of information is as small as possible. Unfor-tunately, the notion of Bayes optimal privacy is practically and computationally diﬃcult to implement. The t-closeness model may be viewed as a practical and heuristic approach that attempts to achieve similar goals as the notion of Bayes optimal privacy. This is achieved by using the distance functions between distributions. Informally, the goal is to create an

20.3. PRIVACY-PRESERVING DATA PUBLISHING

685

anonymization, such that the distance between the sensitive attribute distributions of each anonymized group and the base data is bounded by a user-defined threshold.

Definition 20.3.5 (t-closeness Principle) Let P = (p₁ . . . p_r) be a vector representing

the fraction of the data records belonging to the r diﬀerent values of the sensitive attribute in an equivalence class. Let Q = (q₁ . . . q_r) be the corresponding fractional distributions in the full data set. Then, the equivalence class is said to satisfy t-closeness, if the following is true, for an appropriately chosen distance function Dist(·, ·):

Dist(		,		) ≤ t	(20.12)
	P		Q

An anonymized table is said to satisfy t-closeness, if all equivalence classes in it satisfy t-closeness.

The previous definition does not specify any particular distance function. There are many diﬀerent ways to instantiate the distance function, depending on application-specific goals. Two common instantiations of the distance function are as follows:

Variational distance: This is simply equal to half the Manhattan distance between the two distribution vectors:

			r	\|p_i − q_i\|
Dist(	,	) =	i=1		(20.13)

Yüklə 17,13 Mb.

Dostları ilə paylaş:

1 ... 395 396 397 398 399 400 401 402 ... 423