Data Mining: The Textbook

Yüklə 17,13 Mb.

səhifə	316/423
tarix	07.01.2024
ölçüsü	17,13 Mb.
	#211690

1 ... 312 313 314 315 316 317 318 319 ... 423

1-Data Mining tarjima

ψ_r(T, s_i, s_j ): Probability that the rth position in sequence T corresponds to state s_i, the (r + 1)th position corresponds to s_j .

γ_r(T, s_i): Probability that the rth position in sequence T corresponds to state s_i.

The EM procedure starts with a random initialization of the model parameters and then iteratively estimates (α(·), β(·), ψ(·), γ(·)) from the model parameters, and vice versa. Specif-ically, the iteratively executed steps of the EM procedure are as follows:

(E-step) Estimate (α(·), β(·), ψ(·), γ(·)) from currently estimated values of the model parameters (π(·), θ(·), p_..).
(M-step) Estimate model parameters (π(·), θ(·), p_..) from currently estimated values of (α(·), β(·), ψ(·), γ(·)).

It now remains to explain how each of the above estimations is performed. The values of α(·) and β(·) can be estimated using the forward and backward procedures, respectively. The forward procedure is already described in the evaluation section, and the backward procedure is analogous to the forward procedure, except that it works backward from the end of the sequence. The value of ψ_r(T, s_i, s_j ) is equal to α_r(T, s_i) · p_ij · θ^j (a_r₊₁) · β_r₊₁(T, s_j ) because the sequence-generation procedure can be divided into three portions corresponding to that up to position r, the generation of the (r + 1)th symbol, and the portion after the

15.6. SEQUENCE CLASSIFICATION

521

(r+1)th symbol. The estimated values of ψ_r(T, s_i, s _j ) are normalized to a probability vector by ensuring that the sum over diﬀerent pairs [i, j] is 1. The value of γ _r(T, s _i) is estimated by summing up the values of ψ_r(T, s_i, s_j ) over fixed i and varying j. This completes the description of the E-step.

The re-estimation formulas for the model parameters in the M-Step are relatively straightforward. Let I(a_r, σ _k) be a binary indicator function, which takes on the value of 1 when the two symbols are the same, and 0 otherwise. Then the estimations can be performed as follows:

π(j) = γ₁(T, s_j ), p_ij =

Yüklə 17,13 Mb.

Dostları ilə paylaş:

1 ... 312 313 314 315 316 317 318 319 ... 423