6.9.
Juft ravish
RRCpl
6.10.
Takror ravish
RRRep
6.11.
Tub ravish
RRD0
6.12.
Yasama ravish
RRD1
7.
Bog‘lovchi
C
7.1.
Sof bogʻlovchi
C0
7.2.
Vazifadosh bogʻlovchi
C1
7.3.
Teng bogʻlovchi
CEq (Equeal)
7.4.
Ergashtiruvchi bogʻlovchi
CS
8.
Koʻmakchi
II
8.1.
Sof koʻmakchi
II0
8.2.
Vazifadosh koʻmakchi
II1
9.
Yuklama
Prt (Particle)
9.1.
Soʻz yuklama
PrtD
9.2.
Qo‘shimchasimon yuklama
PrtComp
10.
Modal so‘z
MD
11.
Undov soʻz
UH
11.1.
His-hayajon undovlari
UHEm (Emotion)
11.2
Buyruq-xitob undovlari
UHImp (Imperative)
12.
Taqlid soʻzlar
IM (Imitative)
Using HMM to develop POS Tagger
The words in this Uzbek language
can be understood as
observable cases
(given to us in the data). POS tags that
match
words can be written as
hidden states
, so HMM can be used to evaluate POS tags. Note that we refer to observed
states as
"observations"
and hidden states as
"states".
The Hidden Markov Model has the following components:
Table 3.
Components of HMM
Q = Q
1
,Q
2
,…Q
n
A group of N cases (Unnnoun)
A = a
11
,a
12
,…a
n1
…a
nn
�
is a transition
matrix of probability, representing
the probability of
transition
from each state
�
��
– �
to another state j.
∑
O = O
1
,O
2
,…O
t
�
is a sequence of o
bservations
(O), all of which
are taken from a special
dictionary (source).
V = V
1
,V
2
,…V
t
B = b
i
(O
t
)
A
sequence of observation of probabilities
(called emission probabilities), all
of which represent the probability that observation
O
t
will occur from state
�
.
π = π
1
, π
2
,… π
n
The initial probability distribution
for S cases.
π
i
means the probability that the Markov chain starts at a certain state
�
.
∑
In that,
-
Q
is the set of possible labels;
-
A-A
matrix represents the probability that the current tag will be formed from previous tags, keeping the tag
transition probability
P(t
i
|t
i−1
)
Example: Calculation of
A[Verb] [Ot]
:
P (Noun |Verb): Count (Noun and Verb) / Count (Verb)
-
O
- the sequence of observation (for the words in the sentence);
-
B – B
is the emission probability representing the probability of
P(w
i
|t
i
)
that the given tag (supposing, a verb) is
associated with a given word. For example, the outlier probability
B[Verb][boiler
] is calculated using the following formula:
P (pot | Verb): Count (pot and Verb) / pot (Verb)
It should be noted that the values of the
Count ()
function mentioned above are taken from the tagged corpus (data)
of the Uzbek language to study the HMM model [16]. Matrices
"A" and "B"
in the HMM model for the sentence
“Bu yoqdagi
odamlar menga juda yoqdi”
(
"I reall
y liked the people here”) look
like this:
“O‘zbekiston Milliy universitetining ilm-fan rivoji va jamiyat taraqqiyotida tutgan o‘rni”
mavzusidagi
xalqaro ilmiy-amaliy konferensiya, 2023 yil, 12 may
111
Figure 7. Transition and emission matrices in the HMM model
Here, the solid black lines in “
A"
represent the values of
transition matrix,
and the dotted black lines in
"B"
represent
the
emission matrix
for a system with
Q:
{MD, VB, NN}
.
Decoding using HMM
Suppose, an NMM be given, consisting of a transition and emission matrix and a sequence of observations
O=O
1
,O
2
,
…,O
t
(words in corpus sentences). Given these values, we have to to determine the maximum
possible sequence of states
Q=Q
1
,Q
2
, …,Q
t
(POS tags). In order to decode a sequence of tags using HMM, two main assumptions are made:
-
the probability of the appearance of the current word depends only on its own tag and does not depend on neighboring
(3-gram) words and tags;
-
the probability of occurrence of a tag does not depend on the sequence of previous tags, but only on the previous tag
(gram 2).
The pseudocode of the HMM algorithm is given below:
Hidden Markov Model Algorithm
Initialize:
σ
← k x N array
For s = 1 … k: ς[1, s] ← π(s)
Pr
[
O
1
|
s
]
X ←
k x N array
Populate
σ
and X
For
i
= 2 …
N
:
For
s
= 1 …
k
:
xprev
← argmax {ς [
i
-1,
x
] Pr[x → s]}
x
X[
i
,
s
] ←
xprev
ς[
i
,
s
] ← ς[
i
-1,
xprev
] Pr[x → s]
Pr
[
O
i
|
s
]
Reconstruct OptPath
:
s ← argmax{ς[
N
,
x
]}
x
Optpath
←
EmptyList
For j = N
… 1:
Optpath
← s ::
Optpath
if j >
1:
s ← X[j, s]
Return OptPath
Viterbi algorithm
The decoding process used for HMM is called
Viterbi algorithm
[26]. First, it is necessary to form a probability matrix
called a
grid.
The
columns
in this matrix are the sequence of words in a sentence;
lines:
represent hidden states (all possible
POS tags). The
Viterbi grid, (Viterbi Algorithm)
corresponding to the saying “
Shirin nodir toshni o‘tga otdi” (
"Shirin threw
the rare stone into the fire"), looks like this:
Figure 8. Viterbi grid/algorithm corresponding to this sentence
In the Viterbi grid/algorithm above, one can observe the columns corresponding to the words in the given sentence
(Shirin
nodir toshni o‘tga otdi
)
(Shirin threw the rare stone into the fire) and the rows representing all known POS tags (RR,
NUM, N, JJ, VB, C, P). The data in this grid can be interpreted as follows:
-
Each cell (cell) of the grid is represented as V
V(t,j)
(“t”
represents a column and
j
represents a row, called the Viterbi
path probability);
-
After the first
t
observations, the HMM calculates the probability of occurrence of state
j
(current POS tag) and
determines/identifies the sequence of states with the highest probability;
The
V(t,j)
value is calculated based on the following formula:
Here:
–
the probability of the Viterbi path corresponding to the previous steps.
– the probability of transition from the previous state
q
i
to the current state
q
j.
;
–
the probability of observation state
j
taking into account of the current state of
O
t
.
Using the Viterbi algorithm, it is necessary to calculate the values of the
transition matrix
“A”
and the
emission matrix
“B”
of the HMM discussed above. In our example, using the bigram HMM model, the POS tags depend only on the previous tag.
For the sentence “
Shirin nodir toshni o‘tga otdi”
("Shirin threw the rare stone into the fire"), it is necessary to have the result
in the following format:
Dostları ilə paylaş: