•
T: the features extracted in time domain.
•
F: the features extracted in the frequency domain.
•
N: the number of sampling in a frame.
[11] supplies the detail of each feature. Due to space
limitations, the calculation equations of each feature are
summarized as follows:
1)
Magnitude
𝑇𝑀
𝑛
= ∑|𝑆
𝑛
(𝑖)|
𝑁
𝑖=1
(5)
2)
Average
𝑇𝐴
𝑛
=
1
𝑁 ∑
𝑆
𝑛
(𝑖)
𝑁
𝑖=1
(6)
3)
Root mean square
TRMS
𝑛
= √
∑ 𝑆
𝑛
(𝑖)
2
𝑁
𝑖=1
𝑁
(7)
4)
Spectral centroid
𝐹𝐶
𝑛
=
∑ (|𝑆
′
𝑛
(𝑖)|
2
× 𝑖)
𝑁
𝑖=1
∑ (|𝑆
′
𝑛
(𝑖)|
2
)
𝑁
𝑖=1
(8)
5)
Spectral bandwidth
𝐹𝐵
𝑛
= √
∑ (|𝑆
′
𝑛
(𝑖)|
2
× (𝑖 − 𝐹𝐶
𝑛
)
2
)
𝑁
𝑖=1
∑ (|𝑆
′
𝑛
(𝑖)|
2
)
𝑁
𝑖=1
(9)
6)
Spectral roll-off
∑|𝑆
′
𝑛
(𝑖)|
2
𝐹𝑅
𝑖=1
= 0.85 × ∑ 𝑆
′
𝑛
(𝑖)
𝑁
𝑖=1
(10)
7)
Valley
𝐹𝑉𝑎𝑙𝑙𝑒𝑦
𝑛,𝑘
= log {
1
𝛼𝑁 ∑
𝑆
′
𝑛,𝑘
(𝑁 − 𝑖)
𝛼𝑁
𝑖=1
} (11)
8)
Peak
𝐹𝑉𝑎𝑙𝑙𝑒𝑦
𝑛,𝑘
= log {
1
𝛼𝑁 ∑ 𝑆
′
𝑛,𝑘
(𝑖)
𝛼𝑁
𝑖=1
} (12)
Where k is the number of sub-band and
𝛼 is a constant. We
set k and
𝛼 to 7 and 0.2, respectively.
9)
MFCC
MFCC is an abbreviation for Mel-Frequency Cepstral
Coefficients. In the first step, we get
𝑆
′
𝑛
by framing, windowing
and FFT. The next step is to filter
𝑆
′
𝑛
by the Mel-filter bank.
Then, we use Discreate Cosine Transform (DCT) and extract
dynamic difference parameters. We extract MFCC1-MFCC12
[10].
D.
Feature Selection
Sequential forward floating search (SFFS) algorithms are
often used to select the optimal set of feature vectors. Starting
with an empty set, a subset x from the unselected features each
round is selected. Then, the evaluation function is optimized
after joining the subset x, and then the subset z is selected from
the selected features, so that after eliminating the subset z, the
evaluation function is optimal [11]. We use support vector
machines for classification and k-fold cross-validation to
calculate the classification accuracy. Finally, we use the SFFS
algorithm to obtain feature sets. The detail of SFFS is shown as
follows.
E.
Crying Classification
Now that we have the highly abstract feature vector of crying
signal, we use this feature vector to train the SVM classifier.
SVM is a supervised learning model in the field of machine
learning [13]. Its principle can be described from linear
separability, then extended to linearly inseparable cases, and
even extended to non-linear functions, with a deep theoretical
background.
We use Python's existing SVC (Support Vector Classifier)
for training. The training set labels used are hunger, sleepiness,
pain, and non-crying. SVC uses the "one-versus-one" method to
achieve multiple classifications. This algorithm adopts the
Input: F is the set of all unselected features;
result
:= {∅};
E
() is the evaluation function;
done
:= false;
Dostları ilə paylaş: