2.4 Segmentation 2.4.1 pyAudioAnalysis: [1] Silence removal. A semi-supervised silence removal
functionality is also provided in the library. Their respective
function takes an uninterrupted audio recording as input and
returns segment endpoints that correspond to individual
audio events, removing “silent” areas of the recording. This is
achieved through a semi-supervised approach which
performs the following steps:
• The short
-term features of the whole recording are
extracted
• An
SVM model is trained to distinguish between high-
energy and low-energy short-term frames. In particular, 10%
of the highest energy frames along with the10% of the lowest
are used to train the SVM model
• The SVM classifier is applied (with a probabilistic
output)
on the whole recording, resulting in a sequence of
probabilities that correspond to a level of confidence that
their respective short-term frames belong to an audio event
(and do not belong to a silent segment).
• A dynamic thresholding is used to
detect the active
segments.
2.4.2 Pseudo code for segmentation 3
[Fs,x]=
aIO.readAudioFile("1904060514208888592030.wav")
4
segments = aS.silenceRemoval(x, Fs, 0.025, 0.025,
smoothWindow = 0.85, plot = True)
5
#type(segments)
6
#print(Fs) Fs: the sampling rate of the generated WAV
files
7
#adding label
8
n=1
9
for i in segments:
10
#indexing of data to avoid using csv files and manual
editing
11
i.insert(0,n)
12
#insert appends at the beginning
13
n=n+1
14
print("The segmented wav files written are:")
15
import scipy.io.wavfile as wavfile
16
[Fs,
x]
=
aIO.readAudioFile("1904060514208888592030.wav")
17
for j in segments:
18
T1 = float(j[1])#timestamp 1
19
T2 = float(j[2])#timestamp 2
20
label = ("1904060514208888592030.wav", j[0],
T1, T2)
21
xtemp=x[int(round(T1*Fs)):int(round(T2*Fs))]
22
#time x (sapmle/time)
23
print (T1, T2, label, xtemp.shape)
24
wavfile.write(label, Fs, xtemp)
25
print("the labeled limits are:\n",segments )