Voice Data
Data
Preprocessing(STFT,
MFCC)
Training (CNN)
Command Detection
THE 3 rd INTERNATIONAL SCIENTIFIC CONFERENCES OF STUDENTS AND YOUNG RESEARCHERS dedicated to the 99
th
anniversary of the National Leader of Azerbaijan Heydar Aliyev
188
Result: Visualization of MFCC for one sample.
Training result.
Hardware & Software Microphone: Simple USB microphone is used to get voice input.
Raspberry Pi: Raspberry Pi 400 kit is used to process the voice and send
commands to the STM32. It consists of an adapter, microSD card, mouse, and
keyboard. Python Librosa library and Tensorflow are used through the
Anaconda Spyder environment to process the sounds and build the model.
Then this model is given to Raspberry Pi and command detection is performed.
STM32: STM32F is selected to control the components and it is
programmed with STM32CubeIDE software.
SPI: Serial Peripheral Interface is utilized to make communication
between Raspberry Pi (master) and STM32 (slave).
References [1] Spectral Audio Signal Processing - Julius O. Smith III.
[2] Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and
Dynamic Time Warping (DTW) Techniques - Lindasalwa Muda, Mumtaj Begam and I.
Elamvazuthi
[3] Speech Recognition using Convolution Deep Neural Networks - Ayad Alsobhani, Hanaa
M A ALabboodi, Haider Mahdi.
THE 3 rd INTERNATIONAL SCIENTIFIC CONFERENCES OF STUDENTS AND YOUNG RESEARCHERS dedicated to the 99
th
anniversary of the National Leader of Azerbaijan Heydar Aliyev
189