A. Convolutional Neural Networks (CNN) Convolutional neural networks are special kind of artificial neural network that can mimic the human brain activity to analyze data with supervised learning. CNN is modified multilayer perceptron, which means fully connected network. It consists of several layers namely, input layer, output layer and many hidden layers to make it happen. These hidden layers are convolutional hence the name convolutional neural networks. It offers beyond the limit abilities to perform object detection. These convolutional layers use several mathematical models to critically evaluate and analyze data. Then these outputs of the previous layers are passed to the next layers. There is chance of overfitting since the network is fully connected. To avoid this situation, the CNN exploit the hierarchical pattern in the data and sort them according to their complexity from simpler to complex patterns engraved in the layers. The input is given as tensor with number of inputs x height x width x channels of input. Now the image is in an abstract form, then the layers convert this abstract image into a feature map. This is repeated layer after layer which simulates the working of brain neurons. Since it is fully connected network all the output gets filtered and combined as a single output in the output layer. The number of filters directly proportional to the feature map size.
B. Architecture The architecture of a Convolutional neural network comprises of convolutional layers. CNN is different from other object detection algorithms because of the ability to generate region of interest in the original image using image transform filters called as convolutional kernels. While other algorithms take the weighted sums and connection weights to build the model. The number of feature maps generated will be equal to the number of kernels. The pixel color in the feature maps represents activation points. White pixels in the feature map are points in the original image with strong activation points. Grey pixels represent weak activation points, Black pixels represents strong negative activation points. The fire region in the original image is reddish orange so the convolutional kernel changes the pixels to white. Each neuron in the convolution neural network receives an input from a restricted part of the previous layer. Each neuron in the network gives an output by executing functions in the output of previous layers. These functions are determined by the weights of the input values. A unique feature of Convolutional neural networks is that it can share the same functions on every layer. The feature extractor used in network is called AlexNet deep CNN, which is a simple application of CNN which enables easy object detection in an image. Fig. 1 depicts the simple architecture of Convolutional neural networks.
CNN layers
Region proposals
I nput
Output fully connected layers
Fig 1. Architecture of CNN
The above figure Fig.1 represents the basic architecture of Convolutional neural networks, the data is given as input, images of fire in this case. Then the layers of the network make an abstract form of the image removing all background noises and highlight the object that needs to be detected. The layers produce region of proposals that are later combined to build a machine learning model in the fully connected layers and the decision-making algorithm analyze output from layers to reach a conclusion.
METHODOLOGY
In this paper, the proposed methodology consists of different stages. The stages include A. Acquisition of Dataset, B. Data Preprocessing, C. Feature Extraction, D. Building model, E. Validation and testing.
A. Acquisition of Dataset Data is in form of video frames which are obtained from CCTV footages, but for the ease custom made videos are to be used to perform training and test. the collection of such videos with fire is a tedious task. The frames with fire and without fire are then stored as respectively. Then we divide the dataset as training set and test set. This is to be done with great care because if the data fed to the neural network is faulty, the results will be corrupted and fail to produce an accurate system.
B. Data preprocessing Data preprocessing is the next stage of building a quality machine learning model. Here, the data gets cleaned and processed or simply make the data fit for use. Data preprocessing consist of removing noises and other unwanted objects from the frame. The algorithm must require relevant data otherwise it may produce undesired results.
C. Feature Extraction For the neural network to accurately detect fire, it needs to know the features of fire, how it looks like in computer’s vision. The feature of fire is easily identifiable by human eye. Fire emits reddish color; it has a shape under different circumstances and motion depending on the fuel it uses to burn. In this paper, the shape, color and motion of fire and smoke is used for the detection. We extract the features from different frames in the training set. The neural network extracts these features using feature extraction network in the CNN which is powered by a custom algorithm. After extracting the features these video frames are classified into fire and non-fire scenarios. The features are extracted using bounding boxes using image descriptors.
D. Building the model The extracted features are then passed to the network to build a model. This model is a set of thresholds to help the network to accurately detect fire. The model learns from the features extracted and set a standard for analyzing new input data.
E. Validation and testing Validation of the machine learning model is essential because it is clearly important to get the accuracy and see if the system is working. The validation process is executed using another set of video frame which is completely unique from the dataset provided to build the model. According, the test results the system achieved about 93 % accuracy with the validation set.
EXPERIMENTATION RESULTS.
The findings of the project are greatly satisfying. The system detected fire with an accuracy rate of 93 %. The result obtained show promise for implementation of Convolutional neural networks for detecting fire compared to other neural networks. The system combines several training data intelligently for calculating and reduce false alarm rates with fully connected network. Then this data is passed to decision-making algorithm to classify whether there is a fire or not. Although it has minor detection errors in some images, the overall performance and statistics are super-efficient. The only downfall is that it is a bit slow because it needs more computational power to produce results. The score of false alarm may be reduced by cleaning the data more and more. When implementing the rate of false alarm should be kept to minimum.
CONCLUSION
The scope of using video frames in the detection of fire using machine learning is challenging as well as innovative. If this system with less error rate can be implemented at a large scale like in big factories, houses, forests, it is possible to prevent damage and loss due to random fire accidents by making use of the Surveillance systems. The proposed system can be developed to more advanced system by integrating wireless sensors with CCTV for added protection and precision. The algorithm shows great promise in adapting to various environment.