Figure 1. The general concept of Federated learning in the healthcare system.
Federated learning has the potential to be particularly useful in the healthcare industry, where data privacy and security are of paramount importance. With it, sensitive patient data can be kept on individual devices and hospital servers rather than centralized in a single location [13,14, 15]. This can help to protect patient privacy and comply with regulations such as HIPAA [16]. In addition, federated learning can train more accurate models by allowing for data aggregation from a more significant number of patients. This can be especially beneficial in rare disease research [17], where a centralized dataset may not have enough examples to train a reliable model. Federated learning can also enable the training of models on a more diverse patient population, which can lead to more generalizable and, therefore, more valuable models [18].
This study conducted clinical event classification using vital signs data with federated learning. The Flower FL algorithm was selected for the implementation of keeping the privacy of the dataset. Several machine learning techniques, such as Random Forest classification, XGBoost, and multi-model ensemble Random Forest with SGD, Random Forest with XGBoost were used to get the optimal result. A custom fine-tuned method in federated learning was used to acquire the best results. Section 2 presents the related works, Section 3 the materials and methods, Section 4 the experimental results, and the last section the conclusion.
2. Related work Clinical event classification using vital signs [19] data is critical in healthcare as it allows for early detection and management of various medical conditions. Researchers globally have extensively explored computational techniques, including machine learning and predictive modeling, to develop accurate and reliable methods for such predictions. Effective classification can identify health risks or critical events earlier, allowing for timely intervention and potentially preventing severe outcomes. Also, automated classification systems can quickly analyze a high volume of patient data, assisting healthcare providers in making more accurate and faster diagnoses with using Machine Learning models.
Machine learning is a popular approach in this field, as it allows for the analysis of vast amounts of historical and current data from various sources in healthcare to make predictions [1,20]. Medical machine learning contributes significantly to reducing the investment spent on health care and renewing the relationship between doctor and patient by reducing investment in this field [21]. A wireless radar, for example, collects vital signs data using radar technology and categorizes healthy and infected people using five machine-learning models [22]. In 2019 years, Juan-Jose Beunza et al. [23], to predict clinical events, compared several supervised classification machine learning algorithms for internal validity and accuracy. The Framingham open database used new methods in the data preparation process and got women an accuracy value of 0.81 while men a value of 0.78. However, their performance in the degree of accuracy is not considered sufficient and is often hindered by the lack of large, diverse, and labeled data. Yuanyuan et al. [24] introduced the system for using a convolutional neural network (CNN) with enhanced deep learning techniques to predict heart disease on an Internet of Medical Things (IoMT) platform. The "enhanced deep learning" aspect likely refers to using advanced techniques such as transfer learning or ensemble methods to improve the performance of the CNN. The IoMT platform uses medical devices connected to the Internet to collect and transmit data for analysis.
Jie Xu et al. [12] wrote that the survey aims to examine the use of federated learning in the biomedical field. It will provide an overview of various solutions for dealing with federated learning's statistical system and privacy challenges. Another example is highlighting these technologies' potential applications and impacts in healthcare is that of Thanveer Shaik et al. [25], who proposed a decentralized privacy-protected system for monitoring in-patient activity in hospitals using sensors and AI models to classify twelve routine activities with the FedStack system. FedStack is a proposed system for using stacked federated learning for personalized activity monitoring. Federated learning is a technique for training machine learning models on decentralized data, where data is distributed across multiple devices or locations. Stacked federated learning refers to a specific technique where multiple federated models are trained and combined to form a final model. This paper suggests using this approach for activity monitoring, which involves collecting data from sensors or other devices worn by individuals to track their physical activity and utilizing the trained models to personalize the monitoring and analysis of such data. Similarly, Ittai Dayan et al. [26] worked on predicting the future oxygen requirements for symptomatic COVID-19 patients using vital signs, laboratory data, and chest X-rays with the FL model. Also, the research proposed using federated learning for predicting clinical outcomes in patients with COVID-19. Federated learning is a technique for training machine learning models on decentralized data, in which information is distributed across multiple devices or locations. In this case, the authors suggest this approach to train models on data from different hospitals or clinics and improve the accuracy of predictions for patients with COVID-19. They also claim that this approach can help make predictions in real-time, improving the models' performance by sharing knowledge across different institutions.
The proposed cross-device ensemble method offers advantages over existing methods by combining and building upon the related approaches mentioned above. Firstly, it provides privacy protection by training models on decentralized data, whereas FL safeguards sensitive patient information as data never leave individual devices or institutions. Secondly, this method ensures robustness by enabling data integration from various sources, leading to more accurate and robust models. These advantages make this approach a promising solution for healthcare applications that require enormous amounts of sensitive patient data while ensuring privacy and robustness.