Type of the Paper Article

Yüklə 331,48 Kb.

səhifə	4/9
tarix	30.08.2023
ölçüsü	331,48 Kb.
	#141125

1 2 3 4 5 6 7 8 9

Clinical event classification with FL (3)

3. Materials and Methods
The overall concept of the architecture covers the dataset description, pre-processing of the dataset, machine learning part, and, eventually, federated learning.
3.1 Dataset description
This research uses the Medical Information Mart for Intensive Care (MIMIC-IV) [27,28] dataset. This dataset contains de-identified electronic health record (EHR) data from patients admitted to intensive care units (ICUs) at the Beth Israel Deaconess Medical Center (BIDMC) in Boston, Massachusetts, USA, between 2008 and 2019. With data on more than 300,000 hospital admissions, the MIMIC-IV dataset was chosen since it is one of the world's most extensive publicly available ICU datasets, making it an invaluable resource for researchers studying critical care medicine, health outcomes, and medical informatics. The MIMIC-IV dataset is used to gather information on patient demographics, diagnoses, medications, laboratory results, vital signs, and more, providing a highly detailed view of patients' medical histories. Despite being de-identified to protect patient privacy, the dataset retains a high degree of clinical detail, making it useful for various research applications.
Table 1. Description of Vital Signs and Typical Normal Ranges.

Term	Description	Normal Range
SpO2	The oxygen saturation level in the patient's blood.	95%–100%
BPM	The heart rate, measured in beats per minute.	60–100 beats per min
RR	The number of breaths the patient takes per minute, providing insights into respiratory function.	12–18 breaths per min
SBP	The highest pressure exerted on the arterial walls during the cardiac cycle.	90–120 mmHg
DBP	The lowest pressure exerted on the arterial walls when the heart is at rest between beats.	60–90 mmHg
MBP	The average pressure within the patient's arteries over a complete cardiac cycle.	60–110 mmHg

The MIMIC-IV dataset was a vital resource for this research into critical care medicine, health outcomes, and medical informatics. Its vast size, clinical detail, and open availability make it an ideal dataset for various research applications. It should be noted that access to the MIMIC-IV dataset is restricted and requires approval from the PhysioNet Data Use Agreement (DUA). This study selected six vital signs (SpO2, BPM, RR, SBP, DBP, and MBP) from the MIMIC-IV dataset for analysis. Table 1 provides a concise overview of vital signs commonly used in healthcare along with their descriptions and typical normal ranges. These vital signs include SpO2 (oxygen saturation level), BPM (heart rate), RR (respiratory rate), SBP (systolic blood pressure), DBP (diastolic blood pressure), and MBP (mean blood pressure). The table serves as a reference for healthcare professionals to assess and monitor patients' physiological parameters within the expected normal range. Table 2 illustrates the initial version for data pre-processing which includes vital sign measurements extracted from the main dataset. The table displays the distribution of these vital signs over the duration of a patient's stay in the intensive care unit (ICU). It provides valuable insights into the variations and trends of these physiological parameters during the patient's ICU stay.
Table 2. Head of the initial version of the MIMIC IV dataset for federated learning process

Index	Subject_id	Charttime	Storetime	Valuenum	Valueuom
0	10003700	2165-04-24 05:28:00	2165-04-24 05:37:00	152.0	mmHg (SBP)
1	10003700	2165-04-24 05:28:00	2165-04-24 05:37:00	97.0	mmHg (DBP)
2	10003700	2165-04-24 05:28:00	2165-04-24 05:37:00	110.0	mmHg (MBP)
3	10003700	2165-04-24 05:30:00	2165-04-24 05:37:00	65.0	bpm
4	10003700	2165-04-24 05:30:00	2165-04-24 05:37:00	14.0	insp/min
5	10003700	2165-04-24 05:31:00	2165-04-24 05:37:00	100.0	%
6	10003700	2165-04-24 05:37:00	2165-04-24 05:37:00	120.0	bpm
7	10003700	2165-04-24 05:37:00	2165-04-24 05:37:00	50.0	bpm
8	10003700	2165-04-24 05:37:00	2165-04-24 05:37:00	160.0	mmHg (SBP)
9	10003700	2165-04-24 05:37:00	2165-04-24 05:37:00	90.0	mmHg (DBP)

3.2 Data pre-processing

Data pre-processing is a crucial step in machine learning as it helps prepare the data for analysis and modeling—some of the critical reasons for data pre-processing. For example, data cleaning helps to identify and remove any errors, inconsistencies, or missing values in the data. This helps to ensure that the data are accurate and reliable for analysis and modeling. The first step of pre-processing is to remove or fill missing values and noise in the dataset. MIMIC-IV dataset contains a lot of missing values. These missing values can be filled with measures like mean, median, or mode, or using model-based imputation methods. The next step is feature selection that identify features are relevant to the prediction. For example, for predicting a clinical event, features might include SBP or BMP, etc. Unnecessary features might include patient ID, which is not predictive. To bring all the features to a similar level, normalization is an essential process that includes the data that is often normalized or standardized. This prevents features with larger scales from dominating the model. The z-score (1) is a common method of normalization, and it's calculated using the following formula.
z = (x - μ) / σ (1)
In the given equation, x represents a data point, μ denotes the mean of the dataset, and σ represents the standard deviation of the dataset. In data analysis, the dataset is divided into a training set and a testing set. The training set is used to train the model, while the testing set evaluates its performance. A common split ratio is 70% for training and 30% for testing. This ensures effective learning and unbiased evaluation of the model's generalization.
In the initial version of the dataset, there were no clinic event targets, whereas PEACE-Home [29] proposed a system for monitoring patients in a home-based setting using vital signs such as heart rate, blood pressure, and respiratory rate. The system used probabilistic estimation to identify abnormal clinical events, such as deterioration in a patient's condition, by analyzing correlations among vital signs and separating clinic events as target data while clustering and using a relied on expert system. Data labeling is a process of assigning labels or tags to data to be used for training or evaluating machine learning models. In the context of PEACE-Home, data labeling was likely to involve identifying and tagging instances of abnormal clinical events within the vital signs data collected from patients in a home-based setting. This can be done through manual annotation by healthcare professionals or algorithms to identify and label events of interest automatically. If a vital sign is out of its expected range for a prolonged period of time, cannot be treated promptly, and persists, that is a clinical event in patient care. The expected ranges are often tailored to each patient, based on their specific health condition and history, although there are general medical guidelines that outline the typical boundaries of various vital signs. For instance, consider a patient with a history of hypertension. The patient's normal blood pressure may consistently register above the typically accepted "normal" range (SBP 80-120 and DBP 60-90 mmHg). A clinical event occurs if their blood pressure spikes to a dangerously high level, above their usual expected maximum or in bradycardia, this event refers to a slower-than-normal heart rate, defined as a heartbeat of 60 beats per minute (bpm) or less. For example, if a patient's heart rate drops to 55 bpm and stays there for a significant period without intervention, it would qualify as a Bradycardia clinical event. The study specifically examined simultaneous changes in four vital signs from generalized normal values and developed techniques to predict these changes in advance.
The labeled data was generated from the initial version of dataset as normal and abnormal clinical events using threshold values. The model can then monitor patients in a home-based setting and identify potential health problems early on. Table 3 shows the labeled clinic event data from the MIMIC IV dataset using the PEACE-Home method.
Table 3. Characteristics and threshold values for each clinical event, indicating the presence or absence of specific abnormalities in vital signs.

Yüklə 331,48 Kb.

Dostları ilə paylaş:

1 2 3 4 5 6 7 8 9