Ruzaliev R:
Federated Learning for Clinical Event Classification Using Vital
Signs Data
2
VOLUME XX, 2023
XGBoost provides several hyperparameters that can be tuned
to control the behavior of the algorithm and improve its
performance.
3.3.5
STOCHASTIC GRADIENT DECEDMT
Stochastic Gradient Descent (SGD) is an optimization
algorithm used for training machine learning models,
especially for linear models like linear regression, logistic
regression, and support vector machines. It is called
"stochastic" because it uses a randomly selected subset of the
training data, or a single training example, at each iteration to
update the model parameters. In SGD, the model parameters
are updated in the direction of the negative gradient of the loss
function with respect to the model parameters. The gradient is
estimated using the randomly selected subset of the training
data, rather than using the entire training set. This makes SGD
more computationally efficient than batch gradient descent,
where the entire training set is used to calculate the gradient at
each iteration. SGD is a popular optimization algorithm
because it is simple to implement and can be applied to very
large datasets, making it well-suited for training machine
learning models on big data. However, the stochastic nature of
the algorithm can make it more sensitive to the choice of the
learning rate and can result in more oscillations in the
optimization path compared to batch gradient descent. To
mitigate these issues, SGD is often combined with techniques
such as mini-batch learning, learning rate schedules, and
regularization.
3.4
FEDERATED LEARNING
Federated Learning is a distributed machine learning
framework that enables multiple parties to train a shared
model without sharing their raw data. Instead, the raw data
remains on the devices of the participants and only the model
parameters are communicated and aggregated to form the final
model. In a federated learning structure, each participant has a
local model that is trained on its own data. The local models
are then used to make predictions on new data, and the
gradients of the loss function with respect to the model
parameters are calculated. These gradients are then
communicated to a central server, which aggregates the
gradients and updates the global model parameters. The
updated model parameters are then sent back to the
Dostları ilə paylaş: