Information Fusion 88 (2022) 263–280
270
M.F. Criado et al.
preceding concepts. This kind of framework has been given different
names over the years [
110
], like
Lifelong Learning [
111
,
112
],
Never
Ending Learning
[
113
,
114
] and
Incremental Learning [
115
–
117
], but all
of them rely on the same ideas: training a model gradually with data
collected over different periods of time, adapting to the new instances
and trying to preserve the previous knowledge.
We introduce the CL framework because we aim to talk about
the time-evolving condition of FL problems. However, throughout this
section we are going to cite and briefly describe works focused on CL
that do not necessarily consider the FL framework. This is because,
as we already mentioned, there are almost no works that focus on
both FL and CL simultaneously [
9
,
10
,
118
]. Nonetheless, the works we
consider are, from our point of view, the ones that would be more easily
adaptable to the FL framework, with multiple devices collaborating
to achieve the same global model. We will further explain how each
strategy could be modified when talking about them.
Training a model using CL techniques presents some specific prob-
lems, which have already been studied in recent literature. The most
challenging ones are, as it occurred with FL, related to the data dis-
tribution. CL was conceived as a centralized paradigm of ML so, even
though non-IID data across devices has not been discussed nor handled
so far, it can evolve in time. This is a complication, as the model could
be unable to converge to a solution if the training data shifts constantly.
Another undesirable situation, named
catastrophic forgetting, is that the
model completely and abruptly forgets previously learned concepts if
they are not present in the current data anymore [
119
,
120
]. For these
reasons we are going to focus on how data behaves as time goes by, and
how to act if the data shifts drastically, in unpredictable ways. This is
commonly known as
concept drift [
7
,
110
].
4.1. Concept drift definition
The non-stationary data distribution is caused by changes in data
over time. These changes can be seen as variations in the frequencies
certain kind of data appears: a concept has frequency zero if it has
not appeared yet in the dataset, and when it shows up its frequency
becomes a positive number. This kind of variation, called
concept drift,
is one of the most important CL challenges [
110
,
121
]. We can formally
define them as follows:
Dostları ilə paylaş: