Non-iid data and Continual Learning processes in Federated Learning: a long road ahead

Addressing Federated and Continual non-IID data

Yüklə 1,96 Mb.

Pdf görüntüsü

səhifə	18/31
tarix	11.06.2023
ölçüsü	1,96 Mb.
	#128584

1 ... 14 15 16 17 18 19 20 21 ... 31

1-s2.0-S1566253522000884-main

5. Addressing Federated and Continual non-IID data
For what we have seen in Section
4
, concept drift in CL scenar-
ios can be interpreted as the counterpart of non-IID data in the FL
ones, i.e., changes in the distribution as time passes are the origin
of statistical heterogeneity in continual settings. Notice that variations
on the distribution of one dataset over time should be always con-
templated as identically distributed, since there is only one dataset
affected. Nonetheless, the sets of data collected by one client 𝑖 over
time, corresponding to different timestamps, 𝐷
𝑡
𝑖
, 𝐷
𝑡
+𝑘
𝑖
, could be studied
as two different datasets, 𝐷
𝑡
⊊ 𝐷
𝑡
+𝑘
.
In that sense, it is logical to talk about non-identical distributed
data over time. In fact, considering again the factorization 𝑃 (𝑥, 𝑦) =
𝑃
(𝑥)
⋅ 𝑃 (𝑦
|𝑥), the casuistry is identical to the one explained in Section
3
:
if the client data distribution is stationary (IID over time), then we can
ensure that both factors remain equal over time: 𝑃
𝑡
𝑖
(𝑥) = 𝑃
𝑡
+𝑘
𝑖
(𝑥)
and
𝑃
𝑡
𝑖
(𝑦
|𝑥) = 𝑃
𝑡
+𝑘
𝑖
(𝑦
|𝑥). Else, we find the three possibilities, illustrated in
Fig. 4
. From now on, we will call temporal non-IID data to the data
heterogeneity that a client can undergo over time (see Section
4.1
).

Information Fusion 88 (2022) 263–280
272
M.F. Criado et al.
Table 4
Spatial and Temporal heterogeneity learning scenarios, and the strategies that could
potentially solve each situation. Strategies that deal with changes in both the input
space and the behaviour are placed only in the last row/column, and not in the previous
ones.
Conversely, we will call spatial non-IID data to the data heterogeneity
across clients that are training a shared model (see Section
3
).
In real-life problems, data distributions can vary in a bunch of dif-
ferent ways. Clients in a federated setting are expected to collect their
own data samples, under particular conditions, leading to statistically
unequal datasets. These differences can rely on the inputs each client
perceive, 𝑃
𝑖
(𝑥)
≠ 𝑃
𝑗
(𝑥)
, as well as on the label associated with their
inputs, 𝑃
𝑖
(𝑦
|𝑥) ≠ 𝑃
𝑗
(𝑦
|𝑥) (see
Table 1
). If we desire the model to be
adapted to the particularities of the training participants, standard FL
techniques will not be enough. Moreover, the process of collecting data
and solving a task takes a certain amount of time, so the desired model
should be able to evolve and adjust to future situations. Data will be
collected during a long period of time, leading to changes in the input
space, 𝑃
𝑡
(𝑥)
≠ 𝑃
𝑡
+𝑘
(𝑥)
and also in the labels, 𝑃
𝑡
(𝑦
|𝑥) ≠ 𝑃
𝑡
+𝑘
(𝑦
|𝑥) (see
Fig. 4
).
On the whole, there are 4 feasible scenarios for each spatial and
temporal data, and they may appear combined with each other in re-
alistic tasks. The global data distribution 𝐷
𝑡
𝐺
(𝑥, 𝑦)
, which includes both
spatial and temporal heterogeneity, can evolve following 16 different
courses. We represent all of the possibilities in
Table 4
, as well as
some of the strategies and algorithms that focus on solving some of
those possibilities. Notice that we include IID data to consider all of
the possible combinations of heterogeneity.
It is reasonable to think that each course must be faced with specific
methods. For instance, if the problem we are considering presents
changes in the input space over time across one or multiple partici-
pants, it could be solved using a memory-based method to generalize
the data from previous distributions and avoid catastrophic forgetting.
Despite fitting perfectly for this problem, these kinds of solutions
cannot deal with changes in behaviours over time, or changes in the
input space across clients. For this reason, in this Section we focus on
determining which strategies are more suitable to deal with each cir-
cumstance. For now, we only explained some algorithms from FL that
were proposed as Personalization strategies, without further explanation
about the origin of the data heterogeneity; as well as methods to detect
drifts, but not to react to them.
Notice that if we determine effective approaches to solve each of
the scenarios corresponding to the first row and column in
Table 4
,
then we will be able to solve the situation of any cell by combining
the algorithms from its corresponding row (already addressed in Sec-
tion
3.3.1
) and column, as long as they are compatible. Thus, from
now on we will consider scenarios where data is IID in the spatial axis.
This corresponds to pure CL. In the following sections, we are going
to present the existing solutions to deal with temporal non-IID data,
classify those strategies according to their shared characteristics, and
compare their experimental results when possible.

Yüklə 1,96 Mb.

Dostları ilə paylaş:

1 ... 14 15 16 17 18 19 20 21 ... 31