Information Fusion 88 (2022) 263–280
Available online 3 August 2022
1566-2535/© 2022 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (
http://creativecommons.org/licenses/by/4.0/
).
Contents lists available at
ScienceDirect
Information Fusion
journal homepage:
www.elsevier.com/locate/inffus
Non-IID data and Continual Learning processes in Federated Learning: A long
road ahead
Marcos F. Criado
a
,
∗
, Fernando E. Casado
a
, Roberto Iglesias
a
, Carlos V. Regueiro
b
, Senén Barro
a
a
CiTIUS (Centro Singular de Investigación en Tecnoloxías Intelixentes), Universidade de Santiago de Compostela, 15782 Santiago de Compostela, Spain
b
CITIC, Computer Architecture Group, Universidade da Coruña, 15071 A Coruña, Spain
A R T I C L E
I N F O
Keywords:
Federated Learning
Data heterogeneity
Non-IID data
Concept drift
Distributed learning
Continual learning
A B S T R A C T
Federated Learning is a novel framework that allows multiple devices or institutions to train a machine learning
model collaboratively while preserving their data private. This decentralized approach is prone to suffer the
consequences of data statistical heterogeneity, both across the different entities and over time, which may
lead to a lack of convergence. To avoid such issues, different methods have been proposed in the past few
years. However, data may be heterogeneous in lots of different ways, and current proposals do not always
determine the kind of heterogeneity they are considering. In this work, we formally classify data statistical
heterogeneity and review the most remarkable learning Federated Learning strategies that are able to face it.
At the same time, we introduce approaches from other machine learning frameworks. In particular, Continual
Learning strategies are worthy of special attention, since they are able to handle habitual kinds of data
heterogeneity. Throughout this paper, we present many methods that could be easily adapted to the Federated
Learning settings to improve its performance. Apart from theoretically discussing the negative impact of data
heterogeneity, we examine it and show some empirical results using different types of non-IID data.
Dostları ilə paylaş: |