Differential privacy in Federated Learning: balancing privacy and utility

In recent years, the use of Artificial Intelligence (AI) has seen a significant surge in everyday applications. Part of this technology relies on Machine Learning algorithms that gather data to train analytical and AI models. However, in many instances, this data is stored in different repositories with certain limitations on data sharing.

To address this, the concept of Federated Learning (FL) has emerged, enabling collaborative training of models among different devices or entities while keeping the training data private. FL is based on the exchange of hyperparameters of the models, such as weights and gradients, while preserving the privacy of the training data. While FL improves the privacy of the training data, it also poses security challenges. Attacks have been developed that compromise the privacy of the data, allowing the identification of an individual’s membership in the training data, inference of personal attributes, and even reconstruction of the training data.

Federated Learning: the use of Differential Privacy

One solution to protect FL is the use of Differential Privacy (DP) techniques, originally designed for protecting databases. These techniques have proven effective in enhancing privacy in FL schemes by introducing controlled statistical noise through mechanisms that ensure data privacy. The main parameter in these mechanisms is the privacy budget, which defines the amount of noise added to the algorithm. It is important to carefully choose the privacy budget as noise accumulates with each iteration of the FL algorithm and affects both privacy and utility of the model.

When creating new mechanisms, various probability distributions, whether existing or newly generated, are considered. The main challenge in this line of work is proving that the mechanism satisfies the definition of DP, i.e., that it mathematically ensures all the data is obscured in a blurry cloud. Another alternative to adjusting the privacy budget is by studying its accumulation in each iteration of the algorithm. This can be a significant challenge since we are dealing with algorithms that require numerous iterations [1]. Currently, the main challenge lies in investigating different methods or properties to minimize the privacy budget applied in each federated algorithm iteration.

Unresolved issue about Differential Privacy

Determining at which step of the FL process the DP mechanism should be introduced is also an unresolved issue. The most common method involves applying DP to the weights [2, 3, 4], although other alternatives propose applying DP to the entire gradient or dividing the weight vector into magnitude and direction components and adding different noises to each component [5].

The TRUMPET project focuses on achieving an appropriate balance between privacy and utility in FL. This involves exploring different approaches such as creating new DP mechanisms based on probability distributions, studying the accumulation of the privacy budget in each iteration of the algorithm, and identifying the optimal moment to apply DP mechanisms. In summary, FL brings improvements in the privacy of the training data but also introduces security challenges. The use of Differential Privacy techniques provides a solution to protect privacy in FL, but it is necessary to find a balance between privacy and utility of the algorithm to ensure optimal results.

Carlos García-Pagán, Inés Ortega, Eva Sotos from Gradiant


[1] Mironov, I. (2017, August). Rényi differential privacy. In 2017 IEEE 30th computer security foundations symposium (CSF) (pp. 263-275). IEEE.

[2] Sun, L., Qian, J., & Chen, X. (2020). LDP-FL: Practical private aggregation in federated learning with local differential privacy. arXiv preprint arXiv:2007.15789.

[3] Lyu, M., Su, D., & Li, N. (2016). Understanding the sparse vector technique for differential privacy. arXiv preprint arXiv:1603.01699.

[4] Zhao, Y., Zhao, J., Yang, M., Wang, T., Wang, N., Lyu, L., … & Lam, K. Y. (2020). Local differential privacy-based federated learning for internet of things. IEEE Internet of Things Journal, 8(11), 8836-8853.

[5] Bhowmick, A., Duchi, J., Freudiger, J., Kapoor, G., & Rogers, R. (2018). Protection against reconstruction and its applications in private federated learning. arXiv preprint arXiv:1812.00984.