On the Privacy of Federated Pipelines

Reza Nasirigerdeh, Reihaneh Torkzadehmahani, Jan Baumbach, David B. Blumenthal

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

11 Scopus citations

Abstract

Federated learning (FL) is becoming an increasingly popular machine learning paradigm in application scenarios where sensitive data available at various local sites cannot be shared due to privacy protection regulations. In FL, the sensitive data never leaves the local sites and only model parameters are shared with a global aggregator. Nonetheless, it has recently been shown that, under some circumstances, the private data can be reconstructed from the model parameters, which implies that data leakage can occur in FL. In this paper, we draw attention to another risk associated with FL: Even if federated algorithms are individually privacy-preserving, combining them into pipelines is not necessarily privacy-preserving. We provide a concrete example from genome-wide association studies, where the combination of federated principal component analysis and federated linear regression allows the aggregator to retrieve sensitive patient data by solving an instance of the multidimensional subset sum problem. This supports the increasing awareness in the field that, for FL to be truly privacy-preserving, measures have to be undertaken to protect against data leakage at the aggregator.

Original languageEnglish
Title of host publicationSIGIR 2021 - Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval
PublisherAssociation for Computing Machinery, Inc
Pages1975-1979
Number of pages5
ISBN (Electronic)9781450380379
DOIs
StatePublished - 11 Jul 2021
Externally publishedYes
Event44th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2021 - Virtual, Online, Canada
Duration: 11 Jul 202115 Jul 2021

Publication series

NameSIGIR 2021 - Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval

Conference

Conference44th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2021
Country/TerritoryCanada
CityVirtual, Online
Period11/07/2115/07/21

Keywords

  • federated learning
  • genome-wide association studies
  • integer linear programming
  • multidimensional subset sum
  • privacy

Fingerprint

Dive into the research topics of 'On the Privacy of Federated Pipelines'. Together they form a unique fingerprint.

Cite this