Incentivising the federation: gradient-based metrics for data selection and valuation in private decentralised training

Dmitrii Usynin, Daniel Rueckert, Georgios Kaissis

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Obtaining high-quality data for collaborative training of machine learning models can be a challenging task due to A) regulatory concerns and B) a lack of data owner incentives to participate. The first issue can be addressed through the combination of distributed machine learning techniques (e.g. federated learning) and privacy enhancing technologies (PET), such as the differentially private (DP) model training. The second challenge can be addressed by rewarding the participants for giving access to data which is beneficial to the training model, which is of particular importance in federated settings, where the data is unevenly distributed. However, DP noise can adversely affect the underrepresented and the atypical (yet often informative) data samples, making it difficult to assess their usefulness. In this work, we investigate how to leverage gradient information to permit the participants of private training settings to select the data most beneficial for the jointly trained model. We assess two such methods, namely variance of gradients (VoG) and the privacy loss-input susceptibility score (PLIS). We show that these techniques can provide the federated clients with tools for principled data selection even in stricter privacy settings.

Original languageEnglish
Title of host publicationProceedings of the 2024 European Interdisciplinary Cybersecurity Conference, EICC 2024
EditorsKovila Coopamootoo, Michael Sirivianos
PublisherAssociation for Computing Machinery
Pages179-185
Number of pages7
ISBN (Electronic)9798400716515
DOIs
StatePublished - 5 Jun 2024
Event2024 European Interdisciplinary Cybersecurity Conference, EICC 2024 - Xanthi, Greece
Duration: 5 Jun 20246 Jun 2024

Publication series

NameACM International Conference Proceeding Series

Conference

Conference2024 European Interdisciplinary Cybersecurity Conference, EICC 2024
Country/TerritoryGreece
CityXanthi
Period5/06/246/06/24

Keywords

  • data valuation
  • differential privacy
  • federated learning

Fingerprint

Dive into the research topics of 'Incentivising the federation: gradient-based metrics for data selection and valuation in private decentralised training'. Together they form a unique fingerprint.

Cite this