Sfaira accelerates data and model reuse in single cell genomics

David S. Fischer, Leander Dony, Martin König, Abdul Moeed, Luke Zappia, Lukas Heumos, Sophie Tritschler, Olle Holmberg, Hananeh Aliee, Fabian J. Theis

Research output: Contribution to journalArticlepeer-review

12 Scopus citations


Single-cell RNA-seq datasets are often first analyzed independently without harnessing model fits from previous studies, and are then contextualized with public data sets, requiring time-consuming data wrangling. We address these issues with sfaira, a single-cell data zoo for public data sets paired with a model zoo for executable pre-trained models. The data zoo is designed to facilitate contribution of data sets using ontologies for metadata. We propose an adaption of cross-entropy loss for cell type classification tailored to datasets annotated at different levels of coarseness. We demonstrate the utility of sfaira by training models across anatomic data partitions on 8 million cells.

Original languageEnglish
Article number248
JournalGenome Biology
Issue number1
StatePublished - Dec 2021


  • Data zoo
  • Model zoo
  • Single-cell genomics


Dive into the research topics of 'Sfaira accelerates data and model reuse in single cell genomics'. Together they form a unique fingerprint.

Cite this