Skip to main navigation Skip to search Skip to main content

Long-read sequence assembly: A technical evaluation in barley

  • Martin Mascher
  • , Thomas Wicker
  • , Jerry Jenkins
  • , Christopher Plott
  • , Thomas Lux
  • , Chu Shin Koh
  • , Jennifer Ens
  • , Heidrun Gundlach
  • , Lori B. Boston
  • , Zuzana Tulpová
  • , Samuel Holden
  • , Inmaculada Hernández-Pinzón
  • , Uwe Scholz
  • , Klaus F.X. Mayer
  • , Manuel Spannagl
  • , Curtis J. Pozniak
  • , Andrew G. Sharpe
  • , Hana Simková
  • , Matthew J. Moscou
  • , Jane Grimwood
  • Jeremy Schmutz, Nils Stein
  • Leibniz Institute of Plant Genetics and Crop Plant Research
  • German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig
  • University of Zurich
  • HudsonAlpha Institute for Biotechnology
  • Helmholtz Zentrum München German Research Center for Environmental Health
  • Global Institute for Food Security
  • University of Saskatchewan
  • Institute of Experimental Botany of the Academy of Sciences of the Czech Republic
  • John Innes Centre
  • Georg August Universität Göttingen

Research output: Contribution to journalArticlepeer-review

286 Scopus citations

Abstract

Sequence assembly of large and repeat-rich plant genomes has been challenging, requiring substantial computational resources and often several complementary sequence assembly and genome mapping approaches. The recent development of fast and accurate long-read sequencing by circular consensus sequencing (CCS) on the PacBio platform may greatly increase the scope of plant pan-genome projects. Here, we compare current long-read sequencing platforms regarding their ability to rapidly generate contiguous sequence assemblies in pan-genome studies of barley (Hordeum vulgare). Most long-read assemblies are clearly superior to the current barley reference sequence based on short-reads. Assemblies derived from accurate long reads excel in most metrics, but the CCS approach was the most cost-effective strategy for assembling tens of barley genomes. A downsampling analysis indicated that 20-fold CCS coverage can yield very good sequence assemblies, while even five-fold CCS data may capture the complete sequence of most genes. We present an updated reference genome assembly for barley with near-complete representation of the repeat-rich intergenic space. Long-read assembly can underpin the construction of accurate and complete sequences of multiple genomes of a species to build pan-genome infrastructures in Triticeae crops and their wild relatives.

Original languageEnglish
Pages (from-to)1888-1906
Number of pages19
JournalPlant Cell
Volume33
Issue number6
DOIs
StatePublished - Jun 2021
Externally publishedYes

Fingerprint

Dive into the research topics of 'Long-read sequence assembly: A technical evaluation in barley'. Together they form a unique fingerprint.

Cite this