An open resource for accurately benchmarking small variant and reference calls

Justin M. Zook

Jennifer McDaniel

Nathan D. Olson

Justin M. Wagner

Hemang Parikh

Haynes Heaton

Sean A. Irvine

Len Trigg

Rebecca Truty

Cory Y. McLean

Francisco M. De La Vega

Chunlin Xiao

Stephen Sherry

Marc Salit

Nature Biotechnology, 37(2019), 561–566

Download Google Scholar

Abstract

Benchmark small variant calls are required for developing, optimizing and assessing the performance of sequencing and bioinformatics methods. Here, as part of the Genome in a Bottle (GIAB) Consortium, we apply a reproducible, cloud-based pipeline to integrate multiple short- and linked-read sequencing datasets and provide benchmark calls for human genomes. We generate benchmark calls for one previously analyzed GIAB sample, as well as six genomes from the Personal Genome Project. These new genomes have broad, open consent, making this a ‘first of its kind’ resource that is available to the community for multiple downstream applications. We produce 17% more benchmark single nucleotide variations, 176% more indels and 12% larger benchmark regions than previously published GIAB benchmarks. We demonstrate that this benchmark reliably identifies errors in existing callsets and highlight challenges in interpreting performance metrics when using benchmarks that are not perfect or comprehensive. Finally, we identify strengths and weaknesses of callsets by stratifying performance according to variant type and genome context.

Research Areas

Health & Bioscience

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

An open resource for accurately benchmarking small variant and reference calls

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

An open resource for accurately benchmarking small variant and reference calls

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities