Skip to main navigation Skip to search Skip to main content

OurDNA dataset

  • Christopher Richards (Creator)
  • Katrina de Lange (Creator)
  • Jennifer Piscionere (Creator)
  • Katalina S Bobowik (Creator)
  • Daniela Bodemer (Creator)
  • Stuart Cantsilieris (Creator)
  • Dan Coates (Creator)
  • Samantha Croy (Creator)
  • Zuong Dang (Creator)
  • Michael Harper (Creator)
  • Bindu Swapna Madala (Creator)
  • Amy Miniter (Creator)
  • Joshua M. Schmidt (Creator)
  • Luke Seesink (Creator)
  • Rafal Shouly (Creator)
  • Michael Silk (Creator)
  • Alex Stuckey (Creator)
  • Bronwyn Terrill (Creator)
  • Caitlin Uren (Creator)
  • Marijana Vanevski (Creator)
  • Maia Ambegaokar (Creator)
  • Vivian Bakiris (Creator)
  • Lisa Burton Crawford (Creator)
  • Daniel C Esposito (Creator)
  • Michael K Franklin (Creator)
  • Leonhard Gruenschloss (Creator)
  • Sally L Sansom (Creator)
  • Miloslav Hyben (Creator)
  • Chethana Krishnakumar (Creator)
  • Caitlin Morrison (Creator)
  • Hannah R. Nicholas (Creator)
  • Shenei Penaia (Creator)
  • Andreia Pinho (Creator)
  • Rocio Rius (Creator)
  • Vladislav Savelyev (Creator)
  • Cas Simons (Creator)
  • Natalie P Stewart (Creator)
  • Helen Tsimiklis (Creator)
  • Y.A. Wang (Creator)
  • Laura Wedd (Creator)
  • Laura M Yeates (Creator)
  • Basim Alansari (Creator)
  • Faustino Jerome G Babate (Creator)
  • Cesar Bartolome (Creator)
  • Joyce Cho (Creator)
  • Mila A Cichello (Creator)
  • Sylvia A Coombe (Creator)
  • Abdisa A Kalbesa (Creator)
  • Dina Kerr (Creator)
  • Maria Grace Liston (Creator)
  • Naseema Mustapha (Creator)
  • Saba Nabi (Creator)
  • Anh Linh Pham (Creator)
  • Nidia Raya Martinez (Creator)
  • Lee Maureen Santiago (Creator)
  • V.B.B Tran (Vietnamese Community in Australia - NSW Chapter) (Creator)
  • Gloria T Velasquez (Creator)
  • Tiffany F Boughtwood (Creator)
  • Mary Ann Geronimo (Creator)
  • Jodie Ingles (Creator)
  • Sanji Kanagalingam (Creator)
  • Jane Marning (Creator)
  • Melissa C. Southey (Creator)
  • Danya F Vears (Creator)
  • Mary-Anne Young (Creator)
  • Martin Delatycki (Creator)
  • Gemma Figtree (Creator)
  • Alex Hewitt (Creator)
  • Edwin P Kirk (Creator)
  • Nigel Laing (Creator)
  • Daniel G. MacArthur (Creator)
  • Katie Arkell (Contributor)
  • Amy Baker (Creator)
  • Stephanie Best (Contributor)
  • Alex D Brown (Australian National University) (Contributor)
  • Samantha J. Bryen (Contributor)
  • Ira W. Deveson (Contributor)
  • Zoe Fehlberg (Contributor)
  • Edward Formaini (Creator)
  • Azure Hermes (Contributor)
  • Vera Howlett (Contributor)
  • Amanda Hreszczuk (Contributor)
  • David Irving (Contributor)
  • Karin S Kassahn (Contributor)
  • Elizabeth Knight (Contributor)
  • Sarah Kummerfeld (Contributor)
  • Paul Lacaze (Contributor)
  • John Marshall (Contributor)
  • Ignatius J Menzies (Contributor)
  • Lucas A. Mitchell (Contributor)
  • Jonathan Nguyen (Contributor)
  • Yash Pankhania (Contributor)
  • Daniel Pavlic (Contributor)
  • Joseph E. Powell (Contributor)
  • Paris Reedy (Contributor)
  • Heidi L. Rehm (Contributor)
  • Elise Richards (Other)
  • Sonia Shah (Contributor)
  • Michael E. Talkowski (Contributor)
  • Natasha Tamasese (Contributor)
  • Loic Thibault (Contributor)
  • Rachel Thorpe (Contributor)
  • David Wallace (Creator)
  • Matthew Welland (Creator)
  • Sabrina Yan (Contributor)
  • Loic Yengo (Contributor)

Dataset

Description

The OurDNA dataset is composed of harmonised, aggregated genome and exome sequences from the OurDNA program and provides the foundational reference set used by the OurDNA browser. The OurDNA program is a flagship initiative of the Centre for Population Genomics to increase the genomic representation of Australian multicultural communities. The OurDNA program aims to aggregate and share genetic variation data from over 20,000 Australians, including 8,000 new high-quality whole genome sequences from participants from genomically underrepresented groups recruited following participatory community engagement. 

The OurDNA Browser is a resource intended for clinicians and researchers with formal training in genetics and genomics who understand the limitations of population genetic data. Use of the dataset is subject to conditions of use as outlined in OurDNA browser policies.

The OurDNA dataset v1 (GRCh38) includes 12,882 individuals:



10,671 exomes

2,211 genomes


Short variants



Total SNVs: 57,322,471

Total INDELs: 4,567,608

Variant type counts



Synonymous: 719,413

Missense: 1,321,931

Nonsense: 35,735

Frameshift: 36,991

Canonical splice site: 33,237




Versioned, aggregate data are available for download. Download instructions are provided on the OurDNA browser.

Methods

The OurDNA dataset contains individuals sequenced using a mix of exome and genome capture methods and sequencing chemistries, so coverage varies between individuals and across sites. This variation in coverage is incorporated into the variant frequency calculations for each variant. Data were QCed and analyzed using the Hail open-source framework for scalable genetic analysis.

All of the raw data from contributing projects and the OurDNA project have been (re)processed through equivalent pipelines to increase consistency across projects. Short-read whole genome sequencing data was processed according to the DRAGEN-GATK Best Practices guidelines. This includes alignment to GRCh38 using the open-source DRAGEN mapper (DRAGMAP, v1.3.0), and variant calling with GATK v4.2.6.1 HaplotypeCaller to discover single-nucleotide variants (SNVs) and insertion-deletions (indels). All samples were aggregated using the hail gVCF Combiner, and then sample and variant quality control was performed on the joint call set in line with gnomAD best practices.

Funding



Garvan Institute of Medical Research (https://ror.org/01b3dvp57) and Murdoch Children’s Research Institute (https://ror.org/048fyec77) contribute to the development of this resource via their significant funding support for the Centre for Population Genomics, enabled through the generosity of donors.

Funding for this research has also been provided by the Australian Government’s Medical Research Future Fund (MRFF) grant 2015969 (CIA Daniel MacArthur; 2022-2027) from the Genomics Health Futures Mission and by the National Health and Medical Research Council (NHMRC, https://ror.org/011kf5r70) investigator grant 2009982 (CIA Daniel MacArthur; 2022-2026).

The contents of this published material are solely the responsibility of the authors and do not reflect the views of the Commonwealth of Australia or the NHMRC.

Christopher Richards, Katrina de Lange and Jennifer Piscionere are joint first authors.
Date made available14 May 2025
PublisherZenodo

Cite this