An international consortium of scientists is proposing what is arguably the most ambitious project in the history of biology: sequencing the DNA of all known eukaryotic species on Earth.
The benefits of the monumental initiative promise to be a complete transformation of the scientific understanding of life on Earth and a vital new resource for global innovations in medicine, agriculture, conservation, technology and genomics.
The central goal of the Earth BioGenome Project is to understand the evolution and organization of life on our planet by sequencing and functionally annotating the genomes of 1.5 million known species of eukaryotes, a massive group that includes plants, animals, fungi and other organisms whose cells have a nucleus that houses their chromosomal DNA. To date, the genomes of less than 0.2 percent of eukaryotic species have been sequenced.
The project also seeks to reveal some of the estimated 10 to 15 million unknown species of eukaryotes, most of which are single cell organisms, insects and small animals in the oceans. The genomic data will be a freely available resource for scientific discovery and the resulting benefits shared with countries and indigenous communities where biodiversity is sourced. Researchers estimate the proposed initiative will take ten years and cost approximately $4.7 billion.
In a perspective paper published April 23 in the Proceedings of the National Academy of Sciences (PNAS), twenty-four interdisciplinary experts, who make up the working group of the Earth BioGenome Project, provide a compelling rationale for why the project should go forward and outline a roadmap for how it can be achieved.
Harris A. Lewin, a distinguished professor of evolution and ecology at UC Davis, chairs the working group and is the lead author of the paper. Gene Robinson, director of the Carl R. Woese Institute for Genomic Biology at the University of Illinois and W. John Kress, research botanist and curator at the Smithsonian Institution, are co-chairs.
In the paper, the scientists point to the hugely successful precedent of the Human Genome Project. Launched in 1990 and completed in 2003, the United States and funding agencies in other countries invested approximately $3 billion to sequence the entire human genome. The resulting “genomic revolution” has had an enormous impact not only on human medicine but also on veterinary medicine, agricultural bioscience, biotechnology, environmental science, renewable energy, forensics and industrial biotechnology. A 2013 report by the Battelle Memorial Institute estimated the financial benefit of the Human Genome Project to the US economy to be nearly $1 trillion.
Lewin sees the Earth BioGenome Project as providing even greater opportunities for generating scientific and societal benefits.
“The EBP will lay the scientific foundation for a new bio-economy that has the potential to bring innovative solutions to health, environmental, economic and social problems to people across the globe, especially in under-developed countries that have significant biodiversity assets,” he said.
The project first emerged in 2015 at a meeting organized by Lewin, Robinson and Kress, followed by another meeting organized as part of the Smithsonian Initiative on Biodiversity Genomics. After the completion of the Human Genome Project, many organisms of biomedical, agricultural and industrial importance had their genomes sequenced. The attendees at the 2015 meeting decided that an even more ambitious project was needed to advance biology, one that would sequence DNA from all complex life on Earth.
Advances in technology have made the project feasible. The cost of whole genome sequencing has declined to about $1,000 for a draft-quality sequence of human genome size and about $30,000 for a reference-quality assembly of the chromosomes of an average eukaryotic genome.
With advances in high-performance computing, data storage and bioinformatics, high throughput assembly and characterization of genomes is now feasible, although innovations in algorithms for aligning, interpreting and visualizing the massive amounts of data will be necessary. The completed project is expected to require about one exabyte (one billion gigabytes) of digital storage capacity.
Addressing critical needs
The project also addresses several critical needs. One is the need for better conservation tools for endangered species and ecosystems, particularly those impacted by climate change.
“The Earth BioGenome Project will give us insight into the history and diversity of life and help us better understand how to conserve it,” Robinson said.
The working group also see the project as being essential for developing new drugs for infectious and inherited diseases as well as creating new biological synthetic fuels, biomaterials, and food sources for the anticipated human population of 9.6 billion by 2050.
“Scientists believe that by the end of the century more than half of all species will vanish from the face of the Earth, and with consequences to human life that are unknown, but are potentially catastrophic,” Lewin said.
To help achieve its vision, the Earth BioGenome Project is developing an array of global partnerships and strategies.
The organizational structure of the project will consist of a “global network of communities,” each community contributing to the project and following the project’s protocols and standards. The project has partnered with Global Genome Biodiversity Network, the world’s major resource of tissues and DNA from voucher specimens. It is also forging partnerships with communities of scientists working on different groups of organisms, including the Vertebrate Genomes Project, the Global Invertebrate Genome Alliance, the 10,000 Plant Genomes Project, the 5000 Insect Genomes Project, and others.
Assembling the species will be a massive undertaking, which is why partnerships with institutions that procure and preserve the Earth’s biodiversity, such as natural history museums, botanical gardens, zoos, and aquaria will be crucial for success. The Smithsonian herbarium, for example, contains around 300,000 species.
“Many scientists at the Smithsonian Institution with its 19 museums and nine research institutes are applying genomics technologies in their research to increase our understanding of the natural world. The strength of biodiversity genomics at the Smithsonian is a good indicator of the vital role the institution will play in furthering the goals of the Earth BioGenome Project,” Kress said.
The Earth BioGenome Project also plans to capitalize on the “citizen scientist” movement to collect specimens, modeled after the University of California Conservation Genomic Consortium’s CALeDNA program. The project will likely enable the development of new technologies, such as portable genetic sequencers and instrumented drones that can go out, identify samples in the field, and bring those samples back to the laboratory.
A pilot program has been initiated in conjunction with the Amazon Bank of Codes and the World Economic Forum to create a system that ensures equitable sharing of benefits arising from the utilization of genetic resources under the Convention on Biological Diversity, Nagoya Protocol. Brazil contains approximately 10 percent of the world’s total biodiversity. The project will offer indigenous and traditional communities in the Amazon Basin an opportunity to reap a fair share of the economic value generated from the use of biological data and natural assets from their local biomes. If successful, the pilot program will serve as the foundation for other countries with rich biodiversity.
Harris A. Lewin el al., Earth BioGenome Project: Sequencing life for the future of life. PNAS (2018). www.pnas.org/cgi/doi/10.1073/pnas.1720115115
The Earth BioGenome Project aims to sequence all eukaryotic species. This superkingdom of life includes all organisms except bacteria and archaea
Credit: Mirhee Lee