Mapping the ‘dark matter’ of human DNA
This is a detail of a representation of a DNA variants map.Credit: Nature Communications (DOI: 10.1038/ncomms12989)
Although our knowledge of the human DNA is extensive, it is nowhere near complete. For instance, our knowledge of exactly which changes in our DNA are responsible for a certain disease is often insufficient. This is related to the fact that no two people have exactly the same DNA. Even the DNA molecules of identical twins have differences, which occur during their development and ageing. Some differences ensure that not everybody looks exactly alike, while others determine our susceptibility to particular diseases. Knowledge about the DNA variants can therefore tell us a lot about potential health risks and is a first step towards personalized medicine. Many small variants in the human genome — the whole of genetic information in the cell — have already been documented. Although it is known that larger structural variants play an important role in many hereditary diseases, these variants are also more difficult to detect and are, therefore, much less investigated.
By comparing the DNA of 250 healthy Dutch families with the reference DNA database the researchers were able to identify 1.9 million variants affecting multiple DNA ‘letters’. These variants include large sections of DNA that have disappeared, moved or even appear out of nowhere. When this happens in the middle of a gene that encodes a certain protein, it is likely that the functionality of the gene, and thus the production of the protein, is compromised. However, large structural variants often occur just before or after the coding part of a gene. The effect of this type of variation is hard to predict.
In the paper two occasions are described in which an extra piece of DNA was found just outside the coding region of a gene. In these occasions the variants had a demonstrable effect on the gene regulation. This proves that even structural variants that occur outside the coding regions need to be monitored closely in future DNA screenings. The catalogue of variants provided by this research enables other scientists to predict the occurrence of large structural variants from the known profile of the smaller ones. This technique opens new possibilities for studying the effects of large structural changes in our genomes.
Additionally, the research resulted in the discovery of large parts of DNA that were not included in the genome reference. This “extra” DNA does contain parts that could be involved in the production of proteins. One of the extra pieces of DNA that was described in the paper is a new “ZNF” gene that has previously never been found in humans. Nevertheless it appears to be present in roughly half of the Dutch population. This particular gene is a member of the ZNF gene family that was known from the reference genomes of several species of apes. The new variant will now be added to the human reference database. Authors subsequently showed that this gene is also present in genomes of several other human populations, however its function remains unknown. The fact that these and other pieces of “dark matter” now have been placed on the genetic map enables scientists worldwide to study them and use the results to better understand human genetic diseases.
This study is part of the Genome of the Netherlands (GoNL) project. One of the main goals of the study is to map the genome of the Dutch population and all its variants. Several teams of bio- informaticians from different countries work continuously on the development of new algorithms for data analysis, as well as on innovative ways to combine existing algorithms. The result: an accurate representation of the genomes of the Dutch population and thereby a solid base for the personalised medicine of the future.
Source: Saarland University
- Jayne Y. Hehir-Kwa, Tobias Marschall, Wigard P. Kloosterman, Laurent C. Francioli, Jasmijn A. Baaijens, Louis J. Dijkstra, Abdel Abdellaoui, Vyacheslav Koval, Djie Tjwan Thung, René Wardenaar, Ivo Renkens, Bradley P. Coe, Patrick Deelen, Joep de Ligt, Eric-Wubbo Lameijer, Freerk van Dijk, Fereydoun Hormozdiari, Jasper A. Bovenberg, Anton J. M. de Craen, Marian Beekman, Albert Hofman, Gonneke Willemsen, Bruce Wolffenbuttel, Mathieu Platteel, Yuanping Du, Ruoyan Chen, Hongzhi Cao, Rui Cao, Yushen Sun, Jeremy Sujie Cao, Pieter B. T. Neerincx, Martijn Dijkstra, George Byelas, Alexandros Kanterakis, Jan Bot, Martijn Vermaat, Jeroen F. J. Laros, Johan T. den Dunnen, Peter de Knijff, Lennart C. Karssen, Elisa M. van Leeuwen, Najaf Amin, Fernando Rivadeneira, Karol Estrada, Jouke-Jan Hottenga, V. Mathijs Kattenberg, David van Enckevort, Hailiang Mei, Mark Santcroos, Barbera D. C. van Schaik, Robert E. Handsaker, Steven A. McCarroll, Arthur Ko, Peter Sudmant, Isaac J. Nijman, André G. Uitterlinden, Cornelia M. van Duijn, Evan E. Eichler, Paul I. W. de Bakker, Morris A. Swertz, Cisca Wijmenga, Gert-Jan B. van Ommen, P. Eline Slagboom, Dorret I. Boomsma, Alexander Schönhuth, Kai Ye, Victor Guryev. A high-quality human reference panel reveals the complexity and distribution of genomic structural variants. Nature Communications, 2016; 7: 12989 DOI: 10.1038/ncomms12989