10% of data unknown in the human genome can be found in African DNA


By National Human Genome Research Institute

Research on the human genome containing human genetic information has progressed, and the latest data set " GRCh38 " of the reference sequence which is regarded as "standard sequence" has been published in December 2013, but among them Many parts remain unresolved. Research teams by researchers at John Hopkins University have analyzed the genome of 910 people with roots in Africa and announced that they have succeeded in elucidating the unexplained parts equivalent to 10% of the total.

Assembly of a pan-genome from deep sequencing of 910 humans of African descent | Nature Genetics
https://www.nature.com/articles/s41588-018-0273-y

DNA data from Africans reveals sequences that we 'd missed | Ars Technica
https://arstechnica.com/science/2018/11/our-human-reference-genome-is-missing-a-lot-of-material/

In obtaining the reference sequence of the human genome, only data obtained from several genomes has been used so far, most of GRCh38 was based on only one DNA. There are various perspectives as to whether to call it "reference" in this state, and many researchers are engaged in the study of reference sequence which was made based on extensive genome information.



Among them, what is regarded as particularly important is the elucidation of the reference sequence based on data crossing groups of various races, and in the research published this time, it seems that African genome analysis has been advanced as a part of it . As mentioned above, those who have roots in Africa are targeted, but this is not necessarily only those living in Africa, but also that people living in areas such as America, Asia and Europe are included .

Researchers compared the DNA of the subject with the data set of the reference sequence and searched for parts with unmatched parts greater than 1000 base pairs. As a result, we found a mismatch site equivalent to about 300 million bases. This corresponds to approximately 10% of the entire reference sequence data set.



Furthermore, this discrepancy is not unique to African people, 40% of which are consistent with the genomes of Korean and Chinese people. From here, important genetic information that determines the characteristics of each race exists mutually among different races, and the current reference sequence obtained from a small number of people does not surely capture that figure It suggests.



It is a reference sequence of human beings to be studied, but as new research subjects become wider, new questions will be revealed one after another, and it seems to be said that further detailed and extensive research will need to be carried out in the future. This field may be used for the development of new types of medicines based on genome information, and it is expected to further develop in the future.

in Science, Posted by darkhorse_log