In the 19th century, the Austrian monk George Mendel investigated how traits are inherited in peas that he grew in his monastery garden. Mendel established genetic principles and coined concepts that we still use today. For heredity to work at all, the information contained in DNA must be copied, and every time a cell divides, all the DNA in the cell is copied, so that all cells, including the gametes (germ cells) that give rise to a new individual, have the same information. This is called replication and knowledge of how DNA works at the molecular level did not come until 100 years after Mendel lived.
To understand how traits are inherited through DNA, knowledge of chromosomes is needed. We humans have most of our genes packaged in 46 chromosomes, 23 inherited from each parent. Two by two, the 23 chromosomes inherited from each parent form homologous pairs of chromosomes that contain the same genes, the only exception to homologous chromosome pairs being the sex chromosomes (X and Y) in a male.
Chromosome pairs in a woman at the top and a man at the bottom. Note that the man's XY are not homologous: Image source: https://eweitz.github.io/ideogram/multiple-trio
The smallest biological units that make up all life on Earth are cells. We humans and all other animals, all plants and most fungi are multicellular organisms that consist of so-called eukaryotic cells where DNA is protected inside a cell nucleus (2). Eukaryotic cells also have organelles that run the cell's machinery, such as mitochondria (9) which also contain a smaller amount of DNA.
A DNA molecule consists of two strands of nucleic acids that are twisted around each other and the structure resembles a rope ladder and is called a double helix or a double helix. The individual rungs of the ladder are made up of nucleotides, either adenine (A), thymine (T), cytosine (C) or guanine (G) that are connected to each other so that together they form base pairs.
The DNA sequence is described by these letters and its size is measured in how many base pairs the sequence consists of.
The DNA of the 23 chromosomes and mitochondria is divided into long sequences of different numbers of base pairs.
Autosomal DNA has a total of 2867 million base pairs distributed in various lengths on chromosomes 1 to 22.
X-DNA has 155 million base pairs
Y-DNA has 58 million base pairs
mtDNA has only 16569 base pairs and is hardly identifiable on the same graphs as other types of DNA.
DNA Inheritance patterns (Jari Kinnunen © Visual DNA 2026) Chromosomes image source: https://eweitz.github.io/ideogram/multiple-trio
In 2000, a major breakthrough was made when researchers succeeded in creating a reference model of the human genome, i.e. mapping the sequence of the approximately 3 billion base pairs that human DNA segments consist of.
By simply comparing the letters in a test person's DNA sequence against the reference model, one can identify differences at different positions in the chromosomes. If, according to the reference model Hg38, one expects at a specific position on a chromosome that a base pair has A⇔T but instead finds that the base pair has A⇔C in the test person, then a mutation has been discovered. In a similar way, one goes through the entire sequence of available DNA.
Not all parts of DNA have previously been sequenced, some parts are so compact that they could not be read with previous technology, but in 2022, researchers have completed a mission that started 32 years ago, after discovering the final DNA sequences that make up a human genome. The Telomere-to-Telomere (T2T) Consortium, which focused on the remaining 8% of the genome, presented in 2022 a complete sequence of 3.055 billion base pairs of a human genome that includes complete assemblies for all chromosomes except the Y.
Refer to: Science - The complete sequence of a human genome.
The reference model has been updated since 2000 and today it has the designation Hg38.
In 1994, Luca Cavalli-Sforza published “The History and Geography of Human Genes,” which brought together what was then known from archaeology, linguistics, history, and genetics to tell a grand story of the origins of the world’s peoples today. The book offered an overview of the deep past, but in the absence of genetic data, it was limited compared to the much more extensive information from archaeology and linguistics.
In 2018, geneticist David Reich published “Who We Are and How We Got Here,” a book about the vast research on ancient DNA and its contribution to modern human population genetics. He describes discoveries made by his group as well as others based on analyses and comparisons of ancient and modern DNA from human populations around the world. Characteristic of these discoveries is that almost all human populations are mixtures resulting from multiple migrations and flows of different populations.
One must be aware that any such publication is a snapshot and based on the knowledge available at that particular time. The advances of the last 5-10 years alone have revolutionized the knowledge of genetic genealogy a number of times and one should expect similar developments some time into the future.
Extensive technological innovations now allow researchers to extract and analyze ancient DNA like never before, i.e. analyze genetic material from human remains that go back hundreds of thousands of years. At the same time, interest in genetic genealogy among modern humans has literally exploded due to high availability in conjunction with relatively low costs for very advanced tests.
DNA itself does not directly carry any chronological or geographical information, but it has been analyzed through various results and studies of ancient and modern DNA and then linked in various ways to the reference model for the human genome (as described in the previous chapter).
The radiocarbon C14-dating was developed in the late 1940s and led to a revolution in archaeology, making it possible to date ancient remains and fossils containing organic material, usually charcoal and bone, in a way that was not possible before. The C14-dating in conjunction with the possibility of extracting ancient DNA (mainly mtDNA and Y-DNA) provides an opportunity to chronologically and geographically attach an ancient find to a time axis and a map to produce a phylogeography.
When looking at ancient finds, cases where it was possible to extract mtDNA dominate due to its multiple number of copies hence a higher availability in each cell compared to the nucleus of DNA where Y-DNA is found (see section above).
Similarly, with a large number of modern DNA results, chronological and geographical data can be obtained, however, the number tested varies greatly in different parts of the world, which sometimes gives a distorted picture of the phylogeography in the world, i.e. which mutations have occurred where.
Total number of finds (approx. 15,000) and distribution of ancient mtDNA (red) and ancient Y-DNA (blue) from 48,000 years ago to the present. Source: https://sites.google.com/view/haplotree-info/home/ancient-dna
Analyzing DNA from ancient and modern results provides a complementary method for dating evolutionary events. Some mutations are interesting from genetic and genealogical aspects because mutations occur at certain intervals, which provides an estimate of a time that has passed, a “molecular clock” in short. By comparing DNA sequences, one can not only reconstruct relationships between different populations but also infer an evolutionary history over long time scales.
“Molecular clocks” are becoming increasingly sophisticated thanks to improved DNA sequencing, improved analytical tools, and a better understanding of the processes behind genetic change. By applying these methods to an ever-growing database of DNA from diverse populations (both modern and ancient), a more refined timeline of human evolution can be constructed.
The biggest challenge arises from the fact that the molecular clock has not ticked constantly throughout evolution; it can differ between species and human populations, like trying to measure time with a clock that ticks at different rates under different conditions.
An example of this is that mutations in mtDNA are much less frequent than in Y-DNA because its molecular clock ticks about 100 times faster than the corresponding one for mtDNA. Knowledge of this is of great importance if one is to compare kinship between two individuals or the equivalent of a geographical migration between two mutations or a distribution of a haplogroup.
Y-DNA SNP mutates on average about once every 80 years, i.e. about one new SNP mutation every two to three generations*.
*) SNP (Single Nucleotide Polymorphism) Vance D. The Genealogist’s Guide to Y-DNA Testing for Genetic Genealogy. 2nd ed. 2024.