Identifying the molecular basis of the characteristics that makes us human is by far one of the greatest challenges of biology and as humans, we have been searching for answers for our exceptionally developed brains for centuries. In between the millions of mutations and chromosomal rearrangements that have occured during our evolution lie the genetic changes responsible for human spesific traits such as cognitive behaviour, upright walking and so on. Among the intellectually developed members of the Homo genus, it is thought that the human brain has appeared in evolution as a result of natural selection. Subsequently, as a consequence of their exceptionally developed neocortex, only the Homo sapiens survived 1.
The human brain is indeed much more sophisticated when compared with our closest living relative: chimpanzee, from whom we diverged nearly 8 million years ago and with whom we share nearly 99% of our genome 2. As concluded by the studies of King and Wilson 3, functionally new protein coding genes cannot account for all obvious morphological changes between humans and chimps, because human protein-coding genes are highly similar to chimpanzee genes and constitute only -1.5% of the human genome. Although very little is known about the genetic alterations that support the dramatic differences in morphology and function between humans and other primates, it is clear that gene regulatory changes play an important role. It is implied that the genetic basis of the morphological differences must primarily come from non-coding regulatory sequences 1,2,4.
What are non-coding regulatory sequences?
Unlike the coding portion, non-coding DNA works as a corresponder to an organism’s genome that do not code for amino acids. Even though it is known that some non-coding sequences serve complimentory roles such as in the regulation of gene expression, functionalities of other areas remain unknown 5.
Human accelerated regions (HARs) are a set of the human genome which constitutes of 49 segments. These regions have been conserved throughout vertebrate evolution but are visibly different in humans 6. They have been named accordingly to their differences between humans and chimpanzees and HAR1 shows the largest degree of difference. These regions have been found by computational scanning through genomic databases of multiple species and some of these highly mutated portions may provide human-specific traits 5,7. Several of the HARs genes are known to produce proteins crucial in neurodevelopment.
In early 2006, several studies applied genome-wide tests for human acceleration to numerous sets of mammal-conserved elements – most of them excluded protein-coding exons 4,5. Other studies previously compared these studies and revealed that HAR data sets produces without coding filters were made up of non-coding sites with a percentage of 96.6%. Same studies also produced a combined list of non-coding HARs 8. Expectedly, ncHARs have many more substitutions in human compared to other mammals which are also highly conserved. (human mean: 1.7 per 100 base pairs; chimpanzee mean: 0.2 per base pairs.) Although a typical ncHAR has only a few human-specific substitutions, this rate is significantly faster than other conserved elements. It is vital to keep in mind that the majority of bases that differ between human and chimp are mostly comprised of structral variations rather than substitutions 1.
There are two types of HARs we are mainly going to focus on: HAR1 and HAR2 (HACNS1)
HAR1 is a novel RNA gene expressed during the development of the neocortex. HAR2 is a human specific developmental enhancer consisting of conserved non-coding sequence. HAR2 consists of HACNS1 –a gene enhancer that consisted across two developmental stages in the mouse: presumptive anterior wrist and proximal thumb 9. The enhancer has gained strong limb expression domain relative to the orthologus elements from chimpanzee and rhesus macaque 3,7,9.
Through in vivo analyses, human-specific substitutions were either introduced into the chimpanzee enhancer sequence or reverted in the human to its ancestral state. Subsequently, 13 substitutions clustered in one 81 base per-module and genome sequences that have altered the molecular development likely lead to the unique evolution of human morphology 1,10.
Even though these genetic modifications remain highly unidentified, it is thought that they included changes in gene expression due to positive selection for nucleotide substitutions. In vitro analyses have been shown to affect the enhancer cell function meanwhile in vivo activity of developmental regulatory elements remain obscure. In vivo analyses of evolutionarily conserved non-coding sequences have revealed them to be enriched in cis-regulatory transcriptional enhancers that confer specific expression patterns during development 6,9,10. An abundant number of genomes have been sequenced and the methods are now efficient enough to detect strong evolutionary binding at single base pair resolution and to reliably identify conserved elements moderately persisted along the length of transcription factor binding sites 5.
Application of these methods to whole genome multiple sequence alignments revealed that five to ten percent of the human genome is conserved across mammals, stating that most of which is not in protein-coding regions 1,5.
These studies reveal that compared to the coding portion, the non-coding portion of the human genome maintains more functionally constrained DNA, subsequently making it a larger potential target for evolutionary change. In addition to this idea, the catalogue of protein-coding changes that occurred during human evolution is not wide enough to explain all of our unique traits. Accordingly, comparative genomics demonstrate that many human phenotypes likely resulted from changes to regulatory elements, as originally hypothesized by King & Wilson and consistent with evidence from other animals and human population genetic studies 1,3–5,8.
Human-specific regulatory elements and their enhancing functions
In the vast non-coding portion of the genome, phylogenetic analysis of mammalian genomes also provides distinct evidence for the locations of human-specific regulatory elements 5.
Many of such highly conserved sequences function as gene regulatory enhancers. Rooting from this idea, to scan mammalian sequence alignments for evolutionarily conserved sequences that have changed significantly in humans since divergence from chimpanzees, several research groups developed computational approaches 5,7.
Detecting acceleration on a specific lineage requires a statistical test comparing the DNA substitution rate observed with the rate expected given the rest of the tree.
Statistical tests for accelerated regions
In tests for accelerated evolution, the goal of interest is to determine whether the rate of DNA substitutions is faster than expected in a lineage of interest or not. Said lineage can be a single branch –for example humans since divergence from chimpanzee, a clade like the great apes or an extinct species; ancestor of all primates. Among the tests proposed, some include the use of phylogenetic relationships between species to derive expected numbers of substitutions in the lineage of interest, while others compare sister species directly. Apart from these distinctions, the main idea is to determine whether the data in a multiple sequence alignment is more consistent with lineage-specific acceleration than the expected rate of substitutions 1,4,5,7.
Specialized comparative genomics methods are also used to identify slow and fast evolving proteins and/or RNA genes. Stating the obvious, even though these are clearly powerful approaches for studying small subsets of the genome, DNA-based methods are needed for unbiased, genome-wide scans. In order to focus on functional portions of the genome, analyses have commonly used evolutionarily conserved elements. Since it is possible for acceleration on the lineage of interest to prevent a region from being classified as conserved, aimed lineage should be removed from alignment before generating the conserved elements 8. Such acceleration tests can also be applied to neutral regions to detect events that bring functionality into the region of interest.
For example, using a method based on plausible ratio tests for accelerated sequence divergence on the human lineage, researchers previously identified 721 human accelerated regions (HARs). Even though 92% of them are non-coding, this approach does not necessarily imply HARs to be so and underscore the likely importance of regulatory sequences in recent human genome evolution. Non-coding HARs have the potential to both directly and indirectly impact the expression of numerous genes. Studies have analysed the ncHARs in the means of a recent segmentation of the non-coding genome based on data generated by ENCODE. This analysis summarized data from six of the ENCODE cell lines. Even though many are found in states likely to have regulatory functions in some cell lines, the majority of ncHARs are in predicted low activity regions in these cell types. This suggests that it is possible for the ncHARs to be involved in both promoting and repressing gene expression 1,6,8. To focus our analysis on embryonic development and integrate data about known biologically active enhancers, EnhancerFinder, a developmental regulatory enhancer prediction method that scientists have recently developed, was applied to the ncHARs. EnhancerFinder is an algorithm that have been trained and evaluated on a set of nearly 1500 human sequences. To accurately predict developmental enhancers of gene expression, the algorithm integrates DNA sequence, evolutionary patterns and functional genomics data from many cellular contexts. EnhancerFinder predicts that 29% of the non-coding elements are human developmental enhancers. EnhancerFinder is also able to predict tissues in which enhancers are most likely to be active. Prior sufficient data accurately assigns predicted enhancers to brain, limb and heart activity domains. Unexpectedly, each of the ncHARs predicted to be brain and limb enhancers are significantly more than expected from genome-wide analysis 6. However, it is noted that while the distribution of ncHARs across the genome is somewhat different from all conserved non-coding elements, significant differences in the biological process annotations of nearby genes was not found. More than half of the ncHARs showed evidence of enhancer activity in at least one cellular context. Studies revealed a significant enrichment of ncHARs among predicted developmental enhancers, particularly those predicted to be active in the embryonic brain and limb. It is estimated that 64% of ncHARs tested to date are developmental enhancers 6,11.
In conclusion, it has been found that many of the cellular, histological and morphological differences between humans and chimpanzees are likely to be established later in development than the relatively evolutionarily conserved stage tested. Still, it is not possible to ignore the fact that several ncHAR enhancers drive expression patterns suggestive of differences in humans compared with chimpanzees 3,5,12.
Recent results support the hypothesis that human-specific adaptive evolution in HACNS1 has contributed to the unique human aspects of digit and limb patterning. It is believed that the gain of function in HACNS1 have influenced the evolution of such human limb features by altering the expression of nearby genes during limb development 4,5,8.
Other studies focused on conserved non-coding elements and developed methods to identify those with the most sequence changes in humans. Collectively, these approaches have identified 2649 unique non-coding HARs. To explore potential functions, researchers associated each ncHAR with nearby genes and analysed their annotations. They found significant enrichments for functions involved in development and regulation among the genes near the ncHARs. Contrasting this, none of the biological process terms were visibly enriched among the ncHARs when compared with a background of conserved non-coding portions. Therefore, findings involved suggest that functions in development are likely common among the ncHARs 1,6.
As stated repeatedly above, many of the human accelerated regions are developmental enhancers and play vital roles in the context of human evolution (mainly brain and forelimb evolution). Although studies in this particular area has come a long way, our knowledge is limited and there are still many criteria to test and focus on.
References:
- Capra, J. A., Erwin, G. D., Mckinsey, G., Rubenstein, J. L. R. & Pollard, K. S. Many human accelerated regions are developmental enhancers. Philosophical Transactions of the Royal Society B: Biological Sciences 368, (2013). (Results, Section b,c,d and e)
- Nature, C. S. and A. C.- & 2005, undefined. Initial sequence of the chimpanzee genome and comparison with the human genome. nature.com. (Introduction, Section 1,2)
- King, M. C. & Wilson, A. C. Evolution at two levels in humans and chimpanzees. Science (1979) 188, 107–116 (1975).
- Pollard, K. S. et al. Forces shaping the fastest evolving regions in the human genome. PLoS Genet 2, 1599–1611 (2006). (Results section 1 and 2)
- Prabhakar, S., Noonan, J. P., Pääbo, S. & Rubin, E. M. Accelerated evolution of conserved noncoding sequences in humans. Science (1979) 314, 786 (2006).
- Hubisz, M. J. & Pollard, K. S. Exploring the genesis and functions of Human Accelerated Regions sheds light on their role in human evolution. Curr Opin Genet Dev 29, 15–21 (2014). (Section 1,2 & 4)
- Bush, E. C. & Lahn, B. T. A genome-wide screen for noncoding elements important in primate evolution. BMC Evol Biol 8, (2008). (Conclusion)
- Bird, C. P. et al. Fast-evolving noncoding sequences in the human genome. Genome Biol 8, (2007). (Results 1,2 & 3)
- Prabhakar, S. et al. Human-specific gain of function in a developmental enhancer. Science (1979) 321, 1346–1350 (2008).
- Pennacchio, L. A. et al. In vivo enhancer analysis of human conserved non-coding sequences. nature.comLA Pennacchio, N Ahituv, AM Moses, S Prabhakar, MA Nobrega, M Shoukry, S MinovitskyNature, 2006•nature.com. (Introduction & Results and Discussion, section 2,3)
- Burbano, H. A. et al. Analysis of Human Accelerated DNA Regions Using Archaic Hominin Genomes. PLoS One 7, e32877 (2012).
- Pollard, K. S. et al. An RNA gene expressed during cortical development evolved rapidly in humans. nature.comKS Pollard, SR Salama, N Lambert, MA Lambot, S Coppens, JS Pedersen, S KatzmanNature, 2006•nature.com (2006) doi:10.1038/nature05113.
Inspector: Elif BÖCÜ