sábado, 10 de marzo de 2012

Insights into hominid evolution from the gorilla genome sequence



Nature
483,
169–175
(08 March 2012)
doi:10.1038/nature10842
Received
Accepted
Published online

Abstract


Gorillas are humans’ closest living relatives after chimpanzees, and are of comparable importance for the study of human origins and evolution. Here we present the assembly and analysis of a genome sequence for the western lowland gorilla, and compare the whole genomes of all extant great ape genera. We propose a synthesis of genetic and fossil evidence consistent with placing the human–chimpanzee and human–chimpanzee–gorilla speciation events at approximately 6 and 10 million years ago. In 30% of the genome, gorilla is closer to human or chimpanzee than the latter are to each other; this is rarer around coding genes, indicating pervasive selection throughout great ape evolution, and has functional consequences in gene expression. A comparison of protein coding genes reveals approximately 500 genes showing accelerated evolution on each of the gorilla, human and chimpanzee lineages, and evidence for parallel acceleration, particularly of genes involved in hearing. We also compare the western and eastern gorilla species, estimating an average sequence divergence time 1.75 million years ago, but with evidence for more recent genetic exchange and a population bottleneck in the eastern species. The use of the genome sequence in these and future analyses will promote a deeper understanding of great ape biology and evolution.

Figures at a glance

Introduction


Humans share many elements of their anatomy and physiology with both gorillas and chimpanzees, and our similarity to these species was emphasized by Darwin and Huxley in the first evolutionary accounts of human origins1. Molecular studies confirmed that we are closer to the African apes than to orang-utans, and on average closer to chimpanzees than gorillas2 (Fig. 1a). Subsequent analyses have explored functional differences between the great apes and their relevance to human evolution, assisted recently by reference genome sequences for chimpanzee3 and orang-utan4. Here we provide a reference assembly and initial analysis of the gorilla genome sequence, establishing a foundation for the further study of great ape evolution and genetics.

Figure 1: Speciation of the great apes.
Speciation of the great apes.
a, Phylogeny of the great ape family, showing the speciation of human (H), chimpanzee (C), gorilla (G) and orang-utan (O). Horizontal lines indicate speciation times within the hominine subfamily and the sequence divergence time between human and orang-utan. Interior grey lines illustrate an example of incomplete lineage sorting at a particular genetic locus—in this case (((C, G), H), O) rather than (((H, C), G), O). Below are mean nucleotide divergences between human and the other great apes from the EPO alignment. b, Great ape speciation and divergence times. Upper panel, solid lines show how times for the HC and HCG speciation events estimated by CoalHMM vary with average mutation rate; dashed lines show the corresponding average sequence divergence times, as well as the HO sequence divergence. Blue blocks represent hominid fossil species (key at top right): each has a vertical extent spanning the range of dates estimated for it in the literature9, 50, and a horizontal position at the maximum mutation rate consistent both with its proposed phylogenetic position and the CoalHMM estimates (including some allowance for ancestral polymorphism in the case of Sivapithecus). The grey shaded region shows that an increase in mutation rate going back in time can accommodate present-day estimates, fossil hypotheses, and a middle Miocene speciation for orang-utan. Lower panel, estimates of the average mutation rate in present-day humans11, 12, 13; grey bars show 95% confidence intervals, with black lines at the means. Estimates were made by the 1000 Genomes Project for trios of European (CEU) and Yoruban African (YRI) ancestry.
Recent technological developments have substantially reduced the costs of sequencing, but the assembly of a whole vertebrate genome remains a challenging computational problem. We generated a reference assembly from a single female western lowland gorilla (Gorilla gorilla gorilla) named Kamilah, using 5.4×109 base pairs (5.4Gbp) of capillary sequence combined with 166.8Gbp of Illumina read pairs (Methods Summary). Genes, transcripts and predictions of gene orthologues and paralogues were annotated by Ensembl5, and additional analysis found evidence for 498 functional long (>200-bp) intergenic RNA transcripts. Table 1 summarizes the assembly and annotation properties. An assessment of assembly quality using finished fosmid sequences found that typical (N50; see Table 1 for definition) stretches of error-free sequence are 7.2kbp in length, with errors tending to be clustered in repetitive regions. Outside repeat masked regions and away from contig ends, the total rate of single-base and indel errors is 0.13 per kbp. See Supplementary Information for further details.

Table 1: Assembly and annotation statistics
We also collected less extensive sequence data for three other gorillas, to enable a comparison of species within the Gorilla genus. Gorillas survive today only within several isolated and endangered populations whose evolutionary relationships are uncertain. In addition to Kamilah, our analysis included two western lowland gorillas, Kwanza (male) and EB(JC) (female), and one eastern lowland gorilla, Mukisi (male).

Discussion


Since the middle Miocene—an epoch of abundance and diversity for apes throughout Eurasia and Africa—the prevailing pattern of ape evolution has been one of fragmentation and extinction48. The present-day distribution of non-human great apes, existing only as endangered and subdivided populations in equatorial forest refugia43, is a legacy of that process. Even humans, now spread around the world and occupying habitats previously inaccessible to any primate, bear the genetic legacy of past population crises. All other branches of the genus Homo have passed into extinction. It may be that in the condition of Gorilla, Pan and Pongo we see some echo of our own ancestors before the last 100,000years, and perhaps a condition experienced many times over several million years of evolution. It is notable that species within at least three of these genera continued to exchange genetic material long after separation4, 49, a disposition that may have aided their survival in the face of diminishing numbers. As well as teaching us about human evolution, the study of the great apes connects us to a time when our existence was more tenuous, and in doing so, highlights the importance of protecting and conserving these remarkable species.

Methods



Assembly

We constructed a hybrid de novo assembly combining 5.4Gbp of Illumina paired reads. Improvements in long-range structure were then guided by human homology, placing contigs into scaffolds wherever read pairs confirmed collinearity between gorilla and human. Base-pair contiguity was improved by local reassembly within each scaffold, merging or extending contigs using Illumina read pairs. Finally we used additional Kamilah bacterial artificial chromosome (BAC) and fosmid end pair capillary sequences to provide longer range scaffolding. Base errors were corrected by mapping all Illumina reads back to the assembly and rectifying apparent homozygous variants, while recording the location of heterozygous sites. Further details and other methods are described in Supplementary Information.

Author information



Affiliations

  1. Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK

    • Aylwyn Scally,
    • Ian Goodhead,
    • Shane McCarthy,
    • Y. Amy Tang,
    • Yali Xue,
    • Bryndis Yngvadottir,
    • Qasim Ayub,
    • Yuan Chen,
    • Chris M. Clee,
    • Yong Gu,
    • Paul Heath,
    • Anja Kolb-Kokocinski,
    • Gavin K. Laird,
    • Anthony S. Rogers,
    • Jared T. Simpson,
    • Daniel J. Turner,
    • Weldon Whitener,
    • Zemin Ning,
    • Duncan T. Odom,
    • Michael A. Quail,
    • Stephen M. Searle,
    • Jane Rogers,
    • Chris Tyler-Smith &
    • Richard Durbin
  2. Bioinformatics Research Center, Aarhus University, C.F. Møllers Allé 8, 8000 Aarhus C, Denmark

    • Julien Y. Dutheil,
    • Asger Hobolth,
    • Thomas Mailund,
    • Lars N. Andersen,
    • Kasper Munch &
    • Mikkel H. Schierup
  3. Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA

    • LaDeana W. Hillier,
    • Tomas Marques-Bonet,
    • Can Alkan,
    • Emre Karakoc,
    • Saba Sajjadian &
    • Evan E. Eichler
  4. European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK

    • Gregory E. Jordan,
    • Javier Herrero,
    • Petra C. Schwalie,
    • Kathryn Beal,
    • Stephen Fitzgerald,
    • Albert J. Vilella,
    • Paul Flicek &
    • Nick Goldman
  5. Department of Genetic Medicine and Development, University of Geneva Medical School, Rue Michel-Servet 1, 1211 Geneva 4, Switzerland

    • Tuuli Lappalainen &
    • Emmanouil T. Dermitzakis
  6. Institut de Biologia Evolutiva (UPF-CSIC), 08003 Barcelona, Catalonia, Spain

    • Tomas Marques-Bonet &
    • Javier Prado-Martinez
  7. Institucio Catalana de Recerca i Estudis Avançats, ICREA, 08010 Barcelona, Spain

    • Tomas Marques-Bonet
  8. Department of Zoology, University of Cambridge, Downing Street, Cambridge CB2 3EJ, UK

    • Stephen H. Montgomery,
    • Brenda J. Bradley,
    • Timothy D. O’Connor &
    • Nicholas I. Mundy
  9. University of Cambridge, Department of Oncology, Hutchison/MRC Research Centre, Hills Road, Cambridge CB2 0XZ, UK

    • Michelle C. Ward,
    • Dominic Schmidt &
    • Duncan T. Odom
  10. Cancer Research UK, Cambridge Research Institute, Li Ka Shing Centre, Robinson Way, Cambridge CB2 0RE, UK

    • Michelle C. Ward,
    • Dominic Schmidt &
    • Duncan T. Odom
  11. Howard Hughes Medical Institute, University of Washington, Seattle, Washington 20815-6789, USA

    • Can Alkan &
    • Evan E. Eichler
  12. Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff CF14 4XN, UK

    • Edward V. Ball,
    • Matthew Mort,
    • Andrew D. Phillips,
    • Katy Shaw,
    • Peter D. Stenson &
    • David N. Cooper
  13. Department of Anthropology, Yale University, 10 Sachem Street, New Haven, Connecticut 06511, USA

    • Brenda J. Bradley
  14. The Genome Institute at Washington University, Washington University School of Medicine, Saint Louis, Missouri 63108, USA

    • Tina A. Graves,
    • Wesley C. Warren &
    • Richard K. Wilson
  15. MRC Functional Genomics Unit, University of Oxford, Department of Physiology, Anatomy and Genetics, South Parks Road, Oxford OX1 3QX, UK

    • Andreas Heger,
    • Stephen Meader &
    • Chris P. Ponting
  16. Wellcome Trust Centre for Human Genetics, Roosevelt Drive, Oxford OX3 7BN, UK

    • Gerton Lunter
  17. Comparative Genomics Unit, Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, 20892-2152, USA

    • James C. Mullikin
  18. Max Planck Institute for Evolutionary Anthropology, Primatology Department, Deutscher Platz 6, Leipzig 04103, Germany

    • Linda Vigilant
  19. Children’s Hospital Oakland Research Institute, Oakland, California 94609, USA

    • Baoli Zhu &
    • Pieter de Jong
  20. San Diego Zoo’s Institute for Conservation Research, Escondido, California 92027, USA

    • Oliver A. Ryder
  21. Present addresses: Institut des Sciences de l'Évolution – Montpellier (I.S.E.-M.), Université de Montpellier II – CC 064, 34095 Montpellier Cedex 05, France (J.Y.D); Centre for Genomic Research, Institute of Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, UK (I.G.); Division of Biological Anthropology, University of Cambridge, Fitzwilliam Street, Cambridge CB2 1QH, UK (B.Y.); EASIH, University of Cambridge, Addenbrooke’s Hospital, Cambridge CB2 0QQ, UK (A.S.R.); Oxford Nanopore Technologies, Edmund Cartwright House, 4 Robert Robinson Avenue, Oxford OX4 4GA, UK (D.J.T.); Institute of Microbiology, Chinese Academy of Sciences, Datun Road, Chaoyang District, Beijing 100101, China (B.Z.); The Genome Analysis Centre, Norwich Research Park, Norwich NR4 7UH, UK (J.R.).

    • Julien Y. Dutheil,
    • Ian Goodhead,
    • Bryndis Yngvadottir,
    • Anthony S. Rogers,
    • Daniel J. Turner,
    • Baoli Zhu &
    • Jane Rogers

Contributions

Manuscript main text: A.S., R.D., C.T.-S., N.I.M., G.E.J., P.C.S., A.K.-K. Project coordination: A.S., A.S.R., A.K.-K., R.D. Project initiation: J.R., R.D., R.K.W. Library preparation and sequencing: I.G., D.J.T., M.A.Q., C.M.C., B.Z., P.d.J., O.A.R., Q.A., B.Y., Y.X., T.A.G., W.C.W. Assembly: A.S., L.W.H., Y.G., J.T.S., J.C.M., W.W., Z.N. Fosmid finishing: P.H. Assembly quality: A.S., S. Meader, G.L., C.P.P. Annotation: Y.A.T., G.K.L., A.J.V., A. Heger, S.M.S. Primate multiple alignments: J.H., K.B., S.F. Great ape speciation and ILS: J.Y.D., A.S., T.M., M.H.S., K.M., G.E.J. Sequence loss and gain: A.S., S.M., C.T.-S., Y.A.T., A.J.V. Protein evolution: G.E.J., S.H.M., N.I.M., B.J.B., T.D.O’C., Y.X., Y.C., N.G. Human disease allele analysis: Y.X., Y.C., C.T.-S., P.D.S., E.V.B., A.D.P., M.M., K.S., D.N.C. Transcriptome analysis: T.L., E.T.D. ChIP-seq experiment and analysis: P.C.S., M.C.W., D.S., P.F., D.T.O. Additional gorilla samples: B.Y., Y.X., L.V., C.T.-S. Gorilla species diversity and divergence: A.S., A.H., T.M., L.N.A., B.Y., L.V. Gorilla species functional differences: Y.X., Y.C., C.T.-S. Segmental duplication analysis: T.M.-B., C.A., S.S., E.K., J.P.-M., E.E.E.

Competing financial interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to:
Accession numbers for all primary sequencing data are given in Supplementary Information. The assembly has been submitted to EMBL with accession numbers FR853080 to FR853106, and annotation is available at Ensembl (http://www.ensembl.org/Gorilla_gorilla/Info/Index).

Additional data



Citations to this article

Selected feature

Featured article image

Editor's summary

Hominid genomes: gorilla makes four

The genome of the gorilla has been sequenced, making it possible to compare the DNA of the four surviving hominid genera: human, chimpanzee, gorilla and orang-utan. The data — mainly from a female wes…

News & Views

by Gibbs and Rogers

The gorilla genome reveals that genetic similarities among humans and the apes are more complex than expected, and allows a fresh assessment of the evolutionary mechanisms that led to the primate spec…

Science jobs from naturejobs

Open innovation challenges

Click here to find out more!

Science events from natureevents

Click here to find out more!
Nature
ISSN 0028-0836
EISSN 1476-4687