Supplementary Information for
Warren WC, Jasinska AJ, Garcia-Perez R, Svardal H, Tomlinson C, Rocchi M, Archidiacono N, Capozzi O, Minx P, Montague MJ, Kyung K, Hillier LW, Kremitzki M, Graves T, Chiang C, Hughes JF, Tran N, Wang Y, Ramensky V, Choi OW, Jung YJ, Schmitt CA, Juretic N, Wasserscheid J, Turner TR, Wiseman RW, Tuscher JJ, Karl JA, Schmitz JE, Zahn R, O'Connor DH, Redmond E, Nisbett A, Jacquelin B, Müller-Trutwin MC, Brenchley JM, Dione M, Antonio M, Schroth GP, Kaplan JR, Jorgensen MJ, Thomas GW, Hahn MW, Raney B, Aken B, Schmitz J, Churakov G, Noll A, Stanyon R, Webb D, Thibaud-Nissen F, Nordborg M, Marques-Bonet T, Dewar K, Weinstock GM, Wilson RK, Freimer NB
The genome of the vervet (Chlorocebus aethiops sabaeus)
Genome Res 25, 1921 (2015)
- Supplementary Tables Includes:
- Table S1. Sequence representation on the X chromosome by species.
- Table S2. Assembly metrics for sequenced primate genomes.
- Table S3. Total interspersed repeats in vervet. Unique vervet sequences, defined by comparison of vervet and rhesus presence absence patterns.
- Table S4. Chromosomal distribution of SINEs, LINEs, LTRs, and DNA transposons in vervet. The percentages in red indicate significant overrepresentations and those in blue underrepresentations of specific elements from the expected chromosomal distribution patterns (p<0.05, two-sided confidential intervals).
- Table S5. Chromosomal distribution of SINEs, LINEs, LTRs, and DNA transposons in human. The percentages in red indicate significant overrepresentations and those in blue underrepresentations of specific elements from the expected chromosomal distribution patterns (p<0.05, two-sided confidential intervals).
- Table S6. Chromosomal distribution of SINEs, LINEs, LTRs, and DNA transposons in rhesus macaque (rheMac7). The percentages in red indicate significant overrepresentations and those in blue underrepresentations of specific elements from the expected chromosomal distribution patterns (p<0.05, two-sided confidential intervals).
- Table S7. Mapping statistics of VRC monkeys used for structural variation discovery .
- Table S8. Deletion variants defined in the vervet research colony population.
- Table S9. The total estimated base loss events by length unique for each VRC individual.
- Table S10. The total estimated base loss events shared by any 3 sequenced VRC individuals.
- Table S11. Autosomes 1-29 segmental duplication base counts.
- Table S12. Genes residing in segmental duplication regions showing enrichment among canonical KEGG pathways.
- Table S13. Summary of gene gain and loss events inferred after correcting for annotation and assembly error across all 11 species. The number of rapidly evolving families is shown in parentheses for each type of change.
- Table S14. A summary of sequencing measures for vervet subspecies.
- Table S15. Autosomal pairwise difference matrix across all subspecies.
- Table S16. Estimated subspecies split times in years.
- Table S17. Inferred average coalescent time in years for all subspecies pairs.
- Table S18. Sources of vervet (C. a. sabaeus) evaluated for MHC diversity.
- Table S19. Summary of Chsa MHC class I sequences identified.
- Table S20. The iterative masking steps for VRC segmental duplication discovery.
- Table S21. Assembly and annotation error estimation and gene gain/loss rates in the 11 mammals included in this study compared to the same values for the 10 mammals used in Han et al, 2013.
- Table S22. SNV filters applied to each subspecies aligned sequences.
- Supplementary Methods Includes:
- Structural variant detection
- Supplementaru references
- Figures Includes:
- Figure S1. Structural variant detection.
- Figure S2. Whole genome alignments of human (hg19), vervet (Chlorocebus_sabeus 1.1) and rhesus macaque (rheMac2).
- Figure S3. The fission that generated vervet chromosomes 24 and 29 is mapped to a single rhesus macaque BAC CH250-181A5.
- Figure S4. Human 14 region of breakpoint origin for CAE24 and CAE29.
- Figure S5. A summary of segmental duplication content in vervet chromsomes that have experienced fissions compared to those that have not.
- Figure S6. Gene copy number for LENG1 in the primate lineage among sequenced primates (source is Ensembl gene trees database).
- Figure S7. Total subspecies filtered single nucleotide polymorphisms per vervet chromosome.
- Figure S8. Tile path of individual assembled BACs interspersed with whole genome assembly contigs for the vervet MHC region. Blue blocks represent individual clones and green blocks are interspersed whole genome assembled contigs. The assembled BAC tile path is available upon request.
- Figure S9. Cumulative distribution of additional masking achieved by masking over-represented kmers.