Monday, 19 April 2021

Intergenerational effects of manipulating DNA methylation in the early life of an iconic invader

Another cane toad paper has hit the shelves! This is another paper from our ongoing collaboration with Lee Ann Rollins and her great team of invasion biologists and molecular ecologists. This paper once again uses our draft cane toad genome* and builds on the previous cane toad methylation analysis by PhD student Roshmi Sarma to look at some really interesting intergenerational effects. [*Genome update coming soon - watch this space!]

Could this be epigenetic inheritance? Maybe. But it could also be some kind of parental germline thing. Either way, it’s a fascinating result and further evidence to support the fact that whilst our genome may establish our genetic potential, it does not control our destiny.

And if you want to know what it takes to identify effects like this, check out the mind-boggling experiment design Figure! (PDF available on request!)

This article is part of the theme issue ‘How does epigenetics influence the course of evolution?’

Sarma RR, Crossland MR, Eyck HJF, Edwards RJ, DeVore JL, Cocomazzo M, Zhou J, Brown GP, Shine R & Rollins LA (2021): Intergenerational effects of manipulating DNA methylation in the early life of an iconic invader. Philosophical Transactions of the Royal Society B 376:20200125. [Phil Trans Roy Soc B] [PubMed]


In response to novel environments, invasive populations often evolve rapidly. Standing genetic variation is an important predictor of evolutionary response but epigenetic variation may also play a role. Here, we use an iconic invader, the cane toad (Rhinella marina), to investigate how manipulating epigenetic status affects phenotypic traits. We collected wild toads from across Australia, bred them, and experimentally manipulated DNA methylation of the subsequent two generations (G1, G2) through exposure to the DNA methylation inhibitor zebularine and/or conspecific tadpole alarm cues. Direct exposure to alarm cues (an indicator of predation risk) increased the potency of G2 tadpole chemical cues, but this was accompanied by reductions in survival. Exposure to alarm cues during G1 also increased the potency of G2 tadpole cues, indicating intergenerational plasticity in this inducible defence. In addition, the negative effects of alarm cues on tadpole viability (i.e. the costs of producing the inducible defence) were minimized in the second generation. Exposure to zebularine during G1 induced similar intergenerational effects, suggesting a role for alteration in DNA methylation. Accordingly, we identified intergenerational shifts in DNA methylation at some loci in response to alarm cue exposure. Substantial demethylation occurred within the sodium channel epithelial 1 subunit gamma gene (SCNN1G) in alarm cue exposed individuals and their offspring. This gene is a key to the regulation of sodium in epithelial cells and may help to maintain the protective epidermal barrier. These data suggest that early life experiences of tadpoles induce intergenerational effects through epigenetic mechanisms, which enhance larval fitness.

Thursday, 18 March 2021

Chromosome-length genome assembly and structural variations of the primal Basenji dog (Canis lupus familiaris) genome

Our latest genome paper is now out at BMC Genomics. This was a second collaboration in the team behind the German Shepherd Dog genome last year, led by Bill Ballard. This time, we used a combination of BGI short reads, ONT long reads, and Hi-C scaffolding to generate a chromosome-length assembly. This is one of the most intact and complete dog genomes generated to date, and joins only a handful of published breed-specific chromosome-length assemblies.

The Basenji is particularly interesting as it sits at the base of the dog breed family tree, making it a good unbiased reference for future comparisons between breeds.

The paper also has a few nice nuggets for those interesting in genome assembly. Of particular interest, our initial assembly had an artefact where the entire mitochondrial genome got assembled (in two copies) into the middle of one of the nuclear chromosomes. It is not entirely clear why this happened, but it was inserted into a NUMT (nuclear mitochondrial DNA insertion) fragment at that location. To make finding such things easier, we’ve released a new NUMT finding tool, NUMTFinder.

As with our previous dog genome, the German Shepherd Dog, we also observe that the tandem repeat of Amy2B (Amylase Alpha 2B) genes, was assembled intact but with fewer copies than are present in the actual genome. (This gene is of interest for dog domestication and adaptations to a starch-rich diet.) Crucially, without looking at the raw sequencing data, it would not have been clear that the assembly under-represents Amy2B copy number. This kind of analysis can be repeated using the regcheck or regcnv run modes of Diploidocus, which estimates the copy number of a region based on its read depth versus the single-copy read depth determined from BUSCO single-copy complete genes.

Overall, this presents a nice case study of the need for a bit of TLC and manual curation, even when you have some very impressive completeness and contiguity statistics.

Edwards RJ, Field MA, Ferguson JM, Dudchenko O, Keilwagen K, Rosen BD, Johnson GS, Rice ES, Hillier L, Hammond JM, Towarnicki SG, Omer A, Khan R, Skvortsova K, Bogdanovic O, Zammit RA, Lieberman Aiden E, Warren WC & Ballard JWO (2021): Chromosome-length genome assembly and structural variations of the primal Basenji dog (Canis lupus familiaris) genome. BMC Genomics 22:188


Background: Basenjis are considered an ancient dog breed of central African origins that still live and hunt with tribesmen in the African Congo. Nicknamed the barkless dog, Basenjis possess unique phylogeny, geographical origins and traits, making their genome structure of great interest. The increasing number of available canid reference genomes allows us to examine the impact the choice of reference genome makes with regard to reference genome quality and breed relatedness.

Results: Here, we report two high quality de novo Basenji genome assemblies: a female, China (CanFam_Bas), and a male, Wags. We conduct pairwise comparisons and report structural variations between assembled genomes of three dog breeds: Basenji (CanFam_Bas), Boxer (CanFam3.1) and German Shepherd Dog (GSD) (CanFam_GSD). CanFam_Bas is superior to CanFam3.1 in terms of genome contiguity and comparable overall to the high quality CanFam_GSD assembly. By aligning short read data from 58 representative dog breeds to three reference genomes, we demonstrate how the choice of reference genome significantly impacts both read mapping and variant detection.

Conclusions: The growing number of high-quality canid reference genomes means the choice of reference genome is an increasingly critical decision in subsequent canid variant analyses. The basal position of the Basenji makes it suitable for variant analysis for targeted applications of specific dog breeds. However, we believe more comprehensive analyses across the entire family of canids is more suited to a pangenome approach. Collectively this work highlights the importance the choice of reference genome makes in all variation studies.

Friday, 4 December 2020

EdwardsLab at #AusEvo2020

If you missed his talk at ABACBS2020, Jack will be presenting today at the Australasian Evolution Society 2020 Conference about The role of gene duplication in the evolution of snake venoms. Two conference presentations in two weeks - not a bad way to prepare for your Honours viva post-submission. Well done, Jack!

Also, Kat Stuart will be presenting her work on invasive starlings in Zoom 2 at 13:00 AEDT. Kat’s talks are always great to listen to:

  • Katarina Stuart: What drives invasion success? Using historical museum samples to examine evolution in an invasive passerine.

Wednesday, 25 November 2020

#ABACBS2020: Unsupervised orthologous gene tree enrichment for cost-effective phylogenomic analysis and a test case on waratahs (Telopea spp.)

Stephanie Chen, Maurizio Rossetto, Marlien van der Merwe, Hervé Sauquet, Patricia Lu-Irving, Jia-Yee Yap, William Studley, Greg Bourke, Jason Bragg, Richard J. Edwards


Whole-genome shotgun sequencing is becoming increasingly common in phylogenetic research due to the falling cost of whole genome sequencing compared to traditional methods which target subsets of genomes. However, there are few existing packages for assembling putatively orthologous loci from evolutionarily diverged samples and making alignments for phylogenetic analysis from these data. Additionally, short-read Illumina sequencing data are highly accurate but at low coverages, it can be difficult to draw out meaningful phylogenomic inferences, especially for non-model organisms for which there is no reference genome available.

We have developed a scalable method of rapidly generating species trees from short-read data without the need for a reference genome. The workflow involves (1) de novo genome assembly with ABySS at a range of k values (2) extracting the most complete BUSCO (Benchmarking Universal Single-Copy Orthologs) genes from each set of assemblies with the BUSCO Compiler and Comparison tool (BUSCOMP) (3) generating gene trees, and (4) constructing a species tree.

The workflow has been applied to a whole genome shotgun sequencing waratah (Telopea spp.) dataset of five species, comprising of two samples from each of the seven lineages; there are three lineages of T. speciosissima (New South Wales waratah) – coastal, upland, and southern. We have also generated a reference genome for T. speciosissima, and examine the robustness of the workflow by comparison to a reference-based approach. It is anticipated that the workflow will maximise the recovery of informative data from genomic datasets for reproducible phylogenomic studies and be especially useful for non-model organisms.

#ABACBS2020: Whole transcripts in genome assembly, annotation, and assessment: the draft genome assembly of the globally invasive common starling, Sturnus vulgaris

Katarina Stuart, Yuanyuan Cheng, Lee Rollins & Richard J. Edwards


Native to the Palearctic, the common starling (Sturnus vulgaris) is a near-globally invasive passerine that has now colonised every continent barring Antarctica. Ecological interest in the species is two-fold – they are considered a conservation risk and crop pest within the invasive ranges, while recent decades have brought with them a worrying decline in starling numbers within historical native ranges. Despite the global interest in this species, there are still fundamental knowledge gaps in our understanding of the genetics and population differences of this species across their native and invasive range. We present the Australian S. vulgaris draft genome and transcriptome to be used as a reference for further investigation into evolutionary characterisation of this ecologically significant species. An initial 10x Genomics linked-read assembly was scaffolded and gap-filled with low coverage nanopore sequencing, complemented by PacBio Isoseq full-length transcript data. Isoseq data was incorporated into assembly scaffolding, annotation, and assembly assessment to inform workflow decisions. We produced a draft assembly with a scaffold N50 size of 72.5 Mb, and assess this alongside a North American S. vulgaris draft genome, previously assembled from Illumina data. Lastly, we use these different reference genomes, alongside a non-scaffolded version of the Australian S. vulgaris genome to assess how choice of reference genome affects common population genetic downstream analysis using a global whole genome resequencing data set.

Tuesday, 24 November 2020

#ABACBS2020: The role of gene duplication in the evolution of snake venoms

Jack Clarke, Vicki Thomson & Richard Edwards


Snakes are one of the most venomous animals on the planet, using their venom for defence and the capturing of prey. Snake venoms have evolved independently of other venoms in other vertebrates, and there is considerable variation between species in their proteomic composition. One of the primary mechanisms through which snake venoms are thought to evolve is the duplication, recruitment and specialisation of proteins from other tissues. In some cases, this evolution is known to involve the tandem duplication of genes resulting in chromosomal clusters of venom genes in some gene families. We have recently sequenced and assembled the genomes of two highly venomous Australian snakes: Notechis scutatus (mainland tiger snake) and Pseudonaja textilis (eastern brown snake). In conjunction with publicly available proteomes from 10 other venomous snakes and 2 non-venomous snakes, these genomes provide an excellent opportunity to examine the role that duplication and neofunctionalisation has played in snake venom evolution.

We have analysed 43 protein families known to play a role in snake venom and examined their pattern of duplication in snakes, compared to high quality reference genomes of other reptiles and non-venomous vertebrates. We find evidence for extensive duplications across some of these families, but no clear enrichment for duplication in the evolution of venom specifically. Instead, we identify a trend where numerous duplications specific to venomous snakes occur in proteins that seem predisposed to evolve by duplication and specialisation, even in non-venomous vertebrates. A subset of high-quality snake genomes was then used to further explore the nature of duplications. While tandem gene duplication is evident in some larger families, it remains absent in many.

The snake venom metalloproteinase (SVMP) family provides an excellent case study, with multiple duplication events throughout its evolutionary history in vertebrates. Part of the broader ADAM (“a disintegrin and metalloproteinase”) family of single-pass transmembrane and secreted zinc proteases, SVMP appears to have expanded by independent tandem duplications in different snake lineages. We also identify a second ADAM subfamily, ADAM20, with an abundance of venomous snake-specific duplications. Ongoing work in exploring the possible role of ADAM20 proteins in snake venoms and the role that genome assembly quality has played in our ability to robustly detect the presence or absence of gene duplication events.