One of the most important, interesting and challenging questions in biology is how new traits evolve at the molecular level. My lab employs sequence analysis techniques to interrogate protein and DNA sequences for the signals left behind by evolution. We are a bioinformatics lab but like to incorporate bench data through collaboration wherever possible.
The core research in the lab is broadly divided into two main themes:
1. Evolutionary Genomics.
Since moving to UNSW, a major focus of the lab has been the exploitation of genomic and post-genomic data to understand biological function and adaptation to novel environments. We work closely with the Ramaciotti Centre for Genomics and are involved in numerous de novo whole genome sequencing and assembly projects, using short read (Illumina), long read (PacBio & Nanopore) and linked read (10x Chromium) sequencing. The biggest of these is leading the bioinformatics and assembly effort in a consortium to sequence the cane toad genome, and leading the BABS Genome project to sequence two iconic Australian snakes. We are a member of the Oz Mammals Genomics initiative, assisting with the sequencing and assembly of Australia’s unique marsupial fauna. In 2018, we were selected as part of a team to sequence the Waratah genome as part of the pilot phase for the new Genomics of Australian Plants initiative.
We enjoy bringing our bioinformatics to bear on a variety of collaborative research projects. Most notably, we have an ARC Linkage Grant with Microbiogen Pty Ltd to understand how a strain of Saccharomyces cerevisiae has evolved to efficiently use xylose as a sole carbon source: something vital for second-generation biofuel production that wild yeast cannot do. We are combining comparative genomics, evolutionary genetics, RNA-Seq transcriptomics, and competition assays to understand how the novel metabolism evolved. Through deep Illumina resequencing of evolving populations, and assembling reliable complete genomes of the founding ancestors, the ultimate goal is to trace how mutations have interacted with existing genetic variation during adaptive evolution. More recently, we have received an ARC Linkage Grant with the Royal Botanic Gardens and Domain Trust, to apply genomics approaches to the challenges of rainforest tree conservation in the face of climate change and invasive pathogens. We are also collaborating with industrial and academic partners to de novo sequence, assemble, annotate and interrogate the genomes of a selection of microbes with interesting metabolic abilities.
2. Short Linear Motifs (SLiMs).
Many protein-protein interactions are mediated by Short Linear Motifs (SLiMs): short stretches of proteins (5-15 amino acids long), of which only a few positions are critical to function. These motifs are vital for biological processes of fundamental importance, acting as ligands for molecular signalling, post-translational modifications and subcellular targeting. SLiMs have extremely compact protein interaction interfaces, generally encoded by less than 4 major affinity-/specificity-determining residues. Their small size enables high functional density and evolutionary plasticity, making them frequent products of convergent “ex nihilo” evolution. It also makes them challenging to identify, both experimentally and computationally.
A major focus of the lab is the computational prediction of SLiMs from protein sequences. This research originated with Rich’s postdoctoral research, during which he developed a sequence analysis methods for the rational design of biologically active short peptides. He subsequently developed SLiMDisc, one of the first algorithms for successfully predicting novel SLiMs from sequence data - and coined the term “SLiM” into the bargain. This subsequently lead to the development of SLiMFinder, the first SLiM prediction algorithm able to estimate the statistical significance of motif predictions. SLiMFinder greatly increased the reliability of predictions. SLiMFinder has since spawned a number of motif discovery tools and webservers and is still arguably the most successful SLiM prediction tool on benchmarking data. Methods are made available through the SLiMSuite bioinformatics package and webservers.
Current research is looking to develop these SLiM prediction tools further and apply them to important biological questions. Of particular interest is the molecular mimicry employed by viruses to interact with host proteins and the role of SLiMs in other diseases, such as cancer. Other work is concerned with the evolutionary dynamics of SLiMs within protein interaction networks.
OTHER RESEARCH PROJECTS
In addition to the main research in the lab, the lab has a number of interdisciplinary collaborative projects applying bioinformatics tools and molecular evolution theory to experimental biology, often using large genomic, transcriptomic and/or proteomic datasets. These projects often involve the development of bespoke bioinformatics pipelines and a number of open source bioinformatics tools have been generated as a result. Please see the Publications and Lab software pages for more detail, or get in touch if something catches your eye and you want to find out more. We frequently have small collaborations and/or undergraduate student research projects. Many of these are “on hold” waiting for the right person, or sometimes data, to come along. If you think that you have what it needs, get in touch!