Sunday, 14 February 2016

Lorne #Genome2016 poster 132: PacBio sequencing and comparative genomics of three Saccharomyces cerevisiae strains

Richard J. Edwards, Åsa Pérez-Bercoff, Tonia L. Russell, Zhiliang Chen , Marc R. Wilkins, Paul V. Attfield & Philip J.L. Bell. F1000Research 2016, 5:172 (poster) (doi: 10.7490/f1000research.1111305.1).

Abstract

PacBio Single Molecule Real Time (SMRT™) sequencing is rapidly becoming the technology of choice for de novo whole genome sequencing. The long read lengths and random error of PacBio data make genome assembly considerably easier and more accurate than short read data. Here, we report on de novo genome sequencing and assembly of three Saccharomyces cerevisiae genomes using the PacBio RSII at the UNSW Ramaciotti Centre for Genomics. A haploid reference yeast genome strain, S288C, and two novel diploid strains were sequenced as part of a larger functional genomics project. For each strain, 20kb SMRT Bell library preps were performed and sequenced on two SMRT Cells using the P6-C4 chemistry with read lengths of up to 53.3 kb. Whole genome de novo assemblies are then generated through the PacBio SMRT Portal.

We are using the S288C data to explore performance in comparison to the published genome as a reference. An initial assembly of S288C yielded over 99.97% genome coverage at 99.99% accuracy on only 26 contigs, with 16/17 reference chromosomes (16 nuclear chromosomes plus mitochondrion) essentially returned as a single, complete contig. The long reads enable accurate reconstruction of tandemly repeated genes (except >900kb of rRNA repeats), transposition and chromosomal translocations. We are now using the S288C data to optimise the assembly process and derive assembly settings for the two novel diploid strains. To this end, we have developed a new pipeline for the comparative assessment of high quality whole genomes against a reference, which we are now adapting for the additional challenge of appropriately handling diploid data.

Friday, 12 February 2016

Quantitative Proteomics of the Infectious and Replicative Forms of Chlamydia trachomatis

Skipp PJS, Hughes C, McKenna T, Edwards R, Langridge J, Thomson NR & Clarke IN (2016): Quantitative Proteomics of the Infectious and Replicative Forms of Chlamydia trachomatis. PLoS ONE 11(2): e0149011. doi:10.1371/journal.pone.0149011

Abstract

The obligate intracellular developmental cycle of Chlamydia trachomatis presents significant challenges in defining its proteome. In this study we have applied quantitative proteomics to both the intracellular reticulate body (RB) and the extracellular elementary body (EB) from C. trachomatis. We used C. trachomatis L2 as a model chlamydial isolate for our study since it has a high infectivity:particle ratio and there is an excellent quality genome sequence. EBs and RBs (>99% pure) were quantified by chromosomal and plasmid copy number using PCR, from which the concentrations of chlamydial proteins per bacterial cell/genome were determined. RBs harvested at 15h post infection (PI) were purified by three successive rounds of gradient centrifugation. This is the earliest possible time to obtain purified RBs, free from host cell components in quantity, within the constraints of the technology. EBs were purified at 48h PI. We then used two-dimensional reverse phase UPLC to fractionate RB or EB peptides before mass spectroscopic analysis, providing absolute amount estimates of chlamydial proteins. The ability to express the data as molecules per cell gave ranking in both abundance and energy requirements for synthesis, allowing meaningful identification of rate-limiting components. The study assigned 562 proteins with high confidence and provided absolute estimates of protein concentration for 489 proteins. Interestingly, the data showed an increase in TTS capacity at 15h PI. Most of the enzymes involved in peptidoglycan biosynthesis were detected along with high levels of muramidase (in EBs) suggesting breakdown of peptidoglycan occurs in the non-dividing form of the microorganism. All the genome-encoded enzymes for glycolysis, pentose phosphate pathway and tricarboxylic acid cycle were identified and quantified; these data supported the observation that the EB is metabolically active. The availability of detailed, accurate quantitative proteomic data will be invaluable for investigations into gene regulation and function.