Thursday 29 June 2023

Three Pawsey Internship projects available for the Ocean Genomes Project

We have three Pawsey student internships available this summer with the Ocean Genomes Laboratory in the Minderoo OceanOmics Centre at UWA. Closing date: 07 August, 2023 at 17:00 AWST (Perth time). This is a 10-week, paid program open to exceptional undergrad (2nd/3rd year), Honours, Master’s and PhD students. Apply at the CSIRO Application page. Please get in touch if you want to know more and/or are interested in a student research project in the lab.

Optimising workflows for whole genome assembly for marine vertebrates (Project #04)

The biodiversity of marine vertebrates is critical for the health of our ocean’s ecosystem, but is under immediate threat from climate change, pollution, overfishing and habitat destruction. To advance our understanding of how best to protect and sustain our ocean life, global efforts are underway (such as the Vertebrate Genome Project; VGP) to establish a complete library of high-quality reference genomes for all ~22,000 marine vertebrates.

Reference genomes are pivotal not only for answering fundamental questions in marine biology and evolution, but also for guiding the conservation of species most at risk within our changing oceans, and for accurately monitoring biodiversity.

This project utilizes data generated in-house, either by Illumina short-read or PacBio high-fidelity long-read sequencing of Australian marine vertebrate species. The primary objective is to optimize analysis workflows on Pawsey, encompassing the entire life cycle of the data from its raw format to the ultimate outcome of a high-quality assembled genome. We have data across a diverse range of species covering small to large genome sizes.

A containerised Pawsey workflow for Diploidocus (Project #10)

Bioinformatics in general, and genomics specifically, is replete with complex workflows that do not translate easily to HPC. Frequently, genomics pipelines will incorporate many different tools and/or in-built functions with very different computational requirements in terms of multithreading, memory requirements and IO pressures. The Diploidocus genome curation pipeline exemplifies this problem with some lengthy single-processor steps building on data produced by highly parallelised tools, such as minimap2. As well as adapting a specific mission-critical tool, this project will help identify and establish some general principles for optimising genomics code/workflows for running on Setonix.

Diploidocus is a published genome curation and clean-up tool that utilises several different underlying bioinformatics tools and in-built algorithms. Different steps (and tools) in the pipeline have markedly different CPU, IO and memory requirements, including some lengthy non-parallelised portions. This makes it hard to run efficiently on HPC without wasting resource allocation and/or failing to take advantage of parallelisation when available.

The expected outcome of this project is a Nextflow workflow for the deployment of the Diploidocus pipeline on HPC. This will (a) increase in-house efficiency of HPC usage, and (b) make Diploidocus more attractive as a tool to other research groups.

A containerised Pawsey workflow high throughput phylogenomics (Project #14)

This project aims to produce a robust and efficient phylogenomics workflow for whole genome sequencing data.

One important application of genome assemblies is to test and improve the taxonomic classification of species using large-scale genome-wide phylogenetics, known as phylogenomics. There is a previously developed Snakemake workflow for the rapid generation of phylogenomic trees from low- to mid-coverage whole genome shotgun sequencing data. This pipeline (1) creates multiple rapid draft assemblies; (2) identifies an optimal set of orthologous genes per species using BUSCO and BUSCOMP; (3) generates a multiple sequence alignment per gene; (4) generates a phylogenetic tree per gene; and (5) generates a consensus tree from all the individual gene trees.

There is now a requirement to (1) update the pipeline to be optimised for the high-coverage draft and reference genomes created by the Ocean Genomes Project, and (2) convert this pipeline from PBS/Snakemake to SLURM/Nextflow in-line with other genomics workflows being developed at the Minderoo OceanOmics Centre at UWA.

This project will adapt the wgs2tree workflow to optionally start from a set of existing genome assemblies and BUSCO orthologue annotations and implement a Nextflow/SLURM workflow optimised to run efficiently on Pawsey.

Thursday 22 June 2023

The Minderoo OceanOmics Centre at UWA is hiring - lab manager position available

We are recruiting a new lab manager position for the Minderoo OceanOmics Centre at UWA. This is a full-time three-year position, available at Level 6 or 7, depending on experience. Applications close 11:55 PM AWST on Thursday, 13 July 2023. Please see the link below to find out more. Informal enquiries are also welcome - please contact Rich Edwards.

We are seeking a detail-oriented professional who possesses excellent organisational and managerial skills, ideally with a strong scientific background. Your operations experience will ensure smooth functioning of the laboratory, promoting a safe, productive and efficient working environment. As lab manager, you will work closely with the Lead Academic of the OceanOmics Centre to optimise operations to support the goals of Minderoo’s OceanOmics Program.

About the team

The Minderoo OceanOmics Centre at UWA is a partnership between UWA and Minderoo Foundation to undertake research and development under the direction of Minderoo’s OceanOmics Program. Part of Minderoo’s Flourishing Oceans initiative, this ambitious program aims to revolutionise ocean conservation through application of novel environmental DNA technologies. This includes the development of innovative laboratory and computational approaches to optimise and scale collection, processing and analysis of environmental DNA (eDNA) from marine environments, as well as generate a comprehensive reference library of marine vertebrate genome data. All data produced as part of the OceanOmics program will be subject to rigorous QA/QC and released publicly through open access repositories.

Located in the Bayliss Building on the UWA Crawley Campus, the OceanOmics Centre combines a joint Ocean Genomes Laboratory, an OceanOmics/eDNA Laboratory, and Computational Biology Services. Equipped with the latest high-throughput sequencing technology, liquid handling robotics, flow cytometry, and computational infrastructure, the centre is staffed by a collaborative team of scientists (from both Minderoo and UWA) and UWA technical staff. Core centre operations support the Minderoo OceanOmics Program, under the direction of senior Minderoo employees. UWA staff, including this postholder, are part of the UWA Oceans Institute, a multidisciplinary research institution with core offices in the nearby Indian Ocean Marine Research Centre building, and liaise closely with Minderoo employees for day-to-day operations.

For more details, and to apply, visit the UWA jobs site: https://external.jobs.uwa.edu.au/cw/en/job/514000.

Watch this space for some further opportunities coming soon: two laboratory research technicians, and three Pawsey student internships.