Davey NE, Shields DC & Edwards RJ (2006): SLiMDisc: short, linear motif discovery, correcting for common evolutionary descent. Nucleic Acids Res. 34(12):3546-54.
Many important interactions of proteins are facilitated by short, linear motifs (SLiMs) within a protein’s primary sequence. Our aim was to establish robust methods for discovering putative functional motifs. The strongest evidence for such motifs is obtained when the same motifs occur in unrelated proteins, evolving by convergence. In practise, searches for such motifs are often swamped by motifs shared in related proteins that are identical by descent. Prediction of motifs among sets of biologically related proteins, including those both with and without detectable similarity, were made using the TEIRESIAS algorithm. The number of motif occurrences arising through common evolutionary descent were normalized based on treatment of BLAST local alignments. Motifs were ranked according to a score derived from the product of the normalized number of occurrences and the information content. The method was shown to significantly outperform methods that do not discount evolutionary relatedness, when applied to known SLiMs from a subset of the eukaryotic linear motif (ELM) database. An implementation of Multiple Spanning Tree weighting outperformed two other weighting schemes, in a variety of settings.