CAGI5 - Frataxin Challenge

Alexey Strokach, Carles Corbi-Verge, Philip M. Kim

2018-07-06

Overview

  • Discuss the motivation for developing ELASPIC.

  • Present the ELASPIC webserver.

  • Discuss predictions made by ELASPIC and other methods for 8 mutations in Frataxin.

Introduction

  • We need tools for improving protein stability and increasing the affinity of proteins to their targets.
    • Design proteins that are easier to crystallize and that show higher yield and less aggregation when expressed in heterologous systems.
    • Design biocatalysts which remain active in inhospitable conditions, including high temperatures and pressures, acidic or basic pH, different salt types and concentrations, and admixture of organic solvents
    • Optimize protein and peptide “biologics” to improve their shelf life and increase their affinity and specificity to de- sired targets.

  • We need tools which provide a mechanistic explanation as to why a variant may be deleterious or benign.
  • ELASPIC uses sequential and structural information to predict the effect of mutations on protein-folding and protein-protein interaction on a genome-wide scale.

Methods

Berliner et. al. (2014) PLoS ONE 9(9): e107353.

FoldX energy function


$$ \begin{align} ΔG =& w_1 ⋅ ΔG_\text{vdw} + w_2 ⋅ ΔG_\text{solvH} + w_3 ⋅ ΔG_\text{solvP} + w_4 ⋅ ΔG_\text{hbond} + w_5 ⋅ ΔG_\text{wb} + \\ & w_6 ⋅ ΔG_\text{el}+ w_7 ⋅ ΔG_\text{clash} + w_8 ⋅ TΔS_\text{mc} + w_9 ⋅ TΔS_\text{sc} + w_{10} ⋅ ΔG_\text{kon} \end{align} $$

  • $\{w_1 ... w_{10}\}$ - tuned parameters.
  • $ΔG_\text{vdw}$ - experimental transfer energies from water to vapour.
  • $ΔG_\text{solvH}$ - hydrophobic desolvation score, scaled with the burial of the residue.
  • $ΔG_\text{solvP}$ - hydrophilic desolvation score, scaled with the burial of the residue.
  • $ΔG_\text{hbond}$ - hydrogen bonds, inferred through geometric contrains and empirical data.
  • $ΔG_\text{wb}$ - water molecules that have persistent interactions with the protein.
  • $ΔG_\text{el}$ - electrostatics, calculated using Coulomb's law.
  • $ΔG_\text{clash}$ - steric overlaps between atoms in the structure.
  • $TΔS_\text{mc}$ - the entropic penalty for fixing the backbone in a given conformation.
  • $TΔS_\text{sc}$ - the entropy cost of fixing a side chain in a particular conformation.
  • $ΔG_\text{kon}$ - electrostatic interactions between different chains.

Provean methodology

Source: http://provean.jcvi.org/about.php.

Provean score

Source: http://provean.jcvi.org/about.php.

ELASPIC webserver

http://elaspic.kimlab.org/

Witvliet, Strokach, et. al. (2016) Bioinformatics 32(10): 1589-1591.

Performance on core mutations

ΔΔG datasets Phenotype datasets
Training / Validation
Test

Performance on interface mutations

ΔΔG datasets Phenotype datasets
Training / Validation
Test

Performance on oncogenes / tumor suppresor genes in DoCM

  • Predictions for passenger mutations (green), mutations in tumour suppressors genes (purple), and mutations in oncogenes (cyan).

Results

ELASPIC results

FoldX Provean ELASPIC

Note: ΔΔG values are changes in the Gibbs free energy of unfolding in kcal / mol.

Rosetta results

Protocol: ddg_monomer
Weights: soft_rep_design
Protocol: cartesian_ddg
Weights: talaris2013
Protocol: cartesian_ddg
Weights: beta_nov16

Amber (thermodynamic integration) results

Conclusions

  • Evolutionary information can be very useful for protein structure prediction and design.

  • The ELASPIC webserver provides a user-friendly interface for evaluating the structural impact of mutations.

  • Rosetta cartesian_ddg protocol with the talaris2013 or beta_nov15 weights produces the most accurate results.

Acknowledgements

Supervisor

  • Philip M. Kim

Members of kimlab

  • David Becerra
  • Recep Colak
  • Carles Corbi
  • Michael Garton
  • Clare Juhyun Jeon
  • Mark Sun
  • Joan Teyra

Project students

  • Daniel Witvliet