Instant Notes - ISMB 2022 3DSIG COSI

Here are the notes on some interesting articles appeared in the ISMB 2022's 3DSIG COSI, as well as related works on the same topic.

Protein Structure Comparison

Peter Røgen presented a novel method applying the Knot theory to find topological obstructions to a superposition of one protein backbone onto another. Such a protein structure comparison method considers self-intersections and self-avoiding morphs. He previously utilized generalized Gauss integrals and proposed scaled Gauss metric as geometric measures of protein structures.

Kempen et al. developed a new approach to perform a fast protein structure search by discretizing the tertiary interactions into structural alphabets learned by VQ-VAE and emphasized advantages over those discretizing the local backbone (related to Alexandre G. de Brevern group’s works),

Fig 1 of Kempen et al.

however not mention the related works utilizing the 3D Zernike polynomials that supporting both monomeric and oligomeric query.

Fig 3 of Guzenko et al.

For discretizing tertiary interactions, there are also some related works,

Fig 1 and Fig 3 of Shi et al. & Fig 1 of PraĹľnikar et al.

particularly Gevorg Grigoryan group’s works.

Summary Fig of Zheng et al. & Fig 1 of Mackenzie et al. & Fig 1 of Zhou et al.

New Sequence Alignment

Inspired by the field of protein structure contact prediction, particularly the Direct Coupling Analysis (DCA) methodology, Talibart et al. applied the Potts model considering direct couplings (i.e. coevolution) between positions in addition to positional composition (i.e. positional conservation) to align two sequences through aligning two Potts models inferred from corresponding multiple sequence alignments (MSA) via Integer Linear Programming (ILP). Their model can be used to improve the alignment of remotely related protein sequences in tractable time. Following this idea, it is straightforward to utilize the Restricted Boltzmann Machines (RBM) and even deep neural networks to build theoretically more powerful models.

Interestingly, Petti et al. recently proposed another (similar in idea but quite different in implementation) approach to perform multiple sequence alignment. They implemented a smooth and differentiable version of the Smith-Waterman pairwise alignment algorithm via differentiable dynamic programming and designed a method called Smooth Markov Unaligned Random Field (SMURF) that takes as input unaligned sequences and jointly learns the MSA. And they proved that such a differentiable alignment module helps improve the structure prediction results over those initial MSAs.

AlphaFold2 and RoseTTAFold Downstream Analysis

New Fold?

Bordin et al. reported a new CATH-Assign protocol (ultizing Foldseek for fast structure comparison) which is used to analyze the AlphaFoldDB and detect new superfamilies. It seems that AlphaFold2 yields a certain amount of “novel” structures. But people should be cautious about this since the predicted structures are not always “true” and the structure comparison methods may not be robust enough.

Predicting the Impact of Mutations

Sen et al. used both AlphaFold and RoseTTAFold to predict the structures of protein domains without known experimental structures, and perform subsequent functional predictions based on those predicted structures to help estimate the effect of disease-associated missense mutations. Such incorporating two models to try to yield better results is a kind of ensemble approach.

Toolbox

Require further investigation for usability.


Cited as:

@online{zhu2022instant-notes-on-ISMB-2022-3DSIG-COSI,
        title={Instant Notes - ISMB 2022 3DSIG COSI},
        author={Zefeng Zhu},
        year={2022},
        month={July},
        url={https://naturegeorge.github.io/blog/2022/07/instant-notes/},
}