Posts Proteomics Informatics (Task 1)
Post
Cancel

Proteomics Informatics (Task 1)

Requirement

复现文章

Potential inhibitors for 2019-nCoV coronavirus M protease From clinically approved medicines Xin Liu, Xiu-Jie Wang

Material

Sequence Resource

The protein sequences of SARS-CoV Mpro (Accession: 1UK3_A) and 2019-nCoV polyprotein orf1ab (Accession: YP_009724389.1) were downloaded from GenBank(http://www.ncbi.nlm.nih.gov). The protein sequence of 2019-nCoV Mpro was determined by aligning the SARS-CoV Mpro sequence to 2019-nCoV polyprotein orf1ab using BLAST (Altschul et al., 1990), the best aligned region in 2019-nCoV orf1ab to SARS-CoV Mpro was selected as 2019-nCoV Mpro.

Structure Modeling

Crystal structure of SARS-CoV Mpro (PDB ID: 1UJ1) was downloaded from Protein Data Bank (PDB, http://www.rcsb.org) (Burley et al., 2019). Structure of 2019-nCoV Mpro was predicted by Modeller algorithm (Webb and Sali, 2016) using the structure of SARS-CoV Mpro as template. Structural details were visualized with the Visualizer function of Discovery Studio 3.5 (Accelrys Software Inc)

复现步骤

Download Resource

Implement Tools

Personal Note

该论文作者需要补充论证的点:

  1. 为什么双序列比对需要用Blastp(正常是运用于序列数据库大规模搜索相似序列),而不采用Clustal/Needle等比对算法工具以确定匹配区间?
  2. 为什么采用Modeller建模,且建模参数是什么?
  3. 1uk3A与1uj1A的SEQRES序列是相同的,但空间结构上有所差异,选择1uj1A作为模板的依据是什么?

Result of Blastp

fig_blastp_2019-nCoV

可以看到在匹配区域内,二者的序列相似性很高,可以作为同源建模的材料。

>>lcl|Query_64123:3264-3569 YP_009724389.1 orf1ab polyprotein [Severe acute respiratory syndrome coronavirus 2] SGFRKMAFPSGKVEGCMVQVTCGTTTLNGLWLDDVVYCPRHVICTSEDMLNPNYEDLLIRKSNHNFLVQA GNVQLRVIGHSMQNCVLKLKVDTANPKTPKYKFVRIQPGQTFSVLACYNGSPSGVYQCAMRPNFTIKGSF LNGSCGSVGFNIDYDCVSFCYMHHMELPTGVHAGTDLEGNFYGPFVDRQTAQAAGTDTTITVNVLAWLYA AVINGDRWFLNRFTTTLNDFNLVAMKYNYEPLTQDHVDILGPLSAQTGIAVLDMCASLKELLQNGMNGRT ILGSALLEDEFTPFDVVRQCSGVTFQ >lcl|Query_64123:4217-4233 YP_009724389.1 orf1ab polyprotein [Severe acute respiratory syndrome coronavirus 2] TDTPKGPKVKYLYFIKG >lcl|Query_64123:1405-1414 YP_009724389.1 orf1ab polyprotein [Severe acute respiratory syndrome coronavirus 2] KYKGIKIQEG

Sorted by E-Value/Score/Identity/…, the best region locates in 3264-3569 of YP_009724389.

Choose the region and create a new file YP_009724389_1UK3A.ali for Modeller to use.

>>P1;YP_009724389_1UK3A sequence:YP_009724389_1UK3A:::::::0.00: 0.00 SGFRKMAFPSGKVEGCMVQVTCGTTTLNGLWLDDVVYCPRHVICTSEDMLNPNYEDLLIRKSNHNFLVQA GNVQLRVIGHSMQNCVLKLKVDTANPKTPKYKFVRIQPGQTFSVLACYNGSPSGVYQCAMRPNFTIKGSF LNGSCGSVGFNIDYDCVSFCYMHHMELPTGVHAGTDLEGNFYGPFVDRQTAQAAGTDTTITVNVLAWLYA AVINGDRWFLNRFTTTLNDFNLVAMKYNYEPLTQDHVDILGPLSAQTGIAVLDMCASLKELLQNGMNGRT ILGSALLEDEFTPFDVVRQCSGVTFQ*

Modelling Script of Modeller

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
from modeller import *
from modeller.automodel import *

env = environ()
# ReDo the alignment that implemented by Modeller
aln = alignment(env)
mdl = model(env, file='1uj1', model_segment=('FIRST:A','LAST:A'))
aln.append_model(mdl, align_codes='1uj1A', atom_files='1uj1.pdb')
aln.append(file='YP_009724389_1UK3A.ali', align_codes='YP_009724389_1UK3A')
aln.align2d()
aln.write(file='YP_009724389_1UK3A_modeller.ali', alignment_format='PIR')
# Start modelling
a = automodel(env, alnfile='YP_009724389_1UK3A_modeller.ali',
              knowns='1uj1A', sequence='YP_009724389_1UK3A',
              assess_methods=(assess.DOPE, assess.GA341))
a.starting_model = 1
a.ending_model = 5
a.make()
Modelling result of Modeller(B99990001)

Another way to build the model: SWISS-MODEL auto-modelling

fig_SWISS-MODEL_2019-nCoV

fig_SWISS-MODEL_2019-nCoV_result

Modelling result of SWISS-MODEL

Comparison & Reflect

Superimpose result of 6lu7_A with 1uj1_A (FATCAT) Superimpose result of 6lu7_A with SWISS-MODEL result (FATCAT)
  • For 6lu7_A with 1uj1_A
    • These two structures are significantly similar with P-value of 0.00e+00 (raw score is 884.27).
    • They have 299 equivalent positions with an RMSD of 1.19Å without twists.
  • For 6lu7_A with SWISS-MODEL result
    • These two structures are significantly similar with P-value of 0.00e+00 (raw score is 884.30)
    • They have 299 equivalent positions with an RMSD of 1.19Å without twists.

通过最近结晶出的6lu7,可以看到作为建模模板的1uj1本身与其就有很高的结构相似性,序列上的相似性就更不用说了。可以看到,在未结晶出6lu7时,同源建模的结果是挺有价值的。

Since there remain doubts about the paper mentioned above, one should examine this task more carefully.

  • https://swissmodel.expasy.org/repository/species/2697049
This post is licensed under CC BY 4.0 by the author.