您的位置 首页 工具使用

trRosetta: 在台式机上从头折叠一个蛋白——这台台式机最好有一块好显卡

作者: 刘源 北京大学化学与分子工程学院-王初题组特聘博士后

一、前言

去年CASP上AlphaFold打败了众多老牌队伍,展示了工业级别深度学习的能力。但其所用的方法基本是学界已有的方案: 首先用Jinbo Xu首创的深度学习方法[1], 利用多序列比对数据生成目标蛋白的几何约束。这些约束中包括氨基酸接触距离分布,主链二面角分布等信息。接着用直接梯度下降方法进行优化,最后用Rosetta精修结构。尤其是优化初始结构这一步,与传统的fragments方法不同,不需要大量的随机采样,就可以直接得到高质量的蛋白折叠结构,无疑展示出深度学习的强大功能。

使用多重序列比对数据来挖掘蛋白折叠数据的思路已经发展了数十年,其坚实基础是测序技术与算法的共同进步。序列信息呈现几何级数式的增长,为算法上提供了足够的数据量!在算法上的进步同样不可小觑,最初考虑的只是两个独立位点间的关联信息,随后的一些算法比如Gremlin[2] 会考虑每个位点和整个序列的信息,而Jinbo Xu的方法引入监督学习,相当于进一步引入了所有蛋白的信息 (最近听了Xu老师的报告很受启发)。

随着输入信息的增多,输出信息也越来越丰富。从两个残基是否接触的二分类问题,发展到氨基酸接触距离分布,甚至是各种二面角,扭转角度的分布问题都可以用一个深度学习模型进行预测。与传统的Fragments数据(存储着短程几何约束信息的数据库)不同,深度学习本质上构造了一个包含长短几何约束信息的预测模型(Yaoqi Zhou和Haipeng Gong等展示深度学习模型可以得到比数据库更好的fragments[3])。而且这个模型理论上是可微分的,所以无论是信息精度和采样方式都会带来质的提升。

最近发表的trRosetta方案[4](请关注南开大学Jianyi Yang),作为集大成者,不仅预测精度上达到了超越AlphaFold的性能,同时还可以在本地完成全套计算。下面就让我们来体验一下在一台台式机上从头折叠一个蛋白的奇妙过程。

trRosetta技术原理:

trRosetta的深度学习模型根据序列比对文件来预测氨基酸之间的距离和朝向分布,并将其转化为平滑的Rosetta限制参数,用于后续的能量最小化建模。

二、trRosetta的使用方法

依赖:

Tensorflow 1.13 或 1.14版本
trRosetta的预训练数据
PyRosetta

安装依赖库:

conda install tensorflow==1.14

获取预训练数据(270.5MB):

wget https://files.ipd.uw.edu/pub/trRosetta/model2019_07.tar.bz2
tar xf model2019_07.tar.bz2

PyRosetta获取详见: http://www.pyrosetta.org/dow

备注: 本文以N-Terminal SH2 domain of the p120RasGAP(PDB ID: 6pxb)作为实例,序列长度109。

1. 生成多序列比对MSA

trRosetta是基于序列比对文件作为输入,因此我们可以使用Gremlin2和MapPred服务器快速生成所需的contact map以及原始数据msa。这个有很多服务器可以做:

Gremlin: https://gremlin2.bakerlab.org/

MapPred: http://yanglab.nankai.edu.cn/MapPred/

此处以Gremlin为例,测试序列的Nf值为216.8, 共进化信息比较丰富。

点击下载Alignment,共3652条序列比对。

注意1: msa数据数据不能分行显示,每行必须包括比对序列的全部内容,如下图。

>seq1
SISTRIGEYRSAQSKEDLIQKYLNQLPGSLCVFFKFLPSVRSFVATHASGI…
>seq2
AFPMRIADYRSAQSKEDMIQRFLNGISGSRCLFFKFLPSVRSFVATHANGI…
>seq3
AISEVINDYRIAESKEDIIRMLFQNLSNLPLLFFKFLPSMNSFVMSHASMP…
>seq4
HVSMRITDYRSAQSKEDLIQKYLNHLPNALCIFFKFLPSVRSFVATHAQGI…

2.  用trRosetta生成几何约束文件

接着我们可以从GitHub(https://github.com/gjoni/trRosetta) 上下载trRosetta的部分文件。随后用简单的代码就可以得到npz格式的几何信息文件。

python [path1]/network/predict.py -m [model_dir] [msa] [output.npz]
该步骤的运行时间大约在5~10分钟不等,展示完成时的Log:

Instructions for updating:
Use standard file APIs to check for files with this prefix.
./model2019_07/model.xaa – done
./model2019_07/model.xac – done
./model2019_07/model.xae – done
./model2019_07/model.xab – done
./model2019_07/model.xad – done
得到npz文件。

3. 用pyRosetta生成能量优化文件

首先你需要安装pyRosetta,这个可以查站内教程,之后在trRosetta官网可以下载优化脚本(http://yanglab.nankai.edu.cn/trRosetta/)。

如果在Mac上运行时,需要修改Wdir的路径: 查看trRosetta/data文件夹中的params.json文件。

在Linux上运行无需修改。

{
“WDIR” : “./”,
“PCUT” : 0.05,
“PCUT1” : 0.5,
“EBASE” : -0.5,
“EREP” : [10.0,3.0,0.5],
“DREP” : [0.0,2.0,3.5],
“PREP” : 0.1,
“SIGD” : 10.0,
“SIGM” : 1.0,
“MEFF” : 0.0001,
“DCUT” : 19.5,
“ALPHA” : 1.57,
“DSTEP” : 0.5,
“ASTEP” : 15.0
}
修改WDIR为当前路径即可。

用下面命令进行模型生成:

python [path2]/trRosetta.py [npz] [fasta] model.pdb

一般msa质量好的话生成几个就够了,最终结构都是收敛的。运行时间大约只需要几分钟。

测试结果:

除了N端和C端的以外,蛋白质结构域的内核预测准确度相当的高。

三、trRosetta在线服务器的使用

这步就不提了,毕竟咱们玩的就是在个人PC上折叠蛋白的魔法,否则就不酷了。

详见: http://yanglab.nankai.edu.cn/trRosetta/

 

参考文献:

  1. Wang S, Sun S, Li Z, Zhang R, Xu J (2017) Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model. PLoS Comput Biol 13(1): e1005324. doi:10.1371/journal.pcbi.1005324
  2. Assessing the utility of coevolution-based residue–residue contact predictions in a sequence-and structure-rich era. Hetunandan Kamisetty, Sergey Ovchinnikov, and David Baker. Proceedings of the National Academy of Sciences 110, no. 39 (2013): 15674-15679.
  3. Jianyi Yang#, Ivan Anishchenko#, Hahnbeom Park, Zhenling Peng, Sergey Ovchinnikov, David Baker* Improved protein structure prediction using predicted inter-residue orientations PNAS, in press (2019).
  4. Wang, T., Qiao, Y., Ding, W. et al. Improved fragment sampling for ab initio protein structure prediction using deep neural networks. Nat Mach Intell 1, 347–355 (2019) doi:10.1038/s42256-019-0075-7
  5. Protein structure determination using metagenome sequence data, Science 20 Jan 2017: Vol. 355, Issue 6322, pp. 294-298 DOI: 10.1126/science.aah4043



发表回复

您的电子邮箱地址不会被公开。 必填项已用*标注

评论列表(179)

  1. “Lookin’ good, son. Lookin’ good.” His dad says over the stream of the warm cleansing water as it caresses his mature man body.

  2. “Yep. I’ve had one since I put on my gear at practice today.” He tells his dad. I was surprised He wasn’t angry. He said,” boy you wait here, I am going to the bathroom.” He closed the door. He must sit on the toilet with pants down because I heard him losing the belt but didn’t hear him peeing. After a while, he called my name,” get in here ass sniffing slut.” I opened the door and crawled to the side of daddy. Daddy was still sitting on the toilet. . Just when I was confused why he sat that and did nothing. He stood up and pulled his pants up. I did take a peep of daddy’s huge uncut Middle Eastern dick. Half hard like 7inches already. While he was doing that, he grabbed my hair pushed right into the toilet, and closed the lid. “ smell my ass from there boy when it is still fresh.” He paused and continued, “ Don’t move.”

  3. Thanks a lot, Loads of stuff!
    best online casino for missouri residents [url=https://usacasinomaster.com/#]top 10 online casino[/url] no deposit bonus codes for vegas casino online

  4. He gulps once as his dad mounts more pressure on his balls. Squeezing them. [url=https://arturzasada.pl/]polskie porno[/url] His father’s words are what he hears when he erupts. His cum streaming like liquid threads from the pee-hole of his rigid cock.

  5. Nicely put, Thank you!
    no deposit free sign up bonus online casinos [url=https://luckyusaplay.com/#]online casino with no deposit bonus[/url] casinos mexico online

  6. You made the point.
    gambling online casino real money [url=https://uscasinoguides.com/#]online casino welcome bonus[/url] gta online casino heist stealth guide

  7. You expressed this terrifically!
    minimum deposit online casino usa [url=https://usagamblinghub.com/#]new online casinos[/url] gta 5 ps4 online casino

  8. Superb information, Many thanks.
    real money online casino no deposit sign up bonus [url=https://usacasinomaster.com/#]new online casinos usa real money[/url] online casino sites uk

  9. You reported this wonderfully!
    online casino in arabic [url=https://usaplayerscasino.com/#]no deposit casino bonus usa online casinos[/url] jugar casino online gratis

  10. Many thanks. I like it.
    ignite online casino [url=https://luckyusaplay.com/#]casino onlines[/url] online casino affiliate marketing

  11. Useful stuff. Cheers.
    what is the most legit online casino [url=https://usagamblinghub.com/#]new online casino no deposit bonus usa[/url] online casino japan

  12. With thanks, Loads of content!
    tropicana online casino pa app [url=https://uscasinoguides.com/#]best usa online casinos[/url] new jersey licensed online casinos

  13. Nicely put, Thanks a lot.
    play free slot casino games online free [url=https://usaplayerscasino.com/#]online casino sign up bonus[/url] tennessee online casino

  14. Howdy, i read your blog from time to time and i own a similar one and i was just
    wondering if you get a lot of spam feedback? If so how
    do you reduce it, any plugin or anything you can suggest?
    I get so much lately it’s driving me crazy so any support is very much appreciated.

    Here is my site … eharmony special coupon code 2024

  15. The most talked about weight loss product is finally here! FitSpresso is a powerful supplement that supports healthy weight loss the natural way. Clinically studied ingredients work synergistically to support healthy fat burning, increase metabolism and maintain long lasting weight loss. https://fitspresso-try.com/

  16. Unquestionably consider that that you said. Your
    favourite reason seemed to be at the net the simplest factor facebook vs eharmony to find love online be aware of.
    I say to you, I certainly get annoyed at the same time as people think about worries that they just don’t
    know about. You managed to hit the nail upon the highest and defined out the entire
    thing without having side-effects , people can take a signal.
    Will likely be again to get more. Thank you

  17. Wow, incredible weblog format! How long have you been running a blog for?
    you made running a blog glance easy. The entire look of
    your site is fantastic, as well as the content material!

    You can see similar here dobry sklep

  18. I’ve been surfing online more than three hours today, yet I never found any interesting article like yours.
    It’s pretty worth enough for me. In my view, if all web owners and bloggers made
    good content as you did, the web will be a lot
    more useful than ever before.

    Feel free to surf to my webpage; vpn code 2024

  19. Fantastic website you have here but I was curious about if you knew of any community forums that cover the same topics discussed in this article?

    I’d really like to be a part of community where I can get feed-back
    from other knowledgeable people that share the same interest.
    If you have any recommendations, please let me know.

    Thanks a lot!

    Also visit my blog; vpn special coupon

联系我们

联系我们

(44)07934433023

在线咨询: QQ交谈

邮箱: info@bioengx.org

关注微信
微信扫一扫关注我们

微信扫一扫关注我们

关注微博
返回顶部