生信漫谈如何利用MEGA7构建系统进化树

您所在的位置:网站首页 mega7进化树显示不全 生信漫谈如何利用MEGA7构建系统进化树

生信漫谈如何利用MEGA7构建系统进化树

2023-11-21 06:16| 来源: 网络整理| 查看: 265

前言

生物技术近年发展越来迅猛,掌握一门生信语言或者一个生信软件的使用,这将为我们的科研学习之路提供非常大的便利。今天我们主要来介绍如何用MEGA7进行进化树。

1、下面以MEGA7为例来进行讲解,下面是下载地址,大家根据自己的系统进行下载即可。

http://www.megasoftware.net/

 

 

2、序列的准备,必须是fasta结尾的格式,其他像txt格式,软件不能识别,以下以拟南芥SPL15基因的蛋白序列为例,进行同源序列查找

>NP_191351.1 SPL15 [organism=Arabidopsis thaliana] [GeneID=824961] MELLMCSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSRSKNRVNTVRKSSTTARCQVEGCRMDLSN VKAYYSRHKVCCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALF TSHYSRIAPSLYGNPNAAMIKSVLGDPTAWSTARSVMQRPGPWQINPVRETHPHMNVLSHGSSSFTTCPE MINNNSTDSSCALSLLSNSYPIHQQQLQTPTNTWRPSSGFDSMISFSDKVTMAQPPPISTHQPPISTHQQ YLSQTWEVIAGEKSNSHYMSPVSQISEPADFQISNGTTMGGFELYLHQQVLKQYMEPENTRAYDSSPQHF NWSL

选择其中同源序列高的前19条蛋白序列进行下载进行示范

 

下载后的序列形式:

>AST51816.1 Venus [Cloning vector pSTB205] MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKLICTTGKLPVPWPTLVTTLGYGLQCFARYPDHMK QHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNVYITADKQKN GIKANFKIRHNIEDGGVQLADHYQQNTPIGDGPVLLPDNHYLSYQSALSKDPNEKRDHMVLLEFVTAAGITLGMDELYKE LLMCSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSRSKNRVNTVRKSSTTARCQVEGCRMDLSNVKAYYSRHKVCC IHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSHYSRIAPSLYGNPNAAMIKS VLGDPTAWSTARSVMQRPGPWQINPVRETHPHMNVLSHGSSSFTTCPEMINNNSTDSSCALSLLSNSYPIHQQQLQTPTN TWRPSSGFDSMISFSDKVTMAQPPPISTHQPPISTHQQYLSQTWEVIAGEKSNSHYMSPVSQISEPADFQISNGTTMGGF ELYLHQQVLKQYMEPENTRAYDSSPQHFNWSL >NP_191351.1 squamosa promoter binding protein-like 15 [Arabidopsis thaliana] MELLMCSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSRSKNRVNTVRKSSTTARCQVEGCRMDLSNVKAYYSRHKV CCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSHYSRIAPSLYGNPNAAMI KSVLGDPTAWSTARSVMQRPGPWQINPVRETHPHMNVLSHGSSSFTTCPEMINNNSTDSSCALSLLSNSYPIHQQQLQTP TNTWRPSSGFDSMISFSDKVTMAQPPPISTHQPPISTHQQYLSQTWEVIAGEKSNSHYMSPVSQISEPADFQISNGTTMG GFELYLHQQVLKQYMEPENTRAYDSSPQHFNWSL >KAG7634825.1 SBP domain superfamily [Arabidopsis suecica] MELLMGSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSRSKNRVNTVRKSSTTARCQVEGCRMDLSNVKAYYSRHKV CCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSHYSRIAPSLYGNPNAAMI KSVLGDPTAWSTARSVMQRPGPWQINPVRETHPHMNVLSHGSSSFTTCPEMINNNSTDSSCALSLLSNSYPIHQQQLQTP TNTWRPSSGFDSMISFSDKVTMAQPPPISTHQPPISTHQQYLSQTWEVIAGEKSNSHYMSPVSQISEPADFQISNGTTMG GFELYLHQQVLKQYMEPENTRAYDSSPQHFNWSL >CAA0387110.1 unnamed protein product [Arabidopsis thaliana] MELLMGSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSRSKNRVNTVRKSSTTARCQVEGCRMDLSNVKAYYSRHKV CCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSHYSRIAPSLYGNPNAAMI KSVLGDPTAWSTARSVMQRPGPWQINPVRETHPHMNVLSHGSSSFTTCPEMINNNSTDSSCALSLLSNSYPIHQQQLQTP TNTWRPSSGFDSMISFSDKVTMAQPPPISTHQPPISTHQQYLSQTWEVIAGEKSNSHYMSPVSQISEPVDFQISNGTTMG GFELYLHQQVLKQYMEPENTRAYDSSPQHFNWSL >CAD5326126.1 unnamed protein product [Arabidopsis thaliana] MELLMGSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSRSKNRVNTVRKSSTTARCQVEGCRMDLSNVKAYYSRHKV CCIHSKSSKVIVSGLHQRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSHYSRIAPSLYGNPNAAMIKSVLGDP TAWSTARSVMQRPGPWQINPVRETHPHMNVLSHGSSSFTTCPEMINNNSTDSSCALSLLSNSYPIHQQQLQTPTNTWRPS SGFDSMISFSDKVTMAQPPPISTHQPPISTHQQYLSQTWEVIAGEKSNSHYMSPVSQISEPVDFQISNGTTMGGFELYLH QQVLKQYMEPENTRAYDSSPQHFNWSL >KAG7561265.1 SBP domain superfamily [Arabidopsis thaliana x Arabidopsis arenosa] MELLMGSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSGSKNRVNTGRKSTMTARCQVEGCRMDLSNVKAYYSRHKV CCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSRYSRIAPSLYGNPNAAMI KSVLGDPMAWSTAKSVMRRSGPWQINPERESHQLLNVLSHGSSSFTTCPEIINNNSTDSSCALSLLSNSNPIQQQQLQTP TNLWRPSSGFDSLISFSDRVTMAQPPPISTHHQYLSQTWEVMAGEKSNSHYISPVSQISEPADFQISNGTTMGGFELSLH QQVLRQYMEPENTRAYDSSPQHFNWSL >XP_002878178.1 squamosa promoter-binding-like protein 15 [Arabidopsis lyrata subsp. lyrata] MELLMGSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSGSKNRVNTGRKSTMTARCQVEGCRMDLSNVKAYYSRHKV CCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSRYTRIAPSLYGNPNAAMI KSVLGDPTAWSTARSVMRRSGPWQINPERESHQIMNVLSHGSSSFTTCPEITNNNSTDSSCALSLLSNSNPIQQQQLQTP TNLWRPSSGFDSMISFSDRVTMAQPPPISTHHQYLSQTWDVMAGGKSNSHYMSPVSQISEPAEFQISNGTTMGGFELSLH QQVLRQYMEPENTRAYDSSPQHFNWSL >KAG7566101.1 SBP domain [Arabidopsis suecica] MELLMGSGHAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSGSKNRVNTGRKSTMTARCQLEGCRMDLSNVKAYYSRHKV CCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTSSLFTSRYSRIAPSLYGNPNAAMI KSVLGDPMAWSTAKSVMRRSGPWQINPERESHQLLNVLSHGSSSFTTCPEIINNNSTDSSCALSLLSNSNPIQQQQLQTP TNLWRPSSGFDSLISFSDRVTMAQPPPISTHHQYLSQTWEVMAGEKSNSHYISPVSQISEPAGFQISNGTTMGGFELSLH QQVLRQYMEPENTRAYDSSPQHFNWSL >CAE6076605.1 unnamed protein product [Arabidopsis arenosa] MRRGRGKGKRQNATAREDRGSGEEEKIPAFRRRGRPQKPVKDEIEEEEVELVKKTEEEEDKDDDTNGSVTSKEDVTENGR KRKKPVESKESNITEEENGVGSKSSTEDSMKSSSSIGFRQNGSRRKNKPRRAAEAVVECNGAESGGSSSTESSSLSGGLR FGQKIYFEDGSGSGSKNRVNTGRKSTMTARCQVEGCRMDLSNVKAYYSRHKVCCIHSKSSKVIVSGLHQRFCQQCSRFHQ LSEFDLEKRSCRRRLACHNERRRKPQSTTSLFTSRYSRIAPSLYGNPNAAMIKSVLGDPMAWSTAKSVMRRSGPWQINPE RESHQLLNVLSHGSSSFTTCPEIINNNSTDSSCALSLLSNSNPIQQQQLQTPTNLWRPSSGFDSLISFSDRVTMAQPPPI STHHQYLSQTWEVMAGEKSNSHYISPVSQISEPADFQISNGSTMGGFELSLHQQVLRQYMEPENTRAYDSSPQHFNWSL >XP_006291402.1 squamosa promoter-binding-like protein 15 [Capsella rubella] MELLMGSGQAESGGSSSTESSLLSGGLRFGQKIYFEDGSGSGSKNRVSTGHKSSMTTVARCQVEGCKMDLSNAKAYYSRH KVCCIHSKSSKVIVSGLHQRFCQQCSRFHHLSEFDLEKRSCRRRLACHNERRRKPQPATLFTSHYTRIAPSLYGNANAAM IKSVLGDPTAWSTSRSVMRSSGPWQINPVKESNQLMNVYSQESSSFTITCPEMMNNNSTDSGCALSLLSNSNPIQQQQQQ PQTQTNIWRSSSGFDSMILDRVTMAQPPPISGHHQYLNQTLAFMAGEKSNSHYMSPVLGPSQISEPDEFQISNGTTMDGF ELSLHQQVLRQYMEPENTRAYDSSPHYFNWSL >CAH2063751.1 unnamed protein product [Thlaspi arvense] MELLMGSGQNRTESYGSSSTESSSLSGGLRFGQKIYFEDGSGSGGGSNKNRVNTGRKSRTARCQVEGCRMDLSNVKTYYS RHKVCCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQATTSLLTSRYSRIAPSLYGNAN TAMIRSVLGDPTAWSTARSVMRRSAPWQINPERESHQLMNVFSHDSSSFTTTCPEMMNSNGTDSSCALSLLSNSNTNQQQ QLLQTSTNIWRPSSGFDSANADRATMAQPPPVSNQHQYLNQTWEFMAGEKSNSHYLSPVLGLSQISEPVDFQISNGTTMG GFELSIHQQVLRHYMEPENTRAYDSSAQHFNWSL >XP_010516431.1 PREDICTED: squamosa promoter-binding-like protein 15 [Camelina sativa] MELLIGGSGQTESGGASSTKSSSLSGGLRFGQKIYFEDGSGSGSKNRVGTGHKSSTTTTTARCQVEGCKMDLSNAKAYYS RHKVCCIHSKSSKVIVSGLRQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTLYTSQYTRIAPSLYGDANA AMMKSVLGDPTVWSTARSVMRRSGPWQISPVKESHHQLMNVFSQESSSFTITCPEMMNNNSTDSSCALSLLSNSNSNSNP IQQQQQQLQTQTHIWRPSLGFDSMTVDRVTMAQPPPISSHHQYLNQTLEFMAGEKSSSHYMSPVLGPSQISEPDEFQISN GTTMDGFELSLHQQVLRQYMEPENTRAYDSSPHHFNWSL >AKC05620.1 squamosa promoter-binding-like protein 15 [Cardamine hirsuta] MELLMGSGQSESGASSSNESSSLSGGLRFGQKIYFEDGSGSGSKNRVSSTGRKSSTTTARCQVEGCRMDLSNAKTYYSRH KVCCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPATTLFTSRFTRTAPSHYGNANAA MIKSVLGDPTAWTAERSVMRRSAPWQSNPSHQVMIDFSHGSSSLTTTCPEMMNNTSTDSSCALSLLSNSNQTQQLQQQLQ TPANIWRASSGFDSMIADRVTMAQPPPISTHHQYLNQSWEFMPGEKNDSHYMSPMSQISEPADLHMRNRTTMGGFEVSLH QQVMRQYMAPENTRAYDSSPQHFNWSL >XP_010504729.1 PREDICTED: squamosa promoter-binding-like protein 15 [Camelina sativa] MELLMGGSGQTESGGASSTESSSLSGGLRFGQKIYFEDGSGSGSKNRVGAGHKSSTTARCQVEGCKMDLSNAKAYYSRHK VCCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTLYTRIAASLYGNANAAMIKSVL GDPTVWSTARSVMRRSGPWQINPVKESHHQHMNVFSQESSSFTITCPEMMNNNSTDSSCALSLLSNSNSNPIQQQQQQLQ TQTNIWRPSSGFDYMTVDRVTLAQPPPIPSHHQYLNQTLEFMTGEKNSSHYMSPALGPSQISAPDEFQISNGTTMDGFEL SLHQQVLRQYMAPENTRAYDSSPHHFNWSL >CAA7060637.1 unnamed protein product [Microthlaspi erraticum] MELLMDSSQTESGGSSSIESSSLTGGLRFGQKIYFEDGSGSGAKSSKNRVNTARKSSTSTARCQVEGCRMDLSNAKTYYS RHKVCCIHSKSSNVIVSGLHQRFHLLSEFDLEKRSCRRRLACHNERRRKPHATTNLLTSRYSRIAPSLYENANTAIFRSV LGDTTAWSAARPVMRRSGPWQINPERESNLNVFSHGSSSFTTCPAMMNNNSTDSSCALSLLSNSNTNTNQQQQQPLQTST DTWRPSSGFDSMIADRVTMAQPPPVSIHNQYLNQSWDFMEGEKSNSHHMSPVLGLSQISEPADFQLSNGMGGGFELSLHQ QVLKQYMEPENTRAYDSSPQHFNWSL >KAG2324838.1 hypothetical protein Bca52824_007566 [Brassica carinata] MELLMGSGQDHPQSAGSSSTLSGGLRFGQKIYFEDGSGAGLSRNRVNNTGRKSMTARCQVEGCRMDLSNAKTYYSRHKVC CVHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQTTTTLLTSHYSSIAPSLYGNAIRSVLG DPTLWSTARGSSAPWQINPERESHHQLMNIISFGSSSFTNSTDSSCALSLLSNSNRNQQEQQPLQTPTNAWRPSLDFDSI VADRVTMAQPPPVSIQNQYLNQTWEFMSGEKSNAHCISPVLGLSQISEPVDFQTSNGATMSGVELSLHQQVLRQYLEPEN TRAYDSSHQHFNWSL >CAH8384605.1 unnamed protein product [Eruca vesicaria subsp. sativa] MELEMGSGQKKPESAGSSSTLSGGLRFGQKIYFEDGSGAGLSKNRVSSTGRKSMTARCQVEGCRTDLSNAKTYYSRHKVC CVHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQATTTLLTSRYSSLYGNAIRSVLGDPTT WSTARGSAPWKINQESDRHQLMNVISFGSSSFTTCPEMMNNNSTDSSCALSLLSNSNPNQQEQQPLQTSNTIWRPSLDFD STVADRVTMAQPPPVSMQNQYLNQTWEFMSGEKSNAQCISPVLGQSQISEPVDFQIGTTMGGGFELSLHQQVLRQYMEPE NTRAYDTSPQYFNWSL >KAF8114775.1 hypothetical protein N665_0034s0114 [Sinapis alba] MELLMGSGQNQPESAGSSSSTLSGGLRFGQKIYFEDGSGAGLSKNRVNTGRKSTTARCQVEGCRMDLSSAKTYYSRHKVC CIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQATTTFLTSHYSSIAPSLYGNAIRGVLG DSTTWSTARGSAPLQINPERESHRLMNVFSFGSSSFTNNSTDSSCALSLLSNSNPNQQEQQPLQTPTNTWRPSLDFDSIV ADRVTMAQPPPVSVQNQYLNQTWEFMSGEKSNGQHYISPVLGLSQISEPVDFQISNGATMSGVELSLHQQVLRQYLEPEN TRAYDSSPQHFNWSL >XP_010427684.1 PREDICTED: squamosa promoter-binding-like protein 15 [Camelina sativa] MELLMGGTESGGASSTESSSLSGGLRFGQKIYFEDGSGSGSKNRVVTGHKSSTTTTTARCQVEGCKMDLSNAKAYYSRHK VCCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTLFTSHYTRIAPSLYGNANAAMI KSVLGDPTVWSTARSVMRRSGPWQINPVKESHHQLMNVFSQESSSFTITCPEMMNNNNSTDSSCALSLLSNSNSNPIQQQ QQQLQTQTNIWRPSLGFDSMTVDRVTLAQPPPILSHHQYMSPVLGPSQISAPDEFQISNVTTMDGFELSLHQQVLRQYME PQNTRAYDSSPHHFNWSL

3、导入蛋白序列,点击File菜单栏导入或者直接拖进MEGA7软件都可以

 

 

以上任一种形式都可以。

4、多序列比对,导入成功后如下图所示。

 

选择Alignment > Align by ClustalW > OK > 默认参数

 

 

比对后的结果如下图:

 

5、系统进化树构建,选择NJ法进行构建系统进化树

 

 

 

6、结果展示

第一种,步长树

 

第二种步长对齐树

 

其他形状的树,点击以上按钮进行展示

 

7、保存nwk树文件,导入树图片进行美化

 

 

 

生信漫谈

生信漫谈,认识生信,学习生信,跨越生信入门路上的障碍,从而利用生信技术解决科研学习路上的绊脚石!



【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3