2015-02-05 83 views
0

有人知道我如何從GenBank中的數據中僅使用GenBank代碼加入和biopython獲得學名(或所有特徵)。例如:如何獲取給定GenBank登錄碼biopython的科學名稱?

>>> From Bio import Entrez 
>>> Entrez.email = [email protected] 
>>> Input = Entrez.someFunction(db="nucleotide", term="AY851612") 
>>> output = Entrez.read(Input) 
>>> print output 

"Austrocylindropuntia subulata" 

或者得好:

>>> print output 

"LOCUS AY851612 892 bp DNA linear PLN 10-APR-2007 
DEFINITION Opuntia subulata rpl16 gene, intron; chloroplast. 
ACCESSION AY851612 
VERSION AY851612.1 GI:57240072 
KEYWORDS . 
SOURCE chloroplast Austrocylindropuntia subulata 
ORGANISM Austrocylindropuntia subulata 
Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 
Spermatophyta; Magnoliophyta; eudicotyledons; core eudicotyledons; 
Caryophyllales; Cactaceae; Opuntioideae; Austrocylindropuntia. 
REFERENCE 1 (bases 1 to 892) 
AUTHORS Butterworth,C.A. and Wallace,R.S. 
..." 

感謝所有! =)

+0

你有沒有過上訪問Entrez的資源Biopython教程[相應部分(http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc108)讀? – MattDMo 2015-02-05 21:59:55

+0

是的,我讀了關於「訪問NCBI的Entrez數據庫」的第9章,但它集中在GI代碼而不是GB代碼(或登錄代碼)。 =( – 2015-02-05 22:07:23

回答

3

請注意,output是一本字典。如果需要,您可以訪問任何適當的字段。此外,你會想使用efetch,而不是esearch。

In [1]: from Bio import Entrez 

In [3]: Entrez.email = '##############' 

In [28]: handle = Entrez.efetch(db="nucleotide", id="AY851612", rettype="gb", retmode="text") 

In [29]: x = SeqIO.read(handle, 'genbank') 

In [30]: print(x) 
ID: AY851612.1 
Name: AY851612 
Description: Opuntia subulata rpl16 gene, intron; chloroplast. 
Number of features: 3 
/date=10-APR-2007 
/sequence_version=1 
/taxonomy=['Eukaryota', 'Viridiplantae', 'Streptophyta', 'Embryophyta', 'Tracheophyta', 'Spermatophyta', 'Magnoliophyta', 'eudicotyledons', 'Gunneridae', 'Pentapetalae', 'Caryophyllales', 'Cactineae', 'Cactaceae', 'Opuntioideae', 'Austrocylindropuntia'] 
/data_file_division=PLN 
/references=[Reference(title='Molecular Phylogenetics of the Leafy Cactus Genus Pereskia (Cactaceae)', ...), Reference(title='Direct Submission', ...)] 
/keywords=[''] 
/accessions=['AY851612'] 
/gi=57240072 
/organism=Austrocylindropuntia subulata 
/source=chloroplast Austrocylindropuntia subulata 
Seq('CATTAAAGAAGGGGGATGCGGATAAATGGAAAGGCGAAAGAAAGAAAAAAATGA...AGA', IUPACAmbiguousDNA()) 

In [31]: x.description 
Out[31]: 'Opuntia subulata rpl16 gene, intron; chloroplast.' 
+0

非常棒!我只需要做:x.annotations ['organism']來獲得科學名稱。謝謝! – 2015-02-05 22:31:08

相關問題