我試圖從uniprot XML文件中選擇一些數據,並且我能夠獲得我想要的大部分內容,但是我遇到了獲取數據輸出在同一節點中具有更多條目。最好將它們結合在一起。選擇具有相同節點名稱的數據併合並來自XML文件的數據
XML代碼:
<?xml version='1.0' encoding='UTF-8'?>
<?xml-stylesheet href="test_will7.xslt" type="text/xsl" ?>
<uniprot>
<entry dataset="Swiss-Prot" created="1993-04-01" modified="2012-11-28" version="118">
<accession>P30443</accession>
<accession>O77964</accession>
<name>1A01_HUMAN</name>
<protein>
<recommendedName>
<fullName>HLA class I histocompatibility antigen, A-1 alpha chain</fullName>
</recommendedName>
</protein>
<gene>
<name type="primary">HLA-A</name>
<name type="synonym">HLAA</name>
</gene>
</comment>
<comment type="subcellular location">
<subcellularLocation>
<location>Membrane</location>
<topology>Single-pass type I membrane protein</topology>
</subcellularLocation>
</comment>
<dbReference type="GO" id="GO:0031901">
<property type="term" value="C:early endosome membrane"/>
<property type="evidence" value="TAS:Reactome"/>
</dbReference>
<dbReference type="GO" id="GO:0012507">
<property type="term" value="C:ER to Golgi transport vesicle membrane"/>
<property type="evidence" value="TAS:Reactome"/>
</dbReference>
<dbReference type="GO" id="GO:0000139">
<property type="term" value="C:Golgi membrane"/>
<property type="evidence" value="TAS:Reactome"/>
</dbReference>
</entry>
<entry dataset="Swiss-Prot" created="1986-07-21" modified="2012-11-28" version="151">
<accession>P01892</accession>
<accession>O19619</accession>
<accession>P06338</accession>
<name>1A02_HUMAN</name>
<protein>
<recommendedName>
<fullName>HLA class I histocompatibility antigen, A-2 alpha chain</fullName>
</recommendedName>
</protein>
<gene>
<name type="primary">HLA-A</name>
<name type="synonym">HLAA</name>
</gene>
<comment type="subcellular location">
<subcellularLocation>
<location>Membrane</location>
<topology>Single-pass type I membrane protein</topology>
</subcellularLocation>
</comment>
<dbReference type="GO" id="GO:0060333">
<property type="term" value="P:interferon-gamma-mediated signaling pathway"/>
<property type="evidence" value="TAS:Reactome"/>
</dbReference>
</entry>
<entry dataset="Swiss-Prot" created="1987-08-13" modified="2012-11-28" version="124">
<accession>P04439</accession>
<name>1A03_HUMAN</name>
<protein>
<recommendedName>
<fullName>HLA class I histocompatibility antigen, A-3 alpha chain</fullName>
</recommendedName>
</protein>
<gene>
<name type="primary">HLA-A</name>
<name type="synonym">HLAA</name>
</gene>
<comment type="subcellular location">
<subcellularLocation>
<location>Membrane</location>
<topology>Single-pass type I membrane protein</topology>
</subcellularLocation>
</comment>
<dbReference type="GO" id="GO:0005887">
<property type="term" value="C:integral to plasma membrane"/>
<property type="evidence" value="NAS:UniProtKB"/>
</dbReference>
<dbReference type="GO" id="GO:0019048">
<property type="term" value="P:virus-host interaction"/>
<property type="evidence" value="IEA:UniProtKB-KW"/>
</dbReference>
</entry>
</uniprot>
我的XSLT文件現在看起來是這樣。但是,我仍然在做錯事,因爲它不起作用。也許是因爲不同的關卡?
<?xml version="1.0" ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="/">
<html>
<body>
<h2>My Selection</h2>
<table border="1">
<tr bgcolor="#9acd32">
<th>Name</th>
<th>GeneName</th>
<th>AccessionNr</th>
<th>ProteinName</th>
<th>SubcellularLocation</th>
<th>TissueSpecificity</th>
<th>GOID</th>
<th>GOName</th>
</tr>
<xsl:apply-templates/>
</table>
</body>
</html>
</xsl:template>
<xsl:template match="uniprot/entry">
<tr>
<xsl:apply-templates select="name|gene/name|accession|protein/recommendedName/fullName|comment[@type = 'subcellular location']/subcellularLocation/location|comment[@type = 'tissue specificty']/text|dbReference[@type = 'GO']/@id|dbReference[@type = 'GO']/property[@type = 'term']/@value"/>
</tr>
</xsl:template>
<xsl:template match="name|gene/name|accession|protein/recommendedName/fullName|comment[@type = 'subcellular location']/subcellularLocation/location|comment[@type = 'tissue specificty']/text|dbReference[@type = 'GO']/@id|dbReference[@type = 'GO']/property[@type = 'term']/@value">
<xsl:choose>
<xsl:when test="name()='dbReference[@type = 'GO']/@id|dbReference[@type = 'GO']/property[@type = 'term']/@value' and not(preceding-sibling::dbReference[@type = 'GO']/@id|dbReference[@type = 'GO']/property[@type = 'term']/@value)">
<td>
<xsl:value-of select="."/>
<xsl:if test="following-sibling::dbReference[@type = 'GO']/@id|dbReference[@type = 'GO']/property[@type = 'term']/@value">
<xsl:text>;</xsl:text>
<xsl:for-each select="following-sibling::dbReference[@type = 'GO']/@id|dbReference[@type = 'GO']/property[@type = 'term']/@value">
<xsl:value-of select="."/>
<xsl:if test="position()!=last()">
<xsl:text>;</xsl:text>
</xsl:if>
</xsl:for-each>
</xsl:if>
</td>
</xsl:when>
<xsl:when test="name()='dbReference[@type = 'GO']/@id|dbReference[@type = 'GO']/property[@type = 'term']/@value' and preceding-sibling::dbReference[@type = 'GO']/@id|dbReference[@type = 'GO']/property[@type = 'term']/@value"/>
<xsl:otherwise>
<td>
<xsl:value-of select="."/>
</td>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
我想要的輸出:
Name GeneName AccessionNr ProteinName SubcellularLocation GOID_ GOName
1A01_HUMAN HLA-A P30443 HLA class I histocompatibility antigen, A-1 alpha chain Membrane GO:0031901- C:early endosome membrane; GO:0012507- C:ER to Golgi transport vesicle membrane; GO:0000139- C:Golgi membrane
1A02_HUMAN HLA-A P01892 HLA class I histocompatibility antigen, A-2 alpha chain Membrane GO:0060333-P:interferon-gamma-mediated signaling pathway
1A03_HUMAN HLA-A P04439 HLA class I histocompatibility antigen, A-3 alpha chain Membrane GO:0005887- C:integral to plasma membrane; GO:0019048- P:virus-host interaction
如果這是太困難,這也可能是這樣的:
Name GeneName AccessionNr ProteinName SubcellularLocation GOID GOName
1A01_HUMAN HLA-A P30443 HLA class I histocompatibility antigen, A-1 alpha chain Membrane GO:0031901; GO:0012507; GO:0000139 C:early endosome membrane; C:ER to Golgi transport vesicle membrane; C:Golgi membrane
1A02_HUMAN HLA-A P01892 HLA class I histocompatibility antigen, A-2 alpha chain Membrane GO:0060333 P:interferon-gamma-mediated signaling pathway
1A03_HUMAN HLA-A P04439 HLA class I histocompatibility antigen, A-3 alpha chain Membrane GO:0005887; GO:0019048 C:integral to plasma membrane; P:virus-host interaction
我知道這是很多,而且相當困難的區分一切。我可以閱讀代碼,但修復錯誤或寫新內容仍然非常困難! (並且我是XML新手) 謝謝!
PLS。把正確的XML文件。如果您的XML'有機體'元素沒有任何結束標記。 – 2013-05-14 10:07:13
複製/粘貼錯誤,謝謝。我糾正了它。 – user1941884 2013-05-15 00:40:48