我有一個問題,我必須做一個網頁的解析器。結構如下:用PHP解析HTML
<TABLE WIDTH=80%>
<tr><td colspan=7><BR><BR></td></tr>
<TR>
<Td colspan=7><FONT FACE="arial" align=left><B><A NAME="TEST">Anagrafica</B><br></TH>
</TR>
<tr><td colspan=7></td></tr>
<TR>
<TH ALIGN=LEFT ><FONT COLOR="#AA0000" FACE="arial" SIZE="2">Name</FONT></TH>
<TH></TH>
<TH ALIGN=LEFT ><FONT COLOR="#AA0000" FACE="arial" SIZE="2">Surname</FONT></TH>
<TH></TH>
<TH ALIGN=LEFT ><FONT COLOR="#AA0000" FACE="arial" SIZE="2">ID</FONT></TH>
<TH></TH>
<TH ALIGN=LEFT ><FONT COLOR="#AA0000" FACE="arial" SIZE="2">Code</FONT></TH>
</TR>
<tr>
<TD COLSPAN="7">
<HR SIZE="1" NOSHADE></TD>
<TR>
<TR>
<TD ALIGN="left" VALIGN="TOP" NOWRAP><FONT SIZE="1" FACE="arial">Mario</FONT> </TD>
<TD WIDTH="10"><VALIGN="TOP"><FONT SIZE="1" FACE="arial"> </FONT></TD>
<TD ALIGN="CENTER" VALIGN="TOP" NOWRAP><P ALIGN="CENTER"><FONT SIZE="1" FACE="arial"> Mario </FONT></TD>
<TD WIDTH="10"><VALIGN="TOP"><FONT SIZE="1" FACE="arial"> </FONT></TD>
<TD ALIGN="LEFT" VALIGN="TOP" NOWRAP><FONT SIZE="1" FACE="arial">1</FONT></TD>
<TD WIDTH="10"><VALIGN="TOP"><FONT SIZE="1" FACE="arial">a</FONT></TD>
<TD ALIGN="LEFT" VALIGN="TOP" NOWRAP><FONT SIZE="1" FACE="arial">132</FONT></TD>
<TR>
<TD ALIGN="left" VALIGN="TOP" NOWRAP><FONT SIZE="1" FACE="arial">Mario</FONT> </TD>
<TD WIDTH="10"><VALIGN="TOP"><FONT SIZE="1" FACE="arial"> </FONT></TD>
<TD ALIGN="CENTER" VALIGN="TOP" NOWRAP><P ALIGN="CENTER"><FONT SIZE="1" FACE="arial"> Mario </FONT></TD>
<TD WIDTH="10"><VALIGN="TOP"><FONT SIZE="1" FACE="arial"> </FONT></TD>
<TD ALIGN="LEFT" VALIGN="TOP" NOWRAP><FONT SIZE="1" FACE="arial">1</FONT></TD>
<TD WIDTH="10"><VALIGN="TOP"><FONT SIZE="1" FACE="arial">a</FONT></TD>
<TD ALIGN="LEFT" VALIGN="TOP" NOWRAP><FONT SIZE="1" FACE="arial">132</FONT></TD>
<TR>
我想用這個腳本
$start = strpos($content,'<Td colspan=7><FONT FACE="arial" align=left><B><A NAME=');
if ($start == TRUE) {
$end = strpos($content,'</TABLE>',$start) + 8;
$table = substr($content,$start,$end-$start);
preg_match_all("|<TD(.*)</TD>|U",$table,$rows);
$x = 1;
$counter = 1;
echo "<table class=\"TFtable\">";
foreach ($rows[0] as $row){
if ((strpos($row,'<TR')===false)){
preg_match_all("|<TD(.*)</TD>|U",$row,$cells);
$status[$x] = strip_tags($cells[0][0]);
$x = $x+1;
$counter = $counter+1;
}
if ($counter % 7 == 1) {
echo "<tr><td>{$status[2]} - {$status[4]} <br> {$status[6]} - {$status[1]}</td></tr>\n";
$x = 1;
}
}
echo "</table>";
這樣拿4列的數據,但是,最後一個字段$狀態[1]我就會出現在第二行中,就好像它確實是第2行的一部分:
例如
馬里奧羅西1 213
馬里奧·比安奇2 324
顯示
馬里奧·羅西1
馬里奧·比安奇2 213
我在哪裏錯了?
簡單:使用[DOM](http://php.net/dom)。你不應該手動解析HTML – 2014-11-25 14:53:02
強制性的:http://stackoverflow.com/a/1732454/1902010 – ceejayoz 2014-11-25 14:53:54