我有一個表格並希望使用Nokogiri來提取每個表格行中前兩個單元格的內容。目前我面臨一些困難,希望得到你的幫助。這是我現在得到的。任何人都可以幫助我?謝謝。使用Nokogiri提取所有表格行中的前2個表格單元格
irb(main):001:0> require 'nokogiri'
=> true
irb(main):002:0>
irb(main):003:0* @doc = Nokogiri::HTML::DocumentFragment.parse <<-EOHTML
irb(main):004:0" <body>
irb(main):005:0" <div class="c">
irb(main):006:0" <table>
irb(main):007:0" <tr>
irb(main):008:0" <td>test</td><td>test</td><td>test</td><td>test</td>
irb(main):009:0" </tr>
irb(main):010:0" <tr class="even">
irb(main):011:0" <td>test</td><td>test</td><td>test</td><td>test</td>
irb(main):012:0" </tr>
irb(main):013:0" <tr>
irb(main):014:0" <td>test</td><td>test</td><td>test</td><td>test</td>
irb(main):015:0" </tr>
irb(main):016:0" <tr class="even">
irb(main):017:0" <td>test</td><td>test</td><td>test</td><td>test</td>
irb(main):018:0" </tr>
irb(main):019:0" </table>
irb(main):020:0" </div>
irb(main):021:0" </body>
irb(main):022:0" EOHTML
irb(main):026:0> @doc.css("div.c > table").search("table/tr/td")
=> ...
irb(main):026:0> @doc.css("div.c > table").search("table/tr/td[position()>2]")
Nokogiri::CSS::SyntaxError: unexpected '>' after '#<Nokogiri::CSS::Node:0x2b7bc20>'
from C:/RailsInstaller/Ruby1.9.2/lib/ruby/gems/1.9.1/gems/nokogiri-1.5.0-x86-mingw32/lib/nokogiri/css/parser_extras.rb:87:in `on_error'
from C:/RailsInstaller/Ruby1.9.2/lib/ruby/1.9.1/racc/parser.rb:99:in `_racc_do_parse_c'
from C:/RailsInstaller/Ruby1.9.2/lib/ruby/1.9.1/racc/parser.rb:99:in `do_parse'
from C:/RailsInstaller/Ruby1.9.2/lib/ruby/gems/1.9.1/gems/nokogiri-1.5.0-x86-mingw32/lib/nokogiri/css/parser_extras.rb:62:in `parse'
from C:/RailsInstaller/Ruby1.9.2/lib/ruby/gems/1.9.1/gems/nokogiri-1.5.0-x86-mingw32/lib/nokogiri/css/parser_extras.rb:79:in `xpath_for'
from C:/RailsInstaller/Ruby1.9.2/lib/ruby/gems/1.9.1/gems/nokogiri-1.5.0-x86-mingw32/lib/nokogiri/css.rb:23:in `xpath_for'
from C:/RailsInstaller/Ruby1.9.2/lib/ruby/gems/1.9.1/gems/nokogiri-1.5.0-x86-mingw32/lib/nokogiri/xml/node_set.rb:111:in `block (2 levels) in
css'
from C:/RailsInstaller/Ruby1.9.2/lib/ruby/gems/1.9.1/gems/nokogiri-1.5.0-x86-mingw32/lib/nokogiri/xml/node_set.rb:109:in `map'
from C:/RailsInstaller/Ruby1.9.2/lib/ruby/gems/1.9.1/gems/nokogiri-1.5.0-x86-mingw32/lib/nokogiri/xml/node_set.rb:109:in `block in css'
from C:/RailsInstaller/Ruby1.9.2/lib/ruby/gems/1.9.1/gems/nokogiri-1.5.0-x86-mingw32/lib/nokogiri/xml/node_set.rb:239:in `block in each'
from C:/RailsInstaller/Ruby1.9.2/lib/ruby/gems/1.9.1/gems/nokogiri-1.5.0-x86-mingw32/lib/nokogiri/xml/node_set.rb:238:in `upto'
from C:/RailsInstaller/Ruby1.9.2/lib/ruby/gems/1.9.1/gems/nokogiri-1.5.0-x86-mingw32/lib/nokogiri/xml/node_set.rb:238:in `each'
from C:/RailsInstaller/Ruby1.9.2/lib/ruby/gems/1.9.1/gems/nokogiri-1.5.0-x86-mingw32/lib/nokogiri/xml/node_set.rb:105:in `css'
from C:/RailsInstaller/Ruby1.9.2/lib/ruby/gems/1.9.1/gems/nokogiri-1.5.0-x86-mingw32/lib/nokogiri/xml/node_set.rb:83:in `block in search'
from C:/RailsInstaller/Ruby1.9.2/lib/ruby/gems/1.9.1/gems/nokogiri-1.5.0-x86-mingw32/lib/nokogiri/xml/node_set.rb:80:in `each'
from C:/RailsInstaller/Ruby1.9.2/lib/ruby/gems/1.9.1/gems/nokogiri-1.5.0-x86-mingw32/lib/nokogiri/xml/node_set.rb:80:in `search'
from (irb):27
from C:/RailsInstaller/Ruby1.9.2/bin/irb:12:in `<main>'irb(main):028:0>
嗨Vivien,你的方法將以相同的方式處理所有匹配的表格單元格。實際上,我例子中每行的兩個單元格都有一些關係,我需要它們的值。有什麼方法可以提取它們並保留它們的關係?例如,如何獲得每行中的前2個單元並將它們連接起來?謝謝。 – 2012-02-13 15:00:52
@Yousui獲取第一個'td',並在迭代它們時使用['td.next_element'](http://nokogiri.org/Nokogiri/XML/Node.html#method-i-next_element)來查找第二個'td'在那一行。 – Phrogz 2012-02-13 23:23:18