與紅寶石

字符串和數組中的文件工作，我有一個文本文件（「dict.txt」）的8K +英文單詞：與紅寶石

apple -- description text 
angry -- description text 
bear -- description text 
...

我需要後刪除所有文字「 - 」上的每一行我的文件。

什麼是解決這個問題的最簡單最快速的方法？

來源

2013-10-30 Calirails

您的目標是編輯文件或只是將文字讀入數組中？ – hirolau

如果您將文件讀入數組'a'（'a [0] ='apple - description text''，只需'a.map！{| e | e [/.+--/]'。 –

與開始：

words = [ 
    'apple -- description text', 
    'angry -- description text', 
    'bear -- description text', 
]

如果你想只說了句前述--：

words.map{ |w| w.split(/\s-+\s/).first } # => ["apple", "angry", "bear"]

或者：

words.map{ |w| w[/^(.+) --/, 1] } # => ["apple", "angry", "bear"]

如果你想要的話和--：

words.map{ |w| w[/^(.+ --)/, 1] } # => ["apple --", "angry --", "bear --"]

如果目標是沒有說明，以創建一個版本的文件：

File.open('new_dict.txt', 'w') do |fo| 
    File.foreach('dict.txt') do |li| 
    fo.puts li.split(/\s-+\s/).first 
    end 
end

一般情況下，爲了避免可擴展性問題，如果/當你輸入文件增長到巨大的比例，使用foreach遍歷輸入文件並將其作爲單行處理。只要逐行迭代或嘗試將其全部浸入並作爲緩衝區或數組進行處理，就可以達到處理速度。啜huge一個巨大的文件可能會減慢機器的抓取速度或使您的代碼崩潰，使其無限緩慢;逐行IO意外快速，沒有潛在的問題。

來源

2013-10-30 15:55:02

Sn，我在查找文檔時遇到了一些困難，我注意到（對於給定的數組「文字」），您可能已經使用過'w [0] ]''而不是'w [，1]'。你能提供一個參考嗎，或者解釋嗎？ –

它是字符串的一部分：['String。[]']（http://www.ruby-doc.org/芯2.0.0/String.html＃方法-I-5B-5D）。 –

File.read("dict.txt").gsub(/(?<=--).*/, "")

輸出

apple -- 
angry -- 
bear -- 
...

來源

2013-10-30 15:44:37 sawa

lines_without_description = File.read('dict.txt').lines.map{|line| line[0..line.index('-')+1]} 
File.open('dict2.txt', 'w'){|f| f.write(lines_without_description.join("\n"))}

來源

2013-10-30 15:49:37 hirolau

如果你想要的速度，你可能要考慮一下在命令行上sed做：

sed -r 's/(.*?) -- .*/\1/g' <dict.txt> new_dict.txt

這將創建一個新的文件僅包含單詞的new_dict.txt。

來源

2013-10-30 15:49:52 tessi

回答

相關問題