2011-12-02 76 views
1

所以我給出包含一個文本文件:提取部分

Blade Runner (1982) [117 min] 
Full Metal Jacket (1987) [116 min] 
Monty Python and the Holy Grail (1975) [91 min] 
The Godfather (1972) [175 min] 

,並有把它變成這樣:

Movie name: Blade Runner 
Movie release year: 1982 
Movie length (in mins): 117 

Movie name: Full Metal Jacket 
Movie release year: 1987 
Movie length (in mins): 116 

Movie name: Monty Python and the Holy Grail 
Movie release year: 1975 
Movie length (in mins): 91 

Movie name: The Godfather 
Movie release year: 1972 
Movie length (in mins): 175 

首先,我重複在每一行,然後我我以爲我應該迭代字符串的每個部分,但那是我卡住的地方,我該怎麼做?我使用正則表達式嗎?如何保存與正則表達式匹配的特定字符串?

下面是代碼的當前shell,它將三個部分存儲到用於初始化to_s方法以所需格式打印的影片類的變量中。

我知道這在許多方面都不正確,但這就是我尋求幫助的原因。 variable =/regex /是變量分配給正則表達式捕獲的東西的行,以及當正則表達式匹配時/ regex /的時候。

class Movie 
    def initialize (name, year, length) # constructor 
     @name = name 
     @year = year 
     @length = length 
    end 

    def to_s # returns string representation of object 
     return "Movie Name: " + @name 
      + "\nMovie release year: " 
      + @year + "\nMovie Length (in min): " 
      + @length + "\n" 
    end 
end 

$movies = [] 
File.open("movies.txt").each do |line| 
    if matches = /(.*)? \((\d+).*?(\d+)/.match(line) 
    $movies << Movie.new(matches[1], matches[2], matches[3]) 
    end 
end 


for $movie in $movies do #what u got here is not index but the element in the array 
    print $movie.to_s 
end 

編輯:

固定版本的代碼,但打印循環結束時無法正常工作。

Edit2:和nownit呢。謝謝PeterPeiGuo!

+0

@贏郭88888888:你能解釋一下你的編輯?看起來有點多,並且摧毀了問題內容的三分之二,並且甚至沒有寫出適當的編輯摘要。 – BoltClock

回答

2
m = /(.*)? \((\d+).*?(\d+)/.match("Blade Runner (1982) [117 min]") 
+0

然後呢?這並不是我所問的。 – Portaljacker

+0

你編輯它使我的評論啞。 :P但是,謝謝。爲什麼m成爲一個具有這些元素的數組? (是的,我希望在存儲字符串之前或之後沒有空格,用正則表達式處理,而不是chomp/trim) – Portaljacker

+0

太棒了。所以正則表達式基本上查找整個表達式,但只存儲了與曲線括號內部匹配的內容? – Portaljacker

1

你可以做這樣的事情:

$movies = [] 
File.open("movies.txt").each do |line| 
    if matches = /^(.*)\((\d+)\) \[(\d+)\smin\]/.match(line) 
    $movies << Movie.new(matches[1], matches[2], matches[3]) 
    end 
end 
+0

什麼?好吧,我把這個標記爲家庭作業,我需要一些解釋。來自JavaScript的正則表達式,但你在這裏做什麼? – Portaljacker

1
# create string containing list of movies (numerous ways to load this data) 
movie = <<-MOV 
Blade Runner (1982) [117 min] 
Full Metal Jacket (1987) [116 min] 
Monty Python and the Holy Grail (1975) [91 min] 
The Godfather (1972) [175 min] 
<<-MOV 

# split movies into lines, then iterate over each line and do some regex 
# to extract relavent data (name, year, runtime) 
data = movies.split("\n").map do |s| 
    s.scan(/([\w\s]+)\ \((\d+)\)\ \[(\d+)\ min\]/).flatten } 
end 
# => [['Blade Runner', '1982', '117'], ... ] 

# iterate over data, output in desired format. 
data.each do |data| 
    puts "Movie name: #{data[0]}\nMovie release year: #{data[1]}\nMovie length: (in mins): #{data[2]}\n\n" } 
end 
# outputs in format you specified