我們的程序創建一個主散列,其中每個鍵都是表示ID(大約10-20個字符)的符號。每個值都是一個空的散列。Ruby中的內存使用率高:500B正常的空散列?
主散列有大約800K條記錄。
但是我們看到ruby的內存幾乎達到了400MB。
這表明每個鍵/值對(符號+空哈希)每個消耗約500B。
這是正常的紅寶石?下面
代碼:
def load_app_ids
cols = get_columns AppFile
id_col = cols[:application_id]
each_record AppFile do |r|
@apps[r[id_col].intern] = {}
end
end
# Takes a line, strips the record seperator, and return
# an array of fields
def split_line(line)
line.gsub(RecordSeperator, "").split(FieldSeperator)
end
# Run a block on each record in a file, up to
# @limit records
def each_record(filename, &block)
i = 0
path = File.join(@dir, filename)
File.open(path, "r").each_line(RecordSeperator) do |line|
# Get the line split into columns unless it is
# a comment
block.call split_line(line) unless line =~ /^#/
# This import can take a loooong time.
print "\r#{i}" if (i+=1) % 1000 == 0
break if @limit and i >= @limit
end
print "\n" if i > 1000
end
# Return map of column name symbols to column number
def get_columns(filename)
path = File.join(@dir, filename)
description = split_line(File.open(path, &:readline))
# Strip the leading comment character
description[0].gsub!(/^#/, "")
# Return map of symbol to column number
Hash[ description.map { |str| [ str.intern, description.index(str) ] } ]
end
哈希不是內存中唯一的東西,是嗎? – 2013-03-19 05:14:02
我們正在閱讀文件的每一行,將盡快發佈代碼。謝謝。 – Crashalot 2013-03-19 05:42:37
更新了代碼 – Crashalot 2013-03-19 06:35:59