如何將大量數據導入到Rails中？

少量的數據，我已經使用了耙任務從CSV中重要的數據加載到Rails：如何將大量數據導入到Rails中？

desc "Import users." 
task :import_users => :environment do 
    File.open("users.txt", "r").each do |line| 
     name, age, profession = line.strip.split("\t") 
     u = User.new(:name => name, :age => age, :profession => profession) 
     u.save 
    end 
end

對於大文件（約50,000條記錄），雖然，這是令人難以置信的慢。有更快的方式來導入數據嗎？

來源

2010-10-06 grautur

您可能想看看activerecord-import並查看this similar thread。

來源

2010-10-06 18:52:28 aNoble

沒有額外的庫（我同意，用AR擴展批量導入應該會更快）（雖然AR：擴展跳過模型驗證），你可以添加併發的一點點，並採取多核機器的優勢

# Returns the number of processor for Linux, OS X or Windows. 
def number_of_processors 
    if RUBY_PLATFORM =~ /linux/ 
    return `cat /proc/cpuinfo | grep processor | wc -l`.to_i 
    elsif RUBY_PLATFORM =~ /darwin/ 
    return `sysctl -n hw.logicalcpu`.to_i 
    elsif RUBY_PLATFORM =~ /win32/ 
    # this works for windows 2000 or greater 
    require 'win32ole' 
    wmi = WIN32OLE.connect("winmgmts://") 
    wmi.ExecQuery("select * from Win32_ComputerSystem").each do |system| 
     begin 
     processors = system.NumberOfLogicalProcessors 
     rescue 
     processors = 0 
     end 
     return [system.NumberOfProcessors, processors].max 
    end 
    end 
    raise "can't determine 'number_of_processors' for '#{RUBY_PLATFORM}'" 
end 

desc "Import users." 
task :fork_import_users => :environment do 
    procs = number_of_processors 
    lines = IO.readlines('user.txt') 
    nb_lines = lines.size 
    slices = nb_lines/procs 
    procs.times do 
    subset = lines.slice!(0..slices) 
    fork do 
     subset.each do |line| 
     name, age, profession = line.strip.split("\t") 
     u = User.new(:name => name, :age => age, :profession => profession) 
     u.save 
     end 
    end 
    end 
    Process.waitall 
end

我的機器有2芯和叉版本上

我得到

real 1m41.974s 
user 1m32.629s 
sys  0m7.318s

同時用版本：

real 2m56.401s 
user 1m21.953s 
sys  0m7.529s

來源

2010-10-06 19:26:27 hellvinz

ar-extensions（及其替代Rails 3 activerecord-import）不必跳過模型驗證。根據您的需求和速度偏好，這是可選的。 – 2011-04-29 04:24:40

您應該嘗試FasterCSV。這對我來說非常快速且很容易使用。

來源

2010-10-07 14:31:21

如何將大量數據導入到Rails中？

回答

相關問題