Django將大型數據集插入模型 - 如何延遲提交？

我正在開發一個項目，我需要將大型文件插入到模型中（有時需要幾個千兆字節）。由於文件可能很大，我正在採用的方法是逐行讀取，然後將其插入到Django模型中。Django將大型數據集插入模型 - 如何延遲提交？

但是，當過程中遇到錯誤時，我該如何取消整個操作？什麼是正確的方式來確保整個文件處理後沒有錯誤的行提交。

另一種選擇是一次創建所有模型對象並批量插入，這對於大型數據集是否可行？它將如何工作。

這裏是我的代碼：

class mymodel(models.Model): 
    fkey1 = models.ForeignKey(othermodel1,on_delete=models.CASCADE) 
    fkey2= models.ForeignKey(othermodel2,on_delete=models.CASCADE) 
    field 1= models.CharField(max_length=25,blank=False) 
    field 2= models.DateField(blank=False) 
    ... 
    Field 12= models.FloatField(blank=False)

而且插入數據從Excel模型：

wb=load_workbook(datafile, read_only=True, data_only=True) 
ws=wb.get_sheet_by_name(sheetName) 
for row in ws.rows: 
    if isthisheaderrow(row): 
     #determine column arrangement and pass to next 
     break 
for row in ws.rows: 
    if isthisheaderrow(row): 
     pass 
    elif isThisValidDataRow(row): 
     relevantRow=<create a list of values> 
     dictionary=dict(zip(columnNames,relevantRow)) 
     dictionary['fkey1']=othermodel1Object 
     dictionary['fkey2']=othermodel2Object 
     mymodel(**dictionary).save()

來源

2017-03-01 ste_kwr

我應該看着更難，提交可以由一個裝飾@transaction.atomic延遲。更詳細的描述在這裏給出：https://docs.djangoproject.com/en/1.11/topics/db/transactions/

上述符號是：

wb=load_workbook(trfile, read_only=True, data_only=True) 
ws=wb.get_sheet_by_name(sheetName) 
revenueSwitch=True 
for row in ws.rows: 
    if ifHeaderReturnIndex(row,desiredColumns): 
     selectedIndex=ifHeaderReturnIndex(row, desiredColumns) 
     outputColumnNames=[row[i].value.replace(" ", "") for i in selectedIndex] 
     #output_ws.append(outputColumnNames) 
     break 
@transaction.atomic 
def insertrows(): 
    for row in ws.rows: 
     if ifHeaderReturnIndex(row,desiredColumns): 
      pass 
     elif isRowValid(row,selectedIndex): 
      newrow=[row[i].value for i in selectedIndex] 
      dictionary=dict(zip(outputColumnNames,newrow)) 
      dictionary['UniqueRunID']=run 
      dictionary['SourceFileObject']=TrFile 
      TransactionData(**dictionary).save() 
insertrows()

來源

2017-03-01 03:52:18

只是注意，它也可以用作[上下文管理器（https://docs.djangoproject.com/en/ 1.11/topics/db/transactions /＃controls-transactions-explicit）（'with transaction.atomic（）：'），所以不需要創建一個函數來包裝一些代碼。 – Anonymous

非常好，謝謝！ –

Django將大型數據集插入模型 - 如何延遲提交？

回答

相關問題