2017-05-28 601 views
1

我在嘗試使用python腳本將數據從csv文件導入到sqlite數據庫。Python:將數據從大型csv導入到sqlite數據庫

我的CSV行如下:(在從quandle下載的excel工作表)

Date  Open High  Low Last Close TotTrQt Turnover (Lacs) 
2017-05-26 2625 2626.85 2564.65 2570.05 2578.25 681275 17665.43 
2017-05-25 2577 2637.55 2568 2615.05 2624.6 2047047 53333.77 
2017-05-24 2534.8 2570 2529.65 2567.1 2559.15 1267274 32252.28 
2017-05-23 2533.2 2564.15 2514 2523.7 2521.7 1374298 34776.45 
2017-05-22 2510 2553.75 2510 2535 2531.35 831970 21054.61 
2017-05-19 2536.2 2540.55 2486 2503.85 2507.15 893022 22384.3 
2017-05-18 2450 2572 2442.25 2525 2536.2 2569297 64894.78 
2017-05-17 2433.5 2460.75 2423 2450 2455.35 1438099 35137.29 
2017-05-16 2380 2435 2373.45 2425.1 2429.15 1800513 43397.03 
2017-05-15 2375.1 2377.95 2341.6 2368 2365.1 908802 21380.43 

爲了產生DB表,我已經使用下面的腳本:

import sqlite3 

try: 
    db = sqlite3.connect('NSETCS') 
    cursor=db.cursor() 
    print 'Executing: Create Table SQL' 
    cursor.execute('''CREATE TABLE NSETCS (DATE TEXT, OPEN REAL, HIGH REAL, LOW REAL, LAST REAL, CLOSE REAL,\ 
    TOTALTRADEQUANTITY REAL, TURNOVER REAL)''') 
    ##since above statment is DDL, no explicit commit is reqd 
except Exception as E: 
    print "Error=",E 
finally: 
    db.close() 

用於插入數據在表格的特定行中,我使用下面的腳本,但是由於浮點轉換給出錯誤,所以數據插入失敗,任何指導都將非常值得讚賞。

import sqlite3 

try: 
    infile = open (r'F:\mypractise_python\day11\NSE-TCS.csv','r') 
    content = infile.readlines() 
except IOError as E: 
    print "Error: ", E 
try: 
    db = sqlite3.connect('NSETCS') 
    cursor = db.cursor() 

    for line in content: 
     line =line.strip() 
     columns = line.split(',') 
     if line == '' or columns[0] == 'Date': 
      continue 
     date = columns[0].strip() 
     open_stock = float(columns[1].strip()) 
     high = float(columns[2].strip()) 
     low = float(columns[3].strip()) 
     last= float(columns[4].strip()) 
     close= float(columns[5].strip()) 
     tot_trade_qt= float(columns[6].strip()) 
     turnover= float(columns[7].strip()) 
     cursor.execute('''insert into NSETCS values (:date, :open_stock, :high, :low, :last, :close, :tot_trade_qt, :turnover)''',\ 
         {'date':date, 'open_stock':open_stock, 'high':high, 'low':low, 'last':last, 'close':close,\ 
         'tot_trade_qt':tot_trade_qt, 'turnover':turnover}) 

except Exception as E: 
    print "Error:", E 
else: 
    db.commit() 

db.close() 
infile.close() 
+0

任何錯誤信息? –

+0

給我們舉例你的CSV行 –

+0

錯誤:無法將字符串轉換爲浮動。 –

回答

0

該CSV樣本不匹配的代碼 - 在後者你跳過一排(大概是報頭),其中所述第一小區是「DATE」 - 和在CSV是「日期」。

考慮使用csv.DictReader - 它會爲每個源行的字典,其中一列是一個關鍵:

import sqlite3 
import csv 

db = sqlite3.connect('NSETCS') 

with open (r'F:\mypractise_python\day11\NSE-TCS.csv','r') as infile, db: 
    content = csv.DictReader(infile, delimiter=',') # csv generator to the file, will be read line by line 

    cursor = db.cursor() 

    for line in content: 
     # line is a dict, where each column name is the key 
     # no need to sanitize the header row, that was done automatically upon reading the file 
     date = line['Date'] 
     open_stock = float(line['Open']) 
     high = float(line['High']) 
     low = float(line['Low']) 
     last= float(line['Last']) 
     close= float(line['Close']) 
     tot_trade_qt= float(line['TotTrQt']) 
     turnover= float(line['Turnover (Lacs)']) 
     cursor.execute('''insert into NSETCS values (:date, :open_stock, :high, :low, :last, :close, :tot_trade_qt, :turnover)''',\ 
        {'date':date, 'open_stock':open_stock, 'high':high, 'low':low, 'last':last, 'close':close,\ 

      'tot_trade_qt':tot_trade_qt, 'turnover':turnover}) 

# no file closing, that is automatically done by 'with open()' 
# no db close or commit - the DB connection is in a context manger, that'll be done automatically (the commit - if there are no exceptions) 

我的主要觀點是 - 不要手工解析原始CSV,但使用內置的庫 - 這將爲您節省很多麻煩。

例如,您嘗試處理空行,跳過標題行;但在csv中還有其他一些情況 - 數據引用和轉義,例如庫爲您處理。

+1

那麼我的答案仍然成立 - 你試過了嗎?而且,在更新後的問題代碼中,即使數據樣本中的列是「日期」,您也會與「DATE」(大寫)合作。 – Todor

+0

todor運行ur代碼導致以下結果 錯誤:DictReader實例沒有屬性'--getitem--' –

+1

與編輯建議,它會很好地工作。 [csv.DictReader(infile,delimiter =',')]只會代表列表和列表(csv.DictReader(infile,delimiter =','))將它轉換成列表..... right? –