2016-12-15 76 views
0

我有一個桶文件夾,其中包含形式爲yy-mm-dd.CSV的csv文件,其中包含幾行標題,我可以忽略第二行末尾的日期,然後是151行timestamp:power(千瓦)。這裏有一個片段:Python,MySQL迴歸,SQL錯誤或錯誤的條件?

sep=; 
    Version CSV|Tool SunnyBeam11|Linebreaks CR/LF|Delimiter semicolon|Decimalpoint point|Precision 3|Language en-UK|TZO=0|DST|2012.06.21 

    ;SN: removed 
    ;SB removed 
    ;2120138796 
    Time;Power 
    HH:mm;kW 
    00:10;0.000 
    00:20;0.000 
    00:30;0.000 
    00:40;0.000 
    00:50;0.000 
    01:00;0.000 
    01:10;0.000 
    01:20;0.000 
    01:30;0.000 
    01:40;0.000 
    01:50;0.000 
    02:00;0.000 
    02:10;0.000 
    02:20;0.000 
    02:30;0.000 
    02:40;0.000 
    02:50;0.000 
    03:00;0.000 
    03:10;0.000 
    03:20;0.000 
    03:30;0.000 
    03:40;0.000 
    03:50;0.000 
    04:00;0.000 
    04:10;0.000 
    04:20;0.000 
    04:30;0.000 
    04:40;0.000 
    04:50;0.006 
    05:00;0.024 
    05:10;0.006 
    05:20;0.000 
    05:30;0.030 
    05:40;0.036 
    05:50;0.042 
    06:00;0.042 
    06:10;0.042 
    06:20;0.048 
    06:30;0.060 
    06:40;0.114 
    06:50;0.132 
    07:00;0.150 

我解析這些文件的檢查,他們有這種格式的文件名,因爲有其他的文件,我不想分析桶文件夾,和我搶日期從每兩排文件並存儲它。我連接到數據庫,然後處理剩下的行,將存儲的日期與第9行(或其附近)後面的每行上的時間戳連接起來。我也抓住每條線上的第二個值(功率,單位爲千瓦)。目的是將連接的日期時間值和關聯的功率值插入連接的mysql數據庫中。讀取最後一行時,文件將被移至名爲'parsed'的子文件夾。所有這些都按預期進行,但每行讀取都會經過「不能附加到Db」的try/except循環(第107行)的除外分支。我已經通過登錄到MySQL(實際上是MariaDB on OpenSuse LEAP 4.2)來檢查存儲的數據庫credentails的工作情況,並且該工作和我已經打印了連接變量,這兩個都導致我相信我實際上已經正確連接每個文件。我會剪掉了我的Python腳本的部分,使其短,但我不是一個particuarly高級Python編碼器和我不想冒險缺少關鍵部分:

#!/usr/bin/python 

    from os import listdir 
    from datetime import datetime 
    import MySQLdb 
    import shutil 
    import syslog 
    #from sys import argv 


    def is_dated_csv(filename): 
     """ 
     Return True if filename matches format YY-MM-DD.csv, otherwise False. 
     """ 
     date_format = '%y-%m-%d.csv' 

     try: 
      date = datetime.strptime(filename, date_format) 
      return True 
     except ValueError: 
      # filename did not match pattern 
      syslog.syslog('SunnyData file ' + filename + ' did NOT match') 
     #print filename + ' did NOT match' 
      pass 
    #'return' terminates a function 
     return False 


    def parse_for_date(filename): 
    """ 
    Read file for the date - from line 2 field 10 
    """ 
    currentFile = open(filename,'r') 
    l1 = currentFile.readline() #ignore first line read 
    date_line = currentFile.readline() #read second line 
    dateLineArray = date_line.split("|") 
    day_in_question = dateLineArray[-1]#save the last element (date) 
    currentFile.close() 
    return day_in_question 


    def normalise_date_to_UTF(day_in_question): 
    """ 
    Rather wierdly, some days use YYYY.MM.DD format & others use DD/MM/YYYY 
    This function normalises either to UTC with a blank time (midnight) 
    """ 
    if '.' in day_in_question: #it's YYYY.MM.DD 
     dateArray = day_in_question.split(".") 
     dt = (dateArray[0] +dateArray[1] + dateArray[2].rstrip() + '000000') 
    elif '/' in day_in_question: #it's DD/MM/YYYY 
     dateArray = day_in_question.split("/") 
     dt = (dateArray[2].rstrip() + dateArray[1] + dateArray[0] + '000000') 
    theDate = datetime.strptime(dt,'%Y%m%d%H%M%S') 
    return theDate #A datetime object 


    def parse_power_values(filename, theDate): 
    currentFile = open(filename,'r') 
    for i, line in enumerate(currentFile): 
     if i <= 7: 
     doingSomething = True 
     print 'header' + str(i) + '/ ' + line.rstrip() 
     elif ((i > 7) and (i <= 151)): 
     lineParts = line.split(';') 
     theTime = lineParts[0].split(':') 
     theHour = theTime[0] 
     theMin = theTime[1] 
     timestamp = theDate.replace(hour=int(theHour),minute=int(theMin)) 
     power = lineParts[1].rstrip() 
     if power == '-.---': 
      power = 0.000 
     if (float(power) > 0): 
      print str(i) + '/ ' + str(timestamp) + ' power = ' + power + 'kWh' 
      append_to_database(timestamp,power) 
     else: 
      print str(i) + '/ ' 
     elif i > 151: 
     print str(timestamp) + ' DONE!' 
     print '----------------------' 
     break 
    currentFile.close() 

    def append_to_database(timestampval,powerval): 
    host="localhost", # host 
    user="removed", # username 
    #passwd="******" 
    passwd="removed" 
    database_name = 'SunnyData' 
    table_name = 'DTP' 
    timestamp_column = 'DT' 
    power_column = 'PWR' 
    #sqlInsert = ("INSERT INTO %s (%s,%s) VALUES('%s','%s')" % (table_name, timestamp_column, power_column, timestampval.strftime('%Y-%m-%d %H:%M:%S'), powerval)) 
    #sqlCheck = ("SELECT TOP 1 %s.%s FROM %s WHERE %s.%s = %s;" % (table_name, timestamp_column, table_name, table_name, timestamp_column, timestampval.strftime('%Y-%m-%d %H:%M:%S'))) 
    sqlInsert = ("INSERT INTO %s (%s,%s) VALUES('%s','%s')", (table_name, timestamp_column, power_column, timestampval.strftime('%Y-%m-%d %H:%M:%S'), powerval)) 
    sqlCheck = ("SELECT TOP 1 %s.%s FROM %s WHERE %s.%s = %s;", (table_name, timestamp_column, table_name, table_name, timestamp_column, timestampval.strftime('%Y-%m-%d %H:%M:%S'))) 
    cur = SD.cursor() 
    try: 
     #cur.execute(sqlCheck) 
     # Aim here is to see if the datetime for the file has an existing entry in the database_name 
     #If it does, do nothing, otherwise add the values to the datbase 
     cur.execute(sqlCheck) 
     if cur.fetchone() == "None": 
      cur.execute(sqlInsert) 
      print "" 
     SD.commit() 
    except: 
     print 'DB append failed!' 
     syslog.syslog('SunnyData DB append failed') 
     SD.rollback() 

    # Main start of program 
    path = '/home/greg/currentGenerated/SBEAM/' 
    destination = path + '/parsed' 
    syslog.syslog('parsing SunnyData CSVs started') 
    for filename in listdir(path): 
    print filename 
    if is_dated_csv(filename): 
     #connect and disconnect once per CSV file - wasteful to reconnect for every line in def append_to_database(...) 
     SD = MySQLdb.connect(host="localhost", user="root",passwd="removed", db = 'SunnyData') 
     print SD 
     print filename + ' matched' 
     day_in_question = parse_for_date(filename) 
     print 'the date is ' + day_in_question 
     theDate = normalise_date_to_UTF(day_in_question) 
     parse_power_values(filename, theDate) 
     SD.close() 
     shutil.move(path + '/' + filename, destination) 
     syslog.syslog('SunnyData file' + path + '/' + filename + 'parsed & moved to ' + destination) 

它用於工作,但它一直很長一段時間,自從我上次檢查以來有很多更新。我擔心迴歸可能會改變我的代碼下的東西。只是不知道如何全力以赴。

道歉,這不是一個非常明確和具體的問題,但如果你能幫我分揀,它可能仍然是一個很好的例子,爲其他人?

感謝

格雷格

+1

考慮捕獲異常:'除了'例外作爲e:print(e)'而不是'print'DB append失敗或者除了''print append failed!'',因爲你會得到實際的MySQL/Mariadb或Python錯誤消息。 – Parfait

+0

我加了你的建議,現在有更好的見解。原來這是一個類型錯誤 '參數1必須是字符串或只讀緩衝區,而不是元組' 現在只需要閱讀如何處理它:o) – Greg

回答

0

有在MySQL/MariaDB的無SELECT TOP ...語法,所以你的腳本必須在試圖執行sqlCheck要失敗。

應該是SELECT %s.%s FROM %s WHERE %s.%s = %s LIMIT 1

+0

我將版本更改爲您的版本,雖然它沒有做出區別。但是我確定你是對的,所以,我將編輯留下,直到我可以將SQL提供給_string_,而不是_tuple_(誰知道!?) – Greg