Python SQL查詢執行時間

我幾乎沒有使用Python和SQL的經驗。爲了完成我的碩士論文，我一直在自學。Python SQL查詢執行時間

我只是寫了一個小腳本基準約50個相同的結構化數據庫，如下：

import thesis,pyodbc 

# SQL Server settings 
drvr = '{SQL Server Native Client 10.0}' 
host = 'host_directory' 
user = 'username' 
pswd = 'password' 
table = 'tBufferAux' # Found (by inspection) to be the table containing relevant data 
column = 'Data' 

# Establish a connection to SQL Server 
cnxn = pyodbc.connect(driver=drvr, server=host, uid=user, pwd=pswd) # Setup connection 

endRow = 'SELECT TOP 1 ' + column + ' FROM [' # Query template for ending row 
with open(thesis.db_metadata_path(),'w') as file: 
    for db in thesis.db_list(): 
     # Prepare queries 
     countRows_query = 'SELECT COUNT(*) FROM [' + db + '].dbo.' + table 
     firstRow_query = endRow + db + '].dbo.' + table + ' ORDER BY ' + column + ' ASC' 
     lastRow_query = endRow + db + '].dbo.' + table + ' ORDER BY ' + column + ' DESC' 
     # Execute queries 
     N_rows = cnxn.cursor().execute(countRows_query).fetchone()[0] 
     first_row = cnxn.cursor().execute(firstRow_query).fetchone() 
     last_row = cnxn.cursor().execute(lastRow_query).fetchone() 
     # Save output to text file 
     file.write(db + ' ' + str(N_rows) + ' ' + str(first_row.Data) + ' ' + str(last_row.Data) + '\n') 

# Close session 
cnxn.cursor().close() 
cnxn.close()

我驚訝地發現，這個簡單的程序採取近10秒的運行，所以我在想，如果這是正常的，或者我有我的代碼的任何部分，可能會延緩執行。（我提醒你，進行循環運行僅56倍）

注意，從thesis（定製）模塊的所有功能，具有非常小的影響，因爲所有的人都只是變量賦值（除了thesis.db_list()這是一個快速.TXT文件閱讀）

編輯：This是由該程序生成的輸出.txt文件。第二列是每個數據庫的該表的記錄數。

來源

2015-02-08 POliveira

作爲一個附註，你的列名在'first_row/last_row.Data'中被硬編碼。使用'getattr（first_row，column）'來避免這種情況。 – 2015-02-08 01:48:50

您正在使用的索引順序列？如果不是，則可能需要很長時間才能找到第一行和最後一行。 – 2015-02-08 05:33:03

@ivan_pozdeev好的！我沒有注意到這一點。謝謝。 – POliveira 2015-02-08 11:30:37

timeit是很好的衡量和比較單一的語句和代碼塊的性能（注意，在iPython，有一個內置的命令這樣做更容易）。
Profilers將測量值分解爲每個調用的函數（對更大量的代碼更有用）。
請注意，獨立程序（更是如此，解釋型語言中的一個程序）具有啓動（和關閉）開銷。

結合起來，對於訪問數據庫的程序來說，10秒看起來不是很像。

作爲一個測試，我在這樣的探查包裹你的程序：

def main(): 
<your program> 
if __name__=='__main__': 
    import cProfile 
    cProfile.run('main()')

而且從cygwin的bash像這樣運行它：

T1=`date +%T,%N`; /c/Python27/python.exe ./t.py; echo $T1; date +%T,%N

結果表中所列connect作爲單一時間豬（我的機器是一個非常快的i7 3.9GHz/8GB與本地MSSQL和SSD作爲系統磁盤）：

 7200 function calls (7012 primitive calls) in 0.058 seconds 

ncalls tottime percall cumtime percall filename:lineno(function) 
<...> 
    1 0.003 0.003 0.058 0.058 t.py:1(main) 
<...> 
    1 0.043 0.043 0.043 0.043 {pyodbc.connect} 
<...>

個

而且date命令表明自己跑了周圍300毫秒的節目，給它250ms的總開銷：

<...>:39,782700900 
<...>:40,072717400

（通過命令行排除python，我證實了其他命令的開銷可以忽略不計 - 約7us）

來源

2015-02-08 01:12:02

因此連接呼叫對最後的持續時間有很大的影響，對吧？因此，我的代碼沒有什麼特別的錯誤，對嗎？謝謝 – POliveira 2015-02-08 11:33:32

是的，你的代碼沒有問題。除了連接和啓動之外，您的情況可能還有其他障礙，例如網絡或文件I/O。我測試的數據比您的數據少一個數量級。 – 2015-02-08 11:49:32

Python SQL查詢執行時間

回答

相關問題