(0)您問「爲什麼win32com比xlrd慢得多?」 ......這個問題有點像「你不再毆打你的妻子嗎?」 ---它是基於一個可能不是真的假設; win32com由一位出色的程序員用C語言編寫,但是xlrd是由普通程序員用純Python編寫的。真正的區別在於win32com必須調用COM,其中涉及進程間通信和由you-know-who編寫,而xlrd則直接讀取Excel文件。此外,場景中還有第四方:您。請繼續閱讀。
(1)您不會向我們展示在COM代碼中重複使用的find_last_col()
函數的來源。在xlrd代碼中,您總是樂意使用相同的值(ws.ncols)。所以在COM代碼中,你應該調用find_last_col(ws)
ONCE,然後使用返回的結果。 更新請參閱answer to your separate question關於如何從COM獲取xlrd的Sheet.ncols
的等效項。 (2)訪問每個單元值TWICE正在減慢兩個代碼的速度。取而代之的
if ws.cell_value(6, cnum):
wsHeaders[str(ws.cell_value(6, cnum))] = (cnum, ws.ncols)
嘗試
value = ws.cell_value(6, cnum)
if value:
wsHeaders[str(value)] = (cnum, ws.ncols)
注:有每個代碼片斷2案件這一點。
(3)你的嵌套循環的目的並不明顯,但似乎有一些冗餘計算,涉及從COM冗餘提取。如果你想通過例子告訴我們你想要達到的目標,我們可以幫助你使其運行得更快。至少,從COM中提取值然後在Python中嵌套循環中處理它們應該更快。有多少列?
更新2同時,小精靈走上與直腸鏡你的代碼,並與下面的腳本上來:
tests= [
"A/B/C/D",
"A//C//",
"A//C//E",
"A///D",
"///D",
]
for test in tests:
print "\nTest:", test
row = test.split("/")
ncols = len(row)
# modelling the OP's code
# (using xlrd-style 0-relative column indexes)
d = {}
for cnum in xrange(ncols):
if row[cnum]:
k = row[cnum]
v = (cnum, ncols) #### BUG; should be ncols - 1 ("inclusive")
print "outer", cnum, k, '=>', v
d[k] = v
for cend in xrange(cnum + 1, ncols):
if row[cend]:
k = row[cnum]
v = (cnum, cend - 1)
print "inner", cnum, cend, k, '=>', v
d[k] = v
break
print d
# modelling a slightly better algorithm
d = {}
prev = None
for cnum in xrange(ncols):
key = row[cnum]
if key:
d[key] = [cnum, cnum]
prev = key
elif prev:
d[prev][1] = cnum
print d
# if tuples are really needed (can't imagine why)
for k in d:
d[k] = tuple(d[k])
print d
它輸出這樣的:因爲它
Test: A/B/C/D
outer 0 A => (0, 4)
inner 0 1 A => (0, 0)
outer 1 B => (1, 4)
inner 1 2 B => (1, 1)
outer 2 C => (2, 4)
inner 2 3 C => (2, 2)
outer 3 D => (3, 4)
{'A': (0, 0), 'C': (2, 2), 'B': (1, 1), 'D': (3, 4)}
{'A': [0, 0], 'C': [2, 2], 'B': [1, 1], 'D': [3, 3]}
{'A': (0, 0), 'C': (2, 2), 'B': (1, 1), 'D': (3, 3)}
Test: A//C//
outer 0 A => (0, 5)
inner 0 2 A => (0, 1)
outer 2 C => (2, 5)
{'A': (0, 1), 'C': (2, 5)}
{'A': [0, 1], 'C': [2, 4]}
{'A': (0, 1), 'C': (2, 4)}
Test: A//C//E
outer 0 A => (0, 5)
inner 0 2 A => (0, 1)
outer 2 C => (2, 5)
inner 2 4 C => (2, 3)
outer 4 E => (4, 5)
{'A': (0, 1), 'C': (2, 3), 'E': (4, 5)}
{'A': [0, 1], 'C': [2, 3], 'E': [4, 4]}
{'A': (0, 1), 'C': (2, 3), 'E': (4, 4)}
Test: A///D
outer 0 A => (0, 4)
inner 0 3 A => (0, 2)
outer 3 D => (3, 4)
{'A': (0, 2), 'D': (3, 4)}
{'A': [0, 2], 'D': [3, 3]}
{'A': (0, 2), 'D': (3, 3)}
Test: ///D
outer 3 D => (3, 4)
{'D': (3, 4)}
{'D': [3, 3]}
{'D': (3, 3)}
+1我同意 - 放緩不是因爲COM,而是因爲OP的使用,使得它看起來更慢。 – Cam 2010-06-04 01:55:36
正是由於IPC的開銷,還有許多方法可以使用COM通過安全數組讀取和寫入多個值。你可以通過win32com透明地做到這一點,通過指定範圍而不是單個單元。 – 2010-06-04 03:38:46