更新
這個問題已經在OR exchange上進行了徹底的討論和更新,我已經在其中進行了交叉處理。CPLEX Python API性能開銷?
原始問題
當運行在命令行CPLEX 12.5.0.0:
cplex -f my_instance.lp
最佳整數解在19056.99蜱找到。
但通過Python API,就很不相同的實例:
import cplex
problem = cplex.Cplex("my_instance.lp")
problem.solve()
現在所需要的時間將達到97407.10蜱(慢5倍以上)。
在這兩種情況下,模式都是並行的,確定性的,最多2個線程。想知道如果這表現不佳是由於一些Python線程開銷,我想:
problem = cplex.Cplex("my_instance.lp")
problem.parameters.threads.set(1)
problem.solve()
要求46513.04蜱(即使用一個核心比使用兩個快兩倍!)。
作爲CPLEX和LP的新手,我發現這些結果令人困惑。有沒有一種方法可以提高Python API的性能,還是應該切換到一些更成熟的API(即Java或C++)?
附件
這裏是2線程的分辨率的全部細節,首先將(公共)前同步碼:
Tried aggregator 3 times.
MIP Presolve eliminated 2648 rows and 612 columns.
MIP Presolve modified 62 coefficients.
Aggregator did 13 substitutions.
Reduced MIP has 4229 rows, 1078 columns, and 13150 nonzeros.
Reduced MIP has 1071 binaries, 0 generals, 0 SOSs, and 0 indicators.
Presolve time = 0.06 sec. (18.79 ticks)
Probing fixed 24 vars, tightened 0 bounds.
Probing time = 0.08 sec. (18.12 ticks)
Tried aggregator 1 time.
MIP Presolve eliminated 87 rows and 26 columns.
MIP Presolve modified 153 coefficients.
Reduced MIP has 4142 rows, 1052 columns, and 12916 nonzeros.
Reduced MIP has 1045 binaries, 7 generals, 0 SOSs, and 0 indicators.
Presolve time = 0.05 sec. (11.67 ticks)
Probing time = 0.01 sec. (1.06 ticks)
Clique table members: 4199.
MIP emphasis: balance optimality and feasibility.
MIP search method: dynamic search.
Parallel mode: deterministic, using up to 2 threads.
Root relaxation solution time = 0.20 sec. (91.45 ticks)
結果的命令行:
GUB cover cuts applied: 1
Clique cuts applied: 3
Cover cuts applied: 2
Implied bound cuts applied: 38
Zero-half cuts applied: 7
Gomory fractional cuts applied: 2
Root node processing (before b&c):
Real time = 5.27 sec. (2345.14 ticks)
Parallel b&c, 2 threads:
Real time = 35.15 sec. (16626.69 ticks)
Sync time (average) = 0.00 sec.
Wait time (average) = 0.00 sec.
------------
Total (root+branch&cut) = 40.41 sec. (18971.82 ticks)
結果來自Python API:
Clique cuts applied: 33
Cover cuts applied: 1
Implied bound cuts applied: 4
Zero-half cuts applied: 10
Gomory fractional cuts applied: 4
Root node processing (before b&c):
Real time = 6.42 sec. (2345.36 ticks)
Parallel b&c, 2 threads:
Real time = 222.28 sec. (95061.73 ticks)
Sync time (average) = 0.01 sec.
Wait time (average) = 0.00 sec.
------------
Total (root+branch&cut) = 228.70 sec. (97407.10 ticks)
對不起,最近刪除了一個「答案」,用於將讀者重定向到更新的討論。我隨後通過鏈接更新了我的帖子。希望它能存活下來;-) – Aristide 2013-09-14 13:10:01