2016-01-22 167 views
2

我有三個相關模型:過程,因子和級別。一個過程與因素有多對多的關係,一個因素將有一個或多個等級。我試圖計算一個流程涉及的所有級別組合。這很容易用Python的itertools作爲模型方法來實現,但執行速度有點慢,所以我想弄清楚如何使用Django ORM在SQL中執行這種計算。使用Django ORM計算組合(CROSS JOIN)

型號:

class Process(models.Model): 
    factors = models.ManyToManyField(Factor, blank = True) 

class Factor(models.Model): 
    ... 

class Level(models.Model): 
    factor = models.ForeignKey(Factor, on_delete=models.CASCADE) 

示例:運行涉及三個因素(距離爬升表面)每一個數字表示的級別的組成的過程長/短Flat/HillyRoad/Mixed/Trail)。在SQL中計算組合將涉及通過首先確定涉及多少個因素(本例中爲3)並執行多次所有級別的CROSS JOIN來構建查詢。

在SQL中,這可以被實現爲這樣:

WITH foo AS 
    (SELECT * FROM Level 
    WHERE Level.factor_id IN 
     (SELECT ProcessFactors.factor_id FROM ProcessFactors WHERE process_id = 1) 
    ) 
SELECT a1.*, a2.*, a3.* 
    FROM foo a1 
    CROSS JOIN foo a2 
    CROSS JOIN foo a3 
WHERE (a1.factor_id < a2.factor_id) AND (a2.factor_id < a3.factor_id) 

a1.name | a2.name | a3.name 
-------------------------- 
Long | Flat | Road 
Long | Flat | Mixed 
Long | Flat | Trail 
Long | Hilly | Road 
Long | Hilly | Mixed 
Long | Hilly | Trail 
Short | Flat | Road 
Short | Flat | Mixed 
Short | Flat | Trail 
Short | Hilly | Road 
Short | Hilly | Mixed 
Short | Hilly | Trail 

目前,我有這個實現在流程模型的方法爲:

def level_combinations(self): 
    levels = [] 
    for factor in self.factors.all(): 
     levels.append(Level.objects.filter(factor = factor)) 

    combinations = [] 
    for levels in itertools.product(*levels): 
     combination = {} 

     combination["levels"] = levels 

     combinations.append(combination) 

    return combinations 

這可能使用Django的ORM或它是否足夠複雜,應該作爲原始查詢來實現,以提高Python代碼實現的速度?

幾年前有一個關於performing CROSS JOIN in Django ORM的類似問題(大概Django v1.3看起來像)沒有吸引力似乎吸引了太多注意力(作者被踢到只使用Python itertools)。

回答

2
from itertools import groupby, product 

def level_combinations(self): 
    # We need order by factor_id for proper grouping 
    levels = Level.objects.filter(factor__process=self).order_by('factor_id') 
    # [{'name': 'Long', 'factor_id': 1, ...}, 
    # {'name': 'Short', 'factor_id': 1, ...}, 
    # {'name': 'Flat', 'factor_id': 2, ...}, 
    # {'name': 'Hilly', 'factor_id': 2, ...}] 

    groups = [list(group) for _, group in groupby(levels, lambda l: l.factor_id)] 
    # [[{'name': 'Long', 'factor_id': 1, ...}, 
    # {'name': 'Short', 'factor_id': 1, ...}], 
    # [{'name': 'Flat', 'factor_id': 2, ...}, 
    # {'name': 'Hilly', 'factor_id': 2, ...}]] 

    # Note: don't forget, that product is iterator/generator, not list 
    return product(*groups) 

如果順序並不重要,那麼:

def level_combinations(self): 
    levels = Level.objects.filter(factor__process=self) 
    groups = {} 
    for level in levels: 
     groups.setdefault(level.factor_id, []).append(level) 
    return product(*groups.values()) 
+0

雖然這不是一個純粹的Django的ORM解決方案我希望(它似乎不像Django 1.9那樣存在),它比我以前使用的代碼快得多('timeit'快了約60%)。對於我的生產數據集,無序算法比有序方法快大約10%)。 – Linville

1

如果我理解正確的話,你可以嘗試:

for process in Process.objects.all(): 
    # get all levels for current process 
    levels = Level.objects.filter(factor__in=process.factors.all())