Neo4j中緩慢的性能批量更新關係屬性

我在努力高效地批量更新Neo4j中的關係屬性。目的是更新〜500000的關係（每個具有大約3個屬性），其餘塊成在單一的Cypher語句1000和處理批次，Neo4j中緩慢的性能批量更新關係屬性

UNWIND {rows} AS row 
MATCH (s:Entity) WHERE s.uuid = row.source 
MATCH (t:Entity) WHERE t.uuid = row.target 
MATCH (s)-[r:CONSUMED]->(t) 
SET r += row.properties

然而1000個節點的每個批次需要大約60秒。存在的:Entity標籤上UUID性的指標，即我先前曾執行過，

CREATE INDEX ON :Entity(uuid)

這意味着匹配關係是每個查詢計劃的超高效，

有總共有6個數據庫命中，查詢在〜150毫秒內執行。我還添加上確保了每場比賽只返回一個元素的UUID屬性唯一性約束，

CREATE CONSTRAINT ON (n:Entity) ASSERT n.uuid IS UNIQUE

有誰知道我可以繼續調試明白爲什麼它採取的Neo4j這麼長時間來處理的關係？

請注意，我正在使用類似的邏輯來更新節點，它的速度要快幾個數量級，並且有更多的元數據與它們相關聯。

僅供參考我正在使用Neo4j 3.0.3，py2neo和Bolt。的Python代碼塊的形式爲，

for chunk in chunker(relationships): # 1,000 relationships per chunk 
    with graph.begin() as tx: 
     statement = """ 
      UNWIND {rows} AS row 
      MATCH (s:Entity) WHERE s.uuid = row.source 
      MATCH (t:Entity) WHERE t.uuid = row.target 
      MATCH (s)-[r:CONSUMED]->(t) 
      SET r += row.properties 
      """ 

      rows = [] 

      for rel in chunk: 
       rows.append({ 
        'properties': dict(rel), 
        'source': rel.start_node()['uuid'], 
        'target': rel.end_node()['uuid'], 
       }) 

      tx.run(statement, rows=rows)

來源

2016-11-21 John

嘗試此查詢：

UNWIND {rows} AS row 
WITH row.source as source, row.target as target, row 
MATCH (s:Entity {uuid:source}) 
USING INDEX s:Entity(uuid) 
WITH * WHERE true 
MATCH (t:Entity {uuid:target}) 
USING INDEX t:Entity(uuid) 
MATCH (s)-[r:CONSUMED]->(t) 
SET r += row.properties;

它使用index hints強制索引查找爲兩個Entity節點，然後一個Expand(Into)運算符，它應當更比查詢計劃中顯示的Expand(All)和Filter運算符高。

來源

2016-11-21 18:16:21

@ william-lyon我想知道我是否需要WITH * WHERE true子句？我想問的原因是DB的命中數爲4〜8的增加，即

PROFILE 
MATCH (s:Entity {uuid:row.source}) 
USING INDEX s:Entity(uuid) 
MATCH (t:Entity {uuid:row.target}) 
USING INDEX t:Entity(uuid) 
MATCH (s)-[r:CONSUMED]->(t)

回報

而

PROFILE 
MATCH (s:Entity {uuid:row.source}) 
USING INDEX s:Entity(uuid) 
WITH * WHERE true 
MATCH (t:Entity {uuid:row.target}) 
USING INDEX t:Entity(uuid) 
MATCH (s)-[r:CONSUMED]->(t)

回報

請注意，使用索引提示可將數據庫命中數從6降低到4.對於上下文，我們有多個節點標籤（和索引），儘管每個節點都有:Entity標籤。

來源

2016-11-22 05:04:52 John

Neo4j中緩慢的性能批量更新關係屬性

回答

相關問題