mongodb：返回子文檔並跟蹤父代

我收集了一些推文，我試圖將根級別的轉推（類似於引用推文）輸出到新集合，以便稍後將它們與使用轉儲的原始集合合併，恢復）。轉推狀態是tweet文檔中的一個子文檔，可能有多個推文轉發同一推文。如何在根級上進行轉推，並添加一個名爲'retweeted_by'的數組，其中包含所有推特轉發的ID？mongodb：返回子文檔並跟蹤父代

請記住，我將推文ID用作主索引（_id），以避免在組合（mongorestore）集合時創建重複項。

我收藏有以下形式：

{ 
    "_id" : "123456", 
    "other_fields1" : "values1", 
    "retweeted_status" : { 
          "retweet_id": "159753", 
          "other_fields2" : "values2", 
          } 
}

理想的產量預計將看起來像：

{ 
    "_id" : "159753", 
    "other_fields2" : "values2",  
    "retweeted_by" : [ "123456", "974631", "121212"] 
}

編輯澄清：

子文檔中的字段（other_fields2 ）是多個字段（〜28），並非全部存在於其他推文中

來源

2017-08-07 Ali Abul Hawa

'db.collection.aggregate（[{$組：{_id：「$ retweeted_status.retweet_id」，retweeted_by ：{$ push：「$ _id」}}}]）' – felix

@felix謝謝，但這隻輸出retweeted_status的id，而不是retweedted_status的整個子文檔，在我的示例「other_fields2」中調用...我想在我需要使用$ replaceRoot將子文檔作爲newRoot，並以某種方式向其添加數組retweted_by –

add'other_fields2：{$ first：「$ retweeted_status.other_fields2」}'。請看[mongodb documentation $ group]（https://docs.mongodb.com/manual/reference/operator/aggregation/group/） – felix

OK ..所以我終於達成了解決我的問題..我不知道這是否是這樣做，雖然最好的辦法：

db.tweets.aggregate([ 
{ 
    $match: { retweeted_status: {$exists: true}} 
}, 
{ 
    $addFields: { 'retweeted_status.retweeted_by' : '$_id', 'retweeted_status._id' : '$retweeted_status.id_str'} 
}, 
{ 
    $replaceRoot: { newRoot: '$retweeted_status'} 
}, 
{ 
    $group: { _id: '$_id', doc: { '$first': '$$ROOT' }, retweeted_by: {$addToSet: '$retweeted_by'}} 
}, 
{ 
    $addFields: { 'doc.retweeted_by' : '$retweeted_by'} 
}, 
{ 
    $replaceRoot: { newRoot: '$doc'} 
}, 
{ 
    $project: { id: 0 , id_str: 0 } 
}, 
{ 
    $out: 'retweets' 
} 
], {allowDiskUse: true})

開始時，每個文件（鳴叫）的形式爲：

{父，子文檔{}}

首先匹配一個retweeted_status（子文檔）的存在，然後通過retweeted_status ID分組之前，我添加一個字段與父鳴叫的id：

{父，{子文檔，PARENT_ID}}

然後替換根與修改後的子文檔：

{子文檔，PARENT_ID}

然後，我通過新的根的_id分組，拿到了該組的第一份文件，並添加了一個新的累加器組（retweeted_by）。（未$推因爲Twitter API有時發送一式兩份）

到目前爲止，根文檔包含_id，嵌入在字段「文檔」內的轉推文件，以及包含父母的數組：

{文檔{子文檔，PARENT_ID}，[parent_ids]}

接着，我添加了父母陣列內的文檔的字段，（覆蓋先前添加retweeted_by字段）：

{文檔{子文檔，[parent_ids]} ，[parent_ids]}

然後用新文檔替換父（root）。然後排除包含相同數量的作爲_id字段：

{子文檔，[parent_ids]}

來源

2017-08-08 12:09:17

mongodb：返回子文檔並跟蹤父代

回答

相關問題