有效地從json對象中追加n個字符串？

所以我有一個JSON對象是這樣的：有效地從json對象中追加n個字符串？

data = [{key1: 123, key2:"this is the first string to concatenate"}, 
{key1: 131, key2:"C'est la deuxième chaîne à concaténer"}, 
{key1: 152, key2:"this is the third string to concatenate"}, 
{key1: 152, key2:"this is the fourth string to concatenate"} ]

，我想所有的英文key2字符串拼接在一起，就像：

"this is the first string to concatenate this is the third string to concatenate this is the fourth string to concatenate"

和基於的this問題，我做它像這樣：

all_key2 = " ".join([elem["key2"] for elem in data if langid.classify(elem["key2"])=="english"])

但是，是否有可能限制項目的數量加入到列表中？例如，如果我只想連接最多2個英文key2，該怎麼辦？這意味着，我想是這樣的：

"this is the first string to concatenate this is the third string to concatenate"

基本上，一旦我串連英語句子的一些最大數量，我不想再串聯。我可以用一個做到這一點的循環，像這樣：

all_key2 = "" 
english_count =0 
data = json.load(json_file) 
for p in data: 
    if english_count > 2: 
     break 
    #make it all one big string 
    if langid.classify(p["key2"])=="english": 
     #increment english_count 
     #join here

但由於我想避免for循環性能問題....有沒有辦法做到這一點？

[編輯]我只是不切片過濾列表的原因是因爲生成過濾列表需要很多時間。我想放置一個最大english_count條件，使我產生只是整個列表

來源

2017-04-23 ocean800

什麼性能問題？「不成熟的優化是萬惡的根源」https://xkcd.com/1691/矢量化方法不能提前停止，'for'循環可以（使用'break'） – cco

@cco我有一個bazillion對象，每個對象都有很多長串。使用for循環時間過長，'.join（）'有顯着的性能提升 – ocean800

列表解析不能提前停止;將永遠運行整個列表。 – cco

使用for循環而不是一個列表理解可以讓你提前停止，這樣的一部分：

filtered_list = [] 
for elem in data: 
    if langid.classify(elem["key2"])=="english": 
     filtered_list.append(elem["key2"]) 
     if len(filtered_list) > 2: # or whatever your max is 
      break 
result = " ".join(filtered_list)

來源

2017-04-23 22:02:27 cco

有效地從json對象中追加n個字符串？

回答

相關問題