2017-08-24 99 views
0

我有一個列表列表。我能夠在內部列表中生成二元語法,它看起來像如下:用python bigrams中的「_」下劃線替換逗號

[[('bacteria', 'agricultur'), ('agricultur', 'soil'), ('soil', 'presenc'), ('presenc', 'sampl')],[('bacteria', 'agricultur'), ('agricultur', 'soil'), ('soil', 'presenc'), ('presenc', 'sampl')],[('nodul', 'uragensi')], [('nodul', 'stem'), ('stem', 'nodul')], [('deform', 'morphoid')]]

現在,我需要與我無法做下劃線來代替二元元組中的逗號了這一點。所以,結果應該

[[(bacteria_agricultur), (agricultur_soil), (soil_presenc), (presenc_sampl)],[(bacteria_agricultur), (agricultur_soil), (soil_presenc), (presenc_sampl)],[(nodul_uragensi)], [(nodul_stem), (stem_nodul)], [('deform'_'morphoid')]]

當我使用加入它給了我錯誤

texts = ["_".join(word) for word in texts] 

錯誤:

TypeError: sequence item 0: expected str instance, tuple found 

我怎麼能生產出上面的輸出?由於

回答

1

你可以只使用嵌套列表理解:

In [446]: [['_'.join(y) for y in x] for x in lst] 
Out[446]: 
[['bacteria_agricultur', 'agricultur_soil', 'soil_presenc', 'presenc_sampl'], 
['bacteria_agricultur', 'agricultur_soil', 'soil_presenc', 'presenc_sampl'], 
['nodul_uragensi'], 
['nodul_stem', 'stem_nodul'], 
['deform_morphoid']] 

如果你堅持括號,你可以創建一個元素的元組,以及:

In [447]: [[('_'.join(y),) for y in x] for x in lst] 
Out[447]: 
[[('bacteria_agricultur',), 
    ('agricultur_soil',), 
    ('soil_presenc',), 
    ('presenc_sampl',)], 
[('bacteria_agricultur',), 
    ('agricultur_soil',), 
    ('soil_presenc',), 
    ('presenc_sampl',)], 
[('nodul_uragensi',)], 
[('nodul_stem',), ('stem_nodul',)], 
[('deform_morphoid',)]] 
0
NewData=[] 
for bigrams in lists: 
    for grams in bigrams: 
     NewData.append(str(grams).replace("'","").replace(", ","_")))