2017-02-17 288 views
0

我有一個默認字典,有3層嵌入,稍後將用於三元組。如何將當前字典嵌入到python中的另一個字典中?

counts = defaultdict(lambda:defaultdict(lambda:defaultdict(lambda:0))) 

然後,我有一個for循環,通過一個文件去,並創建每個字母的計數(和bicounts和tricounts)

counts[letter1][letter2][letter3] = counts[letter1][letter2][letter3] + 1 

我想添加另一層,這樣我可以,如果指定這封信是一個輔音或一個元音。

我希望能夠在輔音與元音之間運行我的bigram和trigram,而不是字母表中的每個字母,但我不知道如何執行此操作。

+0

你能提供你當前的代碼嗎? – mitoRibo

+0

我不確定我是否理解你的問題......如何不簡單地向你的defaultdict添加另一個「圖層」來解決問題?你不知道怎麼辦? –

+1

上帝啊,你真討厭那個'+ ='對你做了什麼?不使用它(尤其是在這裏)比「count [letter1] [letter2] [letter3] + = 1」更慢,更可笑冗長/多餘! – ShadowRanger

回答

0

我不確定你想要做什麼,但我認爲嵌套字典的方法並不像你使用字母組合字符串(即d['ab']而不是d['a']['b'])的字母鍵盤那樣乾淨。我還加入了代碼來檢查bigram/trigram是否僅由元音/輔音或混合物組成。

CODE:

from collections import defaultdict 


def all_ngrams(text,n): 
    ngrams = [text[ind:ind+n] for ind in range(len(text)-(n-1))] 
    ngrams = [ngram for ngram in ngrams if ' ' not in ngram] 
    return ngrams 


counts = defaultdict(int) 
text = 'hi hello hi this is hii hello' 
vowels = 'aeiouyAEIOUY' 
consonants = 'bcdfghjklmnpqrstvwxzBCDFGHJKLMNPQRSTVWXZ' 

for n in [2,3]: 
    for ngram in all_ngrams(text,n): 
     if all([let in vowels for let in ngram]): 
      print(ngram+' is all vowels') 

     elif all([let in consonants for let in ngram]): 
      print(ngram+' is all consonants') 

     else: 
      print(ngram+' is a mixture of vowels/consonants') 

     counts[ngram] += 1 

print(counts) 

OUTPUT:

hi is a mixture of vowels/consonants 
he is a mixture of vowels/consonants 
el is a mixture of vowels/consonants 
ll is all consonants 
lo is a mixture of vowels/consonants 
hi is a mixture of vowels/consonants 
th is all consonants 
hi is a mixture of vowels/consonants 
is is a mixture of vowels/consonants 
is is a mixture of vowels/consonants 
hi is a mixture of vowels/consonants 
ii is all vowels 
he is a mixture of vowels/consonants 
el is a mixture of vowels/consonants 
ll is all consonants 
lo is a mixture of vowels/consonants 
hel is a mixture of vowels/consonants 
ell is a mixture of vowels/consonants 
llo is a mixture of vowels/consonants 
thi is a mixture of vowels/consonants 
his is a mixture of vowels/consonants 
hii is a mixture of vowels/consonants 
hel is a mixture of vowels/consonants 
ell is a mixture of vowels/consonants 
llo is a mixture of vowels/consonants 
defaultdict(<type 'int'>, {'el': 2, 'his': 1, 'thi': 1, 'ell': 2, 'lo': 2, 'll': 2, 'ii': 1, 'hi': 4, 'llo': 2, 'th': 1, 'hel': 2, 'hii': 1, 'is': 2, 'he': 2}) 
0

假設你需要保持計數元音和輔音你可以簡單地保持不同的地圖的順序。

如果你有一個函數is_vowel(letter)返回True如果letter是元音和False如果它是一個輔音,你可以做到這一點。

vc_counts[is_vowel(letter1)][is_vowel(letter2)][is_vowel(letter3)] = \ 
vc_counts[is_vowel(letter1)][is_vowel(letter2)][is_vowel(letter3)] + 1 
+0

非常感謝!太棒了。 –

+0

很棒@KatieTetzloff。如果它解決了你的問題,你可以請upvote並接受答案?謝謝! – cjungel

相關問題