我想你」重新尋找是一個簡單的字典結構。這會讓你不僅跟蹤你正在尋找的單詞,而且還會記錄它們的數量。
字典將事物存儲爲鍵/值對。因此,例如,您可以使用「alice」這個關鍵字(您想查找的一個字,並將其值設置爲您找到該關鍵字的次數。)
檢查字典中是否有內容的最簡單方法是通過Python的in
關鍵字即
if 'pie' in words_in_my_dict: do something
有了這些信息的方式進行,建立一個字計數器是很容易
def get_word_counts(words_to_count, filename):
words = filename.split(' ')
for word in words:
if word in words_to_count:
words_to_count[word] += 1
return words_to_count
if __name__ == '__main__':
fake_file_contents = (
"Alice's Adventures in Wonderland (commonly shortened to "
"Alice in Wonderland) is an 1865 novel written by English"
" author Charles Lutwidge Dodgson under the pseudonym Lewis"
" Carroll.[1] It tells of a girl named Alice who falls "
"down a rabbit hole into a fantasy world populated by peculiar,"
" anthropomorphic creatures. The tale plays with logic, giving "
"the story lasting popularity with adults as well as children."
"[2] It is considered to be one of the best examples of the literary "
"nonsense genre,[2][3] and its narrative course and structure, "
"characters and imagery have been enormously influential[3] in "
"both popular culture and literature, especially in the fantasy genre."
)
words_to_count = {
'alice' : 0,
'and' : 0,
'the' : 0
}
print get_word_counts(words_to_count, fake_file_contents)
這使輸出:!
{'and': 4, 'the': 5, 'alice': 0}
由於dictionary
存儲我們要計數的單詞和它們出現的次數。整個算法只是檢查每個單詞是否在dict
中,如果事實證明我們是,我們將1
添加到該單詞的值。
辭書here.閱讀了
編輯:
如果要統計所有的話,然後找到這個任務的一組特定的,字典是仍然很大(快!) 。
我們需要做的唯一變化是首先檢查字典key
是否存在,如果不存在,則將其添加到字典中。
例
def get_all_word_counts(filename):
words = filename.split(' ')
word_counts = {}
for word in words:
if word not in word_counts: #If not already there
word_counts[word] = 0 # add it in.
word_counts[word] += 1 #Increment the count accordingly
return word_counts
這使輸出:
and : 4
shortened : 1
named : 1
popularity : 1
peculiar, : 1
be : 1
populated : 1
is : 2
(commonly : 1
nonsense : 1
an : 1
down : 1
fantasy : 2
as : 2
examples : 1
have : 1
in : 4
girl : 1
tells : 1
best : 1
adults : 1
one : 1
literary : 1
story : 1
plays : 1
falls : 1
author : 1
giving : 1
enormously : 1
been : 1
its : 1
The : 1
to : 2
written : 1
under : 1
genre,[2][3] : 1
literature, : 1
into : 1
pseudonym : 1
children.[2] : 1
imagery : 1
who : 1
influential[3] : 1
characters : 1
Alice's : 1
Dodgson : 1
Adventures : 1
Alice : 2
popular : 1
structure, : 1
1865 : 1
rabbit : 1
English : 1
Lutwidge : 1
hole : 1
Carroll.[1] : 1
with : 2
by : 2
especially : 1
a : 3
both : 1
novel : 1
anthropomorphic : 1
creatures. : 1
world : 1
course : 1
considered : 1
Lewis : 1
Charles : 1
well : 1
It : 2
tale : 1
narrative : 1
Wonderland) : 1
culture : 1
of : 3
Wonderland : 1
the : 5
genre. : 1
logic, : 1
lasting : 1
注:正如你可以看到有一對夫婦 「擦槍走火」 的時候,我們split(' ')
文件。具體來說,有些詞有附加的開頭或結尾括號。你將不得不在你的文件處理中對此進行解釋..但是,我讓你知道!
您需要列出哪些單詞要保留計數,並且只有在輸入單詞在該列表上時纔會添加。作爲一種優化,你可以初始化'cnt'字典以對每個「有趣」的單詞計數爲零,然後在主循環中只有當單詞已經有一個計數時才遞增。 – tripleee 2013-04-11 05:49:01
[請使用一致的縮進](http://www.python.org/dev/peps/pep-0008/)。但無論如何,我不明白這個問題。你想要它做什麼不是你的代碼?你不希望它做什麼? 「只讀特定字詞」是什麼意思?你不知道你要閱讀的單詞是否在「特定單詞」列表中,直到你看它,即閱讀它。 – 2013-04-11 07:11:41