2017-02-11 77 views
1

我想爲nltk.concordance運行下面的代碼,但它沒有給出任何結果。有人能指導我,我做錯了什麼?NLTK協調不起作用

import nltk.corpus 
from nltk.text import Text 

sent = '''China is an emerging FinTech hotbed thanks to its expanding middle class, rapid digitization and electronic payments adoption. But a new report from Citi found that, while China may be the market to watch for FinTech investments, the U.S. continues to thrive at the top of the B2B FinTech mountain. 
According to Digital Disruption — Revisited: What FinTech VC Investments Tells Us About A Changing Industry, Citi expects an influx in venture capital across the FinTech startup scape. But not all markets are created equal. China saw more than half of the world’s FinTech investments in the first nine months of 2016, the bank noted.''' 

content = sent.decode('utf-8') #else it throws error 
textList = Text(content) 
textList.concordance('FinTech') 

我得到以下輸出:

No matches 

TIA的幫助

回答

2

必須從字符串序列創建Text實例。使用nltk.tokenizeTokenizer來標記您的句子:

> t = nltk.tokenize.WhitespaceTokenizer() # or any other Tokenizer 
> c = Text(t.tokenize(content)) 
> c.concordance(u'FinTech') 
Displaying 6 of 6 matches: 
            FinTech hotbed thanks to its expanding midd 
hina may be the market to watch for FinTech investments, the U.S. continues to 
ues to thrive at the top of the B2B FinTech mountain. According to Digital Disr 
igital Disruption — Revisited: What FinTech VC Investments Tells Us About A Cha 
nflux in venture capital across the FinTech startup scape. But not all markets 
a saw more than half of the world’s FinTech investments in the first nine month