2016-07-15 68 views
1

首先,對不起我的英文破碎。當零字發生時的錯誤

我使用此代碼來計算單詞「LeBron」或「Curry」在推文上出現的次數。問題是,如果沒有任何推文包含單詞「勒布朗」或「咖喱」程序崩潰。單詞在那裏,程序運行完美。

tweets_data_path = '/Users/HCruz/NetBeansProjects/elections3/data/fetched_tweets.txt' 

tweets_data = [] 
tweets_file = open(tweets_data_path, "r") 
for line in tweets_file: 
    try: 
     tweet = json.loads(line) 
     tweets_data.append(tweet) 
    except: 
     continue 

tweets = pd.DataFrame() 

tweets['text'] = map(lambda tweet: tweet['text'], tweets_data) 

def word_in_text(word, text): 
    word = word.lower() 
    text = text.lower() 
    match = re.search(word, text) 
    if match: 
     return True 
     return False 

tweets['LeBron'] = tweets['text'].apply(lambda tweet: word_in_text('LeBron', tweet)) 
tweets['Curry'] = tweets['text'].apply(lambda tweet: word_in_text('Curry', tweet)) 

LeBron = tweets['LeBron'].value_counts()[True] 
Curry = tweets['Curry'].value_counts()[True] 

print("LeBron %s" % LeBron) 
print("Curry %s" % Curry) 

即使世界上ATLEAST各一個的, 「咖喱」 或 「勒布朗」 我得到這個:

Processing... 
LeBron 1 
Curry 34 

那是完美的。

但是,如果我刪除「勒布朗」,所以沒有勒布朗發生,程序崩潰。

Hectors-iMac:src HCruz$ python process_tweets.py 
Processing... 
Traceback (most recent call last): 
    File "process_tweets.py", line 80, in <module> 
    s.run() 
    File  "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/sched.py", line 117, in run 
action(*argument) 
    File "process_tweets.py", line 54, in processing 
    process_tweets() 
    File "process_tweets.py", line 44, in process_tweets 
LeBron = tweets['LeBron'].value_counts()[True] 
    File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/series.py", line 491, in __getitem__ 
result = self.index.get_value(self, key) 
    File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/index.py", line 1038, in get_value 
return tslib.get_value_box(s, key) 
    File "tslib.pyx", line 454, in pandas.tslib.get_value_box (pandas/tslib.c:9561) 
    File "tslib.pyx", line 469, in pandas.tslib.get_value_box (pandas/tslib.c:9408) 
IndexError: index out of bounds 

回答

2

使用異常處理的周邊上的try/catch線44代碼:

try: 
    LeBron = tweets['LeBron'].value_counts()[True] 
except IndexError: 
    LeBron = 0 
+0

得到一個錯誤,但我改變勒布朗=無勒布朗= 0它完美地工作。謝謝! –

+0

太棒了,很高興你明白了,乾杯! – davedwards

+0

如果我的答案幫助你弄明白了,請考慮接受它,謝謝! – davedwards

相關問題