沃森DSX得到「索引錯誤」，但碼頭木星env。不要

我在Watson DSX上運行以下Python腳本並出現錯誤。（IndexError：列表索引超出範圍）沃森DSX得到「索引錯誤」，但碼頭木星env。不要

（1）相同的程序在Docker Jupiter筆記本環境中正常運行。

（2）如果輸入文件大小變小，在Waston DSX上正常運行。

請你告訴我是什麼原因呢，和我應該怎麼辦不發生錯誤？

!pip install janome 
data = get_object_storage_file_with_credentials_8b9fb794cc1049b09563d144c8861966('KITDemo', 'kusa-out.txt') 
#data = get_object_storage_file_with_credentials_8b9fb794cc1049b09563d144c8861966('KITDemo', 'kusa2-out.txt') 
txt = data.getvalue() 

word_list = [] 
from janome.tokenizer import Tokenizer 
t = Tokenizer() 
for token in t.tokenize(txt, stream=True): 
    partOfSpeech = token.part_of_speech.split(',')[0] 
    if partOfSpeech == u'名詞': 
     word_list.append(token.surface)

以下是完整的堆棧跟蹤。

IndexError Traceback (most recent call last) 
<ipython-input-4-9a7681ae1aa6> in <module>() 
     2 from janome.tokenizer import Tokenizer 
     3 t = Tokenizer() 
----> 4 for token in t.tokenize(txt, stream=True): 
     5  partOfSpeech = token.part_of_speech.split(',')[0] 
     6  if partOfSpeech == u'名詞':

Screenshot

來源

2017-10-15 Masanori Akaishi

您可以發佈完整的堆棧跟蹤？ –

可能是在輸入數據中，這導致標記生成器以產生空令牌，或觸發IndexError本身的問題。您是否嘗試過使用不同的大型輸入數據集，或者只使用一個？也許通過在循環的開始打印'token'添加一些調試輸出，例如。然後在打印完最後一個標記後檢查輸入數據。 –

IndexError: list index out of range固定在JANOME版本0.3.6。 https://github.com/mocobeta/janome/blob/0.3.6/CHANGES.txt

請升級janome。如果升級後仍然有問題，請在此處創建問題。 https://github.com/mocobeta/janome/issues

來源

2017-12-13 08:42:17

沃森DSX得到「索引錯誤」，但碼頭木星env。不要

回答

相關問題