2016-02-19 61 views
3

我正在使用麥芽解析器與python nltk。我已經成功下載了培訓數據並更新了最新的nltk。當我打電話給麥芽解析器時,它給了我一個插入錯誤。下面是python的代碼,其中也包含了回溯。麥芽解析器給出斷言錯誤,當與nltk一起使用

mp = MaltParser("C:/Users/mustufain/Desktop/Python Files/maltparser-1.8.1","C:/Users/mustufain/Desktop/Python Files/maltparser-1.7.2",additional_java_args=['-Xmx512m']) 

Traceback (most recent call last): 
    File "<pyshell#10>", line 1, in <module> 
    mp = MaltParser("C:/Users/mustufain/Desktop/Python Files/maltparser-1.8.1","C:/Users/mustufain/Desktop/Python Files/maltparser-1.7.2",additional_java_args=['-Xmx512m']) 
    File "C:\Python34\lib\site-packages\nltk\parse\malt.py", line 131, in __init__ 
    self.malt_jars = find_maltparser(parser_dirname) 
    File "C:\Python34\lib\site-packages\nltk\parse\malt.py", line 72, in find_maltparser 
    assert malt_dependencies.issubset(_jars) 
AssertionError 
>>> 
+0

你有沒有設置:https://github.com/nltk/nltk/wiki/Installing-Third-Party-Software#malt-parser? – alvas

+0

你在'C:/ Users/mustufain/Desktop/Python Files/maltparser-1.8.1'中有['log4j.jar','libsvm.jar','liblinear-1.8.jar']嗎? – alvas

+0

在命令提示符處輸入'dir C:/ Users/mustufain/Desktop/Python Files/maltparser-1.8.1 /'是什麼? – alvas

回答

1

如果所有的下載和環境變量設置是正確,最有可能是文件/目錄路徑是如何在nltk.parse.malt.py分裂,在https://github.com/nltk/nltk/blob/develop/nltk/parse/malt.py#L69,其將目錄和文件名,專門爲Linux:

def find_maltparser(parser_dirname): 
    """ 
    A module to find MaltParser .jar file and its dependencies. 
    """ 
    if os.path.exists(parser_dirname): # If a full path is given. 
     _malt_dir = parser_dirname 
    else: # Try to find path to maltparser directory in environment variables. 
     _malt_dir = find_dir(parser_dirname, env_vars=('MALT_PARSER',)) 
    # Checks that that the found directory contains all the necessary .jar 
    malt_dependencies = ['','',''] 
    _malt_jars = set(find_jars_within_path(_malt_dir)) 
    _jars = set(jar.rpartition('/')[2] for jar in _malt_jars) 
    malt_dependencies = set(['log4j.jar', 'libsvm.jar', 'liblinear-1.8.jar']) 

    assert malt_dependencies.issubset(_jars) 
    assert any(filter(lambda i: i.startswith('maltparser-') and i.endswith('.jar'), _jars)) 
    return list(_malt_jars) 

所述錯誤已被固定,在https://github.com/nltk/nltk/pull/1292

合併改變此行的過程:

_jars = set(jar.rpartition('/')[2] for jar in _malt_jars) 

這應該解決您的問題=)

_jars = set(os.path.split(jar)[1] for jar in _malt_jars) 

對於不相關的代碼本身的答案,但你是如何設置環境變量或下載並保存麥芽解析器的目錄或文件見https://github.com/nltk/nltk/issues/1294

+0

它因此未工作,我已經改變了線malt.py並重新啓動它,它仍然給我一個說法erorr當我加載麥芽解析器 – Mustufain

+0

它讓我改變malt.py行後,這個新的斷言錯誤:任何斷言(filter(lambda i:i.startswith('maltparser-')and i.ends('。jar'),_jars)) AssertionError – Mustufain

+0

它在這條線上拋出異常:malt_dependencies = set(['log4j.jar' ,'libsvm.jar','liblinear-1.8.jar']) – Mustufain

2

TL;DR(在PYTHON3 !!):

import urllib.request 
urllib.request.urlretrieve('http://www.maltparser.org/mco/english_parser/engmalt.poly-1.7.mco', 'C:\\Users\\mustufain\\Desktop\\engmalt.poly-1.7.mco') 
urllib.request.urlretrieve('http://maltparser.org/dist/maltparser-1.8.1.zip', 'C:\\Users\\mustufain\\Desktop\\maltparser-1.8.1.zip') 
zfile = zipfile.ZipFile('C:\\Users\\mustufain\\Desktop\\maltparser-1.8.1.zip') 
zfile.extractall('C:\\Users\\mustufain\\Desktop\\maltparser-1.8.1\\') 

然後:

from nltk.parse import malt 
mp = malt.MaltParser('C:\\Users\\mustufain\\Desktop\\maltparser-1.8.1\\', "C:\\Users\\mustufain\\Desktop\\engmalt.poly-1.7.mco") 
mp.parse_one('I shot an elephant in my pajamas .'.split()).tree() 
+0

Thanks @ L3viathan for the edit!有一個廣泛的答案:https://github.com/nltk/nltk/issues/1294。 – alvas

+0

也許你應該在你的回答中鏈接那個。從我這裏迎接薩爾布呂肯! – L3viathan

+0

@ L3viathan沒問題; P – alvas