NLTK數據目錄保持不變,所以不需要重新安裝數據。
但是代碼在python的Python2和Python3的dist-package中有不同的地方。
因此,所有你需要做的是使用pip
和pip3
安裝nltk
:
pip install -U nltk
pip3 install -U nltk
但是你只需要安裝nltk_data
目錄只有一次,例如:
# Let's delete the existing nltk_data directory and start afresh:
[email protected]:~$ ls nltk_data/
chunkers grammars misc sentiment taggers
corpora help models stemmers tokenizers
[email protected]:~$ rm nltk_data/
# Install the NLTK code for pip3 (Python3) and pip (Python2)
[email protected]:~$ pip3 install -U nltk
Requirement already up-to-date: nltk in /usr/local/lib/python3.5/dist-packages
Requirement already up-to-date: six in ./.local/lib/python3.5/site-packages (from nltk)
[email protected]:~$ pip2 install -U nltk
Requirement already up-to-date: nltk in /usr/local/lib/python2.7/dist-packages
Requirement already up-to-date: six in /usr/local/lib/python2.7/dist-packages (from nltk)
# Now, download the NLTK directory in Python2
[email protected]:~$ python
Python 2.7.12 (default, Nov 19 2016, 06:48:10)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
>>> nltk.download('popular')
[nltk_data] Downloading collection u'popular'
[nltk_data] |
...
[nltk_data] | Downloading package averaged_perceptron_tagger to
[nltk_data] | /home/alvas/nltk_data...
[nltk_data] | Unzipping taggers/averaged_perceptron_tagger.zip.
[nltk_data] |
[nltk_data] Done downloading collection popular
True
# Now in Python3, when we try to re-download the nltk_data directory
# We see that it doesn't re-download it =)
>>> import nltk
>>> nltk.download('popular')
[nltk_data] Downloading collection 'popular'
[nltk_data] |
[nltk_data] | Downloading package cmudict to
[nltk_data] | /home/alvas/nltk_data...
[nltk_data] | Package cmudict is already up-to-date!
...
[nltk_data] | /home/alvas/nltk_data...
[nltk_data] | Package averaged_perceptron_tagger is already up-
[nltk_data] | to-date!
[nltk_data] |
[nltk_data] Done downloading collection popular
True
我可以只下載一個Python包Python2並使其在Python3工作?
不幸的是,沒有。在Python 2中安裝包/庫獨立於Python 3的環境,反之亦然。這不僅適用於nltk
,也適用於其他庫。
不要將Python 3視爲Python 2的更高版本,將它們想象成兩種不同的語言; P