1
我想問問任何人是否有將MeCab中的日文字典數據編碼爲UTF-8的經驗。 我安裝了MeCab並安裝了R中的RMeCab軟件包以製作日文字圖,但由於字典數據未編碼爲UTF-8,所以POS標記似乎不起作用。使用UTF-8通過RMeCab/MeCab加載日文字典
library("RMeCab")
library("wordcloud")
setwd('C:\\Users\\sukyu\\Desktop\\JP')
word <- RMeCabFreq("OLS_Japantext.txt")
word <- subset(word,Info1=="名詞")
type <- c("數","非自立","接尾")
word <- subset(word,!Info2%in% type)
word <- word[order(word$Freq,decreasing =T),]
pal <- brewer.pal(8,"Spectral")
par(family = "HiraKakuProN-W3")
wordcloud(word$Term,word$Freq,min.freq = 1,colors=pal,
random.order = TRUE,scale = c(5,4))