2016-09-14 68 views
0

我想弄清楚如何在文本文件中找到最頻繁的單詞並更改單個單詞以便它包裹某些東西否則,例如:freewordchoice(自由+頻繁詞+選擇)和文本中的任何地方,該詞是該詞可以改變的文本。我一直在尋找像這樣瘋狂的東西,但我找不到它。我很新的JavaScript,這是我想用這個。要上傳和顯示文本正常工作,我不明白的是,我如何定位最常用的單詞,並在實際顯示在瀏覽器中之前在整個文本中對其進行更改。在我看來,我需要某種變量來找到這個詞,並在某處存儲這個世界,並且需要一個變量來放置要添加或改變目標詞的變量。如何在.txt文件中查找/定位最頻繁的單詞並將其更改爲

示例文本:阿拉丁的古登堡計劃ETEXT和美妙的燈

信息/問題UPT:下面的代碼查找在上面的​​示例文本的全文中最常說的一句話。我現在說這個詞是阿拉丁。問題是我可以用它來正確替換Aladdin這個詞。我打印出fooAladdinbar,就像我想要的,但不是隻改變Aladding = fooAladdinbar,而是在示例文本中的每個字母之間都有fooAladdinbar。

這是解決的,是一個可變的問題。

+0

你有一些測試數據嗎?文本文件將如何顯示?什麼編碼?哪些語言特殊字符等? –

+0

我更新了我的答案,以涵蓋您的問題的替換部分。 –

+0

這是我正在從事的演示的地方:http://internetstall.nu/demo/demo.html,我只是上傳一個簡單的包含文本的.txt文件。不知道你是什麼意思的特殊字符,我試圖用JavaScript來完成,但有些事情告訴我,這不是你的意思。已經解決了問題 – user3481279

回答

0

這不是完美的,但作品,這裏是一個演示:

(此演示只是發現常用字詞)

  • 它分裂與正則表達式的文本
  • 然後計數單詞
  • 然後返回最頻繁的單詞

var data = document.getElementById("data").value; 
 

 
var allWords = data.split(/\b/); 
 
var wordCountList = {}; 
 

 
allWords.forEach(function(word){ 
 
    if(word !== " "){ 
 
    if(!wordCountList.hasOwnProperty(word)){ 
 
     wordCountList[word] = {word: word, count:0}; 
 
    } 
 
    wordCountList[word].count++; 
 
    } 
 
}) 
 

 

 
var maxCountWord = {count:0}; 
 
for(var propName in wordCountList){ 
 
    var currentWord = wordCountList[propName]; 
 
    if(maxCountWord.count<currentWord.count){ 
 
    maxCountWord = currentWord; 
 
    } 
 
} 
 
console.info(maxCountWord);
textarea{ 
 
    width:100%; 
 
    height:100px; 
 
}
<textarea id="data" > 
 
<!-- start slipsum code --> 
 

 
The path of the righteous man is beset on all sides by the iniquities of the selfish and the tyranny of evil men. Blessed is he who, in the name of charity and good will, shepherds the weak through the valley of darkness, for he is truly his brother's keeper and the finder of lost children. And I will strike down upon thee with great vengeance and furious anger those who would attempt to poison and destroy My brothers. And you will know My name is the Lord when I lay My vengeance upon thee. 
 

 
<!-- end slipsum code --> 
 
</textarea> 
 

 
<div id="result"></div>

要更換您還可以使用正則表達式的話:
(這只是演示代替了常用字詞)

function freewordchoice (free, word, choice){ 
 
    var data = document.getElementById("data").innerHTML; 
 
    var replaceExpression = new RegExp("\\b"+word+"\\b","gi"); 
 
    console.info(replaceExpression); 
 
    data =data.replace(replaceExpression, free + word + choice); 
 
    document.getElementById("result").innerHTML = data; 
 
    
 
} 
 

 

 
freewordchoice("<b>", "the", "</b>");
<b>Before:</b> 
 
<div id="data" > 
 
<!-- start slipsum code --> 
 

 
The path of the righteous man is beset on all sides by the iniquities of the selfish and the tyranny of evil men. Blessed is he who, in the name of charity and good will, shepherds the weak through the valley of darkness, for he is truly his brother's keeper and the finder of lost children. And I will strike down upon thee with great vengeance and furious anger those who would attempt to poison and destroy My brothers. And you will know My name is the Lord when I lay My vengeance upon thee. 
 

 
<!-- end slipsum code --> 
 
</div> 
 
<br/><br/> 
 
<b>After:</b> 
 
<div id="result" > 
 
    
 
    </div>

更新:

問題是此行

common = 'the,a,do,in,with,this,so,that,of,and,not,did,when,what,were,went,was,as, 
if,who,had,at,can,you,which,while,will,to,till,then,them,their,she, 
he,once,out,no,must,many,me,is,it,his,him,her,about,have,i,has,your, 
would,where,whom,s,on,from,for,by,but,all,said,my,'; 

的問題是在串,said,my,';的最後刪除最後一個逗號,它應該工作,像這樣:

common = 'the,a,do,in,with,this,so,that,of,and,not,did,when,what,were,went,was,as, 
if,who,had,at,can,you,which,while,will,to,till,then,them,their,she, 
he,once,out,no,must,many,me,is,it,his,him,her,about,have,i,has,your, 
would,where,whom,s,on,from,for,by,but,all,said,my'; 

由於通過最後一個逗號,最後一個字是空字符串。

+0

當我嘗試運行它時,我只得到錯誤,我看到有一個字符串在數據部分,但我怎麼會讓它工作與.txt文件上傳瀏覽器就像我在這裏做的演示:http://internetstall.nu/demo/demo.html – user3481279

+0

這個腳本文件沒有加載,請檢查瀏覽器的控制檯(按F12)。 –

+0

@ user3481279您是否看到我最近的評論?該腳本文件未被加載。我的解決方案有用嗎?或者你需要更多的幫助? –

相關問題