PHP Word Crawler

如何從數組中的網頁獲取所有唯一字？（沒有所有的屬性和JavaScript等）？PHP Word Crawler

有人能幫助我嗎？

2010-10-18 Simon

嗯，DOM - > DOM文檔 - >所有文本content/nodeValue內容 - >通過空格分解到數組 - >然後看到http://stackoverflow.com/questions/3933760/how-to-remove-all-instances-of-duplicated-values-from-an-array/3933852＃3933852 ..有樂趣 – Hannes 2010-10-18 17:02:24

看一看http://simplehtmldom.sourceforge.net/

然後做這樣的事情：

<?php 

include_once('simplehtmldom/simple_html_dom.php'); 

$string = file_get_html('http://www.google.com')->plaintext; 
$words = preg_split('/[\s,.]+/', $string, null, PREG_SPLIT_NO_EMPTY); 

var_dump(array_unique($words)); 

?>

來源

2010-10-19 00:14:48 Xhantar

試試這個get_text 這個人會幫助你：http://mel.melaxis.com/devblog/2005/08/06/localizing-php-web-sites-using-gettext/

來源

2010-10-18 17:47:09 key

你能舉個例子嗎？我現在不明白。 – Simon 2010-10-18 22:48:16

PHP Word Crawler

回答

相關問題