2017-01-21 36 views
1

我有一個比較兩個文本,並使得加亮顯示不同的字scrpit,但它並沒有在所有的工作。許多單詞在它們不是時會標記爲不同的單詞,例如「that」「the」等單詞不會將它們考慮在內,如果它們之間有兩個單詞,如果它們也發生了變化,則標記爲已更改。我附上一張圖片。比較兩個文本,並突出唯一的區別

<?php 


$old = 'The one-page order, which Mr. Trump signed in a hastily arranged Oval Office ceremony shortly before departing for the inaugural balls, gave no specifics about which aspects of the law it was targeting. But its broad language gave federal agencies wide latitude to change, delay or waive provisions of the law that they deemed overly costly for insurers, drug makers, doctors, patients or states, suggesting that it could have wide-ranging impact, and essentially allowing the dismantling of the law to begin even before Congress moves to repeal it.'; 


$new = 'The one-page order, which Mr. Trump signed in a unexpectedly organized Oval workplace rite quickly before departing for the inaugural balls, gave no specifics approximately which components of the law it became targeting. But its large language gave federal organizations huge range to exchange, put off or waive provisions of the law that they deemed overly luxurious for insurers, drug makers, docs, sufferers or states, suggesting that it could have wide-ranging effect, and basically permitting the dismantling of the regulation to start even before Congress moves to repeal it.'; 



$oldArr = preg_split('/\s+/', $old);// old (initial) text splitted into words 
$newArr = preg_split('/\s+/', $new);// new text splitted into words 
$resArr = array(); 

$oldCount = count($oldArr)-1; 
$newCount = count($newArr)-1; 

$tmpOld = 0;// marker position for old (initial) string 
$tmpNew = 0;// marker position for new (modified) string 
$end = 0;// do while variable 

// endless do while loop untill specified otherwise 
while($end == 0){ 
// if marker position is less or equal than max count for initial text 
// to make sure we don't overshoot the max lenght 
if($tmpOld <= $oldCount){ 
// we check if current words from both string match, at the current marker positions 
if($oldArr[$tmpOld] === $newArr[$tmpNew]){ 
// if they match, nothing has been modified, we push the word into results and increment both markers 
array_push($resArr,$oldArr[$tmpOld]); 
$tmpOld++; 
$tmpNew++; 
}else{ 
// fi the words don't match, we need to check for recurrence of the searched word in the entire new string 
$foundKey = array_search($oldArr[$tmpOld],$newArr,TRUE); 
// if we find it 
if($foundKey != '' && $foundKey > $tmpNew){ 
// we get all the words from the new string between the current marker and the foundKey exclusive 
// and we place them into results, marking them as new words 
for($p=$tmpNew;$p<$foundKey;$p++){ 
array_push($resArr,'<span class="new-word">'.$newArr[$p].'</span>'); 
} 
// after that, we insert the found word as unmodified 
array_push($resArr,$oldArr[$tmpOld]); 
// and we increment old marker position by 1 
$tmpOld++; 
// and set the new marker position at the found key position, plus one 
$tmpNew = $foundKey+1; 
}else{ 
// if the word wasn't found it means it has been deleted 
// and we need to add ti to results, marked as deleted 
array_push($resArr,'<span class="old-word">'.$oldArr[$tmpOld].'</span>'); 
// and increment the old marker by one 
$tmpOld++; 
} 
} 
}else{ 
$end = 1; 
} 
} 

$textFinal = ''; 
foreach($resArr as $val){ 
$textFinal .= $val.' '; 
} 
echo "<p>".$textFinal."</p>"; 
?> 
<style> 
body { 
background-color: #2A2A2A; 
} 

@font-face { 
font-family: 'Eras Light ITC'; 
font-style: normal; 
font-weight: normal; 
src: local('Eras Light ITC'), url('ERASLGHT.woff') format('woff'); 
} 

p { 
font-family: 'Eras Light ITC', Arial; 
color:white; 
} 

.new-word{background:rgba(1, 255, 133, 0.9);color:black;font-weight: bold;} 
.new-word:after{background:rgba(1, 255, 133, 0.9)} 
.old-word{text-decoration:none; position:relative;background:rgba(215, 40, 40, 0.9);} 
.old-word:after{ 


} 
</style> 

例子:

Example image result

爲什麼你標記這些不同的話,如果他們沒有改變? 關心!

回答

0

我檢查你的代碼,嘗試不同的情況下,我認爲你的算法是錯誤的。

例如,如果你輸入「單頁」,而不是「對」或「」,你會看到,這似乎是「失配」。這個背後的原因,當不匹配時,你正在搜索所有數組中不匹配的詞。然後,如果給定的單詞已被跳過(存在索引號較少),則算法失敗。

一看就知道,你可以使用以下變量。

$old = 'for costly for insurers.'; 
$new = 'for luxurious for insurers.'; 

對於此設置,當昂貴的豪華不匹配發現,你的代碼嘗試匹配下面的「爲」字。但array_search調用,它使用的是回報的位置「爲」你的字符串的開頭。

$foundKey = array_search($oldArr[$tmpOld],$newArr,TRUE); 

因此,您應該嘗試修改此部分以不同方式進行搜索。你可以編寫你的array_search具有「starting_indices」功能。 (或者,也許你可以不設置從陣列匹配成功的要素。)

+0

我不能處理故障。我被封鎖了,我不知道該如何解決它... – JotaMarkes