2011-12-13 73 views
5

這裏我有兩種方法使用str_replace替換給定短語中的字符串。PHP中str_replace的性能

// Method 1 
$phrase = "You should eat fruits, vegetables, and fiber every day."; 
$healthy = array("fruits", "vegetables", "fiber"); 
$yummy = array("pizza", "beer", "ice cream"); 
$phrase = str_replace($healthy, $yummy, $phrase); 

// Method 2 
$phrase = "You should eat fruits, vegetables, and fiber every day."; 
$phrase = str_replace("fruits", "pizza", $phrase); 
$phrase = str_replace("vegetables", "beer", $phrase); 
$phrase = str_replace("fiber", "ice cream", $phrase); 

哪種方法是更有效的(在執行時間方面使用&資源)?

假設實際的詞組要長得多(例如50,000個字符),並且要替換的詞有更多的對。

我在想什麼是方法2調用str_replace 3次,這將花費更多的函數調用;另一方面,方法1創建2個數組,而str_replace需要在運行時解析2個數組。

+0

方法1更快。 – djot

+1

既不是一個好的選擇,如果你有一個很長的字符串,反覆需要str_replace,爲什麼不在str_replace之後保存結果呢? – ajreal

+0

如果您在循環中一遍又一遍地創建ARRAYs healty和yummy,它會變慢,而不是將它們放在外面。 – djot

回答

5

我寧願使用方法1作爲其更清潔和更有條理的方法1提供了使用其他來源對的機會,例如:數據庫中的壞詞表。方法2將需要排序的另一環..

<?php 
$time_start = microtime(true); 
for($i=0;$i<=1000000;$i++){ 
    // Method 1 
    $phrase = "You should eat fruits, vegetables, and fiber every day."; 
    $healthy = array("fruits", "vegetables", "fiber"); 
    $yummy = array("pizza", "beer", "ice cream"); 
    $phrase = str_replace($healthy, $yummy, $phrase); 
} 
$time_end = microtime(true); 
$time = $time_end - $time_start; 
echo "Did Test 1 in ($time seconds)\n<br />"; 



$time_start = microtime(true); 
for($i=0;$i<=1000000;$i++){ 
    // Method2 
    $phrase = "You should eat fruits, vegetables, and fiber every day."; 
    $phrase = str_replace("fruits", "pizza", $phrase); 
    $phrase = str_replace("vegetables", "beer", $phrase); 
    $phrase = str_replace("fiber", "ice cream", $phrase); 

} 
$time_end = microtime(true); 
$time = $time_end - $time_start; 
echo "Did Test 2 in ($time seconds)\n"; 
?> 

測試了1(3.6321988105774秒)

測試了2英寸(2.8234610557556秒)


編輯:在進一步測試字符串重複到50k,減少迭代和來自愛滋病的建議,差別非常小。

<?php 
$phrase = str_repeat("You should eat fruits, vegetables, and fiber every day.",50000); 
$healthy = array("fruits", "vegetables", "fiber"); 
$yummy = array("pizza", "beer", "ice cream"); 

$time_start = microtime(true); 
for($i=0;$i<=10;$i++){ 
    // Method 1 
    $phrase = str_replace($healthy, $yummy, $phrase); 
} 
$time_end = microtime(true); 
$time = $time_end - $time_start; 
echo "Did Test 1 in ($time seconds)\n<br />"; 



$time_start = microtime(true); 
for($i=0;$i<=10;$i++){ 
    // Method2 
    $phrase = str_replace("fruits", "pizza", $phrase); 
    $phrase = str_replace("vegetables", "beer", $phrase); 
    $phrase = str_replace("fiber", "ice cream", $phrase); 

} 
$time_end = microtime(true); 
$time = $time_end - $time_start; 
echo "Did Test 2 in ($time seconds)\n"; 
?> 

在有沒有(1.1450328826904秒)測試1

在有沒有(1.3119208812714秒)測試2

+0

但您的測試顯示方法2性能更好? – ajreal

+0

是的,但是id犧牲了1mil迭代0.9秒的分數以獲得更好的編碼和可伸縮性。 –

+1

我可以建議你將數組聲明放在循環之外嗎? – ajreal

3

即使老了,這個基準是不正確。

感謝匿名用戶:

「這個測試是錯誤的,因爲當測試3開始$短語是使用測試2的結果,其中有什麼可替代

當我添加$。在測試3之前,結果是:測試1在(4.3436799049377秒)中測試2在(5.7581660747528秒)中測試3在(7.5069718360901秒)中「

 <?php 
     $time_start = microtime(true); 

     $healthy = array("fruits", "vegetables", "fiber"); 
     $yummy = array("pizza", "beer", "ice cream"); 

     for($i=0;$i<=1000000;$i++){ 
      // Method 1 
      $phrase = "You should eat fruits, vegetables, and fiber every day."; 
      $phrase = str_replace($healthy, $yummy, $phrase); 
     } 
     $time_end = microtime(true); 
     $time = $time_end - $time_start; 
     echo "Did Test 1 in ($time seconds)<br /><br />"; 



     $time_start = microtime(true); 
     for($i=0;$i<=1000000;$i++){ 
      // Method2 
      $phrase = "You should eat fruits, vegetables, and fiber every day."; 
      $phrase = str_replace("fruits", "pizza", $phrase); 
      $phrase = str_replace("vegetables", "beer", $phrase); 
      $phrase = str_replace("fiber", "ice cream", $phrase); 

     } 
     $time_end = microtime(true); 
     $time = $time_end - $time_start; 
     echo "Did Test 2 in ($time seconds)<br /><br />"; 




     $time_start = microtime(true); 
     for($i=0;$i<=1000000;$i++){ 
       foreach ($healthy as $k => $v) { 
        if (strpos($phrase, $healthy[$k]) === FALSE) 
        unset($healthy[$k], $yummy[$k]); 
       }           
       if ($healthy) $new_str = str_replace($healthy, $yummy, $phrase); 

     } 
     $time_end = microtime(true); 
     $time = $time_end - $time_start; 
     echo "Did Test 3 in ($time seconds)<br /><br />"; 

     ?> 

Test Test 1 in(3.5785729885101 seco NDS)

測試了2英寸(3.8501658439636秒)

測試了3(0.13844394683838秒)

1

@djot你有

<?php 
    foreach ($healthy as $k => $v) { 
     if (strpos($phrase, $healthy[$k]) === FALSE) 
      unset($healthy[$k], $yummy[$k]); 
     } 

這裏的錯誤,我們有一個固定的版本和更好/簡單的新測試4

<?php 
$time_start = microtime(true); 

     $healthy = array("fruits", "vegetables", "fiber"); 
     $yummy = array("pizza", "beer", "ice cream"); 

     for($i=0;$i<=1000000;$i++){ 
      // Method 1 
      $phrase = "You should eat fruits, vegetables, and fiber every day."; 
      $phrase = str_replace($healthy, $yummy, $phrase); 
     } 
     $time_end = microtime(true); 
     $time = $time_end - $time_start; 
     echo "Did Test 1 in ($time seconds)". PHP_EOL. PHP_EOL; 



     $time_start = microtime(true); 
     for($i=0;$i<=1000000;$i++){ 
      // Method2 
      $phrase = "You should eat fruits, vegetables, and fiber every day."; 
      $phrase = str_replace("fruits", "pizza", $phrase); 
      $phrase = str_replace("vegetables", "beer", $phrase); 
      $phrase = str_replace("fiber", "ice cream", $phrase); 

     } 
     $time_end = microtime(true); 
     $time = $time_end - $time_start; 
     echo "Did Test 2 in ($time seconds)" . PHP_EOL. PHP_EOL; 




     $time_start = microtime(true); 
     for($i=0;$i<=1000000;$i++){ 
      $a = $healthy; 
      $b = $yummy; 
       foreach ($healthy as $k => $v) { 
        if (strpos($phrase, $healthy[$k]) === FALSE) 
        unset($a[$k], $b[$k]); 
       }           
       if ($a) $new_str = str_replace($a, $b, $phrase); 

     } 
     $time_end = microtime(true); 
     $time = $time_end - $time_start; 
     echo "Did Test 3 in ($time seconds)". PHP_EOL. PHP_EOL; 



     $time_start = microtime(true); 
     for($i=0;$i<=1000000;$i++){ 
      $ree = false; 
      foreach ($healthy as $k) { 
       if (strpos($phrase, $k) !== FALSE) { //something to replace 
        $ree = true; 
        break; 
       } 
      }           
      if ($ree === true) { 
       $new_str = str_replace($healthy, $yummy, $phrase); 
      } 
     } 
     $time_end = microtime(true); 
     $time = $time_end - $time_start; 
     echo "Did Test 4 in ($time seconds)". PHP_EOL. PHP_EOL; 

測試1在(0.38219690322876 seco nds)

Test Test 2 in(0。42352104187012秒)

的確在測試3(0.47777700424194秒)

測試了4(0.19691610336304秒)

0

雖然沒有直接問的問題,OP確實狀態:

假設真實短語長得多(例如50,000個字符),並且 要替換的單詞有更多的單詞對。

在這種情況下,如果你並不需要(或希望)替換內替換,它可能是更有效的使用preg_replace_callback解決方案,使整個字符串只對每一對處理一次,一次也沒有的替代品。

這裏是一個通用函數,在我的情況下,使用1.5Mb字符串和〜20,000對替換的速度快了10倍左右,不過由於「正則表達式太大」錯誤,需要將替換分割成塊已經在替代品中進行了不確定的替換(但在我的特殊情況下,這是不可能的)。

在我的特殊情況下,我能夠進一步優化大約100倍的性能增益,因爲我的搜索字符串都遵循特定模式。 (Windows 7 32位上的PHP版本7.1.11)。

function str_replace_bulk($search, $replace, $subject, &$count = null) { 
    // Assumes $search and $replace are equal sized arrays 
    $lookup = array_combine($search, $replace); 
    $result = preg_replace_callback(
    '/' . 
     implode('|', array_map(
     function($s) { 
      return preg_quote($s, '/'); 
     }, 
     $search 
    )) . 
    '/', 
    function($matches) use($lookup) { 
     return $lookup[$matches[0]]; 
    }, 
    $subject, 
    -1, 
    $count 
); 
    if (
    $result !== null || 
    count($search) < 2 // avoid infinite recursion on error 
) { 
    return $result; 
    } 
    // With a large number of replacements (> ~2500?), 
    // PHP bails because the regular expression is too large. 
    // Split the search and replacements in half and process each separately. 
    // NOTE: replacements within replacements may now occur, indeterminately. 
    $split = (int)(count($search)/2); 
    error_log("Splitting into 2 parts with ~$split replacements"); 
    $result = str_replace_bulk(
    array_slice($search, $split), 
    array_slice($replace, $split), 
    str_replace_bulk(
     array_slice($search, 0, $split), 
     array_slice($replace, 0, $split), 
     $subject, 
     $count1 
    ), 
    $count2 
); 
    $count = $count1 + $count2; 
    return $result; 
}