2017-07-24 49 views
-1
<a href="/position/memory1"> kw random</a> 
<a href="/position/memory2"> kw2 random2</a> 
<a href="/position/memory3"> 123 orange</a> 
<a href="/position/memory4"> test apple</a> 
<a href="/position/memory5"> bla</a> 
<div> 
    <a href="//examples.com/position/keyword1"> kw random</a> 
    <a href="//examples.com/position/keyword2"> kw2 random2</a> 
    <a href="//examples.com/position/keyword3" rel="nofollow"> 123 orange</a> 
    <a href="//examples.com/position/keyword4"> test apple</a> 
    <a href="//examples.com/position/keyword5" title="something"> bla</a> 
</div> 

如何提取keyword1keyword2keyword3keyword4keyword5php數組?PHP pregmatch所有元素陣列

回答

0

如果<a href="//examples.com/position/後的關鍵字是ALLWAYS,這是做的工作:

$html = <<<EOD 
<a href="/position/memory1"> kw random</a> 
<a href="/position/memory2"> kw2 random2</a> 
<a href="/position/memory3"> 123 orange</a> 
<a href="/position/memory4"> test apple</a> 
<a href="/position/memory5"> bla</a> 
<div> 
    <a href="//examples.com/position/keyword1"> kw random</a> 
    <a href="//examples.com/position/keyword2"> kw2 random2</a> 
    <a href="//examples.com/position/keyword3" rel="nofollow"> 123 orange</a> 
    <a href="//examples.com/position/keyword4"> test apple</a> 
    <a href="//examples.com/position/keyword5" title="something"> bla</a> 
</div> 
EOD; 

preg_match_all('~<a href="//examples.com/position/([^"]+)~', $html, $matches); 
var_dump($matches[1]); 

輸出:

array(5) { 
    [0]=> 
    string(8) "keyword1" 
    [1]=> 
    string(8) "keyword2" 
    [2]=> 
    string(8) "keyword3" 
    [3]=> 
    string(8) "keyword4" 
    [4]=> 
    string(8) "keyword5" 
} 
+0

非常感謝它的工作! – vagiz

0

剛開始使用的preg_match功能:

// $lines is your string 
// I think the regex is ok 
preg_match_all("/(?<=\/position\/).+(?=\\")/", $lines, $output_array); 

var_dump($output_array); 
+0

我添加了: preg_match_all(「/(?<= \/\/examples.com \/position \ /)。+(?= [^ \」])/「,$ lines,$ output_array); 但它不會分裂到另一個數組元素後第一個雙引號 – vagiz

0

你可以做這樣的事情。捕獲href值和錨點的文本。然後評估鏈接上的匹配。應該是自我解釋。

<?php 
$data = ' 
<a href="/position/memory1"> Bkw random</a> 
<a href="/position/memory2">B kw2 random2</a> 
<a href="/position/memory3"> 123 orange</a> 
<a href="/position/memory4"> test apple</a> 
<a href="/position/memory5"> bla</a> 
<a href="//examples.com/position/keyword1"> Akw random</a> 
<a href="//examples.com/position/keyword2"> Akw2 random2</a> 
<a href="//examples.com/position/keyword3" rel="nofollow"> 123 orange</a> 
<a href="//examples.com/position/keyword4"> test apple</a> 
<a href="//examples.com/position/keyword5" title="something"> bla</a> 
'; 


$matches = []; 
$needles = ['keyword1', 'keyword2', 'keyword3', 'keyword4', 'keyword5']; 

preg_match_all('#<a\s+href\s*=\s*"([^"]+)"[^>]*>([^<]+)</a>#i', $data, $matches, PREG_SET_ORDER); 

foreach ($matches as $match) {    
    foreach($needles as $needle) { 
     if (stristr($match[1], $needle) !== false) { 
      echo $match[2]; 
     } 
    } 
} 

不確定我是否按照您的意見。配件有沒有我認爲你需要什麼...

//   $match[1]    $match[2] 
//<a href=" |/position/memory1| "> |Bkw random| </a> 
+0

想法是好的,但我的keywordX是動態的,所以我只是收集一些關鍵字 – vagiz

+0

抱歉abotu,他們keywordX的東西是動態的,也我不會收集另一部分的href = /位置/,只有與href =域/位置/ XXXXX,直到雙引號 – vagiz