2014-09-02 89 views
0

我得到這個作爲對API命中的響應。正則表達式/子字符串提取所有匹配的模式/組

1735 Queries 

Taking 1.001303 to 31.856310 seconds to complete 

SET timestamp=XXX; 
SELECT * FROM ABC_EM WHERE last_modified >= 'XXX' AND last_modified < 'XXX'; 

38 Queries 

Taking 1.007646 to 5.284330 seconds to complete 

SET timestamp=XXX; 
show slave status; 

6 Queries 

Taking 1.021271 to 1.959838 seconds to complete 

SET timestamp=XXX; 
SHOW SLAVE STATUS; 

2 Queries 

Taking 4.825584, 18.947725 seconds to complete 

use marketing; 
SET timestamp=XXX; 
SELECT * FROM ABC WHERE last_modified >= 'XXX' AND last_modified < 'XXX'; 

我已經提取了這一點響應HTML,並把它作爲一個字符串now.I需要儘可能簡明扼要這樣,我得到一個地圖格式地圖的值的檢索值(查詢 - > T1到T2秒)基本上,這是在MySQL從服務器上運行的所有慢查詢的狀態。我正在建立一個警報系統。所以從整個段落中以字符串的形式我需要分開查詢並保存相應的時間範圍。 1.001303到31.856310是一個時間範圍。和反對的時間範圍內相應的查詢是:

SET timestamp=XXX; SELECT * FROM ABC_EM WHERE last_modified >= 'XXX' AND last_modified < 'XXX'; 

這個信息,我希望在一階地圖保存。甲地圖形式的(query:String->timeRange:String)

又如:

("use marketing; SET timestamp=XXX; SELECT * FROM ABC WHERE last_modified >= 'XXX' AND last_modified xyz ;"->"4.825584 to 18.947725 seconds") 

「」 「(。)###()### \ n \ n(。*)###」」 「.r.findAllIn(reqSlowQueryData).matchData foreach {m => println(」group0「+ m.group(1)+」next group「+ m.group(2)+ m.group(3)}

我正在使用上面的語句來提取重複的單元格,以便稍後對其進行操作。但它似乎不工作;

感謝提前!我知道有幾種方法可以做到這一點,但所有引人注目的方法都是低效和乏味的。我需要Scala來做同樣的事情!也許我可以使用subString方法遞歸提取?

+0

就像近複製你昨天公佈,目前還不清楚你想要什麼。請編輯您的問題並更加清晰地格式化它。然後告訴我們你已經嘗試了什麼,以及你被困在哪裏,因爲這不是一個網站來說「給我的代碼」 – 2014-09-02 07:24:14

+3

可能的重複[智能和快速的方法來解析字符串以獲取所需的數據](http:// stackoverflow.com/questions/25608460/intelligent-and-quick-way-to-parse-string-to-get-required-data) – 2014-09-02 07:24:44

回答

1

如果你想使用Scala的嘗試:

val regex = """(\d+).(\d+).*(\d+).(\d+) seconds""".r // extract range 

    val txt = """ 
       |1735 Queries 
       | 
       |Taking 1.001303 to 31.856310 seconds to complete 
       | 
       |SET timestamp=XXX; SELECT * FROM ABC_EM WHERE last_modified >= 'XXX' AND last_modified < 'XXX'; 
       | 
       |38 Queries 
       | 
       |Taking 1.007646 to 5.284330 seconds to complete 
       | 
       |SET timestamp=XXX; show slave status; 
       | 
       |6 Queries 
       | 
       |Taking 1.021271 to 1.959838 seconds to complete 
       | 
       |SET timestamp=XXX; SHOW SLAVE STATUS; 
       | 
       |2 Queries 
       | 
       |Taking 4.825584, 18.947725 seconds to complete 
       | 
       |use marketing; SET timestamp=XXX; SELECT * FROM ABC WHERE last_modified >= 'XXX' AND last_modified < 'XXX'; 
    """.stripMargin 


def logToMap(txt:String) = { 
    val (_,map) = txt.lines.foldLeft[(Option[String],Map[String,String])]((None,Map.empty)){ 
     (acc,el) => 
     val (taking,map) = acc // taking contains range 
     taking match { 
      case Some(range) if el.trim.nonEmpty => //Some contains range 
      (None,map + (el -> range)) // add to map 
      case None => 
      regex.findFirstIn(el) match { //extract range 
       case Some(range) => (Some(range),map) 
       case _ => (None,map) 
      } 
      case _ => (taking,map) // probably empty line 
     } 
    } 
map 
} 
0

修改ajozwik的回答爲SQL工作命令,多行:

val regex = """(\d+).(\d+).*(\d+).(\d+) seconds""".r // extract range 
    def logToMap(txt:String) = { 
    val (_,map) = txt.lines.foldLeft[(Option[String],Map[String,String])]((None,Map.empty)){ 
    (accumulator,element) => 
     val (taking,map) = accumulator 
     taking match { 
     case Some(range) if element.trim.nonEmpty=> { 
      if (element.contains("Queries")) 
      (None, map) 
      else 
      (Some(range),map+(range->(map.getOrElse(range,"")+element))) 
     } 
     case None => 
      regex.findFirstIn(element) match { 
      case Some(range) => (Some(range),map) 
      case _ => (None,map) 
      } 
     case _ => (taking,map) 
     } 
    } 
    println(map) 
    map 
    }