Java正則表達式提取一個id字符串，基於每個id的循環子字符串

我正在讀取日誌文件並提取文件中包含的某些數據。我能夠爲日誌文件的每一行提取時間。Java正則表達式提取一個id字符串，基於每個id的循環子字符串

現在我想解壓ID "ieatrcxb4498-1"。所有的ID都以子字符串ieatrcxb開頭，我嘗試查詢並返回基於它的完整字符串。

我已經嘗試了其他帖子的許多不同的建議。但我一直不成功，有以下模式：

(?i)\\b("ieatrcxb"(?:.+?)?)\\b 
(?i)\\b\\w*"ieatrcxb"\\w*\\b" 
^.*ieatrcxb.*$

我也試圖解壓基於完整的ID，在字符串開始i和1整理。正如他們所做的一樣。日誌文件

150: 2017-06-14 18:02:21 INFO monitorinfo   :  Info: Lock VCS on node "ieatrcxb4498-1"

代碼

Scanner s = new Scanner(new FileReader(new File("lock-unlock.txt"))); 
    //Record currentRecord = null; 
    ArrayList<Record> list = new ArrayList<>(); 

    while (s.hasNextLine()) { 
     String line = s.nextLine(); 

     Record newRec = new Record(); 
     // newRec.time = 
     newRec.time = regexChecker("([0-1]?\\d|2[0-3]):([0-5]?\\d):([0-5]?\\d)", line); 

     newRec.ID = regexChecker("^.*ieatrcxb.*$", line); 

     list.add(newRec); 

    } 


public static String regexChecker(String regEx, String str2Check) { 

    Pattern checkRegex = Pattern.compile(regEx); 
    Matcher regexMatcher = checkRegex.matcher(str2Check); 
    String regMat = ""; 
    while(regexMatcher.find()){ 
     if(regexMatcher.group().length() !=0) 
      regMat = regexMatcher.group(); 
     } 
     //System.out.println("Inside the "+ regexMatcher.group().trim()); 
    } 

    return regMat; 
}

的

行，我需要一個簡單的模式，這將幫我這個忙。

來源

2017-07-14 Sean McGrath

「一個日誌文件並提取文件中包含的某些數據」 - 什麼樣的數據，什麼樣的日誌文件？發表一個例子。 – user3734782

@ user3734782150：2017-06-14 18:02:21 INFO monitorinfo：Info：鎖定節點「ieatrcxb4498-1」上的VCS這是來自日誌文件的數據示例。 –

所以你想提取整行包含ID？ – user3734782

ID是否始終採用格式「ieatrcxb後跟4位數，後跟-，後跟1位」？

如果是這樣的話，你可以這樣做：

regexChecker("ieatrcxb\\d{4}-\\d", line);

注意{4}量詞，它匹配是4位數（\\d）。如果最後一位數字始終爲1，則還可以使用"ieatrcxb\\d{4}-1"。

如果位數不同，可以使用"ieatrcxb\\d+-\\d+"，其中+表示「1或更多」。

您還可以使用最小和最大出現次數的{}量詞。例如："ieatrcxb\\d{4,6}-\\d" - {4,6}的意思是「最少4次，最多6次出現」（這只是一個例子，我不知道這是你的情況）。如果您確切知道ID的數量，這非常有用。

以上所有工作爲您的情況，返回ieatrcxb4498-1。使用哪一個將取決於你的輸入如何變化。

如果你想只是沒有ieatrcxb部分（4498-1）的數字，你可以使用一個lookbehind regex：

regexChecker("(?<=ieatrcxb)\\d{4,6}-\\d", line);

這使得ieatrcxb到不是比賽的一部分，從而只返回4498-1。

如果你也不想-1，只是4498，您可以用前瞻相結合這樣的：

regexChecker("(?<=ieatrcxb)\\d{4,6}(?=-\\d)", line)

這僅返回4498。

來源

2017-07-14 16:21:25

非常感謝。 –

public static void main(String[] args) { 
    String line = "150: 2017-06-14 18:02:21 INFO monitorinfo   :  Info: Lock VCS on node \"ieatrcxb4498-1\""; 
    String regex ="ieatrcxb.*1"; 
    Pattern p = Pattern.compile(regex); 
    Matcher m = p.matcher(line); 
    while(m.find()){ 
     System.out.println(m.group()); 
    } 
}

，或者如果ID是所有報價：

String id = line.substring(line.indexOf("\""), line.lastIndexOf("\"")+1); 
System.out.println(id);

來源

2017-07-14 15:58:11 Eritrean

您正在試圖通過非常困難的辦法做到這一點。如果lock-unlock.txt文件的每一行就像是你貼片斷一樣的，你可以做到以下幾點：

File logFile = new File("lock-unlock.txt"); 

List<String> lines = Files.readAllLines(logFile.toPath()); 

List<Integer> ids = lines.stream() 
       .filter(line -> line.contains("ieatrcxb")) 
       .map(line -> line.split("\"")[1]) //"ieatrcxb4498-1" 
       .map(line -> line.replaceAll("\\D+","")) //"44981" 
       .map(Integer::parseInt) // 44981 
       .collect(Collectors.toList());

如果你是不是在找剛纔的ID號，只需刪除/評論第二和第三.map()方法調用，但它會產生一個字符串列表而不是整數。

來源

2017-07-14 16:03:13 user3734782

Java正則表達式提取一個id字符串，基於每個id的循環子字符串

回答

相關問題