2011-12-23 100 views
4

我必須分開大量的電子郵件和名稱,我必須用逗號分割,但有些名稱中有逗號,所以我必須首先處理。幸運的是,名字介於「引號」之間。正則表達式在字符上轉義

目前,我與我的正則表達式輸出這樣的,例如得到(編輯:它不會顯示在論壇的郵件我見):

"Talboom, Esther" 

"Wolde, Jos van der" 

"Debbie Derksen" <[email protected]>, corine <[email protected]>, " 

最後一個出了問題導致了名字沒有逗號,所以它繼續下去,直到它找到一個,那是我想用來分開的那個。所以我希望它看起來直到找到'<'。 我該怎麼做?

import java.util.regex.Pattern; 
import java.util.regex.Matcher; 

String test = "\"Talboom, Esther\" <[email protected]>,  \"Wolde, Jos van der\" <[email protected]>, \"Debbie Derksen\" <[email protected]>, corine <[email protected]>, \"Markies Aart\" <[email protected]>"; 

Pattern pattern = Pattern.compile("\".*?,.*?\""); 

Matcher matcher = pattern.matcher(test); 

boolean found = false; 
while (matcher.find()) { 
    System.out.println(matcher.group()); 
} 

編輯: 更好的線是與因爲並不是所有的姓名或報價工作:

String test = "\"Talboom, Esther\" <[email protected]>,  DRP - Wouter Haan <[email protected]>, \"Wolde, Jos van der\" <[email protected]>, \"Debbie Derksen\" <[email protected]>, corine <[email protected]>, [email protected], \"Markies Aart\" <[email protected]>"; 

回答

2

我會用String.splitString.replaceAll簡化代碼。這避免了與Pattern一起工作的麻煩,並使代碼簡潔明瞭。
試試這個:

public static void main(String[] args) { 
    String test = "\"Talboom, Esther\" <[email protected]>,  \"Wolde, Jos van der\" <[email protected]>, \"Debbie Derksen\" <[email protected]>, corine <[email protected]>, \"Markies Aart\" <[email protected]>"; 

    // Split up into each person's details 
    String[] nameEmailPairs = test.split(",\\s*(?=\")"); 
    for (String nameEmailPair : nameEmailPairs) { 
     // Extract exactly the parts you need from the person's details 
     String name = nameEmailPair.replaceAll("\"([^\"]+)\".*", "$1"); 
     String email = nameEmailPair.replaceAll(".*<([^>]+).*", "$1"); 
     System.out.println(name + " = " + email); 
    } 
} 

的輸出,顯示它的實際工作:)

Talboom, Esther = [email protected] 
Wolde, Jos van der = [email protected] 
Debbie Derksen = [email protected] 
Markies Aart = [email protected] 
+0

我用的是相同的正則表達式比你之前 - 直到我看到它會失敗的(法律)的名字,作爲「我」,「。 – fge 2011-12-23 23:23:00

+0

@fge通常在* 2 *步驟中執行這些操作比較容易 - 它往往會保持正則表達式仍然可讀,並且易於理解,調試和維護 – Bohemian 2011-12-23 23:25:41

+0

此外''.split()'將在內部使用'Pattern' ,不同之處在於它必須爲每個'split()'調用創建一個新的 - 也可以使用一個Pattern及其提供的split()方法;) – fge 2011-12-23 23:25:47