-1

我有一個文件，在那個emailid，phoneno和我的日期在那裏。在python中使用正則表達式，如何逐個找到3個字段？期望的輸出看起來像如何在Python中使用正則表達式從文件中提取數據？

Emailid: [email protected] 
Phoneno: 1234567890 
dateofbirth: xx-xx-xx

我知道如何單獨找到字段。但我不知道如何一次找到3個。下面的代碼片段顯示瞭如何從文件中找到emailid。這段代碼輸出看起來像

Emaildid: [email protected]

...........................

import sys,re 

pattern=r'''(?P<emailid>[a-zA-Z\.]*\@[a-zA-Z]*\.c[a-zA-Z]*)''' 


regobj = re.compile(pattern, re.VERBOSE) 

for line in sys.stdin: 
    results= regobj.finditer(line) 
for result in results: 
    sys.stdout.write("%s\n"%result.group('emailid'))

來源

2013-03-28 lost

是否所有三條信息總是一起，在同一行？在這種情況下，您不需要使用一個正則表達式來查找它們，您可以僅分析該行3次。如果它比這更復雜，那麼我們需要看到你解析文件的一些例子。 – octern 2013-03-28 04:27:03

現在代碼看起來比以前好多了... – lost 2013-03-28 04:38:51

可以遍歷過使用該方法finditer如下字符串的RE圖案的所有非重疊-匹配：

import sys,re 

pattern = re.compile(r'''(?P<emailid>[a-zA-Z.]*@[a-zA-Z]*\.c[a-zA-Z]*).*?(?P<phone>\(?[0-9]{3}\)?[-. ]?[0-9]{3}[-. ]?[0-9]{4}).*?(?P<dob>[0-9]{2}-[0-9]{2}-[0-9]{2})''', re.DOTALL) 

for result in pattern.finditer(sys.stdin.read()): 
    sys.stdout.write("Emailid: %s\n"%result.group('emailid')) 
    sys.stdout.write("Phoneno: %s\n"%result.group('phone')) 
    sys.stdout.write("dateofbirth: %s\n"%result.group('dob'))

來源

2013-03-28 05:02:00 SUB0DH

@subodh：我在執行代碼時遇到了以下錯誤........ for result.php（pattern.finditer（sys.stdin））： TypeError：expected字符串或緩衝區 – lost 2013-03-28 05:26:33

@ martin-atkins編輯完成後，代碼應該沒有任何錯誤地工作。 – SUB0DH 2013-03-29 15:14:24

如何在Python中使用正則表達式從文件中提取數據？

...........................

回答

相關問題