2016-03-15 143 views
2

我有一個解析原始電子郵件的腳本。它適用於多部分電子郵件,但如何解析非多部分電子郵件?Python電子郵件解析非多部分

mail = email.message_from_string(raw_message) 
if mail.is_multipart(): 
    data = extract(mail) 
else: 
    payload = mail.get_payload(decode=True) 

原始電子郵件:

Return-Path: <> 
X-Original-To: [email protected] 
Delivered-To: [email protected] 
Received: from inmumg01.tcs.com (inmumg01.tcs.com [219.64.33.12]) 
    by smtp.mydomain.com (Postfix) with ESMTP id 603693FE11 
    for <[email protected]>; Tue, 15 Mar 2016 04:39:36 -0400 (EDT) 
Received: from localhost by inmumg01.tcs.com; 
    15 Mar 2016 14:09:38 +0530 
Message-Id: <[email protected]> 
Date: 15 Mar 2016 14:09:38 +0530 
To: [email protected] 
From: "Mail Delivery System" <[email protected]> 
Subject: Undeliverable Message 

The following message to <[email protected]> was undeliverable. 
The reason for the problem: 
5.1.0 - Unknown address error 550-'[email protected] No such user' 

The IP address of the MTA to which the message could not be sent: 
172.17.9.35 

---------- A copy of the message begins below this line ---------- 
X-IronPort-Anti-Spam-Filtered: true 
X-IronPort-Anti-Spam-Result: A0CGBgCEyedW/3zwO4tdHgEBAg4BgklMUm2nXoJekBMBDYFmBxUFAQ2HGwI4FAEBAQEBAQFkJ4RLIAoTAQEECCwGSQMBCQICMTsFHASHJ10FCatgZ4RBAQSLKQaBD4REgkIBhlERAWqCNBOBJ5MJhEuBLwKEPogSAoIuhnOFYo1YgUUBAUKBNgwBgj5OB4kqgTIBAQE 
X-IPAS-Result: A0CGBgCEyedW/3zwO4tdHgEBAg4BgklMUm2nXoJekBMBDYFmBxUFAQ2HGwI4FAEBAQEBAQFkJ4RLIAoTAQEECCwGSQMBCQICMTsFHASHJ10FCatgZ4RBAQSLKQaBD4REgkIBhlERAWqCNBOBJ5MJhEuBLwKEPogSAoIuhnOFYo1YgUUBAUKBNgwBgj5OB4kqgTIBAQE 
X-IronPort-AV: E=Sophos;i="5.24,338,1454956200"; 
    d="scan'208,217";a="72486315" 
X-Amp-Result: Clean 
X-Amp-File-Uploaded: False 
Received: from smtp.mydomain.com ([139.59.240.124]) 
    by inmumg01.tcs.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 15 Mar 2016 14:09:37 +0530 
Received: from 128.199.202.14 (unknown [128.199.202.14]) 
    (Authenticated sender: mailsender) 
    by smtp.mydomain.com (Postfix) with ESMTPA id 0D41F3FE11 
    for <[email protected]>; Tue, 15 Mar 2016 04:39:33 -0400 (EDT) 
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kitemailer.com; 
    s=kitemail; t=1458031173; 
    bh=1KFZSL77mNYuQ3iTjpNMdGcBOp2a4pGQnLVlq49ZrGg=; 
    h=Date:From:To:Subject:List-Unsubscribe; 
    b=Xxaf++WE0B7HL+FN28O76df7gYNEIKzk8eE9VpxrnMBCpGWPKWBMMfVDfCyie3NBJ 
    GJiMxn/Yhn+ey6Mr5R5AK5JO5n72yWlytLm0RepMEydaeHHVQPx7bE+LMDMlORSFin 
    bWdnz58lNMuZ3w9qtqjCXt22Sk5yXfCO71tRgfus= 
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=mydomain.com; 
    s=kitemail; t=1458031173; 
    bh=1KFZSL77mNYuQ3iTjpNMdGcBOp2a4pGQnLVlq49ZrGg=; 
    h=Date:From:To:Subject:List-Unsubscribe; 
    b=FiGdkSE9LCjYkfYyWq65GbZoMZVCQs5OXXJA35CyGQtjPWbvwIKvx7Z6Ff39EBRLf 
    Vu+6PUrvwyZLFh/1CW0NGOHDgUDjWWQ2jHfnNpJ9QEbHgOwomuMty10HDeZnIr0zM7 
    8mFCgeCbiiyusQkhmXh5aYqqD+Q/1wFcrpLpkBZc= 
Date: Tue, 15 Mar 2016 04:39:31 -0400 (EDT) 
From: Kitemailer Newsletter <[email protected]> 
To: vi[email protected] 
Message-ID: <[email protected].com> 
Subject: KiteMailer | New Features this Week 
MIME-Version: 1.0 
Content-Type: multipart/mixed; 
    boundary="----=_Part_44_1398250960.1458031171306" 
List-Unsubscribe: <http://example.com/unsubscribe/dmlwdWw0LmpAdGNzLmNvbSM5Ng==> 
Feedback-ID: 19:96:1520615:MyDomain 

現在else語句,我想提取信息,如果我嘗試payload['to']它將引發我一個錯誤TypeError: string indices must be integers, not str

+0

您嘗試通過索引'to'獲取元素,但是'payload'需要使用整數作爲索引。調用'print payload'(用於python 2)或'print(payload)'(用於python 3)來查看包含的元素。在這種情況下,「有效載荷」是一個字符串,元素是字符(符號) – nikniknik2016

+0

是的,你是正確的,它打印的字符串,但我如何從字符串中提取標題? –

+0

你能打印一個有效載荷的樣本,所以我們可以幫你嗎? – YOBA

回答

0

好吧,讓我們說沒有你可以用郵件庫(我不知道)來做這件事,你可以將你的原始信息轉換成字典並獲得你的元素:

這個我s是您的原始消息:

raw_message='''Return-Path: <> 
X-Original-To: [email protected] 
Delivered-To: [email protected] 
Received: from inmumg01.tcs.com (inmumg01.tcs.com [219.64.33.12]) 
    by smtp.mydomain.com (Postfix) with ESMTP id 603693FE11 
    for <[email protected]>; Tue, 15 Mar 2016 04:39:36 -0400 (EDT) 
Received: from localhost by inmumg01.tcs.com; 
    15 Mar 2016 14:09:38 +0530 
Message-Id: <[email protected]> 
Date: 15 Mar 2016 14:09:38 +0530 
To: [email protected] 
From: "Mail Delivery System" <[email protected]> 
Subject: Undeliverable Message 

The following message to <[email protected]> was undeliverable. 
The reason for the problem: 
5.1.0 - Unknown address error 550-'[email protected] No such user' 

The IP address of the MTA to which the message could not be sent: 
172.17.9.35 

---------- A copy of the message begins below this line ---------- 
X-IronPort-Anti-Spam-Filtered: true 
X-IronPort-Anti-Spam-Result: A0CGBgCEyedW/3zwO4tdHgEBAg4BgklMUm2nXoJekBMBDYFmBxUFAQ2HGwI4FAEBAQEBAQFkJ4RLIAoTAQEECCwGSQMBCQICMTsFHASHJ10FCatgZ4RBAQSLKQaBD4REgkIBhlERAWqCNBOBJ5MJhEuBLwKEPogSAoIuhnOFYo1YgUUBAUKBNgwBgj5OB4kqgTIBAQE 
X-IPAS-Result: A0CGBgCEyedW/3zwO4tdHgEBAg4BgklMUm2nXoJekBMBDYFmBxUFAQ2HGwI4FAEBAQEBAQFkJ4RLIAoTAQEECCwGSQMBCQICMTsFHASHJ10FCatgZ4RBAQSLKQaBD4REgkIBhlERAWqCNBOBJ5MJhEuBLwKEPogSAoIuhnOFYo1YgUUBAUKBNgwBgj5OB4kqgTIBAQE 
X-IronPort-AV: E=Sophos;i="5.24,338,1454956200"; 
    d="scan'208,217";a="72486315" 
X-Amp-Result: Clean 
X-Amp-File-Uploaded: False 
Received: from smtp.mydomain.com ([139.59.240.124]) 
    by inmumg01.tcs.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 15 Mar 2016 14:09:37 +0530 
Received: from 128.199.202.14 (unknown [128.199.202.14]) 
    (Authenticated sender: mailsender) 
    by smtp.mydomain.com (Postfix) with ESMTPA id 0D41F3FE11 
    for <[email protected]>; Tue, 15 Mar 2016 04:39:33 -0400 (EDT) 
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kitemailer.com; 
    s=kitemail; t=1458031173; 
    bh=1KFZSL77mNYuQ3iTjpNMdGcBOp2a4pGQnLVlq49ZrGg=; 
    h=Date:From:To:Subject:List-Unsubscribe; 
    b=Xxaf++WE0B7HL+FN28O76df7gYNEIKzk8eE9VpxrnMBCpGWPKWBMMfVDfCyie3NBJ 
    GJiMxn/Yhn+ey6Mr5R5AK5JO5n72yWlytLm0RepMEydaeHHVQPx7bE+LMDMlORSFin 
    bWdnz58lNMuZ3w9qtqjCXt22Sk5yXfCO71tRgfus= 
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=mydomain.com; 
    s=kitemail; t=1458031173; 
    bh=1KFZSL77mNYuQ3iTjpNMdGcBOp2a4pGQnLVlq49ZrGg=; 
    h=Date:From:To:Subject:List-Unsubscribe; 
    b=FiGdkSE9LCjYkfYyWq65GbZoMZVCQs5OXXJA35CyGQtjPWbvwIKvx7Z6Ff39EBRLf 
    Vu+6PUrvwyZLFh/1CW0NGOHDgUDjWWQ2jHfnNpJ9QEbHgOwomuMty10HDeZnIr0zM7 
    8mFCgeCbiiyusQkhmXh5aYqqD+Q/1wFcrpLpkBZc= 
Date: Tue, 15 Mar 2016 04:39:31 -0400 (EDT) 
From: Kitemailer Newsletter <[email protected]> 
To: [email protected] 
Message-ID: <[email protected].com> 
Subject: KiteMailer | New Features this Week 
MIME-Version: 1.0 
Content-Type: multipart/mixed; 
    boundary="----=_Part_44_1398250960.1458031171306" 
List-Unsubscribe: <http://example.com/unsubscribe/dmlwdWw0LmpAdGNzLmNvbSM5Ng==> 
Feedback-ID: 19:96:1520615:MyDomain''' 

我用你的代碼,以獲得有效載荷:

#in case it is not multipart 
import email 

mail = email.message_from_string(raw_message) 
payload = mail.get_payload(decode=True) 

mail_dico = { elt.split(":",1)[0].strip():elt.split(":", 1)[1].strip() for elt in payload.split("\n") if ":" in elt and " " not in elt.split(':')[0].strip()} 

這裏是你的詞典:

{'Content-Type': 'multipart/mixed;', 
'DKIM-Signature': 'v=1; a=rsa-sha256; c=relaxed/simple; d=mydomain.com;', 
'Date': 'Tue, 15 Mar 2016 04', 
'Feedback-ID': '19', 
'From': 'Kitemailer Newsletter <[email protected]>', 
'List-Unsubscribe': '<http', 
'MIME-Version': '1.0', 
'Message-ID': '<[email protected].com>', 
'Received': 'from 128.199.202.14 (unknown [128.199.202.14])', 
'Subject': 'KiteMailer | New Features this Week', 
'To': '[email protected]', 
'X-Amp-File-Uploaded': 'False', 
'X-Amp-Result': 'Clean', 
'X-IPAS-Result': 'A0CGBgCEyedW/3zwO4tdHgEBAg4BgklMUm2nXoJekBMBDYFmBxUFAQ2HGwI4FAEBAQEBAQFkJ4RLIAoTAQEECCwGSQMBCQICMTsFHASHJ10FCatgZ4RBAQSLKQaBD4REgkIBhlERAWqCNBOBJ5MJhEuBLwKEPogSAoIuhnOFYo1YgUUBAUKBNgwBgj5OB4kqgTIBAQE', 
'X-IronPort-AV': 'E=Sophos;i="5.24,338,1454956200";', 
'X-IronPort-Anti-Spam-Filtered': 'true', 
'X-IronPort-Anti-Spam-Result': 'A0CGBgCEyedW/3zwO4tdHgEBAg4BgklMUm2nXoJekBMBDYFmBxUFAQ2HGwI4FAEBAQEBAQFkJ4RLIAoTAQEECCwGSQMBCQICMTsFHASHJ10FCatgZ4RBAQSLKQaBD4REgkIBhlERAWqCNBOBJ5MJhEuBLwKEPogSAoIuhnOFYo1YgUUBAUKBNgwBgj5OB4kqgTIBAQE', 
'h=Date': 'From'} 

現在你可以訪問你的元素:

print(mail_dico["To"]) 
>> '[email protected]' 

print(mail_dico["Subject"]) 
>> 'KiteMailer | New Features this Week' 

這可能是不是這樣做的最好方式,但我希望它有所幫助。

+0

謝謝你爲我工作:) –