2017-01-09 140 views
1

有這個CSV:Python的CSV到JSON陣列與來自CSV唯一值的對象作爲一個JSON對象多於一個

Domain,IP,Server,PoweredBy,MetaGenerator,Email 
http://www.example1.com,1.1.1.1,,,, 
http://www.example2.com,2.2.2.2,Apache,PHP/5.5.9-1ubuntu4.20,, 
http://www.example3.com,3.3.3.3,Apache,PHP/5.5.9-1ubuntu4.20,Easy Digital Downloads v2.4.9;Powered by Visual Composer - drag and drop page builder for WordPress.,[email protected];[email protected] 

試圖建立對象的JSON陣列,其中每個對象將CSV的唯一組合那裏有許多(「;」隔開)值,即

正如我們可以看到,我們有www.example3.com

對於這種情況,不同MetaGenerators和電子郵件,對象的JSON數組應該是這樣的,每個組合作爲arr中的JSON對象AY:

[{'Domain': 'http://www.example1.com', 
    'Email': '', 
    'IP': '1.1.1.1', 
    'MetaGenerator': '', 
    'PoweredBy': '', 
    'Server': ''}, 
{'Domain': 'http://www.example2.com', 
    'Email': '', 
    'IP': '2.2.2.2', 
    'MetaGenerator': '', 
    'PoweredBy': 'PHP/5.5.9-1ubuntu4.20', 
    'Server': 'Apache'}, 
{'Domain': 'http://www.example3.com', 
    'Email': '[email protected]', 
    'IP': '2.2.2.2', 
    'MetaGenerator': 'Easy Digital Downloads v2.4.9', 
    'PoweredBy': 'PHP/5.5.9-1ubuntu4.20', 
    'Server': 'Apache'}, 
{'Domain': 'http://www.example3.com', 
    'Email': '[email protected]', 
    'IP': '2.2.2.2', 
    'MetaGenerator': 'Powered by Visual Composer - drag and drop page builder for WordPress.', 
    'PoweredBy': 'PHP/5.5.9-1ubuntu4.20', 
    'Server': 'Apache'}, 
{'Domain': 'http://www.example3.com', 
    'Email': '[email protected]', 
    'IP': '2.2.2.2', 
    'MetaGenerator': 'Easy Digital Downloads v2.4.9', 
    'PoweredBy': 'PHP/5.5.9-1ubuntu4.20', 
    'Server': 'Apache'}, 
{'Domain': 'http://www.example3.com', 
    'Email': '[email protected]', 
    'IP': '2.2.2.2', 
    'MetaGenerator': 'Powered by Visual Composer - drag and drop page builder for WordPress.', 
    'PoweredBy': 'PHP/5.5.9-1ubuntu4.20', 
    'Server': 'Apache'}] 

有這個Python代碼:

import csv 
import pprint 
import json 

with open("results.csv", 'r') as csvfile: 
    reader = csv.DictReader(csvfile, delimiter=',') 
    out=[] 
    d=dict() 
    for row in reader: 
     if ';' in row['Email']: 
      val = row['Email'].split(';') 
      for v in val: 
      d['Email']=v 
      out.append(d)  
     if ';' in row['MetaGenerator']: 
      val = row['MetaGenerator'].split(';') 
      for v in val: 
      d['MetaGenerator']=v 
      out.append(d) 
     else: 
      d=row 
      out.append(d) 


pprint.pprint(out) 

但它不能正常工作。

如何實現我的目標?僞代碼也可以。訂單並不重要。我應該使用哪些模塊?

感謝,

回答

3

試試這個(支票itertools DOC):

import csv 
import pprint 
import json 
import itertools 

out=[] 
with open("results.csv", 'r') as csvfile: 
    reader = csv.DictReader(csvfile, delimiter=',') 
    for row in reader: 

     Domains = row['Domain'].split(";") 
     Ips = row['IP'].split(";") 
     Servers = row['Server'].split(";") 
     Emails = row['Email'].split(";") 
     MetaGenerators = row['MetaGenerator'].split(";") 
     PoweredBy = row['PoweredBy'].split(";") 

     for comb in itertools.product(Domains, Ips, Servers, Emails, MetaGenerators, PoweredBy): 
      (cDomain, cIp, cServer, cEmail, cMeta, cPowered) = comb 

      out.append({ 
        'Domain': cDomain, 
        'IP': cIp, 
        'Server': cServer, 
        'Email': cEmail, 
        'MeraGenerator': cMeta, 
        'PoweredBy': cPowered 
       }) 

pprint.pprint(out) 

支票本的可讀性,但聰明的解決方案,隔離CSV字段:

out=[] 
with open("results.csv", 'r') as csvfile: 
    reader = csv.DictReader(csvfile, delimiter=',') 
    headers = reader.fieldnames 

    for row in reader: 
     fields = [value.split(";") for key, value in row.iteritems()] 
     out += [{headers[key]: value for key, value in enumerate(comb)} for comb in itertools.product(*fields)] 

pprint.pprint(out) 
+1

完美的作品!謝謝。不會用itertools弄清楚... –

相關問題