2017-08-06 196 views
0

當我使用ProPublica的API,我可以通過簡單地通過終端獲得115國會議員的名單:移調JSON嵌套轉換爲CSV

curl "https://api.propublica.org/congress/v1/115/senate/members.json" -H "X-API-Key: "MY_API_KEY" 

我得到一個JSON響應,看起來像這樣:

{ 
    "status":"OK", 
    "copyright":" Copyright (c) 2017 Pro Publica Inc. All Rights Reserved.", 
    "results":[ 
    { 
    "congress": "115", 
    "chamber": "Senate", 


    "num_results": 101, 
    "offset": 0, 
    "members": [ 
      { 
      "id": "A000360", 
      "api_uri":"https://api.propublica.org/congress/v1/members/A000360.json", 
      "first_name": "Lamar", 
      "middle_name": null, 
      "last_name": "Alexander", 
      "date_of_birth": "1940-07-03", 
      "party": "R", 
      "leadership_role": null, 
      "twitter_account": "SenAlexander", 
      "facebook_account": "senatorlamaralexander", 
      "youtube_account": "lamaralexander", 
      "govtrack_id": "300002", 
      "cspan_id": "5", 
      "votesmart_id": "15691", 
      "icpsr_id": "40304", 
      "crp_id": "N00009888", 
      "google_entity_id": "/m/01rbs3", 
      "url": "https://www.alexander.senate.gov/public", 
      "rss_url": "https://www.alexander.senate.gov/public/?a=RSS.Feed", 
      "contact_form": "http://www.alexander.senate.gov/public/index.cfm?p=Email", 
      "domain": null, 
      "in_office": true, 
      "dw_nominate": 0.323, 
      "ideal_point": null, 
      "seniority": "15", 
      "next_election": "2020", 
      "total_votes": 187, 
      "missed_votes": 7, 
      "total_present": 0, 
      "ocd_id": "ocd-division/country:us/state:tn", 
      "office": "455 Dirksen Senate Office Building", 
      "phone": "202-224-4944", 
      "fax": "202-228-3398", 
      "state": "TN", 
      "senate_class": "2", 
      "state_rank": "senior", 
      "lis_id": "S289" 
      ,"missed_votes_pct": 3.74, 
      "votes_with_party_pct": 98.89 
      }, 
         { 
      "id": "B000575", 
      "api_uri":"https://api.propublica.org/congress/v1/members/B000575.json", 
      "first_name": "Roy", 
      "middle_name": null, 
      "last_name": "Blunt", 
      "date_of_birth": "1950-01-10", 
      "party": "R", 
      "leadership_role": null, 
      "twitter_account": "RoyBlunt", 
      "facebook_account": "SenatorBlunt", 
      "youtube_account": "SenatorBlunt", 
      "govtrack_id": "400034", 
      "cspan_id": "45465", 
      "votesmart_id": "418", 
      "icpsr_id": "29735", 
      "crp_id": "N00005195", 
      "google_entity_id": "/m/034fn4", 
      "url": "https://www.blunt.senate.gov/public", 
      "rss_url": "http://www.blunt.senate.gov/public/?a=RSS.Feed", 
      "contact_form": "https://www.blunt.senate.gov/public/index.cfm/contact-roy", 
      "domain": null, 
      "in_office": true, 
      "dw_nominate": 0.431, 
      "ideal_point": null, 
      "seniority": "7", 
      "next_election": "2022", 
      "total_votes": 187, 
      "missed_votes": 2, 
      "total_present": 0, 
      "ocd_id": "ocd-division/country:us/state:mo", 
      "office": "260 Russell Senate Office Building", 
      "phone": "202-224-5721", 
      "fax": "202-224-8149", 
      "state": "MO", 
      "senate_class": "3", 
      "state_rank": "junior", 
      "lis_id": "S342" 
      ,"missed_votes_pct": 1.07, 
      "votes_with_party_pct": 99.46 
      }, 

等等

但是,當我將其轉換爲CSV,這只是兩行(其中一個之中列標題),綿延近4000列。看起來像JSON嵌套的方式,這是我可以將其轉換爲CSV的唯一方法。希望它轉換爲CSV,以便我可以正確導入SQL。

我計算了標題,每個會員有39個。他們被制定爲成員/ 0/id,會員/ 0/api_url,等到成員/ 100/id,會員/ 100/api_url等。

有反正我可以做到這一點,無需手動改變?在理想的世界中,我可以運行我的終端腳本,輸出爲CSV,然後上傳到SQL中使用它。如果結果是100行,39列,而不是1行和3900列,那麼它們的工作情況就會很好。

回答

0

下面是使用jq

一個解決方案,如果該文件filter.jq包含

.results[].members 
| (.[0] | keys), 
    (.[] as $x | [ $x[] ]) 
| @csv 

,你的數據是在一個名爲data.json文件,然後

jq -M -r -f filter.jq data.json 

會產生

"api_uri","contact_form","crp_id","cspan_id","date_of_birth","domain","dw_nominate","facebook_account","fax","first_name","google_entity_id","govtrack_id","icpsr_id","id","ideal_point","in_office","last_name","leadership_role","lis_id","middle_name","missed_votes","missed_votes_pct","next_election","ocd_id","office","party","phone","rss_url","senate_class","seniority","state","state_rank","total_present","total_votes","twitter_account","url","votes_with_party_pct","votesmart_id","youtube_account" 
"A000360","https://api.propublica.org/congress/v1/members/A000360.json","Lamar",,"Alexander","1940-07-03","R",,"SenAlexander","senatorlamaralexander","lamaralexander","300002","5","15691","40304","N00009888","/m/01rbs3","https://www.alexander.senate.gov/public","https://www.alexander.senate.gov/public/?a=RSS.Feed","http://www.alexander.senate.gov/public/index.cfm?p=Email",,true,0.323,,"15","2020",187,7,0,"ocd-division/country:us/state:tn","455 Dirksen Senate Office Building","202-224-4944","202-228-3398","TN","2","senior","S289",3.74,98.89 
"B000575","https://api.propublica.org/congress/v1/members/B000575.json","Roy",,"Blunt","1950-01-10","R",,"RoyBlunt","SenatorBlunt","SenatorBlunt","400034","45465","418","29735","N00005195","/m/034fn4","https://www.blunt.senate.gov/public","http://www.blunt.senate.gov/public/?a=RSS.Feed","https://www.blunt.senate.gov/public/index.cfm/contact-roy",,true,0.431,,"7","2022",187,2,0,"ocd-division/country:us/state:mo","260 Russell Senate Office Building","202-224-5721","202-224-8149","MO","3","junior","S342",1.07,99.46