的Python從CSV的字典添加多個數據點文件

我有一個CSV文件看起來像：的Python從CSV的字典添加多個數據點文件

CountryCode, NumberCalled, CallPrice, CallDuration 
BS,+1234567,0.20250,29 
BS,+19876544,0.20250,1 
US,+121234,0.01250,4 
US,+1543215,0.01250,39 
US,+145678,0.01250,11 
US,+18765678,None,0

我希望能夠分析文件，以從數據工作的一些統計數據：

CountryCode, NumberOfTimesCalled, TotalPrice, TotalCallDuration 
US, 4, 1.555, 54

目前，我有字典多數民衆贊成設置：

CalledStatistics = {}

當我讀從CSV，什麼最好的辦法t分別行把數據輸入字典？：

CalledStatistics['CountryCode'] = {'CallDuration', 'CallPrice', 'NumberOfTimesCalled'}

請問加入美國第二線覆蓋的第一行或將在數據基礎上的關鍵「COUNTRYCODE」被添加？

來源

2016-03-04 Mathew Jenkinson

什麼問題？你有一本字典，每次你讀CSV時，國家代碼總是被覆蓋，所以你最終會得到一個帶密鑰（BS，US）的字典和值=最近的條目，即覆蓋數據。 – Seekheart

你真的打算把一個集合分配給'CalledStatistics ['CountryCode']'嗎？ – MattDMo

在字典中KEY是一個唯一的值，所以是的，這樣做會覆蓋VALUE。您只需將一個新的VALUE分配給已有的KEY（美國）。 – catalesia

每個呼叫：

CalledStatistics['CountryCode'] = {'CallDuration', 'CallPrice', 'NumberOfTimesCalled'}

將覆蓋前的通話。

爲了計算你需要的總和，你可以使用一個字典詞典。就像在for循環中你將數據放在這些變量中一樣：country_code，call_duration，call_price以及你要在collect_statistics中存儲數據的位置：（編輯：添加第一行以便將call_price轉換爲0，如果它在數據;這段代碼是爲了處理一致的數據，比如只有整數，如果可能有其他類型的數據，它們需要在python總結之前變成整數[或任何相同類型的數字]）

call_price = call_price if call_price != None else 0 

if country_code not in collected_statistics: 
    collected_statistics[country_code] = {'CallDuration' : [call_duration], 
              'CallPrice' : [call_price]} 
else: 
    collected_statistics[country_code]['CallDuration'] += [call_duration] 
    collected_statistics[country_code]['CallPrice'] += [call_price]

，並在循環後，每個COUNTRY_CODE：

number_of_times_called[country_code] = len(collected_statistics[country_code]['CallDuration'] 

total_call_duration[country_code] = sum(collected_statistics[country_code]['CallDuration']) 
total_price[country_code] = sum(collected_statistics[country_code]['CallPrice'])

好了，終於在這裏是一個完整的窩王腳本處理，你給的例子：使用CalledData具有您所提供的完全一樣的內容的文件，它輸出

#!/usr/bin/env python3 

import csv 
import decimal 

with open('CalledData', newline='') as csvfile: 
    csv_r = csv.reader(csvfile, delimiter=',', quotechar='|') 

    # btw this creates a dict, not a set 
    collected_statistics = {} 

    for row in csv_r: 

     [country_code, number_called, call_price, call_duration] = row 

     # Only to avoid the first line, but would be better to have a list of available 
     # (and correct) codes, and check if the country_code belongs to this list: 
     if country_code != 'CountryCode': 

      call_price = call_price if call_price != 'None' else 0 

      if country_code not in collected_statistics: 
       collected_statistics[country_code] = {'CallDuration' : [int(call_duration)], 
                 'CallPrice' : [decimal.Decimal(call_price)]} 
      else: 
       collected_statistics[country_code]['CallDuration'] += [int(call_duration)] 
       collected_statistics[country_code]['CallPrice'] += [decimal.Decimal(call_price)] 


    for country_code in collected_statistics: 
     print(str(country_code) + ":") 
     print("number of times called: " + str(len(collected_statistics[country_code]['CallDuration']))) 
     print("total price: " + str(sum(collected_statistics[country_code]['CallPrice']))) 
     print("total call duration: " + str(sum(collected_statistics[country_code]['CallDuration'])))

：

$ ./test_script 
BS: 
number of times called: 2 
total price: 0.40500 
total call duration: 30 
US: 
number of times called: 4 
total price: 0.03750 
total call duration: 54

來源

2016-03-04 16:56:30 zezollo

這是行不通的，因爲在最後一行有一個** None **值會出現** TypeError **。但這是個好想法。 – catalesia

確實。我認爲我們可以假設None的價格可以被視爲零。所以，數據在使用之前需要進行處理。我編輯我的帖子來反映這一點。 – zezollo

沒有你想象的那麼簡單:)我們不知道所有的細節。案件越複雜，它就越複雜！你測試過了嗎？它工作嗎？想象一下，在文件的某處有人把「五」而不是5;） – catalesia

字典可以包含列表和字典的名單，這樣你就可以達到你想要的結構如下：

CalledStatistics['CountryCode'] =[ { 
    'CallDuration':cd_val, 
    'CallPrice':cp_val, 
    'NumberOfTimesCalled':ntc_val } ]

然後你就可以添加值是這樣的：

for line in lines: 
    parts = line.split(',') 
    CalledStatistics[parts.pop(0)].append({ 
     'CallDuration':parts[0], 
     'CallPrice':parts[1], 
     'NumberOfTimesCalled':parts[2] })

通過使每個countryCode成爲一個列表，您可以根據自己的countryCode添加任意數量的唯一字符。

pop(i)方法返回值並對列表進行變更，所以剩下的就是您對字典值所需的數據。這就是爲什麼我們彈出索引0並將索引0 - 2添加到字典。

來源

2016-03-04 16:57:04 arctelix

您的方法可能會略有不同。只需讀取文件，將其作爲列表（readlines.strip（「\ n」），split（「，」））。

忘掉第一行和最後一行（最可能是空的，測試）。然後，你可以使用一個示例@zezollo使用的字典，只需添加您將創建的字典的鍵的值。確保在添加列表後，所有添加的值都是相同的類型。

完全不像一個艱苦的工作，你會記得長的話;）

測試，測試，測試在模擬的例子。並閱讀Python幫助和文檔。這個棒極了。

來源

2016-03-04 18:17:24 catalesia

的Python從CSV的字典添加多個數據點文件

回答

相關問題