您好,我正在嘗試使用CSV文件並遍歷每個客戶數據。爲了解釋,每個客戶都有12個月的數據。我想分析他們的年度數據,將這些數據的相關性保存到一個新列表中並循環,直到所有客戶都進行了分析。對CSV進行迭代刪除分析數據
我已經能夠得到這個工作,以生成一個客戶數據的CSV相關性。但是,我的數據表中有成千上萬的客戶。我想使用嵌套for循環來獲取每個客戶的所有相關值到列表/數組中。該列表將包含一行特定客戶的關聯關係,那麼下一行將成爲下一個客戶。
這裏是我當前的代碼:
import numpy
from numpy import genfromtxt
overalldata = genfromtxt('C:\Users\User V\Desktop\CUSTDATA.csv', delimiter=',')
emptylist = []
overalldatasubtract = overalldata[13::]
#This is where I try to use the four loop to go through all the customers. I don't know if len will give me all the rows or the number of columns.
for x in range(0,len(overalldata),11):
for x in range(0,13,1):
cust_months = overalldata[0:x,1]
cust_balancenormal = overalldata[0:x,16]
cust_demo_one = overalldata[0:x,2]
cust_demo_two = overalldata[0:x,3]
num_acct_A = overalldata[0:x,4]
num_acct_B = overalldata[0:x,5]
#Correlation Calculations
demo_one_corr_balance = numpy.corrcoef(cust_balancenormal, cust_demo_one)[1,0]
demo_two_corr_balance = numpy.corrcoef(cust_balancenormal, cust_demo_two)[1,0]
demo_one_corr_acct_a = numpy.corrcoef(num_acct_A, cust_demo_one)[1,0]
demo_one_corr_acct_b = numpy.corrcoef(num_acct_B, cust_demo_one)[1,0]
demo_two_corr_acct_a = numpy.corrcoef(num_acct_A, cust_demo_two)[1,0]
demo_two_corr_acct_b = numpy.corrcoef(num_acct_B, cust_demo_two)[1,0]
result_correlation = [demo_one_corr_balance, demo_two_corr_balance, demo_one_corr_acct_a, demo_one_corr_acct_b, demo_two_corr_acct_a, demo_two_corr_acct_b]
result_correlation_combined = emptylist.append(result_correlation)
#This is where I try to delete the rows I have already analyzed.
overalldata = overalldata[11**x::]
print result_correlation_combined
print overalldatasubtract
看來,我的加減法的工作,但是當我用我的更大的數據集試了一下,我才意識到我的方法是完全錯誤的。
你會以不同的方式做到這一點嗎?我認爲它可以工作,但我找不到我的錯誤。
謝謝,這似乎是什麼,我試圖做的,但我仍然沒有得到任何輸出。 我想將這些相關性保存到: result_correlation_combined = emptylist.append(result_correlation) 但是,這似乎並沒有保存任何內容,因爲我不斷收到一個空列表。 –