我有一個熊貓數據框,其中包含數百萬客戶的產品名稱[a,b,c,d,e,f,j,h,i,j,k,l] 。 對於每個產品,數據報告客戶是否在當月使用產品(由1表示)或未使用(由0表示)。使用和非使用
1 0我想重新分類的產品使用分爲四類::如何將記錄從兩類重新分類到四類
的原始客戶分類
S:使用
L:保持使用(在幾個用於隨後幾個月)
N:不使用
d:維護未使用(未使用連續數月)
的原始數據如下所示:
+-------------+-------+---+---+---+---+---+---+---+---+---+---+---+---+
| Customer_ID | Month | a | b | c | d | e | f | j | h | i | j | k | l |
+-------------+-------+---+---+---+---+---+---+---+---+---+---+---+---+
| 19509 | Jan | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 1 | 0 |
| 19509 | Feb | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 0 |
| 19509 | Mar | 1 | 1 | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 1 | 0 |
| 19509 | Apr | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 |
| 19509 | May | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 0 |
| 19509 | Jun | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 |
| 19509 | Jul | 1 | 1 | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 1 | 0 |
| 19509 | Aug | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
| 19509 | Sep | 1 | 1 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 0 |
| 19510 | Jan | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 1 | 0 |
| 19510 | Feb | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 0 |
| 19510 | Mar | 1 | 1 | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 1 | 0 |
| 19510 | Apr | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 |
| 19510 | May | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 0 |
| 19510 | Jun | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 |
| 19510 | Jul | 1 | 1 | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 1 | 0 |
| 19510 | Aug | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
| 19510 | Sep | 1 | 1 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 0 |
| 19511 | Jan | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 1 | 0 |
| 19511 | Feb | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 0 |
| 19511 | Mar | 1 | 1 | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 1 | 0 |
| 19511 | Apr | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 |
| 19511 | May | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 0 |
| 19511 | Jun | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 |
| 19511 | Jul | 1 | 1 | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 1 | 0 |
| 19511 | Aug | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
| 19511 | Sep | 1 | 1 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 0 |
+-------------+-------+---+---+---+---+---+---+---+---+---+---+---+---+
我想將客戶重新分爲四類,以考慮那些在幾個月內保持使用或保持未使用狀態的客戶。
結果應該如下所示:
+-------------+-------+---+---+---+---+---+---+---+---+---+---+---+---+
| Customer_ID | Month | a | b | c | d | e | f | j | h | i | j | k | l |
+-------------+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| 19509 | Jan | S | N | S | N | N | S | N | S | N | S | S | N |
| 19509 | Feb | M | N | N | D | D | M | D | M | D | N | M | D |
| 19509 | Mar | M | S | S | D | D | M | D | M | D | S | M | D |
| 19509 | Apr | N | M | N | S | D | M | D | M | D | N | N | D |
| 19509 | May | D | N | D | M | S | M | D | M | D | D | D | D |
| 19509 | Jun | D | D | D | M | N | M | D | M | D | D | D | D |
| 19509 | Jul | S | S | S | N | D | M | D | M | D | S | S | D |
| 19509 | Aug | N | M | N | D | D | M | D | N | D | N | N | D |
| 19509 | Sep | S | M | S | S | D | M | D | D | S | S | S | D |
| 19510 | Jan | S | N | S | N | N | S | N | S | N | S | S | N |
| 19510 | Feb | M | N | N | D | D | M | D | M | D | N | M | D |
| 19510 | Mar | M | S | S | D | D | M | D | M | D | S | M | D |
| 19510 | Apr | N | M | N | S | D | M | D | M | D | N | N | D |
| 19510 | May | D | N | D | M | S | M | D | M | D | D | D | D |
| 19510 | Jun | D | D | D | M | N | M | D | M | D | D | D | D |
| 19510 | Jul | S | S | S | N | D | M | D | M | D | S | S | D |
| 19510 | Aug | N | M | N | D | D | M | D | N | D | N | N | D |
| 19510 | Sep | S | M | S | S | D | M | D | D | S | S | S | D |
| 19511 | Jan | S | N | S | N | N | S | N | S | N | S | S | N |
| 19511 | Feb | M | N | N | D | D | M | D | M | D | N | M | D |
| 19511 | Mar | M | S | S | D | D | M | D | M | D | S | M | D |
| 19511 | Apr | N | M | N | S | D | M | D | M | D | N | N | D |
| 19511 | May | D | N | D | M | S | M | D | M | D | D | D | D |
| 19511 | Jun | D | D | D | M | N | M | D | M | D | D | D | D |
| 19511 | Jul | S | S | S | N | D | M | D | M | D | S | S | D |
| 19511 | Aug | N | M | N | D | D | M | D | N | D | N | N | D |
| 19511 | Sep | S | M | S | S | D | M | D | D | S | S | S | D |
+-------------+-------+---+---+---+---+---+---+---+---+---+---+---+---+
算法做看起來很複雜,我還在想着合適的順序來做到這一點。
我想這樣做對所有客戶的所有產品(列),我認爲我們可以這樣開始:
for i in customer_ID:
for j in df.columns:
注:這種情況下不使用和不使用的情況下,相反,它是加入(1),取消(0),保持空閒(0),如果再次加入(1)等等。所以當它爲零時,這意味着客戶取消了服務,而在接下來的三個月內它爲零時,意味着他不是一個客戶,然後他加入並且再次取消,我們應該知道他有多少次取消了服務。如果我們只計算總數,它不會給我們多少次客戶加入,以及他取消特定產品或服務的次數。
我很感激任何意見或想法來解決這個問題。
爲什麼'r'標籤? – Sotos