您可以使用:
df.index = df['id'].where(df['Code'].isnull()).ffill()
df = df[(df['Code'] != 'Code') & (df['id'] != df.index)]
df = df.rename_axis('Day').rename_axis('Week', 1)
df = df.set_index(['id','Code'], append=True)
.replace({'n':np.nan})
.stack().reset_index(name='val')
df['Week'] = df['Week'].str.extract('(\d+)', expand=False).astype(int)
cols = ['Code','Day','Week']
df = df.drop(['val','id'], axis=1)[cols].sort_values(['Week','Code']).reset_index(drop=True)
print (df)
Code Day Week
0 100 sunday 1
1 600 Monday 1
2 900 Tuesday 1
3 100 sunday 2
4 200 sunday 2
5 500 Monday 2
6 600 Monday 2
7 800 Tuesday 2
8 300 sunday 3
9 500 Monday 3
10 600 Monday 3
11 800 Tuesday 3
12 900 Tuesday 3
對於一般的輸出 - id
列所有y
和n
值刪除replace
:
df.index = df['id'].where(df['Code'].isnull()).ffill()
df = df[(df['Code'] != 'Code') & (df['id'] != df.index)]
df = df.rename_axis('Day').rename_axis('Week', 1)
df = df.set_index(['id','Code'], append=True).stack().reset_index(name='val')
df['Week'] = df['Week'].str.extract('(\d+)', expand=False).astype(int)
print (df)
Day id Code Week val
0 sunday 1 100 1 y
1 sunday 1 100 2 y
2 sunday 1 100 3 n
3 sunday 2 200 1 n
4 sunday 2 200 2 y
5 sunday 2 200 3 n
6 sunday 3 300 1 n
7 sunday 3 300 2 n
8 sunday 3 300 3 y
9 Monday 1 500 1 n
10 Monday 1 500 2 y
11 Monday 1 500 3 y
12 Monday 2 600 1 y
13 Monday 2 600 2 y
14 Monday 2 600 3 y
15 Tuesday 1 800 1 n
16 Tuesday 1 800 2 y
17 Tuesday 1 800 3 y
18 Tuesday 2 900 1 y
19 Tuesday 2 900 2 n
20 Tuesday 2 900 3 y
第一步的人!確保此數據框的創建者不允許創建更多數據框。 – piRSquared
@piRSquared LoL。我實際上在python中讀取一個excel文件,數據框看起來像這樣:P。這就是爲什麼我卡住了 – Shubham
我的眼睛...他們傷害了... –