2017-02-14 39 views
1

沒有與套,任何長度的可能的列的熊貓數據幀:解壓設置在數據幀值和重複進行

n = np.nan 
stack1 = pd.DataFrame.from_dict( 
     {'letter1': ['a','b','c','y'], 
     'letter2': [ 'o','p', 'q', 'y'], 
     'overlap': [ {'v'},{'c'}, {'c'}, {'v', 'c'}] 
     }) 
stack1.reset_index(inplace=True, drop=True) 

enter image description here

從這個數據幀,我怎麼能解壓內容的集合,並從每個解壓縮的元素創建新的行?如果該解決方案也適用於其他容器(如列表和元組),那將會很好。

期望的結果:

enter image description here

回答

1

嘗試這種情況:

In [32]: col_to_unpack = 'overlap' 

In [33]: df = stack1.copy() 

In [34]: pd.DataFrame({ 
    ...:  col:np.repeat(df[col].values, df[col_to_unpack].str.len()) 
    ...:  for col in df.columns.difference([col_to_unpack]) 
    ...: }).assign(**{col_to_unpack:np.concatenate(df[col_to_unpack].map(list).values)})[df.columns.tolist()] 
    ...: 
Out[34]: 
    letter1 letter2 overlap 
0  a  o  v 
1  b  p  c 
2  c  q  c 
3  y  y  c 
4  y  y  v