您已有ndarray
。你正在尋找的是一個結構化數組,一個是這個複合dtype。首先看看pandas
是否可以爲你做。如果失敗了,我們可能可以通過tolist
和列表理解來做些事情。
In [84]: dt=[('PERCENT_A_NEW', '<f8'), ('JoinField', '<i4'), ('NULL_COUNT_B', '<
...: f8'),
...: ('PERCENT_COMP_B', '<f8'), ('RANKING_A', '<f8'), ('RANKING_B', '<f8'),
...: ('NULL_COUNT_B', '<f8')]
In [85]: subset=np.array([[ 2. , 12. , 33.33333333, 2.
...: ,
...: 33.33333333, 12. ],
...: [ 2. , 2. , 33.33333333, 2. ,
...: 33.33333333, 2. ],
...: [ 2.8 , 8. , 45.83333333, 2.75 ,
...: 46.66666667, 13. ],
...: [ 3.11320755, 75. , 56. , 3.24 ,
...: 52.83018868, 33. ]])
In [86]: subset
Out[86]:
array([[ 2. , 12. , 33.33333333, 2. ,
33.33333333, 12. ],
[ 2. , 2. , 33.33333333, 2. ,
33.33333333, 2. ],
[ 2.8 , 8. , 45.83333333, 2.75 ,
46.66666667, 13. ],
[ 3.11320755, 75. , 56. , 3.24 ,
52.83018868, 33. ]])
現在用dt
製作一個數組。輸入一個結構數組必須是一個元組列表 - 所以我使用tolist
和列表理解
In [87]: np.array([tuple(row) for row in subset.tolist()],dtype=dt)
....
ValueError: field 'NULL_COUNT_B' occurs more than once
In [88]: subset.shape
Out[88]: (4, 6)
In [89]: dt
Out[89]:
[('PERCENT_A_NEW', '<f8'),
('JoinField', '<i4'),
('NULL_COUNT_B', '<f8'),
('PERCENT_COMP_B', '<f8'),
('RANKING_A', '<f8'),
('RANKING_B', '<f8'),
('NULL_COUNT_B', '<f8')]
In [90]: dt=[('PERCENT_A_NEW', '<f8'), ('JoinField', '<i4'), ('NULL_COUNT_B', '<
...: f8'),
...: ('PERCENT_COMP_B', '<f8'), ('RANKING_A', '<f8'), ('RANKING_B', '<f8')]
In [91]: np.array([tuple(row) for row in subset.tolist()],dtype=dt)
Out[91]:
array([(2.0, 12, 33.33333333, 2.0, 33.33333333, 12.0),
(2.0, 2, 33.33333333, 2.0, 33.33333333, 2.0),
(2.8, 8, 45.83333333, 2.75, 46.66666667, 13.0),
(3.11320755, 75, 56.0, 3.24, 52.83018868, 33.0)],
dtype=[('PERCENT_A_NEW', '<f8'), ('JoinField', '<i4'), ('NULL_COUNT_B', '<f8'), ('PERCENT_COMP_B', '<f8'), ('RANKING_A', '<f8'), ('RANKING_B', '<f8')])
你應該用正確的D型像'np.int16','np.float32','NP。 float64' .... – Chr
你可以在熊貓身上使用['.astype'](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.astype.html)方法。爲什麼不必要地轉換爲數組? – Kartik
@Kartik我正在使用的程序使用numpy數組。 –