2015-09-06 86 views
2

我有一個使用np.genfromtxt()函數從csv文件加載的元組數組。將元組數組轉換爲2維數組

import numpy as np 
import re 
from matplotlib.dates import strpdate2num 
def convert_string_to_bigint(x): 
    p = re.compile(r'(\d{4})/(\d{1,2})/(\d{1,2}) (\d{1,2}):(\d{2}):\d{2}') 
    m = p.findall(x) 
    l = list(m[0]) 
    l[1] = ('0' + l[1])[-2:] 
    l[2] = ('0' + l[2])[-2:] 
    return long("".join(l)) 

#print convert_string_to_bigint("2012/7/2 14:07:00") 
csv = np.genfromtxt ('sr00-1min.txt', delimiter=',', converters={0:convert_string_to_bigint}) 

在CSV文件中的數據樣本:

2015/9/2 14:54:00,5169,5170,5167,5168 
2015/9/2 14:55:00,5168,5169,5166,5166 
2015/9/2 14:56:00,5167,5170,5165,5169 
2015/9/2 14:57:00,5168,5173,5167,5172 
2015/9/2 14:58:00,5172,5187,5171,5182 
2015/9/2 14:59:00,5182,5183,5171,5176 
2015/9/2 15:00:00,5176,5183,5174,5182 

加載後,它看起來像這樣:

[(201509021455L, 5168.0, 5169.0, 5166.0, 5166.0) 
(201509021456L, 5167.0, 5170.0, 5165.0, 5169.0) 
(201509021457L, 5168.0, 5173.0, 5167.0, 5172.0) 
(201509021458L, 5172.0, 5187.0, 5171.0, 5182.0) 
(201509021459L, 5182.0, 5183.0, 5171.0, 5176.0) 
(201509021500L, 5176.0, 5183.0, 5174.0, 5182.0)] 

我想將它轉換爲numpy的二維數組。它應該是這樣的:

[[201509021455L, 5168.0, 5169.0, 5166.0, 5166.0] 
[201509021456L, 5167.0, 5170.0, 5165.0, 5169.0] 
[201509021457L, 5168.0, 5173.0, 5167.0, 5172.0] 
[201509021458L, 5172.0, 5187.0, 5171.0, 5182.0] 
[201509021459L, 5182.0, 5183.0, 5171.0, 5176.0] 
[201509021500L, 5176.0, 5183.0, 5174.0, 5182.0]] 

我用下面的代碼來解決這個問題,但它看起來extreamly ugly.Could誰能告訴我怎麼把它轉換成一個優雅的方式?

pool = np.asarray([x for x in csv if x[0] > 201508010000]) 
sj = np.asarray([x[0] for x in pool]) 
kpj = np.asarray([x[1] for x in pool]) 
zgj = np.asarray([x[2] for x in pool]) 
zdj = np.asarray([x[3] for x in pool]) 
spj = np.asarray([x[4] for x in pool]) 
output = np.column_stack((sj,kpj,zgj,zdj,spj)) 
print output.shape 
+0

csv的外觀如何? – unutbu

+0

你是什麼意思的二維數組?你想如何輸出你的相同輸入? – sureshvv

+0

預期輸出是什麼? – luoluo

回答

2

convert_string_to_bigint,更改

return long("".join(l)) 

return float("".join(l)) 

然後genfromtxt將認識所有值作爲浮筒,並返回浮點數D型細胞的2D陣列:

In [23]: np.genfromtxt ('sr00-1min.txt', delimiter=',', converters={0:convert_string_to_bigint}).shape 
Out[23]: (7, 5) 

而不是混合dtype的1D 結構化陣列