2017-03-08 45 views
2

我試圖繪製了在夜間運行某些程序的持續時間,我的節目時長的數據導出到一個CSV文件,以便以後進行分析。 (像這樣)我如何可以繪製一個程序在python

example

這裏是我的代碼和CSV例子:

CSV:

na,programName,totaal,na,startDate,endDate,Date 
?,"to/check.apl",54006,?,2017-02-27T20:04:07.233,2017-02- 27T20:05:01.239,2017-02-27T00:00:00.000 
?,"to/ibx.apl",143887,?,2017-02-27T20:07:55.627,2017-02-27T20:10:19.514,2017-02-27T00:00:00.000 
?,"to/checker.apl",2039600,?,2017-02-27T20:14:37.662,2017-02-27T20:48:37.262,2017-02-27T00:00:00.000 

Python代碼:

import matplotlib 
from pandas import * 
import pandas as pd 
import numpy as np 
import matplotlib.pyplot as plt 

matplotlib.style.use('ggplot') 

data = "miFile.csv" 
df = pd.DataFrame.from_csv(data) 
df = df.set_index('totaal') 

newDf = df[['programName','startDate','endDate']] 

到目前爲止,我得到的日期時間錯誤,所以我試圖通過這樣做來解決這個問題(也沒有好運的情節):

newDf['startDate'] = pd.to_datetime(newDf['startDate']) 
newDf['endDate'] = pd.to_datetime(newDf['endDate']) 

#pd.to_datetime(pd.Series(["2017-02-27T20:04:07.233"]) format= "%d, %m, %y, %H: %M: %S") 

newDf.plot('programName','startDate','endDate') 

plt.show() 

回答

2

我認爲你需要read_csv創建df,然後得到列的差異,convert timedeltaminutesplot

temp=u"""na,programName,totaal,na,startDate,endDate,Date 
?,"to/check.apl",54006,?,2017-02-27T20:04:07.233,2017-02-27T20:05:01.239,2017-02-27T00:00:00.000 
?,"to/ibx.apl",143887,?,2017-02-27T20:07:55.627,2017-02-27T20:10:19.514,2017-02-27T00:00:00.000 
?,"to/checker.apl",2039600,?,2017-02-27T20:14:37.662,2017-02-27T20:48:37.262,2017-02-27T00:00:00.000""" 
#after testing replace 'StringIO(temp)' to 'filename.csv' 
df = pd.read_csv(StringIO(temp), index_col=[2], parse_dates=[4,5,6]) 

print (df.dtypes) 
na      object 
programName   object 
na.1     object 
startDate  datetime64[ns] 
endDate  datetime64[ns] 
Date   datetime64[ns] 
dtype: object 
df['duration'] = (df['endDate'] - df['startDate']).astype('timedelta64[m]') 
newDf = df[['programName','duration']] 
print (newDf) 
      programName duration 
totaal       
54006  to/check.apl  0.0 
143887  to/ibx.apl  2.0 
2039600 to/checker.apl  33.0 

newDf.plot() 

plt.show() 
+0

謝謝,這工作得很好,我用'newDf.plot( 'PROGRAMNAME', '時間')'來得到它的權利,我也用'astype( 'timedelta64 [S]')'來獲得它在幾秒鐘內。但是我只看到它應該像70 – H35am

+0

如果測試'打印(DF)'只有7排7程序的名字呢? – jezrael

+0

'print(df)'給了我這個:'[70 rows x 6 columns]' – H35am

0

我建議你使用pandas.read_csv( )而不是pandas.DataFrame.from_csv()。 然後我會考慮將時間與時間分開的T。

0

由於jezreal這是我最後的解決方案是如何看起來和正常工作。我在幾秒鐘內計劃,因爲1分鐘以下的節目將被忽略,這在我的情況下是不準確的。

import matplotlib 
from pandas import * 
import pandas as pd 
import matplotlib.pyplot as plt 

matplotlib.style.use('ggplot') 

data = "miFile.csv" 
df = pd.read_csv(data,index_col=[2], parse_dates=[4,5,6]) 

df['duration'] = (df['endDate'] - df['startDate']).astype('timedelta64[s]') 
newDf = df[['programName','duration']] 

newDf.plot('programName','duration') 
plt.show() 
相關問題