2017-06-06 60 views
1

我無法找到如何通過字符串將值繪製到時間圖表中。按日期繪製按日期顯示的值

這是我的數據。

輸入(來自CSV):

Fecha,Pais,count 
"20/05/2017",Brazil,1 
"20/05/2017",China,821 
"20/05/2017",Czechia,31 
"20/05/2017",France,1 
"20/05/2017","Republic of Korea",1 
"21/05/2017",Argentina,5 
"21/05/2017",Australia,2 
"21/05/2017",China,3043 
"21/05/2017",Denmark,1 
"21/05/2017",Egypt,1 
... 
.. 
. 

我已導入人從與日期的CSV數據,字符串和整數值解析井:

DatetimeIndex(['2017-05-20', '2017-05-20', '2017-05-20', '2017-05-20', 
       '2017-05-20', '2017-05-21', '2017-05-21', '2017-05-21', 
       '2017-05-21', '2017-05-21', '2017-05-21', '2017-05-21', 
       '2017-05-21', '2017-05-21', '2017-05-21', '2017-05-21', 
       '2017-05-21', '2017-05-21', '2017-05-21', '2017-05-21', 
       '2017-05-22', '2017-05-22', '2017-05-22', '2017-05-22', 
       '2017-05-22', '2017-05-22', '2017-05-22', '2017-05-22', 
       '2017-05-22', '2017-05-22', '2017-05-22', '2017-05-22', 
       '2017-05-22', '2017-05-22', '2017-05-22', '2017-05-22'], 
       dtype='datetime64[ns]', freq=None) 
['Brazil' 'China' 'Czechia' 'France' 'Republic of Korea' 'Argentina' 
'Australia' 'China' 'Denmark' 'Egypt' 'France' 'Hungary' 'Netherlands' 
'Oman' 'Republic of Korea' 'Russia' 'Slovak Republic' 'Taiwan' 'Ukraine' 
'United Arab Emirates' 'Argentina' 'Brazil' 'China' 'Czechia' 'Ecuador' 
'France' 'Germany' 'India' 'Latvia' 'Liberia' 'Pakistan' 'Peru' 
'Republic of Korea' 'Russia' 'Taiwan' 'Ukraine'] 
['1' '821' '31' '1' '1' '5' '2' '3043' '1' '1' '1' '1' '1' '1' '1' '1' '1' 
'3' '48' '1' '2' '1' '3759' '79' '2' '1' '3' '1' '192' '1' '1' '1' '1' '2' 
'1' '1'] 

事實上我有劇情:

see plot figure

,但是,我不能加入值由同一個國家在包含數據的日期繪製歷史記錄。

這裏是我的代碼:

import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt 
from matplotlib.dates import DateFormatter, DayLocator, AutoDateLocator, AutoDateFormatter 
import datetime 


locator = DayLocator() 
formatter = AutoDateFormatter(locator) 

date, country, count = np.loadtxt("72hcountcountry.csv", 
            delimiter=',', 
            unpack=True, 
            dtype='string', 
            skiprows=1) 

date = np.char.replace (date, '"', '') 
country = np.char.replace (country, '"', '') 
date2 = pd.to_datetime(date, format="%d/%m/%Y") 

print date2 
print country 
print count 

fig, ax = plt.subplots() 

ax.plot_date(date2, count) 
ax.xaxis.set_major_locator(locator) 
ax.xaxis.set_major_formatter(formatter) 
ax.autoscale_view() 

ax.grid(True) 
fig.autofmt_xdate() 

plt.show() 

我怎麼能每個國家與線繪製每個日期與數據分開?

+0

另外,我可以建議你把問題的標題改爲最後一個與你的實際問題相關的標題嗎?類似於「繪製文件中的數據從一列中的(字符串)值分隔的文件」可能會更好,恕我直言。 –

+0

非常感謝@Pablo –

回答

0

如果我正確理解你正在嘗試做什麼,你可以使用Pandas庫實現它:你需要將輸入數據讀入到DataFrame(它應該正確處理日期格式),然後利用groupby方法(請參閱文檔here)。

的csv文件的情況下,一個簡單的例子是在這裏(你可能會想另外更改蜱的格式等):

import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt 

infile = "foo.csv" 

# Read in the file to a Pandas 'DataFrame' 
df = pd.read_csv(infile) 

# Group the different entries by the content of the 
# Country/Pais column 
dfg = df.groupby('Pais') 

fig, ax = plt.subplots() 

# Loop over group names (country names), 
# and plot each one separately (assigning the appropriate label) 
for country in dfg.groups.keys(): 
    thisdf = dfg.get_group(country) 
    ax.plot_date(thisdf['Fecha'], thisdf['count'], 'o-', label=country) 


ax.legend() 
fig.autofmt_xdate() 

plt.show() 

這裏是結果(最小你的輸入文件的版本): example plot