2017-04-04 53 views
0

我正在從谷歌存儲與谷歌datalab讀取文件,然後我有一個數據的變量,但我需要將其轉換爲熊貓數據框。谷歌存儲對象熊貓數據框

我讀:

%%gcs read --object $objeto1 --variable prueba 

變量prueba樣子:

1/1/2016 08:35:56,1,4756798,"7501073831988",1.00,15.00,0.16,"S0394",4388,2,10.43\r\n1,1/1/2016 08:35:56,1,4756798,"850697002395",1.00,13.50,0.00,"S0394",4388,2,10.36\r\n1,1/1/2016 08:35:56,1,4756798,"850697002425",1.00,10.00,0.00,"S0394",4388,2,7.29\r\n1,1/1/2016 08:38:55,2,1013642,"8469760102003",1.00,200.00,0.16,"C0278",2595,1,161.20\r\n 

任何幫助嗎?

+0

當我從閱讀的BigQuery的查詢,例如:DF = bq.Query(「選擇塔布拉*」)to_dataframe( ),它足以將我的對象轉換爲熊貓數據框,但是當我在存儲的變量中執行類似操作時:AttributeError:'str'對象沒有屬性'to_dataframe' –

+0

將您的變量包裝在StringIO中,如下所示:https: //stackoverflow.com/questions/37990467/how-can-i-load-my-csv-from-google-datalab-to-a-pandas-data-frame – Tautvydas

回答

0

我建議你讀GCS文件到您的datalabs機:

def (gcs_path, csv_file_name): 
    get_ipython().system(u'gsutil cp ' + path + csv_file_name+' .') 
    df = pd.read_csv(csv_file_name) 
    return df