2016-01-20 484 views
0

我想在IPython環境中加載數據集並使用它。如何解決使用pickle.load()函數時發生的錯誤?

在包含數據集的目錄,我已經得到了這些文件:

  • batches.meta
  • data_batch_1
  • data_batch_2
  • data_batch_3
  • data_batch_4
  • data_batch_5
  • 自述文件
  • test_batch

我寫了這個代碼:

import os 
import pickle as pickle 
import numpy as np 
import matplotlib.pyplot as plt 

#Function Definition 
def load_CIFAR(ROOT): 
xs=[]; 
ys=[]; 
for b in range(6): 
    f = os.path.join(ROOT, "data_batch_%d"%(b+1)); 
    X, Y = load_CIFAR_batch(f); 
    xs.append(X); 
    ys.append(Y); 
Xtr = np.concatenate(xs); 
Ytr = np.concatenate(ys); 

del X, Y; 
Xte, Yte = load_CIFAR_batch(os.path.join(ROOT, "test_batch")); 
return Xtr, Ytr, Xte, Yte 

#Function Definition 
def load_CIFAR_batch(filename): 
with open(filename, 'r') as f: 

    ****** Here is where error occurs 
    datadict = pickle.load(f); 
    ****** 
    X = datadict['data']; 
    Y = datadict['labels']; 
    X = X.reshape(10000, 3, 32, 32).transpose(0,2,3,1).astype("float"); 
    Y = np.array(Y); 
    return X, Y; 

但是,當我用這個函數加載與下面的命令這個數據集,我碰到一個[需要字節狀物體,不'str']錯誤。

#The directory of my dataset in my hard drive 
url = 'D:\\OTIWU\\data\\cifar10' 
Xtr, Ytr, Xte, Yte = load_CIFAR(url) 

上面是我用過的命令。

The whole error: 
--------------------------------------------------------------------------- 
TypeError Traceback (most recent call last) 
<ipython-input-14-f0576df4fbda> in <module>() 
----> 1 Xtr, Ytr, Xte, Yte = load_CIFAR(url) 

<ipython-input-10-fedf6bd7c144> in load_CIFAR(ROOT) 
     4  for b in range(1,6): 
     5   f=os.path.join(ROOT, "data_batch_%d" % (b,)); 
     ----> 6   X, Y=load_CIFAR_batch(f); 
     7   xs.append(X); 
     8   ys.append(Y); 

     <ipython-input-13-368cd3e9d8d2> in load_CIFAR_batch(filename) 
     1 def load_CIFAR_batch(filename): 
     2  with open(filename, 'r') as f: 
     ----> 3   datadict = pickle.load(f); 
     4 
     5   X = datadict['data']; 

     TypeError: a bytes-like object is required, not 'str' 

我該如何解決這樣的問題?

+0

你在哪裏'pickling'的數據?看來你需要使用'pickle.loads(...)'而不是'pickle.load(...)' – tglaria

+0

告訴我們如何創建pickle文件。 –

+0

@JohnGordon對不起,我必須編輯問題的上下文嗎? –

回答

0

我找到了解決方案。這是一個與python 3.x相關的問題。當我用python 2.x運行它時,我可以讀取數據集中的所有數據。我也不得不說,我已經改變了一點點的源代碼。我的意思是我從cPickle庫中使用,而不是Pickle和除此問題之外的所有源代碼都與以前相同。

0

你需要打開你的文件中二進制模式

with open(filename, 'rb') as f: