2017-05-28 91 views
-3

我得到一個錯誤, IndexError:只有整數,切片(:),省略號(...),numpy.newaxis(無)和整數或布爾數組是有效的索引。 我正在製作聲音識別應用程序。 我的代碼是IndexError可以將int用作索引嗎?

import numpy as np 
import pandas as pd 
import scipy as sp 
import pickle 
from scipy import fft 
from time import localtime, strftime 
import matplotlib.pyplot as plt 
from skimage.morphology import disk,remove_small_objects 
from skimage.filter import rank 
from skimage.util import img_as_ubyte 
import wave 

folder = 'mlsp_contest_dataset/' 


essential_folder = folder+'essential_data/' 
supplemental_folder = folder+'supplemental_data/' 
spectro_folder =folder+'my_spectro/' 
single_spectro_folder =folder+'my_spectro_single/' 
dp_folder = folder+'DP/' 

# Each audio file has a unique recording identifier ("rec_id"), ranging from 0 to 644. 
# The file rec_id2filename.txt indicates which wav file is associated with each rec_id. 
rec2f = pd.read_csv(essential_folder + 'rec_id2filename.txt', sep = ',') 

# There are 19 bird species in the dataset. species_list.txt gives each a number from 0 to 18. 
species = pd.read_csv(essential_folder + 'species_list.txt', sep = ',') 
num_species = 19 

# The dataset is split into training and test sets. 
# CVfolds_2.txt gives the fold for each rec_id. 0 is the training set, and 1 is the test set. 
cv = pd.read_csv(essential_folder + 'CVfolds_2.txt', sep = ',') 

# This is your main label training data. For each rec_id, a set of species is listed. The format is: 
# rec_id,[labels] 
raw = pd.read_csv(essential_folder + 'rec_labels_test_hidden.txt', sep = ';') 
label = np.zeros(len(raw)*num_species) 
label = label.reshape([len(raw),num_species]) 
for i in range(len(raw)): 
    line = raw.iloc[i] 
    labels = line[0].split(',') 
    labels.pop(0) # rec_id == i 
    for c in labels: 
     if(c != '?'): 
      print(label) 
      label[i,c] = 1 

我運行此代碼, 我在這一點上label[i,c] = 1得到了錯誤。 我試圖通過print(label) label看到label變量是像

warn(skimage_deprecation('The `skimage.filter` module has been renamed ' 
[[ 0. 0. 0. ..., 0. 0. 0.] 
[ 0. 0. 0. ..., 0. 0. 0.] 
[ 0. 0. 0. ..., 0. 0. 0.] 
..., 
[ 0. 0. 0. ..., 0. 0. 0.] 
[ 0. 0. 0. ..., 0. 0. 0.] 
[ 0. 0. 0. ..., 0. 0. 0.]] 

我認爲,該錯誤意味着整數,切片(:),省略號(...),numpy.newaxis(無)和整數或布爾不能用作數組索引,但我把int放入數組索引很多時候,所以我不明白爲什麼會發生這個錯誤。 調試告訴我,

labels 

具有標籤:: [ '?']。

c 

for c in labels[i]: 

有 '?',我真的不明白? type.I認爲這個?導致錯誤,但我不知道如何解決這個問題。 我該如何解決這個問題?

+0

'在標籤C:...','不過是labels'字符串列表。字符串是不是在設置「*整數,切片(:),省略號(...),numpy.newaxis(無)和整數或布爾*」。 (另請注意:'np.zeros((LEN(原料),num_species))'是簡單的。) –

+0

@AndrasDeak非常感謝你!哪一部分是np。你告訴我的零((len(raw),num_species))?我怎樣才能解決這個問題? – user21063

+0

我只注意到for循環之前的兩行可以在一行中完成,而無需重新整形。至於你的問題:我不知道你想要做什麼,但試圖使用字符作爲numpy數組索引肯定是行不通的。 –

回答

0

該錯誤消息是說,索引一個numpy的陣列

only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices 

label時是浮筒的2D陣列,以0:

label = np.zeros([len(raw),num_species]) 

電流值在循環:

for i in range(len(raw)):  # i=0,1,2,... 

你檢查什麼raw樣的呢?來自pd.read_csv我想它是一個數據框; iloc[i]選擇一排,但尚未拆分成多列?

line = raw.iloc[i] 
    labels = line[0].split(',') 
    labels.pop(0) # rec_id == i 

什麼是labels like?我猜這是字符串的所有陣列

for c in labels: 
     if(c != '?'):   # evidently `c` is a string 
      print(label)  # prints the 2d array 
      label[i,c] = 1 

索引的二維數組應該是這樣label[0,1]c可能是錯誤信息中的其他內容之一。但它不能是一個字符串。

Dataframes確實允許索引與琴絃 - 這是一個熊貓的特徵。但是numpy數組必須有數字索引或者幾個選擇。它們沒有用字符串索引(除了結構化數組的情況)。


In [209]: label = np.zeros((3,5)) 
In [210]: label 
Out[210]: 
array([[ 0., 0., 0., 0., 0.], 
     [ 0., 0., 0., 0., 0.], 
     [ 0., 0., 0., 0., 0.]]) 
In [211]: label[1,3] 
Out[211]: 0.0 
In [212]: label[1,3]=1  # index with integers OK 
In [213]: label[0,2]=1 
In [214]: label[0,'?'] =1 # index with a string - ERROR 
--------------------------------------------------------------------------- 
IndexError        Traceback (most recent call last) 
<ipython-input-214-3738f623c78e> in <module>() 
----> 1 label[0,'?'] =1 

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices 

In [215]: label[0,:] =2  # index with a slice 
In [216]: label 
Out[216]: 
array([[ 2., 2., 2., 2., 2.], 
     [ 0., 0., 0., 1., 0.], 
     [ 0., 0., 0., 0., 0.]])