2017-10-12 174 views
0

下面的代碼使用的音頻文件在tensorflow創建的特徵的矩陣:Python的類型錯誤:「浮動」對象不能被解釋爲索引

import tensorflow as tf 

directory = "audio_dataset/*.wav" 

filenames = tf.train.match_filenames_once(directory) 

init = (tf.global_variables_initializer(), tf.local_variables_initializer()) 

count_num_files = tf.size(filenames) 
filename_queue = tf.train.string_input_producer(filenames) 
reader = tf.WholeFileReader() 
filename, file_contents = reader.read(filename_queue) 

with tf.Session() as sess: 
    sess.run(init) 
    num_files = sess.run(count_num_files) 

    coord = tf.train.Coordinator() 
    threads = tf.train.start_queue_runners(coord=coord) 

    for i in range(num_files): 
     audio_file = sess.run(filename) 
     print(audio_file) 

這是一種將音頻從時域到頻域的工具包:

from bregman.suite import * 


chromo = tf.placeholder(tf.float32) 
max_freqs = tf.argmax(chromo, 0) 


def get_next_chromogram(sess): 
    audio_file = sess.run(filename) 
    F = Chromagram(audio_file, nfft=16384, wfft=8192, nhop=2205) 
    return F.X 


def extract_feature_vector(sess, chromo_data): 
    num_features, num_samples = np.shape(chromo_data) 
    freq_vals = sess.run(max_freqs, feed_dict={chromo: chromo_data}) 
    hist, bins = np.histogram(freq_vals, bins=range(num_features + 1)) 
    return hist.astype(float)/num_samples 


def get_dataset(sess): 
    num_files = sess.run(count_num_files) 
    coord = tf.train.Coordinator() 
    threads = tf.train.start_queue_runners(coord=coord) 
    xs = [] 
    for _ in range(num_files): 
     chromo_data = get_next_chromogram(sess) 
     x = [extract_feature_vector(sess, chromo_data)] 
     x = np.matrix(x) 
     if len(xs) == 0: 
      xs = x 
     else: 
      xs = np.vstack((xs, x)) 
    return xs 

這個聚類圍繞兩個質心數據:

k = 2 
max_iterations = 100 

def initial_cluster_centroids(X, k): 
    return X[0:k, :] 

def assign_cluster(X, centroids): 
    expanded_vectors = tf.expand_dims(X, 0) 
    expanded_centroids = tf.expand_dims(centroids, 1) 
    distances = tf.reduce_sum(tf.square(tf.subtract(expanded_vectors, expanded_centroids)), 2) 
    mins = tf.argmin(distances, 0) 
    return mins 

def recompute_centroids(X, Y): 
    sums = tf.unsorted_segment_sum(X, Y, k) 
    counts = tf.unsorted_segment_sum(tf.ones_like(X), Y, k) 
    return sums/counts 

with tf.Session() as sess: 
    sess.run(init) 
    X = get_dataset(sess) 
    centroids = initial_cluster_centroids(X, k) 
    i, converged = 0, False 
    while not converged and i < max_iterations: 
     i += 1 
     Y = assign_cluster(X, centroids) 
     centroids = sess.run(recompute_centroids(X, Y)) 
    print(centroids) 

但是我得到以下回溯:

Traceback (most recent call last): 
    File "components.py", line 776, in <module> 
    X = get_dataset(sess) 
    File "ccomponents.py", line 745, in get_dataset 
    chromo_data = get_next_chromogram(sess) 
    File "coffee_components.py", line 728, in get_next_chromogram 
    F = Chromagram(audio_file, nfft=16384, wfft=8192, nhop=2205) 
    File "/Volumes/Dados/Documents/Education/Programming/Machine Learning/Manning/book/BregmanToolkit-master/bregman/features.py", line 143, in __init__ 
    Features.__init__(self, arg, feature_params) 
    File "/Volumes/Dados/Documents/Education/Programming/Machine Learning/Manning/book/BregmanToolkit-master/bregman/features_base.py", line 70, in __init__ 
    self.extract() 
    File "/Volumes/Dados/Documents/Education/Programming/Machine Learning/Manning/book/BregmanToolkit-master/bregman/features_base.py", line 213, in extract 
    self.extract_funs.get(f, self._extract_error)() 
    File "/Volumes/Dados/Documents/Education/Programming/Machine Learning/Manning/book/BregmanToolkit-master/bregman/features_base.py", line 711, in _chroma 
    if not self._cqft(): 
    File "/Volumes/Dados/Documents/Education/Programming/Machine Learning/Manning/book/BregmanToolkit-master/bregman/features_base.py", line 588, in _cqft 
    self._make_log_freq_map() 
    File "/Volumes/Dados/Documents/Education/Programming/Machine Learning/Manning/book/BregmanToolkit-master/bregman/features_base.py", line 353, in _make_log_freq_map 
    mxnorm = P.empty(self._cqtN) # Normalization coefficients   
TypeError: 'float' object cannot be interpreted as an index 

就我而言,rangeint,而不是一個float

有人可以請指出我的錯誤嗎?

+0

'range'在哪裏?它不在堆棧跟蹤中。這似乎是抱怨'X = get_dataset(sess)'行。 – Antimony

+0

是的,'get_dataset(sess)'是一個函數(參見上面),使用('range()')進行迭代。通常這個錯誤是指你在範圍內使用'float'這個事實,但我不確定這裏。 – outkast

+0

也許你可以檢查'get_next_chromogram()'中'audio_file'的值?這是唯一傳遞給'Chromagram()'的非整數。 – Antimony

回答

1

的問題是,你正在使用Python 3,但佈雷格曼工具包是用Python編寫2.錯誤來自this line

mxnorm = P.empty(self._cqtN) 

self._cqtNfloat。在Python 2中,pylab庫接受彩車輸入:

pylab.empty(5.0) 
__main__:1: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future 
array([ 0., 0., 0., 0., 0.]) 

然而,在Python 3,你做你得到同樣的錯誤:

pylab.empty(5.0) 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
TypeError: 'float' object cannot be interpreted as an integer 

你應該能夠只是爲了解決這個錯誤編輯我在上面鏈接的文件中的行,並將其轉換爲int:

mxnorm = P.empty(int(self._cqtN)) 

然而,如果沒有發現任何其他錯誤,我會感到驚訝,由於不兼容的版本。您可能想嘗試使用Python 2或尋找Bregman Toolkit的替代方案。

+0

我不明白。我爲此使用了「Python 2.X」conda環境。那應該不是問題。 – outkast

相關問題