2016-08-01 149 views
1

我有問題向量化多維函數多維函數。
請看下面的例子:numpy的矢量化

def _cost(u): 
    return u[0] - u[1] 

cost = np.vectorize(_cost) 

>>> x = np.random.normal(0, 1,(10, 2)) 
>>> cost(x) 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
    File "/Users/lucapuggini/MyApps/scientific_python_3_5/lib/python3.5/site-packages/numpy/lib/function_base.py", line 2218, in __call__ 
    return self._vectorize_call(func=func, args=vargs) 
    File "/Users/lucapuggini/MyApps/scientific_python_3_5/lib/python3.5/site-packages/numpy/lib/function_base.py", line 2281, in _vectorize_call 
    ufunc, otypes = self._get_ufunc_and_otypes(func=func, args=args) 
    File "/Users/lucapuggini/MyApps/scientific_python_3_5/lib/python3.5/site-packages/numpy/lib/function_base.py", line 2243, in _get_ufunc_and_otypes 
    outputs = func(*inputs) 
TypeError: _cost() missing 1 required positional argument: 'v' 

背景資料: 我遇到試圖將下面的代碼(PSO算法)推廣到多變量數據問題:

import numpy as np 
import matplotlib.pyplot as plt 


def pso(cost, sim, space_dimension, n_particles, left_lim, right_lim, f1=1, f2=1, verbose=False): 

    best_scores = np.array([np.inf]*n_particles) 
    best_positions = np.zeros(shape=(n_particles, space_dimension)) 
    particles = np.random.uniform(left_lim, right_lim, (n_particles, space_dimension)) 
    velocities = np.zeros(shape=(n_particles, space_dimension)) 

    for i in range(sim): 
     particles = particles + velocities 
     print(particles) 
     scores = cost(particles).ravel() 
     better_positions = np.argwhere(scores < best_scores).ravel() 
     best_scores[better_positions] = scores[better_positions] 
     best_positions[better_positions, :] = particles[better_positions, :] 
     g = best_positions[np.argmin(best_scores), :] 

     u1 = np.random.uniform(0, f1, (n_particles, 1)) 
     u2 = np.random.uniform(0, f2, (n_particles, 1)) 
     velocities = velocities + u1 * (best_positions - particles) + u2 * (g - particles) 

     if verbose and i % 50 == 0: 
      print('it=', i, ' score=', cost(g)) 


      x = np.linspace(-5, 20, 1000) 
      y = cost(x) 

      plt.plot(x, y) 
      plt.plot(particles, cost(particles), 'o') 
      plt.vlines(g, y.min()-2, y.max()) 
      plt.show() 


    return g, cost(g) 




def test_pso_1_dim(): 

    def _cost(x): 
     if 0 < x < 15: 
      return np.sin(x)*x 
     else: 
      return 15 + np.min([np.abs(x-0), np.abs(x-15)]) 

    cost = np.vectorize(_cost) 

    sim = 100 
    space_dimension = 1 
    n_particles = 5 
    left_lim, right_lim = 0, 15 
    f1, f2 = 1, 1 

    x, cost_x = pso(cost, sim, space_dimension, n_particles, 
        left_lim, right_lim, f1, f2, verbose=False) 

    x0 = 11.0841839 
    assert np.abs(x - x0) < 0.01 

    return 

如有告訴我在這種情況下矢量化不是一個好主意。在筆記vectorize提到

回答

1

如:

矢量化功能主要爲了方便而提供,而不是性能。實現本質上是一個for循環。

因此,儘管矢量化你的代碼可能會通過numpy類型和功能是一個好主意,你可能不應該使用numpy.vectorize做到這一點。

對於你給的例子,你cost可能會簡單而有效地計算爲numpy陣列上運行的功能:

def cost(x): 
    # Create the empty output 
    output = np.empty(x.shape) 
    # Select the first group using a boolean array 
    group1 = (0 < x) & (x < 15) 
    output[group1] = np.sin(x[group1])*x[group1] 
    # Select second group as inverse (logical not) of group1 
    output[~group1] = 15 + np.min(
     [np.abs(x[~group1]-0), np.abs(x[~group1]-15)], 
     axis=0) 
    return output 
0

np.vectorize飼料定標器以你的函數。例如:

In [1090]: def _cost(u): 
     ...:  return u*2 

In [1092]: cost=np.vectorize(_cost) 
In [1093]: cost(np.arange(10) 
     ...:) 
Out[1093]: array([ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18]) 
In [1094]: cost(np.ones((3,4))) 
Out[1094]: 
array([[ 2., 2., 2., 2.], 
     [ 2., 2., 2., 2.], 
     [ 2., 2., 2., 2.]]) 

但是,您的函數就好像它正在獲取具有2個值的列表或數組。你打算做什麼?

與2次的標量函數:

In [1095]: def _cost(u,v): 
     ...:  return u+v 
     ...:  
     ...: 
In [1096]: cost=np.vectorize(_cost) 

In [1098]: cost(np.arange(3),np.arange(3,6)) 
Out[1098]: array([3, 5, 7]) 

In [1099]: cost([[1],[2]],np.arange(3,6)) 
Out[1099]: 
array([[4, 5, 6], 
     [5, 6, 7]]) 

或與2列x

In [1103]: cost(x[:,0],x[:,1]) 
Out[1103]: 
array([-1.7291913 , -0.46343403, 0.61574928, 0.9864683 , -1.22373097, 
     1.01970917, 0.22862683, -0.11653917, -1.18319723, -3.39580376]) 

這是相同的軸線上做一個陣列總和1

In [1104]: x.sum(axis=1) 
Out[1104]: 
array([-1.7291913 , -0.46343403, 0.61574928, 0.9864683 , -1.22373097, 
     1.01970917, 0.22862683, -0.11653917, -1.18319723, -3.39580376])