如何創建一個自定義的隨機分佈函數？

通常我生成使用the built in random functions值，但現在我需要創建表單如何創建一個自定義的隨機分佈函數？

f(x) = k*log(x) + m

的隨機分佈是否有可能定義一個定製的隨機分佈函數？對於我的實際模型，我有x = [1, 1.4e7), k = -0.905787102751, m = 14.913170454。理想情況下，我想它的工作電流內置的分佈怎麼辦：

int main() 
{ 
    std::mt19937 generator; 

    std::uniform_real_distribution<> dist(0.0, 1.0); 
    my_distribution my_dist(0.0, 10.0); // Distribution using f(x) 

    double uni_val = dist(generator); 
    double log_val = my_dist(generator); 
}

來源

2017-04-26 pingul

這個問題和C++一樣重要。例如，請參閱https://en.wikipedia.org/wiki/Inverse_transform_sampling。 – jwimberley

什麼是域名？ –

@ YvesDaoust對於最初的問題，它是在1 - > 1.4e7之間。我添加了一個答案，我如何解決它。 – pingul

我跟着@ jwimberley的想法幾乎到了點，以爲我會在這裏分享我的成果。我創建了一個類，執行以下操作：

構造參數：
- CDF（歸一化或未歸一化），這是PDF的積分。
- 分佈的下限和上限
- （可選）表示我們應該採用多少個CDF採樣點的分辨率。
計算來自CDF的映射 - >隨機數x。這是我們的逆CDF功能。
產生由隨機點：
- 使用std::random(0, 1]之間生成隨機概率頁。

代碼：

// sampled_distribution.hh #ifndef SAMPLED_DISTRIBUTION #define SAMPLED_DISTRIBUTION #include <algorithm> #include <vector> #include <random> #include <stdexcept> template <typename T = double, bool Interpolate = true> class Sampled_distribution { public: using CDFFunc = T (*)(T); Sampled_distribution(CDFFunc cdfFunc, T low, T high, unsigned resolution = 200) : mLow(low), mHigh(high), mRes(resolution), mDist(0.0, 1.0) { if (mLow >= mHigh) throw InvalidBounds(); mSampledCDF.resize(mRes + 1); const T cdfLow = cdfFunc(low); const T cdfHigh = cdfFunc(high); T last_p = 0; for (unsigned i = 0; i < mSampledCDF.size(); ++i) { const T x = i/mRes*(mHigh - mLow) + mLow; const T p = (cdfFunc(x) - cdfLow)/(cdfHigh - cdfLow); // normalising if (! (p >= last_p)) throw CDFNotMonotonic(); mSampledCDF[i] = Sample{p, x}; last_p = p; } } template <typename Generator> T operator()(Generator& g) { T cdf = mDist(g); auto s = std::upper_bound(mSampledCDF.begin(), mSampledCDF.end(), cdf); auto bs = s - 1; if (Interpolate && bs >= mSampledCDF.begin()) { const T r = (cdf - bs->prob)/(s->prob - bs->prob); return r*bs->value + (1 - r)*s->value; } return s->value; } private: struct InvalidBounds : public std::runtime_error { InvalidBounds() : std::runtime_error("") {} }; struct CDFNotMonotonic : public std::runtime_error { CDFNotMonotonic() : std::runtime_error("") {} }; const T mLow, mHigh; const double mRes; struct Sample { T prob, value; friend bool operator<(T p, const Sample& s) { return p < s.prob; } }; std::vector<Sample> mSampledCDF; std::uniform_real_distribution<> mDist; }; #endif

下面是結果的部分地塊。對於給定的PDF，我們需要首先通過積分來分析計算CDF。

數線性

正弦

你可以用下面的演示試試這個自己：

// main.cc #include "sampled_distribution.hh" #include <iostream> #include <fstream> int main() { auto logFunc = [](double x) { const double k = -1.0; const double m = 10; return x*(k*std::log(x) + m - k); // PDF(x) = k*log(x) + m }; auto sinFunc = [](double x) { return x + std::cos(x); }; // PDF(x) = 1 - sin(x) std::mt19937 gen; //Sampled_distribution<> dist(logFunc, 1.0, 1e4); Sampled_distribution<> dist(sinFunc, 0.0, 6.28); std::ofstream file("d.txt"); for (int i = 0; i < 100000; i++) file << dist(gen) << std::endl; }

的數據與蟒蛇繪製。

// dist_plot.py import numpy as np import matplotlib.pyplot as plt d = np.loadtxt("d.txt") fig, ax = plt.subplots() bins = np.arange(d.min(), d.max(), (d.max() - d.min())/50) ax.hist(d, edgecolor='white', bins=bins) plt.show()

運行帶有演示：

clang++ -std=c++11 -stdlib=libc++ main.cc -o main; ./main; python dist_plot.py

來源

2017-04-26 22:00:06 pingul

關於這段代碼，有幾件事可以說，但這確實屬於代碼審查。 – Walter

@Walter該帖子沒有要求審查。這是我如何創建自定義隨機發布的答案，回答了我自己的問題。對於downvote，我真的很驚訝。 – pingul

你的代碼遠非最佳。首先，你至少應該測試CDF的單調性。其次，你可以實現一個更好的方法來倒置它，例如使用樣條或多項式插值。第三，如果您向用戶請求PDF和CDF，則可以使用Newton-Raphson對後者進行反轉，這可以收斂到機器精度。最後，這對你最初的問題來說是矯枉過正的。 – Walter

這是非常有可能的，但它儘可能多的數學問題，因爲一個C++的問題。創建僞隨機數發生器的最一般方法是Inverse transform sampling。從本質上講，任何PDF的CDF均勻分佈在0和1之間（如果這不明顯，只要記住CDF的值是一個概率並考慮這一點）。所以，你只需要對0到1之間的隨機統一數字進行採樣並應用CDF的逆。在您的情況下，使用$ f（x）= k * log（x）+ m $（您沒有指定界限，但我假設他們在1和某個正數之間> 1）CDF及其它反是相當混亂 - 我留給你的問題！在C++的實施將看起來像

double inverseCDF(double p, double k, double m, double lowerBound, double upperBound) { 
    // do math, which might include numerically finds roots of equations 
}

然後生成的代碼看起來就像

class my_distribution { 
    // ... constructor, private variables, etc. 
    template< class Generator > 
    double operator()(Generator& g) { 
      std::uniform_real_distribution<> dist(0.0, 1.0); 
      double cdf = dist(g); 
      return inverseCDF(cdf,this->k,this->m,this->lowerBound,this->upperBound); 
    } 
}

來源

2017-04-26 12:41:54 jwimberley

這是很好的建議，並帶領我走上正確的道路。 Upvoted。我添加了一個答案，概述了我是如何實現它的 - 這是你的想法？如果您有任何問題，請提出改進建議。 – pingul

正如指出的其他地方，用於採樣任何PDF的標準方法是在從區間選取的均勻隨機的點反轉其CDF [0,1] 。

如果您遇到特定的問題，CDF是一個簡單的函數，但其反過來不是。在這種情況下，可以使用傳統的數值工具（如Newton-Raphson迭代）將其倒置。不幸的是，您未能指定x的範圍或參數m和k的允許選項。我已經實現了通用的m,k和範圍（and posted it on code review）以滿足C++ RandomNumberDistribution concept。

來源

2017-04-27 14:52:00 Walter

如何創建一個自定義的隨機分佈函數？

回答

相關問題