2012-02-02 58 views
4

差分運算符(類似於導數運算符)和求和運算符(類似於積分運算符)可用於更改算法,因爲它們是逆。什麼是其他可用於轉換算法的數學運算符

Sum of (difference of y) = y 
Difference of (sum of y) = y 

下面是一個在c程序中使用它們的例子。

這個c程序演示了三種方法來製作一組正方形。

  1. 第一種方法是簡單明顯的方法y = x*x
  2. 第二種方法使用方程(difference in y) = (x0 + x1)*(difference in x)
  3. 第三種方法是相反的,並使用方程(sum of y) = x(x+1)(2x+1)/6

第二種方法是一致的速度稍快,則第一個,即使我沒有打擾優化它。我想如果我更努力地嘗試,我可以做得更好。

第三種方法一貫是兩倍慢,但這並不意味着基本思想是愚蠢的。我可以想象,除了y = x*x以外的其他功能,這種方法可能會更快。還有一個整數溢出問題。

嘗試所有這些轉換是非常有趣的,所以現在我想知道我可以用來轉換算法的其他幾對數學運算符是什麼?

下面是代碼:

#include <stdio.h> 
#include <time.h> 

#define tries 201 
#define loops 100000 

void printAllIn(unsigned int array[tries]){ 
unsigned int index; 

for (index = 0; index < tries; ++index) 
    printf("%u\n", array[index]); 
} 

int main (int argc, const char * argv[]) { 
     /* 

    Goal, Calculate an array of squares from 0 20 as fast as possible 

     */ 

    long unsigned int obvious[tries]; 
    long unsigned int sum_of_differences[tries]; 
    long unsigned int difference_of_sums[tries]; 

    clock_t time_of_obvious1; 
    clock_t time_of_obvious0; 

    clock_t time_of_sum_of_differences1; 
    clock_t time_of_sum_of_differences0; 

    clock_t time_of_difference_of_sums1; 
    clock_t time_of_difference_of_sums0; 

    long unsigned int j; 
    long unsigned int index; 
    long unsigned int sum1; 
    long unsigned int sum0; 
    long signed int signed_index; 

    time_of_obvious0 = clock(); 
    for (j = 0; j < loops; ++j) 
    for (index = 0; index < tries; ++index) 
     obvious[index] = index*index; 
    time_of_obvious1 = clock(); 

     time_of_sum_of_differences0 = clock(); 
    for (j = 0; j < loops; ++j) 
    for (index = 1, sum_of_differences[0] = 0; index < tries; ++index) 
     sum_of_differences[index] = sum_of_differences[index-1] + 2 * index - 1; 
    time_of_sum_of_differences1 = clock(); 

    time_of_difference_of_sums0 = clock(); 
    for (j = 0; j < loops; ++j) 
    for (signed_index = 0, sum0 = 0; signed_index < tries; ++signed_index) { 
     sum1 = signed_index*(signed_index+1)*(2*signed_index+1); 
     difference_of_sums[signed_index] = (sum1 - sum0)/6; 
     sum0 = sum1; 
    } 
    time_of_difference_of_sums1 = clock(); 

    // printAllIn(obvious); 
    printf(
     "The obvious approach y = x*x took, %f seconds\n", 
     ((double)(time_of_obvious1 - time_of_obvious0))/CLOCKS_PER_SEC 
     ); 
    // printAllIn(sum_of_differences); 
    printf(
     "The sum of differences approach y1 = y0 + 2x - 1 took, %f seconds\n", 
     ((double)(time_of_sum_of_differences1 - time_of_sum_of_differences0))/CLOCKS_PER_SEC 
     ); 
    // printAllIn(difference_of_sums); 
    printf(
     "The difference of sums approach y = sum1 - sum0, sum = (x - 1)x(2(x - 1) + 1)/6 took, %f seconds\n", 
     (double)(time_of_difference_of_sums1 - time_of_difference_of_sums0)/CLOCKS_PER_SEC 
     ); 

    return 0; 
} 

回答

1

有兩類這裏優化:strength reductionpeephole優化。

強度降低是用較便宜的功能替換「昂貴」的數學函數的常用術語 - 比方說,用兩個logarithm table lookups(一個加法)替代乘法,然後用反對數查找來找出最終結果。

窺孔優化是通常的術語,用乘以左乘的兩個乘方來代替乘法。有些CPU對這些操作有簡單的指令,對於乘以2的冪的特定情況,它們的運行速度要比通用整數乘法更快。

您還可以執行單個算法的優化。你可能會寫a * b,但有many different ways to perform multiplication,不同的算法在不同的條件下表現更好或更差。許多這些決定都是由芯片設計製造,但任意精度的整數庫使基於提供給他們的原語的優點,自己的選擇。

0

當我試圖編譯在Ubuntu 10.04的代碼,我得到一個分段錯誤權當主()開始,因爲您聲明價值的堆棧變量很多兆字節。在我將大部分變量移到main之外後,我可以編譯它,使它們成爲全局變量。

然後我得到這些結果:

The obvious approach y = x*x took, 0.000000 seconds 
The sum of differences approach y1 = y0 + 2x - 1 took, 0.020000 seconds 
The difference of sums approach y = sum1 - sum0, sum = (x - 1)x(2(x - 1) + 1)/6 took, 0.000000 seconds 

程序運行如此之快,很難相信它真的做了什麼。我將「-O0」選項設置爲禁用優化,但可能GCC仍可能優化了所有計算。所以我嘗試在數組中添加「volatile」限定符,但仍然得到了類似的結果。

這就是我停止工作的地方。總之,我並不知道你的代碼是怎麼回事,但很可能出現了錯誤。