C++ 11異步僅使用一個核心

我想在C++中並行化一個長時間運行的函數，並使用std :: async它只使用一個核心。C++ 11異步僅使用一個核心

這不是函數的運行時間太小，因爲我目前使用的測試數據需要大約10分鐘才能運行。

從我的邏輯，我創建NThreads價值的期貨（每個採取一定比例的循環，而不是一個單獨的細胞，因此它是一個很好的長期運行的線程），其中每個將派遣異步任務。然後在它們創建完成後，程序自旋鎖等待它們完成。但它總是使用一個核心？！

這不是我在看上面兩種，並說，它看起來大致是一個CPU，我ZSH配置輸出的最後一個命令的CPU％，它總是正是 100％，從未超過

auto NThreads = 12; 
auto BlockSize = (int)std::ceil((int)(NThreads/PathCountLength)); 

std::vector<std::future<std::vector<unsigned __int128>>> Futures; 

for (auto I = 0; I < NThreads; ++I) { 
    std::cout << "HERE" << std::endl; 
    unsigned __int128 Min = I * BlockSize; 
    unsigned __int128 Max = I * BlockSize + BlockSize; 

    if (I == NThreads - 1) 
     Max = PathCountLength; 

    Futures.push_back(std::async(
     [](unsigned __int128 WMin, unsigned __int128 Min, unsigned__int128 Max, 
      std::vector<unsigned __int128> ZeroChildren, 
      std::vector<unsigned __int128> OneChildren, 
      unsigned __int128 PathCountLength) 
      -> std::vector<unsigned __int128> { 
      std::vector<unsigned __int128> LocalCount; 
      for (unsigned __int128 I = Min; I < Max; ++I) 
       LocalCount.push_back(KneeParallel::pathCountOrStatic(
        WMin, I, ZeroChildren, OneChildren, PathCountLength)); 
      return LocalCount; 
    }, 
    WMin, Min, Max, ZeroChildInit, OneChildInit, PathCountLength)); 
} 

for (auto &Future : Futures) { 
    Future.get(); 
}

有沒有人有任何見解。

我在Arch Linux上使用clang和LLVM進行編譯。有沒有我需要的編譯標誌，但從我能告訴C++ 11標準化線程庫？

編輯：如果它可以幫助任何人提供更多的線索，當我註釋掉本地向量時，它會在所有內核上運行，因爲它應該放在所有內核上，當我將它放回到一個內核時。

編輯2：所以我固定了解決方案，但它似乎很離奇。從lambda函數返回矢量將其固定爲一個核心，所以現在我通過將shared_ptr傳遞給輸出矢量並對其進行處理來解決這個問題。嘿，嘿，它激起核心！

我覺得這是毫無意義的現在使用期貨，因爲我沒有回報，我會使用線程，而不是使用線程，使用線程與沒有返回也使用一個核心。奇怪嗎？

好吧，回到使用期貨，只是回到丟掉什麼東西。是的，你猜對了，即使從線程返回一個int也會將程序粘貼到一個核心。除期貨不能有無效的lambda函數。所以我的解決方案是傳遞一個指針來存儲輸出到一個int lambda函數，它永遠不會返回任何東西。是的，它感覺像膠帶，但我看不到更好的解決方案。

這似乎是...... bizzare？就像編譯器以某種方式錯誤地解釋lambda一樣。難道是因爲我使用LLVM的開發版而不是穩定的分支...？

反正我的解決方案，因爲我恨沒有什麼比在這裏找到我的problm並沒有回答更多：

auto NThreads = 4; 
auto BlockSize = (int)std::ceil((int)(NThreads/PathCountLength)); 

auto Futures = std::vector<std::future<int>>(NThreads); 
auto OutputVectors = 
    std::vector<std::shared_ptr<std::vector<unsigned __int128>>>(
     NThreads, std::make_shared<std::vector<unsigned __int128>>()); 

for (auto I = 0; I < NThreads; ++I) { 
    unsigned __int128 Min = I * BlockSize; 
    unsigned __int128 Max = I * BlockSize + BlockSize; 

if (I == NThreads - 1) 
    Max = PathCountLength; 

Futures[I] = std::async(
    std::launch::async, 
    [](unsigned __int128 WMin, unsigned __int128 Min, unsigned __int128 Max, 
     std::vector<unsigned __int128> ZeroChildren, 
     std::vector<unsigned __int128> OneChildren, 
     unsigned __int128 PathCountLength, 
     std::shared_ptr<std::vector<unsigned __int128>> OutputVector) 
     -> int { 
     for (unsigned __int128 I = Min; I < Max; ++I) { 
     OutputVector->push_back(KneeParallel::pathCountOrStatic(
      WMin, I, ZeroChildren, OneChildren, PathCountLength)); 
     } 
    }, 
    WMin, Min, Max, ZeroChildInit, OneChildInit, PathCountLength, 
    OutputVectors[I]); 
} 

for (auto &Future : Futures) { 
    Future.get(); 
}

來源

2015-02-23 user3259106

異步調用與多處理無關。 – texasbruce 2015-02-24 07:40:36

@texasbruce您的意思是：'std :: async'不是一個正確的多處理方式嗎？使其成爲多核心的功能是什麼？ – cppBeginner 2017-12-07 11:17:31

通過提供第一個參數異步，您可以配置它運行延遲（std::launch::deferred）在自己的線程中運行（std::launch::async），或讓系統在兩個選項（std::launch::async | std::launch::deferred）之間進行決定。後者是默認行爲。

因此，要強制它在另一個線程中運行，請將std::async的調用修改爲std::async(std::launch::async, /*...*/)。

來源

2015-02-23 16:48:52 SebastianK

感謝您的建議，但它沒有幫助...？我將std :: launch :: async添加到開始，但它表現出相同的行爲。短暫運行後zsh的一個小問題 - 「13.92s用戶5.28s系統100％cpu 19.184總」 – user3259106 2015-02-23 18:07:34

我編輯了第一個問題，添加了一些信息以查看它是否增加了一些見解 – user3259106 2015-02-23 19:22:53

C++ 11異步僅使用一個核心

回答

相關問題