peewee和peewee異步：爲什麼異步較慢

我想繞過Tornado和異步連接到Postgresql我的頭。我找到了一個可以在http://peewee-async.readthedocs.io/en/latest/上執行此操作的庫。peewee和peewee異步：爲什麼異步較慢

我設計了一個小測試來比較傳統的Peewee和Peewee異步，但不知何故，異步運行速度較慢。

這是我的應用程序：

import peewee 
import tornado.web 
import logging 
import asyncio 
import peewee_async 
import tornado.gen 
import tornado.httpclient 
from tornado.platform.asyncio import AsyncIOMainLoop 

AsyncIOMainLoop().install() 
app = tornado.web.Application(debug=True) 
app.listen(port=8888) 

# =========== 
# Defining Async model 
async_db = peewee_async.PooledPostgresqlDatabase(
    'reminderbot', 
    user='reminderbot', 
    password='reminderbot', 
    host='localhost' 
) 
app.objects = peewee_async.Manager(async_db) 
class AsyncHuman(peewee.Model): 
    first_name = peewee.CharField() 
    messenger_id = peewee.CharField() 
    class Meta: 
     database = async_db 
     db_table = 'chats_human' 


# ========== 
# Defining Sync model 
sync_db = peewee.PostgresqlDatabase(
    'reminderbot', 
    user='reminderbot', 
    password='reminderbot', 
    host='localhost' 
) 
class SyncHuman(peewee.Model): 
    first_name = peewee.CharField() 
    messenger_id = peewee.CharField() 
    class Meta: 
     database = sync_db 
     db_table = 'chats_human' 

# defining two handlers - async and sync 
class AsyncHandler(tornado.web.RequestHandler): 

    async def get(self): 
     """ 
     An asynchronous way to create an object and return its ID 
     """ 
     obj = await self.application.objects.create(
      AsyncHuman, messenger_id='12345') 
     self.write(
      {'id': obj.id, 
      'messenger_id': obj.messenger_id} 
     ) 


class SyncHandler(tornado.web.RequestHandler): 

    def get(self): 
     """ 
     An traditional synchronous way 
     """ 
     obj = SyncHuman.create(messenger_id='12345') 
     self.write({ 
      'id': obj.id, 
      'messenger_id': obj.messenger_id 
     }) 


app.add_handlers('', [ 
    (r"/receive_async", AsyncHandler), 
    (r"/receive_sync", SyncHandler), 
]) 

# Run loop 
loop = asyncio.get_event_loop() 
try: 
    loop.run_forever() 
except KeyboardInterrupt: 
    print(" server stopped")

，這是我從Apache的基準獲得：

ab -n 100 -c 100 http://127.0.0.1:8888/receive_async 

Connection Times (ms) 
       min mean[+/-sd] median max 
Connect:  2 4 1.5  5  7 
Processing: 621 1049 256.6 1054 1486 
Waiting:  621 1048 256.6 1053 1485 
Total:  628 1053 255.3 1058 1492 

Percentage of the requests served within a certain time (ms) 
    50% 1058 
    66% 1196 
    75% 1274 
    80% 1324 
    90% 1409 
    95% 1452 
    98% 1485 
    99% 1492 
100% 1492 (longest request) 




ab -n 100 -c 100 http://127.0.0.1:8888/receive_sync 
Connection Times (ms) 
       min mean[+/-sd] median max 
Connect:  2 5 1.9  5  8 
Processing:  8 476 277.7 479 1052 
Waiting:  7 476 277.7 478 1052 
Total:   15 481 276.2 483 1060 

Percentage of the requests served within a certain time (ms) 
    50% 483 
    66% 629 
    75% 714 
    80% 759 
    90% 853 
    95% 899 
    98% 1051 
    99% 1060 
100% 1060 (longest request)

爲什麼是同步更快？我錯過了什麼瓶頸？

來源

2016-10-01 kurtgn

在很長的解釋：

http://techspot.zzzeek.org/2015/02/15/asynchronous-python-and-databases/

對於簡短的說明：同步Python代碼是簡單且多標準庫的插座模塊中實現，這是純C.異步Python代碼比更復雜同步代碼。每個請求都需要執行主要事件循環代碼，這是用Python編寫的（在asyncio的情況下），因此與C代碼相比有很多開銷。

像您這樣的基準測試顯示了異步的開銷，因爲您的應用程序和數據庫之間沒有網絡延遲，而且您正在執行大量非常小的數據庫操作。由於基準的每個其他方面都很快，因此事件循環邏輯的許多執行都會增加總運行時間的很大一部分。

Mike Bayer的觀點與上面相關，這種低延遲的情況對於數據庫應用程序來說是典型的，因此數據庫操作不應該在事件循環中運行。

異步適用於高延遲的情況，例如Web應用程序和Web爬蟲，其中應用程序花費大部分時間等待對等，而不是花費其大部分時間執行Python。結論：如果你的應用程序有一個很好的理由是異步（它處理慢對等體），爲了一致的代碼，擁有異步數據庫驅動程序是一個好主意，但是需要一些開銷。

如果您因其他原因不需要異步，請勿執行異步數據庫調用，因爲它們會慢一點。

來源

2016-10-01 13:54:58

所以異步的Web框架像Sanic https://github.com/channelcat/sanic可以加快？它使用Python3.5 + uvloop – Pegasus

數據庫ORM爲異步體系結構引入了許多複雜性。 ORM中有幾個地方可能發生阻塞，並且可能會壓倒性地改變爲異步形式。發生阻塞的地方也可能因數據庫而異。我的猜測是爲什麼你的結果如此之慢，是因爲有很多來自事件循環的非優化調用（我可能是嚴重錯誤的，我大多數時候使用SQLAlchemy或原始SQL）。根據我的經驗，在線程中執行數據庫代碼通常更快，並在可用時產生結果。我無法爲PeeWee說話，但SQLAlchemy非常適合在多線程中運行，並且沒有太多缺陷（但存在的非常非常煩人）。

我建議你嘗試使用ThreadPoolExecutor和同步Peewee模塊並在一個線程中運行數據庫函數。你將不得不修改你的主代碼，但如果你問我，這將是值得的。例如，假設您選擇使用回調代碼，那麼你的ORM查詢可能是這樣的：

from concurrent.futures import ThreadPoolExecutor 

executor = ThreadPoolExecutor(max_workers=10) 

def queryByName(name): 
    query = executor.submit(db_model.findOne, name=name) 
    query.add_done_callback(processResult) 

def processResult(query): 
    orm_obj = query.results() 
    # do stuff with the results

你可以使用協程yeild from或await，但它是一個有點問題對我來說。另外，我還不熟悉協程。只要開發人員注意死鎖，數據庫會話和事務，這段代碼應該適用於Tornado。如果線程中出現問題，這些因素可能會減慢應用程序的速度。

如果您感覺非常冒險，MagicStack（asyncio後面的公司）有一個名爲asyncpg的項目，它應該是非常快的！我一直在嘗試的意思，但還沒有找到時間:(

來源

2016-10-01 20:51:27

我可以同意你的大部分答案，但是這句話：「MagicStack（asyncio背後的公司）」錯誤地引出了他們負責的想法或者asyncio的作者，他們對異步/等待做出了貢獻，但這使得他們成爲別人貢獻者，系統中的另一部分無論如何，我已經贊成你，因爲你的例子很有用，並且可以幫助其他操作者在該舞臺上進行研究。 – mydaemon

peewee和peewee異步：爲什麼異步較慢

回答

相關問題