2016-10-03 71 views
4

我正在編寫一個簡單的服務,從幾個來源獲取數據,將它們聚合在一起,然後使用Google API客戶端將其發送到Google表格。簡單的peasy工作良好,數據不是那麼大。Google App Engine和Google表格超出軟限制

問題是在構建api服務(即build('sheets', 'v4', http=auth).spreadsheets())之後調用.spreadsheets()會導致大約30兆字節的內存跳轉(我做了一些分析以分離內存分配的位置)。當部署到GAE時,這些峯值會持續很長一段時間(有時候是幾個小時),並向上蔓延,並且在多次請求觸發GAE的「超出軟限制內存限制」錯誤。

我正在使用用於發現文檔和urlfetch的memcache來抓取數據,但這些是我正在使用的唯一其他服務。

我已經試過手動垃圾收集,改變app.yaml中的線程安全,甚至像改變調用.spreadsheets()的地方這樣的事情,並且不能動搖這個問題。我也可能誤解GAE的體系結構,但我知道這個高峯是由調用.spreadsheets()引起的,我並沒有在本地緩存中存儲任何東西。

是否有一種方法可以1)通過調用.spreadsheets()來減小內存峯值的大小或2)使尖峯停留在內存中(或者最好是兩者兼有)。下面給出一個非常簡化的要點,以說明API調用和請求處理程序的概念,如果需要,我可以提供更完整的代碼。我知道以前有類似的問題,但我無法解決。

https://gist.github.com/chill17/18f1caa897e6a202aca05239

+0

事實上,我發現[問題#7973](https://code.google.com/p/googleappengine/issues/detail?id=7973)和[問題#12220](https://開頭代碼.google.com/p/googleappengine/issues/detail?id = 12220&can = 1&q = Exceeded%20soft%20private%20memory&colspec = ID%20Type%20Component%20Status%20Stars%20Summary%20Language%20Priority%20Owner%20Log)跟蹤器與遇到的問題有關「超出軟件私人內存限制」。並且根據給定的線索,這個問題還沒有完全解決,並且在其中一個線索中給出的解決方法似乎也與您的擔憂無關。 – Teyam

回答

0

我使用與可用的RAM的僅20MB小處理器的電子表格API時遇到了這個。問題是谷歌API客戶端以字符串格式提取整個API並將其作爲資源對象存儲在內存中。

如果空閒內存是一個問題,您應該構建自己的http對象並手動進行所需的請求。請參閱我的Spreadsheet()類,以此作爲如何使用此方法創建新電子表格的示例。

SCOPES = 'https://www.googleapis.com/auth/spreadsheets' 
CLIENT_SECRET_FILE = 'client_secret.json' 
APPLICATION_NAME = 'Google Sheets API Python Quickstart' 

class Spreadsheet: 

    def __init__(self, title): 

     #Get credentials from locally stored JSON file 
     #If file does not exist, create it 
     self.credentials = self.getCredentials() 

     #HTTP service that will be used to push/pull data 

     self.service = httplib2.Http() 
     self.service = self.credentials.authorize(self.service) 
     self.headers = {'content-type': 'application/json', 'accept-encoding': 'gzip, deflate', 'accept': 'application/json', 'user-agent': 'google-api-python-client/1.6.2 (gzip)'}   


     print("CREDENTIALS: "+str(self.credentials)) 


     self.baseUrl = "https://sheets.googleapis.com/v4/spreadsheets" 
     self.spreadsheetInfo = self.create(title) 
     self.spreadsheetId = self.spreadsheetInfo['spreadsheetId']  



    def getCredentials(self): 
     """Gets valid user credentials from storage. 

     If nothing has been stored, or if the stored credentials are invalid, 
     the OAuth2 flow is completed to obtain the new credentials. 

     Returns: 
      Credentials, the obtained credential. 
     """ 
     home_dir = os.path.expanduser('~') 
     credential_dir = os.path.join(home_dir, '.credentials') 
     if not os.path.exists(credential_dir): 
      os.makedirs(credential_dir) 
     credential_path = os.path.join(credential_dir, 
             'sheets.googleapis.com-python-quickstart.json') 

     store = Storage(credential_path) 
     credentials = store.get() 
     if not credentials or credentials.invalid: 
      flow = client.flow_from_clientsecrets(CLIENT_SECRET_FILE, SCOPES) 
      flow.user_agent = APPLICATION_NAME 
      if flags: 
       credentials = tools.run_flow(flow, store, flags) 
      else: # Needed only for compatibility with Python 2.6 
       credentials = tools.run(flow, store) 
      print('Storing credentials to ' + credential_path) 
     return credentials 

    def create(self, title): 

     #Only put title in request body... We don't need anything else for now 
     requestBody = { 
      "properties":{ 
       "title":title 
      }, 
     } 


     print("BODY: "+str(requestBody)) 
     url = self.baseUrl 

     response, content = self.service.request(url, 
             method="POST", 
             headers=self.headers, 
             body=str(requestBody)) 
     print("\n\nRESPONSE\n"+str(response)) 
     print("\n\nCONTENT\n"+str(content)) 

     return json.loads(content)