嗨stackoverflow社區,我有一個安排cron作業的問題,需要將網站抓取並將其作爲模型(MOVIE)的一部分存儲在數據庫中。問題是模型似乎在執行Procfile之前加載。我應該如何創建一個在後臺內部運行的cron作業,並將刮取的信息存儲到數據庫中?這裏是我的代碼:問題在Django項目中使用apscheduler定義Procfile(Heroku)中的Cron作業
Procfile:
web: python manage.py runserver 0.0.0.0:$PORT
scheduler: python cinemas/scheduler.py
scheduler.py:
# More code above
from cinemas.models import Movie
from apscheduler.schedulers.blocking import BlockingScheduler
sched = BlockingScheduler()
@sched.scheduled_job('cron', day_of_week='mon-fri', hour=0, minutes=26)
def get_movies_playing_now():
global url_movies_playing_now
Movie.objects.all().delete()
while(url_movies_playing_now):
title = []
description = []
#Create BeatifulSoup Object with url link
s = requests.get(url_movies_playing_now, headers=headers)
soup = bs4.BeautifulSoup(s.text, "html.parser")
movies = soup.find_all('ul', class_='w462')[0]
#Find Movie's title
for movie_title in movies.find_all('h3'):
title.append(movie_title.text)
#Find Movie's description
for movie_description in soup.find_all('ul',
class_='w462')[0].find_all('p'):
description.append(movie_description.text.replace(" [More]","."))
for t, d in zip(title, description):
m = Movie(movie_title=t, movie_description=d)
m.save()
#Go to the next page to find more movies
paging = soup.find(class_='pagenating').find_all('a', class_=lambda x:
x != "inactive")
href = ""
for p in paging:
if "next" in p.text.lower():
href = p['href']
url_movies_playing_now = href
sched.start()
# More code below
from django.db import models
電影院/ models.py:
#Create your models here.
class Movie(models.Model):
movie_title = models.CharField(max_length=200)
movie_description = models.CharField(max_length=20200)
這是我收到的時候錯誤工作正在運行。
2016-11-17T17:57:06.074914 + 00:00 app [scheduler.1]:回溯(最近一次通話最後): 2016-11-17T17:57:06.074931 + 00:00 app [scheduler。 1]:文件「cinemas/scheduler.py」,第2行,在 2016-11-17T17:57:06.075058 + 00:00 app [scheduler.1]:import cineplex 2016-11-17T17:57:06.075060+ 00:00 app [scheduler.1]:文件「/app/cinemas/cineplex.py」,第1行,在 2016-11-17T17:57:06.075173 + 00:00 app [scheduler.1]:from cinemas。模型導入電影 2016-11-17T17:57:06.075196 + 00:00 app [scheduler.1]:文件「/app/cinemas/models.py」,第5行,在 2016-11-17T17:57:06.075295 +00:00 app [scheduler.1]:class Movie(models.Model): 2016-11-17T17:57:06.075297 + 00:00 app [scheduler.1]:File「 /app/.heroku/python/lib/python3.5/site-packages/django/db/models/base.py「,第105行,新 2016-11-17T17:57:06.075414 + 00:00 app [scheduler.1]:app_config = apps.get_containing_app_config(module) 2016-11-17T17:57:06.075440 + 00:00 app [scheduler.1]:File「/app/.heroku/python/lib/python3。 5/site-packages/django/apps/registry.py「,第237行,在get_containing_app_config中 2016-11-17T17:57:06.075585 + 00:00 app [scheduler.1]:self.check_apps_ready() 2016-11 -17T17:57:06.075586 + 00:00 app [scheduler.1]:文件「/app/.heroku/python/lib/python3.5/site-packages/django/apps/registry.py」,第124行,在check_apps_ready 2016-11-17T17:57:06.075703 + 00:00 app [scheduler.1]:raise AppRegistryNotReady(「Apps are not loaded yet。」) 2016-11-17T17:57:06.075726 + 00:00 app [SCH eduler.1]:django.core.exceptions.AppRegistryNotReady:應用尚未加載。
如果我不包含模型對象,Cron作業正常工作。我應該如何使用Model對象每天運行這個工作而不失敗?
謝謝
謝謝!這對我有用:) –