2017-07-21 71 views
0

我正在使用Tweepy的Stream Listener並希望檢索英國當前政治辯論中的推文。不幸的是,在RT和響應情況下,我只收到了截斷的推文。Tweepy流監聽器用於擴展推文

如: -

RT @ZaidJilani:舒默(antiBDS法案的發起人)說,我們應該扼殺了加沙。傑里米·科爾賓說壓迫他們會...

當fulltweet應該是: -

舒默(antiBDS法案的發起人)說,我們應該扼殺了加沙。傑里米柯賓說,壓制他們只會激化人們。

我看到有一種方法可以使用`tweet_mode = extended與普通的Twitter.API擴展。但是我找不到與Streaming API類似的東西。有沒有人爲此提供解決方案?我的代碼如下: -

from tweepy import Stream 
from tweepy import OAuthHandler 
from tweepy.streaming import StreamListener 
from redis import Redis 
from rq import Queue 
import requests 
import time 
import io 
import os 
import json 
import threading 
import multiprocessing 
from datetime import datetime, timedelta 
import _credentials 

# twitter OAuth 
ckey = _credentials.ckey 
consumer_secret = _credentials.consumer_secret 
access_token_key = _credentials.access_token_key 
access_token_secret = _credentials.access_token_secret 



#Listener Class Override 
class listener(StreamListener): 

def __init__(self, start_time, time_limit): 
    self.time = start_time 
    self.limit= time_limit 
    self.tweet_data = [] 



def on_data(self, data): 
    localtime = datetime.now().strftime("%Y-%b-%d--%H-%M-%S") 
    print(localtime) 

    while (time.time() - self.time) < self.limit: 
     try: 
      self.tweet_data.append(data) 
      return True 

     except BaseException: 
      print ('failed ondata') 
      time.sleep(5) 
      pass 
    saveFile = io.open(('raw_tweets_{}.json').format(localtime), 'w', encoding='utf-8') 
    saveFile.write(u'[\n') 
    saveFile.write(','.join(self.tweet_data)) 
    saveFile.write(u'\n]') 
    saveFile.close() 
    exit() 



def on_error(self, status): 

    print (status) 

def on_disconnect(self, notice): 

    print ('bye') 




#Beginning of the specific code 
keyword_list = ['Theresa May', 'Jeremy Corbyn', 'GE2017', 'Labour', 'Tory','Tories'] #track list 

start_time=time.time() 
auth = OAuthHandler(ckey, consumer_secret) #OAuth object 
auth.set_access_token(access_token_key, access_token_secret) 
twitterStream = Stream(auth, listener(start_time, time_limit=10)) #initialize   Stream object with a time out limit 
twitterStream.filter(track=keyword_list, languages=['en']) #call the filter method to run the Stream Listener 
+0

[Twitter Dev Documentation](http://dev.twitter.com/overview/api/upcoming-changes-to-tweets)聲明它在Streaming API中不可用 - 「Streaming API不提供[...]通過公共REST API以兼容模式呈現的Tweets將不包含extended_tweet字段。「 – davedwards

回答

1

現在已經過了一段時間了,我認爲支持全文。

在此鏈接:

https://developer.twitter.com/en/docs/tweets/tweet-updates

它說,兼容性默認支持。我(可能是醜陋的)代碼,顯示我如何處理它是在這裏:=「擴展」似乎是增加了對tweet_mode支持:

if 'extended_tweet' in raw_tweepy_data_object: 
     if 'full_text' in raw_tweepy_data_object['extended_tweet']: 
      self.text = raw_tweepy_data_object['extended_tweet']['full_text'] 
     else: 
      pass # i need to figure out what is possible here 
    elif 'text' in raw_tweepy_data_object: 
     self.text = raw_tweepy_data_object['text'] 
2

更新。

self.stream = Stream(auth = auth, listener = self, tweet_mode= 'extended') 
tweet_data = json.loads(data) 
if "extended_tweet" in tweet_data: 
    tweet = tweet_data['extended_tweet']['full_text'] 

PS。請原諒格式化,拼寫錯誤等。我是新來堆棧溢出,只是希望幫助其他人面臨這個問題。