2016-07-22 67 views
-3

我正試圖從twitter推斷用戶位置方面的用戶名。解析提取用戶位置的用戶名推特

輸入:用戶列表有超過50K的用戶名

AkkiPritam,6.77E+17,12/15/2015,#chennaifloods 
AkkiPritam,6.77E+17,12/15/2015,#bhoomikatrust 
AkkiPritam,6.77E+17,12/15/2015,#akshaykumar 
gischethans,6.77E+17,12/15/2015,#chennaifloods 
mid_day,6.77E+17,12/15/2015,#bollywood 
mid_day,6.77E+17,12/15/2015,#chennaifloods 
Nanthivarman16,6.77E+17,12/15/2015,#admkfails 
Nanthivarman16,6.77E+17,12/15/2015,#jayafails 
Nanthivarman16,6.77E+17,12/15/2015,#stickergovt 
Nanthivarman16,6.77E+17,12/15/2015,#chennaifloods 
AdilaMatra,6.77E+17,12/15/2015,#chennaifloods 
AdilaMatra,6.77E+17,12/15/2015,#climatechange 
AdilaMatra,6.77E+17,12/15/2015,#delhichokes 
AdilaMatra,6.77E+17,12/15/2015,#smog 
HDFCERGOGIC,6.77E+17,12/15/2015,#chennaifloods 
HDFCERGOGIC,6.77E+17,12/15/2015,#tnfloods 
ImSoorej,6.77E+17,12/15/2015,#chennaifloods 
ImSoorej,6.77E+17,12/15/2015,#chennaimicr 

代碼:我想找到的地理位置可能是地理座標。

from __future__ import print_function 
import tweepy 
from tweepy import OAuthHandler 
from tweepy import Stream 
from tweepy.streaming import StreamListener 
import pandas as pd 
import csv 

consumer_key = 'xyz' 
consumer_secret = 'xyz' 
access_token = 'xyz' 
access_token_secret = 'xyz' 

data = pd.read_csv('user_keyword.csv') 
df = ['user_name', 'user_id', 'date', 'keyword'] 

def get_user_details(username): 
     userobj = api.get_user(username) 
     return userobj 

if __name__ == '__main__': 
    #authenticating the app (https://apps.twitter.com/) 
    auth = tweepy.auth.OAuthHandler(consumer_key, consumer_secret) 
    auth.set_access_token(access_token, access_token_secret) 
    api = tweepy.API(auth) 

    username = df['user_name'] 
    userOBJ = get_user_details(username) 
    print(userOBJ.location) 

錯誤:無法解析用戶名到程序中。

Traceback (most recent call last): 
    File "user_profile_location.py", line 38, in <module> 
    username = df['user_name'] 
TypeError: list indices must be integers, not str 
+1

umm。 'df'不是字典,它是一個字符串列表 - 你需要使用整數索引來訪問'df'元素。 –

+0

@ChitharanjanDas謝謝!我做了什麼改變? –

+0

你的代碼在這裏'data = pd.read_csv('user_keyword.csv')'創建DataFrame。 'df = ['user_name','user_id','date','keyword']'創建一個Python列表並將列表賦給變量'df'。如果您的csv的標題與列表中的項目相匹配,那麼您需要使用'data ['user_name']'我最好的建議是閱讀熊貓[docs](http://pandas.pydata.org/pandas- docs/stable /) – toasteez

回答

1

您正在使用「數據」來定義你的數據框和「DF」什麼,我想應該是的數據幀

data = pd.read_csv('user_keyword.csv') 
df = ['user_name', 'user_id', 'date', 'keyword'] 

我假設user_keyword.csv文件沒有頭部列,請嘗試添加:

data.columns = df 

它會將列名更改爲存儲在df中的值。 再後來代替:

username = df['user_name'] 

嘗試:

username = data['user_name'] 

請記住,現在的用戶名是一整列這樣get_user_details(username)不應該期待一個字符串。

+0

我收到此錯誤:' 文件「user_profile_location.py」,第40行,在 userOBJ = get_user_details(用戶名) 文件「user_profile_location.py」第29行,在get_user_details userobj = api.get_user(username) 文件「/usr/local/lib/python2.7/dist-packages/tweepy/binder.py」,第245行,在_call中 return method.execute( ) 文件「/usr/local/lib/python2.7/dist-packages/tweepy/binder。py「,第229行,執行 raise TweepError(error_msg,resp,api_code = api_error_code) tweepy.error.TweepError:[{u'message':u'Could not authenticate you。',u'code':32} ]' –

+0

看起來像你有一個認證錯誤 – toasteez

+1

'userOBJ = get_user_details(username)'嘗試用你的用戶名替換用戶名,如果這樣做的話,那麼它是因爲你試圖驗證你沒有憑據的用戶。 – toasteez