2016-08-30 25 views
-1

我想通過Google地理編碼API檢索一堆地址的地理編碼,並將它們附加到我的地址表中。從Google API檢索地理編碼並追加到原始表 - python

經過兩天的審查,我沒有找到任何簡單的做法,但它不應該那麼難。我特別有問題解析json輸出並將其附加到我的原始表。 我在窗口上使用python 3.5我最初從一個數據庫中獲取數據,我將它添加到python中的數據框中。但要在這裏貼吧更容易將其轉換爲一個字典,然後返回到數據幀:

data_dict={'street': {0: 'ROMULO', 1: 'SAN BARTOLOME', 2: 'GARBI', 3: 'SAN JOSE'}, 
'concat': {0: '3+ROMULO+CALLE+ALMERIA', 
    1: '5+SAN BARTOLOME+CALLE+TOLEDO', 
    2: '48+GARBI+CALLE+CASTELLON', 
    3: '30+SAN JOSE+CALLE+SANTA CRUZ DE TENERIFE'}, 
'number': {0: '3', 1: '5', 2: '48', 3: '30'}, 
'province': {0: 'ALMERIA', 
    1: 'TOLEDO', 
    2: 'CASTELLON', 
    3: 'SANTA CRUZ DE TENERIFE'}, 
'region': {0: 'ANDALUCIA', 
    1: 'CASTILLA LA MANCHA', 
    2: 'COMUNIDAD VALENCIANA', 
    3: 'CANARIAS'}} 

返回到數據幀:

import pandas as pd 

table=pd.DataFrame.from_dict(data_dict) 

現在我從谷歌的地理編碼API檢索數據:

import requests 
import json 

key="MyKey" 
jsonout=[] 
for i in table.loc[:,'concat']: 
    try: 
     url="https://maps.googleapis.com/maps/api/geocode/json?address=%s&key=%s" % (i, key) 
     response = requests.get(url) 
     jsonf = response.json() 
     jsonout.append(jsonf) 
    except Exception: 
     continue  

我得到這樣的輸出:

jsonout=[{'results': [{'address_components': [{'long_name': '3', 
     'short_name': '3', 
     'types': ['street_number']}, 
    {'long_name': 'Calle Rómulo', 
     'short_name': 'Calle Rómulo', 
     'types': ['route']}, 
    {'long_name': 'Adra', 
     'short_name': 'Adra', 
     'types': ['locality', 'political']}, 
    {'long_name': 'Almería', 
     'short_name': 'AL', 
     'types': ['administrative_area_level_2', 'political']}, 
    {'long_name': 'Andalucía', 
     'short_name': 'AL', 
     'types': ['administrative_area_level_1', 'political']}, 
    {'long_name': 'Spain', 
     'short_name': 'ES', 
     'types': ['country', 'political']}, 
    {'long_name': '04770', 'short_name': '04770', 'types': ['postal_code']}], 
    'formatted_address': 'Calle Rómulo, 3, 04770 Adra, Almería, Spain', 
    'geometry': {'location': {'lat': 36.7593, 'lng': -2.97818}, 
    'location_type': 'ROOFTOP', 
    'viewport': {'northeast': {'lat': 36.76064898029149, 
     'lng': -2.976831019708498}, 
     'southwest': {'lat': 36.7579510197085, 'lng': -2.979528980291502}}}, 
    'partial_match': True, 
    'place_id': 'ChIJG39VNzNOcA0R2f8Ek3E12AY', 
    'types': ['street_address']}], 
    'status': 'OK'}, 
{'results': [{'address_components': [{'long_name': '5', 
     'short_name': '5', 
     'types': ['street_number']}, 
    {'long_name': 'Calle de San Bartolomé', 
     'short_name': 'Calle de San Bartolomé', 
     'types': ['route']}, 
    {'long_name': 'Toledo', 
     'short_name': 'Toledo', 
     'types': ['locality', 'political']}, 
    {'long_name': 'Toledo', 
     'short_name': 'TO', 
     'types': ['administrative_area_level_2', 'political']}, 
    {'long_name': 'Castilla-La Mancha', 
     'short_name': 'CM', 
     'types': ['administrative_area_level_1', 'political']}, 
    {'long_name': 'Spain', 
     'short_name': 'ES', 
     'types': ['country', 'political']}, 
    {'long_name': '45002', 'short_name': '45002', 'types': ['postal_code']}], 
    'formatted_address': 'Calle de San Bartolomé, 5, 45002 Toledo, Spain', 
    'geometry': {'location': {'lat': 39.8549781, 'lng': -4.026267199999999}, 
    'location_type': 'ROOFTOP', 
    'viewport': {'northeast': {'lat': 39.85632708029149, 
     'lng': -4.024918219708497}, 
     'southwest': {'lat': 39.85362911970849, 'lng': -4.027616180291502}}}, 
    'partial_match': True, 
    'place_id': 'ChIJ4bse1aALag0RJ5RxxfyDxUI', 
    'types': ['street_address']}], 
    'status': 'OK'}, 
{'results': [{'address_components': [{'long_name': '48', 
     'short_name': '48', 
     'types': ['street_number']}, 
    {'long_name': 'Carrer de Garbí', 
     'short_name': 'Carrer de Garbí', 
     'types': ['route']}, 
    {'long_name': 'Peníscola', 
     'short_name': 'Peníscola', 
     'types': ['locality', 'political']}, 
    {'long_name': 'Castelló', 
     'short_name': 'Castelló', 
     'types': ['administrative_area_level_2', 'political']}, 
    {'long_name': 'Comunidad Valenciana', 
     'short_name': 'Comunidad Valenciana', 
     'types': ['administrative_area_level_1', 'political']}, 
    {'long_name': 'Spain', 
     'short_name': 'ES', 
     'types': ['country', 'political']}, 
    {'long_name': '12598', 'short_name': '12598', 'types': ['postal_code']}], 
    'formatted_address': 'Carrer de Garbí, 48, 12598 Peníscola, Castelló, Spain', 
    'geometry': {'location': {'lat': 40.3634529, 'lng': 0.3963583}, 
    'location_type': 'ROOFTOP', 
    'viewport': {'northeast': {'lat': 40.3648018802915, 
     'lng': 0.397707280291502}, 
     'southwest': {'lat': 40.3621039197085, 'lng': 0.395009319708498}}}, 
    'partial_match': True, 
    'place_id': 'ChIJHVNHcelGoBIRogILRMno_wk', 
    'types': ['street_address']}, 
    {'address_components': [{'long_name': '48', 
     'short_name': '48', 
     'types': ['street_number']}, 
    {'long_name': 'Carrer Garbí', 
     'short_name': 'Carrer Garbí', 
     'types': ['route']}, 
    {'long_name': 'Vila-real', 
     'short_name': 'Vila-real', 
     'types': ['locality', 'political']}, 
    {'long_name': 'Castelló', 
     'short_name': 'Castelló', 
     'types': ['administrative_area_level_2', 'political']}, 
    {'long_name': 'Comunidad Valenciana', 
     'short_name': 'Comunidad Valenciana', 
     'types': ['administrative_area_level_1', 'political']}, 
    {'long_name': 'Spain', 
     'short_name': 'ES', 
     'types': ['country', 'political']}, 
    {'long_name': '12540', 'short_name': '12540', 'types': ['postal_code']}], 
    'formatted_address': 'Carrer Garbí, 48, 12540 Vila-real, Castelló, Spain', 
    'geometry': {'bounds': {'northeast': {'lat': 39.955829, 'lng': -0.110409}, 
     'southwest': {'lat': 39.9558231, 'lng': -0.1104261}}, 
    'location': {'lat': 39.9558231, 'lng': -0.110409}, 
    'location_type': 'RANGE_INTERPOLATED', 
    'viewport': {'northeast': {'lat': 39.9571750302915, 
     'lng': -0.109068569708498}, 
     'southwest': {'lat': 39.9544770697085, 'lng': -0.111766530291502}}}, 
    'partial_match': True, 
    'place_id': 'EjRDYXJyZXIgR2FyYsOtLCA0OCwgMTI1NDAgVmlsYS1yZWFsLCBDYXN0ZWxsw7MsIFNwYWlu', 
    'types': ['street_address']}], 
    'status': 'OK'}, 
{'results': [{'address_components': [{'long_name': '30', 
     'short_name': '30', 
     'types': ['street_number']}, 
    {'long_name': 'Calle San José', 
     'short_name': 'Calle San José', 
     'types': ['route']}, 
    {'long_name': 'Santa Cruz de la Palma', 
     'short_name': 'Santa Cruz de la Palma', 
     'types': ['locality', 'political']}, 
    {'long_name': 'Santa Cruz de Tenerife', 
     'short_name': 'TF', 
     'types': ['administrative_area_level_2', 'political']}, 
    {'long_name': 'Canarias', 
     'short_name': 'CN', 
     'types': ['administrative_area_level_1', 'political']}, 
    {'long_name': 'Spain', 
     'short_name': 'ES', 
     'types': ['country', 'political']}, 
    {'long_name': '38700', 'short_name': '38700', 'types': ['postal_code']}], 
    'formatted_address': 'Calle San José, 30, 38700 Santa Cruz de la Palma, Santa Cruz de Tenerife, Spain', 
    'geometry': {'location': {'lat': 28.6864347, 'lng': -17.7624433}, 
    'location_type': 'ROOFTOP', 
    'viewport': {'northeast': {'lat': 28.6877836802915, 
     'lng': -17.7610943197085}, 
     'southwest': {'lat': 28.6850857197085, 'lng': -17.7637922802915}}}, 
    'partial_match': True, 
    'place_id': 'ChIJ8ZFx6__rawwRV3dc118gEgE', 
    'types': ['street_address']}, 
    {'address_components': [{'long_name': '30', 
     'short_name': '30', 
     'types': ['street_number']}, 
    {'long_name': 'Calle San José', 
     'short_name': 'Calle San José', 
     'types': ['route']}, 
    {'long_name': 'San Andrés', 
     'short_name': 'San Andrés', 
     'types': ['locality', 'political']}, 
    {'long_name': 'Santa Cruz de Tenerife', 
     'short_name': 'Santa Cruz de Tenerife', 
     'types': ['administrative_area_level_4', 'political']}, 
    {'long_name': 'Santa Cruz de Tenerife', 
     'short_name': 'TF', 
     'types': ['administrative_area_level_2', 'political']}, 
    {'long_name': 'Canarias', 
     'short_name': 'CN', 
     'types': ['administrative_area_level_1', 'political']}, 
    {'long_name': 'Spain', 
     'short_name': 'ES', 
     'types': ['country', 'political']}, 
    {'long_name': '38120', 'short_name': '38120', 'types': ['postal_code']}], 
    'formatted_address': 'Calle San José, 30, 38120 San Andrés, Santa Cruz de Tenerife, Spain', 
    'geometry': {'location': {'lat': 28.505875, 'lng': -16.1930036}, 
    'location_type': 'ROOFTOP', 
    'viewport': {'northeast': {'lat': 28.5072239802915, 
     'lng': -16.1916546197085}, 
     'southwest': {'lat': 28.5045260197085, 'lng': -16.1943525802915}}}, 
    'partial_match': True, 
    'place_id': 'ChIJsfd-ITjKQQwRjFHLI0XPSok', 
    'types': ['street_address']}], 
    'status': 'OK'}]  

我終於想什麼有我與緯度和經度原始表數據框座標

(i['results'][0]['geometry']['location']['lat'], 
    i['results'][0]['geometry']['location']['lng']) 

,並從請求中的formatted_address。

+1

你問的是如何解析json並按照你想要的方式將它轉換成熊貓數據框? – wwii

+0

是的,但將它追加到我的原始表中,匹配相應行的座標。我已經能夠在數據框中獲得所有的lats和lng,但它看起來並不匹配我原始數據框中的正確行。我還想從json添加'formatted_address'鍵來檢查地址是否與我的輸入相符。 – vdBurg

回答

1

我用this package來做我的地理編碼,它負責解析JSON文件。

from geopy.geocoders import GoogleV3 

googleGeo = GoogleV3('googleKey') 

# create a geocoded list containing geocode objects 
geocoded = [] 
for address in mydata['location']: # assumes mydata is a pandas df 
    geocoded.append(googleGeo.geocode(address)) # geocode function returns a geocoded object 

# append geocoded list to mydata 
mydata['geocoded'] = geocoded 

# create coordinates column 
mydata['coords'] = mydata['geocoded'].apply(lambda x: (x.latitude, x.longitude)) 

# if you want to split our your lat and long then do 
# mydata['lat'] = mydata['geocoded'].apply(lambda x: x.latitude) 
# mydata['long'] = mydata['geocoded'].apply(lambda x: x.longitude) 

根據您提供,如果您使用的是谷歌的API沒有API密鑰的評論,那麼它可能是有益的,包括每個地理編碼調用之間的隨機暫停。

from time import sleep 
from random import randint 
from geopy.geocoders import GoogleV3 

googleGeo = GoogleV3() 

def geocode(address): 
    location = googleGeo.geocode(address) 
    sleep(randint(5,10)) # give the API a break 
    return location 

然後你使用這個自定義函數做你的地理編碼


捎帶上我的前面部分,你甚至可以利用多個地圖API服務。這是我爲我的項目之一的功能,利用Nominatim的API,然後再對谷歌的API回落,如果Nominatim或者返回錯誤或沒有返回:

from geopy.geocoders import Nominatim, GoogleV3 
from geopy.exc import GeocoderTimedOut, GeocoderAuthenticationFailure 
from random import randint 
from time import sleep 

nomiGeo = Nominatim() # Nominatim geolocator 
googleGeo = GoogleV3('myKey') # Google Maps v3 API geolocator 

def geocode(address): 
    """Geocode an address. 

    Args: 
     address (str): the physical address 

    Returns: 
     dict: geocoded object 
    """ 
    location = None 
    attempt = 0 
    useGoogle = False # set to True to use Google only 
    while (location is None) and (attempt <= 8): 
     try: 
      attempt += 1 
      if useGoogle: 
       location = googleGeo.geocode(address, timeout=10) 
      else: 
       location = nomiGeo.geocode(address, timeout=10) 
       if location is None: 
        useGoogle = True 
        location = googleGeo.geocode(address, timeout=10) 
      sleep(randint(5, 10)) # Give the API a break 
     except GeocoderAuthenticationFailure: 
      print 'Error: GeocoderAuthenticationFailure while geocoding {} during attempt #{}'.format(address, attempt) 
      if attempt % 2 == 0: # switch between services for every attempt 
       useGoogle = True 
      else: 
       useGoogle = False 
       sleep(60) 
     except GeocoderTimedOut: 
      sleep(randint(3, 5)) # Give API a break 
      print 'Error: GeocoderTimedOut while geocoding {} during attempt #{}'.format(address, attempt) 
    return location 

請注意,我還進口一些例外具體到因爲基於我對Nominatim的經驗,它有時會拋出隨機錯誤,這是我得到的兩個錯誤。另外,根據我對兩種API的經驗,即使沒有找到某個地址,Google似乎也可以插入座標,而Nominatim必須在其數據庫中有地址才能返回某些內容。

+0

是的!這很容易。雖然它沒有接受我的APIKey,因爲它看起來只有在你有優質API時纔有效。但通過googleGeo = GoogleV3(),我可以在沒有密鑰的情況下獲得有限的座標。 – vdBurg

+0

哦,purrfect!然後,我建議您編寫一個自定義函數,強制地理編碼函數暫停隨機數秒,以避免垃圾郵件API或谷歌可能會阻止您的IP –

+0

謝謝!是的,我知道我不應該打太多的電話。我想要獲得300.000+個地址,但我必須在幾天內完成。我讀過關於Nominatim geolocator,但我更喜歡谷歌,因爲我猜想在西班牙它不會像google – vdBurg

相關問題