2017-04-11 81 views
0

我試圖使用pyathenajdbc.connect()連接到Athena。我通過多因素身份驗證設置了AWS憑據。當我沒有在連接字符串中包含AWS Token時,出現以下錯誤。未能從Python 2.7查詢AWS Athena時,在通過AWS會話令牌傳遞給pyathenajdbc.connect()

athena_conn = connect(access_key=AWS_KEY_ID, secret_key=AWS_SECRET, s3_staging_dir='s3://abc-pqr-xyz/processed/athena-outputs/',region_name=REGION)

EROR: pyathenajdbc.error.DatabaseError: The security token included in the request is invalid. (Service: AmazonAthena; Status Code: 400; Error Code: UnrecognizedClientException; Request ID: 0d488c0b-1eed-11e7-bad8-711e54af6b73)

當我包括AWS令牌在連接字符串我獲得以下錯誤 - >

athena_conn = connect(access_key=AWS_KEY_ID, secret_key=AWS_SECRET, token=AWS_SESSION_TOKEN, s3_staging_dir='s3://abc-pqr-xyz/processed/athena-outputs/',region_name=REGION) ERROR: pyathenajdbc.error.DatabaseError: The security token included in the request is invalid. (Service: AmazonAthena; Status Code: 400; Error Code: UnrecognizedClientException; Request ID: 91751051-1eed-11e7-8347-153dfe3d84a6)

有誰知道這裏有什麼問題?

這是我的整個代碼。

from pyathenajdbc import connect 
from pyathenajdbc.util import as_pandas 
from boto3 import Session 
import jpype 
jvm_path = jpype.getDefaultJVMPath() 

_current_credentials = Session().get_credentials() 
AWS_KEY_ID = _current_credentials.access_key 
AWS_SECRET = _current_credentials.secret_key 
AWS_SESSION_TOKEN = _current_credentials.token 
REGION = "us-east-2" 

#athena_conn = connect(access_key=AWS_KEY_ID, secret_key=AWS_SECRET, s3_staging_dir='s3://abc-pqr-xyz/processed/athena-outputs/',region_name=REGION) 

athena_conn = connect(access_key=AWS_KEY_ID, secret_key=AWS_SECRET, token=AWS_SESSION_TOKEN, s3_staging_dir='s3://abc-pqr-xyz/processed/athena-outputs/',region_name=REGION) 

cursor = athena_conn.cursor(); 
query = 'SELECT * FROM xyz.ABC limit 1;' 
cursor.execute(query) 
df = as_pandas(cursor) 
print(df) 

回答

0
from pyathenajdbc import connect 
from pyathenajdbc.util import as_pandas 
from boto3 import Session 
import os 

_current_credentials = Session().get_credentials() 

os.environ['AWS_ACCESS_KEY_ID'] = _current_credentials.access_key 
os.environ['AWS_SECRET_ACCESS_KEY'] = _current_credentials.secret_key 
os.environ['AWS_SESSION_TOKEN'] = _current_credentials.token 


athena_conn = connect(s3_staging_dir='s3://your-bucket/', 
      region_name='us-west-2', 
      aws_credentials_provider_class='com.amazonaws.athena.jdbc.shaded.com.amazonaws.auth.EnvironmentVariableCredentialsProvider') 

cursor = athena_conn.cursor(); 
query = 'SELECT * FROM schema.table_name limit 1;' 
cursor.execute(query) 
df = as_pandas(cursor) 
print(df) 
0

這個問題並不簡單,但我猜測它與您的憑據有關。您應該調查一下:嘗試打印您的密鑰並驗證它們是否有效。

這裏是我用來輸入我的憑據替代:

import configparser  

aws_config_file = '~/.aws/config' 

Config = configparser.ConfigParser() 
Config.read(os.path.expanduser(aws_config_file)) 

access_key_id = Config['default']['aws_access_key_id'] 
secret_key_id = Config['default']['aws_secret_access_key'] 

否則,只是爲了確保該問題不涉及到JDBC驅動程序,粘貼以下命令的輸出

import pyathenajdbc 

print(pyathenajdbc.ATHENA_CONNECTION_STRING) 
print(pyathenajdbc.ATHENA_DRIVER_CLASS_NAME) 
print(pyathenajdbc.ATHENA_DRIVER_DOWNLOAD_URL) 
print(pyathenajdbc.ATHENA_JAR) 
+0

您好,感謝您的答覆。這裏是輸出-'jdbc:awsathena:// athena。{region} .amazonaws.com:443/hive/{schema}/ com.amazonaws.athena.jdbc.AthenaDriver https://s3.amazonaws.com /athena-downloads/drivers/AthenaJDBC41-1.0.0.jar AthenaJDBC41-1.0.0.jar' – Guddi

+0

另外,確認密鑰有效 – Guddi

+0

我明白了。我認爲這與您在AWS Athena上的權限有關。請驗證您是否可以使用相同憑證訪問雅典娜控制檯 –

1

假設你有在〜/ .aws文件夾中的配置文件的區域定義,你可以使用Session()。REGION_NAME

下面的作品就好了(沒得導入OS):

from pyathenajdbc import connect 
from pyathenajdbc.util import as_pandas 
from boto3 import Session 
import jpype 
jvm_path = jpype.getDefaultJVMPath() 

_current_credentials = Session().get_credentials() 
AWS_KEY_ID = _current_credentials.access_key 
AWS_SECRET = _current_credentials.secret_key 
REGION = Session().region_name 

athena_conn = connect(access_key=AWS_KEY_ID, 
       secret_key=AWS_SECRET, 
       s3_staging_dir='path_to_staging_dir', 
       region_name=REGION) 

cursor = athena_conn.cursor(); 

query = 'SELECT current_date;' 

cursor.execute(query) 
df = as_pandas(cursor) 
print(df)