2017-05-27 364 views
-1

我通過使用Spark Streaming將數據插入啓用kerber的hbase編寫了一個程序。在一批中,我遇到了一個失敗的任務。錯誤如下:java.io.IOException:來自keytab的[email protected]的登錄失敗

java.io.IOException: Login failure for [email protected] from keytab ./user.keytab 
    at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytabAndReturnUGI(UserGroupInformation.java:1160) 
    at com.framework.common.HbaseUtil$.InsertToHbase(HbaseUtil.scala:81) 
    at com.framework.realtime.RDDUtil$$anonfun$dwsTodwd$2.apply(RDDUtil.scala:203) 
    at com.framework.realtime.RDDUtil$$anonfun$dwsTodwd$2.apply(RDDUtil.scala:202) 
    at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$33.apply(RDD.scala:920) 
    at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$33.apply(RDD.scala:920) 
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858) 
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858) 
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) 
    at org.apache.spark.scheduler.Task.run(Task.scala:89) 
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
    at java.lang.Thread.run(Thread.java:745) 
Caused by: javax.security.auth.login.LoginException: Receive timed out 
    at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:767) 
    at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:584) 
    at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:606) 
    at javax.security.auth.login.LoginContext.invoke(LoginContext.java:762) 
    at javax.security.auth.login.LoginContext.access$000(LoginContext.java:203) 
    at javax.security.auth.login.LoginContext$4.run(LoginContext.java:690) 
    at javax.security.auth.login.LoginContext$4.run(LoginContext.java:688) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:687) 
    at javax.security.auth.login.LoginContext.login(LoginContext.java:595) 
    at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytabAndReturnUGI(UserGroupInformation.java:1149) 
    ... 13 more 
Caused by: java.net.SocketTimeoutException: Receive timed out 
    at java.net.PlainDatagramSocketImpl.receive0(Native Method) 
    at java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:146) 
    at java.net.DatagramSocket.receive(DatagramSocket.java:816) 
    at sun.security.krb5.internal.UDPClient.receive(NetClient.java:207) 
    at sun.security.krb5.KdcComm$KdcCommunication.run(KdcComm.java:390) 
    at sun.security.krb5.KdcComm$KdcCommunication.run(KdcComm.java:343) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at sun.security.krb5.KdcComm.send(KdcComm.java:327) 
    at sun.security.krb5.KdcComm.send(KdcComm.java:219) 
    at sun.security.krb5.KdcComm.send(KdcComm.java:191) 
    at sun.security.krb5.KrbAsReqBuilder.send(KrbAsReqBuilder.java:319) 
    at sun.security.krb5.KrbAsReqBuilder.action(KrbAsReqBuilder.java:364) 
    at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:735) 
    ... 25 more 

但在第二次嘗試中,任務成功。在我看來,認證過程太長了,所以失敗了,另一次嘗試的過程很短。所以它收縮了。我對麼?如果是這樣,請問如何解決這個問題? 我的代碼如下:

val ugi = UserGroupInformation.loginUserFromKeytabAndReturnUGI(princ, 
     keytab) 

    ugi.doAs(new PrivilegedAction[Unit]() { 
     def run(): Unit = { 
     // TODO Auto-generated method stub 
     var conn: HConnection = null 
     var htable: HTableInterface = null 

      conn = HConnectionManager.createConnection(conf) 
      htable = conn.getTable(tableName) 
      htable.setAutoFlushTo(false) 
      for (record <- partitionOfRecords) { 
      htable.put(record) 
      } 
     } 
    }) 
+0

您可以先分享產生錯誤的代碼。 – mtoto

回答

1

Hadoop and Kerberos - the Madness beyond the Gate「錯誤消息害怕」 ...

收到在通常超時

堆棧跟蹤像

Caused by: java.net.SocketTimeoutException: Receive timed out
at java.net.PlainDatagramSocketImpl.receive0(Native Method)
...
at sun.security.krb5.internal.UDPClient.receive(NetClient.java:207)

... UDP套接字...切換到TCP -at最起碼,它會失敗 更快。

,略高於認爲:

切換Kerberos使用TCP而不是UDP
/etc/krb5.conf

[libdefaults]
udp_preference_limit = 1


一般來說,許多不穩定的Kerberos問題似乎只發生在UDP上,所以很遺憾它默認使用...


注意,Java還支持 kdc_timeout配置參數,但它是一個髒亂不堪:

  • MIT Kerberos documentation
  • 的Unix/Linux文檔中未提及未提及除for BSD
  • 提到只有在darkest corners of Java documentation, here for Java 9,有一個有趣的方面說明,有關默認值已經從30s-expressed-implicate幾毫秒到幾秒前的30秒
  • 幾周前,Cloudera支持團隊發佈了一個關於該設置的建議 - ,因爲30秒的默認超時會導致HDFS高可用性或類似事件中的級聯故障 - - 但是可憐的傢伙真的不知道什麼他們建議,讓他們隨機建議「3」或「3秒」或「3000」爲明確的超時值


還要注意的是,如果你有 多個 KDC用於高可用性,並且這些KDC在 krb5.conf中明確列出(或者通過具有循環法則規則的DNS別名集隱式列出,f或示例),那麼在「KDC超時」的情況下,Java應該與下一個KDC一致重試。除非你已經達到全球超時。

相關問題