2016-07-04 84 views
0

我試圖用Kinesis在EMR上運行Spark流作業。 Spark 1.6.1和Kinesis ASL 1.6.1。編寫一個簡單的示例wordcount示例。Spark Streaming 1.6.1不適用於Kinesis asl 1.6.1和asl 2.0.0-preview

 <dependency> 
     <groupId>org.apache.spark</groupId> 
     <artifactId>spark-streaming-kinesis-asl_2.10</artifactId> 
     <version>1.6.1</version> 
    </dependency> 


    <dependency> 
     <groupId>com.amazonaws</groupId> 
     <artifactId>amazon-kinesis-client</artifactId> 
     <version>1.6.3</version> 
    </dependency> 
    <dependency> 
     <groupId>com.amazonaws</groupId> 
     <artifactId>amazon-kinesis-producer</artifactId> 
     <version>0.10.2</version> 
    </dependency> 

這會引發以下的org.apache.spark.streaming例外

java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: com/google/protobuf/ProtocolStringList 
    at com.amazonaws.services.kinesis.clientlibrary.lib.worker.ShardConsumer.checkAndSubmitNextTask(ShardConsumer.java:157) 
    at com.amazonaws.services.kinesis.clientlibrary.lib.worker.ShardConsumer.consumeShard(ShardConsumer.java:126) 

升級到2.0.0預覽

 <dependency> 
     <groupId>org.apache.spark</groupId> 
     <artifactId>spark-streaming-kinesis-asl_2.10</artifactId> 
     <version>2.0.0-preview</version> 
    </dependency> 

給出以下異常

java.lang.NoClassDefFoundError: org/apache/spark/internal/Logging 

.kinesis.KinesisU腫瘤浸潤淋巴細胞$$ anonfun $ createStream $ 1.適用(KinesisUtils.scala:74)

回答

1

我有非常類似的問題,在好幾個地方提到的一樣:

當我嘗試在AWS EMR上運行我的Spark Streaming應用程序,但仍然導致:

java.lang.NoSuchMethodError: com.google.protobuf.LazyStringList.getUnmodifiableView()Lcom/google/protobuf/LazyStringList; 

getUnmodifiableView()在某些版本的protobuf中不可用,所以我猜它仍然加載了protobuf的錯誤版本。 我嘗試了幾種依賴版本的組合,但仍然一樣。在我的開發機器上,一切工作正常,但是一旦我嘗試在主節點上提交應用程序,就會出現此錯誤。 我的POM文件的最後嘗試的版本是:

<dependencies> 
    <!-- https://mvnrepository.com/artifact/com.google.protobuf/protobuf-java --> 
    <dependency> 
     <groupId>com.google.protobuf</groupId> 
     <artifactId>protobuf-java</artifactId> 
     <version>2.6.1</version> 
    </dependency> 

    <dependency> 
     <groupId>com.amazonaws</groupId> 
     <artifactId>amazon-kinesis-client</artifactId> 
     <version>1.6.1</version> 
    </dependency> 
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.10 --> 
    <dependency> 
     <groupId>org.apache.spark</groupId> 
     <artifactId>spark-core_2.10</artifactId> 
     <version>2.1.0</version> 
     <scope>provided</scope> 
    </dependency> 
    <!-- https://mvnrepository.com/artifact/com.google.protobuf/protobuf-java --> 



    <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client --> 
    <dependency> 
     <groupId>org.apache.hadoop</groupId> 
     <artifactId>hadoop-client</artifactId> 
     <version>2.7.3</version> 
     <scope>provided</scope> 
    </dependency> 
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-mllib_2.10 --> 
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-mllib_2.10 --> 
    <dependency> 
     <groupId>org.apache.spark</groupId> 
     <artifactId>spark-mllib_2.10</artifactId> 
     <version>2.1.0</version> 
     <scope>provided</scope> 
    </dependency> 
    <dependency> 
     <groupId>org.apache.spark</groupId> 
     <artifactId>spark-sql_2.10</artifactId> 
     <version>2.1.0</version> 
     <scope>provided</scope> 
    </dependency> 
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-hive_2.10 --> 
    <dependency> 
     <groupId>org.apache.spark</groupId> 
     <artifactId>spark-hive_2.10</artifactId> 
     <version>2.1.0</version> 
     <scope>provided</scope> 
    </dependency> 
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-streaming-kinesis-asl_2.11 --> 
    <dependency> 
     <groupId>org.apache.spark</groupId> 
     <artifactId>spark-streaming-kinesis-asl_2.11</artifactId> 
     <version>2.0.0</version> 
    </dependency> 


    <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-aws --> 
    <dependency> 
     <groupId>org.apache.hadoop</groupId> 
     <artifactId>hadoop-aws</artifactId> 
     <version>2.7.3</version> 
     <scope>provided</scope> 
    </dependency> 


    <!--https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk--> 
    <dependency> 
     <groupId>com.amazonaws</groupId> 
     <artifactId>aws-java-sdk</artifactId> 
     <version>1.10.77</version> 
     <exclusions> 
      <exclusion> 
       <artifactId>jackson-core</artifactId> 
       <groupId>com.fasterxml.jackson.core</groupId> 
      </exclusion> 
      <exclusion> 
       <artifactId>jackson-databind</artifactId> 
       <groupId>com.fasterxml.jackson.core</groupId> 
      </exclusion> 
      <exclusion> 
       <artifactId>jackson-annotations</artifactId> 
       <groupId>com.fasterxml.jackson.core</groupId> 
      </exclusion> 
     </exclusions> 
     <scope>provided</scope> 
    </dependency> 
    <!-- https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-annotations --> 
    <dependency> 
     <groupId>com.fasterxml.jackson.core</groupId> 
     <artifactId>jackson-annotations</artifactId> 
     <version>2.6.7</version> 
    </dependency> 
    <!-- https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-core --> 
    <dependency> 
     <groupId>com.fasterxml.jackson.core</groupId> 
     <artifactId>jackson-core</artifactId> 
     <version>2.6.7</version> 
    </dependency> 
    <!-- https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-databind --> 
    <dependency> 
     <groupId>com.fasterxml.jackson.core</groupId> 
     <artifactId>jackson-databind</artifactId> 
     <version>2.6.7</version> 
    </dependency> 
    <!-- https://mvnrepository.com/artifact/net.java.dev.jets3t/jets3t --> 
    <dependency> 
     <groupId>net.java.dev.jets3t</groupId> 
     <artifactId>jets3t</artifactId> 
     <version>0.9.4</version> 
    </dependency> 

</dependencies> 
0

它是由protobuf的Java的依賴衝突引起的。 使用mvn dependency:tree可以查找KCL和KPL所依賴的protobuf-java的版本。並去激發lib目錄,你會發現另一個版本。 請用maven遮陽簾插件,並重新定位衝突類:

<plugin> 
 
    <groupId>org.apache.maven.plugins</groupId> 
 
    <artifactId>maven-shade-plugin</artifactId> 
 
    <version>2.3</version> 
 
    <executions> 
 
     <execution> 
 
      <phase>package</phase> 
 
      <goals> 
 
       <goal>shade</goal> 
 
      </goals> 
 
      <configuration> 
 
       <outputFile> 
 
        ${project.build.directory}/${project.artifactId}-${project.version}-selfcontained.jar 
 
       </outputFile> 
 
       <relocations> 
 
        <relocation> 
 
         <pattern>com.google.protobuf</pattern> 
 
         <shadedPattern>shade.com.google.protobuf</shadedPattern> 
 
        </relocation> 
 
        <relocation> 
 
         <pattern>com.amazonaws</pattern> 
 
         <shadedPattern>shade.com.amazonaws</shadedPattern> 
 
        </relocation> 
 
       </relocations> 
 
       <filters> 
 
        <filter> 
 
         <artifact>*:*</artifact> 
 
         <excludes> 
 
          <exclude>META-INF/*.SF</exclude> 
 
          <exclude>META-INF/*.DSA</exclude> 
 
          <exclude>META-INF/*.RSA</exclude> 
 
         </excludes> 
 
        </filter> 
 
       </filters> 
 
       <transformers> 
 
        <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer" /> 
 
       </transformers> 
 
      </configuration> 
 
     </execution> 
 
    </executions> 
 
</plugin>

相關問題