2017-10-18 88 views
1

我想和/或從文件系統中讀取HDFS的一些文件,我得到這個異常未讀塊數據,同時從Java星火閱讀

Driver stacktrace:] 
      [unread block data] 
    ]org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 6, C-4073.CM.ES, executor 1): java.lang.IllegalStateException: unread block data 
    at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2421) 
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1382) 
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) 
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) 
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) 
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) 
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) 
    at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:76) 
    at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:115) 
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:222) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
    at java.lang.Thread.run(Thread.java:745) 

代碼從HDFS閱讀:

JavaRDD<String> textFile = sc.textFile("file:///PathToFile"); 

代碼從文件系統讀取:

JavaRDD<String> textFile=sc.textFile("hdfs:///PathToFile"); 

我一直在尋找,而用戶通常說,這可能是一個錯誤由於不同的Java版本,但我已經檢查了它:

我的集羣:

$ java -version 
java version "1.7.0_67" 
Java(TM) SE Runtime Environment (build 1.7.0_67-b01) 
Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode) 

我的本地機器:

$ java -version 
java version "1.7.0_76" 
Java(TM) SE Runtime Environment (build 1.7.0_76-b13) 
Java HotSpot(TM) 64-Bit Server VM (build 24.76-b04, mixed mode) 

我的pom.xml:

<properties> 
    <jdk.version>1.7</jdk.version> 
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> 
    <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding> 
    <springboot.version>1.5.3.RELEASE</springboot.version> 
    <springint.version>4.3.10.RELEASE</springint.version> 
    <cdh.version>5.10.1</cdh.version> 
    <solr.version>4.10.3-cdh${cdh.version}</solr.version> 
    <hbase.version>1.2.0-cdh${cdh.version}</hbase.version> 
    <kafka.version>0.9.0-kafka-2.0.2</kafka.version> 
    <rt-framework.version>2.3.5</rt-framework.version> 
    <tas.version>4.0.0</tas.version> 
</properties> 

我不知道我的問題是否與Java版本有關,因爲我從寫入/讀取kafka或從配置單元查詢都沒有問題。

在此先感謝您,並對我的英語不好。

+0

您是否使用相同版本的Java來序列化和反序列化您的對象? –

+0

是的,我到處都有相同的Java版本,但在這個例子中,我甚至沒有序列化或反序列化任何東西,只是試圖讀取文件。 – Cfuentes

回答

0

Finnally,我找到了解決辦法,現在看來,這是具有相關性的問題,特別是在壓縮的jar:

/META-INF/spring.factories 

我倒是對碼的手動添加所有的依賴關係,我不知道這是否是在b est解決方案,但至少,它的工作原理。

這是我的新spring.factories:

#beans ---------------------------------------------------------------------------------------------------------------------- 
    org.springframework.beans.BeanInfoFactory=org.springframework.beans.ExtendedBeanInfoFactory 

    #boot ----------------------------------------------------------------------------------------------------------------------- 
    # PropertySource Loaders 
    org.springframework.boot.env.PropertySourceLoader=\ 
    org.springframework.boot.env.PropertiesPropertySourceLoader,\ 
    org.springframework.boot.env.YamlPropertySourceLoader 

    # Run Listeners 
    org.springframework.boot.SpringApplicationRunListener=\ 
    org.springframework.boot.context.event.EventPublishingRunListener 

    # Application Context Initializers 
    org.springframework.context.ApplicationContextInitializer=\ 
    org.springframework.boot.context.ConfigurationWarningsApplicationContextInitializer,\ 
    org.springframework.boot.context.ContextIdApplicationContextInitializer,\ 
    org.springframework.boot.context.config.DelegatingApplicationContextInitializer,\ 
    org.springframework.boot.context.embedded.ServerPortInfoApplicationContextInitializer 

    # Application Listeners 
    org.springframework.context.ApplicationListener=\ 
    org.springframework.boot.ClearCachesApplicationListener,\ 
    org.springframework.boot.builder.ParentContextCloserApplicationListener,\ 
    org.springframework.boot.context.FileEncodingApplicationListener,\ 
    org.springframework.boot.context.config.AnsiOutputApplicationListener,\ 
    org.springframework.boot.context.config.ConfigFileApplicationListener,\ 
    org.springframework.boot.context.config.DelegatingApplicationListener,\ 
    org.springframework.boot.liquibase.LiquibaseServiceLocatorApplicationListener,\ 
    org.springframework.boot.logging.ClasspathLoggingApplicationListener,\ 
    org.springframework.boot.logging.LoggingApplicationListener 

    # Environment Post Processors 
    org.springframework.boot.env.EnvironmentPostProcessor=\ 
    org.springframework.boot.cloud.CloudFoundryVcapEnvironmentPostProcessor,\ 
    org.springframework.boot.env.SpringApplicationJsonEnvironmentPostProcessor 

    # Failure Analyzers 
    org.springframework.boot.diagnostics.FailureAnalyzer=\ 
    org.springframework.boot.diagnostics.analyzer.BeanCurrentlyInCreationFailureAnalyzer,\ 
    org.springframework.boot.diagnostics.analyzer.BeanNotOfRequiredTypeFailureAnalyzer,\ 
    org.springframework.boot.diagnostics.analyzer.BindFailureAnalyzer,\ 
    org.springframework.boot.diagnostics.analyzer.ConnectorStartFailureAnalyzer,\ 
    org.springframework.boot.diagnostics.analyzer.NoUniqueBeanDefinitionFailureAnalyzer,\ 
    org.springframework.boot.diagnostics.analyzer.PortInUseFailureAnalyzer,\ 
    org.springframework.boot.diagnostics.analyzer.ValidationExceptionFailureAnalyzer 

    # FailureAnalysisReporters 
    org.springframework.boot.diagnostics.FailureAnalysisReporter=\ 
    org.springframework.boot.diagnostics.LoggingFailureAnalysisReporter 




    # boot autoconfigure --------------------------------------------------------------------------------------------------------------------- 
    # Initializers 
    org.springframework.context.ApplicationContextInitializer=\ 
    org.springframework.boot.autoconfigure.SharedMetadataReaderFactoryContextInitializer,\ 
    org.springframework.boot.autoconfigure.logging.AutoConfigurationReportLoggingInitializer 

    # Application Listeners 
    org.springframework.context.ApplicationListener=\ 
    org.springframework.boot.autoconfigure.BackgroundPreinitializer 

    # Auto Configuration Import Listeners 
    org.springframework.boot.autoconfigure.AutoConfigurationImportListener=\ 
    org.springframework.boot.autoconfigure.condition.ConditionEvaluationReportAutoConfigurationImportListener 

    # Auto Configuration Import Filters 
    org.springframework.boot.autoconfigure.AutoConfigurationImportFilter=\ 
    org.springframework.boot.autoconfigure.condition.OnClassCondition 

    # Auto Configure 
    org.springframework.boot.autoconfigure.EnableAutoConfiguration=\ 
    org.springframework.boot.autoconfigure.admin.SpringApplicationAdminJmxAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.aop.AopAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.amqp.RabbitAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.batch.BatchAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.cache.CacheAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.cassandra.CassandraAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.cloud.CloudAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.context.ConfigurationPropertiesAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.context.MessageSourceAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.context.PropertyPlaceholderAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.couchbase.CouchbaseAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.dao.PersistenceExceptionTranslationAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.data.cassandra.CassandraDataAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.data.cassandra.CassandraRepositoriesAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.data.couchbase.CouchbaseDataAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.data.couchbase.CouchbaseRepositoriesAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.data.elasticsearch.ElasticsearchAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.data.elasticsearch.ElasticsearchDataAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.data.elasticsearch.ElasticsearchRepositoriesAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.data.jpa.JpaRepositoriesAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.data.ldap.LdapDataAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.data.ldap.LdapRepositoriesAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.data.mongo.MongoDataAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.data.mongo.MongoRepositoriesAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.data.neo4j.Neo4jDataAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.data.neo4j.Neo4jRepositoriesAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.data.solr.SolrRepositoriesAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.data.redis.RedisAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.data.redis.RedisRepositoriesAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.data.rest.RepositoryRestMvcAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.data.web.SpringDataWebAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.elasticsearch.jest.JestAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.freemarker.FreeMarkerAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.gson.GsonAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.h2.H2ConsoleAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.hateoas.HypermediaAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.hazelcast.HazelcastAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.hazelcast.HazelcastJpaDependencyAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.info.ProjectInfoAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.integration.IntegrationAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.jackson.JacksonAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.jdbc.DataSourceAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.jdbc.JdbcTemplateAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.jdbc.JndiDataSourceAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.jdbc.XADataSourceAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.jdbc.DataSourceTransactionManagerAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.jms.JmsAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.jmx.JmxAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.jms.JndiConnectionFactoryAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.jms.activemq.ActiveMQAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.jms.artemis.ArtemisAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.flyway.FlywayAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.groovy.template.GroovyTemplateAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.jersey.JerseyAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.jooq.JooqAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.kafka.KafkaAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.ldap.embedded.EmbeddedLdapAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.ldap.LdapAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.liquibase.LiquibaseAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.mail.MailSenderAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.mail.MailSenderValidatorAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.mobile.DeviceResolverAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.mobile.DeviceDelegatingViewResolverAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.mobile.SitePreferenceAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.mongo.embedded.EmbeddedMongoAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.mongo.MongoAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.mustache.MustacheAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.orm.jpa.HibernateJpaAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.reactor.ReactorAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.security.SecurityAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.security.SecurityFilterAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.security.FallbackWebSecurityAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.security.oauth2.OAuth2AutoConfiguration,\ 
    org.springframework.boot.autoconfigure.sendgrid.SendGridAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.session.SessionAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.social.SocialWebAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.social.FacebookAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.social.LinkedInAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.social.TwitterAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.solr.SolrAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.thymeleaf.ThymeleafAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.transaction.TransactionAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.transaction.jta.JtaAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.validation.ValidationAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.web.DispatcherServletAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.web.EmbeddedServletContainerAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.web.ErrorMvcAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.web.HttpEncodingAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.web.HttpMessageConvertersAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.web.MultipartAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.web.ServerPropertiesAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.web.WebClientAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.web.WebMvcAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.websocket.WebSocketAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.websocket.WebSocketMessagingAutoConfiguration,\ 
    org.springframework.boot.autoconfigure.webservices.WebServicesAutoConfiguration,\ 
    com.MY.MAIN.CLASS 


    # Failure analyzers 
    org.springframework.boot.diagnostics.FailureAnalyzer=\ 
    org.springframework.boot.autoconfigure.diagnostics.analyzer.NoSuchBeanDefinitionFailureAnalyzer,\ 
    org.springframework.boot.autoconfigure.jdbc.DataSourceBeanCreationFailureAnalyzer,\ 
    org.springframework.boot.autoconfigure.jdbc.HikariDriverConfigurationFailureAnalyzer 

    # Template availability providers 
    org.springframework.boot.autoconfigure.template.TemplateAvailabilityProvider=\ 
    org.springframework.boot.autoconfigure.freemarker.FreeMarkerTemplateAvailabilityProvider,\ 
    org.springframework.boot.autoconfigure.mustache.MustacheTemplateAvailabilityProvider,\ 
    org.springframework.boot.autoconfigure.groovy.template.GroovyTemplateAvailabilityProvider,\ 
    org.springframework.boot.autoconfigure.thymeleaf.ThymeleafTemplateAvailabilityProvider,\ 
    org.springframework.boot.autoconfigure.web.JspTemplateAvailabilityProvider 

Probabilly它並不需要所有的依賴關係,因此我將檢查和更新我未來的答案。

謝謝大家的答案。

1

好的,這兩條線是超級不同的。

JavaRDD<String> textFile = sc.textFile("file:///PathToFile"); 
JavaRDD<String> textFile=sc.textFile("hdfs:///PathToFile"); 

第一行(「文件:/// ...」)假設你的文件是適用於所有在同一位置下的機器,而這些文件實際上是完全一樣的。否則,在分區/閱讀過程中會發生各種令人毛骨悚然的事情。

第二行表示您嘗試從預配置的HDFS中讀取,實際上它是確定的。

如果你想讀的主計算機上的一些本地文件只是做這樣的事情:

List<String> myData = ... 
JavaRDD<String> myRdd = sc.parallelize(myData); 

更多詳情,可在這裏:https://spark.apache.org/docs/2.2.0/api/java/org/apache/spark/SparkContext.html#parallelize-scala.collection.Seq-int-scala.reflect.ClassTag-

+0

嗨! 我知道兩條線都不一樣,我只是試圖表明,無論我在哪裏閱讀,都只是爲了表明它不僅僅是一個HDFS問題,而是在高效的環境中,我會從HDFS中讀取出錯。 我會編輯我的問題,所以在這一點上會很清楚。 但是非常感謝你的回答。 – Cfuentes