2011-06-02 165 views
1

我使用豬CassandraStroage()來插入一個大數據集分成卡桑德拉,運行4個小時後,將其與以下異常崩潰:卡桑德拉豬插入例外

java.lang.NullPointerException 
     at org.apache.cassandra.dht.RandomPartitioner.getToken(RandomPartitioner.java:134) 
     at org.apache.cassandra.dht.RandomPartitioner.getToken(RandomPartitioner.java:36) 
     at org.apache.cassandra.client.RingCache.getRange(RingCache.java:129) 
     at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.write(ColumnFamilyRecordWriter.java:127) 
     at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.write(ColumnFamilyRecordWriter.java:62) 
     at org.apache.cassandra.hadoop.pig.CassandraStorage.putNext(Unknown Source) 
     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOut 
putFormat.java:138) 
     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOut 
putFormat.java:97) 
     at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:498) 
     at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) 
     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48) 
     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:239) 
     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:232) 
     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53) 
     at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) 
     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) 
     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) 
     at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) 

任何想法,爲什麼會這樣?

+0

好的,我發現這是由於我的數據集中的一個條目有一個空鍵。 – 2011-06-06 16:18:49

回答

0

儘管在您的情況下不是問題的原因,但值得注意的是,當嘗試插入指定分區鍵不存在的列族時可能會發生此錯誤。

在這種情況下,它會在第一次遇到reducer類時拋出異常。