2016-11-14 111 views
0

我執行下面的蜂巢查詢執行查詢之後:錯誤Twitter的情感分析

SELECT t.retweeted_screen_name, sum(retweets) AS total_retweets, count(*) AS tweet_count 
FROM (SELECT retweeted_status.user.screen_name as retweeted_screen_name, retweeted_status.text, max(retweet_count) as retweets 
    FROM mytweets GROUP BY retweeted_status.user.screen_name, retweeted_status.text) t 

GROUP BY t.retweeted_screen_name ORDER BY total_retweets DESC LIMIT 10; 

登錄:

Query ID = root_20161114205033_e1736dca-0999-431a-b301-4d1a3bfbaa00 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] INFO org.apache.hadoop.hive.ql.Driver - Query ID = root_20161114205033_e1736dca-0999-431a-b301-4d1a3bfbaa00 
Total jobs = 2 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] INFO org.apache.hadoop.hive.ql.Driver - Total jobs = 2 
Launching Job 1 out of 2 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] INFO org.apache.hadoop.hive.ql.Driver - Launching Job 1 out of 2 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] INFO org.apache.hadoop.hive.ql.Driver - Starting task [Stage-1:MAPRED] in serial mode 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] INFO org.apache.hadoop.hive.ql.exec.Utilities - Cache Content Summary for hdfs://localhost:9000/user/flume/tweets length: 1858 file count: 1 directory count: 1 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] INFO org.apache.hadoop.hive.ql.exec.Utilities - BytesPerReducer=256000000 maxReducers=1009 totalInputFileSize=1858 
Number of reduce tasks not specified. Estimated from input data size: 1 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] INFO org.apache.hadoop.hive.ql.exec.Task - Number of reduce tasks not specified. Estimated from input data size: 1 

In order to change the average load for a reducer (in bytes): 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] INFO org.apache.hadoop.hive.ql.exec.Task - In order to change the average load for a reducer (in bytes): 
set hive.exec.reducers.bytes.per.reducer=<number> 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] INFO org.apache.hadoop.hive.ql.exec.Task - set hive.exec.reducers.bytes.per.reducer=<number> 

In order to limit the maximum number of reducers: 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] INFO org.apache.hadoop.hive.ql.exec.Task - In order to limit the maximum number of reducers: 
set hive.exec.reducers.max=<number> 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] INFO org.apache.hadoop.hive.ql.exec.Task - set hive.exec.reducers.max=<number> 

In order to set a constant number of reducers: 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] INFO org.apache.hadoop.hive.ql.exec.Task - In order to set a constant number of reducers: 
set mapreduce.job.reduces=<number> 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] INFO org.apache.hadoop.hive.ql.exec.Task - set mapreduce.job.reduces=<number> 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] INFO hive.ql.Context - New scratch dir is hdfs://localhost:9000/tmp/hive/root/5bd85c7b-bb35-4557-9053-1b7d248538a3/hive_2016-11-14_20-50-33_177_2178154594719247302-1 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] INFO org.apache.hadoop.hive.ql.exec.mr.ExecDriver - Using org.apache.hadoop.hive.ql.io.CombineHiveInputFormat 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] INFO org.apache.hadoop.hive.ql.exec.mr.ExecDriver - adding libjars: hdfs://localhost:9000/usr/lib/json-serde-1.3.6-SNAPSH‌​OT-jar-‌​‌​with-depe‌​ndencies.‌​ja‌​r 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] INFO org.apache.hadoop.hive.ql.exec.Utilities - Processing alias t:mytweets 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] INFO org.apache.hadoop.hive.ql.exec.Utilities - Adding input file hdfs://localhost:9000/user/flume/tweets 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] INFO org.apache.hadoop.hive.ql.exec.Utilities - Content Summary hdfs://localhost:9000/user/flume/tweetslength: 1858 num files: 1 num directories: 1 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] INFO hive.ql.Context - New scratch dir is hdfs://localhost:9000/tmp/hive/root/5bd85c7b-bb35-4557-9053-1b7d248538a3/hive_2016-11-14_20-50-33_177_2178154594719247302-1 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] INFO org.apache.hadoop.hive.ql.exec.SerializationUtilities - Serializing MapWork using kryo 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] INFO org.apache.hadoop.hive.ql.exec.SerializationUtilities - Serializing ReduceWork using kryo 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] INFO org.apache.hadoop.hive.ql.exec.Utilities - PLAN PATH = hdfs://localhost:9000/tmp/hive/root/5bd85c7b-bb35-4557-9053-1b7d248538a3/hive_2016-11-14_20-50-33_177_2178154594719247302-1/-mr-10006/b726734e-92c6-42bf-abb0-5853ae53bf3d/map.xml 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] INFO org.apache.hadoop.hive.ql.exec.Utilities - PLAN PATH = hdfs://localhost:9000/tmp/hive/root/5bd85c7b-bb35-4557-9053-1b7d248538a3/hive_2016-11-14_20-50-33_177_2178154594719247302-1/-mr-10006/b726734e-92c6-42bf-abb0-5853ae53bf3d/reduce.xml 
java.io.FileNotFoundException: File does not exist: hdfs://localhost:9000/usr/lib/json-serde-1.3.6-SNAPSH‌​OT-jar-‌​‌​with-depe‌​ndencies.‌​ja‌​r 
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309) 
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301) 
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) 
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301) 
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288) 
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224) 
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99) 
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57) 
at org.apache.hadoop.mapreduce.JobResourceUploader.uploadFiles(JobResourceUploader.java:179) 
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:98) 
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:193) 
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) 
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:422) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) 
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) 
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562) 
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:422) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) 
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557) 
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548) 
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:433) 
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:138) 
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) 
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) 
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858) 
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562) 
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313) 
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084) 
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072) 
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) 
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) 
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) 
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) 
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) 
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
at java.lang.reflect.Method.invoke(Method.java:498) 
at org.apache.hadoop.util.RunJar.run(RunJar.java:221) 
at org.apache.hadoop.util.RunJar.main(RunJar.java:136) 
Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist: hdfs://localhost:9000/usr/lib/json-serde-1.3.6-SNAPSH‌​OT-jar-‌​‌​with-depe‌​ndencies.‌​ja‌​r)' 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] ERROR org.apache.hadoop.hive.ql.exec.Task - Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist: hdfs://localhost:9000/usr/lib/json-serde-1.3.6-SNAPSH‌​OT-jar-‌​‌​with-depe‌​ndencies.‌​ja‌​r)' 
java.io.FileNotFoundException: File does not exist: hdfs://localhost:9000/usr/lib/json-serde-1.3.6-SNAPSH‌​OT-jar-‌​‌​with-depe‌​ndencies.‌​ja‌​r 
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309) 
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301) 
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) 
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301) 
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288) 
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224) 
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99) 
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57) 
at org.apache.hadoop.mapreduce.JobResourceUploader.uploadFiles(JobResourceUploader.java:179) 
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:98) 
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:193) 
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) 
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:422) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) 
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) 
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562) 
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:422) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) 
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557) 
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548) 
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:433) 
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:138) 
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) 
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) 
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858) 
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562) 
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313) 
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084) 
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072) 
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) 
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) 
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) 
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) 
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) 
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
at java.lang.reflect.Method.invoke(Method.java:498) 
at org.apache.hadoop.util.RunJar.run(RunJar.java:221) 
at org.apache.hadoop.util.RunJar.main(RunJar.java:136) 

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. File does not exist: hdfs://localhost:9000/usr/lib/json-serde-1.3.6-SNAPSH‌​OT-jar-‌​‌​with-depe‌​ndencies.‌​ja‌​r 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] ERROR org.apache.hadoop.hive.ql.Driver - FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. File does not exist: hdfs://localhost:9000/usr/lib/json-serde-1.3.6-SNAPSH‌​OT-jar-‌​‌​with-depe‌​ndencies.‌​ja‌​r 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] INFO org.apache.hadoop.hive.ql.Driver - Completed executing command(queryId=root_20161114205033_e1736dca-0999-431a-b301-4d1a3bfbaa00); Time taken: 8.332 seconds 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] INFO org.apache.hadoop.hive.conf.HiveConf - Using the default value passed in for log id: 5bd85c7b-bb35-4557-9053-1b7d248538a3 
[5bd85c7b-bb35-4557-9053-1b7d248538a3 main] INFO org.apache.hadoop.hive.ql.session.SessionState - Resetting thread name to main 

請告訴我如何解決它。

回答

0

1)在hadoop-env.sh下的hadoop CONF DIR

添加export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:ABSOLUTE_PATH_TO_slf4j-simple-1.7.5.jar_JAR

或2)中的hadoop庫路徑添加slf4j-simple-1.7.5.jar

+0

它說 SLF4J:類路徑包含多個SLF4J綁定。 SLF4J:發現綁定在[jar:file:/usr/lib/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J:在[ jar:file:/apache-maven-3.3.9/lib/slf4j-simple-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J:請參閱http://www.slf4j.org/有關說明,請參閱codes.html#multiple_bindings。 SLF4J:實際綁定類型爲[org.apache.logging.slf4j.Log4jLoggerFactory] ​​ –

+0

rm /usr/lib/hive/lib/log4j-slf4j-impl-2.4.1.jar並重試 –

+0

我已更新錯誤。請你可以檢查一次。 –