博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
hive 0.8运行python脚本问题
阅读量:2394 次
发布时间:2019-05-10

本文共 4627 字,大约阅读时间需要 15 分钟。

最近在hive上执行python脚本出现了以下问题,在hive命令行里,执行时报错信息如下:

hive>  from records                                  

    > select transform(year,temperature,quality)     
    > using 'python /user/hive/script/is_good_quality.py'    
    > as year,temperature;                               
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201112291016_0023, Tracking URL = http://10.200.187.26:50030/jobdetails.jsp?jobid=job_201112291016_0023
Kill Command = /opt/hadoop-0.20.205.0/libexec/../bin/hadoop job  -Dmapred.job.tracker=10.200.187.26:9001 -kill job_201112291016_0023
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2011-12-29 14:56:34,192 Stage-1 map = 0%,  reduce = 0%
2011-12-29 14:57:16,405 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201112291016_0023 with errors
Error during job, obtaining debugging information...
Examining task ID: task_201112291016_0023_m_000002 (and more) from job job_201112291016_0023
Exception in thread "Thread-248" java.lang.RuntimeException: Error while reading from task log url
at org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:130)
at org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:211)
at org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:81)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:529)
at java.net.Socket.connect(Socket.java:478)
at sun.net.NetworkClient.doConnect(NetworkClient.java:163)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:395)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:530)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:234)
at sun.net.www.http.HttpClient.New(HttpClient.java:307)
at sun.net.www.http.HttpClient.New(HttpClient.java:324)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:970)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:911)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:836)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1172)
at java.net.URL.openStream(URL.java:1010)
at org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:120)
... 3 more

在hadoop的日志文件里(/opt/hadoop-0.20.205.0/logs/hadoop-root-jobtracker-chenyi3.log),错误信息如下:

2011-12-29 14:57:06,865 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201112291016_0023_m_000000_3: java.lang.RuntimeException: Hive Runtime Error while closing operators

at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:226)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hit error while closing ..
at org.apache.hadoop.hive.ql.exec.ScriptOperator.close(ScriptOperator.java:452)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193)
... 7 more

按照之前找到的解决方案如下:

A few things I'd check for if I were debugging this:

1) Is the python file set to be executable (chmod +x file.py)

2) Make sure the python file is in the same place on all machines. Probably better - put the file in hdfs then you can use " using 'hdfs://path/to/file.py' " instead of a local path

3) Take a look at your job on the hadoop dashboard (http://master-node:9100), if you click on a failed task it will give you the actual java error and stack trace so you can see what actually went wrong with the execution

4) make sure python is installed on all the slave nodes! (I always overlook this one)

Hope that helps.....

还是无法执行成功,暂时无解中(如有网友知道,请告知)……………………

通过几天的努力,终于把这个问题解决了,原因在于配置/etc/hosts文件,请参考我另一篇文章《》。

转载地址:http://cvzob.baihongyu.com/

你可能感兴趣的文章
Java8新特性学习(一)- 开篇介绍
查看>>
Java8新特性学习(二)- Optional类
查看>>
Java8新特性学习(三)- Stream类
查看>>
ForkJoin框架使用和原理剖析
查看>>
设计模式-观察者模式
查看>>
CacheLoader returned null for key分析和解决
查看>>
常用的设计模式Java实现及解析
查看>>
Top100案例参会总结
查看>>
Redis源码学习感悟
查看>>
Redis内存节省策略
查看>>
实测win8下安装使用QT4.8+qt creator2.8.0
查看>>
整理:深度学习 vs 机器学习 vs 模式识别
查看>>
深度学习 vs. 概率图模型 vs. 逻辑学
查看>>
IDL box plot
查看>>
IDL vector filed plot
查看>>
piecewise constant function 阶跃常函数
查看>>
IDL save postscript file
查看>>
Bibtex如何使authors in the citation 最多显示两个
查看>>
Bibtex 如何cite 不同格式
查看>>
Cmake environmental variables: how to make find_package, find_path and find_library work
查看>>