我试图将简单的字数作为MapReduce作业运行.在本地运行时一切正常(所有工作都在名称节点上完成).但是,当我尝试使用YARN(添加mapreduce.framework.name
= yarn
mapred-site.conf)在集群上运行它时,作业挂起.
我在这里遇到了类似的问题: MapReduce作业陷入了Accepted状态
工作输出:
*** START *** 15/12/25 17:52:50 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 15/12/25 17:52:51 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this. 15/12/25 17:52:51 INFO input.FileInputFormat: Total input paths to process : 5 15/12/25 17:52:52 INFO mapreduce.JobSubmitter: number of splits:5 15/12/25 17:52:52 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1451083949804_0001 15/12/25 17:52:53 INFO impl.YarnClientImpl: Submitted application application_1451083949804_0001 15/12/25 17:52:53 INFO mapreduce.Job: The url to track the job: http://hadoop-droplet:8088/proxy/application_1451083949804_0001/ 15/12/25 17:52:53 INFO mapreduce.Job: Running job: job_1451083949804_0001
mapred-site.xml中:
mapreduce.framework.name yarn mapreduce.job.tracker localhost:54311
纱的site.xml
yarn.nodemanager.aux-services mapreduce_shuffle yarn.nodemanager.aux-services.mapreduce.shuffle.class org.apache.hadoop.mapred.ShuffleHandler
//我左边评论的选项 - 他们没有解决问题
YarnApplicationState:ACCEPTED:等待AM容器分配,启动并注册RM.
可能是什么问题?
编辑:
我在机器上尝试了这个配置(注释):NameNode(8GB RAM)+ 2x DataNode(4GB RAM).我得到相同的效果:作业挂起ACCEPTED状态.
EDIT2:更改了配置(感谢@Manjunath Ballur):
纱的site.xml:
yarn.resourcemanager.hostname hadoop-droplet yarn.resourcemanager.resource-tracker.address hadoop-droplet:8031 yarn.resourcemanager.address hadoop-droplet:8032 yarn.resourcemanager.scheduler.address hadoop-droplet:8030 yarn.resourcemanager.admin.address hadoop-droplet:8033 yarn.resourcemanager.webapp.address hadoop-droplet:8088 Classpath for typical applications. yarn.application.classpath $HADOOP_CONF_DIR, $HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*, $HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*, $HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*, $YARN_HOME/*,$YARN_HOME/lib/* yarn.nodemanager.aux-services mapreduce.shuffle yarn.nodemanager.aux-services.mapreduce.shuffle.class org.apache.hadoop.mapred.ShuffleHandler yarn.nodemanager.local-dirs /data/1/yarn/local,/data/2/yarn/local,/data/3/yarn/local yarn.nodemanager.log-dirs /data/1/yarn/logs,/data/2/yarn/logs,/data/3/yarn/logs Where to aggregate logs yarn.nodemanager.remote-app-log-dir /var/log/hadoop-yarn/apps yarn.scheduler.minimum-allocation-mb 50 yarn.scheduler.maximum-allocation-mb 390 yarn.nodemanager.resource.memory-mb 390
mapred-site.xml中:
mapreduce.framework.name yarn yarn.app.mapreduce.am.resource.mb 50 yarn.app.mapreduce.am.command-opts -Xmx40m mapreduce.map.memory.mb 50 mapreduce.reduce.memory.mb 50 mapreduce.map.java.opts -Xmx40m mapreduce.reduce.java.opts -Xmx40m
还是行不通.附加信息:我在群集预览中看不到任何节点(此处类似问题:从属节点不在Yarn ResourceManager中)
您应该检查群集中节点管理器的状态.如果NM节点的磁盘空间不足,那么RM会将它们标记为"不健康",并且这些NM无法分配新容器.
1)检查不健康节点: http://
如果"运行状况报告"选项卡显示"local-dirs is bad",则表示您需要从这些节点清除一些磁盘空间.
2)检查DFS dfs.data.dir
属性hdfs-site.xml
.它指向存储hdfs数据的本地文件系统上的位置.
3)登录这些机器并使用df -h
&hadoop fs - du -h
命令测量占用的空间.
4)验证hadoop垃圾并删除它,如果它阻止你.
hadoop fs -du -h /user/user_name/.Trash
和hadoop fs -rm -r /user/user_name/.Trash/*