我正在尝试使用hadoop在我的MAC OS X 10.9.2上完成开发单节点集群设置.我已经尝试了各种在线教程,其中最新的就是这个教程.总结一下我的所作所为:
1) $ brew install hadoop
这在/usr/local/Cellar/hadoop/2.2.0中安装了hadoop 2.2.0
2)配置的环境变量.这是我的.bash_profile的相关部分:
### Java_HOME export JAVA_HOME="$(/usr/libexec/java_home)" ### HADOOP Environment variables export HADOOP_PREFIX="/usr/local/Cellar/hadoop/2.2.0" export HADOOP_HOME=$HADOOP_PREFIX export HADOOP_COMMON_HOME=$HADOOP_PREFIX export HADOOP_CONF_DIR=$HADOOP_PREFIX/libexec/etc/hadoop export HADOOP_HDFS_HOME=$HADOOP_PREFIX export HADOOP_MAPRED_HOME=$HADOOP_PREFIX export HADOOP_YARN_HOME=$HADOOP_PREFIX export CLASSPATH=$CLASSPATH:. export CLASSPATH=$CLASSPATH:$HADOOP_HOME/libexec/share/hadoop/common/hadoop-common-2.2.0.jar export CLASSPATH=$CLASSPATH:$HADOOP_HOME/libexec/share/hadoop/hdfs/hadoop-hdfs-2.2.0.jar
3)配置HDFS
dfs.datanode.data.dir file:///usr/local/Cellar/hadoop/2.2.0/hdfs/datanode Comma separated list of paths on the local filesystem of a DataNode where it should store its blocks. dfs.namenode.name.dir file:///usr/local/Cellar/hadoop/2.2.0/hdfs/namenode Path on the local filesystem where the NameNode stores the namespace and transaction logs persistently.
3)配置core-site.xml
fs.defaultFS hdfs://localhost/ NameNode URI
4)配置yarn-site.xml
yarn.scheduler.minimum-allocation-mb 128 Minimum limit of memory to allocate to each container request at the Resource Manager. yarn.scheduler.maximum-allocation-mb 2048 Maximum limit of memory to allocate to each container request at the Resource Manager. yarn.scheduler.minimum-allocation-vcores 1 The minimum allocation for every container request at the RM, in terms of virtual CPU cores. Requests lower than this won't take effect, and the specified value will get allocated the minimum. yarn.scheduler.maximum-allocation-vcores 2 The maximum allocation for every container request at the RM, in terms of virtual CPU cores. Requests higher than this won't take effect, and will get capped to this value. yarn.nodemanager.resource.memory-mb 4096 Physical memory, in MB, to be made available to running containers yarn.nodemanager.resource.cpu-vcores 2 Number of CPU cores that can be allocated for containers.
5)然后我尝试使用以下格式来格式化namenode:
$HADOOP_PREFIX/bin/hdfs namenode -format
这给了我错误:错误:无法找到或加载主类org.apache.hadoop.hdfs.server.namenode.NameNode.
我查看了hdfs代码,运行它的行基本上等于调用
$java org.apache.hadoop.hdfs.server.namenode.NameNode.
所以认为这是一个类路径问题,我尝试了一些事情
a)将hadoop-common-2.2.0.jar和hadoop-hdfs-2.2.0.jar添加到类路径中,如上面的.bash_profile脚本中所示
b)添加线
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
根据本教程的建议,我的.bash_profile .(我后来删除它,因为它似乎没有任何帮助)
c)我还考虑编写一个shell脚本,将$ HADOOP_HOME/libexec/share/hadoop中的每个jar添加到$ HADOOP_CLASSPATH中,但这似乎是不必要的,并且容易出现未来的问题.
知道为什么我一直得到错误:无法找到或加载主类org.apache.hadoop.hdfs.server.namenode.NameNode?提前致谢.
由于brew包的布局方式,您需要将HADOOP_PREFIX指向包中的libexec文件夹:
export HADOOP_PREFIX="/usr/local/Cellar/hadoop/2.2.0/libexec"
然后,您将从conf目录的声明中删除libexec:
export HADOOP_CONF_DIR=$HADOOP_PREFIX/etc/hadoop